Open Bug 1822650 Opened 1 year ago Updated 10 months ago

Batch BaselineCompiles

Categories

(Core :: JavaScript Engine: JIT, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: alexical, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

Just filing this for tracking purposes. Iain mentioned a bit ago that he had a plan for how to batch baseline compiles which should reduce our mprotects. Turns out this can help us out in other ways too, as our mprotects get flagged by AV software, including Windows Defender (see bug 1441918).

Whiteboard: [sp3]
See Also: → 1490849
Blocks: js-perf-experiments
No longer blocks: 1801189
Blocks: 1817284
No longer blocks: js-perf-experiments
See Also: → 1823634

FWIW, it's possible to disable baseline batching in V8 using --no-baseline-batch-compilation. It would be good to check what impact that has on Chrome's score

With respect to Windows ETW provider Microsoft-Windows-Threat-Intelligence (see bug 1823634), if you have ideas for a reproducible scenario to study which would be well tailored to repeat for estimating our progress over time and/or to compare with other browsers (so a "standard scenario that represents well the average browsing experience"), I'm happy to hear that and collect the data. What I've been doing so far probably isn't the best in that regard (though hopefully still helpful).

Severity: -- → N/A
Priority: -- → P3

Yannis: Depending on what you're looking for, the Speedometer benchmark might be a reasonable starting point, at least as one component of your strategy. It doesn't cover everything in a typical browsing session, but of all the existing JS benchmarks, it's the closest to representing the real web. Speedometer 2 has the advantage of being stable; Speedometer 3, which is under active development, represents a wider range of use cases, but the contents of the benchmark still haven't been finalized.

See Also: 1823634

(In reply to Yannis Juglaret [:yannis] from comment #2)

With respect to Windows ETW provider Microsoft-Windows-Threat-Intelligence (see bug 1823634), if you have ideas for a reproducible scenario to study which would be well tailored to repeat for estimating our progress over time and/or to compare with other browsers (so a "standard scenario that represents well the average browsing experience"), I'm happy to hear that and collect the data. What I've been doing so far probably isn't the best in that regard (though hopefully still helpful).

(In reply to Iain Ireland [:iain] from comment #3)

Yannis: Depending on what you're looking for, the Speedometer benchmark might be a reasonable starting point, at least as one component of your strategy. It doesn't cover everything in a typical browsing session, but of all the existing JS benchmarks, it's the closest to representing the real web. Speedometer 2 has the advantage of being stable; Speedometer 3, which is under active development, represents a wider range of use cases, but the contents of the benchmark still haven't been finalized.

Yeah, I'd be very curious to see the data you get when running i.e. https://browserbench.org/Speedometer2.1/ in your environment with Defender, etc.

Flags: needinfo?(yjuglaret)

Hello, sorry for the delay!

For one (averaged over 5) full run of Speedometer 2.1 on a i5-6200U CPU @ 2.30GHz, 2 Cores, 4 Logical Processors I get:

  • Firefox Nightly 113.0.a1 (2023-03-30): ~194077 PROTECTVM, ~1444 ALLOCVM, ~323 SUSPENDRESUME, MsMpEng.exe uses ~2.51% of CPU;
  • Chrome 111.0.5563.147: ~51 ALLOCVM, MsMpEng.exe uses ~0.05% of CPU;
  • Edge 111.0.1661.62: ~52 ALLOCVM, ~13 PROTECTVM, MsMpEng.exe uses ~0.16% of CPU.

Note: MsMpEng.exe CPU usage is with the patched mpengine.dll from bug 1441918, otherwise CPU usage from MsMpEng.exe might be ~4 times higher with all three browsers.

[:jrmuizel] mentioned to me that v8 seems to have given up on switching between RW and RX and is back to using RWX pages, which seems confirmed by these surprisingly low numbers.

Is it useful that I share a more detailed view per function for Nightly like the attachment from bug 1823634?

Flags: needinfo?(yjuglaret)

(In reply to Yannis Juglaret [:yannis] from comment #5)

For one (averaged over 5) full run of Speedometer 2.1 on a i5-6200U CPU @ 2.30GHz, 2 Cores, 4 Logical Processors I get:

  • Firefox Nightly 113.0.a1 (2023-03-30): ~194077 PROTECTVM, ~1444 ALLOCVM, ~323 SUSPENDRESUME, MsMpEng.exe uses ~2.51% of CPU;
  • Chrome 111.0.5563.147: ~51 ALLOCVM, MsMpEng.exe uses ~0.05% of CPU;
  • Edge 111.0.1661.62: ~52 ALLOCVM, ~13 PROTECTVM, MsMpEng.exe uses ~0.16% of CPU.

Continuing the experiments, it seems that indeed batching can have a big impact, as it appears to result in ~72% MsMpEng.exe CPU usage drop with Chrome on the i5-6200U machine when running Speedometer 2.1 (note that a run takes ~75s in Chrome compared to ~90s in Firefox, so you cannot compare MsMpEng.exe CPU usage cross-browser only based on percentages):

  • chrome.exe --js-flags="--write-protect-code-memory": ~27325 ALLOCVM, MsMpEng.exe uses ~0.70% of CPU;
  • chrome.exe --js-flags="--write-protect-code-memory --no-baseline-batch-compilation": ~117635 ALLOCVM, MsMpEng.exe uses ~2.51% of CPU .

... But also, wait, what? Yes, you read that correctly. --write-protect-code-memory makes Chrome trigger ALLOCVM events, not PROTECTVM. Here is why:

// Chrome uses this variant to change protection...
VirtualAlloc(addr, size, MEM_COMMIT, PAGE_READEXECUTE);

// ... not this more common one used by Firefox!
DWORD oldProtect;
VirtualProtect(addr, size, PAGE_READEXECUTE, &oldProtect);

... and indeed, you can use VirtualAlloc to change protection, despite its name:

An attempt to commit a page that is already committed does not cause the function to fail. This means that you can commit pages without first determining the current commitment state of each page.

What does this change? Well, even assuming that AV vendors do know this trick (I did not, or at least I never thought of using VirtualAlloc that way), ALLOCVM events do not include the previous protection whereas PROTECTVM events do. So when AV software receive those, they have less information to work with, and they will react differently.

It this a bug? Should using VirtualAlloc trigger PROTECTVM events when it results in changing protection? I don't know. But it seems that just using this trick in js::jit::ReprotectRegion can further reduce MsMpEng.exe CPU usage (and this time, likely also with other antivirus software). On the i5-6200U machine I observed ~43% CPU usage drop in MsMpEng.exe while Firefox is running speedometer, when testing these custom Firefox builds: VirtualProtect, VirtualAlloc. It even seemed to have a small positive impact on speedometer results on that i5-6200U machine (58.2 to 60.0, with heavy recording instrumentation running in background), maybe as a result of more CPU being available for Firefox. On my work machine, which has plenty CPU available, I observed no significant difference in speedometer results, but MsMpEng.exe CPU usage seems to drop by ~64%! All this was measured with the patched mpengine.dll from bug 1441918, which already significantly reduced MsMpEng.exe CPU usage with Firefox.

I suggest we try this trick in bug 1823634 and focus on batching here. The good part is that all these improvements are compatible and cumulative.

No longer blocks: 1823634
See Also: → 1823634
Blocks: 1835323
You need to log in before you can comment on or make changes to this bug.