With OMT Baseline compilation enabled, testcase from bug 1826078 takes 1000ms on load, and intermittently on reloading many times. Else the testcase takes 50ms.
Categories
(Core :: JavaScript Engine, defect, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr128 | --- | unaffected |
| firefox-esr140 | --- | disabled |
| firefox134 | --- | unaffected |
| firefox135 | --- | unaffected |
| firefox136 | --- | disabled |
| firefox137 | --- | disabled |
| firefox138 | --- | disabled |
| firefox142 | --- | disabled |
| firefox143 | --- | disabled |
| firefox144 | --- | fixed |
People
(Reporter: mayankleoboy1, Assigned: iain)
References
(Depends on 1 open bug, Blocks 1 open bug, Regression)
Details
(Keywords: regression)
Attachments
(3 files)
- Enable OMT Baseline compilation (javascript.options.experimental.baselinejit.offthread_compilation=TRUE)
- Open the attached testcase from Bug 1826078 (https://bugzilla.mozilla.org/show_bug.cgi?id=1826078#c0) .
- I have increased the loopcount to 10x.
Profile: https://share.firefox.dev/3PQpijC / https://share.firefox.dev/4hsDOtu
The 50ms timings may be hard to notice in the second profile. Take the "pageload marker" (Red vertical lines) in the parent-process for a cue.
| Reporter | ||
Updated•9 months ago
|
| Reporter | ||
Comment 1•9 months ago
|
||
If i disable jit-hints (javascript.options.jithints=FALSE) and enable OMT Baseline compilation, I can still repro the spikes. But the frequency of the spikes gets reduced drastically to the point where the 1000ms spikes stand out in a sea of 50ms calm waters.
Comment 2•9 months ago
|
||
Set release status flags based on info from the regressing bug 1935289
| Assignee | ||
Comment 3•9 months ago
|
||
Interesting!
At first I thought this was caused by long compilation times making us spin in blinterp for longer. However, according to samply, offthread baseline compilation makes this 20x slower even though only a small fraction of time is spent in blinterp. The real problem appears to be that the offthread compilation takes long enough that we've already exceeded the trial inlining threshold, so we don't inline foo into f. If we'd done the inlining, then we would notice that the lambda is dead, so we don't have to allocate a call object for each stack frame. This explains why the profile in comment 1 shows constant minor GCs in the slow cases, and no GC in the fast cases.
Testing locally in an opt build of the shell it looks like the warmUpCount is 1-2K by the time we link the compilation. The exact number seems fairly noisy, which makes sense because it depends on how the OS schedules the background thread. It takes <1.5ms to go from finishing the parse to a full Ion-compile, so I guess the thread synchronization delay is significant for tight loops. It's a little awkward that we're practically reaching the ion threshold before our baseline code is even available, although in real-world code there might be more functions involved and fewer super-tight loops.
If I compare the warm up count to the trial inlining threshold in BaselineCompiler::finishCompile, and reset the count to just before the threshold if necessary, then it looks like the performance gap goes away.
In short: this is further evidence that we need an answer for the trial inlining question before flipping this on.
Updated•9 months ago
|
Updated•9 months ago
|
Updated•8 months ago
|
Updated•8 months ago
|
Updated•6 months ago
|
| Reporter | ||
Comment 4•4 months ago
•
|
||
With hte latest nightly (https://hg.mozilla.org/mozilla-central/rev/262215ce9b5579f207c98d8554ee9cf02613f6d3), I still see occasional spikes of high latency. Good run is ~50ms, Bad run is ~550ms.
Testcase run from bmo: https://share.firefox.dev/3TatZX9
Testcase run as local file: https://share.firefox.dev/4lnOOu1
Aside: The Bad numbers have improved by 50% since the timings in comment 0. They used to take ~1000ms, now they take ~580ms. So yay.
| Assignee | ||
Comment 5•3 months ago
|
||
We can revisit this once I've made the changes to reland bug 1960433.
| Reporter | ||
Comment 6•2 months ago
•
|
||
Here is a testcase that calls filter on an array of size 50000 in a very tight loop.
With OMT Baseline compilation enabled, it roughly takes 15-17 seconds. But sometimes it can take only 4-5s, maing it ~5x faster: https://share.firefox.dev/45nNN07
With OMT BC disabled, it will always take ~15-17s.
| Reporter | ||
Comment 7•2 months ago
|
||
Maybe for some scenarios, make trial inlining more aggressive or go deeper?
| Reporter | ||
Comment 8•2 months ago
|
||
This still repros after bug 1960433 landed on Nightly : https://share.firefox.dev/4lUFcXN (See the spikes with timings of 50ms and 500ms)
| Reporter | ||
Comment 9•2 months ago
•
|
||
With the default nightly, containing bug 1980830:
First testcase
- Default mode (eager compilation only,2): the first testcase always runs in ~50ms now: https://share.firefox.dev/4g5fvlO
- On-demand only (mode 1): it goes to the 50ms-500ms seesaw: https://share.firefox.dev/4p3QkUR
Second testcase
- Default mode (eager compilation only,2): always runs in 15seconds. We dont see the magical 5 second run.
- On-demand only (mode 1): 15seconds, but also got a 30s run https://share.firefox.dev/3I6jZMr .
| Assignee | ||
Comment 10•2 months ago
|
||
Ha! This is both funny and rather embarrassing.
It turns out that bug 1960433 landed all the code that was necessary to do trial inlining from the baseline interpreter, except for the handful of lines to actually, you know, check the warmup counter and call DoTrialInlining.
A tiny patch to actually add that call makes the spikes in the original testcase go away locally (in the non-jit-hint configuration). Testing it on try now.
| Assignee | ||
Comment 11•2 months ago
|
||
Updated•2 months ago
|
Comment 12•2 months ago
|
||
Comment 13•2 months ago
|
||
| bugherder | ||
| Reporter | ||
Comment 14•2 months ago
|
||
On the latest Nightly, With javascript.options.baselinejit.offthread_compilation_strategy =1, I still see spikes of 500ms on the first testcase : https://share.firefox.dev/4peexrA
Updated•2 months ago
|
| Assignee | ||
Comment 15•2 months ago
|
||
Ah! I was initially confused, but poking at it in rr, this is bug 1969020.
Updated•1 month ago
|
Description
•