Closed Bug 1943901 Opened 9 months ago Closed 2 months ago

With OMT Baseline compilation enabled, testcase from bug 1826078 takes 1000ms on load, and intermittently on reloading many times. Else the testcase takes 50ms.

Categories

(Core :: JavaScript Engine, defect, P3)

defect

Tracking

()

RESOLVED FIXED
144 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox-esr140 --- disabled
firefox134 --- unaffected
firefox135 --- unaffected
firefox136 --- disabled
firefox137 --- disabled
firefox138 --- disabled
firefox142 --- disabled
firefox143 --- disabled
firefox144 --- fixed

People

(Reporter: mayankleoboy1, Assigned: iain)

References

(Depends on 1 open bug, Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(3 files)

Profile: https://share.firefox.dev/3PQpijC / https://share.firefox.dev/4hsDOtu
The 50ms timings may be hard to notice in the second profile. Take the "pageload marker" (Red vertical lines) in the parent-process for a cue.

Flags: needinfo?(iireland)
Summary: With OMT Baseline enabled, testcase from bug 1826078 takes 1000ms on load, and intermittently on reloading many times. Else the testcase takes 50ms. → With OMT Baselinen compilation enabled, testcase from bug 1826078 takes 1000ms on load, and intermittently on reloading many times. Else the testcase takes 50ms.

If i disable jit-hints (javascript.options.jithints=FALSE) and enable OMT Baseline compilation, I can still repro the spikes. But the frequency of the spikes gets reduced drastically to the point where the 1000ms spikes stand out in a sea of 50ms calm waters.

https://share.firefox.dev/4jwY8f6

Set release status flags based on info from the regressing bug 1935289

Interesting!

At first I thought this was caused by long compilation times making us spin in blinterp for longer. However, according to samply, offthread baseline compilation makes this 20x slower even though only a small fraction of time is spent in blinterp. The real problem appears to be that the offthread compilation takes long enough that we've already exceeded the trial inlining threshold, so we don't inline foo into f. If we'd done the inlining, then we would notice that the lambda is dead, so we don't have to allocate a call object for each stack frame. This explains why the profile in comment 1 shows constant minor GCs in the slow cases, and no GC in the fast cases.

Testing locally in an opt build of the shell it looks like the warmUpCount is 1-2K by the time we link the compilation. The exact number seems fairly noisy, which makes sense because it depends on how the OS schedules the background thread. It takes <1.5ms to go from finishing the parse to a full Ion-compile, so I guess the thread synchronization delay is significant for tight loops. It's a little awkward that we're practically reaching the ion threshold before our baseline code is even available, although in real-world code there might be more functions involved and fewer super-tight loops.

If I compare the warm up count to the trial inlining threshold in BaselineCompiler::finishCompile, and reset the count to just before the threshold if necessary, then it looks like the performance gap goes away.

In short: this is further evidence that we need an answer for the trial inlining question before flipping this on.

Flags: needinfo?(iireland)
Blocks: 1490849
Severity: -- → S3
Priority: -- → P3
Summary: With OMT Baselinen compilation enabled, testcase from bug 1826078 takes 1000ms on load, and intermittently on reloading many times. Else the testcase takes 50ms. → With OMT Baseline compilation enabled, testcase from bug 1826078 takes 1000ms on load, and intermittently on reloading many times. Else the testcase takes 50ms.
Depends on: 1960433

With hte latest nightly (https://hg.mozilla.org/mozilla-central/rev/262215ce9b5579f207c98d8554ee9cf02613f6d3), I still see occasional spikes of high latency. Good run is ~50ms, Bad run is ~550ms.

Testcase run from bmo: https://share.firefox.dev/3TatZX9
Testcase run as local file: https://share.firefox.dev/4lnOOu1

Aside: The Bad numbers have improved by 50% since the timings in comment 0. They used to take ~1000ms, now they take ~580ms. So yay.

Flags: needinfo?(iireland)

We can revisit this once I've made the changes to reland bug 1960433.

Flags: needinfo?(iireland)

Here is a testcase that calls filter on an array of size 50000 in a very tight loop.

With OMT Baseline compilation enabled, it roughly takes 15-17 seconds. But sometimes it can take only 4-5s, maing it ~5x faster: https://share.firefox.dev/45nNN07
With OMT BC disabled, it will always take ~15-17s.

Maybe for some scenarios, make trial inlining more aggressive or go deeper?

This still repros after bug 1960433 landed on Nightly : https://share.firefox.dev/4lUFcXN (See the spikes with timings of 50ms and 500ms)

With the default nightly, containing bug 1980830:

First testcase

Second testcase

  • Default mode (eager compilation only,2): always runs in 15seconds. We dont see the magical 5 second run.
  • On-demand only (mode 1): 15seconds, but also got a 30s run https://share.firefox.dev/3I6jZMr .

Ha! This is both funny and rather embarrassing.

It turns out that bug 1960433 landed all the code that was necessary to do trial inlining from the baseline interpreter, except for the handful of lines to actually, you know, check the warmup counter and call DoTrialInlining.

A tiny patch to actually add that call makes the spikes in the original testcase go away locally (in the non-jit-hint configuration). Testing it on try now.

Assignee: nobody → iireland
Status: NEW → ASSIGNED
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 144 Branch
Regressions: 1987639

On the latest Nightly, With javascript.options.baselinejit.offthread_compilation_strategy =1, I still see spikes of 500ms on the first testcase : https://share.firefox.dev/4peexrA

Flags: needinfo?(iireland)

Ah! I was initially confused, but poking at it in rr, this is bug 1969020.

Flags: needinfo?(iireland)
Depends on: 1969020
QA Whiteboard: [qa-triage-done-c145/b144]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: