Closed Bug 1867193 Opened 7 months ago Closed 7 months ago

Multiple regressions on AWFY-Jetstream2 benchmark (regexp and and splay) around 14November2023

Categories

(Core :: JavaScript Engine, defect)

defect

Tracking

()

RESOLVED FIXED
122 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox120 --- unaffected
firefox121 --- wontfix
firefox122 --- fixed

People

(Reporter: mayankleoboy1, Assigned: jandem)

References

(Regression)

Details

(Keywords: regression)

Attachments

(5 files)

Summary: 5% regression on AWFY-Jetstream2-regexp* benchmark around 14November2023 → Multiple regressions on AWFY-Jetstream2-regexp* benchmark around 14November2023
Summary: Multiple regressions on AWFY-Jetstream2-regexp* benchmark around 14November2023 → Multiple regressions on AWFY-Jetstream2 benchmark (regexp and and splay) around 14November2023

This bug has been marked as a regression. Setting status flag for Nightly to affected.

Setting Bug 1863939 as the regressor based on Comment 0. Will let the bot do the need-info.

:jandem, since you are the author of the regressor, bug 1863939, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(jdemooij)

I can reproduce the splay regression on my machine using the octane version of splay. Note that the regression window also includes Yoshi's patch for bug 1860655, but building locally I can reproduce the regression with only bug 1863939.

Bug 1863939 got a bunch of wins elsewhere and generally cleaned up an ugly bit of code. It's confusing that it hits splay so hard.

We only call ICCacheIRStub::clone once, so I assume that we're not spending lots of time copying stubs. My next guess was that we spent our time reattaching baseline ICs after throwing them away, but the number of calls to AttachBaselineCacheIRStub only goes from 703 to 737, which doesn't explain the large performance regression.

I do see significantly more bailouts after the patch, which implies that maybe we are compiling to Ion with worse CacheIR because we did a GC and threw away a bunch of stubs, then didn't get a chance to reattach them before compiling. If that's the case, I don't know what we do to fix this. We could tweak heuristics about how often we discard jitcode and how we reset warmup counters after GC, but my understanding is that splay is a very weird benchmark, so I don't know how much we want to tune our heuristics to help it.

I can reproduce this on octane-splay but it's a bit intermittent. Most reliable is to run the full Octane suite with run.js.

We also get slower if I change purgeOptimizedStubs to discard all stubs on the 'before' revision.

I narrowed it down to the allocation sites we create for NewObject and NewArray stubs. I have some small changes that fix this locally but I still have to verify it fixes the JetStream regression on Try.

Assignee: nobody → jdemooij
Status: NEW → ASSIGNED

Allocation sites for IC stubs are created either when we Baseline compile the script,
or when we attach a new stub after Baseline compilation.

When attaching a stub we were checking whether we have an interpreter frame, but that
fails when we have a Baseline script but are currently running in the interpreter.
This can happen after a bailout for example. In this case we'd never create the allocation
site.

I posted one patch. I have some other patches for issues I noticed but they need more work/testing.

I also started a try push to see if this fix is sufficient for the splay regression.

I confirmed on Try that this fixes the JetStream-Splay regression, but there's a bit more I want to do that should also fix the smaller regression on the regexp test.

Keywords: leave-open
Pushed by jdemooij@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c148a4f270ca
Fix baseline-compiled check in MaybeCreateAllocSite. r=iain

(In reply to Mayank Bansal from comment #0)
I'm seeing 20-30% improvement testing locally, and much lower variability in results with this change.

This matches other places. Later patches depend on this and add an assertion.

The active flag is currently set on the JitScript, but the next patch will also
use this to mark trial-inlined ICScripts that are on the stack. We can use this to
preserve those trial-inlined scripts and discard the rest.

To properly discard the trial-inlined ICScripts we also need to know the bytecode size
of the script, to update InliningRoot::totalBytecodeSize_, so this patch also adds
this to the IC script.

Depends on D196171

After discarding IC stubs, we would no longer support trial inlining, resulting
in potential performance cliffs.

With this patch we try to reset as much state as possible. The main exception is
that we preserve trial inlining data for call sites if the callee is active
on the stack and running in Baseline. In this case we also clone the IC stubs
for the call site.

For all other ICs we now reset the ICState and we also purge inactive inlined
ICScripts.

Depends on D196172

Depends on D196173

These latest patches show some high-confidence 3-8% improvements on the segmentation, espree-wtb, and regexp sub-tests of JetStream 2:

https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=8d2418e8c7e65020575e746e242649db6db4641d&originalSignature=3455571&newSignature=3455571&framework=13&application=firefox&originalRevision=0d821795e4dd44bd16f1ff8a01813e707c67c26b&page=1

The 'segmentation' test is interesting because it uses Workers. Those might have longer-running stack frames and these patches probably fix some perf cliffs there.

Depends on: 1869759
Pushed by jdemooij@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a74e6255c240
Remove ICScript for pc when failing trial inlining. r=iain
https://hg.mozilla.org/integration/autoland/rev/0e24a59eae53
Move active flag to ICScript, and add bytecode size field to ICScript. r=iain
https://hg.mozilla.org/integration/autoland/rev/dd251ff86895
Reset more IC state when discarding IC stubs. r=iain
https://hg.mozilla.org/integration/autoland/rev/86021bfb36c3
Add tests. r=iain

The changes in this bug + bug 1869759 have landed so this is fixed now.

(In reply to Mayank Bansal from comment #19)

5% win on AWFY-Jetstream2-prepack-wtb-First

Nice, I hadn't seen that yet :)

Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Flags: needinfo?(jdemooij)
Resolution: --- → FIXED
Target Milestone: --- → 122 Branch

Some improvements and regressions on AWFY-Jetstream2:
35% regression on json-parse-inspector-First
10% improvement on json-parse-inspector-Worst

Not sure if the inspector-First regression is worth tracking as there is improvement on inspector-Worst.

See Also: → 1870762
Regressions: CVE-2024-0744
Regressions: 1871947
No longer regressions: CVE-2024-0744
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: