Multiple regressions on AWFY-Jetstream2 benchmark (regexp and and splay) around 14November2023
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox120 | --- | unaffected |
firefox121 | --- | wontfix |
firefox122 | --- | fixed |
People
(Reporter: mayankleoboy1, Assigned: jandem)
References
(Regression)
Details
(Keywords: regression)
Attachments
(5 files)
Reporter | ||
Updated•1 year ago
|
Reporter | ||
Updated•1 year ago
|
Comment 1•1 year ago
|
||
This bug has been marked as a regression. Setting status flag for Nightly to affected
.
Comment 2•1 year ago
|
||
Setting Bug 1863939 as the regressor based on Comment 0. Will let the bot do the need-info.
Comment 3•1 year ago
|
||
:jandem, since you are the author of the regressor, bug 1863939, could you take a look? Also, could you set the severity field?
For more information, please visit BugBot documentation.
Updated•1 year ago
|
Comment 4•1 year ago
|
||
I can reproduce the splay regression on my machine using the octane version of splay. Note that the regression window also includes Yoshi's patch for bug 1860655, but building locally I can reproduce the regression with only bug 1863939.
Bug 1863939 got a bunch of wins elsewhere and generally cleaned up an ugly bit of code. It's confusing that it hits splay so hard.
We only call ICCacheIRStub::clone
once, so I assume that we're not spending lots of time copying stubs. My next guess was that we spent our time reattaching baseline ICs after throwing them away, but the number of calls to AttachBaselineCacheIRStub
only goes from 703 to 737, which doesn't explain the large performance regression.
I do see significantly more bailouts after the patch, which implies that maybe we are compiling to Ion with worse CacheIR because we did a GC and threw away a bunch of stubs, then didn't get a chance to reattach them before compiling. If that's the case, I don't know what we do to fix this. We could tweak heuristics about how often we discard jitcode and how we reset warmup counters after GC, but my understanding is that splay is a very weird benchmark, so I don't know how much we want to tune our heuristics to help it.
Updated•1 year ago
|
Assignee | ||
Comment 5•1 year ago
|
||
I can reproduce this on octane-splay but it's a bit intermittent. Most reliable is to run the full Octane suite with run.js
.
We also get slower if I change purgeOptimizedStubs
to discard all stubs on the 'before' revision.
I narrowed it down to the allocation sites we create for NewObject
and NewArray
stubs. I have some small changes that fix this locally but I still have to verify it fixes the JetStream regression on Try.
Assignee | ||
Comment 6•1 year ago
|
||
Allocation sites for IC stubs are created either when we Baseline compile the script,
or when we attach a new stub after Baseline compilation.
When attaching a stub we were checking whether we have an interpreter frame, but that
fails when we have a Baseline script but are currently running in the interpreter.
This can happen after a bailout for example. In this case we'd never create the allocation
site.
Assignee | ||
Comment 7•1 year ago
|
||
I posted one patch. I have some other patches for issues I noticed but they need more work/testing.
I also started a try push to see if this fix is sufficient for the splay regression.
Assignee | ||
Comment 8•1 year ago
|
||
I confirmed on Try that this fixes the JetStream-Splay regression, but there's a bit more I want to do that should also fix the smaller regression on the regexp test.
Comment 10•1 year ago
|
||
bugherder |
Comment 11•1 year ago
|
||
(In reply to Mayank Bansal from comment #0)
I'm seeing 20-30% improvement testing locally, and much lower variability in results with this change.
Assignee | ||
Comment 12•1 year ago
|
||
This matches other places. Later patches depend on this and add an assertion.
Assignee | ||
Comment 13•1 year ago
|
||
The active flag is currently set on the JitScript, but the next patch will also
use this to mark trial-inlined ICScripts that are on the stack. We can use this to
preserve those trial-inlined scripts and discard the rest.
To properly discard the trial-inlined ICScripts we also need to know the bytecode size
of the script, to update InliningRoot::totalBytecodeSize_
, so this patch also adds
this to the IC script.
Depends on D196171
Assignee | ||
Comment 14•1 year ago
|
||
After discarding IC stubs, we would no longer support trial inlining, resulting
in potential performance cliffs.
With this patch we try to reset as much state as possible. The main exception is
that we preserve trial inlining data for call sites if the callee is active
on the stack and running in Baseline. In this case we also clone the IC stubs
for the call site.
For all other ICs we now reset the ICState and we also purge inactive inlined
ICScripts.
Depends on D196172
Assignee | ||
Comment 15•1 year ago
|
||
Depends on D196173
Assignee | ||
Comment 16•1 year ago
|
||
These latest patches show some high-confidence 3-8% improvements on the segmentation, espree-wtb, and regexp sub-tests of JetStream 2:
The 'segmentation' test is interesting because it uses Workers. Those might have longer-running stack frames and these patches probably fix some perf cliffs there.
Comment 17•1 year ago
|
||
Comment 18•1 year ago
|
||
bugherder |
Reporter | ||
Comment 19•1 year ago
|
||
Assignee | ||
Comment 20•1 year ago
|
||
The changes in this bug + bug 1869759 have landed so this is fixed now.
(In reply to Mayank Bansal from comment #19)
Nice, I hadn't seen that yet :)
Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
Reporter | ||
Comment 21•1 year ago
|
||
Some improvements and regressions on AWFY-Jetstream2:
35% regression on json-parse-inspector-First
10% improvement on json-parse-inspector-Worst
Not sure if the inspector-First regression is worth tracking as there is improvement on inspector-Worst.
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Description
•