Closed Bug 1867193 Opened 1 year ago Closed 1 year ago

Multiple regressions on AWFY-Jetstream2 benchmark (regexp and and splay) around 14November2023

Tracking

()

Status:

RESOLVED FIXED

Milestone:

122 Branch

Tracking Flags:

Tracking

Status

firefox-esr115

---

unaffected

firefox120

---

unaffected

firefox121

---

wontfix

firefox122

---

fixed

People

(Reporter: mayankleoboy1, Assigned: jandem)

References

(Regression)

Details

(Keywords: regression)

Attachments

(5 files)

Bug 1867193 - Fix baseline-compiled check in MaybeCreateAllocSite. r?iain! 1 year ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1867193 - Remove ICScript for pc when failing trial inlining. r?iain! 1 year ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1867193 - Move active flag to ICScript, and add bytecode size field to ICScript. r?iain! 1 year ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1867193 - Reset more IC state when discarding IC stubs. r?iain! 1 year ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1867193 - Add tests. r?iain! 1 year ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review

Mayank Bansal

Reporter

Description

•

1 year ago

•

Edited

See these

5.6% on Jetstream2-regexp* :
https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&selected=3911463,1794107134&series=mozilla-central,3738076,1,13&series=autoland,3911463,1,13&timerange=5184000&zoom=1699740804052,1700121942300,313.8853140374159,392.72783795608234

27% on splay-Average:
https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&selected=3911478,1794107149&series=mozilla-central,3738091,1,13&series=autoland,3911478,1,13&timerange=5184000&zoom=1699805283247,1700028034171,65.55931685230334,282.61877529005636

Suspect: bug 1863939

This is a low priority bug that may end up being wontfix

Mayank Bansal

Reporter

Updated

•

1 year ago

Summary: 5% regression on AWFY-Jetstream2-regexp* benchmark around 14November2023 → Multiple regressions on AWFY-Jetstream2-regexp* benchmark around 14November2023

Mayank Bansal

Reporter

Updated

•

1 year ago

Summary: Multiple regressions on AWFY-Jetstream2-regexp* benchmark around 14November2023 → Multiple regressions on AWFY-Jetstream2 benchmark (regexp and and splay) around 14November2023

BugBot [:suhaib / :marco/ :calixte]

Comment 1

•

1 year ago

This bug has been marked as a regression. Setting status flag for Nightly to affected.

status-firefox122: --- → affected

Donal Meehan [:dmeehan]

Comment 2

•

1 year ago

Setting Bug 1863939 as the regressor based on Comment 0. Will let the bot do the need-info.

status-firefox120: --- → unaffected

status-firefox121: --- → affected

status-firefox-esr115: --- → unaffected

Regressed by: 1863939

BugBot [:suhaib / :marco/ :calixte]

Comment 3

•

1 year ago

:jandem, since you are the author of the regressor, bug 1863939, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(jdemooij)

Stuart Colville [:muffinresearch]

Updated

•

1 year ago

status-firefox121: affected → fix-optional

Iain Ireland [:iain]

Comment 4

•

1 year ago

I can reproduce the splay regression on my machine using the octane version of splay. Note that the regression window also includes Yoshi's patch for bug 1860655, but building locally I can reproduce the regression with only bug 1863939.

Bug 1863939 got a bunch of wins elsewhere and generally cleaned up an ugly bit of code. It's confusing that it hits splay so hard.

We only call ICCacheIRStub::clone once, so I assume that we're not spending lots of time copying stubs. My next guess was that we spent our time reattaching baseline ICs after throwing them away, but the number of calls to AttachBaselineCacheIRStub only goes from 703 to 737, which doesn't explain the large performance regression.

I do see significantly more bailouts after the patch, which implies that maybe we are compiling to Ion with worse CacheIR because we did a GC and threw away a bunch of stubs, then didn't get a chance to reattach them before compiling. If that's the case, I don't know what we do to fix this. We could tweak heuristics about how often we discard jitcode and how we reset warmup counters after GC, but my understanding is that splay is a very weird benchmark, so I don't know how much we want to tune our heuristics to help it.

Ryan VanderMeulen [:RyanVM]

Updated

•

1 year ago

status-firefox121: fix-optional → wontfix

Jan de Mooij [:jandem]

Assignee

Comment 5

•

1 year ago

I can reproduce this on octane-splay but it's a bit intermittent. Most reliable is to run the full Octane suite with run.js.

We also get slower if I change purgeOptimizedStubs to discard all stubs on the 'before' revision.

I narrowed it down to the allocation sites we create for NewObject and NewArray stubs. I have some small changes that fix this locally but I still have to verify it fixes the JetStream regression on Try.

Assignee: nobody → jdemooij

Status: NEW → ASSIGNED

Jan de Mooij [:jandem]

Assignee

Comment 6

•

1 year ago

Attached file Bug 1867193 - Fix baseline-compiled check in MaybeCreateAllocSite. r?iain! — Details

Allocation sites for IC stubs are created either when we Baseline compile the script,
or when we attach a new stub after Baseline compilation.

When attaching a stub we were checking whether we have an interpreter frame, but that
fails when we have a Baseline script but are currently running in the interpreter.
This can happen after a bailout for example. In this case we'd never create the allocation
site.

Jan de Mooij [:jandem]

Assignee

Comment 7

•

1 year ago

I posted one patch. I have some other patches for issues I noticed but they need more work/testing.

I also started a try push to see if this fix is sufficient for the splay regression.

Jan de Mooij [:jandem]

Assignee

Comment 8

•

1 year ago

I confirmed on Try that this fixes the JetStream-Splay regression, but there's a bit more I want to do that should also fix the smaller regression on the regexp test.

Keywords: leave-open

Pulsebot

Comment 9

•

1 year ago

Pushed by jdemooij@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c148a4f270ca Fix baseline-compiled check in MaybeCreateAllocSite. r=iain

Cristian Tuns

Comment 10

•

1 year ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/c148a4f270ca

Jon Coppeard (:jonco)

Comment 11

•

1 year ago

(In reply to Mayank Bansal from comment #0)
I'm seeing 20-30% improvement testing locally, and much lower variability in results with this change.

Jan de Mooij [:jandem]

Assignee

Comment 12

•

1 year ago

Attached file Bug 1867193 - Remove ICScript for pc when failing trial inlining. r?iain! — Details

This matches other places. Later patches depend on this and add an assertion.

Jan de Mooij [:jandem]

Assignee

Comment 13

•

1 year ago

Attached file Bug 1867193 - Move active flag to ICScript, and add bytecode size field to ICScript. r?iain! — Details

The active flag is currently set on the JitScript, but the next patch will also
use this to mark trial-inlined ICScripts that are on the stack. We can use this to
preserve those trial-inlined scripts and discard the rest.

To properly discard the trial-inlined ICScripts we also need to know the bytecode size
of the script, to update InliningRoot::totalBytecodeSize_, so this patch also adds
this to the IC script.

Depends on D196171

Jan de Mooij [:jandem]

Assignee

Comment 14

•

1 year ago

Attached file Bug 1867193 - Reset more IC state when discarding IC stubs. r?iain! — Details

After discarding IC stubs, we would no longer support trial inlining, resulting
in potential performance cliffs.

With this patch we try to reset as much state as possible. The main exception is
that we preserve trial inlining data for call sites if the callee is active
on the stack and running in Baseline. In this case we also clone the IC stubs
for the call site.

For all other ICs we now reset the ICState and we also purge inactive inlined
ICScripts.

Depends on D196172

Jan de Mooij [:jandem]

Assignee

Comment 15

•

1 year ago

Attached file Bug 1867193 - Add tests. r?iain! — Details

Depends on D196173

Jan de Mooij [:jandem]

Assignee

Comment 16

•

1 year ago

These latest patches show some high-confidence 3-8% improvements on the segmentation, espree-wtb, and regexp sub-tests of JetStream 2:

https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=8d2418e8c7e65020575e746e242649db6db4641d&originalSignature=3455571&newSignature=3455571&framework=13&application=firefox&originalRevision=0d821795e4dd44bd16f1ff8a01813e707c67c26b&page=1

The 'segmentation' test is interesting because it uses Workers. Those might have longer-running stack frames and these patches probably fix some perf cliffs there.

Jan de Mooij [:jandem]

Assignee

Updated

•

1 year ago

Depends on: 1869759

Pulsebot

Comment 17

•

1 year ago

Pushed by jdemooij@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a74e6255c240 Remove ICScript for pc when failing trial inlining. r=iain https://hg.mozilla.org/integration/autoland/rev/0e24a59eae53 Move active flag to ICScript, and add bytecode size field to ICScript. r=iain https://hg.mozilla.org/integration/autoland/rev/dd251ff86895 Reset more IC state when discarding IC stubs. r=iain https://hg.mozilla.org/integration/autoland/rev/86021bfb36c3 Add tests. r=iain

Atila Butkovits

Comment 18

•

1 year ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/a74e6255c240
https://hg.mozilla.org/mozilla-central/rev/0e24a59eae53
https://hg.mozilla.org/mozilla-central/rev/dd251ff86895
https://hg.mozilla.org/mozilla-central/rev/86021bfb36c3

Mayank Bansal

Reporter

Comment 19

•

1 year ago

5% win on AWFY-Jetstream2-prepack-wtb-First

Jan de Mooij [:jandem]

Assignee

Comment 20

•

1 year ago

The changes in this bug + bug 1869759 have landed so this is fixed now.

(In reply to Mayank Bansal from comment #19)

5% win on AWFY-Jetstream2-prepack-wtb-First

Nice, I hadn't seen that yet :)

Status: ASSIGNED → RESOLVED

Closed: 1 year ago

Flags: needinfo?(jdemooij)

Resolution: --- → FIXED

Jan de Mooij [:jandem]

Assignee

Updated

•

1 year ago

status-firefox122: affected → fixed

Target Milestone: --- → 122 Branch

BugBot (nomail) [:suhaib / :marco/ :calixte]

Updated

•

1 year ago

Keywords: leave-open

Mayank Bansal

Reporter

Comment 21

•

1 year ago

Some improvements and regressions on AWFY-Jetstream2:
35% regression on json-parse-inspector-First
10% improvement on json-parse-inspector-Worst

Not sure if the inspector-First regression is worth tracking as there is improvement on inspector-Worst.

Steve Fink [:sfink] [:s:]

Updated

•

1 year ago

Updated

•

1 year ago

Regressions: 1870925

Iain Ireland [:iain]

Updated

•

1 year ago

Regressions: CVE-2024-0744

Jan de Mooij [:jandem]

Assignee

Updated

•

1 year ago

Regressions: 1871947

Jan de Mooij [:jandem]

Assignee

Updated

•

1 year ago

No longer regressions: CVE-2024-0744

You need to log in before you can comment on or make changes to this bug.