Open Bug 2000849 Opened 17 days ago Updated 4 days ago

13.06 - 5.41% jetstream3 raytrace-private-class-fields-Average / jetstream3 raytrace-private-class-fields-Geometric + 5 more (Linux, OSX) regression on Thu November 13 2025

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

Tracking Status
firefox-esr140 --- unaffected
firefox145 --- unaffected
firefox146 --- unaffected
firefox147 --- affected

People

(Reporter: intermittent-bug-filer, Assigned: bthrall)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a browsertime performance regression from push 89313c06df5e30c50ab42494e3435b6c69e18840. As author of one of the patches included in that push, we need your help to address this regression.

Please acknowledge, and begin investigating this alert within 3 business days, or the patch(es) may be backed out in accordance with our regression policy. Our guide to handling regression bugs has information about how you can proceed with this investigation.

If you have any questions or need any help with the investigation, please reach out to bacasandrei@mozilla.com. Alternatively, you can find help on Slack by joining #perf-help, and on Matrix you can find help by joining #perftest.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
13% jetstream3 raytrace-private-class-fields-Average linux1804-64-shippable-qr fission webrender 49.48 -> 55.94 Before/After
12% jetstream3 raytrace-private-class-fields-Worst linux1804-64-shippable-qr fission webrender 50.86 -> 56.76 Before/After
8% jetstream3 raytrace-private-class-fields-Geometric linux1804-64-shippable-qr fission webrender 84.98 -> 77.97 Before/After
7% jetstream3 raytrace-private-class-fields-Average macosx1500-aarch64-shippable fission webrender 11.74 -> 12.59 Before/After
7% jetstream3 raytrace-private-class-fields-Worst macosx1500-aarch64-shippable fission webrender 11.93 -> 12.75 Before/After
7% jetstream3 raytrace-private-class-fields-Worst macosx1500-aarch64-shippable fission webrender 12.02 -> 12.80 Before/After
5% jetstream3 raytrace-private-class-fields-Geometric macosx1500-aarch64-shippable fission webrender 371.57 -> 351.45 Before/After

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
14% jetstream3 raytrace-private-class-fields-Average windows11-64-24h2-shippable fission webrender 25.29 -> 21.88 Before/After
14% jetstream3 raytrace-private-class-fields-Worst windows11-64-24h2-shippable fission webrender 26.37 -> 22.81 Before/After
13% jetstream3 raytrace-private-class-fields-Geometric windows11-64-24h2-shippable fission webrender 178.95 -> 201.70 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask bacasandrei@mozilla.com to do that for you.

You can run all of these tests on try with ./mach try perf --alert 47445

The following documentation link provides more information about this command.

Flags: needinfo?(bthrall)

Set release status flags based on info from the regressing bug 1966196

Duplicate of this bug: 2000356

:iain, this is apparently caused by increasing the environment coordinate hops bitwidth to 16. What is your take on whether we should accept this performance change? How important is aytrace-private-class-fields in the grand scheme of things?

Flags: needinfo?(bthrall) → needinfo?(iireland)
Blocks: sm-jits

Am I reading this correctly that this is a 13% regression on Linux/Mac, but a 13% improvement on Windows? That's a very weird outcome.

Re-reading the patch, I would not expect it to have any meaningful impact on the behaviour of this benchmark. I would expect this benchmark to be dominated by Ion code. (A quick local run shows 90% of my profile spent in Ion code, which if anything undercounts the fraction of measured time in Ion.) Nothing in the patch seems like it should affect anything that isn't actively looking at bytecode.

So there's some sort of weird second-order effect here, presumably the result of the bytecode being slightly larger. I think it's probably nonsense, but before closing it, I would test:
a) If you run this test standalone locally, can you reproduce any performance change? If not, it's more likely to be meaningless noise. (For example, maybe this bumps GC timing, and this subtest is shorter running so it improves/regresses more than the subtest that the GC moves to/from.)
b) How much does the bytecode size for this subtest increase? Is it significantly more than other subtests? If this test is an outlier for bytecode size increase, that makes it more likely that this is a real regression. If the increase here is in line with other subtests, it makes it more likely to be ignorable.

Flags: needinfo?(iireland)
Severity: -- → S3
Priority: -- → P1

It's possible that change pushed some JS function's bytecode over the smallFunctionMaxBytecodeLength threshold of 130, so that we no longer inline it now. If you run this single test locally you could check for that by comparing the output of the IonSpew here for before/after builds (IONFLAGS=warp-trial-inlining or add a similar printf).

Assignee: nobody → bthrall

On my workstation (AMD Ryzen Threadripper PRO 3975WX 32-Cores, x86_64), I only see about a 3% difference in Jetstream3 raytrace-private-class-fields-Average scores between main and main-without-1966196.

I also only see an 0.4% difference in the number of trial inlining attempts that fail because of the smallFunctionMaxBytecodeLength threshold. It isn't obvious to me which of the Jetstream3 resources are the raytrace-private-class-fields-Average scripts so I don't know exactly how those changed.

Jetstream mangles all of its source into a blob to avoid measuring network timing. You can use the --no-prefetch option with cli.js to get better source information in the shell. In the browser I think it's prefetchResources=false.

A 3% regression locally feels like a real thing, if it's consistent between runs. Jan's suggestion about trial inlining is astute; I hadn't considered that we sometimes use bytecode length as a proxy for code size. I would try tweaking this value slightly to see if it recovers the regression.

It has been over 7 days with no activity on this performance regression.

:bthrall, since you are the author of the regressor, bug 1966196, which triggered this performance alert, could you please provide a progress update?

If this regression is something that fixes a bug, changes the baseline of the regression metrics, or otherwise will not be fixed, please consider closing it as WONTFIX. See this documentation for more information on how to handle regressions.

For additional information/help, please needinfo the performance sheriff who filed this alert (they can be found in comment #0), or reach out in #perftest, or #perfsheriffs on Element.

For more information, please visit BugBot documentation.

Flags: needinfo?(bthrall)

Partial status update: my understanding is that Bryan is working on this, but is OOO until Monday due to American Thanksgiving.

:iain is correct, I'm working on finding a reasonable value for smallFunctionMaxBytecodeLength to minimize the regression.

Flags: needinfo?(bthrall)
You need to log in before you can comment on or make changes to this bug.