Closed Bug 1981532 Opened 3 months ago Closed 3 months ago

17.53 - 7.38% perf_reftest_singletons bloom-basic.html / perf_reftest_singletons deeply-nested-grid-1.html + 5 more (OSX, Windows) regression on Wed July 30 2025

Categories

(Core :: Layout: Grid, defect)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr128 --- unaffected
firefox-esr140 --- unaffected
firefox141 --- unaffected
firefox142 --- unaffected
firefox143 --- wontfix

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Regression)

Details

(4 keywords)

Perfherder has detected a talos performance regression from push e4353f5fe7af933450feba1e94e33cbfea2272cc. As author of one of the patches included in that push, we need your help to address this regression.

Please acknowledge, and begin investigating this alert within 3 business days, or the patch(es) may be backed out in accordance with our regression policy. Our guide to handling regression bugs has information about how you can proceed with this investigation.

If you have any questions or need any help with the investigation, please reach out to fbilt@mozilla.com. Alternatively, you can find help on Slack by joining #perf-help, and on Matrix you can find help by joining #perftest.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
18% perf_reftest_singletons bloom-basic.html macosx1470-64-shippable e10s fission stylo webrender 41.89 -> 49.23
17% perf_reftest_singletons bloom-basic-2.html macosx1470-64-shippable e10s fission stylo webrender 42.10 -> 49.39
11% perf_reftest_singletons deeply-nested-grid-1.html macosx1470-64-shippable e10s fission stylo webrender 1.58 -> 1.75
11% perf_reftest_singletons deeply-nested-grid-2.html macosx1470-64-shippable e10s fission stylo webrender 2.34 -> 2.60
9% perf_reftest_singletons deeply-nested-grid-2.html windows11-64-24h2-shippable e10s fission stylo webrender 1.25 -> 1.36
8% perf_reftest_singletons deeply-nested-grid-2.html windows11-64-24h2-shippable e10s fission stylo webrender 1.26 -> 1.35
7% perf_reftest_singletons deeply-nested-grid-1.html windows11-64-24h2-shippable e10s fission stylo webrender 0.89 -> 0.96

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask fbilt@mozilla.com to do that for you.

You can run all of these tests on try with ./mach try perf --alert 46123

The following documentation link provides more information about this command.

Flags: needinfo?(aethanyc)

Set release status flags based on info from the regressing bug 1957244

The perf regressions (on the order of ~10%) in deeply-nested-grid-1.html seem acceptable to me, given that the regressor was adding some additional work that's required for correctness, and the cost here seems to not be exponential (there are ~25 layers of grids in that testcase, so if we were doing something like doubling the work at each nesting-level in terms of its subtree, then this would have blown up a lot more than 10%).

However: I'm surprised & slightly-concerned that there were reported changes in bloom-basic.html and bloom-basic-2.html, since those don't use CSS Grid at all:
https://searchfox.org/mozilla-central/rev/7f12cb24e022aefb9c716211e592a4f4b0a5890c/testing/talos/talos/tests/perf-reftest/bloom-basic.html
https://searchfox.org/mozilla-central/rev/7f12cb24e022aefb9c716211e592a4f4b0a5890c/testing/talos/talos/tests/perf-reftest/bloom-basic-2.html

(Both tests use build_dom from util.js, and that file doesn't mention "grid" either:
https://searchfox.org/mozilla-central/rev/7f12cb24e022aefb9c716211e592a4f4b0a5890c/testing/talos/talos/tests/perf-reftest/util.js#6-28 )

However: looking at the graph, I'm not actually seeing any clear trend/jump upwards there, though; and the data looks pretty noisy (scattered between ~34 and ~62). So maybe this was just an unfortunately-timed outlier, or a statistical anomaly where we got a few high measurements in a row, where the longer-term trend is the same...

Here's the perf.compare UI for the regressing commit ( e4353f5fe7af933450feba1e94e33cbfea2272cc ) vs. its parent as the base commit:
https://perf.compare/subtests-compare-results?baseRev=fc8e84aadc7b540ad6457928f97074835c8cdfec&baseRepo=autoland&newRev=e4353f5fe7af933450feba1e94e33cbfea2272cc&newRepo=autoland&framework=1&baseParentSignature=300908&newParentSignature=300908

The bloom-* subtests do indeed show a regression there right now. I just retriggered to request 9 more jobs before/after to get more data, to test whether that might just be statistical noise... (which it sorta seems like it has to be, given that these tests don't seem to use css grid and the regressor was just a pref-flip for a pref that's only used in CSS grid...)

I'm aligned with Daniel's analysis above in comment 2 and comment 3. The regression on deeply-nested-grid-1.html and deeply-nested-grid-2.html is acceptable because the new behavior introduces more work for grid layout. We are aware of the performance impact with a heads up in bug 1957244 comment 3.

Looking at the past 14 days of performance data, both bloom-basic.html and bloom-basic-2.html had already shown bipolar timing behavior even before bug 1957244. The fast times fluctuate around 37–38 ms, while the slow times are around 48–49 ms.

Bottom line: I'm inclined to close this as WONTFIX.

Flags: needinfo?(aethanyc)
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → WONTFIX

(RE the bimodal timing, that appears to be on a per-build level, maybe due to PGO decisions that vary for each build. A given build will tend to be either the 37-38ms range or the 48-49 range. So my retriggers in comment 3 do superficially show a strong indication of a regression, but in fact it's just a strong correlation that the base build got lucky and the new build got unlucky, in terms of PGO decisions & which bimodal value they tend to end up landing on.)

You need to log in before you can comment on or make changes to this bug.