Closed Bug 1901283 Opened 5 months ago Closed 5 months ago

2.11 - 0.87% speedometer3 TodoMVC-Svelte-Complex-DOM/DeletingAllItems/Async / speedometer3 TodoMVC-Svelte-Complex-DOM/DeletingAllItems/Async (OSX) regression on Tue May 28 2024

Categories

(Core :: Layout, defect)

defect

Tracking

()

RESOLVED INVALID
Tracking Status
firefox129 --- affected

People

(Reporter: afinder, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a browsertime performance regression from push ba88bf442c827bc84857e2353c9b58a16f1e7f99. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
2% speedometer3 TodoMVC-Svelte-Complex-DOM/DeletingAllItems/Async macosx1400-64-shippable-qr fission webrender 1.83 -> 1.87 Before/After
1% speedometer3 TodoMVC-Svelte-Complex-DOM/DeletingAllItems/Async macosx1400-64-shippable-qr fission webrender 1.84 -> 1.85 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 626

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(aethanyc)

My patch in bug 1896875 improves the performance in extreme cases, but it shouldn't slow down things.

As for the test that shows the 2% regression, I'm seeing the test running time is oscillating between 1.8ms and 1.9ms per https://treeherder.mozilla.org/perfherder/graphs?timerange=1209600&series=autoland,5042295,1,13

Alex, could you check if this alert is in the range of noise?

Flags: needinfo?(aethanyc) → needinfo?(afinder)

I toggled 20 retriggers of the macosx1400-64-shippable-qr opt BTime[tier2] (sp3) task just now (on top of the 10 runs that we already had), to get a bit more data here.

Here's the relevant "compare" view:
https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=autoland&newProject=autoland&newRevision=ba88bf442c827bc84857e2353c9b58a16f1e7f99&originalSignature=5042134&newSignature=5042134&framework=13&application=firefox&originalRevision=c7a6ad12193fba796a76da3491137b661ba43569&page=1&filter=TodoMVC-Svelte-Complex-DOM%2FDeletingAllItems%2FAsync

As of right now (with only a few retriggers having completed), it still shows this as a regression, with delta = 0.77% and confidence "low". We can see what happens to the confidence & delta as more retriggers complete.

(Right now, it's looking like the 'before' revision might've just gotten lucky and had its values biased towards the lower end of the range, just by chance, which made the 'after' revision look superficially like a regression.)

Some retriggers are still pending, but right now (comparing 29 jobs for "before", 22 for "after"), the average value has converged to 1.85 for both before and after, with a delta of 0.10% between the two commits, well within the range of error (which is +/- 2.44% for the "before" commit, and +/- 1.01% for the "After" commit).

Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → INVALID

So I think we can conclude that this alert was just a false positive from noisiness in the data [hence, closed as INVALID].

afinder, feel free to reopen / add more data if you're seeing a signal that we're overlooking here.

Daniel, thank you for providing extra analysis here :)

Flags: needinfo?(afinder)
You need to log in before you can comment on or make changes to this bug.