Open Bug 1898433 Opened 4 months ago Updated 3 months ago

3.26 - 2.32% jetstream2 / jetstream2 (Linux, OSX) regression on Thu May 9 2024

Categories

(Testing :: Performance, defect, P5)

defect

Tracking

(firefox-esr115 unaffected, firefox126 unaffected, firefox127 fix-optional, firefox128 fix-optional, firefox129 fix-optional)

Tracking Status
firefox-esr115 --- unaffected
firefox126 --- unaffected
firefox127 --- fix-optional
firefox128 --- fix-optional
firefox129 --- fix-optional

People

(Reporter: afinder, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a browsertime performance regression from push 92d53efb67c7efa140ae3a92bf3b04dadc882c97. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
3% jetstream2 linux1804-64-shippable-qr fission webrender 88.63 -> 85.74 Before/After
2% jetstream2 macosx1015-64-shippable-qr fission webrender 116.72 -> 114.01 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 329

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(kshampur)

Set release status flags based on info from the regressing bug 1893066

While Bug 1893066 is expected to have changed some test's scores, I am not sure if this alert is correct? Because jetstream2 doesn't use geometric mean e.g. https://searchfox.org/mozilla-central/rev/a44891c52387ca4bd7c35b50f0d335f3980ef36a/testing/raptor/raptor/output.py#1124

:afinder if it is not too much work could you double check this?

Flags: needinfo?(kshampur) → needinfo?(afinder)

Hi Kash! Thanks for the heads up! Retriggered some data points to check for a potential infra situation. I see the graph has lowered before the mentioned revision. Will come back with results later today.

Flags: needinfo?(afinder)
Flags: needinfo?(afinder)
Severity: -- → S4
Priority: -- → P5

(In reply to Kash Shampur [:kshampur] ⌚EST from comment #2)

While Bug 1893066 is expected to have changed some test's scores, I am not sure if this alert is correct? Because jetstream2 doesn't use geometric mean e.g. https://searchfox.org/mozilla-central/rev/a44891c52387ca4bd7c35b50f0d335f3980ef36a/testing/raptor/raptor/output.py#1124

:afinder if it is not too much work could you double check this?

Hi Kash! Sorry for the late response on this. I ran a try push after a backout of this commit for the mac and linux jobs (highlighted revision in each graph is the backout of the "Fix geomean" commit with the subsequent patch on autoland, while the previous revision in each graph is the corresponding commit from autoland that preceded the Fix geomean commit). It seems that the backout is more aligned with the baseline of the graph, suggesting the difference in numbers is valid, not infra. I can rerun the jobs on try, without the backout and recheck the results when I'm back from PTO on Wednesday. Please let me know if I might be missing something here.

Flags: needinfo?(afinder)

Set release status flags based on info from the regressing bug 1893066

Thanks for taking a look Alex!

I can rerun the jobs on try, without the backout and recheck the results

If you think this is a valid regression then I think it is worth checking

However, based on what I said in comment 2 I don't think this is valid (or atleast, correctly attributed to bug 1893066 since jetstream does not seem to use geometric mean)

Looking at the absolute values of the regression and the last 60 days of the graph in comment 0, these don't seem to be out of the ordinary either. But I could be wrong and maybe another backfill is needed?

what do you think?

Flags: needinfo?(afinder)

Because of the severity and priority rating, I'm marking this fix-optional for current releases.
While we are happy to take a patch, we don't need to keep looking at the bug in weekly regression triage, since the team is aware and has rated it.

(In reply to Kash Shampur [:kshampur] ⌚EST from comment #6)

Thanks for taking a look Alex!

I can rerun the jobs on try, without the backout and recheck the results

If you think this is a valid regression then I think it is worth checking

However, based on what I said in comment 2 I don't think this is valid (or atleast, correctly attributed to bug 1893066 since jetstream does not seem to use geometric mean)

Looking at the absolute values of the regression and the last 60 days of the graph in comment 0, these don't seem to be out of the ordinary either. But I could be wrong and maybe another backfill is needed?

what do you think?

I just resumed looking at this issue after PTO, and plan on working on it today. I don't think another backfill would help, since we already have the gap filled before the culprit commit. I'm still seeing some signs of infra before the culprit commit on the graph. I'll run the jobs on try again to see if the results fall under the new apparent baseline (this would clearly indicate that it's infra). Another re-run of the jobs with a backout might also help clear this up. If the results are still unclear, we need to explore other options.

Flags: needinfo?(afinder)
Flags: needinfo?(afinder)

Thanks Alex.
IMO I would vote to close this as a wontfix. This doesn't seem to be a true regression to me, based on what I said before - looking at the absolute values and comparing current/previous data trends. For example looking at current data mac and linux there is quite a spread in the data, and the reported values in comment 0 are within that range (similarly if you extend to 30/60 days)

If you've already started the rechecking process then please continue, otherwise I don't think this is worth putting in too much effort

Just submitted a new try push for linux and mac, which are currently in progress. We should see some results soon.

Flags: needinfo?(afinder)
You need to log in before you can comment on or make changes to this bug.