Closed Bug 1876070 Opened 2 years ago Closed 2 years ago

566.57 - 171.1% cnn loadtime / cnn ContentfulSpeedIndex + 2 more (Android) regression on Wed January 10 2024

Categories

(Firefox for Android :: General, defect, P1)

All
Android
defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox122 --- unaffected
firefox123 + wontfix
firefox124 --- fixed

People

(Reporter: afinder, Assigned: sparky)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a browsertime performance regression from push ffeb819f258e7846454a422eb92ec5cbecbd978c. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
567% cnn loadtime android-hw-a51-11-0-aarch64-shippable-qr warm webrender 1,242.11 -> 8,279.50 Before/After
346% cnn LastVisualChange android-hw-a51-11-0-aarch64-shippable-qr warm webrender 2,557.08 -> 11,403.03 Before/After
274% cnn SpeedIndex android-hw-a51-11-0-aarch64-shippable-qr warm webrender 1,841.76 -> 6,894.97 Before/After
171% cnn ContentfulSpeedIndex android-hw-a51-11-0-aarch64-shippable-qr warm webrender 1,550.60 -> 4,203.72 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 41120

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(towhite)

Set release status flags based on info from the regressing bug 1867431

This is a very confusing report.

  1. The graphs linked from the "Ratio" column of the table show a regression from Jan 18, not the regression from Jan 10 that this bug is about.
  2. The "Performance Profiles" links don't work at all - it looks like these jobs don't support being run with the profiler (this deserves a separate bug)

Furthermore, the regression range doesn't make a lot of sense - it's hard to believe that a string change could affect cnn load time.
The graphs show a "Puppet host configuration management" line before the regression, is there a way to find out what kind of configuration change happened?

Flags: needinfo?(afinder)

Jeff, could you please assign Priority/Severity ratings to this report when you get a chance?

Flags: needinfo?(jboek)
Assignee: nobody → jboek
Severity: -- → S2
Flags: needinfo?(jboek)
Priority: -- → P1

(In reply to Markus Stange [:mstange] from comment #2)

This is a very confusing report.

  1. The graphs linked from the "Ratio" column of the table show a regression from Jan 18, not the regression from Jan 10 that this bug is about.
  2. The "Performance Profiles" links don't work at all - it looks like these jobs don't support being run with the profiler (this deserves a separate bug)

Furthermore, the regression range doesn't make a lot of sense - it's hard to believe that a string change could affect cnn load time.
The graphs show a "Puppet host configuration management" line before the regression, is there a way to find out what kind of configuration change happened?

Hi Markus! Sorry for taking so long to respond.
The reason for the confusion in the graphs is because by default we show the graph for the last 14 days from the moment we open it. Here is the graph that narrows down to the culprit mentioned in comment 0.
The reason for the performance profiles not working, I suspect is due to this test running on firefox-android, and I think the profiler links only work for desktop tests. I'm thinking maybe someone from the Profiler team can clarify why the profiler links are not working here.
The yellow infra line you mentioned that says "Puppet host configuration management" isn't always indicative of an actual infra change that can affect the performance results for the tests. I retriggered the cnn job for revision 38af1c1363aa3. If it is an actual infra regression, we should see the test results increase to match the latest trend change.

Flags: needinfo?(afinder)

Leaving a :ni for myself to check back the retriggers later in the day.

Flags: needinfo?(afinder)

Jeff, we are nearing the end of the beta cycle, are we going to do something for 123 or will the fix happen in a later release? Thanks

Flags: needinfo?(jboek)

Pascal, I will dig into this tomorrow and let you know if there is a potential for a fix in 123

Flags: needinfo?(jboek)

Clearing the needinfo, since it looks like this is a valid regression. The metrics for revision 38af1c1363aa3 did not align to the regression range after the retriggers.

Flags: needinfo?(afinder)

The metrics for revision 38af1c1363aa3 did not align to the regression range after the retriggers.

Alex, does this imply that ffeb819f258e7846454a422eb92ec5cbecbd978c is still the cause of the regression?

Flags: needinfo?(towhite) → needinfo?(afinder)

Can we please get an update on this, Jeff? Fx123 goes to release next week.

Flags: needinfo?(jboek)

Bob, looking at the graphs it looks like this regression has been resolved the last week in January. I don't believe the commit that this was flagged on in Fenix was the culprit but something that came from Gecko through GeckoView.

Alex, Is this something we can close?

Flags: needinfo?(jboek)

Alex is at the moment on PTO until Monday. I'd like to take another look on the alert myself - I retriggered some jobs and I'm waiting to see the results first and then we can end up with a conclusion.

Flags: needinfo?(afinder) → needinfo?(bacasandrei)

Jeff, any chance you could identify the commit from the end of January that would explain the improvement?

Flags: needinfo?(jboek)

Marc said that we may have a tool that we can check these commits with to give us a comparison report.

Flags: needinfo?(jboek) → needinfo?(mleclair)

(In reply to twhite from comment #9)

The metrics for revision 38af1c1363aa3 did not align to the regression range after the retriggers.

Alex, does this imply that ffeb819f258e7846454a422eb92ec5cbecbd978c is still the cause of the regression?

Yes. I have also retriggered multiple jobs to double check everything Alex concluded and the results in the graph do not indicate an infra change.

Flags: needinfo?(bacasandrei)

I looked into this, and bug 1867431 did not cause a regression here.

The mozilla-central commit used for the raptor/browsertime harness changed in bug 1867431's commit from this push: https://treeherder.mozilla.org/jobs?repo=mozilla-central&revision=b29d6ace45f4ca84cd58b15ab4430194755a7056
To this mozilla-central push: https://treeherder.mozilla.org/jobs?repo=mozilla-central&revision=33d47d05c368a7b03c3843f70deb11c351eab0e7

In the new mozilla-central base, there's a patch for changing the tests to use toml's instead of ini manifests: https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=33d47d05c368a7b03c3843f70deb11c351eab0e7

Specifically this one: https://hg.mozilla.org/mozilla-central/rev/d4ff02b92c066b55eaedf705b0bec7eab81178b7

This caused the metrics to change for the cnn/cnn-amp tests because they started using the live site variants. This was then fixed in this patch which caused the cnn/cnn-amp metrics to go back to what they previously were since they were using the recorded pages again: https://bugzilla.mozilla.org/show_bug.cgi?id=1877177

I've changed the regressing bug to the bug that caused the regression, and set this regression as fixed.

Status: NEW → RESOLVED
Closed: 2 years ago
Regressed by: 1859911
No longer regressed by: 1867431
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch
Assignee: jboek → gmierz2
Depends on: 1877177
Flags: needinfo?(mleclair)
You need to log in before you can comment on or make changes to this bug.