Closed Bug 1785290 Opened 3 years ago Closed 3 years ago

17.1 - 1.91% wikipedia fcp / twitter PerceptualSpeedIndex + 27 more (Android, Linux, OSX, Windows) regression on Tue August 9 2022

Categories

(Core :: DOM: Navigation, defect)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox103 --- unaffected
firefox104 --- unaffected
firefox105 --- disabled

People

(Reporter: afinder, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a browsertime performance regression from push 44255b7d9b1c37527cb7ece8306031fff378069b. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
17% wikipedia fcp macosx1015-64-shippable-qr fission warm webrender 37.90 -> 44.38
12% google fcp android-hw-a51-11-0-aarch64-shippable-qr warm webrender 225.42 -> 253.42
12% google loadtime android-hw-a51-11-0-aarch64-shippable-qr warm webrender 228.79 -> 256.88
11% google-search-restaurants fcp android-hw-a51-11-0-aarch64-shippable-qr warm webrender 288.46 -> 321.46
11% booking fcp android-hw-a51-11-0-aarch64-shippable-qr warm webrender 294.69 -> 326.50
11% bing loadtime android-hw-a51-11-0-aarch64-shippable-qr warm webrender 270.50 -> 299.21
10% bing fcp android-hw-a51-11-0-aarch64-shippable-qr warm webrender 268.08 -> 295.38
10% reddit fcp android-hw-a51-11-0-aarch64-shippable-qr warm webrender 292.10 -> 321.00
10% google-search-restaurants FirstVisualChange android-hw-a51-11-0-aarch64-shippable-qr warm webrender 345.92 -> 379.83
9% google FirstVisualChange android-hw-a51-11-0-aarch64-shippable-qr warm webrender 280.58 -> 306.92
... ... ... ... ...
5% paypal loadtime windows10-64-shippable-qr fission warm webrender 570.81 -> 601.04
4% google LastVisualChange android-hw-a51-11-0-aarch64-shippable-qr warm webrender 690.92 -> 716.42
4% google-search-restaurants LastVisualChange android-hw-a51-11-0-aarch64-shippable-qr warm webrender 781.33 -> 809.83
2% twitter loadtime linux1804-64-shippable-qr fission warm webrender 589.98 -> 602.88
2% twitter PerceptualSpeedIndex linux1804-64-shippable-qr fission warm webrender 596.58 -> 608.00

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(continuation)

Set release status flags based on info from the regressing bug 1746524

Olli, do you have any suggestions here about this? There are certainly a lot of regressions, but I don't know actionable it is. Certainly this will mess up a lot of timings but I have no idea if there is a real issue I should fix, or what.

Flags: needinfo?(smaug)

I can't recall how we get all those numbers. https://matrix.to/#/#perftest:mozilla.org would be reasonable place to ask.
Then think about in which way page load is changed. I'd need to read the relevant code how parent controlled loads affect page loads.
There is some odd looking code, like https://searchfox.org/mozilla-central/rev/f3616b887b8627d8ad841bb1a11138ed658206c5/netwerk/ipc/DocumentLoadListener.cpp#2441,2466
Sorry, I'm not too useful here. I just haven't read the code for parent initiated loads too carefully ever (because it has been disabled).

Flags: needinfo?(smaug)

Getting some profiles with and without the pref set might be useful. ./mach try fuzzy can be used to trigger the relevant vismet tests and then treeherder UI has buttons to trigger profile creation, IIRC.

I think we should back out bug 1746524 for now. We're into the soft freeze and it seems unlikely I'll be able to figure out and fix whatever the performance issue is in the next day or two.

Flags: needinfo?(continuation)
Severity: -- → S2

Skimming through the many regressions, it looks more or less like the regression went away when my patch was backed out.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED

I've been trying to investigate this issue.

The bulk of the issues are with Android, but I've been unable to get Android to work on try. Instead, it attempted to run it for 24 hours and then presumably hit some timeout.

I have managed to reproduce one of the Linux issues that showed up very distinctly. Linux1804-64 shippable browsertime twitter fcp opt fission warm webrender went from 187ms to 203ms (eg an increase of 16ms).

Unfortunately, when I enable the profiler, the FCP time with the pref enabled the drops from 203ms to 140ms. The score without my patch and the profiler enabled doesn't look much different.

There's also an OSX regression in Wikipedia fcp warm, from 37.9ms to 44ms (a difference of only 7.39ms), and it looks quite noisy, so I'm not sure how well I'll be able to investigate, even though I do have an OSX machine.

The Windows regressions are paypal fcp, loadtime. The OSX regression is wikipedia fcp. The Linux regressions are twitter fcp, loadtime, PerceptualSpeedIndex.

Then there are a ton of Android regressions. reddit fcp, booking fcp, amazon fcp. bing ContentfulSpeedIndex, fcp, FirstVisualChange, loadtime. google-search-restaurants ContentfulSpeedIndex, fcp, FirstVisualChange, LastVisualChange, loadtime, PerceptualSpeedIndex, SpeedIndex. google ContentfulSpeedIndex, fcp, FirstVisualChange, LastVisualChange, loadtime, PerceptualSpeedIndex, SpeedIndex.

Unfortunately, when I enable the profiler, the FCP time with the pref enabled the drops from 203ms to 140ms. The score without my patch and the profiler enabled doesn't look much different.

Is this done with the try push?

I wonder if the regression is reproducible with just local testings, not even via browsertime, just load the page normally. FCP should have a good correlation with FirstContentfulComposite marker in the profile.

(In reply to Sean Feng [:sefeng] from comment #9)

Unfortunately, when I enable the profiler, the FCP time with the pref enabled the drops from 203ms to 140ms. The score without my patch and the profiler enabled doesn't look much different.

Is this done with the try push?

Yes.

I wonder if the regression is reproducible with just local testings, not even via browsertime, just load the page normally. FCP should have a good correlation with FirstContentfulComposite marker in the profile.

Yeah, my next step here is to see if I can reproduce any of these issues locally. Unfortunately I only have an OSX machine readily available, though I do have an Android phone I guess I can try setting up.

I'm going to see if the Google search regressions show up on OSX desktop when Fission is disabled. Hopefully the Android regressions are mostly due to Fission being disabled and not due to it being Android.

You need to log in before you can comment on or make changes to this bug.