Open Bug 1706333 Opened 26 days ago Updated 4 days ago

122.22 - 2.73% cnn-ampstories FirstVisualChange / ebay ContentfulSpeedIndex + 22 more (Windows) regression on Fri April 16 2021

Categories

(Firefox :: Theme, defect, P2)

defect

Tracking

()

Tracking Status
firefox-esr78 --- unaffected
firefox88 --- unaffected
firefox89 --- wontfix
firefox90 --- affected

People

(Reporter: Bebe, Unassigned)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: perf, perf-alert, regression, Whiteboard: [proton-icons] [priority:2b][qf:p1:pageload])

Attachments

(5 files)

Perfherder has detected a browsertime performance regression from push 84499c442e82d5b0d09018e7ca0d943647659e6d. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Suite Test Platform Options Absolute values (old vs new)
122% cnn-ampstories FirstVisualChange windows10-64-shippable-qr cold live webrender 45.00 -> 100.00
85% cnn-ampstories FirstVisualChange windows10-64-shippable-qr cold live webrender 45.00 -> 83.33
27% cnn-ampstories fnbpaint windows10-64-shippable-qr cold live webrender 292.71 -> 372.25
26% cnn-ampstories ContentfulSpeedIndex windows10-64-shippable-qr cold live webrender 120.04 -> 150.83
21% google-slides FirstVisualChange windows10-64-shippable-qr cold webrender 787.42 -> 954.50
19% cnn-ampstories loadtime windows10-64-shippable-qr cold live webrender 492.02 -> 583.58
17% office FirstVisualChange windows10-64-shippable-qr cold webrender 1,068.88 -> 1,252.42
15% office FirstVisualChange windows10-64-shippable-qr cold webrender 1,079.83 -> 1,241.08
14% cnn-ampstories fcp windows10-64-shippable-qr cold live webrender 464.83 -> 530.42
12% office SpeedIndex windows10-64-shippable-qr cold webrender 1,156.17 -> 1,300.08
... ... ... ... ... ...
6% yahoo-mail ContentfulSpeedIndex windows10-64-shippable-qr warm webrender 319.38 -> 340.08
6% instagram LastVisualChange windows10-64-shippable-qr cold webrender 1,329.75 -> 1,413.75
5% google-slides ContentfulSpeedIndex windows10-64-shippable-qr cold webrender 1,523.42 -> 1,592.00
4% office ContentfulSpeedIndex windows10-64-shippable-qr cold webrender 1,523.04 -> 1,581.33
3% ebay ContentfulSpeedIndex windows10-64-shippable-qr cold webrender 902.67 -> 927.33

Improvements:

Ratio Suite Test Platform Options Absolute values (old vs new)
34% twitch FirstVisualChange windows10-64-shippable-qr cold webrender 198.33 -> 130.50
28% youtube dcf windows10-64-shippable warm 1,194.08 -> 862.00
28% youtube dcf windows10-64-shippable-qr warm webrender 1,207.12 -> 874.38
16% netflix LastVisualChange windows10-64-shippable cold 1,580.00 -> 1,320.00
16% instagram loadtime windows10-64-shippable-qr cold webrender 1,550.62 -> 1,308.12
... ... ... ... ... ...
7% youtube LastVisualChange windows10-64-shippable cold 2,786.67 -> 2,583.33

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(sfoster)

Sam, before you spend ages setting up stuff to debug this, I'd like to point you to Bug 1704795 that I'm investigating and Bug 1704463 that Mike is investigating.
Short story: these tests are very sensible to animation and sizing changes in the chrome toolbox.
I suspect longer or shorter animations are shifting the gfx pipeline around. Of course we can't yet exclude there's an actual regression without investigation, but be aware of that fact when investigating.

Another thing that may be useful, that I discovered in these days, -vismet tests are only calculating over stuff that runs in non-vismet ones. For example "youtube-vismet" only runs calculations over "youtube", that means if you want to generate a perf profile you must do that on the "youtube" task, and from that same youtube task you can download "browsertime-results.tgz" that contains videos for each of the test run, so you can also visually compare if we're doing additional work.
Feel free to ask me or mconley if you have further questions, since we spent some time on these we may save you some time.

See Also: → 1704463, 1704795

(In reply to Marco Bonardo [:mak] from comment #1)

Sam, before you spend ages setting up stuff to debug this, I'd like to point you to Bug 1704795 that I'm investigating and Bug 1704463 that Mike is investigating.

Thanks, yeah I will likely take you up on that offer as I've not needed to do this before, so this is all new to me! The patch reduced the length of the animations (fewer frames) and substituted new images for old. Unless there's something intrinsic to the new SVGs which caused a perf. regression, the timing change seems most plausible. I can probably eliminate the first possibility by chopping down the old SVGs to the new length and measuring that - if the regression is still present, presumably it is just the timing change, and if not I need to fiddle with the new SVGs themselves.

Blocks: proton
Flags: needinfo?(sfoster)
Whiteboard: [proton-icons]

== Change summary for alert #29733 (as of Mon, 19 Apr 2021 07:03:05 GMT) ==

Regressions:

Ratio Suite Test Platform Options Absolute values (old vs new)
5% Images windows10-64-shippable-qr 7,099,821.25 -> 7,452,324.08
5% Images macosx1015-64-shippable 6,015,146.47 -> 6,306,169.22
3% Images macosx1015-64-shippable-qr tp6 8,523,948.68 -> 8,799,856.30
3% Images macosx1015-64-shippable tp6 8,755,887.19 -> 9,022,129.24

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=29733

Flags: needinfo?(sfoster)

Set release status flags based on info from the regressing bug 1702281

Priority: -- → P2
Whiteboard: [proton-icons] → [proton-icons] [priority:2b]
Attached image amp-regression.gif

Something does appear to be up here. In this side-by-side comparison, the "good" case is on the left, the "bad" case is on the right. Notice how there's a significant delay in the "bad" case before the page load even begins.

I'm triggered some profile jobs on the "bad" case to see if we can detect what's going on there.

So here are some profiles:

Before: https://share.firefox.dev/3tJ0k9c
After: https://share.firefox.dev/3xbQMWd

Looking at the whole run, it looks like only the first run in the "bad" case do we seem to have this delay. And it's really not clear where it's coming from - according to the profile, the networking layer is spending 770ms waiting for a response to the page in the bad case, and 6.4ms waiting for a response in the good case.

So a few thoughts here in the CSS case.

  1. We should probably ask our performance team to dig deeper and see where this initial network lag is coming from. The fact that we only changed icons here might point to some deeper issue with how Firefox starts up and schedules things.
  2. Since this only seems to affect the first network request in this case, I don't think we need to block on this.

I'll look at the other ones now.

Looks like mconley is all over this, so clearing my need-info. Thanks.

Flags: needinfo?(sfoster)

Looks like things load in slightly different order with Office, and "before" gets there just a hair earlier.

Here's an mp4 version of the CNN amp test side by side comparison.

Looks like the first paint comes in a bit later here in the "after" video.

So the general pattern here seems to be that somehow changing the stop/reload animation has made it so that some loads start a little later? I think this is something we should point our platform performance team at. But I don't think this should block MR1.

Whiteboard: [proton-icons] [priority:2b] → [proton-icons] [priority:2b][qf]
Whiteboard: [proton-icons] [priority:2b][qf] → [proton-icons] [priority:2b][qf:p1:pageload]
You need to log in before you can comment on or make changes to this bug.