Closed Bug 1769195 Opened 2 years ago Closed 2 years ago

38.74% reddit-billgates-ama.members ContentfulSpeedIndex (Linux) regression on Thu May 5 2022

Categories

(Core :: Graphics: ImageLib, defect)

Firefox 102
defect

Tracking

()

RESOLVED FIXED
102 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox100 --- unaffected
firefox101 --- unaffected
firefox102 --- fixed

People

(Reporter: alexandrui, Assigned: tnikkel)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Attachments

(1 file)

Perfherder has detected a browsertime performance regression from push 846e7307e1a3894c84993f4d96d178de6917681e. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
39% reddit-billgates-ama.members ContentfulSpeedIndex linux1804-64-shippable-qr cold fission webrender 299.73 -> 415.83

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
81% facebook-nav.marketplace LastVisualChange macosx1015-64-shippable-qr cold fission webrender 6,238.33 -> 1,206.67
80% facebook-nav.marketplace LastVisualChange linux1804-64-shippable-qr cold fission webrender 6,321.67 -> 1,233.33
80% facebook-nav.marketplace LastVisualChange macosx1015-64-shippable-qr cold fission webrender 6,240.00 -> 1,230.00
7% facebook-nav.marketplace ContentfulSpeedIndex linux1804-64-shippable-qr cold fission webrender 1,125.75 -> 1,049.29
5% facebook-nav.marketplace SpeedIndex linux1804-64-shippable-qr cold fission webrender 1,140.27 -> 1,080.83

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(tnikkel)

Set release status flags based on info from the regressing bug 1231622

Flags: needinfo?(tnikkel) → needinfo?(aionescu)

I suspect this is mis-attributed. I pushed to try current trunk and then a backout of bug 1231622. Retriggered the job in question 5 times. Trunk avg 436, avg with backout 435.4. There's an infra change marker right before the jump in the graph, maybe that's related?

I also downloaded the browsertime results tgz both before and after to look visual to see if anything was going on. I didn't notice anything related this specific bug. I did notice there seems to be a bug in how we calculate these figures.

It looks like we run the test 10 times. In the video first the previous page is showing, then an "orange div" covers the page, and then in 2 of 10 videos we have the page in question that we are measuring showing in it's very early load state (call this case A), and in 8 of 10 videos we briefly show the previous page before changing to the very early load stages of the page in question (call this case B). And this difference changes when we determine various events have occurred. In case A first visual change happens when we have a significant amount of content visible on the page, in case B first visual change happens when we switch from the previous page to the new page with almost nothing drawn. And it affects other events too like what's visible at SpeedIndex. Not sure if this is known and/or if there is someone/some team that is interested in this finding.

Pretty sure at one point that bug 1763643 improved this and the backout of bug 1766333 would have caused the reversion to the mean.

The graph is very noisy and, despite the magnitude, it makes the regression very unclear locally. I retriggered f0fda878f51a5 to check the behavior of infra. If it's not infra, we can close this.

Flags: needinfo?(aionescu)

After doing a bunch of retriggers on autoland it's pretty clear that the graph does go up exactly at bug 1231622.

However when I push various things to try I only get the higher numbers. Things I've pushed to try: current trunk, current trunk with my patch backed out, hg update <revid> where revid is various revisions well before my changeset landed. So it's impossible to investigate this via try. Somehow it gets different numbers then what autoland gets. And I'm comparing to jobs triggered on autoland at the same time as the pushes, so infra changes shouldn't be a problem.

I've deep dived into the browsertime downloadable json/videos to try to understand what is going on. In addition to the problem I noted in comment 2, I've also noticed that a page that loads faster in all respects (all visual milestones are achieved sooner) can get a ContentfulSpeedIndex that is larger. In detail, the visual load of the page in question happens in 5 discrete chunks: when the skeleton page first shows up, when the bg image is drawn, when the title image is drawn, when the skeleton page goes away, when the actual page content shows up.

Comparing two page loads, 1st number timestamp in ms when page load A hits that milestone, 2nd number is page load B.

200 160 skeleton ui
320 320 bg image drawn
320 320 title image drawn
480 440 skeleton page goes away
920 840 page content shows up

As you can see page load B is always faster or the same as page load A, however page load A scores 284 ContentfulSpeedIndex, page load B scores 345 ContentfulSpeedIndex.

Digging into the browsertime json file we can find a map from timestamp to ContentfulSpeedIndex percent complete which looks like it is used to compute the final score. At timestamp 160ms page load A is 52 percent complete (note page load A hasn't even reached the first visual milestone, but perhaps 160ms just got rounded up to 200ms so we'll let it pass). For page load B, it doesn't even hit 52 percent at our second last visual milestone (skeleton page goes away), we only surpass 52 percent when we hit 99 percent at 760ms (just before the page content almost fully shows up at 840ms).

This is not an isolated example. So my trust in this metric is not very high.

The draw will be pointless, and it regresses one perf metric.

Assignee: nobody → tnikkel
Status: NEW → ASSIGNED
Pushed by tnikkel@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/eda72c9d12f1 Don't bother to try to do a partial draw of a background image if we haven't decoded any pixels. r=aosmond
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 102 Branch
Regressions: 1770464

(In reply to Pulsebot from comment #7)

Pushed by tnikkel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eda72c9d12f1
Don't bother to try to do a partial draw of a background image if we haven't
decoded any pixels. r=aosmond

== Change summary for alert #34197 (as of Sun, 22 May 2022 00:27:03 GMT) ==

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
424% facebook-nav.marketplace LastVisualChange macosx1015-64-shippable-qr cold fission webrender 1,193.33 -> 6,253.33
407% facebook-nav.marketplace LastVisualChange linux1804-64-shippable-qr cold fission webrender 1,250.00 -> 6,331.67
12% outlook ContentfulSpeedIndex windows10-64-shippable-qr fission warm webrender 870.75 -> 976.67
6% facebook-nav.marketplace SpeedIndex linux1804-64-shippable-qr cold fission webrender 1,082.38 -> 1,150.88

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=34197

:alexandrui did you mean to needinfo anyone on the last comment? did the last patch cause further regressions, or are the latest alerts from the original bug?

The huge facebook-nav.marketplace LastVisualChange changes are just the tests going back to normal (in the first comment here you see the reverse change), but it seems they are buggy, these patches shouldn't be having that kind of impact on a properly calibrated test.

The facebook-nav.marketplace SpeedIndex is also the test going back to normal (reverse change is in comment 0).

Looking at the graph for outlook ContentfulSpeedIndex when bug 1231622 landed it looks like it caused an improvement equal to the regression here.

So everything is back to normal here as far as I can tell.

Why these two patches move the numbers at all here I'm not sure, I suspect the tests aren't well calibrated or something.

(In reply to Dave Hunt [:davehunt] [he/him] ⌚GMT from comment #10)

:alexandrui did you mean to needinfo anyone on the last comment? did the last patch cause further regressions, or are the latest alerts from the original bug?

nope, it's just a regression fix.

I looked into the facebook-nav.marketplace LastVisualChange changes that happened here because I was investigating something similar (bug 1771977). When LastVisualChange is around 1 second it's because we never draw (or draw it after the browsertime analysis is complete) the little chat overlay icon in the bottom right. When LastVisualChange is 6 seconds we wait until that little chat overlay icon is drawn.

See Also: → 1773020
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: