2.35 - 6.31% cnn / ebay-kleinanzeigen-search google / google-maps / instagram (android-hw-g5-7-0-arm7-api-16-shippable, android-hw-p2-8-0-android-aarch64-shippable) regression on push 8692f8ad44b3ca14d0a7ef839af156b170fb5869 (Wed August 26 2020)
Categories
(Core :: JavaScript Engine, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | unaffected |
firefox80 | --- | unaffected |
firefox81 | --- | unaffected |
firefox82 | --- | wontfix |
People
(Reporter: Bebe, Assigned: tcampbell)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Perfherder has detected a browsertime performance regression from push e87bcd8a1f949b155dff0cc0c7cbad7f4fe03f77. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
6% ebay-kleinanzeigen-search ContentfulSpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,128.09 -> 2,262.33
6% ebay-kleinanzeigen-search PerceptualSpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,371.36 -> 2,518.25
5% ebay-kleinanzeigen-search SpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,623.30 -> 2,759.83
4% cnn ContentfulSpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 4,702.83 -> 4,901.17
4% instagram loadtime android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,945.48 -> 3,062.08
4% instagram SpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,870.39 -> 2,982.92
4% google-maps android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,111.72 -> 1,155.08
4% google-maps fcp android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,061.50 -> 1,102.92
4% instagram PerceptualSpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,719.48 -> 1,783.00
4% google-maps SpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,170.48 -> 1,212.67
4% instagram LastVisualChange android-hw-g5-7-0-arm7-api-16-shippable opt cold 3,135.65 -> 3,247.92
3% google-maps FirstVisualChange android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,077.52 -> 1,114.42
2% google Similarity2D android-hw-p2-8-0-android-aarch64-shippable opt cold 0.95 -> 0.93
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.
For more information on performance sheriffing please see our FAQ.
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 1•4 years ago
|
||
== Change summary for alert #26885 (as of Tue, 08 Sep 2020 11:57:55 GMT) ==
Regressions:
180% raptor-tp6-sheets-firefox-cold loadtime macosx1014-64-shippable opt 2,712.58 -> 7,583.75
84% raptor-tp6-sheets-firefox-cold loadtime macosx1014-64-shippable opt 3,848.96 -> 7,068.92
31% raptor-tp6-sheets-firefox-cold macosx1014-64-shippable opt 1,251.10 -> 1,640.50
7% raptor-tp6-netflix-firefox-cold loadtime linux64-shippable-qr opt webrender 795.67 -> 847.42
6% raptor-tp6-netflix-firefox-cold loadtime linux64-shippable opt 766.38 -> 808.67
3% raptor-tp6-office-firefox-cold fcp linux64-shippable-qr opt webrender 968.85 -> 994.25
3% raptor-tp6-yahoo-mail-firefox-cold loadtime linux64-shippable-qr opt webrender 1,073.00 -> 1,100.25
2% raptor-tp6-office-firefox-cold fcp linux64-shippable opt 940.85 -> 963.42
2% raptor-tp6-sheets-firefox-cold confidence macosx1014-64-shippable opt 95.15 -> 93.17
Improvements:
4% raptor-tp6-facebook-redesign-firefox-cold loadtime linux64-shippable opt 1,941.85 -> 1,866.17
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=26885
Comment 2•4 years ago
|
||
Set release status flags based on info from the regressing bug 1660798
Assignee | ||
Comment 3•4 years ago
|
||
We are investigating some ideas here but believe these are addressed by Bug 1662102 (which requires the regressing patch). We will update with more data tomorrow.
Assignee | ||
Comment 4•4 years ago
|
||
The macosx1014 tp6-sheets regressions are very bi-modal and hardware dependent. When I toggle the pref for Bug 1662102 (which is not entirely complete), I see the regression go away. Looking at the "replicates" view, I still see some bimodal behaviour. It is unclear if the final version of Bug 1662102 will improve things even better than before or if there are other aspect of the test that make it very inconsistent.
Assignee | ||
Comment 5•4 years ago
|
||
Update: I've filed Bug 1664312 to fix a specific detail that affects the raptor-tp6-netflix-firefox-cold case (and probably some of the others).
Updated•4 years ago
|
Comment 6•4 years ago
|
||
I'm trying to understand the status here, looking for help on the following:
- is this code riding the trains or is it preffed off somehow?
- did bug 1664312 and/or bug 1664189 make a difference here, and if yes what's the remaining gap? I see 1664312 itself triggered a new alert and that is planned to be addressed in bug 1666274
- what is the plan for 82 now that is on beta?
Assignee | ||
Comment 7•4 years ago
|
||
[firefox82]
Baseline Revision:
ParserAtoms - Bug 1660798
- https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=2258849b0ee1e5037093ade2d648b0c8a2d04286
- Introduces regressions:
- (A) Handful of Android vismet tests have 2-6% page load regressions
- Very noisy and difficult to pin point changes without 20-40 retries.
- (B) OSX raptor-tp6-sheets-firefox-cold loadtime regression
- This test has discrete results of 2.5, 8.0, 12.0s page load.
- Reported number is a blend of these depending on precise GC
timing which is highly machine-dependent. - This regression changed the balance of how often we hit the fastpath.
- https://treeherder.mozilla.org/perf.html#/comparesubtestdistribution?originalProject=autoland&newProject=autoland&originalRevision=f0ec22dea9acbd01f0a20a31805ad27952b3f538&newRevision=e87bcd8a1f949b155dff0cc0c7cbad7f4fe03f77&originalSubtestSignature=2134364&newSubtestSignature=2134364
- (C) Linux64 raptor-tp6-netflix-firefox-cold loadtime regression (6-7%)
- (D) Linux64 raptor-tp6-office-firefox cold fcp regression (2-3%)
- (E) Linux64 raptor-tp6-yahoo-mail-firefox cold loadtime regression (2%)
- (A) Handful of Android vismet tests have 2-6% page load regressions
Bug 1649968, Bug 1658556, Bug 1660699, Bug 1658720, Bug 1660891, Bug 1661098, Bug 1661079, Bug 1659595, Bug 1662374, Bug 1658631, Bug 1658971
- https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=f15e01260d1c0c06467bc2a81195e538d371890f
- https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=b2665585f2a56f8743f1105296f5f6c5834c2496
- https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=bea475748ad2ed90e68fc86b0afcc1dfd29c0833
- https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=4a8df2bc77137574c578e2f5056c12c6dac0deeb
- Follow-up work depending on ParserAtoms changes.
- Browsertime update changes the definition of these metrics so graphs
are hard to follow. Also got backed out a couple times.
- https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=691a86eef68649b434b921080f5b55960b3c9f91
- Minor perf fix for ParserAtoms.
- The exact impact is hard to determine in the noise.
- https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=b8e17781c000be1bcec132ae37457589de2f87af
- Reduces regression (C) to < 2%
- Introduces regressions:
- (F) 150kB Base Content Heap Unclassified memory regression
[firefox83]
- To be landed.
- Eliminates memory regression (F).
- Could be uplifted to 82 although patch is non-trivial.
- To be landed in FF83.
- Eliminates regression (B) by removing some non-determinism.
- The code needed to resolve regression is in FF82, but pref is not flipped.
- A handful of people have been running with the pref flipped for
weeks but it seems risky to flip on in Beta.
Assignee | ||
Comment 8•4 years ago
•
|
||
(In reply to Julien Cristau [:jcristau] from comment #6)
I'm trying to understand the status here, looking for help on the following:
- is this code riding the trains or is it preffed off somehow?
The regressing "ParserAtoms" patch is currently riding trains and there isn't a pref for it.
- did bug 1664312 and/or bug 1664189 make a difference here, and if yes what's the remaining gap? I see 1664312 itself triggered a new alert and that is planned to be addressed in bug 1666274
Some of the regressions such as netflix load are down to < 2%.
Android data is too noise to tell one way or another. I can try to simulate changes with artificial stacks that only have relevant changes and ignore anything else in 82 (which seem to have affected the numbers but not enough to trigger alerts). This is probably about a 1000 raptor jobs on the android hardware to get through the noise and look at the various test pages so it would likely take a few to get results.
- what is the plan for 82 now that is on beta?
I'm not sure what the least worst thing is here. It isn't really feasible to revert at this point (12 bugs would need to go, and another 5-10 would need merge conflicts resolved).
Regressions:
(A) Hard to tell from data what current impact is. The ebay one is most concerning but I can't tell from the graphs if Bug 1664312 helped or it is just noise.
(B) This seems like an unlucky test case hitting bad luck. It only appears on a single platform and the underlying results between retries have 4x variation. I'm not sure there is anything to do for 82. In 83 the results will bounce back (and early data suggests be even better).
(C) This is helped a lot by Bug 1664312 and the graph shows we are now better then before.
(D,E) These are noisy and may have been helped a little by Bug 1664312. I don't think there is much to do in 82. Further tuning in 83 will happen.
(F) There is a possible uplift of medium-complexity that will get us back the 150kB base content overhead. Since fission is not shipping by default yet, I'm leaning towards not uplifting at this point. We can re-evaluate at end of this week after it has baked on nightly though.
Comment 9•4 years ago
|
||
Bug 1666282, Bug 1666274 are now uplifted to 82.
Comment 10•4 years ago
|
||
I guess this is as fixed as it's going to be for 82 at least.
Assignee | ||
Comment 11•4 years ago
|
||
We've been doing a lot of perf work in this area and it is hard to compare directly against these old versions, so I will close this bug. I think we've fixed the main issues identified by the alert.
Updated•3 years ago
|
Description
•