Closed Bug 1663665 Opened 1 year ago Closed 11 months ago

2.35 - 6.31% cnn / ebay-kleinanzeigen-search google / google-maps / instagram (android-hw-g5-7-0-arm7-api-16-shippable, android-hw-p2-8-0-android-aarch64-shippable) regression on push 8692f8ad44b3ca14d0a7ef839af156b170fb5869 (Wed August 26 2020)

Categories

(Core :: JavaScript Engine, defect, P1)

Firefox 82
defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox80 --- unaffected
firefox81 --- unaffected
firefox82 --- wontfix

People

(Reporter: Bebe, Assigned: tcampbell)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a browsertime performance regression from push e87bcd8a1f949b155dff0cc0c7cbad7f4fe03f77. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

6% ebay-kleinanzeigen-search ContentfulSpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,128.09 -> 2,262.33
6% ebay-kleinanzeigen-search PerceptualSpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,371.36 -> 2,518.25
5% ebay-kleinanzeigen-search SpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,623.30 -> 2,759.83
4% cnn ContentfulSpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 4,702.83 -> 4,901.17
4% instagram loadtime android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,945.48 -> 3,062.08
4% instagram SpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 2,870.39 -> 2,982.92
4% google-maps android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,111.72 -> 1,155.08
4% google-maps fcp android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,061.50 -> 1,102.92
4% instagram PerceptualSpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,719.48 -> 1,783.00
4% google-maps SpeedIndex android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,170.48 -> 1,212.67
4% instagram LastVisualChange android-hw-g5-7-0-arm7-api-16-shippable opt cold 3,135.65 -> 3,247.92
3% google-maps FirstVisualChange android-hw-g5-7-0-arm7-api-16-shippable opt cold 1,077.52 -> 1,114.42
2% google Similarity2D android-hw-p2-8-0-android-aarch64-shippable opt cold 0.95 -> 0.93

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Component: Performance → JavaScript Engine
Flags: needinfo?(kvijayan)
Product: Testing → Core

== Change summary for alert #26885 (as of Tue, 08 Sep 2020 11:57:55 GMT) ==

Regressions:

180% raptor-tp6-sheets-firefox-cold loadtime macosx1014-64-shippable opt 2,712.58 -> 7,583.75
84% raptor-tp6-sheets-firefox-cold loadtime macosx1014-64-shippable opt 3,848.96 -> 7,068.92
31% raptor-tp6-sheets-firefox-cold macosx1014-64-shippable opt 1,251.10 -> 1,640.50
7% raptor-tp6-netflix-firefox-cold loadtime linux64-shippable-qr opt webrender 795.67 -> 847.42
6% raptor-tp6-netflix-firefox-cold loadtime linux64-shippable opt 766.38 -> 808.67
3% raptor-tp6-office-firefox-cold fcp linux64-shippable-qr opt webrender 968.85 -> 994.25
3% raptor-tp6-yahoo-mail-firefox-cold loadtime linux64-shippable-qr opt webrender 1,073.00 -> 1,100.25
2% raptor-tp6-office-firefox-cold fcp linux64-shippable opt 940.85 -> 963.42
2% raptor-tp6-sheets-firefox-cold confidence macosx1014-64-shippable opt 95.15 -> 93.17

Improvements:

4% raptor-tp6-facebook-redesign-firefox-cold loadtime linux64-shippable opt 1,941.85 -> 1,866.17

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=26885

We are investigating some ideas here but believe these are addressed by Bug 1662102 (which requires the regressing patch). We will update with more data tomorrow.

The macosx1014 tp6-sheets regressions are very bi-modal and hardware dependent. When I toggle the pref for Bug 1662102 (which is not entirely complete), I see the regression go away. Looking at the "replicates" view, I still see some bimodal behaviour. It is unclear if the final version of Bug 1662102 will improve things even better than before or if there are other aspect of the test that make it very inconsistent.

Depends on: 1664312

Update: I've filed Bug 1664312 to fix a specific detail that affects the raptor-tp6-netflix-firefox-cold case (and probably some of the others).

Assignee: nobody → tcampbell
Flags: needinfo?(kvijayan)
Severity: -- → S3
Priority: -- → P1
Depends on: 1664189
See Also: → 1609486

I'm trying to understand the status here, looking for help on the following:

  • is this code riding the trains or is it preffed off somehow?
  • did bug 1664312 and/or bug 1664189 make a difference here, and if yes what's the remaining gap? I see 1664312 itself triggered a new alert and that is planned to be addressed in bug 1666274
  • what is the plan for 82 now that is on beta?
Flags: needinfo?(tcampbell)

[firefox82]

Baseline Revision:

ParserAtoms - Bug 1660798

Bug 1649968, Bug 1658556, Bug 1660699, Bug 1658720, Bug 1660891, Bug 1661098, Bug 1661079, Bug 1659595, Bug 1662374, Bug 1658631, Bug 1658971

Bug 1518999

  • Browsertime update changes the definition of these metrics so graphs
    are hard to follow. Also got backed out a couple times.

Bug 1664189

Bug 1664312

[firefox83]

Bug 1666282, Bug 1666274

  • To be landed.
  • Eliminates memory regression (F).
  • Could be uplifted to 82 although patch is non-trivial.

Bug 1662102

  • To be landed in FF83.
  • Eliminates regression (B) by removing some non-determinism.
    • The code needed to resolve regression is in FF82, but pref is not flipped.
    • A handful of people have been running with the pref flipped for
      weeks but it seems risky to flip on in Beta.

(In reply to Julien Cristau [:jcristau] from comment #6)

I'm trying to understand the status here, looking for help on the following:

  • is this code riding the trains or is it preffed off somehow?

The regressing "ParserAtoms" patch is currently riding trains and there isn't a pref for it.

  • did bug 1664312 and/or bug 1664189 make a difference here, and if yes what's the remaining gap? I see 1664312 itself triggered a new alert and that is planned to be addressed in bug 1666274

Some of the regressions such as netflix load are down to < 2%.
Android data is too noise to tell one way or another. I can try to simulate changes with artificial stacks that only have relevant changes and ignore anything else in 82 (which seem to have affected the numbers but not enough to trigger alerts). This is probably about a 1000 raptor jobs on the android hardware to get through the noise and look at the various test pages so it would likely take a few to get results.

  • what is the plan for 82 now that is on beta?

I'm not sure what the least worst thing is here. It isn't really feasible to revert at this point (12 bugs would need to go, and another 5-10 would need merge conflicts resolved).
Regressions:
(A) Hard to tell from data what current impact is. The ebay one is most concerning but I can't tell from the graphs if Bug 1664312 helped or it is just noise.
(B) This seems like an unlucky test case hitting bad luck. It only appears on a single platform and the underlying results between retries have 4x variation. I'm not sure there is anything to do for 82. In 83 the results will bounce back (and early data suggests be even better).
(C) This is helped a lot by Bug 1664312 and the graph shows we are now better then before.
(D,E) These are noisy and may have been helped a little by Bug 1664312. I don't think there is much to do in 82. Further tuning in 83 will happen.
(F) There is a possible uplift of medium-complexity that will get us back the 150kB base content overhead. Since fission is not shipping by default yet, I'm leaning towards not uplifting at this point. We can re-evaluate at end of this week after it has baked on nightly though.

Flags: needinfo?(tcampbell)

Bug 1666282, Bug 1666274 are now uplifted to 82.

I guess this is as fixed as it's going to be for 82 at least.

We've been doing a lot of perf work in this area and it is hard to compare directly against these old versions, so I will close this bug. I think we've fixed the main issues identified by the alert.

Status: NEW → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
See Also: → 1665095
You need to log in before you can comment on or make changes to this bug.