47 bytes, text/x-phabricator-request
|Details | Review|
Speedometer scores started to regress on Android around October 15 and then dipped again on October 27: https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=mozilla-central,1776166,1,10&series=mozilla-central,1775979,1,10 Here is the hg pushlog for the second dip: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=6e96c7ec0d1187c1b488dd4ba645df9cfd68ec16&tochange=7007206a3cd46da7c478e5a9992b608b743a1ff9 Android LTO (bug 1480006) landed in that regression range, but it was also backed out. PLT stubs (bug 1500154) also landed. That change improved page load performance, but perhaps it regressed Speedometer?
Summary: Speedometer regression → GeckoView Speedometer regression on Google Pixel 2 (ARM32 and ARM64)
keep in mind a lot has changed to make the tests run again on bitbar (second dip), will see if tooling is the culprit or the browser.
Bob, do you know why Speedometer results from Moto G5 stopped after October 23? I don't know if that was related to the regression on Pixel 2, but they happened around the same time: https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=mozilla-central,1776166,1,10&series=mozilla-central,1775979,1,10
Priority: -- → P1
:cpeterson, all bitbar testing was broken after a ssl cert was renewed and we had too many redirects accessing artifacts (i.e. bug 1499246). We fixed the issue for the most part, but we turned off all testing for a few days to understand the issue and ended up updating our docker image that runs the tests and serves the webpages to a new version of ubuntu that has new and updated libraries.
My gut feeling here is that it's related to a11y stuff, since Speedometer does a bunch of form actions. Maybe Eitan has insight?
Does speedometer have any accessibility service active?
pushing 3 different pushes, I see similar results to the graph- 2 regressions- since this just ran today in the last 1.5 hours, there are no changes going into the bitbar environment, only changes in the browser code. I think this is code changes. We can bisect this if needed- it takes a few hours to deal with tree updates, etc. but we could bisect on try if there are no culprits identified
I don't think there is anything specific to a11y, we set preferences: https://searchfox.org/mozilla-central/source/testing/profiles/common/user.js https://searchfox.org/mozilla-central/source/testing/profiles/perf/user.js and the speedometer source is here: https://searchfox.org/mozilla-central/source/third_party/webkit/PerformanceTests/Speedometer
I am not sure if we have made progress on this- as it would be some work to bisect, I haven't done that, but I would be happy to if we are not sure anything will fix the problem.
The logs don't suggest accessibility being enabled. So I'm not sure how the a11y changes may have done this. Can't rule it out, though..
I have done a lot of bisecting on this and cannot come to any conclusions, it is obvious that we have a sustained change, but I see it show up in a smaller range, then I bisect more and it yields no results, as if I am unable to reproduce this.
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #10) > I have done a lot of bisecting on this and cannot come to any conclusions, > it is obvious that we have a sustained change, but I see it show up in a > smaller range, then I bisect more and it yields no results, as if I am > unable to reproduce this. @ Joel, what is your smaller regression range? Is your range for the first (October 16) or second (October 27) regression? The regressions look more pronounced on android-hw-p2-8-0-android-aarch64 (ARM64), where we don't have Ion JIT yet (bug 1380129). What do you recommend we do next? With the smaller range, perhaps a Gecko developer can try to bisect by running Speedometer locally. @ Kannan, the range for the first regression (October 16 around ) might be  or . Range  includes a JIT fix for bug 1496847. Could that cause a Speedometer regression on ARM? Range  includes a raptor test change for Sunspider Chrome bug 1487374.  https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=mozilla-central,1776166,1,10&series=mozilla-central,1775979,1,10&zoom=1539315355586.678,1540316452294.6199,15.998876335915556,18.392134762881845  https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f2e35ed6a6928b35e085462dc4268fd86f00fdb9&tochange=31724aea10cae55f30b825ade226c4d25e11a899  https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9079bbe837184ed183b133a374753865b6768bc4&tochange=f2e35ed6a6928b35e085462dc4268fd86f00fdb9
See Also: 1500154 →
I had looked at the oct 27th regression, it seemed that I started narrowing down a range and then I wasn't able to reproduce it on a smaller range and I played whack a mole for a 100 try pushes. If you want I can look into the oct 16th regression tomorrow, let me know if that is desired.
Remember that when using the bitbar devices we are getting a possibly different device each time which may affect the ability to bisect. It might be possible to manually bisect with a single local device.
Chris: No, that patch should behave strictly as speedup on all platforms.
[Tracking Requested - why for this release]: This is a performance regression on Speedometer, one of our benchmarks. GV 64 Nightly merged to Beta on 2018-10-22, so GV 64 would be affected by the apparent October 15 regression. GV 65 would be affected by both the October 15 and 27 regressions.
Eitan - speedometer is known for doing rapid-fire changing of form element states (creating checkboxes, checking 100 checkboxes at a time, deleting them). If any code is in that processing path in order to support accessibility (maintaining state, perhaps), perhaps it could have a perf impact even if a11y isn't in use at that time. Obviously trying to narrow regression ranges is the primary tool to attack this for now
*Possible* regression range for the first regression: https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=Oct+15+2018&enddate=Oct+16+2018 Note: I'm eyeballing the distribution and saying it regressed late-15th-ish perhaps or *maybe* early 16th Roughly 17.75 to 17.0 from eyeballing the noise
Looking at that range: Bug 1498699 was an android-only bug in that range... and Bug 1498715 (unlikely though). Bug 1479757 was for GV, but may be WebRender-only. Bug 1498329 seems very unlikely. There was a big non-android landing for tables (Bug 1484126 ). a JIT piece landed at the end of that which we've discussed: Bug 1496847 I presume this didn't impact desktop. jmaher: thoughts? How does this mesh with regressionrange tests you did, or did you only focus on the second?
:jesup, I only focused on the second. basically someone could do this process: foreach revision in range: hg update revision <apply any patches to fix build or scheduling> ./mach try fuzzy -q 'android aarch64 speedometer' --rebuild 5 <remove any patches to fix build or scheduling> sadly this takes a long time for me locally and the cpu load on my machine is high so doing this while in a vidyo meeting creates a lot of interference. I did this on the second range and found that it was inconclusive which happens maybe 10% of the time when I bisect- usually more frequent the longer back in history I go and the more custom patches to get stuff to build or run.
I'm about to try to bisect this locally by just loading the speedometer page on a dedicated device. That should be good enough, right? My goal is to observe that there *is* in fact a regression, and see if and how it is related to a11y.
loading speedometer on a device while iterating through builds should show a regression; I would at least run it twice for each build; I did this one day to compare against google chrome and different versions of the browser- overall I got a strong signal. As a note, this is on a pixel 2 device.
Eitan - yes, dedicated device should be good. You can bisect on m-c merges in that time range (from archive) once you verify there's a difference from (say) 10/14 vs 10/17.
Ok, the Oct 16th regression in speedometer is understood: glandium backed out Android LTO: "Back out changeset 9481b9ecaf2f (bug 1480006) as being a suspect in the rise of crashes such as bug 1489953. " See https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=pixel2%2Cspeedometer&selectedJob=211353831&revision=9079bbe837184ed183b133a374753865b6768bc4 I identified the regression looking at m-c pushes with https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=pixel2%2Cspeedometer&startdate=2018-10-15&enddate=2018-10-17&selectedJob=211353831, then retriggered the speedometer runs around the regression before noticing this backout (above). So it's just the Oct 27th regression to understand
> ./mach try fuzzy -q 'android aarch64 speedometer' --rebuild 5 So long as I reland the bitbar stuff for each push (and be careful of glandium landing and backing out LTO in this range, it works. So the range is down to 8 changesets: 524c546c7ad5 (good) to ab584824a073 (bad) -- 4 bugs, 8 changesets (5 for 1 bug). One changeset is webgl only (unlikely...) Two are by glandium and involve possible perf issues, though one looks to be desktop-only, and the other is in graphite(!?) and so seems unlikely: 434f70360933 Mike Hommey — Bug 1492663 - Upgrade most CI builds to clang 7 r=froydnj 02ac5668f849 Mike Hommey — Bug 1498072 - Prevent inlining of the direct_run function. r=jfkthame The other patches are for 2 bugs: 42961627bc9a Ryan Hunt — Bug 1453425 - Remove mScrollPosAtLastPaint from nsGfxScrollFrame. r=botond (and 4 more patches) 5ef21255897e Jeff Gilbert — Bug 1478216 - Don't init tex images in FBAttachment::IsComplete. r=kvark Jeff's bug appears to be all WebGL. So if glandium's patches don't pan out as the cause, it's bug 1453425
Try run with 434f70360933 Mike Hommey — Bug 1492663 is slow, so it has to be that or bug 1498072. Fast: https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=211484789&revision=d49d773fe06dad3b08cbbef6c904d2d0fe9c90c5 Slow: https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=211484789&revision=4d30905771e75c7a3882343c59874c81ce53f40f (includes both patches for the bugs above) Still running: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0bb91544975ad5911033f09dfca051c3bf3f29af (which has Bug 1498072 - Prevent inlining of the direct_run function, but not the other patch).
Flags: needinfo?(eitan) → needinfo?(mh+mozilla)
> Still running: > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=0bb91544975ad5911033f09dfca051c3bf3f29af > (which has Bug 1498072 - Prevent inlining of the direct_run function, but not the other patch). That was fast, so it's bug 1492663
Summary: GeckoView Speedometer regression on Google Pixel 2 (ARM32 and ARM64) → GeckoView Speedometer regression on Google Pixel 2 (ARM32 and ARM64) from bug 1492663
(In reply to Randell Jesup [:jesup] from comment #27) > That was fast, so it's bug 1492663 Thanks for pinpointing the regression, Randell. That's a bummer. There's not a lot we can do to recover from a performance regression caused by upgrading to a new clang version. Did macOS or Linux regress on Speedometer after upgrading to clang 7? Skimming the clang 7 release notes  for any interesting changes: * -fmerge-all-constants is no longer applied by default because it is a non-conforming optimization and miscompiles the Linux kernel . Maybe we should try re-enabling? * The new -fforce-emit-vtables option forces the emission of vtables even in modules where it isn't necessary, which increases opportunities for devirtualisation. Forcing emitting of vtable adds a reference of these inline virtual functions. Note that GCC was always doing it.   http://releases.llvm.org/7.0.0/tools/clang/docs/ReleaseNotes.html  https://patchwork.ozlabs.org/patch/888487/  https://reviews.llvm.org/rL334600
(In reply to Chris Peterson [:cpeterson] from comment #28) > (In reply to Randell Jesup [:jesup] from comment #27) > > That was fast, so it's bug 1492663 > > Thanks for pinpointing the regression, Randell. That's a bummer. There's not > a lot we can do to recover from a performance regression caused by upgrading > to a new clang version. Did macOS or Linux regress on Speedometer after > upgrading to clang 7? Slightly improved on linux https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=mozilla-inbound,1640394,1,1&series=mozilla-inbound,1641346,1,1&highlightedRevisions=434f70360933 Unchanged on Mac https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=mozilla-central,1647998,1,1&highlightedRevisions=434f70360933 Or is it one of those benchmarks where the lower score is better?
(In reply to Mike Hommey [:glandium] from comment #29) > Or is it one of those benchmarks where the lower score is better? No. A higher Speedometer score is better, so Linux did improve slightly.
[qf] to put this regression on the Quantum Flow team's radar. We don't have an action plan at this time.
Summary: GeckoView Speedometer regression on Google Pixel 2 (ARM32 and ARM64) from bug 1492663 → GeckoView Speedometer regression on Google Pixel 2 (ARM32 and ARM64) from clang 7 bug 1492663
Has anyone tried bisecting LLVM trunk to see if this is due to a specific change? It may give us a clue on how to work around the regression or at least tell us who to poke upstream.
Pushed by email@example.com: https://hg.mozilla.org/integration/autoland/rev/386ec4aee534 Switch android arm/aarch64 opt/nightly builds back to clang 6. r=dmajor
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/autoland/rev/3a0b29de3cfb fix clang failures on a CLOSED TREE
Thanks Mike! I see FF64 is marked tracking and affected so I think that means we'll need to do the same thing for beta? (See comment 15)
We can see Speedometer performance restored on Perfherder already: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1775979,1,10&series=mozilla-central,1776360,1,10&series=mozilla-central,1776166,1,10 android-hw-p2-8-0-android-aarch64 was impacted the most. aarch64 currently only has Baseline JIT support, so apparently clang 7 was hurting Baseline JIT code more than Ion JIT code. Once we have ARM64 support for Ion (bug 1380129), we can probably switch back to clang 7 (unless we find a clang 7 fix before then).
(In reply to David Bolter [:davidb] (NeedInfo me for attention) from comment #37) > Thanks Mike! I see FF64 is marked tracking and affected so I think that > means we'll need to do the same thing for beta? (See comment 15) This bug covers two Speedometer regressions: one in 64 Nightly on October ~16 and one in 65 Nightly on October 27. The 65 regression was clang 7 bug 1492663, which has been fixed by reverting to clang 6. As such, I am unneedinfo'ing Mike because I don't think any action is needed from him. According to Randell's comment 24, the 64 regression appears to be the backout  of Android LTO bug 1480006 from the first time Mike tried to land it. So 64 did not actually regress; it was a return to pre-LTO performance. I'll mark 64=unaffected. The good news is that we now know that LTO measurably improved our performance on Android!  https://bugzilla.mozilla.org/show_bug.cgi?id=1480006#c12
(In reply to Chris Peterson [:cpeterson] from comment #38) > We can see Speedometer performance restored on Perfherder already: > > https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central, > 1775979,1,10&series=mozilla-central,1776360,1,10&series=mozilla-central, > 1776166,1,10 Wait, we have talos for android?!
Taking a step back here, it seems the bug summary is wrong, in that arm32 was not affected. I'm bisecting, and I see no differences there along the way. Only on arm64. What this tells me is that we could leave arm32 alone. And since we're not shipping aarch64... do we really need to keep using clang 6 for it (when it would be only platform using it)?
(In reply to Mike Hommey [:glandium] from comment #40) > Wait, we have talos for android?! Not all of Talos. We're running Speedometer, Unity WebGL, and soon the Talos tp6 page load benchmarks using a new test runner (for all Firefox platforms) called Raptor. (In reply to Mike Hommey [:glandium] from comment #41) > Taking a step back here, it seems the bug summary is wrong, in that arm32 > was not affected. I'm bisecting, and I see no differences there along the > way. Only on arm64. What this tells me is that we could leave arm32 alone. > And since we're not shipping aarch64... do we really need to keep using > clang 6 for it (when it would be only platform using it)? sstangl is adding Ion support for arm64 in bug 1380129, which should land in 2019 Q1. In the meantime, we could go back to clang 7 on all platforms, accepting the arm64 regression with the hope that Ion will recover the clang 7 regression. We can't stay on clang 6 forever, but perhaps arm64 would still be faster with clang 6 than clang 7 even after we have Ion support? As a Build peer, the compiler decisions are yours. If I squint at this Perfherder graph, I think both arm32 and arm64 regressed on October 27, but arm64 was definitely hit harder: https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=mozilla-central,1775979,1,10&series=mozilla-central,1776360,1,10&series=mozilla-central,1776166,1,10 We don't have Ion JIT support for arm64 yet, which is why arm64 (android-hw-p2-8-0-android-aarch64) is so much slower than arm32 ( android-hw-p2-8-0-arm7-api-16) on the same device (Google Pixel 2).
You need to log in before you can comment on or make changes to this bug.