9.77 - 2.2% espn fcp / imdb PerceptualSpeedIndex + 113 more (Android) regression on Wed July 21 2021
Categories
(GeckoView :: General, defect)
Tracking
(firefox-esr78 unaffected, firefox-esr91 unaffected, firefox92 wontfix, firefox93 wontfix, firefox94 wontfix, firefox95 fix-optional)
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox-esr91 | --- | unaffected |
firefox92 | --- | wontfix |
firefox93 | --- | wontfix |
firefox94 | --- | wontfix |
firefox95 | --- | fix-optional |
People
(Reporter: alexandrui, Unassigned)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression, Whiteboard: [geckoview:m93?] [geckoview:2022h2?])
Perfherder has detected a browsertime performance regression from push 2f8bbf2478c7bf6e6f9d586cfa89e30a332a735b. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
Ratio | Suite | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|---|
10% | espn | fcp | android-hw-g5-7-0-arm7-shippable | cold | 2,316.92 -> 2,543.38 |
10% | espn | fnbpaint | android-hw-g5-7-0-arm7-shippable | cold | 2,326.79 -> 2,553.38 |
9% | allrecipes | dcf | android-hw-g5-7-0-arm7-shippable | warm | 2,031.08 -> 2,218.83 |
9% | allrecipes | fcp | android-hw-g5-7-0-arm7-shippable | warm | 2,070.00 -> 2,257.17 |
9% | allrecipes | fnbpaint | android-hw-g5-7-0-arm7-shippable | warm | 2,089.50 -> 2,276.12 |
9% | espn | FirstVisualChange | android-hw-g5-7-0-arm7-shippable | cold | 2,681.17 -> 2,911.50 |
8% | allrecipes | FirstVisualChange | android-hw-g5-7-0-arm7-shippable | warm | 2,273.21 -> 2,458.83 |
7% | allrecipes | FirstVisualChange | android-hw-g5-7-0-arm7-shippable-qr | warm webrender | 2,178.71 -> 2,341.58 |
7% | booking | loadtime | android-hw-g5-7-0-arm7-shippable-qr | warm webrender | 1,290.40 -> 1,384.71 |
7% | youtube | dcf | android-hw-g5-7-0-arm7-shippable-qr | warm webrender | 840.69 -> 896.71 |
... | ... | ... | ... | ... | ... |
3% | amazon-search | FirstVisualChange | android-hw-g5-7-0-arm7-shippable | warm | 942.75 -> 967.25 |
3% | amazon-search | SpeedIndex | android-hw-g5-7-0-arm7-shippable | warm | 1,011.50 -> 1,036.75 |
2% | imdb | SpeedIndex | android-hw-g5-7-0-arm7-shippable-qr | warm webrender | 2,611.67 -> 2,676.50 |
2% | amazon-search | PerceptualSpeedIndex | android-hw-g5-7-0-arm7-shippable | warm | 1,018.50 -> 1,043.50 |
2% | imdb | PerceptualSpeedIndex | android-hw-g5-7-0-arm7-shippable-qr | warm webrender | 4,258.12 -> 4,351.67 |
Improvements:
Ratio | Suite | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|---|
3% | booking | fnbpaint | android-hw-p2-8-0-android-aarch64-shippable-qr | warm webrender | 452.29 -> 439.96 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.
For more information on performance sheriffing please see our FAQ.
Reporter | ||
Comment 1•3 years ago
|
||
Not sure which of the 2 bugs caused the regression. :agi feel free to leave the regressing bug only if you know. Thanks!
Comment 2•3 years ago
|
||
The most likely culprit is https://hg.mozilla.org/integration/autoland/rev/59beb0677c0f
Updated•3 years ago
|
Comment 3•3 years ago
|
||
It's interesting this only seems to affect 32bit builds, while 64bit are unaffected. I might have screwed up something in the 32bit pgo build pipeline.
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 4•3 years ago
|
||
(In reply to Agi Sferro | :agi | ni? for questions | ⏰ PST | he/him from comment #3)
It's interesting this only seems to affect 32bit builds, while 64bit are unaffected. I might have screwed up something in the 32bit pgo build pipeline.
Agi, did you have time to look into this more?
Comment 5•3 years ago
•
|
||
My current theory is that we need to run the arm pgo profile on an arm CPU to get the performance back. Talking to aklotz last month he mentioned that he believes that pgo profiles should be run on actual devices only, I'm gonna look into that, it might get us better perf on arm64 too (which is the large majority of our users)
Comment 6•3 years ago
|
||
jamher will get an estimate for running the profile-generate
Android job on actual devices for aarch64 and arm.
Comment 7•3 years ago
|
||
ok, these jobs take ~27 minutes total to complete (I will assume same runtime on physical phones). Adding ~6 minutes to account for any reboots - I would round up to 35 minutes per job max.
In the last month we have had 719 64 bit jobs and 809 x86 jobs - accounting for down devices and peak loads, I would round up to 1000 profile runs/day. The last month our load has been higher the the 4 months prior, I assume we are having more pushes as we have fewer PTO days?
Doing the math:
27 minutes @800 runs/day = 15 devices x86_64 and 15 devices x86
35 minutes @1000 runs/day = 24 devices x86_64 and 24 devices x86
What I don't know:
- how long it takes to run on a physical device
- what the reboot/overhead is of the devices
- if there is a reason for higher load in the last month
- if we have other pgo types that are not represented in x86_64 and x86 (new versions upcoming?!?)
I would probably pick between 30 and 45 devices - rough math indicates that 30 devices would be ~$180K/year in infrastructure cost.
Updated•3 years ago
|
Updated•2 years ago
|
Comment 8•2 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #7)
ok, these jobs take ~27 minutes total to complete (I will assume same runtime on physical phones). Adding ~6 minutes to account for any reboots - I would round up to 35 minutes per job max.
In the last month we have had 719 64 bit jobs and 809 x86 jobs - accounting for down devices and peak loads, I would round up to 1000 profile runs/day. The last month our load has been higher the the 4 months prior, I assume we are having more pushes as we have fewer PTO days?
Doing the math:
27 minutes @800 runs/day = 15 devices x86_64 and 15 devices x86
35 minutes @1000 runs/day = 24 devices x86_64 and 24 devices x86What I don't know:
- how long it takes to run on a physical device
- what the reboot/overhead is of the devices
- if there is a reason for higher load in the last month
- if we have other pgo types that are not represented in x86_64 and x86 (new versions upcoming?!?)
I would probably pick between 30 and 45 devices - rough math indicates that 30 devices would be ~$180K/year in infrastructure cost.
Maybe I'm missing something but, don't we only need to run these jobs for mozilla-central (and beta and release) builds? that should only be 8-ish runs per day not 800-1000.
Comment 9•2 years ago
|
||
good point- I think I overlooked the obvious. Rounding up to 10 to account for blue
jobs (that fail in the middle and auto retry, or higher load).
Given the math, then we have:
35 minutes/run * 10 runs/day = 350 minutes/day.
That is < 1/2 device/day.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 10•2 years ago
|
||
Joel, what is the next step for this bug?
In comment 2, Agi said he thinks his change to use an "instrumented build on x86_64" (https://hg.mozilla.org/integration/autoland/rev/59beb0677c0f) caused this page load regression. Is that "instrumented build" used to generate the profile data then used for PGO? Would this regression affecting real users or is this only a perf regression for generating the profile data?
The regression in comment 0 (from July 2021) is for android-hw-g5-7-0-arm7-shippable, which I believe we have already retired.
Comment 11•2 years ago
|
||
it looks like the change switched from profiling on arm7 -> arm64. That means that we probably optimize on arm64 and not as well on arm7.
If there is a strong desire to look into this, I would suggest on try server running tip on the a51 phones, then backing out the root cause and running a second push- then comparing to see the difference. The a51's are aarch64 and that is all we run on these days- so quite likely this won't be seen.
If this isn't seen, then we need to determine if it really is arm7 and if arm7 is a real concern for us and the marketplace.
Comment 12•2 years ago
|
||
If this isn't seen, then we need to determine if it really is arm7 and if arm7 is a real concern for us and the marketplace.
Thanks. I'll follow up with PM about the priority of arm7 vs arm64 performance.
Updated•4 months ago
|
Description
•