Open Bug 1722266 Opened 4 years ago Updated 11 months ago

9.77 - 2.2% espn fcp / imdb PerceptualSpeedIndex + 113 more (Android) regression on Wed July 21 2021

Tracking

(firefox-esr78 unaffected, firefox-esr91 unaffected, firefox92 wontfix, firefox93 wontfix, firefox94 wontfix, firefox95 fix-optional)

Status:

NEW

Tracking Flags:

Tracking

Status

firefox-esr78

---

unaffected

firefox-esr91

---

unaffected

firefox92

---

wontfix

firefox93

---

wontfix

firefox94

---

wontfix

firefox95

---

fix-optional

People

(Reporter: alexandrui, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression, Whiteboard: [geckoview:m93?] [geckoview:2022h2?])

Alexandru Ionescu (needinfo me) [:alexandrui]

Reporter

Description

•

4 years ago

Perfherder has detected a browsertime performance regression from push 2f8bbf2478c7bf6e6f9d586cfa89e30a332a735b. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio	Suite	Test	Platform	Options	Absolute values (old vs new)
10%	espn	fcp	android-hw-g5-7-0-arm7-shippable	cold	2,316.92 -> 2,543.38
10%	espn	fnbpaint	android-hw-g5-7-0-arm7-shippable	cold	2,326.79 -> 2,553.38
9%	allrecipes	dcf	android-hw-g5-7-0-arm7-shippable	warm	2,031.08 -> 2,218.83
9%	allrecipes	fcp	android-hw-g5-7-0-arm7-shippable	warm	2,070.00 -> 2,257.17
9%	allrecipes	fnbpaint	android-hw-g5-7-0-arm7-shippable	warm	2,089.50 -> 2,276.12
9%	espn	FirstVisualChange	android-hw-g5-7-0-arm7-shippable	cold	2,681.17 -> 2,911.50
8%	allrecipes	FirstVisualChange	android-hw-g5-7-0-arm7-shippable	warm	2,273.21 -> 2,458.83
7%	allrecipes	FirstVisualChange	android-hw-g5-7-0-arm7-shippable-qr	warm webrender	2,178.71 -> 2,341.58
7%	booking	loadtime	android-hw-g5-7-0-arm7-shippable-qr	warm webrender	1,290.40 -> 1,384.71
7%	youtube	dcf	android-hw-g5-7-0-arm7-shippable-qr	warm webrender	840.69 -> 896.71
...	...	...	...	...	...
3%	amazon-search	FirstVisualChange	android-hw-g5-7-0-arm7-shippable	warm	942.75 -> 967.25
3%	amazon-search	SpeedIndex	android-hw-g5-7-0-arm7-shippable	warm	1,011.50 -> 1,036.75
2%	imdb	SpeedIndex	android-hw-g5-7-0-arm7-shippable-qr	warm webrender	2,611.67 -> 2,676.50
2%	amazon-search	PerceptualSpeedIndex	android-hw-g5-7-0-arm7-shippable	warm	1,018.50 -> 1,043.50
2%	imdb	PerceptualSpeedIndex	android-hw-g5-7-0-arm7-shippable-qr	warm webrender	4,258.12 -> 4,351.67

Improvements:

Ratio	Suite	Test	Platform	Options	Absolute values (old vs new)
3%	booking	fnbpaint	android-hw-p2-8-0-android-aarch64-shippable-qr	warm webrender	452.29 -> 439.96

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(agi)

Alexandru Ionescu (needinfo me) [:alexandrui]

Reporter

Comment 1

•

4 years ago

Not sure which of the 2 bugs caused the regression. :agi feel free to leave the regressing bug only if you know. Thanks!

[ex-Mozilla] Agi Sferro | :agi

Comment 2

•

4 years ago

The most likely culprit is https://hg.mozilla.org/integration/autoland/rev/59beb0677c0f

Flags: needinfo?(agi)

No longer regressed by: 1709640

BMO Automation

Updated

•

4 years ago

Has Regression Range: --- → yes

[ex-Mozilla] Agi Sferro | :agi

Comment 3

•

4 years ago

It's interesting this only seems to affect 32bit builds, while 64bit are unaffected. I might have screwed up something in the 32bit pgo build pipeline.

Emily Toop (:fluffyemily)

Updated

•

4 years ago

Priority: -- → P2

Whiteboard: [geckoview:m93?]

Steven DeTar [:sdetar]

Updated

•

4 years ago

status-firefox92: affected → wontfix

status-firefox93: --- → affected

status-firefox94: --- → affected

Pascal Chevrel:pascalc

Updated

•

4 years ago

status-firefox93: affected → wontfix

Marco Castelluccio [:marco]

Comment 4

•

4 years ago

(In reply to Agi Sferro | :agi | ni? for questions | ⏰ PST | he/him from comment #3)

It's interesting this only seems to affect 32bit builds, while 64bit are unaffected. I might have screwed up something in the 32bit pgo build pipeline.

Agi, did you have time to look into this more?

status-firefox95: --- → affected

status-firefox-esr78: --- → unaffected

status-firefox-esr91: --- → unaffected

Flags: needinfo?(agi)

[ex-Mozilla] Agi Sferro | :agi

Comment 5

•

4 years ago

•

Edited

My current theory is that we need to run the arm pgo profile on an arm CPU to get the performance back. Talking to aklotz last month he mentioned that he believes that pgo profiles should be run on actual devices only, I'm gonna look into that, it might get us better perf on arm64 too (which is the large majority of our users)

Flags: needinfo?(agi)

[ex-Mozilla] Agi Sferro | :agi

Comment 6

•

4 years ago

jamher will get an estimate for running the profile-generate Android job on actual devices for aarch64 and arm.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 7

•

4 years ago

ok, these jobs take ~27 minutes total to complete (I will assume same runtime on physical phones). Adding ~6 minutes to account for any reboots - I would round up to 35 minutes per job max.

In the last month we have had 719 64 bit jobs and 809 x86 jobs - accounting for down devices and peak loads, I would round up to 1000 profile runs/day. The last month our load has been higher the the 4 months prior, I assume we are having more pushes as we have fewer PTO days?

Doing the math:
27 minutes @800 runs/day = 15 devices x86_64 and 15 devices x86
35 minutes @1000 runs/day = 24 devices x86_64 and 24 devices x86

What I don't know:

how long it takes to run on a physical device
what the reboot/overhead is of the devices
if there is a reason for higher load in the last month
if we have other pgo types that are not represented in x86_64 and x86 (new versions upcoming?!?)

I would probably pick between 30 and 45 devices - rough math indicates that 30 devices would be ~$180K/year in infrastructure cost.

Flags: needinfo?(jmaher)

Dianna Smith [:diannaS]

Updated

•

4 years ago

status-firefox94: affected → wontfix

status-firefox95: affected → fix-optional

Chris Peterson [:cpeterson]

Updated

•

3 years ago

Updated

•

3 years ago

Keywords: perf, perf-alert, regression

Whiteboard: [geckoview:m93?] → [geckoview:m93?] [geckoview:2022h2?]

[ex-Mozilla] Agi Sferro | :agi

Comment 8

•

3 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #7)

ok, these jobs take ~27 minutes total to complete (I will assume same runtime on physical phones). Adding ~6 minutes to account for any reboots - I would round up to 35 minutes per job max.

In the last month we have had 719 64 bit jobs and 809 x86 jobs - accounting for down devices and peak loads, I would round up to 1000 profile runs/day. The last month our load has been higher the the 4 months prior, I assume we are having more pushes as we have fewer PTO days?

Doing the math:
27 minutes @800 runs/day = 15 devices x86_64 and 15 devices x86
35 minutes @1000 runs/day = 24 devices x86_64 and 24 devices x86

What I don't know:

how long it takes to run on a physical device

what the reboot/overhead is of the devices

if there is a reason for higher load in the last month

if we have other pgo types that are not represented in x86_64 and x86 (new versions upcoming?!?)

I would probably pick between 30 and 45 devices - rough math indicates that 30 devices would be ~$180K/year in infrastructure cost.

Maybe I'm missing something but, don't we only need to run these jobs for mozilla-central (and beta and release) builds? that should only be 8-ish runs per day not 800-1000.

Joel Maher ( :jmaher ) (UTC -8)

Comment 9

•

3 years ago

good point- I think I overlooked the obvious. Rounding up to 10 to account for blue jobs (that fail in the middle and auto retry, or higher load).

Given the math, then we have:
35 minutes/run * 10 runs/day = 350 minutes/day.

That is < 1/2 device/day.

BugBot [:suhaib / :marco/ :calixte]

Updated

•

3 years ago

Keywords: regression

Marco Castelluccio [:marco]

Updated

•

3 years ago

Keywords: perf, perf-alert

Chris Peterson [:cpeterson]

Comment 10

•

3 years ago

Joel, what is the next step for this bug?

In comment 2, Agi said he thinks his change to use an "instrumented build on x86_64" (https://hg.mozilla.org/integration/autoland/rev/59beb0677c0f) caused this page load regression. Is that "instrumented build" used to generate the profile data then used for PGO? Would this regression affecting real users or is this only a perf regression for generating the profile data?

The regression in comment 0 (from July 2021) is for android-hw-g5-7-0-arm7-shippable, which I believe we have already retired.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 11

•

3 years ago

it looks like the change switched from profiling on arm7 -> arm64. That means that we probably optimize on arm64 and not as well on arm7.

If there is a strong desire to look into this, I would suggest on try server running tip on the a51 phones, then backing out the root cause and running a second push- then comparing to see the difference. The a51's are aarch64 and that is all we run on these days- so quite likely this won't be seen.

If this isn't seen, then we need to determine if it really is arm7 and if arm7 is a real concern for us and the marketplace.

Flags: needinfo?(jmaher)

Chris Peterson [:cpeterson]

Comment 12

•

3 years ago

If this isn't seen, then we need to determine if it really is arm7 and if arm7 is a real concern for us and the marketplace.

Thanks. I'll follow up with PM about the priority of arm7 vs arm64 performance.

Olivia Hall [:olivia]

Updated

•

11 months ago

Priority: P2 → --

You need to log in before you can comment on or make changes to this bug.