Closed
Bug 1429319
Opened 7 years ago
Closed 7 years ago
2.05 - 36.55% Multiple platform_microbenchmark test (windows10-64, windows7-32) regressions on push 3ede11fe526eed5f34040399dfaed3af8f1e7c71 (Thu Jan 4 2018)
Categories
(Infrastructure & Operations :: SRE, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: igoldan, Unassigned)
References
Details
(Keywords: perf, regression)
We have detected a platform microbenchmarks regression from push:
https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=3ede11fe526eed5f34040399dfaed3af8f1e7c71
As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
37% Strings PerfStripWhitespace windows10-64 opt 116,022.00 -> 158,428.00
13% Stylo Gecko_nsCSSParser_ParseSheet_Bench windows7-32 opt 73,231.21 -> 82,789.25
12% Stylo Servo_StyleSheet_FromUTF8Bytes_Bench windows7-32 opt 71,707.29 -> 80,128.25
11% Stylo Servo_StyleSheet_FromUTF8Bytes_Bench windows10-64 opt 63,095.21 -> 70,302.64
11% Stylo Gecko_nsCSSParser_ParseSheet_Bench windows10-64 opt 60,969.83 -> 67,773.71
10% Strings PerfStripCRLF windows10-64 opt 83,435.83 -> 91,606.42
2% TestStandardURL NormalizePerf windows7-32 opt 73,572.17 -> 75,078.83
You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=11073
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.
To learn more about the regressing test(s), please see: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Automated_Performance_Testing_and_Sheriffing/Platform_Microbenchmarks
| Reporter | ||
Comment 1•7 years ago
|
||
I investigated these regressions for about a week now. They don't related to any in-tree changes.
I retriggered all of the tests above on old changes, some dating back to December 8th. The new values pretty much resemble the regression levels above. So this sounds like an infrastructure change.
This behavior happens also on the build time metrics, to which :gps came up with two possibilities:
* TaskCluster rolling out new AMI (which is slower for some reason). grenade: did you roll out anything last week?
* Spectre and Meltdown patches being applied by AWS. We know the mitigations will make machines slower. But the impact is workload dependent and it is unclear what the impact on Firefox builds will be.
I'm into concluding these regressions on the :gps' explanations.
Comment 2•7 years ago
|
||
recent windows ami updates:
- gecko-t-win10-64-gpu: 04/01/2018
https://github.com/mozilla-releng/OpenCloudConfig/commit/289627fcf95d9e9ec7cefb559f43f9e411626ed9
- gecko-3-b-win2012: 08/12/2017
https://github.com/mozilla-releng/OpenCloudConfig/commit/4bb45fd6d9860d047d18a9d8f8953e016a0e0f55
- gecko-1-b-win2012 & gecko-1-b-win2012: 07/12/2017
https://github.com/mozilla-releng/OpenCloudConfig/commit/9d038b8532b25ade976cdff47661f9e91960f7d9
additionally all windows instances were updated to use the "high performance" power plan (see bug 1362613). this change would affect any instance booted after december 13th. no ami update was required for this change as instances would have picked up the config on boot. https://github.com/mozilla-releng/OpenCloudConfig/commit/d1e4a1e06989e46dda64522c050e2a5b1e2e3379
Updated•7 years ago
|
status-firefox57:
--- → unaffected
status-firefox58:
--- → unaffected
status-firefox59:
--- → affected
| Reporter | ||
Updated•7 years ago
|
status-firefox57:
unaffected → ---
status-firefox58:
unaffected → ---
status-firefox59:
affected → ---
Component: Untriaged → Infrastructure: AWS
Product: Firefox → Infrastructure & Operations
QA Contact: cshields
| Reporter | ||
Comment 3•7 years ago
|
||
I labeled this bug under Infrastructure & Operations :: Infrastructure: AWS component, based on :gps' comments [1] in a similar bug.
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1429311#c2
Comment 4•7 years ago
|
||
And it appears the perf regressions have returned to previous baseline as of a few days ago. Good times.
| Reporter | ||
Comment 5•7 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #4)
> And it appears the perf regressions have returned to previous baseline as of
> a few days ago. Good times.
Yes, it looks like it did. Marking this bug as resolved.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 6•7 years ago
|
||
These results confirm that the baselines have returned to normal:
== Change summary for alert #11160 (as of Fri, 12 Jan 2018 08:36:28 GMT) ==
Improvements:
26% Strings PerfStripWhitespace windows10-64 opt 160,245.67 -> 118,289.67
11% Stylo Servo_StyleSheet_FromUTF8Bytes_Bench windows7-32 opt 79,586.58 -> 71,181.50
10% Stylo Gecko_nsCSSParser_ParseSheet_Bench windows7-32 opt 81,710.83 -> 73,613.08
8% Strings PerfIsASCIIHundred windows7-32 opt 3,070.33 -> 2,816.15
2% TestStandardURL NormalizePerf windows7-32 opt 75,047.38 -> 73,416.92
2% Strings PerfIsUTF8Example3 windows7-32 opt 8,021.84 -> 7,860.23
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=11160
You need to log in
before you can comment on or make changes to this bug.
Description
•