Closed
Bug 1373396
Opened 7 years ago
Closed 7 years ago
Reduce variances for AWFY speedometer benchmark
Categories
(Testing Graveyard :: AWFY, enhancement)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: armenzg, Assigned: armenzg)
References
(Blocks 1 open bug)
Details
(Whiteboard: [PI:June])
I’ve run the benchmark with different configurations:
1st run 2nd run 3rd run
Nightly 60.84 54.75 64.02 (inbound build - non-PGO - AWFY/proxy)
Nightly 62.44 62.26 62.23 (inbound build - non-PGO)
Nightly 64.94 63.93 62.77 (inbound build - non-PGO) with config changes [1]
Canary 103.50
This numbers were obtained on the Quantum reference laptop.
I restarted the browser in between runs to simulate similar conditions to AWFY.
[1] https://github.com/mozilla/arewefastyet/blob/master/slave/configs.py#L20-L22
Set JSGC_DISABLE_POISONING=1 is required besides changes to about:config
How can I verify that manually setting the env and then starting the browser worked as expected?
I believe the scores indicate that running speedometer via AWFY adds unwanted score variances.
I also believe that running with configuration recommendations from the following site improve the scores, however, it also increases variances a bit:
https://developer.mozilla.org/en-US/docs/Mozilla/Benchmarking
WARNING: Three runs are not enough to say with confidence that all my deductions are correct.
At the moment I have few ideas to investigate to improve the current variances with AWFY:
1) Read closely what the automation code
2) Run the harness within a RAM disk
* I wonder if the proxy is slowing us down
3) Switch AWFY to use the real website
* So far it’s proven to yield better results
4) Set up the speedometer site on a host we control
Comment 1•7 years ago
|
||
we need to decide if setting up AWFY automation to use environment variables and prefs similar to what we do with other unittests and perf tests. I am leaning towards the fewer the better as we would be comparing against default values of other browsers- but the numbers need to be reliable.
Whiteboard: [PI:June]
Assignee | ||
Comment 2•7 years ago
|
||
It seems that the score on the Quantum reference laptop actually hits a consistent score of 51:
https://arewefastyet.com/#machine=36&view=single&suite=speedometer-misc&subtest=score
I need to determine why my reference laptop has a score of about 60.
Trying to determine this over email with Sean.
Assignee | ||
Comment 3•7 years ago
|
||
The latest hypothesis is that running speedometer within all the other benchmarks can cause the machine to overheat which the machine can reduce the CPU frequency to reduce heat. This could account for the variance.
Running all benchmarks on my reference laptop (scores in the 50s match automation):
1. 55.98
2. 63.84
3. 58.74 (run right after the previous run)
4. 66.40 (today; after machine had been off all night)
5. 66.20
I've thought of outputting CPU frequencies (temperature if possible) at the end of each run to determine if there's correlation between low scores and low CPU frequencies.
I was hoping to use psutil; unfortunately, it does not work under Cygwin.
https://github.com/giampaolo/psutil/issues/82
Assignee | ||
Comment 4•7 years ago
|
||
My current hypothesis which has given 3 positive results in a row is that the preference "shut off display after 15 minutes" is set on the production machine.
* 56.06
* 55.49
* 55.51
All benchmarks against opt m-i (like production).
I'm now narrowing it down to just speedometer-misc to see if I can gather data points faster.
Assignee | ||
Comment 5•7 years ago
|
||
If I run dromaeo and speedometer I also hit the reduced score (speedomter finishes in less than 15 mins so I added dromaeo).
Running speedometer by itself with a 15 minutes timer does *not* show the reduced score.
Running speedometer by itself with a 5 minutes timer *does* show the reduced score.
I would like to change the setting on machine 16. I'm asking where to announce this change.
Assignee | ||
Comment 6•7 years ago
|
||
I've changed the setting on the machine. I had reached out the quantum team before doing so.
I will wait for official results before closing this.
Updated•7 years ago
|
Blocks: Speedometer_V2
Assignee | ||
Comment 7•7 years ago
|
||
On automation, we're now hitting scores around 67 (non-PGO) instead of 56.
ehsan posted some results with his reference laptop in here:
https://ehsanakhgari.org/blog/2017-06-23/quantum-flow-engineering-newsletter-14
A couple of differences with AWFY is that he runs it in full screen and did not apply the GC poisoning:
https://developer.mozilla.org/en-US/docs/Mozilla/Benchmarking
Running on my local machine:
AWFY 67.20 - with all perf changes - **not** maximized
benj.me 68.05 - clean profile - maximized
benj.me 69.47 - no GC poisoning but other perf changes - maximized
benj.me 70.53 - with all perf changes - maximized
I believe that if we manage to make AWFY run in fulscreen we would get those last 2-3 points.
As far as I know I'm looking for an answer on how to make Firefox run in full screen for automation.
https://groups.google.com/forum/#!topic/mozilla.dev.platform/w2AoQLeD-Ss
Comment 8•7 years ago
|
||
FWIW I have been getting a variance of a few points locally for a few weeks now, I don't think that is unusual.
Assignee | ||
Comment 9•7 years ago
|
||
I've run a PGO build via AWFY after this PR [1] and I've got a score of 71.50.
Automation with inbound builds is at 68.69.
STR:
python download.py --repo mozilla-central -o ~/repos/mozilla-central-pgo/ -c 64bit -b pgo
python execute.py -s remote -b remote.speedometer-misc -e ~/repos/mozilla-central-pgo/ -c default
[1] https://github.com/mozilla/arewefastyet/pull/134
Assignee | ||
Comment 10•7 years ago
|
||
We started running PGO twice in a row once a day:
https://arewefastyet.com/#machine=36&view=single&suite=speedometer-misc&subtest=score
We're getting satisfactory results.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Component: General → AWFY
Updated•5 years ago
|
Product: Testing → Testing Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•