[meta] Determine cause(s) of variable page load timing results
Categories
(Core :: Performance: General, task, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox65 | --- | affected |
People
(Reporter: acreskey, Assigned: acreskey)
References
(Depends on 2 open bugs, Blocks 1 open bug)
Details
(Keywords: meta)
Attachments
(3 files)
Assignee | ||
Comment 1•7 years ago
|
||
Assignee | ||
Comment 2•7 years ago
|
||
Assignee | ||
Comment 3•7 years ago
|
||
Assignee | ||
Comment 4•7 years ago
|
||
Assignee | ||
Comment 5•7 years ago
|
||
Updated•7 years ago
|
Assignee | ||
Comment 6•7 years ago
|
||
Assignee | ||
Comment 7•7 years ago
|
||
Assignee | ||
Comment 8•7 years ago
|
||
Assignee | ||
Comment 9•6 years ago
|
||
Assignee | ||
Comment 10•6 years ago
|
||
The most significant update is that :jesup landed a large performance optimization that deferred the execution of JS triggered by setTimeout() to an idle queue.
bug 1270059
This had the side effect of cutting the Noise Metric (roughly the sum of std deviations) by ~40% to 60% depending on the platform, visible here:
I haven't investigated why exactly this reduces the noise, but it's fair to say that potentially executing long JS functions during pageload (depending on timing) was leading to nondeterministic scheduling.
This was observed in profiles.
Assignee | ||
Comment 11•6 years ago
|
||
Bug 1517323 captures the work of removing mitmproxy's live upstream connections.
Unfortunately it is regressing performance tests.
I investigated and determined that at least for some of the tests (e.g. bing), the performance drop comes from the fact that mitmproxy running in "offline mode" drops http/2 playback to http/1.
This can have a very significant performance impact, particularly when playing back from cached recordings.
Assignee | ||
Comment 12•6 years ago
|
||
I'll be looking into this issue shortly - Bug 1524609.
At least on these raptor pageload tests in the lab, disabling RCWN is reducing the noise by 7% to 70%, depending on the platform.
My goal will be primarily performance tuning of RCWN, but it appears that noise reduction is very possible.
I had earlier done tests of disabling RCWN but at that time I was more focused on finding the source of the bimodal loading distributions and did not notice a drop in the Noise Metric.
Assignee | ||
Comment 13•6 years ago
|
||
It looks like the browser is still busy doing work which interferes with page load after the 30 seconds that raptor waits post-startup.
A quick experiment here extends raptor's startup delay from 30 to 90 seconds and noise is significantly reduced:
Assignee | ||
Comment 14•6 years ago
|
||
Just logged Bug 1543776 after :vchin noticed the erratic load times on www.allrecipes.com (see image)
I've seen slow loads coinciding with long GC majors many times.
If we can resolve this we will significantly reduce noise in numerous tp6 tests.
Assignee | ||
Comment 15•6 years ago
|
||
Fixing Bug 1548355 lead to a very significant drop in noise (and loadtime) on raptor-tp6-reddit-firefox loadtime
We believe that this is because it defers GC out of the loading stage, and thus off-thread parsing isn't blocked (Bug 1543806 ).
Updated•6 years ago
|
Assignee | ||
Comment 16•6 years ago
|
||
This issue is a major contributor to noise: Bug 1555306
For this given site, on the 2017 reference laptop page load results will vary from ~2000ms to ~5000ms when testing against a recorded http session.
Assignee | ||
Comment 17•6 years ago
|
||
The behaviour described in Bug 1564569 was identified as the cause of large spikes in recorded load time such as these (amazon cold load tests on android).
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 18•6 years ago
•
|
||
I described my findings in this post:
https://groups.google.com/a/mozilla.com/d/msg/perfteam/Zk9GN6kx2Fc/NU6QQby8DwAJ
Summarized here:
In Bug 1564569 it was shown that the injection of new resources via setTimeout()
(which may or may not occur before onload) is the source of the bimodal load performance in many, likely most, cases (the first 3 of 3 in tp6m).
You can see the difference between our current behaviour (top) and a prototype in which we run the setTimeout()
s after load here:
https://imgur.com/NBOKnLe
Given that the prototype solution is not viable (hurts visual metrics, poor behaviour on some sites), two alternatives were discussed:
• For completeness, Markus raised the idea of creating our own metric, e.g. onunbrokendependencyload
which would then exhibit less noise.
• Given the preference for visual metrics, the idea of focusing noise reduction on a visual metric (which may be more stable) was discussed and was favorably received
Comment 19•6 years ago
|
||
Making into a meta bug; we should link any bugs affecting variability to this; reports in general on variability should go here
Assignee | ||
Comment 20•6 years ago
|
||
Linking bugs that were closed and worked on this quarter in an effort to reduce noise in the pageload tests:
Bug 1558189 Determine if android device temperature is introducing noise in test results (it's not the cause of the major outliers)
Bug 1563209 Bi-modal raptor tests results: this was caused by mitmproxy's live upstream cert issue which Tarek found a workaround for
Bug 1565325 Determine if using the same mitmproxy session for all cold page load browser cycles introduces noise (ruled out as a source of major noise)
Bug 1558191 Determine if different android devices are introducing noise in test results (not closed but ruled out as a source of major outliers)
Bug 1543776 5+ second delays seen while loading www.allrecipes.com (although not resolved, this issue been worked on by :smaug, :jonco and myself (e.g. Bug 1575943, Bug 1579426))
Assignee | ||
Comment 21•6 years ago
|
||
My recommendations going forward are:
• Pursue Bug 1561324: being able to test on reference PC hardware is important and will be even more so with Fission
• As discussed here, being noise reduction efforts for visual metrics: compare them to current navigation timing metrics, evaluate noise between visual metrics (speedIndex, contentfulnessIndex, etc), and find the sources of noise within those metrics
Assignee | ||
Comment 22•6 years ago
|
||
FYI, currently attempting to reduce noise on the reference laptops in the lab so that they may be usable for performance work. (Bug 1561324)
Assignee | ||
Comment 23•6 years ago
|
||
Found a significant source of noise on desktop cold pageload tests: Bug 1589070
The desktop cold load tests were incorrectly re-using the same profile for each cold load so resources would be in the disk cache for subsequent loads.
Each load was designed to be independent, but because of this bug the performance results would be bimodal with the first out of every 25 cold loads being slow (pictured).
For example:
"name": "loadtime",
"replicates": [
1506,
752,
697,
717,
715,
Assignee | ||
Comment 24•6 years ago
|
||
We noticed in Bug 1595537 that the JS debug option async stacks end up enabled in the raptor performance tests (desktop).
Disabling this pref reduces noise significantly and improves test 'realism', so that's landing shortly.
noise metric
linux64-shippable -9.52%
macosx1014-64-shippable -6.86%
windows10-64-shippable. -22.01%
See also: Talos version - bug 1597297
Assignee | ||
Comment 25•6 years ago
|
||
In Bug 1597862 stephend found that the shift to conditioned profiles is cutting the standard deviation on test results by half.
This will be landing shortly on desktop.
Updated•3 years ago
|
Description
•