Closed
Bug 1483610
Opened 6 years ago
Closed 6 years ago
70.62 - 92.53% displaylist_mutate (linux64-qr, windows10-64-qr) regression on push 3a0e1fb203fad7e435ab65e1b78a5ceb5998bdc0 (Tue Aug 14 2018)
Categories
(Core :: Graphics: WebRender, defect, P2)
Core
Graphics: WebRender
Tracking
()
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox62 | --- | unaffected |
firefox63 | --- | disabled |
firefox64 | --- | fixed |
People
(Reporter: jmaher, Assigned: jrmuizel)
References
(Depends on 1 open bug)
Details
(Keywords: perf, regression, talos-regression, Whiteboard: [gfx-noted])
Attachments
(3 files)
Talos has detected a Firefox performance regression from push: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=3a0e1fb203fad7e435ab65e1b78a5ceb5998bdc0 As author of one of the patches included in that push, we need your help to address this regression. Regressions: 93% displaylist_mutate windows10-64-qr opt e10s stylo 4,151.73 -> 7,993.22 71% displaylist_mutate linux64-qr opt e10s stylo 4,849.90 -> 8,274.71 Improvements: 6% tp5o_scroll linux64-qr opt e10s stylo 0.52 -> 0.49 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=14984 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Comment 1•6 years ago
|
||
:jrmuizel, I see you landed the code in bug 1481570, this has caused a regression for displaylist_mutate (it is also very noisy now instead of very stable), can you look at the regression to see if there is a fix we can do or help decide if we need to backout or accept this regression?
Component: General → Graphics: WebRender
Flags: needinfo?(jmuizelaar)
Product: Testing → Core
Comment 2•6 years ago
|
||
MotionMark score became very very bad on latest nightly. It might also be related to this bug.
Comment 3•6 years ago
|
||
I am going to check which change caused the regression.
Comment 4•6 years ago
|
||
By using https://treeherder.mozilla.org/#/jobs?repo=try&author=kgupta@mozilla.com the regression seems to happen within https://github.com/servo/webrender/compare/e4750616750f20fcbb278df0e324ec26aebd0a3c...c68118517dfcb81090139ea9acaff4f6c8b26431 Talos comparison are the following. https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=e895882944543b1d12bcfb940552af76a1569ef9&newProject=try&newRevision=33ab362f07b201dbde0292ad9de6ed19ed01ab96&framework=1
Comment 5•6 years ago
|
||
I confirmed that Bug 1481570 also regressed HTML5 Fish Bowl on my P50(Win10). Before Bug 1481570, WR profiler shows 25-30 fps. But since Bug 1481570, WR profiler shows 7-15fps. It might be a simper use case for the regression. https://testdrive-archive.azurewebsites.net/performance/fishbowl/ I used the following command to confirm it. > mozregression --good 2018-08-12 --pref gfx.webrender.all:true gfx.webrender.debug.compact-profiler:true gfx.webrender.debug.profiler:true -a https://testdrive-archive.azurewebsites.net/performance/fishbowl/ mozregression showed following result. ---------------------------------------------- 8:17.28 INFO: No more inbound revisions, bisection finished. 8:17.28 INFO: Last good revision: 08bf805f6f0ef61f68686ef1ca2cc6f750a2cfa0 8:17.28 INFO: First bad revision: 7846bdd3762cf494ec24efc9cfc3472dc715ce4f 8:17.28 INFO: Pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=08bf805f6f0ef61f68686ef1ca2cc6f750a2cfa0&tochange=7846bdd3762cf494ec24efc9cfc3472dc715ce4f
Comment 6•6 years ago
|
||
HTML5 Fish Bowl case, gpu usage that was shown Windows Task Manager seems to be increased from around 80% to around 95%.
Comment 7•6 years ago
|
||
Comment 8•6 years ago
|
||
Capture of WebRender profiler at current nightly
Comment 9•6 years ago
|
||
From attachment 9001871 [details], since Bug 1481570 fix, C_CLIP task seems to be increased compared to attachment 9001870 [details].
Comment 10•6 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #4) > By using > https://treeherder.mozilla.org/#/jobs?repo=try&author=kgupta@mozilla.com the > regression seems to happen within > https://github.com/servo/webrender/compare/e4750616750f20fcbb278df0e324ec26aebd0a3c...c68118517dfcb81090139ea9acaff4f6c8b26431 Within the above changes, changes by :gw seems like a culprit.
Comment 12•6 years ago
|
||
I suspected we may have a few regressions similar to this when that patch landed. I suspect what's happening is that there is a case no longer handled where we should be removing a redundant clip mask. I will investigate these test cases tomorrow, thanks for looking into it!
Flags: needinfo?(gwatson)
Updated•6 years ago
|
Assignee: nobody → gwatson
Updated•6 years ago
|
Whiteboard: [gfx-noted]
Comment 13•6 years ago
|
||
I have a WIP patch that resolves the performance regression (on fishbowl, at least). Unfortunately it breaks a couple of reftests on try, so I need to investigate and fix those before the patch will be ready for review.
Updated•6 years ago
|
Updated•6 years ago
|
Comment 14•6 years ago
|
||
The patch linked to above fixes the regression on the fishbowl performance test. I expect it will fix the other tests too (and perhaps improve over the original performance), although I haven't confirmed that yet.
Comment 15•6 years ago
|
||
(In reply to Glenn Watson [:gw] from comment #14) > The patch linked to above fixes the regression on the fishbowl performance > test. I expect it will fix the other tests too (and perhaps improve over the > original performance), although I haven't confirmed that yet. I confirmed that the MotionMark regression was addressed by using the following. Bug 1481570 made the score to 7-10 on P50(Win10), the fix became the score to 100-130. > mozregression --repo try --launch a9d93837c673748ce01d0f45ccfc150d5f4fb036 -B release --pref gfx.webrender.all:true -a https://browserbench.org/MotionMark/
Assignee | ||
Updated•6 years ago
|
Flags: needinfo?(jmuizelaar)
Priority: -- → P2
Comment 16•6 years ago
|
||
Sotaro, is this fixed based on your comment in #15? Or is there still work to be done here?
Flags: needinfo?(sotaro.ikeda.g)
Comment 17•6 years ago
|
||
The perf problems(heavy clipping task) of MotionMark and fishbowl was addressed. But it seems different problem to displaylist_mutate :( The following shows that displaylist_mutate is bad. https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=autoland,1663497,1,1&series=autoland,1683784,1,1&series=autoland,1663687,1,1&series=autoland,1683809,1,1
Flags: needinfo?(sotaro.ikeda.g)
Comment 18•6 years ago
|
||
The following is a direct perf comparison between before regression and latest webrender build at https://treeherder.mozilla.org/#/jobs?repo=try&author=kgupta@mozilla.com https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=e895882944543b1d12bcfb940552af76a1569ef9&newProject=try&newRevision=796b97711e9db6ee09dfe9d67d4baaf7bd3277b4&framework=1
Assignee | ||
Updated•6 years ago
|
Priority: P2 → P1
Comment 19•6 years ago
|
||
This html file is created by modifying https://searchfox.org/mozilla-central/source/testing/talos/talos/tests/layout/benchmarks/displaylist_mutate.html
Comment 20•6 years ago
|
||
attachment 9005100 [details] could be used to check regression. With the following mozregression command, I confirmed that fps was regressed. [1] was 24-25fps and [2] was 39-40fps on my P50(Win10).
[1] latest webrender buld
mozregression --repo try --launch 796b97711e9db6ee09dfe9d67d4baaf7bd3277b4 -B release --pref gfx.webrender.all:true gfx.webrender.debug.compact-profiler:true gfx.webrender.debug.profiler:true
[2] build before regression
mozregression --repo try --launch e895882944543b1d12bcfb940552af76a1569ef9 -B release --pref gfx.webrender.all:true gfx.webrender.debug.compact-profiler:true gfx.webrender.debug.profiler:true
Comment 21•6 years ago
|
||
From webrender profiler, CPU(backend) seemed to become more busy. I checked it with the following command. mozregression --good 2018-08-10 --pref gfx.webrender.all:true gfx.webrender.debug.profiler:true gfx.webrender.debug.gpu-sample-queries:true gfx.webrender.debug.gpu-time-queries:true -a https://bug1483610.bmoattachments.org/attachment.cgi?id=9005100 Before the regression, mean time of CPU(backend) was 6-7ms. But since the regression, it became 8ms-10ms.
Updated•6 years ago
|
Blocks: stage-wr-trains
Comment 22•6 years ago
|
||
The latest profile for this looks a lot like bug 1487864. We're painting frames in pairs, and no threads are being maxed out.
Depends on: frame-scheduling
Comment 23•6 years ago
|
||
We need to fix this before WR goes to the field, but this shouldn't block WR riding to beta.
Priority: P1 → P2
Assignee | ||
Updated•6 years ago
|
Assignee: gwatson → nobody
Comment 24•6 years ago
|
||
Part of the regression might be addressed by https://github.com/servo/webrender/pull/3117.
Updated•6 years ago
|
See Also: → https://github.com/servo/webrender/pull/3117
Reporter | ||
Comment 25•6 years ago
|
||
after bug 1494042, it seems these scores are back to normal, :jrmuizel, do you want to resolve this?
Flags: needinfo?(jmuizelaar)
Assignee | ||
Comment 26•6 years ago
|
||
Sure.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(jmuizelaar)
Resolution: --- → FIXED
Updated•6 years ago
|
Assignee: nobody → jmuizelaar
status-firefox62:
--- → unaffected
status-firefox63:
--- → disabled
status-firefox64:
--- → fixed
status-firefox-esr60:
--- → unaffected
Target Milestone: --- → mozilla64
You need to log in
before you can comment on or make changes to this bug.
Description
•