3.75 - 6.03% ts_paint / ts_paint_webext (windows10-64-qr) regression on push 2cee53dd5773 (Wed Oct 10 2018)
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox-esr68 | --- | disabled |
firefox69 | --- | wontfix |
firefox70 | --- | wontfix |
firefox71 | --- | fix-optional |
People
(Reporter: igoldan, Assigned: jrmuizel)
References
Details
(5 keywords)
Talos has detected a Firefox performance regression from push: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=4f55976a9e9115c9f41075843bc48955684364d9&tochange=2cee53dd577363866c3cc6ed7baf679a9936abbf As author of one of the patches included in that push, we need your help to address this regression. Regressions: 6% ts_paint_webext windows10-64-qr opt e10s stylo 327.00 -> 346.73 4% ts_paint windows10-64-qr opt e10s stylo 324.50 -> 336.67 Improvements: 32% glterrain windows10-64-qr opt e10s stylo 2.14 -> 1.45 22% sessionrestore windows10-64-qr opt e10s stylo 401.33 -> 314.64 8% tp5o_scroll windows10-64-qr opt e10s stylo 2.82 -> 2.59 5% tscrollx linux64-qr opt e10s stylo 2.35 -> 2.25 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=16667 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 1•6 years ago
|
||
These regressions were caused by one the following bugs: bug 1496670 bug 1461239 bug 1495902 bug 1496670 :jrmuizelaar which one is more related to our problem?
Reporter | ||
Comment 2•6 years ago
|
||
We also noticed these AWSY regressions were caused by the same patch: == Change summary for alert #16687 (as of Wed, 10 Oct 2018 00:20:39 GMT) == Regressions: 4% Explicit Memory windows10-64-qr opt stylo 318,964,286.10 -> 333,257,110.10 3% Resident Memory windows10-64-qr opt stylo 823,379,704.81 -> 850,549,640.65 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=16687
Reporter | ||
Updated•6 years ago
|
Assignee | ||
Comment 3•6 years ago
|
||
I guess we should try to narrow down what commit actually caused this.
Comment 4•6 years ago
|
||
I bisected on try: https://treeherder.mozilla.org/perf.html#/graphs?timerange=86400&series=try,1682838,1,1&zoom=1539700822850.41,1539702991000,320.61610532610604,341.5149817306004 and this (bug 1496670) is the root cause: https://hg.mozilla.org/integration/mozilla-inbound/rev/2cee53dd5773 I really don't get this because that push is just a change to reftest.list, the data is clear that it cause the exact regression seen for win10-qr ts_paint. Could the builds be non deterministic?
Assignee | ||
Comment 5•6 years ago
|
||
That's super weird. I have no idea what it would mean. I suppose we could try other reftest.list changes and see if it's just changing that particular one that causes the problem.
Comment 6•6 years ago
|
||
I am going to try some try pushes as "backouts" instead- that might give us a better idea
Comment 7•6 years ago
|
||
the backouts had a lot of conflicts, I could only get one to backout and the other one had a build issue. Needless to say, backout out the reftest.list change didn't cause a perf difference.
Comment 8•6 years ago
|
||
I just came across this bug. I've been bisecting the AWSY regressions as well. I narrowed them down [1] to this push: [2]. I'm bisecting further, will report back. [1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=68738fc5a9ce6a0ec0c1a13bb1e7e43528e63db0&newProject=try&newRevision=4c1fcac81edcf4f6e218a6fb9ac30b4976ec6380&framework=4 [2] https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=7801a4fb37db
Comment 9•6 years ago
|
||
Did a more precise bisect within that WR update [1], which gives us [2] plus the gecko-side patch in bug 1495902 (which was necessary to make things compile). Unless we think Glenn's shadow flattening work might be to blame, that seems like pretty strong evidence that the shader caching work is responsible here, at least for the AWSY regressions. We could retrigger the two linked try pushes to include ts_paint if that's of interest. [1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=68738fc5a9ce6a0ec0c1a13bb1e7e43528e63db0&newProject=try&newRevision=5e08c890491b725dcedfeb49a963d5619349ef98&framework=4 [2] https://github.com/servo/webrender/compare/3c3f9a4e919b81639f078d7bd101012de61b9396...9dd465162183c127e7cefbe50ad9173e6ec27bb3
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 11•6 years ago
|
||
:vchin we need your help on concluding this bug, as it's >2 weeks since it got stuck.
Comment 12•6 years ago
|
||
Given that this only affects WebRender, which is not yet riding the trains, I think it's ok to wait for Matt (or someone else) to get to it.
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 13•6 years ago
|
||
Here's the ts_paint regression:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=4f55976a9e9115c9f41075843bc48955684364d9&newProject=try&newRevision=0b82df8f44d17cca635a8b8aef328095f2f6cb03&framework=1
Assignee | ||
Comment 14•6 years ago
|
||
And just to confirm with the shadow changes removed:
Project=mozilla-inbound&originalRevision=4f55976a9e9115c9f41075843bc48955684364d9&newProject=try&newRevision=81b5499bde1afc9a19fd1a9e8e1bc1a3a44d77e5&framework=1
Assignee | ||
Comment 15•6 years ago
|
||
Profiles from the time of the regression show a clear reason for the regression (we're spending a bunch of time compiling shaders when we don't expect to). However, recent profiles comparing wr and non-wr don't show this same problem. Further, when enabling profiling the startup time regression disappears. This even happens when lowering the sampling rate to 10ms.
Given this, I'm inclined to think that the regression we're seeing here is not that real and may just be related to the particular timing that's happening on the test machines.
Comment 16•6 years ago
|
||
Yeah, I did a bunch of work around shader compilation in December, so that's likely to change things.
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Reporter | ||
Updated•6 years ago
|
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 18•5 years ago
|
||
:jrmuizel Is there anything we can do on this?
Ca we close this as wontfix ?
Assignee | ||
Comment 19•5 years ago
|
||
Yeah lets.
Description
•