Closed
Bug 1469258
Opened 6 years ago
Closed 6 years ago
12.6 - 51.94% displaylist_mutate / glterrain (linux64-qr, windows10-64-qr) regression on push 7fbf8a9126e5 (Fri Jun 15 2018)
Categories
(Core :: Graphics: WebRender, defect, P2)
Tracking
()
RESOLVED
WONTFIX
Tracking | Status | |
---|---|---|
firefox60 | --- | unaffected |
firefox61 | --- | unaffected |
firefox62 | --- | disabled |
firefox63 | --- | affected |
People
(Reporter: igoldan, Unassigned)
References
Details
(Keywords: perf, regression, talos-regression, Whiteboard: [gfx-noted])
Talos has detected a Firefox performance regression from push:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c1d664fefdff80e066dde0140c22e8ff74fe8694&tochange=7fbf8a9126e54addb86d01ff6ec1e62e7fda18e5
As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
52% glterrain windows10-64-qr opt e10s stylo 1.41 -> 2.14
19% displaylist_mutate windows10-64-qr opt e10s stylo4,115.04 -> 4,911.79
13% displaylist_mutate linux64-qr opt e10s stylo 4,832.74 -> 5,441.80
You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=13840
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests
For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running
*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Updated•6 years ago
|
Component: General → Graphics: WebRender
Product: Testing → Core
Reporter | ||
Comment 1•6 years ago
|
||
And here are the Gecko profiles:
For Windows 10 Quantum Render:
Before: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FEHVHvaCgQoeQz1n-47LfOQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_displaylist_mutate.zip
After: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FAZh-2Z3rQE-7pUKEkbVwZg%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_displaylist_mutate.zip
Reporter | ||
Comment 2•6 years ago
|
||
The perf regressions were caused by this landing: https://bugzilla.mozilla.org/show_bug.cgi?id=1467096#c13
:kats What are the plans for these? Can we fix them now?
Flags: needinfo?(bugmail)
Reporter | ||
Comment 3•6 years ago
|
||
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #1)
> And here are the Gecko profiles:
>
> For Windows 10 Quantum Render:
>
> Before:
> https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.
> net%2Fv1%2Ftask%2FEHVHvaCgQoeQz1n-
> 47LfOQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_displaylist_muta
> te.zip
>
> After:
> https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.
> net%2Fv1%2Ftask%2FAZh-2Z3rQE-
> 7pUKEkbVwZg%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_displaylist
> _mutate.zip
These are for the displaylist_mutate perf test.
Reporter | ||
Comment 4•6 years ago
|
||
Also, these are the Gecko profiles for displaylist_mutate, from Linux 64 Quantum Render:
Before: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FINfV0yriQLuse_7dj01QfA%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_displaylist_mutate.zip
After: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FQ7pUiFe7QsOlHOUn_y1y2g%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_displaylist_mutate.zip
Comment 5•6 years ago
|
||
The regression was expected (see bug 1467096 comment 10) and the webrender issue at servo/webrender#2807 will help with this. I plan to work on that this week.
Updated•6 years ago
|
status-firefox60:
--- → unaffected
status-firefox61:
--- → unaffected
status-firefox62:
--- → affected
Comment 6•6 years ago
|
||
Quick update: I wrote a patch for servo/webrender#2807 and it helped with displaylist_mutate but regressed tresize. I'm going to investigate that a bit more.
Comment 7•6 years ago
|
||
I looked into the tresize regression and found a the root of the problem. tresize doesn't run in ASAP mode, so the thing it's measuring (average time between window.resize call to the following MozAfterPaint) is affected in large part by the amount of work that happens between the vsync that triggers composition, and the end of the composition. If we spend extra time doing stuff *before* that vsync it doesn't really factor into the end result, except if it causes a whole frame delay.
So, currently the sequence of events is approximately this:
(a) window.resize
(b) DL build
(c) WR scene build
(d) schedule composite
(e) WR render (for rebuilding hit-testing info)
(f) (wait for vsync)
(g) WR render (triggered via GenerateFrame transaction in WRBP::CompositeToTarget)
(h) composite and fire MozAfterPaint
With my patches, step (e) changes so that instead of doing a full render we just rebuild the hit-testing info. I verified that this does take an order of magnitude less time at step (e). However, it also *increases* the time spent in step (g), presumably because that render has to do more work and can't just reuse the results from (e). So while the amount of work done *before* the vsync is reduced, that doesn't affect the end result, but the small increase in work done *after* the vsync has a disproportionate impact on the test result.
I verified that turning on ASAP mode for this test locally removes the regression, but I'll do a try push to confirm that since I don't know how far I can trust my local results.
Comment 8•6 years ago
|
||
On tryserver there's still a small regression with the test in ASAP mode when comparing with my patches [1] and without [2]. Not sure why yet.
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=52fc13269c5d050d325526f649590d063957eaaf
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=57055a302cb756ff5935ae80b7a6d46a2c5ede72
Comment 9•6 years ago
|
||
assuming this is high priority since kats is working on it
Comment 11•6 years ago
|
||
(flipping bits, on the assumption this WR nightly-only for the time being)
status-firefox63:
--- → affected
Comment 12•6 years ago
|
||
I rebased my patches for servo/webrender#2807 and kicked off another try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ca9520ec0b1ea14f97f7260d58ff5ccfd7860200. I'll see if there's still talos regressions from this.
Comment 13•6 years ago
|
||
It still hurts tresize on windows.
Comment 14•6 years ago
|
||
Kat is on leave for 3 more weeks. Un-assigning so we know it needs a new owner.
Assignee: kats → nobody
Updated•6 years ago
|
Priority: P2 → P3
Updated•6 years ago
|
Priority: P3 → P2
Comment 15•6 years ago
|
||
Kats, is there anything we want to still do with this bug?
Flags: needinfo?(kats)
Comment 16•6 years ago
|
||
Not raelly, I think we can close it. We have existing bugs tracking the talos regressions compared to non-WR. And the patches I was working on earlier were obsoleted by stuff nical landed which accomplished the same thing, I believe.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(kats)
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•