Closed Bug 1281105 Opened 4 years ago Closed 4 years ago
.9 - 25 .13% sessionrestore / sessionrestore _no _auto _restore / tp5o Main _RSS / tresize / ts _paint (linux64) regression on push b3930a21b6ed (Wed Jun 8 2016)
58 bytes, text/x-review-board-request
Talos has detected a Firefox performance regression from push b3930a21b6ed: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=e9a46afd9375bf14dbbef5fb775473745d26a12b&tochange=b3930a21b6edbf8e3138ca739f1b7ead13ecf703 As author of one of the patches included in that push, we need your help to address this regression. This is a list of all known regressions and improvements related to the push: https://treeherder.mozilla.org/perf.html#/alerts?id=1494 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#ts_paint https://wiki.mozilla.org/Buildbot/Talos/Tests#sessionrestore.2Fsessionrestore_no_auto_restore https://wiki.mozilla.org/Buildbot/Talos/Tests#tresize https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5 https://wiki.mozilla.org/Buildbot/Talos/Tests#tsvg-opacity Reproducing and debugging the regression: If you would like to re-run this Talos test on a potential fix, use try with the following syntax: try: -b o -p linux64 -u none -t other,other-e10s,chromez,chromez-e10s,tp5o,tp5o-e10s,svgr --rebuild 5 # add "mozharness: --spsProfile" to generate profile data (we suggest --rebuild 5 to be more confident in the results) To run the test locally and do a more in-depth investigation, first set up a local Talos environment: https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code Then run the following command from the directory where you set up Talos: talos --develop -e [path]/firefox -a ts_paint:sessionrestore:sessionrestore_no_auto_restore:tresize:tp5o:tsvgr_opacity (add --e10s to run tests in e10s mode) Making a decision: As the patch author we need your feedback to help us handle this regression. *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
I did a lot of retriggers here and I see these changes compared to the previous revision: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=e9a46afd9375bf14dbbef5fb775473745d26a12b&newProject=mozilla-inbound&newRevision=b3930a21b6edbf8e3138ca739f1b7ead13ecf703&framework=1&showOnlyConfident=1 those regressions line up with what is reported in the alert detection! The largest regression is on session restore, possibly there is something we can do there? :acomminos, can you take a look at this. Given the fact that we are filing this 1.5 weeks later (yay for work weeks), I would say this falls outside of the backout/urgent category- it would be nice to understand this and try to resolve it before firefox 50 hits aurora.
Component: Untriaged → Graphics
Product: Firefox → Core
The sessionrestore regression isn't particularly surprising, considering that we block on the main thread during startup in order to initialize the GLX context. I'd like to keep things this way, as it provides a safe way to fallback to software vsync if initialization fails. I'll look into the other regressions some more.
(In reply to Andrew Comminos [:acomminos] from comment #2) > The sessionrestore regression isn't particularly surprising, considering > that we block on the main thread during startup in order to initialize the > GLX context. I'd like to keep things this way, as it provides a safe way to > fallback to software vsync if initialization fails. > > I'll look into the other regressions some more. A >10% regression on sessionrestore is basically a must-fix. How much more work would having the hardware vsync come online lazily be? I'd be happy to sit down on vidyo and help with this.
Severity: normal → major
Whiteboard: [talos_regression] → [talos_regression] gfx-noted
[Tracking Requested - why for this release]: Comment 1 suggests we should be tracking this.
Review commit: https://reviewboard.mozilla.org/r/60070/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/60070/
jgilbert and I agreed that it only makes sense to block the main thread to initialize a GLX context on startup when using the GL compositor. Let's do that for now, and potentially look to implementing thread-safe error handlers on X11 in the future for lazy initialization.
https://reviewboard.mozilla.org/r/60068/#review57280 This looks good to me, but jrmuizel should sign off that this does what I expect. MozReview doesn't appear to let me reassign review, so I'll file a bug for that.
Comment on attachment 8764012 [details] Bug 1281105 - Disable GLX vsync when using the basic compositor. https://reviewboard.mozilla.org/r/60070/#review57282 This seems good to me, but jrmuizel should review it.
Attachment #8764012 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8764012 [details] Bug 1281105 - Disable GLX vsync when using the basic compositor. https://reviewboard.mozilla.org/r/60070/#review57700
Pushed by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/fcb19dc55671 Disable GLX vsync when using the basic compositor. r=jrmuizel
Excellent, this fix has shown performance improvements comparable to the originally reported regressions: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=7876627840cb368f324b00086a5afc3281648e75&newProject=mozilla-inbound&newRevision=87bbd3f58b4b153facdd1a2995105ae28cfbcd21&framework=1&showOnlyImportant=0&showOnlyConfident=1 Thanks!
You need to log in before you can comment on or make changes to this bug.