Closed Bug 1247100 Opened 4 years ago Closed 4 years ago
.08% sessionrestore (linux64) regression on push cb036027df84 (Mon Feb 8 2016)
Talos has detected a Firefox performance regression from push cb036027df84. As author of one of the patches included in that push, we need your help to address this regression. This is a list of all known regressions and improvements related to the push: https://treeherder.allizom.org/perf.html#/alerts?id=52 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#sessionrestore.2Fsessionrestore_no_auto_restore Reproducing and debugging the regression: If you would like to re-run this Talos test on a potential fix, use try with the following syntax: try: -b o -p linux64 -u none -t other --rebuild 5 # add "mozharness: --spsProfile" to generate profile data (we suggest --rebuild 5 to be more confident in the results) To run the test locally and do a more in-depth investigation, first set up a local Talos environment: https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code Then run the following command from the directory where you set up Talos: talos --develop -e <path>/firefox -a sessionrestore (add --e10s to run tests in e10s mode) Making a decision: As the patch author we need your feedback to help us handle this regression. *** Please let us know your plans by Friday, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
I was borderline on even reporting this, since it seems to be linux64 non-e10s only (which I assume isn't a configuration we'll be actively supporting soon?), but I figured I should file it at least to acknowledge what happened. While its importance may be dubious, it's unmistably a regression, see this comparison view: https://treeherder.allizom.org/perf.html#/compare?originalProject=fx-team&originalRevision=c882b4072883&newProject=fx-team&newRevision=cb036027df84&framework=1 mconley: Can you give this a quick gander? Feel free to resolve as wontfix if you think it's not relevant.
Looking. I'll get some profiles.
Here are some profiles to look at: Baseline: http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try/sha512/67debc943a55a36955bd094233c86c8f6d2b998cf1e03a25ca71d2f6e7abd0ead23e96cd57fda2106bcb4b2c3d9cf30af9523f293930176510952cb97dde25e7&pathInZip=profile_sessionrestore/startup/cycle_9.sps#report=58520bc1e354c234a75fba114f2b62fde2848537&invertCallback=true&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A1897,%22end%22%3A4259%7D%5D&selection=%22(total)%22,42 Regression: http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try/sha512/01326357b4468bfa96a9e16dafd0cfff6f5863885d9fbf2fbe0703aa26c88af620fa8f1eb16c9fe0a41a38920538e5eb06b7151b78a3fd67d38dc606c4f781d5&pathInZip=profile_sessionrestore/startup/cycle_9.sps#report=b292d8eb495cd43b64d47a08f02695ce9983047a&invertCallback=true&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A1934,%22end%22%3A4392%7D%5D&selection=%22(total)%22,935 I think I'm seeing about ~80ms spent doing some GC stuff, and the time to load tab-content.js appears to have increased by about ~30ms, so I think that's where the regression is coming from. I might be able to bring down the tab-content loading some. The GC... well, we'll see.
I was able to claw 4.89% back by moving the RefreshBlocker into a JSM: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=3b75af5d3d60&newProject=try&newRevision=c4222aa717d5&framework=1 It's busted for non-e10s, but I think I can fix that.
Strangely, the back-out of bug 1221144 for bug 1246396 seems to have put us back to where we were... and with that backout landed, the JSM technique I was trying on the 10th in comment 3 no longer has any effect. I suspect my patch exacerbated a JS problem that was introduced when bug 1221144 landed, and when that was backed out, I think the regression went with it. Here's where we're currently at: https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=[mozilla-inbound,d8bde8adc167a1143cf99d33f679eb734f3dedc7,1]&series=[fx-team,d8bde8adc167a1143cf99d33f679eb734f3dedc7,1]&highlightedRevisions=dd847049b535&zoom=1454304889313.5518,1455904437303.8086,2016.8115588201993,2364.637645776721 jmaher, do you agree with my assessment?
Flags: needinfo?(mconley) → needinfo?(jmaher)
yes, this seems valid. This is the second time in 2016 that we have multiple patches playing off each other. I am just glad that we have this resolved :)
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47
Version: unspecified → Trunk
You need to log in before you can comment on or make changes to this bug.