Closed
Bug 1361889
Opened 7 years ago
Closed 7 years ago
13.62 - 14.72% sessionrestore / sessionrestore_no_auto_restore (osx-10-10) regression on push 51ed3d07846d9a133b901a99c4430feabe33cd8a (Tue May 2 2017)
Categories
(Firefox :: Session Restore, defect)
Tracking
()
RESOLVED
INVALID
People
(Reporter: jmaher, Unassigned)
References
Details
(Keywords: perf, regression, talos-regression)
Talos has detected a Firefox performance regression from push 51ed3d07846d9a133b901a99c4430feabe33cd8a. As author of one of the patches included in that push, we need your help to address this regression. Regressions: 15% sessionrestore osx-10-10 opt e10s 894.75 -> 1026.42 14% sessionrestore_no_auto_restore osx-10-10 opt e10s930.75 -> 1057.5 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=6328 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Comment 1•7 years ago
|
||
this regression happened when talos was broken for many commits: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=5c15d347dabbd647954d5266606420e15fbf6823&tochange=51ed3d07846d9a133b901a99c4430feabe33cd8a so this will require some bisecting where we either: 1) update to the original source and backout the offending patch 2) backout patch by patch this will be a series of 10 try pushes (ignoring 1 servo merge, 1 servo test expectations in bug 1341102, eslint bug 1359540, root cause bug 1352204+backout, and bug 1357116+backout). This means these bugs are suspect: Bug 1359655 Bug 1361417 Bug 1360597 Bug 1360863 Bug 1352305 Bug 1360299 Bug 1348278 Bug 1354079 Bug 1361072 if patch authors of the above bugs could help determine if they caused what appears to be osx only startup regression that would help. Ionut, can you do these try pushes with this syntax: |mach try -b o -p macosx64 -u none -t other-e10s --rebuild 5|
Comment 2•7 years ago
|
||
jmaher: you should ?needinfo people if you need info from them, as many people get too much bugmail to read all of it. You got lucky that I noticed. ;) Bug 1360863 (re-)adds a string, so I don't see how that could cause a talos regression.
Joel, you listed bug 1361417 as a potential cause, this is a bugzilla.mozilla.org administrative change to a component name. Are there hard coded dependencies in Talos on component name? If so, this could become a problem because we're trying to clean up a lot of components, naming schemes, and hierarchies this quarter.
Reporter | ||
Comment 4•7 years ago
|
||
I listed all bugs that showed up in the range where we had no data due to failed tests- that looks like an obvious candidate to skip. Thanks for the info!
No longer blocks: 1361417
Bug 1348278 adds some behavior to mousedown to speed up pageload latency. The chances of it affecting session restore are somewhere between slim and none.
Comment 6•7 years ago
|
||
Bug 1352305 changes behavior of sizing dialog windows when 'privacy.resistFingerprinting' is true. There is no chance for causing this problem since 'privacy.resistFingerprinting' is off by default.
Comment 7•7 years ago
|
||
Bug 1361072 is a simple cleanup of existing logic of AccessibleCaret, and is enabled only on platforms supporting touch events. It's unlikely to be the cause of this regression.
No longer blocks: 1361072
Comment 8•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #1) > this regression happened when talos was broken for many commits: > https://hg.mozilla.org/integration/autoland/ > pushloghtml?fromchange=5c15d347dabbd647954d5266606420e15fbf6823&tochange=51ed > 3d07846d9a133b901a99c4430feabe33cd8a How confident are you that this range is correct? It looks like the regression is gone again. I think it likely was fallout from bug 1054740, which has been backed out for other reasons, although it's weird that it would affect sessionrestore_no_auto_restore, since that bug and its patch are about actually restoring a session.
Comment 9•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #1) > Ionut, can you do these try pushes with this syntax: |mach try -b o -p > macosx64 -u none -t other-e10s --rebuild 5| Yes, will do that.
Flags: needinfo?(ionut.goldan)
Comment 10•7 years ago
|
||
I do not believe Bug 1360299 is the cause; it applies to a non-default compilation flag.
Reporter | ||
Comment 11•7 years ago
|
||
original list: Bug 1359655 Bug 1361417 Bug 1360597 Bug 1360863 Bug 1352305 Bug 1360299 Bug 1348278 Bug 1354079 Bug 1361072 not suspect based on authors comments: Bug 1360299 Bug 1361072 Bug 1352305 Bug 1348278 Bug 1361417 Bug 1360863 That leaves 3 bugs as suspect: Bug 1359655 Bug 1360597 Bug 1354079 An interesting question if this is the right range. I did look over the data again, and I see: https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,c22155824871783bb5d8f514b3a01b4c885034c0,1,1%5D&series=%5Bautoland,c22155824871783bb5d8f514b3a01b4c885034c0,1,1%5D&zoom=1493714834615.245,1493902105000,860.545218655611,1118.1660364994773 This is the range that caused the regression, but nothing in here fixed the problem.
Reporter | ||
Comment 12•7 years ago
|
||
:mkelly, can you comment on the possibility of the code from bug 1359655 causing a session restore regression on osx?
Flags: needinfo?(mkelly)
Reporter | ||
Comment 13•7 years ago
|
||
:dao, can you comment on the possibility of the code from bug 1360597 causing a session restore regression on osx?
Flags: needinfo?(dao+bmo)
Reporter | ||
Comment 14•7 years ago
|
||
:gijs, can you comment on the possibility of the code from bug 1354079 causing a session restore regression on osx?
Flags: needinfo?(gijskruitbosch+bugs)
Comment 15•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #14) > :gijs, can you comment on the possibility of the code from bug 1354079 > causing a session restore regression on osx? Anything is possible, but it seems unlikely. Ditto for the other changes here. Dão is right though, the regression seems fixed based on the last points on the graph you linked.
Flags: needinfo?(gijskruitbosch+bugs)
Updated•7 years ago
|
Flags: needinfo?(dao+bmo)
Reporter | ||
Comment 16•7 years ago
|
||
while it is fixed, the fix is clearly bug 1353571, and not related to anything in this list. So there is a lingering regression to figure out... :) I am still not sure about bug 1054740 which Dao mentioned, in that case it landed and was backed out in a couple hours difference, this regression was May 2 and fixed May 4 (really 32 hours difference)
Comment 17•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #16) > while it is fixed, the fix is clearly bug 1353571, and not related to > anything in this list. So there is a lingering regression to figure out... > :) I am still not sure about bug 1054740 which Dao mentioned, in that case > it landed and was backed out in a couple hours difference, this regression > was May 2 and fixed May 4 (really 32 hours difference) Then I don't know. If we're down to those 3 bugs, can someone do trypushes backing each of them out and seeing which reverts the perf regression, or something? Did the trypushes from comment #1 already happen? I'd rather we measure and know, than just guess which patches are/aren't responsible here.
Comment 18•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #16) > I am still not sure about bug 1054740 which Dao mentioned, in that case > it landed and was backed out in a couple hours difference, this regression > was May 2 and fixed May 4 (really 32 hours difference) Hmm, okay. In that case I'm going to attempt to re-land this. As for bug 1360597, that code may be in the startup path, but nothing expensive is (or at least should be) happening there.
Comment 19•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #12) > :mkelly, can you comment on the possibility of the code from bug 1359655 > causing a session restore regression on osx? I'll echo comment 18: The SHIELD sync adds code to the startup path, although our startup pretty quickly jumps to some async code, which I think(?) makes add-on startup move on, and our startup isn't terribly expensive anyway. To see if it affects anything, I pushed a backout of the SHIELD changes to try with the command from comment 1: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fd9a18ca7ad0ef9d90ac8dafdc881908dce5d038
Flags: needinfo?(mkelly)
Comment 20•7 years ago
|
||
With jmaher's help, I did 6 local Talos runs: 3 with mozilla-central as it is, and 3 with the Normandy sync backed out: https://gist.github.com/Osmose/e5ee0a0d55e2622494eab7d7ffc5764a > <jmaher> I only question without_normandy1.json, all the values seem higher > <jmaher> but it could be random stuff > <Osmose> That was my first run after a rebuild > <jmaher> ok > <jmaher> so I see 1238 vs 1268, really <3% difference
Updated•7 years ago
|
Component: Untriaged → Session Restore
Reporter | ||
Comment 21•7 years ago
|
||
Ionut did a lot of try runs: https://treeherder.mozilla.org/#/jobs?repo=try&author=ionut.goldan@softvision.ro&filter-searchStr=talos%20other&selectedJob=96697223&group_state=expanded none of them showed the culprit, so this is not really solvable.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•