Closed Bug 1289321 Opened 4 years ago Closed 4 years ago
.93 - 27 .4% cart (windowsxp) regression on push 2e6d8283458c202a196842653838ed83384e3cbf (Mon Jul 25 2016)
Talos has detected a Firefox performance regression from push 2e6d8283458c202a196842653838ed83384e3cbf. As author of one of the patches included in that push, we need your help to address this regression. Summary of tests that regressed: cart summary windowsxp opt - 13.93% worse cart summary windowsxp pgo - 27.4% worse You can find links to graphs and comparsion views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=2051 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Here is the zooming better view: https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,60b82c205cac462d8ada5801ecb26d1e1fbf38d2,1,1%5D&series=%5Bmozilla-inbound,898b13577b15bcc31f2ea1bccc3ba5ed4533db14,1,1%5D&zoom=1469332370271.628,1469490988266.0466,30,49.850187265917604&selected=%5Bmozilla-inbound,60b82c205cac462d8ada5801ecb26d1e1fbf38d2,34799,32591365,1%5D This issue might be caused by 2e6d8283458c202a196842653838ed83384e3cbf. https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=93d4bd8a4ad33d4202c8705e1a4e4fe716843614&tochange=2e6d8283458c202a196842653838ed83384e3cbf Hi Matt, as you are the patch author, can you take a look at this and determine what is the root cause? Thanks!
I'll look into this, I don't think we need to rush into backing out.
Assignee: nobody → matt.woodrow
thanks Matt, as a note, we will be merging trunk -> Aurora on Monday, so any fixes landed after Monday should be considered for Aurora as well.
This regression patch 2e6d8283458c has been merged into Aurora, so please make sure to uplift any fixes to Aurora, thank you. https://treeherder.mozilla.org/perf.html#/alerts?id=2185 cart summary windowsxp pgo - 25.12% worse cart summary windowsxp pgo e10s - 1.72% worse
Matt, it has been a few weeks, do you have an update for this bug? It is a large regression for the CART test.
Just based off of the >15% regression, I consider this blocking Aurora -> Beta50 transition.
Hello Jet, I pinged you on IRC about this bug. Joel's team considers this a pretty severe pref regression. Is there a fix in the works that addresses this? Please let me know.
I'm inclined to back out the fix for bug 1217803 since it doesn't seem to be affecting a public site. The regression reported here looks pretty bad.
I'm currently waiting on try results for a possible fix. I'll make sure this gets fixed or backed out before we merge to beta.
(In reply to Matt Woodrow (:mattwoodrow) from comment #9) > I'm currently waiting on try results for a possible fix. I'll make sure this > gets fixed or backed out before we merge to beta. Thanks Jet and Matt for investigating this perf issue. Instead of waiting for merge day (9/12), I think we should make a final decision by 9/5. Perf team plans to do another round of perf testing during that week I think. So, we should address this regression before that (by way of a fix or backout). Hi Joel, when do you plan to do another round of perf runs on Aurora50 before it's merged to Beta? Does a fix/backout by 9/5 work for you?
Yep, having a decision date of 9/5 sounds great to me. I've had a try push running going for 48 hours at this point, fingers crossed that it completes eventually and gives positive results.
yes, September 5th is a good date to have this resolved. Matt, I did some retriggers and priority bumping- you should have data.
Unfortunately that didn't help at all. I tried avoiding painting extra mask layers when they were on leaves, since we can be sure that those won't be animated asynchronously (async animations are only on ContainerLayers) and when scrolled they move with the mask. It looks like the mask layers that are slowing us down are on ContainerLayers, where it's very hard to determine if they can be affected by async animations. Given that this only affected WinXP (and not on e10s), then I'm tempted to just accept this. What do you think Markus?
Yeah, this seems fine to accept, but it would be nice to at least have a layer tree dump so that we can be sure what's going on.
Hi Matt, we are at that point where we need to make a decision regarding this perf regression. Should we consider backing out this patch?
I think we should leave it in, it's a valuable correctness fix.
please resolve this as wontfix :)
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
(In reply to Matt Woodrow (:mattwoodrow) from comment #13) > > Given that this only affected WinXP (and not on e10s), then I'm tempted to > just accept this. > Given this impact (winxp only and non-e10s only), I am ok with a wontfix and untagging this as a blocker for Fx50.
You need to log in before you can comment on or make changes to this bug.