Closed Bug 1165351 Opened 9 years ago Closed 9 years ago

2.5-32% Win7 svg-asap/svg-row-opacity/tart/cart/ts_paint/paint/tresize/tp5o/session_noautorestore regression on Mozilla-Inbound-Non-PGO (v.41) on May 14, 2015 from push fde31eaa2638

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: vaibhav1994, Unassigned)

References

Details

Talos has detected a Firefox performance regression.  We need you to address this regression.

This is a list of all known regressions and improvements related to your bug:
http://alertmanager.allizom.org:8080/alerts.html?rev=fde31eaa2638&showAll=1&testIndex=0&platIndex=0

On the page above you can see Talos alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5

Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p win32 -u none -t tp5o  # add "mozharness: --spsProfile" to generate profile data

To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code

Then run the following command from the directory where you set up Talos:
talos --develop -e <path>/firefox -a tp5o

Making a decision:
As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans by Tuesday, or the offending patch will be backed out! ***

Our wiki page oulines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
These are the revisions coalesced in fde31eaa2638: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=fde31eaa2638

Out of them, Daniel Holbert's and Ehsan's patches are very small and unlikely to cause these regressions. Mostly, one of the others are the root cause. Michal, Mason, Markus can you see which revision is the likely root cause in this regression? Thanks
Flags: needinfo?(mstange)
Flags: needinfo?(michal.novotny)
Flags: needinfo?(mchang)
fyi- normally we would have done some retriggers to gain more data points- these are not working due to closed trees and hg/taskcluster issues.
Blocks: 1165349
It's unclear to me which patch could be responsible here. I think we need a few try pushes to figure this out, and since try is closed, that might take a while.

Also, try talos profiling is broken at the moment, in case somebody was going to try that (bug 1165361).
Flags: needinfo?(mstange)
If this was caused by Mason's box-shadow patch (bug 1155828), the next thing to test would be whether bug 1162824 fixes the regression.
I would be surprised if it's the box shadow patch. We've made it ~10x faster, but maybe we get different results on Windows 7. Can we wait the trees open up again to get more data points?
Flags: needinfo?(mchang)
I don't think this can be caused by bug 1163900. The actual change was done in bug 1156493 and this bug just fixes a crash caused by bug 1156493. In fact, it reverts some changes from 1156493.
Flags: needinfo?(michal.novotny)
It looks like it was the box-shadow change. I checked the tart numbers on the treeherder pages manually, because the perfherder links just showed me empty pages.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=46b424bb13d7 has tart numbers ~9
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3c1e5e3d999c has tart numbers ~6 and is the first of the try pushes that had http://hg.mozilla.org/integration/mozilla-inbound/rev/fde31eaa2638 backed out.
Blocks: 1155828
(In reply to Markus Stange [:mstange] from comment #5)
> If this was caused by Mason's box-shadow patch (bug 1155828), the next thing
> to test would be whether bug 1162824 fixes the regression.

This has landed in the meantime, but it has not fixed the regression. https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=d8a584dddc0c and https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=d8a584dddc0c have Win7 tart numbers of around 8.7.
:mchang, any thought on how to proceed here?  It looks like this is the main cause of many regressions although only a few were actually tested on try server.

This would be the accurate compare-perf view:
https://treeherder.mozilla.org/perf.html#/compare?originalRevision=46b424bb13d7&originalProject=try&newProject=try&newRevision=bb297c86f841

If requested, I could easily schedule the other talos jobs for both try pushes there so we could get a complete picture.  Keep in mind that a lot of green is showing up there as improvements- that would be the old revision as the existing regression and the new revision as the tree prior to the regression.
Flags: needinfo?(mchang)
(In reply to Joel Maher (:jmaher) from comment #18)
> :mchang, any thought on how to proceed here?  It looks like this is the main
> cause of many regressions although only a few were actually tested on try
> server.
> 
> This would be the accurate compare-perf view:
> https://treeherder.mozilla.org/perf.html#/
> compare?originalRevision=46b424bb13d7&originalProject=try&newProject=try&newR
> evision=bb297c86f841
> 
> If requested, I could easily schedule the other talos jobs for both try
> pushes there so we could get a complete picture.  Keep in mind that a lot of
> green is showing up there as improvements- that would be the old revision as
> the existing regression and the new revision as the tree prior to the
> regression.

Due to enough regressions, I backed out both bug 1155828 and bug 1162824. Can we please resolve this bug once we confirm the talos regressions are gone? Thanks!
Flags: needinfo?(mchang)
I verified the regressions were fixed as a result of the backout.  Thanks for taking care of this.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Sorry for just getting around to this. From comment 18, it looks like we actually made a lot of things better? Or is higher better? Thanks!
Flags: needinfo?(jmaher)
for all the mentioned regressions lower is better.  Looking at the graphs again as a sanity check, I see we were higher, then were lower as a result of the backout.  There is a good chance I flipped the try serve rrevisions to be old/new view new/old or vice versa.
Flags: needinfo?(jmaher)
You need to log in before you can comment on or make changes to this bug.