Closed Bug 1229695 Opened 6 years ago Closed 6 years ago

Investigate perf impact of bug 1221050 and bug 1222490

Categories

(Firefox Graveyard :: Panorama, defect)

defect
Not set
normal

Tracking

(firefox45 fixed)

RESOLVED FIXED
Firefox 45
Tracking Status
firefox45 --- fixed

People

(Reporter: Gijs, Assigned: Gijs)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

I woke up to a number of regression alerts. So far I've seen alerts for:

* non-PGO
** windows
*** Sessionrestore no auto restore: e10s + non-e10s: 2-3.5% across the board;
*** Ts, Paint: 2.4% on win7 and win8;
*** TART: 5.2% on win8 only;

Nothing on PGO or on non-windows.

This is surprising, considering these patches mostly removed about 25kloc of code... I was expecting the opposite, perf-wise, really... :-)

I have theories about what's happening, but I prefer evidence. Going to do some trypushes with selective backouts to see if I can make sense of this.

Note that because of the UI migration and because the patches touched , backing this out is going to not be a great idea, so I essentially have 2 weeks to fix this before this hits aurora. :-\
(In reply to :Gijs Kruitbosch from comment #0)
> Note that because of the UI migration and because the patches touched
... so many files.
Bad / good news: commenting out the migration doesn't seem to make a difference locally, which means it's likely something else, which is going to be harder to figure out...
Blocks: 1220148
Keywords: regression
Whiteboard: [talos_regression]
Baseline with all the patches:
 https://treeherder.mozilla.org/#/jobs?repo=try&revision=67d8cbefa419

Undo string and misc other removal:
 https://treeherder.mozilla.org/#/jobs?repo=try&revision=905382315c8a

Undo CSS and test removal as well:
 https://treeherder.mozilla.org/#/jobs?repo=try&revision=517d4572bd36

Undo everything except the main removal of components/tabview (so keep e.g. the toolbar button):
 https://treeherder.mozilla.org/#/jobs?repo=try&revision=c2597e50e567

Undo the entirety of the removal:
 https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb6bd6d530a1
a compare view:
https://treeherder.allizom.org/perf.html#/compare?originalProject=fx-team&originalRevision=eedcfc7cefd3&newProject=fx-team&newRevision=fbb5323919c3&showOnlyConfident=1;

this is quite misleading as there is bi-modal data and many of the notifications on there are false.  Looking at the graphs for each of these, the sessionrestore* and ts_paint are the ones to focus on, this matches the alerts.
So at least on my local machine I can fix everything by just removing the stale reference to browser-tabview.js in global-scripts.inc, which I missed when rebasing bug 1222490.

TBH, I think it's probably a bug that a non-existing file impacts these talos tests so much, but there we go. Landed that because getting review for it didn't seem to serve any purpose. Going to mark this as leave-open so we can close this once some talos data comes in that indicates perf has gone back to normal.
Keywords: leave-open
hey, this seems to have fixed the problem!  Great stuff Gijs!
(In reply to Joel Maher (:jmaher) from comment #7)
> hey, this seems to have fixed the problem!  Great stuff Gijs!

Did it? I hate digging my own hole here, but the emails I got were only for session restore, and the "new" times seemed to still be slightly worse than they were before the original patches for panorama's removal landed. The net difference is much smaller, and probably below our normal threshold for investigation. It could be that some of the other code changes (e.g. the browser.js refactoring) contributed a smaller regression as well that is muddying the 12-run averages that the emails are using.

In any case, what I'm more worried about is that I haven't had emails about ts, paint and tart yet. Has that also gone back to normal?
Flags: needinfo?(jmaher)
(In reply to :Gijs Kruitbosch from comment #8)
> (In reply to Joel Maher (:jmaher) from comment #7)
> > hey, this seems to have fixed the problem!  Great stuff Gijs!
> 
> Did it? I hate digging my own hole here, but the emails I got were only for
> session restore, and the "new" times seemed to still be slightly worse than
> they were before the original patches for panorama's removal landed. The net
> difference is much smaller, and probably below our normal threshold for
> investigation. It could be that some of the other code changes (e.g. the
> browser.js refactoring) contributed a smaller regression as well that is
> muddying the 12-run averages that the emails are using.
> 
> In any case, what I'm more worried about is that I haven't had emails about
> ts, paint and tart yet. Has that also gone back to normal?

OK, I now got an email for one of the ts,paint changes returning to normal. However, TART from the graphs looks like it's still out. However however, if you look at this graph:

http://graphs.mozilla.org/graph.html#tests=[[293,132,31]]&sel=1448943729331.1067,1448962023993.0286,4.8441963876996725,5.634374959128244&displayrange=7&datatype=geo

then actually, the first high is cset ade2734419a41fc0d881c34fa255b5c828c4b873 which is after the panorama changes, which scored "normally" for my remove-panorama push. TART seems to be fairly stable and the increase fairly big, so it seems unlikely this is a fluke.

The pushlog for the increase is https://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=fbb5323919c3b2aaddc66d121a39a26822fe5d4d&tochange=ade2734419a41fc0d881c34fa255b5c828c4b873 and that looks like another issue that maybe warrants investigating, judging by the general trend of the results: http://graphs.mozilla.org/graph.html#tests=[[293,132,31]]&sel=1448923899614.2922,1449135819826.1191,3.967741935483871,5.806451612903226&displayrange=7&datatype=geo

I expect it got regressed by something in the merge from m-c there. That probably warrants another bug. Leaving ni on jmaher for that.

But based on that, it does seem like we can mark this particular bug fixed, at least.
Status: NEW → RESOLVED
Closed: 6 years ago
Keywords: leave-open
Resolution: --- → FIXED
looking at all the graphs, I see this regression getting fixed, I say we are all good. Thanks Gijs!
Flags: needinfo?(jmaher)
the tart stuff is from the m-i merge and is tracked in bug 1229518.  It was just a scheduling/noise thing which ended up putting the blame on your changeset- it is all good :)
Target Milestone: --- → Firefox 45
Product: Firefox → Firefox Graveyard
You need to log in before you can comment on or make changes to this bug.