Closed Bug 1392573 Opened 2 years ago Closed 2 years ago

2.04 - 22.01% cart / tart / tp5o Main_RSS / tp5o responsiveness / tp5o_webext Main_RSS / tp5o_webext responsiveness / tpaint / tsvg_static (linux64, osx-10-10, windows10-64, windows7-32) regression on push a9df4b7dd62f (Wed Aug 16 2017)

Categories

(Firefox :: Screenshots, defect)

defect
Not set

Tracking

()

RESOLVED WONTFIX

People

(Reporter: igoldan, Unassigned)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push:
https://hg.mozilla.org/releases/mozilla-beta/rev/a9df4b7dd62f

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

 22%  tp5o responsiveness windows10-64 opt e10s     2.60 -> 3.17
 21%  tp5o responsiveness linux64 opt e10s          2.85 -> 3.45
 19%  tsvg_static summary windows7-32 opt e10s      49.28 -> 58.70
 11%  tp5o_webext responsiveness linux64 opt e10s   3.20 -> 3.55
  7%  tsvg_static summary osx-10-10 opt e10s        54.77 -> 58.46
  5%  tp5o Main_RSS linux64 opt e10s                161,038,768.02 -> 169,385,608.16
  5%  tp5o Main_RSS windows7-32 opt e10s            112,824,700.39 -> 118,028,869.02
  3%  tpaint summary windows7-32 opt e10s           201.05 -> 208.06
  3%  tart summary windows7-32 opt e10s             4.96 -> 5.11
  2%  tp5o_webext Main_RSS linux64 opt e10s         167,098,178.34 -> 170,918,726.09
  2%  cart summary windows7-32 opt e10s             22.55 -> 23.01


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=8861

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Component: Untriaged → Screenshots
Product: Firefox → Cloud Services
Jared, I see you are the owner of bug 1386333. Where these regressions expected, or should we look for an improvement/backout?
Flags: needinfo?(jhirsch)
as a note on August 1st this same pref landed on trunk:
https://hg.mozilla.org/mozilla-central/rev/ba769a1d2f82
as a note, these seem to for the most part erase gains we saw while uplifting 56 to beta, put another way this is back to parity for many tests that we shipped firefox 55 with.

There are a few exceptions:
  5%  tp5o Main_RSS windows7-32 opt e10s            112,824,700.39 -> 118,028,869.02
  7%  tsvg_static summary osx-10-10 opt e10s        54.77 -> 58.46
  3%  tpaint summary windows7-32 opt e10s           201.05 -> 208.06
  2%  cart summary windows7-32 opt e10s             22.55 -> 23.01


tsvg_static has gone up/down, I suspect this might be an artifact of pgo:
https://treeherder.mozilla.org/perf.html#/graphs?timerange=7776000&series=mozilla-inbound,1455931,0,1&series=mozilla-beta,1533495,1,1&series=mozilla-inbound,1455996,1,1

as for tpaint, I am not sure, it is small and measures the time to create a new window.

the cart regression is small, and this is probably ok.

the tp5o main_rss regression, seems to be 100% related to this- at 5%, maybe that is the overhead for webext + screenshots?


Given that we rely heavily on PGO for linux/windows, there is some [un]lucky draws we take when changing code- I think we should focus on the smaller list, specifically on the tsvg_static (os) and tp5o Main_RSS on windows 7.
Flags: needinfo?(jhirsch)
I didn't mean to unset the needinfo, I was just CCing some people. I'm looking at this now.
These are the same regressions we already decided to eat when we originally preffed this on in nightly.

The responsiveness and tsvg regressions are unfortunately most likely the result of enabling screenshots shortly after startup, rather than before first paint. I suppose we could try to shove some of that work into idle slices, but it may be easier said than done, since most of it is async, and we wouldn't want to do that for the before-first-paint loads.
(In reply to Kris Maglione [:kmag] from comment #5)
> These are the same regressions we already decided to eat when we originally
> preffed this on in nightly.
> 
> The responsiveness and tsvg regressions are unfortunately most likely the
> result of enabling screenshots shortly after startup, rather than before
> first paint. I suppose we could try to shove some of that work into idle
> slices, but it may be easier said than done, since most of it is async, and
> we wouldn't want to do that for the before-first-paint loads.

Is that something that you think people would be comfortable uplifting to Beta at this point, or do you think it's too complex of a patch for that?  

Is there anything we can do in the add-on at this point?

Any other ideas for improving this in the Beta timeline?
Flags: needinfo?(kmaglione+bmo)
I talked to Kris on IRC.  He does not think the idle slice patch would be upliftable and it is not an add-on patch.
Flags: needinfo?(kmaglione+bmo)
As Kris pointed out, there is a lot of overlap with the regressions we accepted in bug 1361792.  Kris has a bunch of performance improvements landing in Nightly, but they aren't good candidates to uplift to address this in Beta.  I confirmed with Jeff and Dave that we should accept the same regressions we did in Nightly and let this bug ride in Beta.
Assuming these are regressions are ones we've already accepted in Nightly, we have sign off from Jeff G. and Dave C. to accept these in Beta as well.
Per comments 8/9.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.