Closed Bug 1248244 Opened 8 years ago Closed 8 years ago

2.54 - 2.67% ts_paint:Win8, tp5o_scroll:winxp regression fx-team (v.47) from push 9799df240b37 (Tue Feb 9 2016)

Categories

(Toolkit :: Add-ons Manager, defect)

45 Branch
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Talos has detected a Firefox performance regression from push 9799df240b37. As author of one of the patches included in that push, we need your help to address this regression.

This is a list of all known regressions and improvements related to the push:

https://treeherder.allizom.org/perf.html#/alerts?id=119

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see:

https://wiki.mozilla.org/Buildbot/Talos/Tests#ts_paint

Reproducing and debugging the regression:

If you would like to re-run this Talos test on a potential fix, use try with the following syntax:

try: -b o -p win64 -u none -t other-e10s[Windows 8] --rebuild 5  # add "mozharness: --spsProfile" to generate profile data

(we suggest --rebuild 5 to be more confident in the results)

To run the test locally and do a more in-depth investigation, first set up a local Talos environment:

https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code

Then run the following command from the directory where you set up Talos:

talos --develop -e <path>/firefox -a ts_paint --e10s

(add --e10s to run tests in e10s mode)

Making a decision:

As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans by Wednesday, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations:

https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling</path>
Component: Untriaged → Add-ons Manager
Product: Firefox → Toolkit
I did a lot of retriggers on try to narrow this down:
https://treeherder.mozilla.org/#/jobs?repo=fx-team&filter-searchStr=talos&tochange=9fd143cd5996&fromchange=683c1c3ca832&group_state=expanded

in addition, looking at the compare view of this change vs the previous:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=fx-team&originalRevision=5afb767f3591&newProject=fx-team&newRevision=9799df240b37&framework=1

we have a windows xp tp5o_scroll regression as well.

:mossop, can you take this regression and help us get to a resolution?
Flags: needinfo?(dtownsend)
It's possible that bug 1244248 causes us to load the certificates database earlier than we would have done before so it might be moving it into the range that ts_paint measures, but as soon as we hit a https site that database should be loaded anyway. I might be able to figure out a way to still do the actual work later if necessary.

I don't think there is any way that that bug would be affecting scrolling performance.
I'm also going to guess that this doesn't affect non-e10s because there something else is loading the certificate database before ts_paint completes, we're probably only seeing a regression for e10s because ts_paint seems to finish early there (bug 1174767)
the only non-e10s regression I see is winxp tp5o_scroll;
https://treeherder.mozilla.org/perf.html#/compare?originalProject=fx-team&originalRevision=5afb767f3591&newProject=fx-team&newRevision=9799df240b37&framework=1

this seems to be across the majority of the pages- not sure if that validates or invalidates the theory here.
I ran two try runs with a current m-c changeset and my patch backed out and did a bunch of talos runs. I'm not seeing the ts_paint regression at all: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=f44f547c4e5d&newProject=try&newRevision=5eac731959f9&framework=1&filter=ts_paint&showOnlyImportant=0

I am seeing the e10s tart and tp5o_scroll regressions on windows xp though. My only guess for what is going on here is that by loading the certificate database on start up some background task is triggered that is then doing extra work during the test runs.

Unfortunately I don't know if there is a way to solve that without regressing bug 1244248. I'm not sure how to proceed here.
Flags: needinfo?(dtownsend) → needinfo?(jmaher)
thanks for running some try runs.  If there is no clear path to fixing the regression, then we only have two realistic choices:
1) backout
2) accept

My understanding is that backing out isn't a realistic option, so we should accept this.  If we can take a stab at trying to explain why this happened as part of "accepting it", then we have more data and information should we need to debug in the future.

Assuming we go forward with accepting this, lets document and mark this as resolved/wontfix!
Flags: needinfo?(jmaher)
I'm not sure who gets to make a call here, but I think we need to just accept this. In reality for the vast majority of users this is just moving something that happens later in startup to earlier and I don't see a way to work around this without breaking bug 1244248.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Version: unspecified → 45 Branch
You need to log in before you can comment on or make changes to this bug.