Closed
Bug 1250362
Opened 9 years ago
Closed 9 years ago
2.08 - 19.14% sessionrestore / sessionrestore_no_auto_restore / ts_paint (linux64, windows7-32, windows8-64) e10s regression on push 135340a254f4 (Mon Feb 22 2016)
Categories
(Firefox :: Session Restore, defect, P2)
Tracking
()
RESOLVED
FIXED
Firefox 47
| Tracking | Status | |
|---|---|---|
| e10s | + | --- |
| firefox45 | --- | unaffected |
| firefox46 | --- | fixed |
| firefox47 | --- | fixed |
People
(Reporter: wlach, Assigned: mak)
References
(Blocks 1 open bug)
Details
(Keywords: perf, regression, Whiteboard: [talos_regression])
Attachments
(1 file)
Hi Marco, could you investigate this please? It seems like a fairly serious regression.
--
Talos has detected a Firefox performance regression from push 135340a254f4. As author of one of the patches included in that push, we need your help to address this regression.
This is a list of all known regressions and improvements related to the push:
https://treeherder.allizom.org/perf.html#/alerts?id=237
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test(s), please see:
https://wiki.mozilla.org/Buildbot/Talos/Tests#sessionrestore.2Fsessionrestore_no_auto_restore
Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux64,win64 -u none -t other-e10s[Windows 8] --rebuild 5 # add "mozharness: --spsProfile" to generate profile data
(we suggest --rebuild 5 to be more confident in the results)
To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code
Then run the following command from the directory where you set up Talos:
talos --develop -e <path>/firefox -a sessionrestore_no_auto_restore:sessionrestore
(add --e10s to run tests in e10s mode)
Making a decision:
As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans by Thursday, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Flags: needinfo?(mak77)
| Assignee | ||
Comment 1•9 years ago
|
||
David, any idea about these session restore tests?
it looks "fancy", at a minimum.
Flags: needinfo?(mak77) → needinfo?(dteller)
| Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(mak77)
ochameau, it looks like Bug 1248600 is also in the changeset, and that's one of yours. It seems to be touching startup, too. Could you take a look on your side?
wlach, any chance you could bissect further and find out whether it's Bug 1249786 or Bug 1248600?
Flags: needinfo?(wlachance)
Flags: needinfo?(poirot.alex)
Flags: needinfo?(dteller)
Comment 3•9 years ago
|
||
Bug 1248600 has been backed out, so I imagine we will get new numbers with just bug 1249786?
Otherwise, bug 1248600 is moving slightly the very precise moment when the devtools modules are loaded.
It moves from in middle of gBrowserInit.onLoad to browser-delayed-startup-finished notification.
If this perf test happen to watch something during browser-delayed-startup-finished, it may have some impact.
But if that happen to be that, this is just a move. We are going to free some computation time during earlier stages. Don't we also see a win in some other tests?
Flags: needinfo?(poirot.alex)
(In reply to Alexandre Poirot [:ochameau] from comment #3)
> But if that happen to be that, this is just a move. We are going to free
> some computation time during earlier stages. Don't we also see a win in some
> other tests?
Note that this Talos test is somewhat broken (see bug 1250362). I therefore believe that, if the regression is due to mak's patch, we shouldn't block on it, because they are quite important for stability.
I meant see bug 1245891.
See Also: → 1245891
| Assignee | ||
Comment 6•9 years ago
|
||
I'd still like to know the reason the change is moving the timing on this test, it may be useful to understand more of the sanitize impact on start.
Flags: needinfo?(mak77)
Comment 7•9 years ago
|
||
here is a compare view (still waiting on more data):
https://treeherder.mozilla.org/perf.html#/compare?originalProject=fx-team&originalRevision=1bd2197b4d48&newProject=fx-team&newRevision=135340a254f4&framework=1
two data points on there are false alarms:
* winxp a11y e10s
* win7 svg_opacity e10s
* win7 tresize
and this seems to be fixed by the backout or false alarm:
* win7 session restore e10s
which means we have:
session restore: win8/linux64
session restore no auto restore: win8/linux64
ts paint: win8/linux64
^ some mix of e10s/pgo/opt
:Yoric, if we feel that sessionrestore test is not valid, lets update it (yes, I see a patch) and delete the old test. Having trust in our tests is important!
Updated•9 years ago
|
Component: Untriaged → Session Restore
| Reporter | ||
Comment 8•9 years ago
|
||
(In reply to Alexandre Poirot [:ochameau] from comment #3)
> Bug 1248600 has been backed out, so I imagine we will get new numbers with
> just bug 1249786?
Yes, and the numbers haven't improved after the backout, so it looks like we've confirmed that the problem is bug 1249786.
Flags: needinfo?(wlachance)
| Reporter | ||
Comment 9•9 years ago
|
||
(In reply to William Lachance (:wlach) from comment #8)
> (In reply to Alexandre Poirot [:ochameau] from comment #3)
> > Bug 1248600 has been backed out, so I imagine we will get new numbers with
> > just bug 1249786?
>
> Yes, and the numbers haven't improved after the backout, so it looks like
> we've confirmed that the problem is bug 1249786.
Sorry, meant to link to a graph (though you can really just pick any in the alert summary above: https://treeherder.allizom.org/perf.html#/alerts?id=237):
https://treeherder.allizom.org/perf.html#/graphs?series=[fx-team,cc4be2ad4c55588a87acfe014c9c55cdde0528f6,1]
Summary: 4.02 - 10.52% sessionrestore / sessionrestore_no_auto_restore (linux64, windows8-64) e10s regression on push 135340a254f4 (Mon Feb 22 2016) → 2.08 - 19.14% sessionrestore / sessionrestore_no_auto_restore / ts_paint (linux64, windows7-32, windows8-64) e10s regression on push 135340a254f4 (Mon Feb 22 2016)
| Assignee | ||
Comment 10•9 years ago
|
||
David, does this test sanitize or set a sanitize pref? Do we use a clean profile at every run?
| Assignee | ||
Comment 11•9 years ago
|
||
I think I may have figured something, Sanitizer.onStartup is invoked at final-ui-startup and it now does the work that before was done by browser.js::_initializeSanitizer in delayedStartup
Among other things it was calling gPrefService.savePrefFile(null). In bug 1249786 I added back that call in Sanitizer.onStartup, unconditionally. That could not be a good time to save the prefs. We can reset the pref just before starting shutdown sanitization and only when it's set.
I can try to push a patch doing that to try and see if that's it.
The test starts with a clean profile (except for the session itself). But, as you mention, flushing preferences on startup may have a performance hit.
Comment 13•9 years ago
|
||
marco, can you take this?
I'm not 100% sure that we need to flush. For one thing, bug 789945 will ensure that the flush takes place OMT without having to call `savePrefFile`, hopefully very soon. Even before that, the worst that can happen is having sanitization run once again on next startup, if there is a crash before something else causes a flush. I suspect that we can live with that.
But yes, mak, as you mention, just by restricting the flush to the case when we actually touch the preference, we should grab back our performance loss.
| Assignee | ||
Comment 15•9 years ago
|
||
well, I'm going to try my idea and see if the Tryserver agrees with my theory.
(In reply to David Rajchenbach-Teller [:Yoric] (please use "needinfo") from comment #14)
Even before that, the worst that can happen is having
> sanitization run once again on next startup, if there is a crash before
> something else causes a flush. I suspect that we can live with that.
No, it's the opposite, the reason to clear the pref is so that if we are supposed to clear on shutdown, but we crash before shutdown, we clear on startup. So we must be sure the pref is stored before an eventual crash.
I got something wrong in the previous code comments, indeed the cases where sanitizeInProgress is not set may be more common than I expected. Will take care of updating the comments.
Assignee: nobody → mak77
Flags: needinfo?(mak77)
| Assignee | ||
Comment 16•9 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #15)
> So we must be sure the pref is stored before an eventual crash.
And here I mean, even before we start a shutdown sanitize.
So the scenario is browser starts, it is supposed to clear on shutdown but it crashes (at any time), shutdown won't ever happen. We want to clear on startup.
| Assignee | ||
Comment 17•9 years ago
|
||
| Assignee | ||
Comment 18•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/36183/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/36183/
Attachment #8722711 -
Flags: review?(dteller)
Comment on attachment 8722711 [details]
MozReview Request: Bug 1250362 - don't flush prefs on startup unless it's really needed. r=yoric
https://reviewboard.mozilla.org/r/36183/#review32773
Looks good to me.
Attachment #8722711 -
Flags: review?(dteller) → review+
Ah, right.
Well, now let's see if this improves the perf.
| Assignee | ||
Comment 21•9 years ago
|
||
Try is taking ages to run 5 talos jobs...
Btw, the first job shows sessionrestore opt e10s -3.15%, sessionrestore_no_auto_restore opt e10s -9.40%, ts_paint -0.9%... that makes me think the fix is solving this.
I'll still wait a little bit for the missing jobs but I think we're fine.
Comment 23•9 years ago
|
||
| Reporter | ||
Comment 24•9 years ago
|
||
Looks to be fixed on the graphs as well:
https://treeherder.allizom.org/perf.html#/alerts?id=261
Comment 25•9 years ago
|
||
| bugherder | ||
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox47:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47
Comment 26•9 years ago
|
||
:yoric, can we uplift this to aurora? the perf regression is on aurora as well
Flags: needinfo?(dteller)
I'm 99% sure that we can, but mak should have the final word.
Flags: needinfo?(dteller) → needinfo?(mak77)
| Assignee | ||
Comment 28•9 years ago
|
||
already uplifted along with bug 1249786.
Comment 29•9 years ago
|
||
[bugday-20160323]
Status: RESOLVED,FIXED -> UNVERIFIED
Comments:
STR: Not clear.
Developer specific testing
Component:
Name Firefox
Version 46.0b9
Build ID 20160322075646
Update Channel beta
User Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0
OS Windows 7 SP1 x86_64
Expected Results:
Developer specific testing
Actual Results:
As expected
Updated•9 years ago
|
Version: unspecified → 46 Branch
You need to log in
before you can comment on or make changes to this bug.
Description
•