Closed Bug 1068631 Opened 10 years ago Closed 9 years ago

2.21% All platforms Session Restore Test regression on Inbound (v.35) Sept 16 from revision cf9ed5c35329

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jmaher, Assigned: lth)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Attachments

(2 files)

session restore took a hit yesterday:
http://graphs.mozilla.org/graph.html#tests=[[313,63,21],[313,63,24],[313,131,35],[313,131,37],[313,131,25],[313,131,31],[313,131,33]]&sel=1410357539125,1410962339125&displayrange=7&datatype=running

I did some retriggers on osx 10.6, this is not a quirk in the system or a misalignment of the planets:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&fromchange=03c78025728a&tochange=ae2e4ca015fd&jobname=Rev4%20MacOSX%20Snow%20Leopard%2010.6%20mozilla-inbound%20talos%20other

it appears the patch in bug 1054882 is quite large, but this is affecting too many platforms to just live with a regression without at least understanding why this is causing us problems.
Lars, can you take a look at this?

here is more information about the session restore tests:
https://wiki.mozilla.org/Buildbot/Talos/Tests#sessionrestore.2Fsessionrestore_no_auto_restore

I can help you run them locally or on try server if you need to go there.
Flags: needinfo?(lhansen)
Yes, I will take a look at this, though I'm fighting a couple of fires right now and it won't be until tomorrow at the earliest.  I'll be in touch when I need more info.
Assignee: nobody → lhansen
Flags: needinfo?(lhansen)
thanks!
Looks like the SharedTypedArrayBuffers would benefit from some lazification, as the SIMD object. Could this regression also be fixed as a side effect of bug 1067459?
(In reply to Benjamin Bouvier [:bbouvier] from comment #4)
> Looks like the SharedTypedArrayBuffers would benefit from some lazification,
> as the SIMD object. Could this regression also be fixed as a side effect of
> bug 1067459?

Thanks, that's a nice lead, I doubt I would have thought of that.  Strange, we're only talking about seven new global types here, it does not feel like that should have a great impact.  (Though perhaps it's being multiplied by hundreds of tabs in this test; I haven't looked yet.)
Excuse me. Nightly went "bad" on 9/16 for me: only pinned tabs are restored on start, Session Manager add-on doesn't appear on the Tools menu, and About -> Update presentation is broken.  Before I start filing bugs (that likely will be dups), is this bug the likely culprit for all that?

Sidenote: BZ is for reporting of specific behavior. Where's the right place to ask "Hey, is Nightly just broken now?"  Maybe the old Mozillazine ... or @FirefoxNightly?
(In reply to Frank Burleigh from comment #6)
> Excuse me. Nightly went "bad" on 9/16 for me: only pinned tabs are restored
> on start, Session Manager add-on doesn't appear on the Tools menu, and About
> -> Update presentation is broken.  Before I start filing bugs (that likely
> will be dups), is this bug the likely culprit for all that?

If you mean bug 1054882, then probably not.  The About->Update thing was bug 1068384;
problems with attachments was bug 1067351.  Bug 1054882 had an impact on indexDB, and
it's possible that can show up in various places; that was tracked on bug 1068539.

> Sidenote: BZ is for reporting of specific behavior. Where's the right place
> to ask "Hey, is Nightly just broken now?"  Maybe the old Mozillazine ... or
> @FirefoxNightly?

I don't know.  I usually end up looking / searching in Bugzilla for recent bugs, but the volume can be daunting and more often I just file the problem and hope somebody who knows the domain will triage.  (That has worked pretty well so far.)
(In reply to Frank Burleigh from comment #6)
> Excuse me. Nightly went "bad" on 9/16 for me: only pinned tabs are restored
> on start, Session Manager add-on doesn't appear on the Tools menu, and About
> -> Update presentation is broken.  Before I start filing bugs (that likely
> will be dups), is this bug the likely culprit for all that?

This bug is about a slight performance regression in some fairly specific tests. It's not about any functional regressions at all.

> 
> Sidenote: BZ is for reporting of specific behavior. Where's the right place
> to ask "Hey, is Nightly just broken now?"  Maybe the old Mozillazine ... or
> @FirefoxNightly?

@FirefoxNightly sounds like a good bet. I have no idea how active Mozillazine is nowadays.

Also, when in doubt file new bugs. Duping them isn't that big of a deal, at least compared to potentially not getting reports about real new issues.
this is now on mozilla-beta
Joel, how do I run this test locally?
Flags: needinfo?(jmaher)
Hi :lth, thanks for checking into this.

There are some basic instructions up here:
https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code

On top of that for this specific test you might want to do this:
talos -e <path>/dist/bin/firefox -a sessionrestore --develop --results_url sessionrestore.out

likewise, we also have:
talos -e <path>/dist/bin/firefox -a sessionrestore_no_auto_restore --develop --results_url sessionrestore_no_auto.out

these will launch the browser and close it quickly, 20 times- no need to be alarmed.  When it is done, you can look at the .out file and see the raw values, likewise just scroll up in your console window.
Flags: needinfo?(jmaher)
Related to this is probably this change:

changeset:   156877:72db5a6ae5c8
user:        Bobby Holley <bobbyholley@gmail.com>
date:        Fri Nov 22 10:55:42 2013 -0800
summary:     Bug 933681 - Define JSStdName tables in terms of jsprototypes.h. r=jorendorff

In that changeset, the JS_FOR_EACH_PROTOTYPE macro is introduced.  This macro should be used to elide definitions that are not enabled in a particular build and should have prevented the present regression, whatever its cause, from reaching mozilla-beta, since the feature (lots of new shared-memory types) is only enabled on Nightly.

The underlying mechanism is that different macros should be used to process certain definitions, one if the feature is enabled and another if the feature is disabled.  However, the definition of JS_FOR_EACH_PROTOTYPE passes the same macro in both cases:

#define JS_FOR_EACH_PROTOTYPE(macro) JS_FOR_PROTOTYPES(macro,macro)

Hence the work is done whether the feature is enabled or not.
Of course, if the class definition were to be elided by JS_FOR_EACH_PROTOTYPE, a lot of ifdeffery would be required elsewhere in the code.
Ubuntu 14.10 64-bit, AMD FX4100 (quad-core) @ 3+ GHz max clock frequency, Nvidia GeForce GTX 560 Ti with Nvidia driver.  Firefox compiled with ./mach build without any mozconfig.

Attached is a hacky patch that (in principle) disables the salient parts of the code introduced by bug 1054882, this should apply cleanly to mozilla-inbound 4b1566e6f3d0 (16 December).

On my system, the session restore test is consistently faster with the patch applied than without, but the difference is only about 0.6%.  The methodology for computing that is to run the session restore test five times with each config and then average all the measurements, and take the ratios of the averages.

I'm attaching an archive that contains the output files for the run and the processing code.
Attached file Measurement results
locally this drops the score about 10 points!  nice.  Can you push this to try:
try: -b o -p linux,linux64,macosx64,win32,win64 -u none -t other,other_nol64
(In reply to Joel Maher (:jmaher) from comment #17)
> locally this drops the score about 10 points!  nice.

That is nice.  I wonder what the difference is.  What hardware are you running on?  I suspect my AMD system of being a little clunky but I have as yet been unable to run the talos tests locally on my OSX laptop.

> Can you push this to try:
> try: -b o -p linux,linux64,macosx64,win32,win64 -u none -t other,other_nol64

I will do that, and report back.

Note, even if this patch could be checked in to stop the regression on beta, which it can't because it won't be approved in anything like its current state, the regression would reappear once this feature starts riding the trains.  So at most the patch gives us reason to look further for a solution that allows the feature to be initialized lazily, and I don't know how long that will take.  The talos regression may therefore be locked in for a couple releases, unless that's completely unacceptable.  Do let me know.
the 10 point drop was from the data you had provided from your local runs.  If we don't have any luck fixing this 100%, then we have documented our work- made what improvements we can, and will move forward with our new reality.
I misunderstood the meaning of "points"; I get it now.

Try build running:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d654ee76c849
I don't see these making an improvement, but the numbers on try server are at the [very] bottom of the range that we see on m-c.

It is hard to compare this to mozilla-beta as that is PGO only.
This is a try run with the same parameters but without the patch that disables the new constructors, in an effort to create an apples-to-apples comparison:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c83dff24eb63
this baseline comparison try run shows numbers in the range, but in the middle, not the bottom.  Overall I believe we will see a small win from this.
this is almost 6 months old- the regression has shipped, any problem with resolving this as wontfix?
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: