Closed Bug 1009260 Opened 6 years ago Closed 5 years ago

2.4% Session Restore regression on Windows XP/7 from bug 984361 on May 9, 2014 in Firefox PGO (v32)

Categories

(Core :: JavaScript: GC, defect)

x86
Windows 7
defect
Not set

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Session restore is a new set of tests and we are excited to have them:
https://wiki.mozilla.org/Buildbot/Talos/Tests#sessionrestore.2Fsessionrestore_no_auto_restore

But we have our first regression!  You can see it on graph server:
https://wiki.mozilla.org/Buildbot/Talos/Tests#sessionrestore.2Fsessionrestore_no_auto_restore

I have done some retriggers to verify this is the root cause and not assigning bad blame due to noise or other alignment of the stars:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=mozilla-inbound%20pgo%20talos%20other&fromchange=264aef76b71d&tochange=62d2d22055e8
I think the graph is here, joel? http://graphs.mozilla.org/graph.html#tests=[[313,63,31],[313,63,25],[313,63,37]]&sel=1398930743314.6582,1399938291904,1034.2610929879706,1481.5771963676923&displayrange=30&datatype=running

However, there was big improvement prior to that, and it's hard to identify a consistent behavior prior to the change IMO. Also, I think win8 behaves similarly to win xp/7 (see graph).

Overall, this metric doesn't seem very stable to me, so I'm not sure how much we can declare changes as "regressions".

Yoric, what do you think? should we expect relatively stable numbers here? would you consider these change magnitudes as something we should  try to follow?
ah, I didn't paste the graph, thanks avi!

While there is noise (just like any other benchmark), you can clearly see the shift for a win that was seen on may 6th, and the regression pointed out in here.  Avi, please look at the data in my retriggers and you can see the trend by clicking on the tbpl links and seeing the values.  They repeat very well and increase consistently when we hit the change from bug 984361.
I'm not arguing that we can't see a relatively clear increase in values on this specific date, and I'm trusting your investigation that bug 984361 is indeed responsible for it.

However, during the short time this test has been running, I couldn't yet recognize the long term pattern of the results. It's been going up and down in 2% range since we started getting results for it.

I'd have preferred to declare regressions only after we can identify some stable pattern. Otherwise, it's possible that this value is affected by many things, and its "natural noise" is such that it goes up and down few % all the time. At least that's what it has been doing so far.
fair enough.  What is enough history to show a pattern?  We have a week of data and the numbers fall into a nice little block.  We had an improvement, and now a regression.  The noise range is roughly 2%, but if the high value of the noise before the regression is 2% less than the high value after the regression there is a noticeable difference.

Lets look at the patch in question and see if it makes sense that it could be the cause of a regression :)
(In reply to Joel Maher (:jmaher) from comment #4)
> What is enough history to show a pattern?

I don't know. The best answer is probably "until we can recognize a pattern". And if we can't then we need to understand why. But I'd wait at least few more weeks before declaring that we can't see a pattern.
ok, the backout shows a fix in session restore:
https://tbpl.mozilla.org/?tree=Try&rev=71303953a5b4

no need to argue about the validity of this, we need to either fix it or live with it.
looking at this data for aurora, winxp shows a regression, win7 a slight one, and any regressions from win8 are now fixed:
http://graphs.mozilla.org/graph.html#tests=[[313,52,37],[313,52,25],[313,52,31]]&sel=1397324535203,1405100535203&displayrange=90&datatype=running
I see regularly variations from ±10% that quickly return to normal, so I'm not really panicked about 2%.
as we have shipped v.32, can we just close this out?
Flags: needinfo?(dteller)
Sounds good to me.
Flags: needinfo?(dteller)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.