Closed Bug 1110175 Opened 10 years ago Closed 9 years ago

5.5% Win8/7 V8 regression on Inbound (v.36) from a variety of pushes

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

I saw this on Aurora, in fact our v8 bugs/alerts are not doing well for keeping us up to date.

Win7 (pgo + non-pgo):
Oct 17: 18200 -> 17800
Oct 30: 17800 -> 17200
** this didn't make it to aurora for some reason
http://graphs.mozilla.org/graph.html#tests=%5B%5B230,131,25%5D,%5B230,63,25%5D,%5B230,52,25%5D%5D&sel=1410529660982,1418305660982&displayrange=90&datatype=running

Win8 (pgo + non-pgo):
NOTE: we switched oct22 to win8 64 bit builds, so the drop there is expected:
http://graphs.mozilla.org/graph.html#tests=%5B%5B230,52,31%5D,%5B230,63,31%5D,%5B230,131,31%5D%5D&sel=1410529660982,1418305660982&displayrange=90&datatype=running
Oct30: 17100 -> 16600 - regression
Nov3:  16600 -> 16900 - improvement
Nov9:  16900 -> 16800 - regression
Nov21: 16800 -> 16600 - regression

Linux 32+64 (pgo + non-pgo):
nothing but improvements

OSX 10.6 / 10.8:
nothing but improvements
this is now on mozilla-beta.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
To give this a story:

1) V8 is the old benchmark, which has now been renamed to Octane. Currently in JS we only track octane, since there were some shortcomings to the V8 benchmarks and some improvements. But most general Octane = V8 + some additional benchmarks.

2) I suspect these regression are most likely octane-splay / v8-splay:
http://arewefastyet.com/#machine=17&view=single&suite=octane&subtest=Splay&start=1412441480&end=1420430377
Which we also saw on on AWFY windows 7 slave.

3) The regressions are most likely related to:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=2681f9b134c2&tochange=47f47eea45d3
Where some of the gc timings were adjusted.

Now the changes that happened during that time were to decrease latency / decrease max memory use / spread the gc over multiple smaller execution units.  But as a result for some unfortunate workloads that also meant some extra overhead. E.g. v8-splay / octane-splay. In Octane the newly splay-latency benchmark shows how good the latency is. (V8 doesn't have this benchmark). In the same regression range, we see a big increase in score on splay-latency:
http://arewefastyet.com/#machine=17&view=single&suite=octane&subtest=SplayLatency&start=1412441480&end=1420430377

This last one offsets the latency regression and made it a total win on the Octane benchmark. Also theoretically doing multiple smaller gc's over 1 big stop-the-time gc is much more preferred.
oh thanks for the info here!  Should we replace V8 with octane in Talos?  We also run Kraken, not sure if that provides benefit.
Flags: needinfo?(hv1989)
(In reply to Joel Maher (:jmaher) from comment #3)
> oh thanks for the info here!  Should we replace V8 with octane in Talos?  We
> also run Kraken, not sure if that provides benefit.

If possible, yes.

On AWFY we also run kraken. It has some nice benchmarks in it and we also keep tracking regressions for it.
Flags: needinfo?(hv1989)
You need to log in before you can comment on or make changes to this bug.