If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Dromaeo-css and V8v7 regressions from what should be innocuous parser changes

REOPENED
Assigned to

Status

()

Core
JavaScript Engine
REOPENED
4 years ago
4 years ago

People

(Reporter: bz, Assigned: naveed)

Tracking

({regression})

unspecified
x86
Mac OS X
regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox24 unaffected, firefox25- affected)

Details

There are a bunch of regression mails on tree-management with this checkin range: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=169644d6f9e6&tochange=c39ede0eea8f

Some examples:

Regression: Mozilla-Inbound-Non-PGO - V8 version 7 - Ubuntu HW 12.04 x64 - 29.7% decrease: http://mzl.la/14WWTvp
Regression: Mozilla-Inbound-Non-PGO - V8 version 7 - Ubuntu HW 12.04 - 32.7% decrease: http://mzl.la/14WWU2A
Regression: Mozilla-Inbound - V8 version 7 - MacOSX 10.8 - 27.3% decrease: http://mzl.la/14WWY29
Regression: Mozilla-Inbound-Non-PGO - V8 version 7 - WINNT 6.1 (ix) - 32.7% decrease: http://mzl.la/1382v8r

Regression: Mozilla-Inbound-Non-PGO - Dromaeo (CSS) - Ubuntu HW 12.04 x64 - 2.32% decrease: http://mzl.la/14WWU2s
Regression: Mozilla-Inbound-Non-PGO - Dromaeo (CSS) - WINNT 6.2 x64 - 3.64% decrease: http://mzl.la/1382voZ
Regression: Mozilla-Inbound - Dromaeo (CSS) - MacOSX 10.7 - 2.94% decrease: http://mzl.la/14WWXLD

The compare-talos results across that push range are: http://perf.snarkfest.net/compare-talos/index.html?oldRevs=169644d6f9e6&newRev=c39ede0eea8f&submit=true

The dromaeo-css bit is showing hits for all of jquery/mootools/prototype, with the biggest hit for prototype.

Looking at the logs at https://tbpl.mozilla.org/php/getParsedLog.php?id=24589881&tree=Mozilla-Inbound&full=1 and https://tbpl.mozilla.org/php/getParsedLog.php?id=24590921&tree=Mozilla-Inbound&full=1 which are the before and after logs for a Windows dromaeo run, the V8 bits look like this:


INFO : v8 benchmark
INFO :  Crypto: 17394.2285354
INFO :  DeltaBlue: 17574.1644
INFO :  EarlyBoyer: 20933.8048564
INFO :  NavierStokes: 22511.7764471
INFO :  RayTrace: 25943.8374502
INFO :  RegExp: 2725.35973516
INFO :  Richards: 18173.4696
INFO :  Splay: 12362.1847
INFO : Score: 14789.0785369


INFO : v8 benchmark
INFO :  Crypto: 16894.5433219
INFO :  DeltaBlue: 11537.591
INFO :  EarlyBoyer: 21021.188716
INFO :  NavierStokes: 21326.9461078
INFO :  RayTrace: 1753.35884913
INFO :  RegExp: 2900.64875622
INFO :  Richards: 17728.6644
INFO :  Splay: 12125.8608
INFO : Score: 9943.17907491

Note that RayTrace numbers.  I'd probably start there...

It's interesting to me that nothing on awfy seems to have caught this?
tracking-firefox25: --- → ?
And in particular, is the v8v7 raytrace significantly different from the Octane raytrace?
(In reply to Boris Zbarsky (:bz) (reading mail, but on paternity leave) from comment #1)
> And in particular, is the v8v7 raytrace significantly different from the
> Octane raytrace?
No, should be the same

Fyi awfy caught this too. Look at unagi/arm builds. For some bizar reason x86 stays relative calm under it. I think I do saw a box2d regression, though...

Unagi regression range:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=8255862d8b2dd0d403128b50ef9fc44cbfdedc9a&tochange=0b81bf199408bbec041eea8652a5cd87cf98dc48

So when combining both ranges we get the following list (could still be wrong, but would be very coincidental:
c39ede0eea8f	Jason Orendorff — Bug 844805, part 3 - Remove two calls to FoldConstants from the parser. r=Waldo.
5428fa083db3	Jason Orendorff — Bug 844805, part 2 - Don't even set the pn_op field of PNK_DOT/ELEM nodes. r=Waldo.
3f152e51be2b	Jason Orendorff — Bug 844805, part 1 - Don't use the pn_op field of PNK_DOT/ELEM nodes. r=Waldo.
71c097c1ec7a	Jason Orendorff — Add passing test for bug 826124 which went away with JM. no_r=me.
818298a5a183	Jason Orendorff — Bug 885463 - Warn about 'yield' without operand. r=Waldo.
> Fyi awfy caught this too. Look at unagi/arm builds.

Huh, indeed.  I wonder why this is happening on ARM but not x86 there, but definitely on x86 in browser....
I didn't have time to investigate today so I backed out my changes (the most likely culprit). Hope this doesn't help at all! :)

https://hg.mozilla.org/integration/mozilla-inbound/rev/f9bc146f973a
https://hg.mozilla.org/integration/mozilla-inbound/rev/5a1907fc70ed
https://hg.mozilla.org/integration/mozilla-inbound/rev/fe6c678b6e40
Found the bug. I'll fix in bug 844805.
Sorry to be the bearer of bad news, but the backout also introduced a regression to asm.js benchmarks: zlib (18% on Odin) and primes (7% slowdown on non-Odin).

(Awfy x86/x64 regression ranges weren't reliable since GGC is shown on awfy. So I bisected locally. Awfy x86/x64 regression ranges should be correct from now on.)

Updated

4 years ago
Assignee: general → jorendorff
status-firefox24: --- → unaffected
status-firefox25: --- → affected
tracking-firefox25: ? → +
Keywords: regression
OK, the backout happened, and then I fixed the bug and checked it back in. Now where do we stand?
Looks good; I didn't see any notifications in tree-management about this problem.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WORKSFORME

Comment 9

4 years ago
The slowdowns mentioned in comment 6 are still on awfy. So either the relanding did not fix the slowdowns, or the slowdowns were caused by something else.
Resolution: WORKSFORME → FIXED
Reopening then. What slowdown exactly does comment 6 describe? Are we talking about the spike in
  http://arewefastyet.com/#machine=11&view=breakdown&suite=asmjs-apps
on the red line on the chart named "asmjs-apps-zlib-workload0"?

If so, the regression appears to be GGC only. Is that right?

It's really hard for me to see how that parser backout could have this effect -- or, if it did, how checking it back in could fail to negate it...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Definitely something odd going on. In comment 6 Hannes says he could only find the regressing changeset by manual local regression, not awfy, because awfy was flakey.

This is not a GGC issue.

On primes for example http://arewefastyet.com/#machine=11&view=breakdown&suite=asmjs-ubench there are two recent slowdowns. Not the very last one but the one before it, is the one relevant here. On zlib workload 4 (not 0), you can see a slowdown in the orange line, that is also the relevant regression here.
The spikes mentioned are visible on June 27th on asmjs-apps-zlib-workload1 (asmjs, orange line) and asmjs-ubench-primes (no-asmjs, grey line). This is on x86.

The regression range of both incorrectly points to June 27th with regression range:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=e5f664c8b66d&tochange=aeae60c65cc7

The range is wrong, since awfy forgot to rebuild the engine (since we started showing ggc). So the actual regression range is 26th June till 27th of June. So I had to manually bisect and verify the mentioned regression are caused by the backout.

If you want to run the benchmarks, they are on (asmjs-apps and asmjs-ubench):
https://github.com/dvander/arewefastyet/tree/master/benchmarks

running is just doing "python harness.py $JS_PATH/js" in the directory
I bisected locally and got a different regression range for the primes slowdown (bug 885186 was implicated). It is possible it has a different cause than the zlib one, and that is what was measured before?
I see a 1.5% regression locally on zlib on that other bug, which is a regression, but I do not see the large regression we saw on awfy. So I am not sure if that regression is also to blame on the other bug or not. But it seems possible this bug here is not to blame for anything.
Un-taking.
Assignee: jorendorff → general
Naveed, who should own this? Have we determined if the perf regression remains?
Assignee: general → nihsanullah
(Assignee)

Comment 17

4 years ago
The original regression was addressed by jorendorff's backout. You can see this by looking at results between regressed changeset c39ede0eea8f and corrected changeset fe6c678b6e40. http://graphs.mozilla.org/graph.html#tests=[[230,131,33]]&sel=1372087389438.0361,1372404015944.0603,8148.14814814815,15555.555555555555&displayrange=365&datatype=running.

This may have introduced small odin regressions. I will have h4writer dig into those to see if they are still present. If they are I think we we should deal with them as another bug.

Alon are you still see these specific dromaeo perf regressions after jorendorff's backout in fe6c678b6e40? 

The graph seems to indicate we have been ticking upwards ever since so unless azakai is still experiencing these specific v8 regressions I think we can safely close this particular bug down.
Flags: needinfo?(azakai)
(I assume you mean asm regressions for me, not dromeo?) I wasn't able to reliably reproduce the asm regressions from before, so from my perspective I don't think there is anything left to investigate there.
Flags: needinfo?(azakai)
(In reply to Naveed Ihsanullah [:naveed] from comment #17)
> The original regression was addressed by jorendorff's backout.

Thanks!
tracking-firefox25: + → -
Blocks: 935296
You need to log in before you can comment on or make changes to this bug.