asm.js performance regression bisected to #1294606

RESOLVED FIXED in Firefox 51

Status

()

defect
P1
major
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: pipcet, Assigned: pipcet)

Tracking

(Blocks 1 bug, {perf, regression})

51 Branch
mozilla53
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr45 unaffected, firefox50 unaffected, firefox51 fixed, firefox52 fixed, firefox53 fixed)

Details

Attachments

(2 attachments)

I'm running into severe compilation time issues with recent versions of SpiderMonkey when running a (very) large asm.js file. I've bisected the problem down to patch part 2 of #1294606. My first guess at what's going wrong is that the new MRSh::foldsTo function just gives up if it's not looking at a sign extension, rather than calling MBinaryBitwiseInstruction::foldsTo.

The details:
I'm working on a gcc (binutils,glibc) backend for asmjs (and wasm) at https://github.com/pipcet/asmjs. I'm producing large switch statements, one for each function, rather than using an Emscripten-like relooper. 

When compiling Perl, the Perl_yylex function turns into about 30,000 lines of code. When switching from SpiderMonkey-as-of-a-few-months ago to the current inbound branch, compilation time drastically increased, from one minute real time to about three minutes. Most of that time was indeed spent compiling Perl_yylex, and the problem appeared to be in the Ion register allocators. --wasm-always-baseline resulted in acceptable startup times, and nodejs --harmony similarly didn't seem to be too problematic.

I bisected it down to part 2 of the patch for #1294606, noticed the lack of inheritance for ::foldsTo, tried fixing it, and performance appears to have gone back to the previous level.

I've attached a proposed patch which applies to the current (well, a few days old) inbound tree. If it looks okay I'll convert to SVN format.
Attachment #8812551 - Flags: feedback?(nicolas.b.pierron)
No luck reducing the script size for a test script yet, so I've uploaded a 7z archive with perl.js to https://gist.github.com/pipcet/ed5e1a3ffeae8167bdf7905ed035834d.

I'm seeing timings like

real	0m26.592s
user	1m8.212s
sys	0m5.528s

with my patch versus

real	2m28.336s
user	4m40.672s
sys	0m7.272s

without it.

If I should upload that file anywhere else or can be of any assistance building the script from the repository (which is probably broken right now. Sorry, working on it.), or if there's anything else needed to fix this bug, just let me know.
Blocks: 1294606
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: perf, regression
Version: Trunk → 51 Branch
Hi pipcet, it looks like you already have a patch so I'm gonna assign it to you. Thanks for working on this!
Assignee: nobody → pipcet
Can someone put this on the test server? Thanks!
Attachment #8812675 - Flags: review?(nicolas.b.pierron)
So with the patch it still takes 1 minute to compile this? Looks like there may be other issues.
Flags: needinfo?(luke)
(In reply to Jan de Mooij [:jandem] from comment #4)
> So with the patch it still takes 1 minute to compile this? Looks like there
> may be other issues.

Oops. I really should have mentioned I'd reduced CPU speed to get more comparable numbers. Sorry!

Which is not to say there aren't other issues, most likely on my side.

I think lazy asm.js/wasm compilation would be a great feature that would probably reduce that time further; after all, perl.js contains much of glibc, most of which is never called.
(In reply to pipcet from comment #5)
> Oops. I really should have mentioned I'd reduced CPU speed to get more
> comparable numbers. Sorry!

Ah! No problem. If you notice any cases that are slower than in Safari/Chrome/Edge, please do file new bugs (in the "JavaScript Engine: JIT" component) :)
Flags: needinfo?(luke)
Comment on attachment 8812675 [details] [diff] [review]
Mercurial-formatted patch

Review of attachment 8812675 [details] [diff] [review]:
-----------------------------------------------------------------

Good catch!
Attachment #8812675 - Flags: review?(nicolas.b.pierron) → review+
(In reply to pipcet from comment #3)
> Can someone put this on the test server? Thanks!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=803996f29a5418a66533a135dcadc5063559ca12

I only scheduled it on Linux with the JS test suite, I do not expect any operating system behaviour from this patch.
Good find!
Blocks: jsperf
Priority: -- → P1
(In reply to pipcet from comment #5)
> (In reply to Jan de Mooij [:jandem] from comment #4)
> > So with the patch it still takes 1 minute to compile this? Looks like there
> > may be other issues.
> 
> I think lazy asm.js/wasm compilation would be a great feature that would
> probably reduce that time further; after all, perl.js contains much of
> glibc, most of which is never called.

Or you can use a faster compiler:

$ time ~/moz/mozilla-inbound/js/src/build-release/dist/bin/js --wasm-always-baseline perl.js 

real	0m3.602s
user	0m3.384s
sys	0m1.415s

$ time ~/moz/mozilla-inbound/js/src/build-release/dist/bin/js perl.js

real	0m35.537s
user	1m12.007s
sys	0m2.597s

(With both of these I get a strange error, the output is this:

  lseek 0 0 1
  lseek 1 0 1
  lseek 2 0 1
  Can't open perl script "HOME=/Users/lhansen": Invalid argument

Is that expected?)
(In reply to Lars T Hansen [:lth] from comment #10)
> (In reply to pipcet from comment #5)
> > I think lazy asm.js/wasm compilation would be a great feature that would
> > probably reduce that time further; after all, perl.js contains much of
> > glibc, most of which is never called.
> 
> Or you can use a faster compiler:

Thanks! That's actually what I've been doing to build Perl.

> (With both of these I get a strange error, the output is this:
> 
>   lseek 0 0 1
>   lseek 1 0 1
>   lseek 2 0 1
>   Can't open perl script "HOME=/Users/lhansen": Invalid argument
> 
> Is that expected?)

Yes, that's expected. I was so eager to report the bug I didn't stop to fix unrelated issues, so the script isn't actually functional right now.

I'd like to thank everyone for responding so quickly and being so helpful.
(In reply to pipcet from comment #1)
> No luck reducing the script size for a test script yet, so I've uploaded a
> 7z archive with perl.js to
> https://gist.github.com/pipcet/ed5e1a3ffeae8167bdf7905ed035834d.
> 
> I'm seeing timings like
> 
> real	0m26.592s
> user	1m8.212s
> sys	0m5.528s
> 
> with my patch versus
> 
> real	2m28.336s
> user	4m40.672s
> sys	0m7.272s
> 
> without it.
> 
> If I should upload that file anywhere else or can be of any assistance
> building the script from the repository (which is probably broken right now.
> Sorry, working on it.), or if there's anything else needed to fix this bug,
> just let me know.

Good find! ;) My mistake.
Comment on attachment 8812551 [details] [diff] [review]
proposed patch

(feedback provided as part of the review of the other patch)
Attachment #8812551 - Flags: feedback?(nicolas.b.pierron)
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f77c4250f23a
constant-fold asm.js RSh expressions again. r=nbp
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/f77c4250f23a
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Hi :luke,
WebAssembly won't be shipped in 51, is 51 still affected? or Should I mark 51 as disabled?
Flags: needinfo?(luke)
(In reply to Gerry Chang [:gchang] from comment #16)
> WebAssembly won't be shipped in 51, is 51 still affected? or Should I mark
> 51 as disabled?

Yes, this issue is also for JS, not only WebAssembly & asm.js.
We should backport to 51 and 52.
Flags: needinfo?(luke)
Nicolas, can you please help out with the Aurora/Beta approval requests?
Flags: needinfo?(nicolas.b.pierron)
Comment on attachment 8812675 [details] [diff] [review]
Mercurial-formatted patch

Approval Request Comment
[Feature/regressing bug #]: Bug 1294606

[User impact if declined]: IonMonkey is unable to optimize ">>" operators, such as folding constants.

[Describe test coverage new/current, TreeHerder]: Landed on central, and more than likely exercised by jit-test test suite.

[Risks and why]: None, this re-establish the function call which used to be implicit (via virtual function calls & inheritance) before the implementation of MRsh::foldsTo.

[String/UUID change made/needed]: N/A
Flags: needinfo?(nicolas.b.pierron)
Attachment #8812675 - Flags: approval-mozilla-beta?
Attachment #8812675 - Flags: approval-mozilla-aurora?
Comment on attachment 8812675 [details] [diff] [review]
Mercurial-formatted patch

Fix a perf regression. Beta51+ and Aurora52+. Should be 51 beta 3.
Attachment #8812675 - Flags: approval-mozilla-beta?
Attachment #8812675 - Flags: approval-mozilla-beta+
Attachment #8812675 - Flags: approval-mozilla-aurora?
Attachment #8812675 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.