Closed Bug 678687 Opened 13 years ago Closed 13 years ago

TI: ammo.js runs very slowly

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla10
Tracking Status
firefox9 - ---

People

(Reporter: azakai, Assigned: bhackett1024)

References

(Blocks 1 open bug)

Details

(Keywords: regression, Whiteboard: [paladin])

Attachments

(5 files, 1 obsolete file)

Attached file stress benchmark
The attachments are ammo.js (bullet physics engine + autogenerated js bindings) and a small stress benchmark. Running the benchmark gives much poorer results with -n than without:

mozilla-central  -m -j -p     1.88
jaegermonkey     -m -j -p     1.96
jaegermonkey     -m -j -p -n  5.47

Results are printed in the benchmark itself and are in seconds (lower is better). jaegermonkey without -n is slightly slower than m-c, while jaegermonkey with -n is almost 3 times slower.

This code is implicitly statically typed (it's compiled from C++), so I was hoping -n would make it faster :(

If it's important, I can provide a non-minified version of ammo.js.
Can you attach or link to the non-minified version?
Here is a pretty-printed version of ammo.js.
The -D output is pointing to a bunch of element accesses which are almost always stubbed.  I looked at the inferred types for some of these and the element type accessed is int|float.  It's kind of silly to still not have a fast path for accesses on double-representable-as-int elements, but I think that a better fix is to try to convert doubles to ints at type barriers which have never seen doubles before.  This will speed up not just the element accesses but also other uses of the not-really-a-double doubles.  e.g. we take 4 million stub calls in the below function because a/c are thought to be doubles, but the 'a + e' and 'c + e' will go faster if a/c are known integers.

function B(a, c, d) {
  for(var e = 0;e < d;e++) {
    z[a + e] = z[c + e]
  }
}
I'm curious, why are these values thought to be doubles in the first place? Is that just the default before anything else is known, or something else? (I am fairly sure that in functions similar to that in ammo.js, only integers would ever be passed in)
The arguments will only be marked as doubles if a double value has actually been observed to have been passed in.  The problem is that this property holds of doubles like '4.0', and JM+TI will coerce integers into doubles within blocks of code used to manipulate both ints and doubles (this allows JM to store the value in a floating point register and avoid testing its type whenever it is accessed).  If those ints-converted-to-doubles are then passed to functions that had previously seen only integers, the arguments get marked as doubles, every time the function gets called any integer argument is converted to a double, and if those new ints-converted-to-doubles are then passed somewhere else the process repeats.

This problem of imprecision steadily leaking across the program is normally addressed by inserting dynamic type tests ('type barriers') at key points (property access sites and function entry points), but as currently performed these type barriers can't cope with the above kind of leaking and need to be improved (should not be hard).
Very interesting, thanks for the explanation. Sounds like ammo.js (which has heavy use of both ints and floats) would greatly benefit from limiting that imprecision leaking.
Attached patch WIP (obsolete) — Splinter Review
WIP for coercing doubles at type barriers, improves TI's time from ~1.9s to 1.0s (m-c is .8s, d8 is 2.5s).
Here are courtesy of humph three very cool demos on the web that use ammo.js,

http://granular.cs.umu.se/ammo/Demos/index.html

They run much slower in FF9 and Nightly compared to earlier FFs because of this bug.
Whiteboard: [paladin]
Updated patch.  Sorry about the delay here.
Assignee: general → bhackett1024
Attachment #556004 - Attachment is obsolete: true
Attachment #568063 - Flags: review?(dvander)
Attachment #568063 - Flags: review?(dvander) → review+
https://hg.mozilla.org/mozilla-central/rev/3b5b10d76887
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
hm, tree-management points out a V8 regression of 2% on Linux and 4% on Linux64...
and also a Dromaeo (String/Array/Eval/Regex) 8% regression
Backed out due to further dev.tree-management confirmation of regressing V8/Dromaeo on multiple platforms since comment 14:
https://groups.google.com/forum/#!topic/mozilla.dev.tree-management/mps5nX3AvzA
...even if the fix ends up being easy, it will be easier to see if it is successful, if it lands as a relanding :-)

https://hg.mozilla.org/integration/mozilla-inbound/rev/6049e8ad9755
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla10 → ---
(In reply to Ed Morley [:edmorley] from comment #15)
> Backed out due to further dev.tree-management confirmation of regressing
> V8/Dromaeo on multiple platforms since comment 14:
> https://groups.google.com/forum/#!topic/mozilla.dev.tree-management/
> mps5nX3AvzA
> ...even if the fix ends up being easy, it will be easier to see if it is
> successful, if it lands as a relanding :-)
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/6049e8ad9755

Thanks!  This also dinged some AWFY numbers a little.  I think I will need to use a more targeted fix instead of this patch.
Alternative approach which just allows GETELEM/SETELEM paths to work on known doubles, by first trying to convert the double to an integer.  This gives a similar speedup to the ammo benchmark.

pre-TI js -m -j -p:  3.49
js -m -n (old):      4.52
js -m -n (new):      3.46
d8:                  6.22

The Dromaeo regressions were caused by the VM changes, trying to convert doubles to ints after element/property accesses and when making function calls.  I rearranged things to be more efficient but the slowdown remained when pushing things to try.  I want to figure out what's up and get the real fix in, but will not be able to do so for Firefox 10 (shouldn't affect ammo.js performance much, however).
Attachment #570702 - Flags: review?(dvander)
So that last change just makes ammo.js faster without making the VM changes that affected dromaeo?
(In reply to Boris Zbarsky (:bz) from comment #18)
> So that last change just makes ammo.js faster without making the VM changes
> that affected dromaeo?

Yes.  The latest patch just improves compiler paths so that x[y] and x[y] = z are faster when y is a double convertible to an int, and makes things strictly faster than they are now.
Attachment #570702 - Flags: review?(dvander) → review+
And backed out again due to an intermittent timeout on test_jQuery...

https://hg.mozilla.org/integration/mozilla-inbound/rev/7cb178191144
Round 3:

https://hg.mozilla.org/integration/mozilla-inbound/rev/f8bbd27f41ce

I tweaked the patch a bit to slow path when accessing a double identifier in GETELEM or SETELEM ops which require an IC.  With this change, I wasn't able to reproduce the timeout on try:

https://tbpl.mozilla.org/?tree=Try&rev=995136d9f804
Assignee: bhackett1024 → general
https://hg.mozilla.org/mozilla-central/rev/f8bbd27f41ce
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
Assignee: general → bhackett1024
This patch is a big improvement, I now get

-m -j -p   0.699
-m -n      0.959

but it is still 37% slower with TI - and TI is something which should, in principle, make this kind of code *faster* (like it does on most benchmarks).

Do we know what the remaining issues are here?
(In reply to Alon Zakai (:azakai) from comment #24)
> This patch is a big improvement, I now get
> 
> -m -j -p   0.699
> -m -n      0.959
> 
> but it is still 37% slower with TI - and TI is something which should, in
> principle, make this kind of code *faster* (like it does on most benchmarks).
> 
> Do we know what the remaining issues are here?

What is the command line you are running?  I do js -f ammo.pretty.js -f ammo.stress.js, and get similar times with -mjp and -mn of about 3.5 seconds.
I am using the original 2 files attached in this bug, and doing |js [opts] stress.js|. I'm on 32-bit linux. As mentioned in comment 0, I am looking at the number printed in the benchmark itself, not the overall time (the overall time would include startup, which isn't important in practice with this code).
[Triage Comment]
This was nominated for tracking on FF9, and is now resolved/fixed. Is this still something that we want to take on aurorabeta? If so, please nominate the patch.
Depends on: 723574
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: