Open Bug 913892 Opened 11 years ago Updated 2 years ago

Performance of JS arrays could be much closer to that of typed arrays

Categories

(Core :: JavaScript Engine, defect)

x86_64
Windows 7
defect

Tracking

()

People

(Reporter: kael, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Attached file ByteVsBoolArray.zip
V8 seems to have a mechanism through which it can identify that a JS Array instance contains only elements of a single type. In comparison, right now if you have two JS Array instances in Spidermonkey, one containing Booleans and one containing small integers, the one containing small integers is much faster.

This test case is some simple JSIL-compiled code that does a sieve algorithm; one test function uses an array of booleans (which maps to JS Array) and the other uses an array of bytes (which maps to a Uint8Array in the JSIL runtime).

In SpiderMonkey, the byte test case is 2x as fast as the boolean test case:
C:\Users\Kevin\Desktop\ByteVsBoolArray>js runTest.js
18367
Sieve Bytes: 02554.00ms
Sieve Bools: 05306.00ms

In V8, somehow they manage to generate the same code for both test cases:
18367
Sieve Bytes: 01627.00ms
Sieve Bools: 01616.00ms

I suspect if this optimization applies more generally it would be useful for a lot of cases, especially since it lets you keep using the JS Boolean type instead of having to change code to use type-hinted integers.
Blocks: JSIL
From your description, I can't see why you come to the conclusion that the performance of untyped arrays depends on the element type. AFAICT, you're comparing an untyped array of booleans to a typed array of Uint8s, right?
Blocks: 885526
Yeah, sorry, the title is wrong. I noticed after writing up the bug. It was an untyped array of bytes at one point but the test case actually isn't :)
I get about the same numbers with an inbound build from last week, but tip works a lot better and is faster than V8 here. With --ion-parallel-compile=on I get:

Build from last week:

Sieve Bytes: 02318.00ms
Sieve Bools: 05292.00ms

Inbound tip:

Sieve Bytes: 01271.00ms
Sieve Bools: 01511.00ms

d8 3.21.12:

Sieve Bytes: 02351.00ms
Sieve Bools: 02959.00ms

Also note that for me with d8 the second test is also slower.
I should note that in my testing I observed inconsistent behavior out of V8 - sometimes the test cases would run miserably slow for no obvious reason (opening the debugger in Chrome was an easy way to trigger it). Interesting to see that d8 does not perform consistent with what I saw.

I recently landed an optimization in JSIL that uses Uint8Array as the backing store for Boolean[], which makes the two test cases produce nearly equivalent code. Interestingly once this is done *both* test cases run in around 320ms in Firefox - that is, even the bytes test case becomes radically faster - and *both* test cases are now miserably slow in Chrome. Given the inconsistency here I'm not sure how much this optimization affects real world apps in SpiderMonkey or V8...

Updating bug title to try and make it more accurate; sorry it was a mess before.
Summary: Performance of JS arrays depends on element type and is dramatically inferior to typed arrays → Performance of JS arrays could be much closer to that of typed arrays
Attached file ByteVsBoolArray2.zip
Here's the test case as run by the latest version of JSIL; FF absolutely crushes it since typed arrays are being used to store booleans and somehow the byte case has gotten faster. Not really a bug in any fashion but the fact that the byte case got so much faster makes me wonder why the original test case couldn't be just as fast.

V8's performance on it is bizarre.
Something seems to be weird with the test case - or I just don't understand it well enough. :)

When I do s/SieveBools/SieveBytes/ in line 64 of the first testcase, times for SieveBytes drop from ~1400ms to ~400ms, with times for SieveBool staying at ~1600ms.

Maybe your second test case just changes that, resulting in the faster times for the bytes version?
The second test case doesn't change that though; I actually tried swapping the warming call in the second test case and saw no difference (though of course, the second test case is incredibly fast). Hrm.

Applying your warming tweak to the original test case gives me:

before
Sieve Bytes: 02579.00ms
Sieve Bools: 03636.00ms

after
Sieve Bytes: 00364.00ms
Sieve Bools: 03672.00ms

Which is a lot bigger than I expected. I wonder if the problem is that if SieveBools runs first, the type information becomes a mess and SieveBytes never gets optimized? Reversing the order of the actual timing runs seems to support this:

after2
Sieve Bools: 03612.00ms
Sieve Bytes: 02674.00ms

In the above case I ran SieveBytes, then timed SieveBools, then timed SieveBytes. So it looks like SieveBools deopts SieveBytes. That's frustrating (probably one of the library calls' shared type information...)

This at least explains why everything is faster in v2 of the test case - both functions now use Uint8Array, so spidermonkey is like 'who cares, they're the same'.

This suggests that JSIL performance in general is being kneecapped all over the place by type information getting polluted, as I've suspected. Blech.
(In reply to K. Gadd (:kael) from comment #7)
> This suggests that JSIL performance in general is being kneecapped all over
> the place by type information getting polluted, as I've suspected. Blech.

Yeah, that's a shame. And most certainly true for other code bases such as Shumway, too.

Jandem, do we have anything at all in the pipeline or at least as a vague idea that would help fix issues like this?
Flags: needinfo?(jdemooij)
(In reply to Till Schneidereit [:till] from comment #8)
> Jandem, do we have anything at all in the pipeline or at least as a vague
> idea that would help fix issues like this?

Not that I know of. It's a hard problem, maybe the callsite cloning mechanism we have for self-hosted functions could help, but we'd have to infer where to use that and it's not straight-forward.
Flags: needinfo?(jdemooij)
Assignee: general → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: