945382 - Typed array behavior different across different JS engines

Reporter

Description

•

11 years ago

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36 Steps to reproduce: Ran https://gist.github.com/wahbahdoo/7755708 using both SpiderMonkey (C26.0a1) and V8 (3.22.19) shells and got differing results for typed array construction. (OSX 10.8.5 Core i7-2720QM) Actual results: SpiderMonkey: i[0] != a[0] V8: i[0] == a[0] Seems to be different for all negative numbers. Expected results: Please advise on correct behavior and make two engines consistent. Thanks!

Heidi Pan

Reporter

Comment 1

•

11 years ago

V8 issue here: https://code.google.com/p/v8/issues/detail?id=3034

Luke Wagner [:luke]

Comment 2

•

11 years ago

What you are seeing is Nan-canonicalization. -1 is 0xffffffff which, when interpreted as a float, is a NaN value. Due to the use of NaN-boxing in SM/JSC, we canonicalize all float/double values on load from a typed array which will change the value to the canonical NaN value of 0x7fc00000. This issue divergence is known and explicitly allowed by the spec. Tentatively resolve 'invalid', but let me know if you have any further questions or think there is still a problem.

Status: UNCONFIRMED → RESOLVED

Closed: 11 years ago

Resolution: --- → INVALID

Heidi Pan

Reporter

Comment 3

•

11 years ago

Ahh, thanks! This came up from a contrived test case for SIMD EcmaScript (https://github.com/johnmccutchan/ecmascript_simd/pull/36), but in practice, there should only be float32x4->int32x4->float32x4 bit cast operations (all bitwise ops are int32x4,int32x4->int32x4, and comparison ops return int32x4 masks), no int32x4->float32x4->int32x4 bit cast operations, so we should be ok.

Heidi Pan

Reporter

Comment 4

•

11 years ago

Actually, this is posing a problem for Emscripten translating existing x86 intrinsics (main source of existing SIMD codes) to JS SIMD. 1) the comparison operations return 0xffffffff or 0 in float. e.g. _mm_cmplt_ps(a, b) => SIMD.int32x4.bitsToFloat32x4(SIMD.float32x4.lessThan(a, b)) Problematic when a<b b/c the result is most likely going to be used as a mask in a bitwise op or fed into movemask. 2) the bitwise operations are done in Suggestions?

Heidi Pan

Reporter

Comment 5

•

11 years ago

(cont'd from above...accidentally got submitted before finishing) 2) ...floating point e.g. _mm_and_ps(a, b), but b could easily be a mask created as such: _mm_castsi128_ps(_mm_set1_epi32(0xffffffff)), or created from a comparison operation

Luke Wagner [:luke]

Comment 6

•

11 years ago

IIUC, float32x4 wouldn't have this problem (that is, all float32x4 operations would leave the float components uncanonicalized; the canonicalization that would occur is when someone loads a component). Does that address your concern?

Heidi Pan

Reporter

Comment 7

•

11 years ago

That helps a lot, but would still be a problem in the polyfill. Is there any way to mimic this behavior for a polyfill implementation? Thx!

Heidi Pan

Reporter

Comment 8

•

11 years ago

Actually do you also mind clarifying 'the canonicalization that would occur is when someone loads a component'? i.e. the concern here is bit casting an int32x4 to float32x4, which would most likely create a new float32x4 out of the original int32x4 components. Would canonicalization happen if one of the int32x4 components were 0xffffffff?

Luke Wagner [:luke]

Comment 9

•

11 years ago

By "loads a component" I mean accessing the .x of a float32x4 array. You're right, though, this would be an issue for self-hosted code! Perhaps we'd just implement the SIMD methods in C++ (where canonicalization doesn't happen unless you ask for it). Self-hosting is nice but it also has its own hidden pitfalls and, for simple math functions, the C++ doesn't end up being too complicated (see, Math.abs http://hg.mozilla.org/mozilla-central/file/c93cfe704487/js/src/jsmath.cpp#l119).

Heidi Pan

Reporter

Comment 10

•

11 years ago

The one example I can come up with where this poses a weird semantic issue is described at https://github.com/johnmccutchan/ecmascript_simd/issues/38

John McCutchan

Comment 11

•

11 years ago

The Dart VM implements the unaccelerated version of SIMD in C++. I would hope that both V8 and SpiderMonkey do the same.

Alon Zakai (:azakai)

Comment 12

•

11 years ago

Isn't this still an issue even if the code is in C++? The SIMD API lets you bitcast an i32 to a float, and read that float into a normal JS variable. That is where NaN canonicalization must occur, as a normal JS variable cannot (in at least 2 of the major JS engines) contain a value dangerous for NaNboxing, I think?

Luke Wagner [:luke]

Comment 13

•

11 years ago

In C++ you only get canonicalization if you ask for it. The C++ would have to be careful when dealing with float/double variables not to stick them in a js::Value that could escape into the VM or be observed by the GC, but that shouldn't be hard for these operations.

Heidi Pan

Reporter

Comment 14

•

11 years ago

Luke, just to clarify my understanding, are you saying that the runtime implementation of n2 = SIMD.float32x4.withX(n, n.x); won't canonicalize n.x as long as it's not used elsewhere (and the runtime is smart enough to figure that out)?

Luke Wagner [:luke]

Comment 15

•

11 years ago

(In reply to Heidi Pan from comment #14) In your example, the "n.x" is one of the "component loads" (I should have said "Lane Accessors") I was mentioning in comment 6 so "n.x" would canonicalize the resultant float32. Now, is this an example that people would write in a real SIMD kernel? I was under the impression that this sort of manipulation was bad form and slow [1]. [1] https://www.dartlang.org/articles/simd/#reading-the-value-of-individual-lanes

Peter Jensen

Comment 16

•

11 years ago

I think it's OK to leave it to the implementation to decide whether to canonicalizde a lane access, just like it's left to the implementation to decide whether to canonicalize an Float32Array/Float64Array element access. Also, see the comments at: https://github.com/johnmccutchan/ecmascript_simd/issues/38

Heidi Pan

Reporter

Comment 17

•

11 years ago

Yes it is bad practice, so I hope not. I just wanted to make sure I understood the behavior regardless, and since we are providing such an API (making lane accesses much easier to do in JS than in C), it'll probably be good to spell these things out so that developers don't accidentally shoot themselves in the foot. Thanks!

Niko Matsakis [:nmatsakis]

Comment 18

•

11 years ago

Hmm. Of course, I *did* suggest that we implement the various accessors as self-hosted code, due to all the usual advantages -- I didn't consider this issue at the time. I suppose we can reimplement as C++ where needed, but how unfortunate.

Bugzilla

Typed array behavior different across different JS engines

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: heidi.pan, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17

Comment 18