Closed Bug 1259280 Opened 4 years ago Closed 4 years ago

asm.js/--no-asmjs difference in treatment of non-canonical NaN

Categories

(Core :: JavaScript Engine: JIT, defect, minor)

x86_64
Linux
defect
Not set
minor

Tracking

()

RESOLVED WONTFIX

People

(Reporter: pipcet, Unassigned)

Details

Attachments

(2 files)

474 bytes, application/javascript
Details
710 bytes, application/javascript
Details
Attached file test case
It's quite probable this isn't a bug, but I'll report just in case it is.

In non-asm.js JavaScript code, HEAPF64[0] = +HEAPF64[0] canonicalizes a non-canonical NaN stored in the first 8 bytes of HEAPF64; subsequently reading back the sign bit with HEAPU32[4]&0x80000000 will always produce 0.

In asm.js code, HEAPF64[0>>3] = +HEAPF64[0>>3] appears to get turned into a nop, so if there was a non-canonical NaN there, HEAPU32[4]&0x80000000 will produce non-zero.

With the attached test case:

$ ./js  ./nan2.js
-2147483648
$ ./js --no-asmjs ./nan2.js
0

Note that (unlike https://bugzilla.mozilla.org/show_bug.cgi?id=584168), this should have no security implications. Also, the test case depends on byte order and the representation of floating-point numbers.
Thanks for reporting; in most cases we do expect the same behavior with asm.js.  In this case, however, the JS spec specifically includes language for allowing possible canonicalization of NaNs (http://tc39.github.io/ecma262/#sec-arraybuffer-objects) so asm.js only does canonicalization of NaNs at FFI and export return boundaries, instead of at each typed array load (as with normal JS).  It's unfortunate that asm.js and JS have visibly divergent behavior, of course, but moving around NaN canonicalization is something that even normal JITs might do now or in the future (given local escape analysis of the loaded double).  Resolving WONTFIX based on this, but feel free to reopen if there's more to this issue.
Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
Thanks for the response!

I'm reading the spec differently, I'm afraid:

    An implementation must always choose the same encoding for each implementation distinguishable NaN value.

I think that implies that asm.js and non-asm.js code mixed in the same program must not treat NaNs differently: we can argue that HEAPF64[0>>3] is "really" a negative NaN, but how can we claim that it's distinguishable from itself based on whether we're in asm.js code?

It's hard to be sure of this since the spec doesn't specify what "implementation distinguishable" means, but I don't think the same bit pattern in memory can result in two different "implementation distinguishable" NaN values.

To be honest, I think the right solution here is to fix the spec, not the code.

Eventually it would be nice to allow all NaNs as distinct values (by turning non-canonical NaNs into garbage-collected "objects" containing the payload) so the non-asm.js code would behave the same as the asm.js code does now.

I'm attaching a program that copies HEAPF64[0>>3] to two places in memory, once in asm.js code and once in JS. By my reading of the spec, the bit patterns in the two memory locations must match, but they don't.

My interpretation of "implementation distinguishable NaN value" is that there is a function sameNaNAs(x,y), defined in an implementation-defined manner, which defines an equivalence relation on JS values representing NaN, and that two NaN values are implementation distinguishable iff !sameNaNAs(x,y). But then we must have sameNaNAs(HEAPF64[0>>3], HEAPF64[0>>3]), so my code should produce the same bit pattern twice.
Attached file nan3.js
Yes, the wording is pretty fuzzy there, but it is indeed a stretch regardless to have the canonicalization happen at some arbitrary program point, unrelated to the load that produced the noncanonical NaN.  I'm not positive, but I think Chakra also does NaN-boxing on 64-bit so they may also have the same canonicalization behavior.
Actually, I noticed something I missed earlier: the above-quoted caveat in 24.1.1.6 is in *Set*ValueInBuffer (not GetValueInBuffer, as I had been imagining).  When you combine this with the language that seems to imply each store may or may not perform a canonicalization, then I think current asm.js behavior does seem valid: the stores inside asm.js are choosing not to canonicalize in some cases.  Still weird, but this spec language makes sense since it's really only storing to an arraybuffer that makes the otherwise-unobservable NaN payload bits observable.
Hmm. I think the spec is unclear on this, since 24.1.1.5 talks about _the_ NaN value when 24.1.1.6 mentions several implementation distinguishable NaN values.

My vague fear is that someone is going to implement memcpy() using Float64Arrays and be really surprised when their implementation breaks only when debugging is enabled.  Unfortunate, but some surprising behavior is going to be inevitable with IEEE NaNs no matter what we do.

(I've opened an issue at https://github.com/tc39/ecma262/issues/546.)
Hehe, yes, independently, the "the" in 24.1.1.5 sounds like it's *mandating* NaN canonicalization of *all* loads, which is clearly not the intention, so I think that wording is just off.

You're right to worry about memcpy() + Float64Array: that is a known problem in Emscripten-land.  However, the high-order bit is simply to assume NaN canonicalization might happen.
You need to log in before you can comment on or make changes to this bug.