Closed Bug 1245627 Opened 8 years ago Closed 8 years ago

Storing into float32 typed arrays must canonicalize in more-deterministic builds

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox47 --- wontfix
firefox49 --- fixed

People

(Reporter: gkw, Assigned: bbouvier)

References

Details

(Keywords: regression, testcase)

Attachments

(3 files, 4 obsolete files)

f = (function(stdlib, foreign, heap) {
    "use asm";
    var Float32ArrayView = new stdlib.Float32Array(heap);
    var Float64ArrayView = new stdlib.Float64Array(heap);
    function f(i0) {
        Float32ArrayView[i0 >>> 0] = 0 / 0
        return (Float64ArrayView[0]);
    }
    return f;
})(this, {}, new ArrayBuffer(4096));
for (var j = 0; j < 2; ++j) {
    print(f());
}


$ ./js-dbg-64-dm-clang-darwin-5f9ba76eb3b1 --fuzzing-safe --no-threads --baseline-eager testcase.js
1.058925634e-314
2.1199235295e-314

$ ./js-dbg-64-dm-clang-darwin-5f9ba76eb3b1 --fuzzing-safe --no-threads --ion-eager testcase.js
1.058925634e-314
1.058925634e-314

Tested this on m-c rev 5f9ba76eb3b1.

My configure flags are:

CC="clang -Qunused-arguments" CXX="clang++ -Qunused-arguments" AR=ar AUTOCONF=/usr/local/Cellar/autoconf213/2.13/bin/autoconf213 sh /Users/skywalker/trees/mozilla-central/js/src/configure --target=x86_64-apple-darwin14.5.0 --disable-jemalloc --enable-debug --enable-more-deterministic --with-ccache --enable-gczeal --enable-debug-symbols --disable-tests

python -u ~/funfuzz/js/compileShell.py -b "--enable-debug --enable-more-deterministic" -r 5f9ba76eb3b1

This goes back past m-c rev 54be5416ae5d (nov 2014), so it will be great if Luke or Benjamin might be able to diagnose for a start, and move it on if necessary.
Flags: needinfo?(luke)
Flags: needinfo?(bbouvier)
Hah, wild.  This is invalid asm.js so the "use asm" is only serving to force eager parsing.  This test case also shows the same behavior:

f = (function(stdlib, foreign, heap) {
    with({}){}
    var Float32ArrayView = new stdlib.Float32Array(heap);
    var Float64ArrayView = new stdlib.Float64Array(heap);
    function f(i0) {
        Float32ArrayView[i0 >>> 0] = 0 / 0
        return (Float64ArrayView[0]);
    }
    return f;
})(this, {}, new ArrayBuffer(4096));
for (var j = 0; j < 2; ++j) {
    print(f());
}

(using 'with' instead of "use asm" to force eager parsing) and you can see the difference with only the --baseline-eager flag.  Perhaps you could autobisect back further using this?
Flags: needinfo?(luke)
Flags: needinfo?(bbouvier)
Summary: Differential Testing: Different output message involving "use asm" → Differential Testing: Different output message involving eager parsing and --baseline-eager
Another simpler test case:

var heap = new ArrayBuffer(8);
var Float32ArrayView = new Float32Array(heap);
var Float64ArrayView = new Float64Array(heap);

var f = (function() {
    with({}){}
    return function () {
        Float32ArrayView[0] = 0 / 0;
        return Float64ArrayView[0];
    }
})();

for (var j = 0; j < 2; ++j) {
    print(f());
}
It goes back before m-c rev 54be5416ae5d. Setting needinfo? from Hannes and Jan as a start.
Flags: needinfo?(jdemooij)
Flags: needinfo?(hv1989)
Hannes can you investigate this? I'm trying to forward/distribute the fuzz bugs a bit better.
Flags: needinfo?(jdemooij)
Sure. Though I'm on PTO this week. I can investigate next week.
I investigated this, for the learning exercise in working in JIT code.  :-)  The issue is that division in the interpreter (i.e. via js::NumberDiv produces JS::GenericNaN(), but division in the JITs fails to canonicalize -- and on x86-64 at least, +0 / +0 produces a NaN bit pattern that is *not* JS::GenericNaN() (rather, it's that value with the sign bit set).  So we need to canonicalize to JS::GenericNaN() when dividing in the JITs, for consistency.
Attachment #8717699 - Flags: review?(bbouvier)
Assignee: nobody → jwalden+bmo
Status: NEW → ASSIGNED
Flags: needinfo?(hv1989)
Hardware: x86_64 → All
Summary: Differential Testing: Different output message involving eager parsing and --baseline-eager → Floating-point division in the JITs doesn't canonicalize (observable assigning the result into a typed array, then punning)
What if we do this only in more-deterministic builds? I think NegD can also produce NaNs with the sign bit set, because we XOR the double:

https://dxr.mozilla.org/mozilla-central/rev/2dfb45d74f42d2a0010696f5fd47c7a7da94cedb/js/src/jit/x86-shared/MacroAssembler-x86-shared-inl.h#172

Hm, actually, we already canonicalize when we store to typed arrays in more-deterministic builds:

https://dxr.mozilla.org/mozilla-central/rev/2dfb45d74f42d2a0010696f5fd47c7a7da94cedb/js/src/jit/MacroAssembler.cpp#241

Why isn't that working here?
(In reply to Jan de Mooij [:jandem] from comment #7)
> Why isn't that working here?

Oh probably because we do that in the Scalar::Float64 case but not for Scalar::Float32...
I agree with jandem, we do not need to have determinism in the general case as long as we do not store a Value.

For fuzzing, we only need to be more deterministic in the float32 case, where we store into the type array.

https://dxr.mozilla.org/mozilla-central/rev/2dfb45d74f42d2a0010696f5fd47c7a7da94cedb/js/src/jit/MacroAssembler.cpp#238

And maybe we should do the same in Ion and for SIMD store.

https://dxr.mozilla.org/mozilla-central/source/js/src/jit/x86/CodeGenerator-x86.cpp#481-482,519
Comment on attachment 8717699 [details] [diff] [review]
Canonicalize floating-point division results in the JITs

Review of attachment 8717699 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing review, see previous comments (only the f32 store should canonicalize).
Attachment #8717699 - Flags: review?(bbouvier)
are you planning on finishing this up Jeff?
Flags: needinfo?(jwalden+bmo)
Attached patch v2, as desired, I think (obsolete) — Splinter Review
Attachment #8744172 - Flags: review?(bbouvier)
Boom.  (I think.)
Flags: needinfo?(jwalden+bmo)
Summary: Floating-point division in the JITs doesn't canonicalize (observable assigning the result into a typed array, then punning) → Storing into float32 typed arrays must canonicalize in more-deterministic builds
Comment on attachment 8744172 [details] [diff] [review]
v2, as desired, I think

Review of attachment 8744172 [details] [diff] [review]:
-----------------------------------------------------------------

The fix looks good in MacroAssembler.cpp, but I have a few remarks/questions for the rest:
- Can you add the test, please?
- I think you're either doing too much or not enough: canonicalization happens on MIPS and x86 only for asm.js stores, but not on x64 and arm (see also visitAsmJSStoreHeap methods). Also, if you do it for asm.js, you'll probably need to do it for SIMD stores as well.
- Can you split the (nice) refactoring in CodeGenerator-arm/CodeGenerator-mips-shared.cpp in another patch? (and asign it insta r+ from me, if you want)
Attachment #8744172 - Flags: review?(bbouvier)
(In reply to Benjamin Bouvier [:bbouvier] from comment #14)
> - Can you add the test, please?

I don't think so?  The test is only "valid" in more-deterministic builds.  And I don't see a jit-tests sigil to require that, or at least |ack -i eterministic jit-tests| doesn't turn it up.
Attached patch Maybe better? (obsolete) — Splinter Review
Does this hit the points you also wanted fixed?  I confess I'm not sure I followed all this code correctly (and there's at least one XXX location you should look at specifically).

Also, still no test because I don't believe there's a way to detect more-deterministic builds and run a test only in them.  (We could perhaps add one, but do we actually regularly run tests in more-deterministic builds?  And compare against not-such-builds to verify there's a distinction being performed?  Unless we do that, a test for this is not really going to do much but occupy space.)
Attachment #8744552 - Flags: review?(bbouvier)
Attachment #8744172 - Attachment is obsolete: true
Comment on attachment 8744552 [details] [diff] [review]
Maybe better?

Review of attachment 8744552 [details] [diff] [review]:
-----------------------------------------------------------------

You're right about the test, sorry I had forgotten that it was only under deterministic builds that this would happen. I agree that it sounds wasteful to check in the test, only to be reproduced by hand (guess this never happens).

I think asm.js SIMD code might want more. The good news is this should address the cases from bug 1259280 too (when compiled with MORE_DETERMINISTIC, at least).

There is some careful bookkeeping made for x64 and x86: note the `before` and `after` variables in emitSimdStore / visitAsmJSStoreHeap, and their use in HeapAccess, which gets reused later for patching accesses on x86 or for the signal handler trick on x64. After your modifications, I *think* the bounds check mechanisms should still work on all the platforms, but to be sure about that, you might want to run the tests under asm.js/ on x86 and x64.

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +2324,1 @@
>                  masm.ma_vstr(vd.singleOverlay(), Address(HeapReg, ptrImm), Assembler::Always);

You could also factor out the Address() here.

::: js/src/jit/mips-shared/CodeGenerator-mips-shared.cpp
@@ +1707,3 @@
>              if (size == 32) {
> +                masm.canonicalizeFloatIfDeterministic(freg);
> +                masm.storeFloat32(freg, Address(HeapReg, ptrImm));

If you wanted to, you could also factor our the Address in both branches.

@@ +1728,3 @@
>              if (size == 32) {
> +                masm.canonicalizeFloatIfDeterministic(freg);
> +                masm.storeFloat32(freg, BaseIndex(HeapReg, ptrReg, TimesOne));

ditto with the BaseIndex

::: js/src/jit/x64/CodeGenerator-x64.cpp
@@ +707,3 @@
>            // See comment above, which also applies to movsd.
> +          case 2:
> +            masm.canonicalizeDoubleIfDeterministic(in);

This is incorrect: here, we're using storeDouble because it's equivalent to store the two low float32s, but the canonicalization must happen on each float32 lane, not on the two lanes considered as a single double (this would introduce strange behavior. Out of curiosity, is it caught by the test cases? If not, it'd be nice that it'd be, but I also wouldn't spend too much time on it as it is a fairly uncommon case)

All the cases (case 1, 2 and 4 in this function + case 3 in the caller of this function) would need canonicalization, as for x86.

::: js/src/jit/x86/CodeGenerator-x86.cpp
@@ +554,5 @@
> +
> +          case 4:
> +            // XXX If this is storing a SIMD value into memory, doesn't it
> +            //     need canonicalization help too?  Zero idea how to do that
> +            //     if it's really needed here (and maybe in other backends?).

For the sake of completeness, yes. It would be needed for x86 too. We could use an helper canonicalizeFloat32x4IfMoreDeterministic in MacroAssembler-x86-shared, which would do something like comparing the input against the generic NaN (look at visitSimdBinaryCompFx4, loadConstantFloat32x4) and then replace the value if needed (a simple movps).

As you'd need canonicalization for the partial loads too (`case 1`, `case 2` mapping to SIMD.Float32x4.load1/load2), I would just canonicalizeIfDeterministic the entire vector before the inner switch. The case for 3 values has to be handled separately in emitSimdStore (this makes me a bit sad to realize that this code is so scattered).
Attachment #8744552 - Flags: review?(bbouvier)
This is really unlikely to bite anyone, so we're unlikely to uplift. WONTFIX 47.
I think this bug's passing the point where I'm productive enough to make it worthwhile as a learning experience, versus letting someone already in this area of code do it.  Particularly not when overnight ping times seem required, and one side knows exactly what's desired and the other doesn't.
Attachment #8744552 - Attachment is obsolete: true
Assignee: jwalden+bmo → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(bbouvier)
Attached patch canonicalize.patch (obsolete) — Splinter Review
Based on Waldo's patch. I've tried on x86 deterministic mode and on x64 deterministic mode: the test now prints the same thing in both cases. Not including the test, because it's triggering only in deterministic mode.
Assignee: nobody → bbouvier
Attachment #8746399 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Flags: needinfo?(bbouvier)
Attachment #8755577 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8755577 [details] [diff] [review]
canonicalize.patch

Review of attachment 8755577 [details] [diff] [review]:
-----------------------------------------------------------------

I do not understand why this patch instrument the CodeGenerator.

From what I understand, we have deterministic loads, but we do not have deterministic store.  To satisfy the more-deterministic builds, we need to have deterministic stores as well.
To me this sounds like a property of the Store, and not a property of the CodeGenerator.  Thus I would expect this patch to instrument the storeFloat32, and storeDouble function, and maybe create a storeFloat32x4 function, where we call the canonicaliseTypeIfDeterministic functions.

I can understand that we have no such abstraction for Float32x4 yet, and I am willing to accept that for the moment, but the Float32 and Double cases should all go through these wrappers, I think.

::: js/src/jit-test/tests/asm.js/testSIMD-load-store.js
@@ +186,5 @@
>  assertEqX4(vec, slice(F32, 4, 4));
>  
>  reset();
> +f32s(4, vecWithNaN);
> +assertEqX4(vecWithNaN, slice(F32, 4, 4));

Thanks for the test case.

Can we make a slice function which uses a function exposed by the asm.js module?  Such that we can also check the round-trip within asm.js and verify that we do canonicalize on loads too.

::: js/src/jit/MacroAssembler-inl.h
@@ +603,5 @@
>  void
> +MacroAssembler::canonicalizeFloatIfDeterministic(FloatRegister reg)
> +{
> +#ifdef JS_MORE_DETERMINISTIC
> +    // See the comment in TypedArrayObjectTemplate::getIndexValue.

nit: TypedArrayObjectTemplate<float>::getIndexValue

@@ +621,5 @@
> +void
> +MacroAssembler::canonicalizeDoubleIfDeterministic(FloatRegister reg)
> +{
> +#ifdef JS_MORE_DETERMINISTIC
> +    // See the comment in TypedArrayObjectTemplate::getIndexValue.

nit: TypedArrayObjectTemplate<double>::getIndexValue

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +2334,1 @@
>                  masm.ma_vstr(vd.singleOverlay(), addr, Assembler::Always);

existing nit: use masm.storeFloat32 and masm.storeDouble, and identically below (as done on other architectures)
Comment on attachment 8755577 [details] [diff] [review]
canonicalize.patch

Review of attachment 8755577 [details] [diff] [review]:
-----------------------------------------------------------------

(cancelling the review, re-ask for review & argue if you think my previous remark was not valid)
Attachment #8755577 - Flags: review?(nicolas.b.pierron)
(In reply to Benjamin Bouvier [:bbouvier] from comment #24)
> Thank you for the review!
> 
> About where the code should live: unfortunately (at least for asm.js), we
> can't rely on hiding the potential canonicalization behind the
> storeDouble/storeFloat functions themselves. We need storeDouble/storeFloat
> to be exactly one instruction, as raw offsets in the masm (see before/after
> in x86 and x64 CodeGenerator :: visitAsmJS{Load/Store}Heap) are recorded and
> reused later (either for patching the array length on x86, or used in the
> signal handlers).
> 
> I'd better not leak the canonicalization to the signal handler, or to the
> store/load bookkeeping structure, so for asm.js, I'd prefer to keep those as
> is.

I agree with this.  If we were to include the canonicalization in the store function, then we have 2 options:
 - Either we carry the range information out-side the store function.
 - Or we could append the wasm::HeapAccess under the store function.

The second approach sounds better, as the wasm::HeapAccess is directly related to the load/store, but this would be larger changes compared to the current goal which is to ensure that we canonicalize properly in deterministic builds.

In which case, I will suggest to go for the second approach as a follow-up bug.

> For non asm.js test cases, we could have a
> storeMaybeCanonicalize{Float,Double} function (or by inverting the meanings,
> asm.js would use storeNoCanonicalize and non-asm would use store). What do
> you think? See also remark below.

From what I understand, we always want to canonicalize by default (for deterministic builds), so we should have   "storeDouble"  which does the canonicalization in deterministic builds, and  "storeNotCanonicalizedDouble" (the exception, thus larger name) which would be used by asm.js with the proper calls to the canonicalize function in deterministic builds.

> (In reply to Nicolas B. Pierron [:nbp] from comment #22)
> > >  reset();
> > > +f32s(4, vecWithNaN);
> > > +assertEqX4(vecWithNaN, slice(F32, 4, 4));
> > 
> > Thanks for the test case.
> > 
> > Can we make a slice function which uses a function exposed by the asm.js
> > module?  Such that we can also check the round-trip within asm.js and verify
> > that we do canonicalize on loads too.
> 
> We don't canonicalize on loads: in asm.js, floating point values get
> canonicalized only when they escape asm.js (through an FFI or when we leave
> asm.js). We could add this too, but I would assume it is unreachable from
> the fuzzers anyways, as it is now. It's still conformant to the spec, so I
> propose to defer doing this until a fuzzer actually hits it?

IIUC, this means that as long as the condition of the assertion is not written in asm.js code this is not observable.
So what I suggested before, which involves writing a slice function in non-asm.js code, should pass, right?
Flags: needinfo?(nicolas.b.pierron)
Thank you for the review!

About where the code should live: unfortunately (at least for asm.js), we can't rely on hiding the potential canonicalization behind the storeDouble/storeFloat functions themselves. We need storeDouble/storeFloat to be exactly one instruction, as raw offsets in the masm (see before/after in x86 and x64 CodeGenerator :: visitAsmJS{Load/Store}Heap) are recorded and reused later (either for patching the array length on x86, or used in the signal handlers).

I'd better not leak the canonicalization to the signal handler, or to the store/load bookkeeping structure, so for asm.js, I'd prefer to keep those as is.

For non asm.js test cases, we could have a storeMaybeCanonicalize{Float,Double} function (or by inverting the meanings, asm.js would use storeNoCanonicalize and non-asm would use store). What do you think? See also remark below.

(In reply to Nicolas B. Pierron [:nbp] from comment #22)
> Comment on attachment 8755577 [details] [diff] [review]
> canonicalize.patch
> 
> Review of attachment 8755577 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I do not understand why this patch instrument the CodeGenerator.
> 
> From what I understand, we have deterministic loads, but we do not have
> deterministic store.  To satisfy the more-deterministic builds, we need to
> have deterministic stores as well.
> To me this sounds like a property of the Store, and not a property of the
> CodeGenerator.  Thus I would expect this patch to instrument the
> storeFloat32, and storeDouble function, and maybe create a storeFloat32x4
> function, where we call the canonicaliseTypeIfDeterministic functions.
> 
> I can understand that we have no such abstraction for Float32x4 yet, and I
> am willing to accept that for the moment, but the Float32 and Double cases
> should all go through these wrappers, I think.
> 
> ::: js/src/jit-test/tests/asm.js/testSIMD-load-store.js
> @@ +186,5 @@
> >  assertEqX4(vec, slice(F32, 4, 4));
> >  
> >  reset();
> > +f32s(4, vecWithNaN);
> > +assertEqX4(vecWithNaN, slice(F32, 4, 4));
> 
> Thanks for the test case.
> 
> Can we make a slice function which uses a function exposed by the asm.js
> module?  Such that we can also check the round-trip within asm.js and verify
> that we do canonicalize on loads too.

We don't canonicalize on loads: in asm.js, floating point values get canonicalized only when they escape asm.js (through an FFI or when we leave asm.js). We could add this too, but I would assume it is unreachable from the fuzzers anyways, as it is now. It's still conformant to the spec, so I propose to defer doing this until a fuzzer actually hits it?
Flags: needinfo?(nicolas.b.pierron)
Attached patch 1.hoist.patchSplinter Review
Attachment #8755577 - Attachment is obsolete: true
Attachment #8757990 - Flags: review?(nicolas.b.pierron)
Attachment #8757991 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8757990 [details] [diff] [review]
1.hoist.patch

Review of attachment 8757990 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MacroAssembler.h
@@ +1114,5 @@
>      inline void branchTestMagicImpl(Condition cond, const T& t, L label)
>          DEFINED_ON(arm, arm64, x86_shared);
>  
> +  public:
> +    // Memory access primitives.

nit: add the following comment above this line.

    // ===============================================================

@@ +1116,5 @@
>  
> +  public:
> +    // Memory access primitives.
> +    inline void storeDouble(FloatRegister src, const Address& dest)
> +        DEFINED_ON(x86_shared, arm, arm64, mips32, mips64);

mips32 and mips64 implementations are identical, move the function definitions to mips-shared-inl.h and use the PER_SHARED_ARCH marco.

::: js/src/jit/arm/MacroAssembler-arm-inl.h
@@ +1202,5 @@
>      branchTestMagic(cond, valaddr, label);
>      branch32(cond, ToPayload(valaddr), Imm32(why), label);
>  }
>  
> +// Memory accesses.

nit: Same here, and in other files.

@@ +1218,5 @@
> +
> +void
> +MacroAssembler::storeFloat32(FloatRegister src, const Address& addr)
> +{
> +    ma_vstr(VFPRegister(src).singleOverlay(), addr);

existing nit:  use “ src.asSingle() ” instead of “ VFPRegister(src).singleOverlay() ”.
(same below)
Attachment #8757990 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 8757991 [details] [diff] [review]
2.canonicalize.patch

Review of attachment 8757991 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks! :)

::: js/src/jit/MacroAssembler.h
@@ +1114,5 @@
>      inline void branchTestMagicImpl(Condition cond, const T& t, L label)
>          DEFINED_ON(arm, arm64, x86_shared);
>  
>    public:
> +    // Canonicalization primitives.

nit: Repeat the comment with =* above each new section, and do the same for all other files.

@@ +1128,1 @@
>      // Memory access primitives.

nit: repeat the visibility on each new section.
Attachment #8757991 - Flags: review?(nicolas.b.pierron) → review+
Thanks for the review!

(In reply to Nicolas B. Pierron [:nbp] from comment #28)
> @@ +1116,5 @@
> >  
> > +  public:
> > +    // Memory access primitives.
> > +    inline void storeDouble(FloatRegister src, const Address& dest)
> > +        DEFINED_ON(x86_shared, arm, arm64, mips32, mips64);
> 
> mips32 and mips64 implementations are identical, move the function
> definitions to mips-shared-inl.h and use the PER_SHARED_ARCH marco.

They're not: they seem to be the same for BaseIndex, but not for Address. I've kept them both as is in the meanwhile.
Pushed by bbouvier@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/65f18c79794b
Hoist store{Float32,Double} functions into the MacroAssembler; r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/2c86039b1c68
Canonicalize before storing a floating point value in deterministic mode; r=nbp
https://hg.mozilla.org/mozilla-central/rev/65f18c79794b
https://hg.mozilla.org/mozilla-central/rev/2c86039b1c68
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: