Closed Bug 1523515 Opened 5 years ago Closed 5 years ago

Differential Testing: Different output message involving 32-bit ARM simulator and --arm-hwcap=vfp

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox65 --- wontfix
firefox66 --- fixed
firefox67 --- fixed

People

(Reporter: gkw, Assigned: wingo)

References

Details

(Keywords: testcase)

Attachments

(2 files)

function f(x, y) {
    return (y ? 1 : x | 0) && x / 1;
}
x = [[], 1];   
for (var j = 0; j < 2; ++j) {   
    for (var k = 0; k < 3; ++k) {
        print(f(x[j], x[k]));
    }
}
$ ./js-32-dm-armSim-linux-f039f8426b5f --fuzzing-safe --no-threads --ion-eager testcase.js 
0
0
0
1
1
1
$ ./js-32-dm-armSim-linux-f039f8426b5f --fuzzing-safe --no-threads --ion-eager --arm-hwcap=vfp testcase.js 
0
0
0
1
1
0

Tested this on m-c rev f039f8426b5f.

My configure flags are:

PKG_CONFIG_LIBDIR=/usr/lib/pkgconfig 'CC="gcc -m32 -msse2 -mfpmath=sse"' 'CXX="g++ -m32 -msse2 -mfpmath=sse"' AR=ar sh /home/ubuntu/trees/mozilla-central/js/src/configure --target=i686-pc-linux --enable-simulator=arm --enable-more-deterministic --with-ccache --enable-gczeal --enable-debug-symbols --disable-cranelift --disable-tests

python3 -u -m funfuzz.js.compile_shell -b "--32 --enable-more-deterministic --enable-simulator=arm" -r f039f8426b5f

Trying to get a bisection result...

autobisectjs shows this is probably related to the following changeset:

The first bad revision is:
changeset: https://hg.mozilla.org/mozilla-central/rev/d5c22661c860
user: Robin Templeton
date: Tue Sep 18 04:04:53 2018 +0000
summary: bug 1490387 - Part 3: Implement BigInt support for bitwise operators. r=jandem

Jan, is bug 1490387 a likely regressor?

Blocks: 1490387
Flags: needinfo?(jdemooij)

Soapbox: --arm-hwcap=vfp is probably unsupported, as that is an ARMv6 device with floating point and ARMv6 is not supported.

See https://searchfox.org/mozilla-central/source/js/src/jit/arm/Architecture-arm.cpp#51 et seq.

In the past I've argued that we should just error out or print prominent warnings for unsupported / nonsensical device specs. --arm-hwcap is a sharp tool and many combinations of features do not make sense.

(Not to say a bug wasn't found here but then it should also be visible with a more reasonable set of flags.)

(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #3)

That flag is still in fuzz-flags.txt in case anyone wonders:

https://hg.mozilla.org/mozilla-central/annotate/14defbde23a6/js/src/shell/fuzz-flags.txt#l67

Mkay. Should be possible to cook up a more interesting list of flag combinations there based on what we consider tier-1 hardware...

Robin, can you take a look at this? See also comment 2, but just in case there's an actual/unrelated bug here.

Flags: needinfo?(jdemooij) → needinfo?(robin)

i can reproduce this locally, but i don't know yet what's causing the incorrect output

the testcase appears to work with --arm-hwcap=idiva and fails with any other --arm-hwcap flag

IONFLAGS=all output looks similar for --arm-hwcap=idiva and --arm-hwcap=vfp except for additional "pools" log entries with vfp:

 [Codegen] emitGuardIsInt32
 [Codegen] emitGuardIsInt32
 [Codegen] emitInt32DivResult
+[Pools] [0] Inserting 1 entries into pool
+[Pools] [0] data is: 0xf63c8aa4
+[Pools] [0] Entry has index 0, offset 120
 [Codegen] emitReturnFromIC

Hm, something is wonky with Ion type analysis here. Consider that the original loop is actually this (note the update to the array):

x = [[], 1, undefined];   
for (var j = 0; j < 2; ++j) {   
    for (var k = 0; k < 3; ++k) {
        print(f(x[j], x[k]));
    }
}

and this has the same result as Gary reported. But if one expands that loop nest like this:

print(f([], []));
print(f([], 1));
print(f([], undefined));
print(f(1, []));
print(f(1, 1));
print(f(1, undefined));

then the error does not occur.

(:lth stops playing around with the JIT and goes back to what he's supposed to be doing.)

running the shell with --fuzzing-safe --no-threads --ion-eager --arm-hwcap=vfp produces incorrect output, but adding any combination of the following flags prevents the error: --cache-ir-stubs=off, --cache-ir-stubs=nobinary, --ion-inlining=off, --ion-osr=off, --ion-regalloc=stupid

Flags: needinfo?(robin)

additional test cases that print incorrect output with shell options --fuzzing-safe --no-threads --ion-eager --arm-hwcap=vfp:

z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x|0))&&(x/1))(z[j],z[k]));}}
z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x&1))&&(x/1))(z[j],z[k]));}}
z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x>>0))&&(x/1))(z[j],z[k]));}}
z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x<<0))&&(x/1))(z[j],z[k]));}}
z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x>>>0))&&(x/1))(z[j],z[k]));}}
z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x**1))&&(x/1))(z[j],z[k]));}}

Robin just noted that the following change to register allocation fixes the symptoms for these tests: https://hg.mozilla.org/try/diff/6c1550a4ae06/js/src/jit/CacheIRCompiler.cpp. But I speculate that the workaround doesn't fix the fundamental cause, and in any case I don't think it works for arches with hard-coded output registers for division.

I suspect that the change above "works" by decreasing register pressure and preventing creation of the constant pool, which seems necessary to trigger the bug.

Incidentally the original patch probably increases register pressure (and possibly constant pool creation) by adding the additional MIRType::Value case for the bitwise op result types, which will result in more box/unbox ops, more live values, and thus perhaps triggering the bug in the same way.

Could something about the VFP support in the simulator have always always been buggy in some way, with this change just making it show? E.g. perhaps the simulator+vfp clobbers more registers in an "sdiv" than it should have been doing? But is it possible for that to be the case?

Surely an Ion type analysis bug is more likely, given the nature of the patch. I will also take a look at this this week.

another possibly related commit found by git bisect for a test case using exponentiation: mozilla-central changeset 5b25886cfb64 (which added a cacheir operation for int32 division)

$ ./js --fuzzing-safe --ion-eager --no-threads -e 'z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x**1))&&(x/1))(z[j],z[k]));}}'
0
0
0
1
1
1
$ ./js --fuzzing-safe --ion-eager --no-threads --arm-hwcap=vfp -e 'z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x**1))&&(x/1))(z[j],z[k]));}}'
0
0
0
1
1
0

configure flags: --target=i686-pc-linux --enable-simulator=arm --enable-more-deterministic --enable-debug --disable-optimize --disable-tests --enable-ccache

That's pointing to a commit by Matthew -> setting needinfo? from him.

Flags: needinfo?(mgaudet)

I just want to toss out Bug 1519077 as being potentially related, though all I have done so far is read the comments.

The specific thing that makes me think this is the Try'd patch in Comment 10; Bug 1519077 is addressing the fact that we can clobber input registers, which is a problem in a fallible IC if we need to reset the state.

I'll take a closer look at this later to see, and try the patch as well. Leaving ni? to make sure I do it.

Ok, so the patch attached to Bug 1519077 does not resolve the differences in Comment 11 in my local testing.

This does feel more like an issue with the CacheIR stubs than an Ion type analysis Bug IMO. My gut (and a bit of printf debugging to make sure I'm on the right arm of the branch) says the bug should be in here: https://searchfox.org/mozilla-central/rev/9eb30227b21e0aa40d51d9f9b08bb0b113c5fadb/js/src/jit/arm/MacroAssembler-arm.cpp#5707-5728

The printf gives me this:

./dist/bin/js --arm-hwcap=vfp  --fuzzing-safe --ion-eager --no-threads -e 'z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x**1))&&(x/1))(z[j],z[k]));}}'
0
0
0
(noidiv) rhs: r1, lhsOutput r2, remOutput r10
1
(noidiv) rhs: r5, lhsOutput r2, remOutput r0
1
0

(It's perhaps notable that register assignment is the same without arm-hwcap=vfp, except we'll use the other branch of the if-else there).

I'm at EOD here, so dumping this info lest someone else get a chance to look at this before I do.

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #14)

Ok, so the patch attached to Bug 1519077 does not resolve the differences in Comment 11 in my local testing.

This does feel more like an issue with the CacheIR stubs than an Ion type analysis Bug IMO. My gut (and a bit of printf debugging to make sure I'm on the right arm of the branch) says the bug should be in here: https://searchfox.org/mozilla-central/rev/9eb30227b21e0aa40d51d9f9b08bb0b113c5fadb/js/src/jit/arm/MacroAssembler-arm.cpp#5707-5728

The printf gives me this:

./dist/bin/js --arm-hwcap=vfp  --fuzzing-safe --ion-eager --no-threads -e 'z=[[],1];for(var j=0;j<2;++j){for(var k=0;k<3;++k){print(((x, y) => (y?1:(x**1))&&(x/1))(z[j],z[k]));}}'
0
0
0
(noidiv) rhs: r1, lhsOutput r2, remOutput r10
1
(noidiv) rhs: r5, lhsOutput r2, remOutput r0
1
0

Excellent sleuthing!!

Specifically I think it's here:
https://searchfox.org/mozilla-central/rev/9eb30227b21e0aa40d51d9f9b08bb0b113c5fadb/js/src/jit/arm/MacroAssembler-arm.cpp#5722

If remOutput is r0, then that's the same as RegisterOutputVal0; but the code does this:

    mov(ReturnRegVal1, remOutput);
    mov(ReturnRegVal0, lhsOutput);

which is

   mov(r1, r0);
   mov(r0, lhsOutput);

I.e. the shuffle clobbers one of the return registers.

I'll see what I can do to poke the MoveResolver into submission.

Assignee: nobody → wingo

Clearing needinfo? from mgaudet; review is welcome though!

Flags: needinfo?(mgaudet)

Hah -- I came in here to drop a link to the ScopedMoveResolution as a possible solution. I think I might like yours better tho!

Only halfway into my first cup of coffee, so I'll refrain from reviewing in more detail just yet.

Bug seems to originate in https://bugzilla.mozilla.org/show_bug.cgi?id=1438727, which first appeared in FF63. Probably we will want to backport to FF66 and F65.

Blocks: 1438727
No longer blocks: 1490387

Set checkin-needed for ff67 but I don't know what's needed for ff65 / ff66, can someone take care of that please? Tx in advance :)

Keywords: checkin-needed

Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c9f9e4c965cf
Fix shuffling of two-valued return on ARM r=jandem

Keywords: checkin-needed
Priority: -- → P1
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67

ni? myself to remind me to prepare backport patches and request approval.

Flags: needinfo?(mgaudet)

Beta/Release Uplift Approval Request

Feature/Bug causing the regression

Bug 1438727

User impact if declined

Potential wrong output or crashes

Is this code covered by automated tests?

Yes

Has the fix been verified in Nightly?

Yes

Needs manual test from QE?

No

If yes, steps to reproduce

List of other uplifts needed

None

Risk to taking this patch

Low

Why is the change risky/not risky? (and alternatives if risky)

Change has been tested on nightly, and simplifies handling of an edge case, while correcting an error.

String changes made/needed

None

Flags: needinfo?(mgaudet)
Attachment #9042966 - Flags: approval-mozilla-beta?

Note, the landed rev c9f9e4c965cf grafts cleanly onto beta and release; uploaded the patch mostly to get access to the beta form.

I haven't requested release approval because I don't think this patch hits the bar for a release backport, however, I do note that it will graft cleanly onto release so we can do that easily if desired.

Comment on attachment 9042966 [details] [diff] [review]
Bug1523515.beta.patch

Fix for potential crashes (found by fuzzing).
Let's uplift for beta 8.
Attachment #9042966 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: