IonMonkey: ARM: asm.js/testAtomics.js test failure on Rpi2

RESOLVED FIXED in Firefox 43

Status

()

Core
JavaScript Engine: JIT
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: hev, Assigned: lth)

Tracking

(Blocks: 1 bug)

unspecified
mozilla43
ARM
Linux
Points:
---

Firefox Tracking Flags

(firefox43 fixed)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

3 years ago
[heiher@alarmpi jit-test]$ python2 ./jit_test.py -f --tbpl ../arm/js/src/shell/js asm.js/testAtomics.js                        
Exit code: -4
FAIL - asm.js/testAtomics.js
[0|1|0|0]  14% ======>                                                |   1.1s
Exit code: -4
FAIL - asm.js/testAtomics.js
[0|2|0|0]  28% ==============>                                        |   1.2s
Exit code: -4
FAIL - asm.js/testAtomics.js
[0|3|0|0]  42% ======================>                                |   1.3s
Exit code: -4
FAIL - asm.js/testAtomics.js
[1|4|0|0]  71% ======================================>                |  11.5s
Exit code: -4
FAIL - asm.js/testAtomics.js
[1|5|0|0]  85% ==============================================>        |  13.1s
Exit code: -4
FAIL - asm.js/testAtomics.js
[1|6|0|0] 100% ======================================================>|  13.2s
FAILURES:
    /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache --ion-extra-checks --non-writable-jitcode --ion-check-range-analysis --ion-offthread-compile=off --no-threads --ion-eager --no-sse3 -e "const platform='linux2'; const libdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/'; const scriptdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/'" -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js
    /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache --baseline-eager -e "const platform='linux2'; const libdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/'; const scriptdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/'" -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js
    /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache -e "const platform='linux2'; const libdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/'; const scriptdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/'" -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js
    /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache --baseline-eager --no-fpu -e "const platform='linux2'; const libdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/'; const scriptdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/'" -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js
    /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache --no-baseline --no-ion -e "const platform='linux2'; const libdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/'; const scriptdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/'" -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js
    /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache --ion-eager --ion-offthread-compile=off -e "const platform='linux2'; const libdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/'; const scriptdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/'" -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js
TIMEOUTS:
(Reporter)

Comment 1

3 years ago
[heiher@alarmpi jit-test]$ /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache --no-baseline --no-ion -e "const platform='linux2'; const libdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/'; const scriptdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/'" -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js
Illegal instruction (core dumped)
[heiher@alarmpi jit-test]$ /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache --no-baseline --no-ion -e "const platform='linux2'; const libdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/'; const scriptdir='/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/'" -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js --no-asmjs
[heiher@alarmpi jit-test]$ echo $?
0
(Reporter)

Comment 2

3 years ago
Starting program: /home/heiher/work/git/gecko-dev/js/src/arm/js/src/shell/js -f /home/heiher/work/git/gecko-dev/js/src/jit-test/lib/prologue.js --js-cache /home/heiher/work/git/gecko-dev/js/src/jit-test/.js-cache --no-baseline --no-ion -e const\ platform=\'linux2\'\;\ const\ libdir=\'/home/heiher/work/git/gecko-dev/js/src/jit-test/lib/\'\;\ const\ scriptdir=\'/home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/\' -f /home/heiher/work/git/gecko-dev/js/src/jit-test/tests/asm.js/testAtomics.js
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x768ff450 (LWP 13240)]
[New Thread 0x766ff450 (LWP 13241)]
[New Thread 0x764ff450 (LWP 13242)]
[New Thread 0x762ff450 (LWP 13243)]
[New Thread 0x760ff450 (LWP 13244)]
[New Thread 0x75eff450 (LWP 13245)]
[New Thread 0x75cff450 (LWP 13246)]
[New Thread 0x75aff450 (LWP 13247)]

Program received signal SIGILL, Illegal instruction.
0x76bb81cc in ?? ()
(gdb) x/20i 0x76bb81cc - 32
   0x76bb81ac:  sub     sp, sp, #4
   0x76bb81b0:  mov     r1, #10
   0x76bb81b4:  mov     r2, #37 ; 0x25
   0x76bb81b8:  add     lr, r11, r1
   0x76bb81bc:  dmb     sy
   0x76bb81c0:  ldrexb  r0, [lr]
   0x76bb81c4:  sxtb    r0, r0
   0x76bb81c8:  add     r12, r0, r2
=> 0x76bb81cc:  strexb  r12, r12, [lr]
   0x76bb81d0:  cmp     r12, #1
   0x76bb81d4:  beq     0x76bb81c0
   0x76bb81d8:  dmb     sy
   0x76bb81dc:  nop     {0}
   0x76bb81e0:  add     sp, sp, #4
   0x76bb81e4:  pop     {pc}            ; (ldr pc, [sp], #4)
   0x76bb81e8:  ldr     r2, [r10, #-1024]       ; 0xfffffc00
   0x76bb81ec:  ldr     r3, [sp]
   0x76bb81f0:  str     r3, [r2, #100]  ; 0x64
   0x76bb81f4:  add     sp, sp, #4
   0x76bb81f8:  pop     {pc}            ; (ldr pc, [sp], #4)
(gdb)
(Assignee)

Comment 3

3 years ago
Model One Raspberry Pi, I assume?
(Reporter)

Comment 4

3 years ago
[heiher@alarmpi jit-test]$ cat /proc/cpuinfo 
processor       : 0
model name      : ARMv7 Processor rev 5 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc07
CPU revision    : 5

processor       : 1
model name      : ARMv7 Processor rev 5 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc07
CPU revision    : 5

processor       : 2
model name      : ARMv7 Processor rev 5 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc07
CPU revision    : 5

processor       : 3
model name      : ARMv7 Processor rev 5 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc07
CPU revision    : 5

Hardware        : BCM2709
Revision        : a01041
Serial          : 00000000b9148d2d
(Assignee)

Comment 5

3 years ago
I can't repro this on my local ARM system.

(Said system is an NVIDIA "Jetson" TK1 dev board: Quad-core Cortex-A15 @ 2.3GHz, 2GB RAM, Ubuntu 14.04.)

It's curious that:

* the machine, which is clearly ARMv7, must have strexb so the instruction should not be "illegal"
* ldrexb succeeds but strexb fails (and the latter can't execute without the former, modulo branch bugs),
  which is even more bizarre
* the run in comment #2 shows jitted code as I would expect it to look, yet that is a run with the
  flags --no-baseline --no-ion so should not be jitted... Maybe the flags don't matter because it's asm.js.

This is unlikely to be directly related (bare-metal programming, MMU disabled) but I'm including it anyway: http://stackoverflow.com/questions/26013319/raspberry-pi-ldrex-causes-data-abort.
See also bug 1098508, and in particular 1098508 comment 15, where the issue is also reproduced.
(Assignee)

Comment 7

3 years ago
(In reply to Benjamin Bouvier [:bbouvier] from comment #6)
> See also bug 1098508, and in particular 1098508 comment 15, where the issue
> is also reproduced.

That's a scary bug, but did you really intend this comment for bug 1201490?  I don't think the problems you're reporting in 1098508 are the same as the crashes we're seeing here.
Flags: needinfo?(benj)
Thanks for adding it to bug 1201490 too. I was merely thinking that it could be related to this current bug as well, as 1) the same test is failing and 2) it seems to not be falling locally on the simulator. As a result, looks like these tiny differences in CPUs are possible culprits in both cases...
Flags: needinfo?(benj)
(Assignee)

Comment 9

3 years ago
On IRC, :hev comments "this test fail too on mips64 after patch that bug 1178793 (Throw on atomic OOB access, asm.js) applied".

At this point, the most useful thing would be a memory dump from the faulting instruction stream (something like, x/16x $pc-32 in gdb parlance) but the gdb on the device has been reported as being broken, will need to work around that somehow.  At a minimum we should be sure that the code looks right.  I checked the assembler and it does set the reserved bits the right way, but it doesn't hurt to check the output.

Speculating wildly now, given that the load works but the store faults, there could be a protection issue, though SIGILL is a weird signal for that.  It could especially be a protection issue given that this is asm.js on ARM - not the most tested combination probably.  A simple off-by-one in an mprotect or mmap is enough.
(Assignee)

Comment 10

3 years ago
The instruction encoding is correct but there is some fine print...

The instruction is STREXB with Rn=14, Rd=12, Rt=12.  The ARM manual states: "if d == n || d == t then UNPREDICTABLE".  Clearly we have Rd==Rt.  So it's a code generation bug, we are violating this constraint and it matters on the chip in the RPi2 but not on the chip in my NVIDIA system.

(If MIPS has the same problem it is not likely to be caused by the fine print in the ARM manual, but by something else.)
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
(Assignee)

Comment 11

3 years ago
BTW, this bug should also affect non-asm.js execution on the platform, and if that is not observable during testing it is probably because we're not testing well enough.  (This could be a problem with the test suite, not just with methodology.  I've observed that even on other platforms it can be hard to test code generation properly because the JIT kicks in late, and if we force it to kick in early it does not have enough information to inline primitives anyhow.)

The Raspberry Pi 2 has a Cortex-A7 MPCore CPU, which is a 900MHz in-order design.
(Assignee)

Comment 12

3 years ago
Created attachment 8658848 [details] [diff] [review]
bug1202650-strex-constraint.patch

Probable fix; it passes testing on the simulator but I need to test on misc devices, and more broadly.

This is really an ARM-only change but it infects *all* platforms because a number of cases are funnelled through common code; thus adding a temp register argument causes bad ripples throughout.  The code "should" be stable now but if this happens again we should bite the bullet and just duplicate it everywhere (or something).  As it is, eg the x86 macro assembler methods receive arguments they don't need.  This seems evil.
(Assignee)

Comment 13

3 years ago
Created attachment 8659204 [details] [diff] [review]
bug1202650-refactor.patch
Attachment #8658848 - Attachment is obsolete: true
(Assignee)

Comment 14

3 years ago
Created attachment 8659206 [details] [diff] [review]
bug1202650-arm-flagtemp.patch

A better attempt in two parts: move the generic code into the platform ports, and then adapt the ARM part.  This is a larger change but it's the right thing to do, and there will be other benefits: there were already comments in the code regarding things we could not do because the code was platform-independent, for example.

These patches still need more testing.
(Reporter)

Comment 15

3 years ago
(In reply to Lars T Hansen [:lth] from comment #14)
> Created attachment 8659206 [details] [diff] [review]
> bug1202650-arm-flagtemp.patch
> 
> A better attempt in two parts: move the generic code into the platform
> ports, and then adapt the ARM part.  This is a larger change but it's the
> right thing to do, and there will be other benefits: there were already
> comments in the code regarding things we could not do because the code was
> platform-independent, for example.
> 
> These patches still need more testing.

Is this generic code bad for arm only? if it works fine on mostly platforms, why can not just override these functions on arm?
(Assignee)

Updated

3 years ago
Blocks: 1203490
(Assignee)

Comment 16

3 years ago
(In reply to Heiher [:hev] from comment #15)
> (In reply to Lars T Hansen [:lth] from comment #14)
> > Created attachment 8659206 [details] [diff] [review]
> > bug1202650-arm-flagtemp.patch
> > 
> > A better attempt in two parts: move the generic code into the platform
> > ports, and then adapt the ARM part.  This is a larger change but it's the
> > right thing to do, and there will be other benefits: there were already
> > comments in the code regarding things we could not do because the code was
> > platform-independent, for example.
> > 
> > These patches still need more testing.
> 
> Is this generic code bad for arm only? if it works fine on mostly platforms,
> why can not just override these functions on arm?

The generic code was a good fit initially but as the back-ends are evolving they are becoming increasingly difficult to deal with; for example, keeping the code generic would require adding a dummy temp register argument to a large number of functions in all back-ends but only ARM32 would need it to be valid.

This was always expected to happen, but when I wrote the code I had other priorities, so I went with the generic approach knowing I'd have to fix it eventually.  This is as good a time as ever; as far as I can tell, neither MIPS nor ARM64 have done much of the work here and will not suffer from the reorg.

(Also, as optimizations are becoming more sophisticated, the generic code is an ever smaller part of the total per-platform code for atomics.)
(Assignee)

Comment 17

3 years ago
On IRC, :hev reports that these patches pass testing on his Raspberry Pi 2.
(Assignee)

Comment 18

3 years ago
Comment on attachment 8659204 [details] [diff] [review]
bug1202650-refactor.patch

This patch removes common code and injects it into x86-shared and ARM, opening up for per-platform specialization, which is desired.  No reengineering here.
Attachment #8659204 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 19

3 years ago
Comment on attachment 8659206 [details] [diff] [review]
bug1202650-arm-flagtemp.patch

This patch specializes the new code in the ARM platform directory.  Very little happens here except renaming variables and introducing another temp argument in one case; the action is in the ARM macro-assembler, where the temp is used, and in lowering, where the temp is allocated (in several places, for different node types).

For the code in LIR-shared.h I could have chosen to move the LIR definitions into platform code too, to specialize.  It's possible that would be better, since the addrTemp is used on x86/x64 and the flagTemp is used on ARM (and later ARM64 and MIPS).  For now I kept it here.  Please offer opinions :)
Attachment #8659206 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8659204 [details] [diff] [review]
bug1202650-refactor.patch

Review of attachment 8659204 [details] [diff] [review]:
-----------------------------------------------------------------

This sounds good, even if I still have a question.
By moving the code for generating atomic operation to the CodeGen, we effectively prevent our-self from reusing this code in Baseline ICs, which means that we restrict our-self to use C++ function to handle atomic mutation on shared array buffer, right?  Do we really want that?
Attachment #8659204 - Flags: review?(nicolas.b.pierron) → review+
(Assignee)

Comment 21

3 years ago
(In reply to Nicolas B. Pierron [:nbp] from comment #20)
> Comment on attachment 8659204 [details] [diff] [review]
> bug1202650-refactor.patch
> 
> Review of attachment 8659204 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This sounds good, even if I still have a question.
> By moving the code for generating atomic operation to the CodeGen, we
> effectively prevent our-self from reusing this code in Baseline ICs, which
> means that we restrict our-self to use C++ function to handle atomic
> mutation on shared array buffer, right?  Do we really want that?

Is the concern that there needs to be an entry point in the platform-independent CodeGenerator or MacroAssembler to support Baseline properly?

(Baseline hasn't been much of a concern so far since atomics are generally quite expensive anyway.)
Comment on attachment 8659206 [details] [diff] [review]
bug1202650-arm-flagtemp.patch

Review of attachment 8659206 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm/Assembler-arm.cpp
@@ +2274,5 @@
>  
>  BufferOffset
>  Assembler::as_strex(Register rd, Register rt, Register rn, Condition c)
>  {
> +    MOZ_ASSERT(rd != rn && rd != rt); // True restriction on Cortex-A7 (RPi2)

Thanks for adding these assertions :)

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +1697,3 @@
>  {
> +    MOZ_ASSERT(flagTemp != InvalidReg);
> +    MOZ_ASSERT(outTemp == InvalidReg || arrayType == Scalar::Uint32);

nit: I think the following would be easier to read:

  MOZ_ASSERT_IF(arrayType == Scalar::Uint32, outTemp != InvalidReg);

::: js/src/jit/shared/LIR-shared.h
@@ +5242,5 @@
>    public:
>      LIR_HEADER(AtomicTypedArrayElementBinopForEffect)
>  
>      LAtomicTypedArrayElementBinopForEffect(const LAllocation& elements, const LAllocation& index,
>                                             const LAllocation& value)

nit: Remove setFlagTemp, and add a fourth argument for the temporary flagTemp with a default value.

At the moment, we always define all virtual-register allocations as the argument of the constructor of the LIR instruction, adding the setFlagTemp goes against this implicit rule, and I do not think this is justified as all uses of setFlagTemp are located right after the constructor call.

@@ +6711,5 @@
>  
>      static const int32_t valueOp = 1;
>  
>      LAsmJSAtomicBinopHeap(const LAllocation& ptr, const LAllocation& value,
>                            const LDefinition& temp)

nit: same here.

@@ +6757,4 @@
>  {
>    public:
>      LIR_HEADER(AsmJSAtomicBinopHeapForEffect);
>      LAsmJSAtomicBinopHeapForEffect(const LAllocation& ptr, const LAllocation& value)

nit: and here.
Attachment #8659206 - Flags: review?(nicolas.b.pierron) → review+
https://hg.mozilla.org/mozilla-central/rev/530967994450
https://hg.mozilla.org/mozilla-central/rev/61ab9f5612e4
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
status-firefox43: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
You need to log in before you can comment on or make changes to this bug.