Closed Bug 1032202 Opened 10 years ago Closed 9 years ago

Intermittent test_fileapi.html | application crashed [@ JSObject::uninlinedIsProxy() const]

Categories

(Core :: JavaScript Engine: JIT, defect)

ARM
Android
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: cbook, Unassigned)

References

()

Details

(Keywords: crash, intermittent-failure)

Android 4.0 Panda mozilla-central debug test mochitest-1 on 2014-06-29 17:39:17 PDT for push c3387c5ddba9

slave: panda-0514

https://tbpl.mozilla.org/php/getParsedLog.php?id=42730882&tree=Mozilla-Central

17:57:09  WARNING -  PROCESS-CRASH | /tests/content/base/test/test_fileapi.html | application crashed [@ JSObject::uninlinedIsProxy() const]
17:57:09     INFO -  Crash dump filename: /tmp/tmpmoqLtK/54fed6fe-7f85-059a-73f99836-0894d080.dmp
17:57:09     INFO -  Operating system: Android
17:57:09     INFO -                    0.0.0 Linux 3.2.0+ #2 SMP PREEMPT Thu Nov 29 08:06:57 EST 2012 armv7l pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys
17:57:09     INFO -  CPU: arm
17:57:09     INFO -       2 CPUs
17:57:09     INFO -  Crash reason:  SIGILL
17:57:09     INFO -  Crash address: 0x6214331a
17:57:09     INFO -  Thread 13 (crashed)
17:57:09     INFO -   0  libxul.so!JSObject::uninlinedIsProxy() const [jsobj.cpp:c3387c5ddba9 : 5810 + 0x2]
17:57:09     INFO -       r4 = 0x722a5be0    r5 = 0x722a5be0    r6 = 0x5d49dea0    r7 = 0x5d49e024
17:57:09     INFO -       r8 = 0x64034ff4    r9 = 0x64034ff4   r10 = 0x5d49e024    fp = 0x00000000
17:57:09     INFO -       sp = 0x5d49de78    lr = 0x6236827d    pc = 0x6214331a
17:57:09     INFO -      Found by: given as instruction pointer in context
17:57:09     INFO -   1  libxul.so!js::GetProxyHandler(JSObject*) [jsproxy.h:c3387c5ddba9 : 381 + 0x3]
17:57:09     INFO -       r4 = 0x722a5be0    r5 = 0x722a5be0    r6 = 0x5d49dea0    r7 = 0x5d49e024
17:57:09     INFO -       r8 = 0x64034ff4    r9 = 0x64034ff4   r10 = 0x5d49e024    fp = 0x00000000
17:57:09     INFO -       sp = 0x5d49de78    pc = 0x6236827d
17:57:09     INFO -      Found by: call frame info
17:57:09     INFO -   2  libxul.so!js::Wrapper::wrappedObject(JSObject*) [jswrapper.h:c3387c5ddba9 : 258 + 0x3]
17:57:09     INFO -       r4 = 0x722a5be0    r5 = 0x63d2b484    r6 = 0x5d49dea0    r7 = 0x5d49e024
17:57:09     INFO -       r8 = 0x64034ff4    r9 = 0x64034ff4   r10 = 0x5d49e024    fp = 0x00000000
17:57:09     INFO -       sp = 0x5d49de88    pc = 0x633e5c53
17:57:09     INFO -      Found by: call frame info
17:57:09     INFO -   3  libxul.so!js::CrossCompartmentWrapper::get(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<jsid>, JS::MutableHandle<JS::Value>) [jswrapper.cpp:c3387c5ddba9 : 312 + 0x3]
17:57:09     INFO -       r4 = 0x6afc11b0    r5 = 0x00000001    r6 = 0x5d49dea0    r7 = 0x5d49e024
17:57:09     INFO -       r8 = 0x64034ff4    r9 = 0x64034ff4   r10 = 0x5d49e024    fp = 0x00000000
17:57:09     INFO -       sp = 0x5d49de98    pc = 0x6340a995
17:57:09     INFO -      Found by: call frame info
17:57:09     INFO -   4  libxul.so!js::Proxy::get(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<jsid>, JS::MutableHandle<JS::Value>) [jsproxy.cpp:c3387c5ddba9 : 2286 + 0x17]
17:57:09     INFO -       r4 = 0x6340a959    r5 = 0x5d49e2f8    r6 = 0x6afc11b0    r7 = 0x63d2b484
17:57:09     INFO -       r8 = 0x64034ff4    r9 = 0x5d49e034   r10 = 0x5d49e024    fp = 0x00000000
17:57:09     INFO -       sp = 0x5d49ded0    pc = 0x633f6667
17:57:09     INFO -      Found by: call frame info
17:57:09     INFO -   5  libxul.so!JSObject::getGeneric(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<jsid>, JS::MutableHandle<JS::Value>) [jsobj.h:c3387c5ddba9 : 966 + 0x3]
17:57:09     INFO -       r4 = 0x00000001    r5 = 0x633f6709    r6 = 0x00000000    r7 = 0x5d49e2f8
17:57:09     INFO -       r8 = 0x0000002f    r9 = 0x66472c79   r10 = 0x66472ca8    fp = 0x00000000
17:57:09     INFO -       sp = 0x5d49df60    pc = 0x631462af
17:57:09     INFO -      Found by: call frame info
17:57:09     INFO -   6  libxul.so!js::jit::DoGetPropFallback [BaselineIC.cpp:c3387c5ddba9 : 6465 + 0x7]
This raises a bunch of questions:
  1) how reliable are the actual errors that we get in the dumps? Looking at the full logs, I see (line numbers prepended):

        3028 17:57:09     INFO -  Crash reason:  SIGILL
   3029 17:57:09     INFO -  Crash address: 0x6214331a

then later (where I tried to actually get information from)

   8129 17:57:12     INFO -  06-29 17:56:47.992 I/DEBUG   ( 1289): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr f8280840
   8130 17:57:12     INFO -  06-29 17:56:47.992 I/DEBUG   ( 1289):  r0 7c140420  r1 6afc11bc  r2 00000001  r3 63f27e30
   8131 17:57:12     INFO -  06-29 17:56:47.992 I/DEBUG   ( 1289):  r4 722a5be0  r5 722a5be0  r6 5d49dea0  r7 5d49e024
   8132 17:57:12     INFO -  06-29 17:56:47.992 I/DEBUG   ( 1289):  r8 64034ff4  r9 64034ff4  10 5d49e024  fp 00000000
   8133 17:57:12     INFO -  06-29 17:56:47.992 I/DEBUG   ( 1289):  ip 62368275  sp 5d49de78  lr 6236827d  pc 6214331a  cpsr 80000130


They agree on the pc, but they disagree on the faulting address, the and the type of fault.

It looks like the issue is coming from a misaligned pc, with very little explanation.
The code around pc is (decompiled from full log):
<arr>:	ldr	r3, [r3, #8]
<arr+2>:	lsrs	r3, r3, #27
<arr+4>:	cmp	r1, r3
<arr+6>:	bcs.n	0x105b0 <arr+16>
<arr+8>:	adds	r0, #16
<arr+10>:	add.w	r0, r0, r1, lsl #3
<arr+14>:	bx	lr
<arr+16>:	subs	r1, r1, r3
<arr+18>:	ldr	r2, [r0, #8]
<arr+20>:	add.w	r0, r2, r1, lsl #3
<arr+24>:	bx	lr
<arr+26>:	ldr	r3, [r0, #4]
<arr+28>:	ldr	r3, [r3, #0]
<arr+30>:	ldr	r0, [r3, #4]
<arr+32>:	ubfx	r0, r0, #20, #1
<arr+36>:	bx	lr
<arr+38>:	movs	r0, r0
<arr+40>:	push	{r4, r5, r6, lr}
<arr+42>:	mov	r5, r0
<arr+44>:	mov	r6, r1

with the pc corresponding to arr + 34 (0x6214331a - 0x621432f8)
the code around lr is:
<arr>:	add	sp, #16
<arr+2>:	pop	{r4, r5, r6, pc}
<arr+4>:	adds	r5, #44	; 0x2c
<arr+6>:	lsls	r2, r6, #4
<arr+8>:	adds	r5, #88	; 0x58
<arr+10>:	lsls	r2, r6, #4
<arr+12>:	movs	r4, #136	; 0x88
<arr+14>:	lsls	r5, r0, #5
<arr+16>:	stmia	r1!, {r2, r3, r4, r5, r6}
<arr+18>:	lsls	r0, r5, #5
<arr+20>:	asrs	r4, r7, #21
<arr+22>:	lsls	r6, r3, #6
<arr+24>:	push	{r3, r4, r5, lr}
<arr+26>:	mov	r5, r0
<arr+28>:	bl	0xffdeb656
<arr+32>:	mov	r4, r0
<arr+34>:	cbnz	r0, 0x105dc <arr+60>
<arr+36>:	ldr	r0, [pc, #36]	; (0x105ec <arr+76>)
<arr+38>:	movw	r2, #381	; 0x17d
<arr+42>:	ldr	r1, [pc, #36]	; (0x105f0 <arr+80>)

with the lr translating to:  0x6236827d - 0x6236825c - 1 = 32, or right after the only call instruction around.  I believe (the math is kind of annoying, and it is easy to make mistakes) that the call instruction tried to jump to:
0x0x62143312 - 0x621432f8 = 26 bytes into the pc-relative dump, which is right after a return instruction, and on an instruction boundary.  These taken together indicate that the blx mentioned was in fact the last call, and it called an full instruction.  Then without any further control flow, the program counter got out of sync.  
Assuming all of this analysis holds, possible culprits are:
cache coherency bug, related to code loading (those are always fun to track down!),
failing hardware,
random act of god^H^H^Hcosmic ray.
misbehaving code spraying all over the text section, somehow overwriting some important code around the crash site.  I really hope that all of our code is marked RO, and would segfault, rather than leading to cache-incoherent random code execution.
Inactive; closing (see bug 1180138).
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.