Closed Bug 1414172 Opened 8 years ago Closed 8 years ago

Crash in js::jit::DoSetPropFallback

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

52 Branch
defect

Tracking

()

RESOLVED INVALID
Tracking Status
thunderbird_esr52 --- wontfix
firefox-esr52 --- wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- wontfix

People

(Reporter: cyu, Unassigned)

Details

(Keywords: crash, Whiteboard: [tbird crash])

Crash Data

This is 100% reproducible on my workstation. thunderbird-52.4.0 on ubuntu 16.04. Crash report: https://crash-stats.mozilla.org/report/index/1258a599-f2cf-40b6-be53-c17a50171103#tab-details Running with gdb: Thread 1 received signal SIGSEGV, Segmentation fault. JS::Rooted<JSScript*>::Rooted<JSContext*, JSScript*>(JSContext* const&, JSScript*&&) (initial=<optimized out>, cx=<synthetic pointer>, this=0xfffe7f1f4ae01b20) at /build/thunderbird-jd5xjR/thunderbird-52.4.0+build1/obj-x86_64-linux-gnu/dist/include/js/RootingAPI.h:788 788 /build/thunderbird-jd5xjR/thunderbird-52.4.0+build1/obj-x86_64-linux-gnu/dist/include/js/RootingAPI.h: No such file or directory. (rr) bt #0 JS::Rooted<JSScript*>::Rooted<JSContext*, JSScript*>(JSContext* const&, JSScript*&&) (initial=<optimized out>, cx=<synthetic pointer>, this=0xfffe7f1f4ae01b20) at /build/thunderbird-jd5xjR/thunderbird-52.4.0+build1/obj-x86_64-linux-gnu/dist/include/js/RootingAPI.h:788 #1 js::jit::DoSetPropFallback (cx=0x7f1f4f27a000, frame=0x7ffed035f388, stub_=0x7f1f49b27288, lhs=..., rhs=..., res=...) at /build/thunderbird-jd5xjR/thunderbird-52.4.0+build1/mozilla/js/src/jit/BaselineIC.cpp:4503 #2 0x00002cef4f0ebc06 in ?? () #3 0x0000000000000000 in ?? ()
Crash Signature: js::jit::DoSetPropFallback
More information: this crash happened after thunderbird is open for > 2hr. Then I couldn't open thunderbird anymore. It crashed during startup. This startup crash is gone after I rebooted my workstation. I have packed rr trace and it still reproduces the crash. I can provide the trace by request.
Severity: normal → critical
Crash Signature: js::jit::DoSetPropFallback → [@ js::jit::DoSetPropFallback]
Flags: needinfo?(cyu)
Sorry, I can't try safe mode because the crash is already gone after reboot. What I have is the packed rr trace of the crash and I can still replay the crash.
Flags: needinfo?(cyu)
Wayne, are you aware of Robert O'Callahan's "rr" (record and replay): http://robert.ocallahan.org/2015/10/rr-40-released-with-reverse-execution.html Anyway, I can't even see any TB code in the call stack given in comment #0.
Looking at the crash reports, it also happens on Firefox. Re-categorize as a JS bug.
Component: General → JavaScript Engine: JIT
Product: Thunderbird → Core
Summary: Startup crash in js::jit::DoSetPropFallback → Crash in js::jit::DoSetPropFallback
arai, this is Thunderbird, not Firefox, but a recorded crash is a rare opportunity. Would you like to get in touch with Cervantes and try to debug it? (Cervantes is in GMT+8.) (ni?jandem too)
Flags: needinfo?(jdemooij)
Flags: needinfo?(arai.unmht)
Priority: -- → P1
I don't have much experience with rr, so it may take some time to get used to it before looking into the record. if it's okay, I'll try working on this.
Flags: needinfo?(arai.unmht)
Arai, that is fine. I expect Cervantes can supply the rr expertise if you supply the JS expertise! If it doesn't work out, ni?me and we'll try something else...
Reports in the last week on Socorro: > Firefox 58.0*: 1 > Firefox 57.0*: 6 > Firefox 56.0*: 13 > Firefox ESR 52.4.*: 28 I don't consider this to block Quantum at these volumes.
cervantes, would it be possible to send me the trace somewhere? We could also try some remote debugging on IRC but that might take more time.
Flags: needinfo?(cyu)
(In reply to Jan de Mooij [:jandem] from comment #10) > cervantes, would it be possible to send me the trace somewhere? We could > also try some remote debugging on IRC but that might take more time. It was taken with core i7 3770. I tried to run the trace on another machine with a difference model (haswell), but it fails to replay because of CPUID issues: https://github.com/mozilla/rr/wiki/Trace-Portability I can send the trace to you. If it fails to replay on your side I can set up a guest account on my workstation for debugging.
Flags: needinfo?(cyu)
Whiteboard: [tbird crash]
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #11) > It was taken with core i7 3770. I tried to run the trace on another machine > with a difference model (haswell), but it fails to replay because of CPUID > issues: https://github.com/mozilla/rr/wiki/Trace-Portability > I can send the trace to you. If it fails to replay on your side I can set up > a guest account on my workstation for debugging. Thank you. It fails for me too, something about an unrecorded syscall. What's your rr version? Access to that machine or STR would be appreciated :)
I sent you the ssh logon information for rr replaying the crash on my workstation.
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #13) > I sent you the ssh logon information for rr replaying the crash on my > workstation. Thanks! This is pretty weird, we're crashing here in DoSetPropFallback: 0x00007f1f5be9089f <+79>: and $0xfffffff,%eax 0x00007f1f5be908a4 <+84>: mov %eax,0x28(%rsp) 0x00007f1f5be908a8 <+88>: callq 0x7f1f5be62f40 <js::jit::ScriptFromCalleeToken(js::jit::CalleeToken)> 0x00007f1f5be908ad <+93>: mov 0x80001b0(%r14),%rcx => 0x00007f1f5be908b4 <+100>: mov %rax,0x40000b0(%rsp) 0x00007f1f5be908bc <+108>: mov %rbx,0x40(%rsp) 0x00007f1f5be908c1 <+113>: lea 0x10(%rcx),%rdx 0x00007f1f5be908c5 <+117>: test %rcx,%rcx Basically we're storing $rax to stack-pointer + 64 MB. That makes no sense. Is this your own build? It's probably either memory/download corruption or a compiler bug...
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #1) > More information: this crash happened after thunderbird is open for > 2hr. > Then I couldn't open thunderbird anymore. It crashed during startup. This > startup crash is gone after I rebooted my workstation. This + comment 14 really suggests bad RAM. Could you run MemTest overnight?
Flags: needinfo?(cyu)
any chance there's extension that touches native part? (js-ctypes is available on 52.4.0 and it can easily break heap/stack)
(In reply to Tooru Fujisawa [:arai] from comment #16) > any chance there's extension that touches native part? > (js-ctypes is available on 52.4.0 and it can easily break heap/stack) It's possible but this extension would first have to mprotect the code to make it writable, then touch some bits.. Also, Cervantes got the crash once and after that it was impossible to start Thunderbird again (until a reboot). The kernel likely stores/caches a copy of libxul.so in RAM and that data being corrupted would explain all these symptoms perfectly.
The build is from Ubuntu and not a local build so we can rule out compiler bugs. The crash looks like a bad RAM issue. I disassemble the function in a new gdb session: 0x00007fffe9f9089f <+79>: and $0xfffffff,%eax 0x00007fffe9f908a4 <+84>: mov %eax,0x28(%rsp) 0x00007fffe9f908a8 <+88>: callq 0x7fffe9f62f40 <js::jit::ScriptFromCalleeToken(js::jit::CalleeToken)> 0x00007fffe9f908ad <+93>: mov 0x1b0(%r14),%rcx 0x00007fffe9f908b4 <+100>: mov %rax,0xb0(%rsp) 0x00007fffe9f908bc <+108>: mov %rbx,0x40(%rsp) When it runs normally, the instruction is mov %rax,0xb0(%rsp), while in the rr capture, it's mov %rax,0x40000b0(%rsp). So the offset is flipped one bit and likely to be cached in the kernel. That explains why it's 100% crash after one crash. Close as invalid since this is a hardware issue.
Flags: needinfo?(cyu)
Closing as discussed.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(jdemooij)
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.