Closed
Bug 1414172
Opened 8 years ago
Closed 8 years ago
Crash in js::jit::DoSetPropFallback
Categories
(Core :: JavaScript Engine: JIT, defect, P1)
Tracking
()
People
(Reporter: cyu, Unassigned)
Details
(Keywords: crash, Whiteboard: [tbird crash])
Crash Data
This is 100% reproducible on my workstation. thunderbird-52.4.0 on ubuntu 16.04.
Crash report: https://crash-stats.mozilla.org/report/index/1258a599-f2cf-40b6-be53-c17a50171103#tab-details
Running with gdb:
Thread 1 received signal SIGSEGV, Segmentation fault.
JS::Rooted<JSScript*>::Rooted<JSContext*, JSScript*>(JSContext* const&, JSScript*&&) (initial=<optimized out>, cx=<synthetic pointer>, this=0xfffe7f1f4ae01b20) at /build/thunderbird-jd5xjR/thunderbird-52.4.0+build1/obj-x86_64-linux-gnu/dist/include/js/RootingAPI.h:788
788 /build/thunderbird-jd5xjR/thunderbird-52.4.0+build1/obj-x86_64-linux-gnu/dist/include/js/RootingAPI.h: No such file or directory.
(rr) bt
#0 JS::Rooted<JSScript*>::Rooted<JSContext*, JSScript*>(JSContext* const&, JSScript*&&) (initial=<optimized out>, cx=<synthetic pointer>, this=0xfffe7f1f4ae01b20) at /build/thunderbird-jd5xjR/thunderbird-52.4.0+build1/obj-x86_64-linux-gnu/dist/include/js/RootingAPI.h:788
#1 js::jit::DoSetPropFallback (cx=0x7f1f4f27a000, frame=0x7ffed035f388, stub_=0x7f1f49b27288, lhs=..., rhs=..., res=...) at /build/thunderbird-jd5xjR/thunderbird-52.4.0+build1/mozilla/js/src/jit/BaselineIC.cpp:4503
#2 0x00002cef4f0ebc06 in ?? ()
#3 0x0000000000000000 in ?? ()
| Reporter | ||
Updated•8 years ago
|
Crash Signature: js::jit::DoSetPropFallback
| Reporter | ||
Comment 1•8 years ago
|
||
More information: this crash happened after thunderbird is open for > 2hr. Then I couldn't open thunderbird anymore. It crashed during startup. This startup crash is gone after I rebooted my workstation.
I have packed rr trace and it still reproduces the crash. I can provide the trace by request.
Comment 2•8 years ago
|
||
Cervantes, Does the crash reproduce in safe mode? https://support.mozilla.org/en-US/kb/safe-mode-thunderbird
On the bugzilla tab*, "Bugzilla - Report this bug in Thunderbird" will give you a nicely formatted bug report.
* https://crash-stats.mozilla.org/report/index/1258a599-f2cf-40b6-be53-c17a50171103#tab-details
This crash signature is rare according to https://crash-stats.mozilla.org/signature/?product=Thunderbird&signature=js%3A%3Ajit%3A%3ADoSetPropFallback&date=%3E%3D2017-10-27T02%3A55%3A47.000Z&date=%3C2017-11-03T02%3A55%3A47.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1#graphs
Perhaps neander will have an idea
Severity: normal → critical
Crash Signature: js::jit::DoSetPropFallback → [@ js::jit::DoSetPropFallback]
Flags: needinfo?(cyu)
| Reporter | ||
Comment 3•8 years ago
|
||
Sorry, I can't try safe mode because the crash is already gone after reboot. What I have is the packed rr trace of the crash and I can still replay the crash.
Flags: needinfo?(cyu)
Comment 4•8 years ago
|
||
Wayne, are you aware of Robert O'Callahan's "rr" (record and replay):
http://robert.ocallahan.org/2015/10/rr-40-released-with-reverse-execution.html
Anyway, I can't even see any TB code in the call stack given in comment #0.
| Reporter | ||
Comment 5•8 years ago
|
||
Looking at the crash reports, it also happens on Firefox. Re-categorize as a JS bug.
Component: General → JavaScript Engine: JIT
Product: Thunderbird → Core
Summary: Startup crash in js::jit::DoSetPropFallback → Crash in js::jit::DoSetPropFallback
Comment 6•8 years ago
|
||
arai, this is Thunderbird, not Firefox, but a recorded crash is a rare opportunity. Would you like to get in touch with Cervantes and try to debug it? (Cervantes is in GMT+8.)
(ni?jandem too)
Flags: needinfo?(jdemooij)
Flags: needinfo?(arai.unmht)
Priority: -- → P1
Comment 7•8 years ago
|
||
I don't have much experience with rr, so it may take some time to get used to it before looking into the record.
if it's okay, I'll try working on this.
Flags: needinfo?(arai.unmht)
Comment 8•8 years ago
|
||
Arai, that is fine. I expect Cervantes can supply the rr expertise if you supply the JS expertise!
If it doesn't work out, ni?me and we'll try something else...
Reports in the last week on Socorro:
> Firefox 58.0*: 1
> Firefox 57.0*: 6
> Firefox 56.0*: 13
> Firefox ESR 52.4.*: 28
I don't consider this to block Quantum at these volumes.
status-firefox56:
--- → wontfix
status-firefox57:
--- → wontfix
status-firefox58:
--- → affected
status-firefox-esr52:
--- → affected
status-thunderbird_esr52:
--- → affected
Comment 10•8 years ago
|
||
cervantes, would it be possible to send me the trace somewhere? We could also try some remote debugging on IRC but that might take more time.
Flags: needinfo?(cyu)
| Reporter | ||
Comment 11•8 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #10)
> cervantes, would it be possible to send me the trace somewhere? We could
> also try some remote debugging on IRC but that might take more time.
It was taken with core i7 3770. I tried to run the trace on another machine with a difference model (haswell), but it fails to replay because of CPUID issues: https://github.com/mozilla/rr/wiki/Trace-Portability
I can send the trace to you. If it fails to replay on your side I can set up a guest account on my workstation for debugging.
Flags: needinfo?(cyu)
Updated•8 years ago
|
Whiteboard: [tbird crash]
Comment 12•8 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #11)
> It was taken with core i7 3770. I tried to run the trace on another machine
> with a difference model (haswell), but it fails to replay because of CPUID
> issues: https://github.com/mozilla/rr/wiki/Trace-Portability
> I can send the trace to you. If it fails to replay on your side I can set up
> a guest account on my workstation for debugging.
Thank you. It fails for me too, something about an unrecorded syscall. What's your rr version?
Access to that machine or STR would be appreciated :)
| Reporter | ||
Comment 13•8 years ago
|
||
I sent you the ssh logon information for rr replaying the crash on my workstation.
Comment 14•8 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #13)
> I sent you the ssh logon information for rr replaying the crash on my
> workstation.
Thanks!
This is pretty weird, we're crashing here in DoSetPropFallback:
0x00007f1f5be9089f <+79>: and $0xfffffff,%eax
0x00007f1f5be908a4 <+84>: mov %eax,0x28(%rsp)
0x00007f1f5be908a8 <+88>: callq 0x7f1f5be62f40 <js::jit::ScriptFromCalleeToken(js::jit::CalleeToken)>
0x00007f1f5be908ad <+93>: mov 0x80001b0(%r14),%rcx
=> 0x00007f1f5be908b4 <+100>: mov %rax,0x40000b0(%rsp)
0x00007f1f5be908bc <+108>: mov %rbx,0x40(%rsp)
0x00007f1f5be908c1 <+113>: lea 0x10(%rcx),%rdx
0x00007f1f5be908c5 <+117>: test %rcx,%rcx
Basically we're storing $rax to stack-pointer + 64 MB. That makes no sense. Is this your own build? It's probably either memory/download corruption or a compiler bug...
Comment 15•8 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #1)
> More information: this crash happened after thunderbird is open for > 2hr.
> Then I couldn't open thunderbird anymore. It crashed during startup. This
> startup crash is gone after I rebooted my workstation.
This + comment 14 really suggests bad RAM. Could you run MemTest overnight?
Flags: needinfo?(cyu)
Comment 16•8 years ago
|
||
any chance there's extension that touches native part?
(js-ctypes is available on 52.4.0 and it can easily break heap/stack)
Comment 17•8 years ago
|
||
(In reply to Tooru Fujisawa [:arai] from comment #16)
> any chance there's extension that touches native part?
> (js-ctypes is available on 52.4.0 and it can easily break heap/stack)
It's possible but this extension would first have to mprotect the code to make it writable, then touch some bits..
Also, Cervantes got the crash once and after that it was impossible to start Thunderbird again (until a reboot). The kernel likely stores/caches a copy of libxul.so in RAM and that data being corrupted would explain all these symptoms perfectly.
| Reporter | ||
Comment 18•8 years ago
|
||
The build is from Ubuntu and not a local build so we can rule out compiler bugs. The crash looks like a bad RAM issue. I disassemble the function in a new gdb session:
0x00007fffe9f9089f <+79>: and $0xfffffff,%eax
0x00007fffe9f908a4 <+84>: mov %eax,0x28(%rsp)
0x00007fffe9f908a8 <+88>: callq 0x7fffe9f62f40 <js::jit::ScriptFromCalleeToken(js::jit::CalleeToken)>
0x00007fffe9f908ad <+93>: mov 0x1b0(%r14),%rcx
0x00007fffe9f908b4 <+100>: mov %rax,0xb0(%rsp)
0x00007fffe9f908bc <+108>: mov %rbx,0x40(%rsp)
When it runs normally, the instruction is mov %rax,0xb0(%rsp), while in the rr capture, it's mov %rax,0x40000b0(%rsp). So the offset is flipped one bit and likely to be cached in the kernel. That explains why it's 100% crash after one crash.
Close as invalid since this is a hardware issue.
Flags: needinfo?(cyu)
Comment 19•8 years ago
|
||
Closing as discussed.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(jdemooij)
Resolution: --- → INVALID
Updated•3 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•