Firefox 87.0 topcrash in [@ JS_WrapValue] with Intel GeminiLake (UHD Graphics 600/605)
Categories
(Core :: JavaScript Engine, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | wontfix |
firefox87 | + | wontfix |
firefox88 | --- | unaffected |
firefox89 | --- | unaffected |
firefox106 | + | affected |
People
(Reporter: aryx, Unassigned)
References
Details
(Keywords: crash, topcrash)
Crash Data
[Tracking Requested - why for this release]: Top crash
This topcrash (#6 for Firefox 87.0) is new in this frequency (1.1k crashes so far). All except one on Windows 10, and >99% with Intel GeminiLake (UHD Graphics 600/605):
family 6 model 122 stepping 1 1094 97.07 %
family 6 model 122 stepping 8 31 2.75 %
Only one non-87.0 crash (for 89.0a1) - might a dot release fix this?
Crash report: https://crash-stats.mozilla.org/report/index/e7d40e1a-8254-4ee2-943b-3b1f00210330
Reason: EXCEPTION_ACCESS_VIOLATION_READ
Top 10 frames of crashing thread:
0 xul.dll JS_WrapValue js/src/jsapi.cpp:656
1 xul.dll trunc
2 xul.dll static XPCWrappedNative::CallMethod js/xpconnect/src/XPCWrappedNative.cpp:1142
3 xul.dll XPC_WN_CallMethod js/xpconnect/src/XPCWrappedNativeJSOps.cpp:925
4 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:520
5 xul.dll Interpret js/src/vm/Interpreter.cpp:3243
6 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:552
7 xul.dll js::jit::DoCallFallback js/src/jit/BaselineIC.cpp:1841
8 @0x1dab9873ebe
9 xul.dll trunc
Reporter | ||
Comment 1•4 years ago
|
||
78.9.0's crash rate is also elevated (78.6.0 was also affected).
Comment 2•4 years ago
|
||
Hmm.. That CPU has given both us and Chrome trouble before..
Comment 3•4 years ago
|
||
https://bugs.chromium.org/p/chromium/issues/detail?id=1157639#c14
Chrome reports a recent spike in the last two weeks which matches us.
This really looks like Intel shipped some microcode update and lost the stability fix.
Comment 4•4 years ago
|
||
This link suggests Microsoft starting rolling out KB4589212 on March 10th. In the KB4589212 description, the list Gemini Lake (the stepping 1 ID), with a footnote that says:
1 Rolled back to microcode updates related to Spectre Variant 3a (CVE-2018-3640: "Rogue System Register Read (RSRE)"), Spectre Variant 4 (CVE-2018-3639: "Speculative Store Bypass (SSB)"), and L1TF (CVE-2018-3620, CVE-2018-3646: "L1 Terminal Fault")
Updated•4 years ago
|
Updated•4 years ago
|
Comment 5•4 years ago
|
||
Changing the priority to p1 as the bug is tracked by a release manager for the current release.
See What Do You Triage for more information
Comment 6•4 years ago
|
||
This crash is only happening in Release on CPUs that have hardware bugs. In the past when we've tried to work around this, we haven't been able to directly have impact. Without any better ideas and with merge in a few days, I think our best option is to cross our fingers and hope that the 88.0 build does not generate code that hits the same pattern.
Comment 7•3 years ago
|
||
Went away in 88+ as expected.
Comment 8•3 years ago
|
||
This crash is back on gemini lake with firefox 91.0.1.
Comment 9•2 years ago
|
||
My brother is hitting this on his laptop on current nightly (bp-61c854a8-cd3b-447e-9768-da1800221021, bp-4013e77f-f13a-411a-8746-87bbb0221021). It seems to happen on gmail and with a rather slow internet connection (one of the crashes has the background hang monitor on the stack).
Is there something that would be useful to investigate here?
Comment 10•2 years ago
|
||
The crashes here are all coming from machines using Gemini Lake processors which suggest a CPU bug, especially given the stacks are all different. I don't think there's much we can do. If this is triggered by a particular code sequence then the next version of nightly should make the problem disappear from his laptop (at least until we end up with the same code sequence again).
Comment 11•2 years ago
|
||
Gabriele, could you help us diagnose this in 106.0.4 and confirm that we are hitting the same CPU bug in 106.0?
Assuming so, the only workaround is a 106.0 dot release?
Updated•2 years ago
|
Comment 12•2 years ago
|
||
Given the current spike I had another look. I can confirm that this is indeed a CPU bug. All the crashes are coming from machines with Gemini Lake CPUs also known as Goldmont Plus. The crashes manifest themselves as an ACCESS_VIOLATION_WRITE
exception, which requires a memory access to be triggered, but the crashing instruction is mov r15, rcx
which does not access memory and couldn't possibly cause that exception.
I skimmed over the errata document for these CPUs but couldn't find a specific issue that could cause this, yet the number of issues in this core is quite large so I might have missed something.
Updated•2 years ago
|
Description
•