Crash in [@ SkScalerContext::AutoDescriptorGivenRecAndEffects] on Intel CPU family 6 model 122 stepping 1
Categories
(Core :: Graphics, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox69 | --- | wontfix |
People
(Reporter: marcia, Unassigned)
References
Details
(Keywords: crash, regression, steps-wanted)
Crash Data
This bug is for crash report bp-3082dfe6-8d7f-4a3f-baec-c026d0190904.
Seen while looking at releases crashes. Currently #4 with no bug associated with it: https://bit.ly/2jZ0rPf. Comments mention repeated crashing.
Top 10 frames of crashing thread:
0 xul.dll SkScalerContext::AutoDescriptorGivenRecAndEffects gfx/skia/skia/src/core/SkScalerContext.cpp:1130
1 xul.dll SkStrikeCache::FindOrCreateStrikeExclusive gfx/skia/skia/src/core/SkStrikeCache.cpp:190
2 xul.dll SkGlyphRunListPainter::drawForBitmapDevice gfx/skia/skia/src/core/SkGlyphRunPainter.cpp:225
3 xul.dll SkBitmapDevice::drawGlyphRunList gfx/skia/skia/src/core/SkBitmapDevice.cpp:541
4 xul.dll SkGlyphRunBuilder::drawTextBlob gfx/skia/skia/src/core/SkGlyphRun.cpp:232
5 xul.dll SkCanvas::onDrawTextBlob gfx/skia/skia/src/core/SkCanvas.cpp:2552
6 xul.dll SkCanvas::drawTextBlob gfx/skia/skia/src/core/SkCanvas.cpp:2573
7 xul.dll mozilla::gfx::DrawTargetSkia::DrawGlyphs gfx/2d/DrawTargetSkia.cpp:1391
8 xul.dll void mozilla::gfx::FillGlyphsCommand::ExecuteOnDT gfx/2d/DrawCommands.h:577
9 xul.dll mozilla::gfx::DrawTarget::DrawCapturedDT gfx/2d/DrawTarget.cpp:167
Reporter | ||
Comment 1•5 years ago
|
||
This crash was mentioned in the Channel meeting yesterday - these are entry-level intel cpus (gemini lake). Philipp noted we have had trouble with these in previous releases.
Comment 2•5 years ago
|
||
Hi Lee, this crash is spiking in 69.0 post-release. Can you please take a look?
Comment 3•5 years ago
|
||
we have already had build specific crash signatures spiking up in the past with this particular cpu, for example bug 1524257, bug 1544192 and bug 1553380.
Updated•5 years ago
|
Comment 4•5 years ago
|
||
Without a repro method, it is difficult to say what is going on here. The stack doesn't peg a specific line where the problem might be occurring, and there are a lot of different objects in play near that area in the stack, none of which looks overtly wrong as causing the crash. So the first step here would be to get some sort of initial lead on what is causing this to allow us to reproduce it.
Updated•5 years ago
|
Reporter | ||
Comment 5•5 years ago
|
||
The comments aren't really useful in terms of getting any steps - they just mention repeated crashing. My guess is we would have to get a machine with this spec if we wanted to reproduce. Some correlations:
(100.0% in signature vs 01.83% overall) CPU Info = family 6 model 122 stepping 1
(97.87% in signature vs 01.57% overall) address = 0x5a
(100.0% in signature vs 07.92% overall) reason = EXCEPTION_ACCESS_VIOLATION_WRITE
(20.43% in signature vs 99.99% overall) graphics_startup_test = null
(33.23% in signature vs 00.95% overall) adapter_vendor_id = 0x00ba [61.24% vs 01.35% if process_type = content]
(95.43% in signature vs 41.98% overall) platform_pretty_version = Windows 10
(20.43% in signature vs 73.86% overall) app_init_dlls = null
(35.06% in signature vs 00.84% overall) adapter_device_id = 0x3185 [48.67% vs 01.13% if startup_crash = 0]
(30.79% in signature vs 00.56% overall) adapter_device_id = 0x3184 [46.76% vs 01.01% if adapter_vendor_id = 0x8086]
(100.0% in signature vs 61.66% overall) cpu_arch = amd64
(25.30% in signature vs 03.59% overall) bios_manufacturer = Insyde Corp. [42.13% vs 03.26% if process_type = content]
(95.12% in signature vs 47.78% overall) Module "wshbth.dll" = true [92.00% vs 57.85% if platform_version = 10.0.17134]
Comment 6•5 years ago
|
||
Looks like the 70.0b4 beta build is also affected
Comment 7•5 years ago
|
||
This seems like it's probably a CPU issue: https://bugs.chromium.org/p/chromium/issues/detail?id=968683
Updated•5 years ago
|
Updated•5 years ago
|
Comment 8•5 years ago
•
|
||
in case an affected user is ending up reading this bug report - according to the chrome thread and our stability data, switching to a 32bit version of the browser might fix this crash pattern. you can get the 32bit installer from https://www.mozilla.org/en-US/firefox/all/
Comment 9•5 years ago
|
||
Chrome landed a speculative workaround for this, not sure if it can apply here too: https://chromium.googlesource.com/v8/v8.git/+/10360127e8bcc4a683ca2f49c0459d548299551b
Comment 10•5 years ago
|
||
(In reply to Emilio Cobos Álvarez (:emilio) from comment #9)
Chrome landed a speculative workaround for this, not sure if it can apply here too: https://chromium.googlesource.com/v8/v8.git/+/10360127e8bcc4a683ca2f49c0459d548299551b
Indeed, we're seeing the same failure mode in this signature. Looking at https://crash-stats.mozilla.org/report/index/52c7bc49-6030-4cca-a701-ac6980191007#tab-rawdump,
0:000> db xul+0x3f1639e-10 L20
0000000183f1638e cc cc 41 57 41 56 56 57-53 48 81 ec b0 00 00 00 ..AWAVVWSH......
0000000183f1639e 4d 89 c6 48 89 d6 48 89-cf 48 8b 05 8a fc 75 01 M..H..H..H....u.
Let's see what happens if the cpu makes the same "off by 16" mistake when crossing the 16-byte boundary:
0:000> eb . 4d 89 41 57; u . L1
ntdll!LdrpDoDebuggerBreak+0x30:
00007ff8`4e4511dc 4d894157 mov qword ptr [r9+57h],r8
In that report, r9 == 3, so r9 + 57h == 0x5a, which matches the crash address in the description.
I went looking for crashes specific to this cpu and also found the same off-by-16 in style::properties::NonCustomPropertyId::allowed_in
in 69.0.1.
Comment 11•5 years ago
|
||
Some maybe-relevant IRC discussion:
19:23 <dmajor> emilio: do you want to try landing chrome's cpu workarounds? I don't know how to word the attribute for the rust one.
20:02 <emilio> dmajor: sorry, was on a meeting. Hmm, not sure `#[repr(align)]` will work on functions...
20:05 <emilio> dmajor: nope, that doesn't seem to work... I'll poke a bit more
20:12 <emilio> dmajor: I guess that what the attribute does in clang is setting the `alignstack(N)`?
20:12 <emilio> dmajor: from http://llvm.org/docs/LangRef.html#function-attributes
20:13 <dmajor> emilio: I don't think it would be related to the stack
20:16 <emilio> dmajor: ah, true, it just emits the "align" attribute in the IR
20:16 <emilio> dmajor: (looking at https://godbolt.org/z/eU9sm8)
20:20 <emilio> dmajor: I don't see anything relevant in https://doc.rust-lang.org/reference/items/functions.html#attributes-on-functions
20:21 <emilio> dmajor: the closest I can see that we could use is https://doc.rust-lang.org/reference/abi.html#the-link_section-attribute, specifying a custom section that we know is well-aligned
Not sure how feasible / reasonable that would be...
Comment 12•5 years ago
|
||
Maybe it is easier to do this at the linker level for all functions? Otherwise it may become a whack a mole.
Comment 13•4 years ago
|
||
This CPU bug now affects 82.0.2.
Comment 14•4 years ago
|
||
I tried the stitching-together-bytes as in comment 10, and it doesn't seem to be the same off-by-16 failure mode this time (although I won't rule out the possibility that there are deeper layers of the hardware bug that we don't understand). I suspect that any active intervention that we'd try would be no more likely to succeed than just spinning a fresh build.
Updated•2 years ago
|
Comment 16•2 years ago
|
||
Closing because no crashes reported for 12 weeks.
Description
•