Crash in [@ gfxContext::GetAzureDeviceSpaceClipBounds]
Categories
(Core :: Graphics, defect)
Tracking
()
People
(Reporter: diannaS, Assigned: tnikkel)
References
(Blocks 1 open bug, Regression)
Details
(6 keywords, Whiteboard: [fixed by 1842325][tbird crash][firefox crash][adv-main116+r][adv-ESR115.1+r][adv-ESR102.14+r])
Crash Data
Crash report: https://crash-stats.mozilla.org/report/index/ea9ecc55-426c-4dea-a04b-5449d0230306
Reason: EXCEPTION_ACCESS_VIOLATION_READ
Top 10 frames of crashing thread:
0 xul.dll gfxContext::GetAzureDeviceSpaceClipBounds const gfx/thebes/gfxContext.cpp:821
0 xul.dll gfxContext::GetClipExtents const gfx/thebes/gfxContext.cpp:510
1 xul.dll mozilla::layers::WebRenderLayerManager::MakeSnapshotIfRequired gfx/layers/wr/WebRenderLayerManager.cpp:530
2 xul.dll mozilla::layers::WebRenderLayerManager::EndTransactionWithoutLayer gfx/layers/wr/WebRenderLayerManager.cpp:485
3 xul.dll mozilla::nsDisplayList::PaintRoot layout/painting/nsDisplayList.cpp:2300
4 xul.dll nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:3413
5 xul.dll mozilla::PresShell::PaintInternal layout/base/PresShell.cpp:6430
6 xul.dll nsViewManager::ProcessPendingUpdatesPaint view/nsViewManager.cpp:433
7 xul.dll nsViewManager::ProcessPendingUpdatesForView view/nsViewManager.cpp:368
8 xul.dll nsViewManager::ProcessPendingUpdates view/nsViewManager.cpp:941
Comment 1•2 years ago
|
||
Timothy, any ideas what could cause this?
Comment 3•2 years ago
|
||
For this Thunderbird user bp-5a679e96-4786-4a5c-b8d2-799670230319, it was simpley a startup crash.
Comment 4•2 years ago
|
||
About 10 out of 44 crashes I see look like they are on poison-ish values, like this one: bp-6dc56e90-80d2-49bb-955e-dab940230412
Another 10 of the crashes are specifically on the value 0x5441554156415791, and they don't seem to be all from the same install time either, so that's odd.
bp-9f4e58a8-e13d-411e-ab48-5f34a0230412
bp-546afb95-aea1-422a-8513-2594d0230412
Comment 5•2 years ago
|
||
Given the wildptrs and clearish UAFs, making a sec bug
Comment 6•2 years ago
|
||
The severity field for this bug is set to S3. However, the bug is flagged with the sec-high
keyword.
:bhood, could you consider increasing the severity of this security bug?
For more information, please visit BugBot documentation.
Updated•2 years ago
|
Comment 7•2 years ago
|
||
From the correlations it looks like it mostly happens after we run into driver issues.
Also it is oddly dominated by AMD GPUs and CPUs (although there are a few intel CPUs and GPUs in the lot).
(98.48% in signature vs 02.99% overall) GFX_ERROR "Killing GPU process due to IPC reply timeout" = true [100.0% vs 13.54% if adapter_device_id = 0x15d8]
(98.48% in signature vs 03.04% overall) GFX_ERROR "timeout" = true [100.0% vs 13.54% if adapter_device_id = 0x15d8]
(98.48% in signature vs 14.27% overall) adapter_vendor_id = 0x1002
![]() |
||
Updated•2 years ago
|
Comment 8•2 years ago
|
||
Got this crash today after i updated-and-restarted Nightly.
https://crash-stats.mozilla.org/report/index/b97cb6e3-65b4-4dfa-8d28-629270230617#tab-bugzilla
Comment 9•2 years ago
|
||
Bug 1837198 is definitely this bug, summing up what I found out there the gfxContext
object pointed by this
has been overwritten with data that most likely belongs to another object. I haven't checked but chances are that the object was freed and a similarly sized object was written on top of it.
Updated•2 years ago
|
![]() |
||
Comment 10•2 years ago
|
||
[@ std::_Func_class<T>::_Tidy | std::_Func_class<T>::~_Func_class | mozilla::ManagedPostRefreshObserver::~ManagedPostRefreshObserver]
has this in the stack for Firefox 114.0.2, e.g. bp-4c9c8f20-662a-42df-8994-aaa420230622.
Top 10 frames of crashing thread:
0 xul.dll std::_Func_class<mozilla::ManagedPostRefreshObserver::Unregister, bool>::_Tidy /builds/worker/fetches/vs/VC/Tools/MSVC/14.16.27023/include/functional:1391
0 xul.dll std::_Func_class<mozilla::ManagedPostRefreshObserver::Unregister, bool>::~_Func_class /builds/worker/fetches/vs/VC/Tools/MSVC/14.16.27023/include/functional:1271
0 xul.dll mozilla::ManagedPostRefreshObserver::~ManagedPostRefreshObserver layout/base/nsRefreshObservers.cpp:19
0 xul.dll mozilla::ManagedPostRefreshObserver::~ManagedPostRefreshObserver layout/base/nsRefreshObservers.cpp:19
1 xul.dll gfxContext::GetAzureDeviceSpaceClipBounds const gfx/thebes/gfxContext.cpp:579
1 xul.dll gfxContext::GetClipExtents const gfx/thebes/gfxContext.cpp:348
2 xul.dll mozilla::layers::WebRenderLayerManager::MakeSnapshotIfRequired gfx/layers/wr/WebRenderLayerManager.cpp:532
3 xul.dll mozilla::layers::WebRenderLayerManager::EndTransactionWithoutLayer gfx/layers/wr/WebRenderLayerManager.cpp:487
4 xul.dll mozilla::nsDisplayList::PaintRoot layout/painting/nsDisplayList.cpp:2342
5 xul.dll nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:3428
Comment 11•2 years ago
|
||
Tim, could you take a look?
Assignee | ||
Comment 12•2 years ago
|
||
The crashes happen when accessing WebRenderLayerManager::mTarget. mTarget usually null for the normal painting to screen code path. It's only non-null if we are being asked to render to some other surface (like drawWindow for example). mTarget gets set at the start of a transaction and cleared at the end of the transaction. In all the crashes I looked at we are on the normal painting path, so mTarget should be null for the entire transaction. So either (1) the mTarget pointer is getting overwritten with another pointer during the transaction or (2) mTarget is not getting properly cleared at the end of a previous transaction. (1) is basically impossible to find as we have no info from the crashes about when that overwriting might be happening. However, I can definitely see how (2) could be happening. I filed bug 1842325 with a patch to make (2) impossible. We can land that and hopefully it makes these crashes stop happening.
Comment 13•2 years ago
|
||
If we ignore the Thunderbird 102 crashes, this is essentially a regression in Firefox 112; there were single digit crashes before that, none of which looked memory-poisoned.
(In reply to Andrew McCreight [:mccr8] from comment #4)
Another 10 of the crashes are specifically on the value 0x5441554156415791, and they don't seem to be all from the same install time either, so that's odd.
There are small clumps of "same specific address" that do look odd. Maybe data values that were interpreted as pointers? But frame-poisoning was supposed to prevent that from happening. I guess this isn't a simple use-after-free of the frame object but rather something stomping the gfxContext that's referencing them?
The value above could be interpreted as strange text: TAUAVAWA. The last "A" only if I take the liberty of assuming there's a +50 offset to a base address. Arbitrary to fit the pattern, but also there were a number of other crashing addresses that ended in "50" so maybe? Might also be part of an array of int16 counting up, interpreted as an address? 0x4154, 0x4155, 0x4156, 0x4157 (again, taking the liberty of assuming a +50 offset).
There are also 5 or 6 that crash with what looks like a frame-poisoning address: 0x7ffffffff0de7fff
Comment 14•2 years ago
|
||
Confirmed the "+50" guess: the crashing instruction for the ones I checked was mov rax, qword [rax + 0x50]
Assignee | ||
Comment 15•2 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #13)
If we ignore the Thunderbird 102 crashes, this is essentially a regression in Firefox 112; there were single digit crashes before that, none of which looked memory-poisoned.
I took a quick look at the change log of files involved here around that time and https://hg.mozilla.org/integration/autoland/rev/6525cdd895fc sticks out. That changeset made the mTarget pointer a raw ptr, whereas it had been a refptr before.
Comment 16•2 years ago
|
||
Set release status flags based on info from the regressing bug 1815404
Updated•2 years ago
|
Comment 17•2 years ago
|
||
We'll check back next week to see if bug 1842325 landing stopped the crashes.
Reporter | ||
Comment 18•2 years ago
|
||
looks like no crashes since b5 (when bug 1842325 landed)
Comment 19•2 years ago
|
||
Fixed by bug 1842325!
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 20•1 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #13)
(In reply to Andrew McCreight [:mccr8] from comment #4)
Another 10 of the crashes are specifically on the value 0x5441554156415791, and they don't seem to be all from the same install time either, so that's odd.
The value above could be interpreted as strange text: TAUAVAWA.
Probably doesn't matter at this point, but that looks like an amd64 function prologue: A
is a REX prefix, P
– W
are register push instructions. AWAVAUAT
pushes r15
, r14
, r13
, r12
(and it shows up a lot in strings/hexdump output).
Updated•1 year ago
|
Description
•