Crash in [@ mozilla::fontlist::FontList::ShmBlock::Memory] on poison values from OffscreenCanvas
Categories
(Core :: Graphics: Canvas2D, defect, P1)
Tracking
()
People
(Reporter: cpeterson, Assigned: jnicol)
References
(Blocks 1 open bug)
Details
(5 keywords, Whiteboard: [adv-main118+r][adv-esr115.3+r])
Crash Data
Attachments
(1 file)
|
48 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
ryanvm
:
approval-mozilla-esr115+
tjr
:
sec-approval+
|
Details | Review |
@ Lee, this UAF OffscreenCanvas crash seems to have spiked in 116.
+++ This bug was initially created as a clone of Bug #1815890 +++
Crash report: https://crash-stats.mozilla.org/report/index/5e81a1b8-d8de-4975-9cc7-057c00230209
Reason: SIGSEGV / SEGV_MAPERR
Top 10 frames of crashing thread:
0 libxul.so mozilla::UniquePtr<base::SharedMemory, mozilla::DefaultDelete<base::SharedMemory> >::get const mfbt/UniquePtr.h:286
0 libxul.so mozilla::UniquePtr<base::SharedMemory, mozilla::DefaultDelete<base::SharedMemory> >::operator-> const mfbt/UniquePtr.h:281
0 libxul.so mozilla::fontlist::FontList::ShmBlock::Memory const gfx/thebes/SharedFontList-impl.h:311
0 libxul.so mozilla::fontlist::Pointer::ToPtr const gfx/thebes/SharedFontList.cpp:78
1 libxul.so mozilla::fontlist::Family::FindAllFacesForStyleInternal const gfx/thebes/SharedFontList.cpp:347
2 libxul.so mozilla::fontlist::Family::FindAllFacesForStyle const gfx/thebes/SharedFontList.cpp:392
2 libxul.so mozilla::fontlist::Family::FindFaceForStyle const gfx/thebes/SharedFontList.cpp:442
3 libxul.so gfxFontGroup::GetDefaultFont gfx/thebes/gfxTextRun.cpp:2174
4 libxul.so gfxFontGroup::GetFirstValidFont gfx/thebes/gfxTextRun.cpp:2370
5 libxul.so mozilla::dom::CanvasRenderingContext2D::DrawOrMeasureText dom/canvas/CanvasRenderingContext2D.cpp:4083
75 crashes with this signature on poison values in the last week. 96% Android. All of them had OffscreenCanvasRenderingContext2D_Binding, and the 10 I looked at were all on DOM worker threads. Some crashes are on 111.
Updated•2 years ago
|
Comment 1•2 years ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 10 AArch64 and ARM crashes on release
For more information, please visit BugBot documentation.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 2•2 years ago
|
||
Seems like the return of bug 1825417?
Updated•2 years ago
|
Updated•2 years ago
|
Comment 3•2 years ago
|
||
Curious that it seems to really spike about a week after the release of 116, not immediately. Is it typical that lots of users upgrade a week after initial release, or is there some other factor contributing to the big increase around Aug 9?
Comment 4•2 years ago
|
||
Jamie, could you have a look here and see if we can get a cause?
| Assignee | ||
Comment 5•2 years ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #3)
Curious that it seems to really spike about a week after the release of 116, not immediately. Is it typical that lots of users upgrade a week after initial release, or is there some other factor contributing to the big increase around Aug 9?
That is a bit curious. I asked on slack and it looks like 116.0 comfortably rolled out by then, and on the decline as 116.2 rolled out.
There's not all that much to go on here. It all seems to be coming from offscreen canvas. Can you think of anything that changed in 116 Andrew?
| Assignee | ||
Comment 6•2 years ago
|
||
If I filter the crash stats to only show beta or nightly crashes, then it also appears to uptick around the same time (start of august) . The first beta build with an increase in crashes appears to be 116.0b8.
The numbers on nightly and beta are low so hard to say definitively, but that could indicate a late uplift to 116 beta. Looking at the pushlog, bug 1833876 jumps out. I don't have permission to view that bug, but it appears to be an offscreen canvas change. Could that have caused this, Andrew?
Comment 7•2 years ago
|
||
I'm not sure how it might affect severity or prioritization, but I'll note that on supported Fenix versions (117 and later) that this signature constitutes around 74% of all crashes on jemalloc poison values (eg where the crash address contains e5e5) in the last week.
| Assignee | ||
Comment 8•2 years ago
•
|
||
Okay, I don't see how bug 1833876 could be relevant.
Looking through the code. We're crashing trying to read the raw pointer address owned by the UniquePtr mShmem here. Not dereferencing that address, but reading the value from the UniquePtr. Which I believe indicates it's the ShmBlock itself that has been poisoned.
The proto signatures mostly show either FontList::GetHeader() or Pointer::ToPtr() immediately above in the stack. Each of these call ShmBlock::Memory() by accessing the ShmBlock through the FontList::mBlocks nsTArray. If this nsTArray was reallocated from underneath the DOM worker thread I believe we could therefore crash in this manner.
In the content process we append to the array in FontList::ShmBlockAdded() (called from gfxPlatformFontList::ShmBlockAdded()). And clear the list in FontList::DetachShmBlocks(). The latter only gets called in the FontList constructor and destructor, so I don't believe could be an issue. But the former seems like it might be.
Jonathan, am I missing anything guarding FontList::mBlocks anywhere? Should we perhaps be locking mLock here ?
Comment 9•2 years ago
|
||
Aha - you may be onto something here. The ShmBlockAdded scenario makes sense as a case where main-thread activity could result in a worker thread crash like this. So I think you're right, we should hold the lock in gfxPlatformFontList::ShmBlockAdded to avoid this risk.
It looks like FontList::GetHeader() should also take the lock, if we're not on the main thread (like Pointer::ToPtr already does).
Comment 10•2 years ago
|
||
Comment 11•2 years ago
|
||
Not having known steps to reproduce this, the patch must be a bit speculative until we see what crash-stats show, but I think it looks very plausible.
Comment 12•2 years ago
|
||
Comment on attachment 9352055 [details]
Bug 1848890 - Guard access to FontList::mBlocks with the font-list lock. r=jnicol
Security Approval Request
- How easily could an exploit be constructed based on the patch?: Not easily - the patch adds locking, which suggests a possible race condition in the font list, but it's unclear how to trigger it with any consistency
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
- Which older supported branches are affected by this flaw?: all
- If not all supported branches, which bug introduced the flaw?: None
- Do you have backports for the affected branches?: No
- If not, how different, hard to create, and risky will they be?: Straightforward to transplant
- How likely is this patch to cause regressions; how much testing does it need?: Low risk - no change in behavior, just added locking for safety
- Is Android affected?: Yes
Updated•2 years ago
|
Comment 13•2 years ago
|
||
The crash is only showing up on Android, but the code being fixed doesn't look Android-specific, so I'll mark ESR-115 as affected.
Comment 14•2 years ago
|
||
Yes, esr115 is definitely affected (there are actually a handful of crash reports from there, too, though nowhere near the Android numbers).
Updated•2 years ago
|
Comment 15•2 years ago
|
||
Comment on attachment 9352055 [details]
Bug 1848890 - Guard access to FontList::mBlocks with the font-list lock. r=jnicol
Approved to land and request uplift
Comment 16•2 years ago
|
||
Comment 17•2 years ago
|
||
Comment 18•2 years ago
|
||
Comment on attachment 9352055 [details]
Bug 1848890 - Guard access to FontList::mBlocks with the font-list lock. r=jnicol
Beta/Release Uplift Approval Request
- User impact if declined: Possible crashes (UAF of poisoned value) during offscreen-canvas font usage
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce: (No STR, just crash reports in the wild)
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Just adding lock around an array access
- String changes made/needed:
- Is Android affected?: Yes
Comment 19•2 years ago
|
||
Comment on attachment 9352055 [details]
Bug 1848890 - Guard access to FontList::mBlocks with the font-list lock. r=jnicol
Approved for 118.0b7, thanks.
Comment 20•2 years ago
|
||
| uplift | ||
Updated•2 years ago
|
Comment 21•2 years ago
|
||
Comment on attachment 9352055 [details]
Bug 1848890 - Guard access to FontList::mBlocks with the font-list lock. r=jnicol
Approved for 115.3esr
Comment 22•2 years ago
|
||
| uplift | ||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Description
•