Crash in [@ nsPresContext::GetDocShell]
Categories
(Core :: Layout, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr115 | --- | unaffected |
| firefox126 | --- | unaffected |
| firefox127 | --- | unaffected |
| firefox128 | + | fixed |
People
(Reporter: aryx, Assigned: emilio)
References
(Regression)
Details
(4 keywords)
Crash Data
Attachments
(1 file)
18 crashes from 10 installs of Firefox 128.0a1 on multiple operating systems, and that's with updates turned off for Windows for two days. First reported build ID is 20240514215634, no reports for 127.0a1. From bug 1896586 or bug 1895870?
Crash report: https://crash-stats.mozilla.org/report/index/7b003d3d-af1c-457c-8d7a-2e7330240516
Reason: EXCEPTION_ACCESS_VIOLATION_READ
Top 10 frames:
0 xul.dll RefPtr<mozilla::dom::Document>::get() const mfbt/RefPtr.h:314
0 xul.dll RefPtr<mozilla::dom::Document>::operator->() const mfbt/RefPtr.h:344
0 xul.dll nsPresContext::GetDocShell() const layout/base/nsPresContext.cpp:1652
0 xul.dll mozilla::PresShell::DoFlushPendingNotifications(mozilla::ChangesToFlush) layout/base/PresShell.cpp:4283
1 xul.dll mozilla::PresShell::FlushPendingNotifications(mozilla::ChangesToFlush) layout/base/PresShell.h:1464
1 xul.dll nsRefreshDriver::FlushLayoutOnPendingDocsAndFixUpFocus() layout/base/nsRefreshDriver.cpp:2240
1 xul.dll nsRefreshDriver::Tick(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType... layout/base/nsRefreshDriver.cpp:2732
2 xul.dll mozilla::RefreshDriverTimer::TickDriver(nsRefreshDriver*, mozilla::layers::Ba... layout/base/nsRefreshDriver.cpp:367
2 xul.dll mozilla::RefreshDriverTimer::TickRefreshDrivers(mozilla::layers::BaseTransact... layout/base/nsRefreshDriver.cpp:345
3 xul.dll mozilla::RefreshDriverTimer::Tick(mozilla::layers::BaseTransactionId<mozilla:... layout/base/nsRefreshDriver.cpp:361
Comment 1•1 year ago
|
||
This majority of these crashes are on the jemalloc poison value, so I'm going to hide this.
For example: bp-6fd31c88-be1c-41e7-91db-16e0f0240516
Comment 2•1 year ago
|
||
Bug 1896024 is another UAF with refresh driver stuff on the stack, though that's supposed to be Android-only. It has a patch that just got merged to m-c today.
Comment 3•1 year ago
|
||
This is likely a regression from bug 1896593. FlushLayoutOnPendingDocsAndFixUpFocus is in the backtrace, and that function was added in bug 1896593 part 1.
Updated•1 year ago
|
Comment 4•1 year ago
•
|
||
Looking in Visual Studio: when we reach PresShell::DoFlushPendingNotifications, we have "this" pointing to a PresShell object that's filled with the poison value:
this = 0x000001b4d8a3c000 mozilla::PresShell *
[...]
+ mUpdateApproximateFrameVisibilityEvent {mEvent={mRawPtr=0xe5e5e5e5e5e5e5e5 {...} } } nsRevocableEventPtr<nsRunnableMethod<mozilla::PresShell,void,1,0>>
[...]
+ mSelection {mRawPtr=0xe5e5e5e5e5e5e5e5 {mRefCnt={mRefCntAndFlags=??? } mDomSelections=0xe5e5e5e5e5e5e5ed {{...}, ...} ...} } RefPtr<nsFrameSelection>
If I move up the stack, the next two frames are marked as [Inline Frame], notably PresShell::FlushPendingNotifications and nsRefreshDriver::FlushLayoutOnPendingDocsAndFixUpFocus -- so really, moving up puts me in the next non-inline stack frame which is nsRefreshDriver::Tick.
There, I have a valid-looking this pointer and a valid-looking presShell (at a different address from the poisoned one discussed above):
this = 0x000001b4d5acaf00 nsRefreshDriver *
pressShell.mRawPtr = 0x000001b4d5c12000 mozilla::PresShell *
Comment 5•1 year ago
•
|
||
So the presShell local-var in nsRefreshDriver::Tick is still "good", but one of the ones we iterate over in nsRefreshDriver::FlushLayoutOnPendingDocsAndFixUpFocus ends up "bad", even though we add a strong stack reference to it.
FlushLayoutOnPendingDocsAndFixUpFocus does traverse an array of PresShells in a loop, and it only adds a strong reference for the one we're currently iterating over. I wonder if one iteration of the loop somehow blows away the remaining PresShell objects, and then we're hosed when we get to the next loop iteration?
If that's what's happening, then an extremely-simple paper-over might be to change the array to hold RefPtr instances here:
https://searchfox.org/mozilla-central/rev/a18a7c526cf3c531f2fc24db4f0dffbc16290a7e/layout/base/nsRefreshDriver.cpp#2223-2224
void nsRefreshDriver::FlushLayoutOnPendingDocsAndFixUpFocus() {
AutoTArray<PresShell*, 16> observers;
so i.e. switch to...
AutoTArray<RefPtr<PresShell>, 16>
...or something along those lines. I'm not sure if that might introduce other problems, though. But I think we do need to somehow account for the possibility that one loop iteration might blow away the remaining still-to-be-iterated PresShells...
Updated•1 year ago
|
| Assignee | ||
Comment 6•1 year ago
|
||
I think what's supposed to make this safe is the RemoveElement check below that tries to guarantee that we're not removed from the observer list already.
But yeah that's a bit sketchy anyways and is only marginally more efficient.
| Assignee | ||
Comment 7•1 year ago
|
||
Much like we deal with resize events.
Updated•1 year ago
|
| Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
Comment 10•1 year ago
|
||
| Reporter | ||
Comment 11•1 year ago
|
||
Comment 12•1 year ago
|
||
I found a similar-looking crash on Fenix Nightly, again all on poison values. The signature is [@ mozilla::dom::Document::GetWindow ]. Here's an example crash: bp-777b647b-f6a4-413e-a70e-135430240517. Hopefully this will get fixed by the same patch. We'll have to keep an eye out for that.
Comment 13•1 year ago
|
||
Another possible variant: [@ mozilla::WeakPtr<T>::get ] bp-cb149673-675a-42eb-8646-612b00240517 There's only one instance of that crash so I won't add the signature.
Updated•1 year ago
|
Comment 15•1 year ago
|
||
There is a test case in bug 1897571, if you'd like a crash test.
Comment 17•1 year ago
|
||
Copying crash signatures from duplicate bugs.
Updated•1 year ago
|
Comment 19•1 year ago
|
||
Copying crash signatures from duplicate bugs.
Comment 20•1 year ago
|
||
As of right now, there are zero crashes in the myriad signatures here after the patch landed, according to the little chart in Bugzilla.
Comment 22•1 year ago
|
||
Copying crash signatures from duplicate bugs.
Updated•1 year ago
|
Comment 24•1 year ago
|
||
More signatures.
Comment 26•1 year ago
|
||
Copying crash signatures from duplicate bugs.
Updated•1 year ago
|
Comment 27•1 year ago
|
||
The patch landed in nightly and beta is affected.
:emilio, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox127towontfix.
For more information, please visit BugBot documentation.
Comment 28•1 year ago
|
||
I don't the existence of this signature on other branches is relevant. This specific spike had a clear regressor that was Nightly only.
Comment 30•1 year ago
|
||
Should this have had sec-review and approval flags set & approval+ before landing?
Comment 31•1 year ago
|
||
Oh, looks like that is not needed for a nightly-only fix (from https://firefox-source-docs.mozilla.org/bug-mgmt/processes/security-approval.html#process-for-security-bugs-developer-perspective). Never mind!
Comment 32•1 year ago
|
||
Emilio, can you confirm that 127 and earlier are unaffected, and have we identified the regressor?
Updated•1 year ago
|
Updated•8 months ago
|
Description
•