Closed Bug 1407740 Opened 2 years ago Closed 2 years ago

Crash with failed "@mozilla.org/docshell;1" instance

Categories

(Core :: DOM: Navigation, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla58
Tracking Status
thunderbird_esr52 --- fixed
firefox-esr52 57+ verified
firefox56 --- wontfix
firefox57 + verified
firefox58 + verified

People

(Reporter: Oriol, Assigned: mccr8)

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main57+][adv-esr52.5+])

Crash Data

Attachments

(2 files, 1 obsolete file)

Andrew, can you investigate?
Flags: needinfo?(continuation)
Priority: -- → P3
Component: XPCOM → XPConnect
Sure, I can check it out, though I'm not sure how much I know about this stuff. It looks like it is crashing on this line of the XPCWrappedNative traverse method:
    cb.NoteXPCOMChild(tmp->GetIdentityObject());
Flags: needinfo?(continuation)
When I run this locally, I get an error in nsJSCID::CreateInstance(), as you'd expect.

Then later I get a crash like this:

#0  0x00007fffdfdb3953 in mozilla::RefPtrTraits<nsISupports>::Release (aPtr=0x7fffacf87808) at /media/ssd/mc/obj-nopt-dbg.noindex/dist/include/mozilla/RefPtr.h:41
#1  0x00007fffdfdb3935 in RefPtr<nsISupports>::ConstRemovingRefPtrTraits<nsISupports>::Release (aPtr=0x7fffacf87808)
    at /media/ssd/mc/obj-nopt-dbg.noindex/dist/include/mozilla/RefPtr.h:398
#2  0x00007fffe109b156 in RefPtr<nsISupports>::assign_assuming_AddRef (this=0x7fffd1966428, aNewPtr=0x0) at /media/ssd/mc/obj-nopt-dbg.noindex/dist/include/mozilla/RefPtr.h:66
#3  0x00007fffe109b119 in RefPtr<nsISupports>::assign_with_AddRef (this=0x7fffd1966428, aRawPtr=0x0) at /media/ssd/mc/obj-nopt-dbg.noindex/dist/include/mozilla/RefPtr.h:57
#4  0x00007fffe109b0cf in RefPtr<nsISupports>::operator= (this=0x7fffd1966428, aRhs=0x0) at /media/ssd/mc/obj-nopt-dbg.noindex/dist/include/mozilla/RefPtr.h:193
#5  0x00007fffe109398c in XPCWrappedNativeTearOff::SetNative (this=0x7fffd1966420, Native=0x0) at /media/ssd/mc/js/xpconnect/src/xpcprivate.h:1502
#6  0x00007fffe1096ec9 in XPCWrappedNative::SweepTearOffs (this=0x7fffd19663e0) at /media/ssd/mc/js/xpconnect/src/XPCInlines.h:500
#7  0x00007fffe10829ea in XPCWrappedNativeScope::SweepAllWrappedNativeTearOffs () at /media/ssd/mc/js/xpconnect/src/XPCWrappedNativeScope.cpp:633
#8  0x00007fffe1016482 in XPCJSRuntime::FinalizeCallback (fop=0x7fffffffaed8, status=JSFINALIZE_COLLECTION_END, data=0x0) at /media/ssd/mc/js/xpconnect/src/XPCJSRuntime.cpp:868

Maybe one of the tearoffs is junk? The crash seems to be from calling release on an nsISupports held by a tearoff.
Group: dom-core-security
This is some kind of use after free. The test case involves chrome code so I'm not sure exactly how bad it is.
Assignee: nobody → continuation
Keywords: csectype-uaf
Attached file ASan log
Based on the log, this feels more like a bug in nsDocShell::Init().
Component: XPConnect → Document Navigation
(In reply to Andrew McCreight [:mccr8] from comment #4)
> This is some kind of use after free. The test case involves chrome code so
> I'm not sure exactly how bad it is.

We made bug 1407751, which is vaguely similar, a sec-high, even though it was chrome code.  Who knows, we might shoot ourselves in the foot from chrome code...
Keywords: sec-high
That's probably a little conservative given that both these bugs seem to crash immediately on the next GC/CC, but sure, better safe than sorry. :)
I'm going to assume whatever is going wrong here is ancient.
Something like this is happening:
1. The doc shell is created in nsDocShellConstructor.
2. QI is called on the docshell (to nsIXHR, presumably), which fails.
3. Because of the failure, nsDocShellConstructor decides to return, which calls release on the docshell.
4. There are no other references to the doc shell, so we begin to call ~nsDocShell, which calls nsDocShell::Destroy().
4. Destroy() effectively does NotifyObservers(this, NS_WEBNAVIGATION_DESTROY, nullptr);
5. Some devtools stuff written in JS listens for NS_WEBNAVIGATION_DESTROY (maybe this is only happening because the browser console is open? I'm not sure), so we need to create an XPCWN _for the doc shell that we are in the middle of destroying_.
6. We create a "strong reference" to the doc shell in the XPCWN, then do whatever to finish out NotifyObservers.
7. Finally, we finish ~nsDocShell, and so we free the DocShell.
8. At some later point, we try to do something with the XPCWN's native pointer, but this is the object we destroyed in step 7, so we crash.
The core of the problem is:
- The dtor for nsDocShell passes |this| to NotifyObservers().
- Observers implemented in JS cause an XPCWN to be created with a "strong reference" to |this|.
- When the dtor exits, we destroy the nsDocShell, but the XPCWN has a "strong reference" to the doc shell.

I'm not sure how to fix this, besides removing the NotifyObservers() call, which would require fixing up devtools to not rely on it.
Maybe we should not NotifyObservers if we're being called from the dtor?  Where else does Destroy() get called from?  This must work in the normal case, because otherwise we'd be dealing with docshell UAFs all over, no?

Alternatively, maybe have a custom nsDocShellConstructor that calls Destroy() while we still have a strong ref, and don't call Destroy() from the dtor?  I'm sure that would wreak all kinds of havoc, though...
I'd also note that in this particular case we're sending the DESTROY message without having sent the CREATE message. If we had called the CREATE message in the ctor, say, then we'd have an XPCWN already for it and it would be okay because we wouldn't call the dtor when we did the release on docshell constructor, though that's a bit fragile. So just checking if we've called CREATE before sending DESTROY would not be a very robust solution, I think.

(In reply to Nathan Froyd [:froydnj] from comment #12)
> Maybe we should not NotifyObservers if we're being called from the dtor? 

Glancing at the code, that looks reasonable to me. Just setting mIsBeingDestroyed to true before we call Destroy() in the dtor would do the trick. There's a bunch of other places we call Destroy() so maybe this is only happening in the weird case where we don't really use the doc shell.
I looked over the other places where we call NotifyObservers, and nothing else seems to be obviously called from a dtor. The closest is TabParent::ActorDestroy(), but I don't think that's actually called from the dtor so it should be okay.
MozReview-Commit-ID: LtTzTycuJxR
Attachment #8919494 - Attachment is obsolete: true
Attachment #8919496 - Flags: review?(bzbarsky)
Comment on attachment 8919496 [details] [diff] [review]
Fix a crash by setting a flag to true.

r=me
Attachment #8919496 - Flags: review?(bzbarsky) → review+
Comment on attachment 8919496 [details] [diff] [review]
Fix a crash by setting a flag to true.

[Security approval request comment]
How easily could an exploit be constructed based on the patch? not easily

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Not really.

Which older supported branches are affected by this flaw? All

If not all supported branches, which bug introduced the flaw?

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? It should be trivially backportable.

How likely is this patch to cause regressions; how much testing does it need? It seems fairly safe to me.
Attachment #8919496 - Flags: sec-approval?
Priority: P3 → P1
Sec-approval+ for trunk. I don't know if we can backport this to beta at this point. Ritu?

The patch is rather tiny so I'd like to take it if possible.
Flags: needinfo?(rkothari)
Attachment #8919496 - Flags: sec-approval? → sec-approval+
Flags: needinfo?(lhenry)
Yes, it should be fine to uplift to beta (and esr)
Flags: needinfo?(lhenry)
Flags: needinfo?(rkothari)
Andrew can you request uplift to beta and esr? Thanks!
Flags: needinfo?(continuation)
Comment on attachment 8919496 [details] [diff] [review]
Fix a crash by setting a flag to true.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: sec-high
User impact if declined: sec-high
Fix Landed on Version: 58
Risk to taking this patch (and alternatives if risky): low, it shouldn't do much of anything in the common case
String or UUID changes made by this patch: none
Flags: needinfo?(continuation)
Attachment #8919496 - Flags: approval-mozilla-esr52?
Attachment #8919496 - Flags: approval-mozilla-beta?
Comment on attachment 8919496 [details] [diff] [review]
Fix a crash by setting a flag to true.

Crash fix, taking this for beta 11 and ESR.
Attachment #8919496 - Flags: approval-mozilla-esr52?
Attachment #8919496 - Flags: approval-mozilla-esr52+
Attachment #8919496 - Flags: approval-mozilla-beta?
Attachment #8919496 - Flags: approval-mozilla-beta+
https://hg.mozilla.org/mozilla-central/rev/23d3b458f69c
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Group: dom-core-security → core-security-release
Whiteboard: [adv-main57+][adv-esr52.5+]
I managed to reproduce the initial issue on 52.4.1esr build1 (20171005074949) and 56.0.2 (20171024165158). I also can confirm that 58.0a1 (2017-11-08), 57.0 build1 (20171106194249) and 52.5.0esr build2 (20171107091003) are verified fixed, using Windows 10 x64, Ubuntu 16.04 x64 and macOS 10.13.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.