Closed Bug 895294 Opened 7 years ago Closed 7 years ago

Crash [@ mozilla::dom::MaybeWrapObjectValue] with document.all after navigating away and GCing

Categories

(Core :: DOM: Core & HTML, defect, critical)

x86_64
macOS
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla26
Tracking Status
firefox23 --- wontfix
firefox24 + verified
firefox25 + verified
firefox26 + verified
firefox-esr17 --- unaffected
b2g18 --- unaffected

People

(Reporter: jruderman, Assigned: mccr8)

References

(Blocks 2 open bugs)

Details

(4 keywords, Whiteboard: [adv-main24+])

Attachments

(5 files, 1 obsolete file)

1. Set:
     user_pref("javascript.options.baselinejit.unsafe_eager_compilation", true);
2. Install https://www.squarefree.com/extensions/domFuzzLite3.xpi
3. Load the testcase

Result: Crash [@ mozilla::dom::MaybeWrapObjectValue]

Might be debug-only.
Attached file testcase
Attached file stack
Looks like just a null crash, but I'll mark it sec-high for now.
Keywords: sec-high
Er, this is bad.  What we have is an ImageDocument for which "p/x *self->mAll.ptr" shows:

    shape_ = {
      <js::EncapsulatedPtr<js::Shape, unsigned long>> = {
        {
          value = 0xdadadadadadadada, 
          other = 0xdadadadadadadada
        }
      }, <No data fields>}, 

etc.

As in, the frameDoc.all got GCed!
We seem to never traverse the ImageDocument in question during the CC the page triggers....
Assignee: nobody → continuation
I can't reproduce on a 138350:18467a85acf6 build from 7/13, I'll try updating.
Did you set the preference from comment 0?
Assuming we had javascript.options.baselinejit.unsafe_eager_compilation at that time.
(In reply to Boris Zbarsky (:bz) from comment #7)
> Did you set the preference from comment 0?

I did, assuming that going into about:config and adding it is enough.
I can't reproduce this, even with an updated build on a fresh profile updated with the addon and the pref, and restarting.  I'm not sure why that is.
I tried a Linux ASAN non-debug build, and I couldn't reproduce it there, either.  Though I meant to do a debug build, I'll retry once I rebuild...
Okay, it does reproduce on a Linux ASAN debug build.
Attached file stack for destroy
What is happening is that we're first triggering nsDocument::Destroy.  I'm not sure why that is, but I've attached the stack.  It involves nsDocShell::EndPageLoad, PresShell::UnsuppressAndInvalidate, nsDocumentViewer::Show and some other stuff.

nsDocument::Destroy manually calls ReleaseWrapper for some reason, which seems bad:
7788   // XXX We really should let cycle collection do this, but that currently still
7789   //     leaks (see https://bugzilla.mozilla.org/show_bug.cgi?id=406684).
7790   ReleaseWrapper(static_cast<nsINode*>(this));

This causes us to drop the document from the JS holders table, which doesn't trigger the assertion because we haven't set mAll to anything.  The ImageDocument now has PreservingWrapper() equal to false.

Then DOMProxyHandler::EnsureExpandoObject sets PreservingWrapper() to true on the ImageDocument, but doesn't add it as a JSHolder again.

Then we hit the code that actually sets mAll for the first time, but because the document already has preserving set to true, we don't hold it.

From there we are doomed.  We GC, don't trace the image document because it isn't held, get the saved value of mAll, which is now garbage.

I see at least two problems here:

1) The ReleaseWrapper() in nsDocument::Destroy() is bad.  It might be okay if we're only supposed to call nsDocument::Destroy() on dead documents, in which case whatever the trigger is for the Destroy() is bad.

2) EnsureExpandoObject is setting the preserved bit to true without ensuring the object is held. I'm not sure exactly what this code does, but that seems bad to me.  It would be nice if we had an assertion in SetPreservingWrapper(true) that checked to make sure the object was held.

I'll look more into this tomorrow, focusing on the second issue first, as I'm more likely to understand that code.
Olli, are we supposed to Destroy documents that are still alive?  I'm guessing no...
Flags: needinfo?(bugs)
Destroy() is called when you navigate away from a document, basically.  So it can most definitely happen on still-alive documents.
Thanks.  Hopefully that comment about needing release wrapper to avoid leaks is no longer accurate.  I could imagine this could be triggered with the wrapper instead of the mAll pointer.  I'll try experimenting with that.
Flags: needinfo?(bugs)
I suspect the reason that eager_compilation is required is that without optimizations the result of the first call to |frameDoc.all| is being stuck on the stack somewhere, so the conservative stack scanner keeps the "all" object alive.  I'd guess that with eager_compilation, the compiler is smart enough to realize that the result of frameDoc.all isn't use, so it throws out the intermediate result, and we don't keep the all object alive through the GC.

This theory is supported by the fact that the variant in this attachment works without eager_compilation set.  The only change is that it explicitly puts the result of frameDoc.all into a local variable x, then sets the value of x to something else, thus I'd guess we don't keep around the all object, like in the optimized case.

I could verify this by looking at a GC log, but I'm lazy.
Anyways, deleting the ReleaseWrapper call in nsDocument::Destroy fixes the test case, and doesn't seem to leak ( https://tbpl.mozilla.org/?tree=Try&rev=0de6a6910f3c ), but then we still have the funny behavior where DOMProxyHandler::EnsureExpandoObject sets a wrapper as preserving without doing a hold, so we'll end up being a use-after-unlink error away from a use-after-free (triggering an EnsureExpandoObject after an Unlink will put us in the same situation as the test case), so I'm going to try to figure out what is going on there.
This is enough to fix the patch.  Is there some reason we can't just do this, Peter?

mAll gets cleared in Destroy, so maybe the ReleaseWrapper in Destroy isn't so terrible.  I can file a separate bug for removing that, as it still feels a bit archaic.
Attachment #788433 - Flags: feedback?(peterv)
Comment on attachment 788433 [details] [diff] [review]
preserve wrapper in last case in EnsureExpandoObject

I don't really know what this last case is supposed to be for, but this at least passes try...
Attachment #788433 - Flags: feedback?(peterv) → review?(peterv)
(In reply to Andrew McCreight [:mccr8] from comment #15)
> Then DOMProxyHandler::EnsureExpandoObject sets PreservingWrapper() to true
> on the ImageDocument, but doesn't add it as a JSHolder again.

But it does call XPCWrappedNativeScope::RegisterDOMExpandoObject,

> From there we are doomed.  We GC, don't trace the image document

so why isn't XPCWrappedNativeScope::TraceWrappedNativesInAllScopes not tracing it?
Flags: needinfo?(continuation)
It looks to me like that only traces the expando, and what we're missing here is tracing the mAll pointer.

I hadn't really realized about the RegisterDOMExpandoObject mechanism for tracing expandos, that's good to know.
Flags: needinfo?(continuation)
So this is a little confusing, because really what should be happening for ImageDocument (which inherits from HTMLDocument) is that we take the |if (expandoAndGeneration) {|. It looks like that is a problem with the OverrideBuiltins (we don't check for that on base interfaces).
(In reply to Peter Van der Beken [:peterv] from comment #26)
> we take the |if (expandoAndGeneration) {|

... branch which does |cache->PreserveWrapper(native)|.
From what I'm seeing, what is happening is that the call of nsDocument::Destroy into ReleaseWrapper is calling GetAndClearExpandoObject on the ImageDocument, which causes JSPROXYSLOT_EXPANDO to become undefined, so then when we call into EnsureExpandoObject, we don't hit the expandoAndGeneration case.  Is OverrideBuiltins supposed to prevent that somehow?
Comment on attachment 788433 [details] [diff] [review]
preserve wrapper in last case in EnsureExpandoObject

It sounds like this is the wrong solution.

I'll poke around in the bindings generation code for whatever deals with OverrideBuiltins to try to figure out what that is doing.
Attachment #788433 - Flags: review?(peterv)
I still don't know what it does, but I can confirm that manually adding OverrideBuiltins to ImageDocument in the .webidl file fixes this crash.
Let's do that for now to fix this. I'll file a separate bug to make that mandatory (I don't think it makes sense to have OverrideBuiltins on a base interface but not on a derived one).
Great!  I checked and there are no other cases of classes inheriting from OverrideBuiltins.

I'm going to call this a regression from the original WebIDLization of ImageDocument, though it isn't clear to me that it can be exploited without the addition of the document.all pointer.
I still need to do a try run.
Attachment #788433 - Attachment is obsolete: true
Attachment #790844 - Flags: review?(peterv)
Attachment #790844 - Flags: review?(peterv) → review+
Comment on attachment 790844 [details] [diff] [review]
Make ImageDocument OverrideBuiltins

try run: https://tbpl.mozilla.org/?tree=Try&rev=ef9284fca0eb

[Security approval request comment]
How easily could an exploit be constructed based on the patch? Very difficult.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No.

Which older supported branches are affected by this flaw? 23, 24 and 25, though this might not be a security problem on 23. The actual testcase here is only going to work on 24 and later.

If not all supported branches, which bug introduced the flaw? Bug 868929 or bug 877277.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Trivial.

How likely is this patch to cause regressions; how much testing does it need? This looks very safe.
Attachment #790844 - Flags: sec-approval?
Comment on attachment 790844 [details] [diff] [review]
Make ImageDocument OverrideBuiltins

Sec-approval+ for trunk. Once it is in and green, let's take this on aurora and beta. I think this is the smallest patch I've seen this week and the risk appears to be very low.
Attachment #790844 - Flags: sec-approval?
Attachment #790844 - Flags: sec-approval+
Attachment #790844 - Flags: approval-mozilla-beta+
Attachment #790844 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/3c60eec96230
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
https://hg.mozilla.org/releases/mozilla-aurora/rev/b4b0dd527993
https://hg.mozilla.org/releases/mozilla-beta/rev/7bd99c764420

The patch didn't apply cleanly to beta due bug 885177 not landing there. I assume that having just the OverrideBuiltins without the ChromeOnly won't hurt anything.
Whiteboard: [adv-main24+]
Confirmed crash on ASan FF25 2013-06-25.
Verified fixed on ASan FF25 2013-09-14.
Verified fixed on ASan FF26 2013-11-20.
Group: core-security
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.