Closed Bug 1406570 Opened 4 years ago Closed 4 years ago

Rooting hazards revealed by improved constructor handling

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox-esr52 --- wontfix
firefox56 --- wontfix
firefox57 --- fixed
firefox58 --- fixed

People

(Reporter: sfink, Assigned: sfink)

References

Details

(Keywords: sec-audit, Whiteboard: [adv-main57-][post-critsmash-triage])

Attachments

(8 files, 3 obsolete files)

1.56 KB, patch
jonco
: review+
Details | Diff | Splinter Review
1.50 KB, patch
jonco
: review+
Details | Diff | Splinter Review
3.44 KB, patch
jonco
: review+
Details | Diff | Splinter Review
5.70 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
3.78 KB, patch
Details | Diff | Splinter Review
4.56 KB, patch
jonco
: review+
Details | Diff | Splinter Review
2.65 KB, patch
jonco
: review+
Details | Diff | Splinter Review
16.74 KB, patch
sfink
: review+
Details | Diff | Splinter Review
I'm working on a sixgill toolchain build, which has the side effect of applying a couple of fixes to sixgill that I apparently never released. And which fix a break in the callgraph, resulting in a rooting hazard.
Trivial fix
Attachment #8916182 - Flags: review?(jcoppeard)
Assignee: nobody → sphink
Status: NEW → ASSIGNED
Group: core-security → javascript-core-security
Attachment #8916182 - Flags: review?(jcoppeard) → review+
Blocks: 1339989
It seems like a pain to file a bug and separately request approval for each of the things uncovered by updating sixgill, so I'm going to try putting them all here for now.

I either need to land these before switching over to the sixgill toolchain task, since it's building a newer sixgill, or figure out what version I was using before so I can disconnect these landings.
Summary: Rooting hazard in structured clone writeArrayBuffer → Rooting hazards revealed by improved constructor handling
This is *probably* a false positive, but I'm not confident. The hazard analysis believes that the mConstructor CustomElementConstructor can GC, for somewhat questionable reasons. I annotated away a few, but it came up with more, and the fix to avoid the problem is easy -- copy the aPrototype JSObject* into mPrototype first, before calling the CustomElementConstructor ctor. Then aPrototype will be dead, and mPrototype will have the post barrier run on it.

jonco -- except... why does this work? It will protect it from the generational collection's pointer move. But what if CustomElementConstructor::CustomElementConstructor triggered a compacting GC? In CustomElementDefinition constructor, the Heap<JSObject*> here is not actually linked into the heap yet, so won't get traced. Are we relying on compacting GCs from not getting triggered in weird places like this? (eg if compacting GCs only happened from an empty event loop or something, this would be fine.) And why do I not know this already?
Attachment #8916713 - Flags: review?(nfroyd)
Attachment #8916713 - Flags: review?(jcoppeard)
My try run hasn't finished yet to confirm that this resolves all of the problems here, but I'm requesting review to ensure that there's not some crazy way that aCallback->CallbackPreserveColor() could somehow return a different value here than it did at the beginning, or something. Or if you have a better idea of how to fix this one.

The issue is that AutoEntryScript has an AutoJSAPI which can GC during construction, and we're holding realCallback and wrappedCallback live across it. This was revealed when I rebuilt sixgill with a fix I made a while back that improves constructor handling (and that I thought I was already running with.) The hazard in detail:

Function

  void mozilla::dom::CallbackObject::CallSetup::CallSetup(mozilla::dom::CallbackObject*, mozilla::ErrorResult*, int8*, uint32, JSCompartment*, uint8)

has unrooted 'realCallback' of type 'JSObject*' live across GC call

  void mozilla::Maybe<T>::emplace(Args&& ...) [with Args = {nsIGlobalObject*&, const char*&, const bool&}; T = mozilla::dom::AutoEntryScript]' at dom/bindings/CallbackObject.cpp:197

GC Function: _ZN7mozilla5MaybeINS_3dom15AutoEntryScriptEE7emplaceIIRP15nsIGlobalObjectRPKcRKbEEEvDpOT_$void mozilla::Maybe<T>::emplace(Args&& ...) [with Args = {nsIGlobalObject*&, const char*&, const bool&}; T = mozilla::dom::AutoEntryScript]
    void mozilla::dom::AutoEntryScript::AutoEntryScript(nsIGlobalObject*, int8*, uint8)
    void mozilla::dom::AutoEntryScript::AutoEntryScript(nsIGlobalObject*, int8*, uint8)
    void mozilla::dom::AutoJSAPI::AutoJSAPI(nsIGlobalObject*, uint8, uint32)
    void mozilla::dom::AutoJSAPI::AutoJSAPI(nsIGlobalObject*, uint8, uint32)
    void mozilla::dom::AutoJSAPI::InitInternal(nsIGlobalObject*, JSObject*, JSContext*, uint8)
    uint8 JS_GetProperty(JSContext*, class JS::Handle<JSObject*>, int8*, class JS::MutableHandle<JS::Value>)
    ...(more stuff, but yeah, JS_GetProperty can GC)...

try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=efaffaebbc27ccdec6b7b0604fd66bb70c59f310
Attachment #8916717 - Flags: review?(continuation)
Attached patch Rooting in IPCSplinter Review
The analysis now believes that JSAutoCompartment's ctor can GC, with a reason that sounds plausble to me: it could yield to another zone group, which could then trigger a GC.
Attachment #8916718 - Flags: review?(jcoppeard)
I don't remember now whether these annotations are necessary for any of the new hazards, or whether they just improve the reasons for GCing that the analysis reoprts. Either way, they're useful.
Attachment #8916719 - Flags: review?(jcoppeard)
Comment on attachment 8916713 [details] [diff] [review]
Resolve rooting hazard in CustomElementRegistry by reordering initialization

Review of attachment 8916713 [details] [diff] [review]:
-----------------------------------------------------------------

rs=me, I guess?, though depending on ordering of initialization for this stuff to not trigger hazards seems pretty fragile.
Attachment #8916713 - Flags: review?(nfroyd) → review+
Comment on attachment 8916717 [details] [diff] [review]
Re-fetch realCallback after AutoEntryScript

Review of attachment 8916717 [details] [diff] [review]:
-----------------------------------------------------------------

Some of this changed in bug 1396904, so it might be worth taking a look at that.

This looks reasonable to me, but bz should take a look at it.

::: dom/bindings/CallbackObject.cpp
@@ +229,5 @@
>      // Check that it's ok to run this callback at all.
> +    //
> +    // Re-fetch the (unwrapped) callback because we may have GC'd and
> +    // invalidated realCallback and wrappedCallback since they were originally
> +    // retrieved. not the wrapper.

"not the wrapper." looks like it should be removed.
Attachment #8916717 - Flags: review?(continuation)
Attachment #8916717 - Flags: review?(bzbarsky)
Attachment #8916717 - Flags: feedback+
Comment on attachment 8916717 [details] [diff] [review]
Re-fetch realCallback after AutoEntryScript

So this is happening because the JSAutoCompartment inside AutoEntryScript can gc?  Or is there something else going on?

I honestly don't see how JSAutoCompartment being able to gc can be safe.  It gets passed a JSObject*.  Given the above comments, the GC path is via the JSContext::enterZoneGroup call in JSContext::enterNonAtomsCompartment, right?  But after that call we keep using the JSCompartment that came in via our argument... except by that point it can be dead, no?

There are also comments in this code about how we can't CC above this point, and it's not clear to me whether _that_ is still true...

Anyway, my preferred fix here would be to move this whole "is script enabled?" check to right after we grab realCallback.  And probably re-scope some things so that realCallback and wrappedCallback go out of scope as soon as possible (right after setting up win and globalObject for the former, right after grabbing realCallback for the latter) instead of lingering like this.
Attachment #8916717 - Flags: review?(bzbarsky) → review-
Comment on attachment 8916713 [details] [diff] [review]
Resolve rooting hazard in CustomElementRegistry by reordering initialization

Review of attachment 8916713 [details] [diff] [review]:
-----------------------------------------------------------------

Ugh.  I feel like we should have a better solution than reordering of fields for problems like this.  How about moving the possible GC call out of the constructor?
Attachment #8916718 - Flags: review?(jcoppeard) → review+
Attachment #8916719 - Flags: review?(jcoppeard) → review+
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #9)
> Comment on attachment 8916717 [details] [diff] [review]
> Re-fetch realCallback after AutoEntryScript
> 
> So this is happening because the JSAutoCompartment inside AutoEntryScript
> can gc?  Or is there something else going on?

Yes, that's what it is.

> I honestly don't see how JSAutoCompartment being able to gc can be safe.  It
> gets passed a JSObject*.  Given the above comments, the GC path is via the
> JSContext::enterZoneGroup call in JSContext::enterNonAtomsCompartment,
> right?  But after that call we keep using the JSCompartment that came in via
> our argument... except by that point it can be dead, no?

The JSObject* live range ends before we could GC, but yes, the JSCompartment's live range does reach the GC call. However, entering a compartment "roots" the compartment (and its global, fwiw: http://searchfox.org/mozilla-central/source/js/src/jscompartment.cpp#770 ).

> There are also comments in this code about how we can't CC above this point,
> and it's not clear to me whether _that_ is still true...
> 
> Anyway, my preferred fix here would be to move this whole "is script
> enabled?" check to right after we grab realCallback.  And probably re-scope
> some things so that realCallback and wrappedCallback go out of scope as soon
> as possible (right after setting up win and globalObject for the former,
> right after grabbing realCallback for the latter) instead of lingering like
> this.

Ok, though the analysis doesn't care; it looks at where the live range ends.
I should have looked closer to see that this code is easily movable. I was afraid to disturb anything.

This patch is going to be a pain to review with whitespace changes. I'll upload a non-whitespace diff separately.
Attachment #8917199 - Flags: review?(bzbarsky)
Attachment #8916717 - Attachment is obsolete: true
> the JSCompartment's live range does reach the GC call. However, entering a compartment "roots" the compartment

That doesn't help, unfortunately.  The order of operations in JSContext::enterNonAtomsCompartment is:

1)  Call enterZoneGroup.
2)  Call c->enter().

The enterCompartmentDepth on c is bumped in step 2.  The potential GC happens in step 1, as far as I can tell.
Flags: needinfo?(sphink)
Comment on attachment 8917199 [details] [diff] [review]
Avoid holding realCallback across AutoEntryScript

r=me; thank you for the -w diff!
Attachment #8917199 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #14)
> > the JSCompartment's live range does reach the GC call. However, entering a compartment "roots" the compartment
> 
> That doesn't help, unfortunately.  The order of operations in
> JSContext::enterNonAtomsCompartment is:
> 
> 1)  Call enterZoneGroup.
> 2)  Call c->enter().
> 
> The enterCompartmentDepth on c is bumped in step 2.  The potential GC
> happens in step 1, as far as I can tell.

Gah. You are right, of course. The full order of operations is

1) enterCompartmentDepth_++
2) Call enterZoneGroup
3) Call c->enter()

which fooled me into thinking that enterCompartmentDepth had already been incremented, because I mixed up JSContext::enterCompartmentDepth_ and JSCompartment::enterCompartmentDepth. :-( Maybe using the same name isn't such a good idea.

Well, crud. It would seem a little weird to swap the order, because then the compartment would be entered during a potentially yielding enterZoneGroup, and I'm not sure of the invariants here -- the cx running in that ZoneGroup might reasonably expect that all of JSCompartment::enterCompartmentDepth_ is from itself. I wonder if the MOZ_ASSERT(!source->hasBeenEntered()) in mergeCompartments might fail?

Heh. I suppose we could make a JSCompartment::root() (or hold()) that increments a different counter that also roots the global.
Flags: needinfo?(sphink)
Sounds like we should just get a separate bug filed on that...
This is pretty heavyweight. And perhaps I should work on making JSCompartment a GCThing so it can be Rooted<>. Hmm... or maybe to speed up compartment entry, we'd want to specialize Rooted<JSCompartment*> to do exactly what this is doing?

Alternatively, we could just move up the c->enter() call and remove/replace any assertions that might break.
Attachment #8917618 - Flags: review?(jcoppeard)
Comment on attachment 8917618 [details] [diff] [review]
"Root" compartment while entering it

Review of attachment 8917618 [details] [diff] [review]:
-----------------------------------------------------------------

> we could just move up the c->enter() call and remove/replace any assertions that might break.

Yeah we should try that... I don't know what the invariants are here either.

But this is fine.

::: js/src/jscompartment.h
@@ +660,5 @@
> +    }
> +    void releaseGlobal() {
> +        globalHolds--;
> +    }
> +    bool canCollectGlobal() const { return globalHolds == 0 && !hasBeenEntered(); }

I'd prefer inverting the sense of this - shouldTraceGlobal() is clearer to me than canCollectGlobal().
Attachment #8917618 - Flags: review?(jcoppeard) → review+
Keywords: sec-audit
Filed bug 1408120 for extending the analysis to detect the untraced-during-construction hazards.
Ok, I'm going to go with a simple fix for now. This just passes in a Handle to avoid the reported hazard. The current field order is correct; if that mConstructor constructor really *can* GC, then you'd want it in the existing order so that the object stays safe in the Handle during the GC, then gets placed into the Heap<> afterwards. Reordering would make the hazard report go away, but would actually introduce a real GC invalidation hazard because of the untraced Heap<> that wouldn't be reported until bug 1408120 is implemented.
Attachment #8917964 - Flags: review?(jcoppeard)
Attachment #8916713 - Attachment is obsolete: true
Attachment #8916713 - Flags: review?(jcoppeard)
Attached patch Rollup patch (obsolete) — Splinter Review
Attachment #8918040 - Flags: review+
Comment on attachment 8918040 [details] [diff] [review]
Rollup patch

This was labeled sec-audit, and I'm not sure whether I need sec-approval for that. So I'll request it. Note that one constituent patch is not yet r+, but in its final form it's pretty trivial. I will wait for r+ before landing.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

Difficult. You need to trigger a GC at just the right time, and you don't have a whole lot of control over how the stale memory gets used.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? 

"rooting hazards" imply an opportunity for UAF for people aware of what they are.

Which older supported branches are affected by this flaw?

all

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

They should be pretty much the same. I wouldn't be surprised if the analysis patch, at least, had some cosmetic fuzz.

How likely is this patch to cause regressions; how much testing does it need?

Unlikely. There is very little behavior change in this patch. Some operations are reordered a bit.
Attachment #8918040 - Flags: sec-approval?
Comment on attachment 8918040 [details] [diff] [review]
Rollup patch

Technically a sec-audit doesn't need sec-approval but I'll give it here anyway in case we re-rate it at some point.
Attachment #8918040 - Flags: sec-approval? → sec-approval+
Attachment #8917964 - Flags: review?(jcoppeard) → review+
https://hg.mozilla.org/mozilla-central/rev/750de14d8371

Are there branch backports that need to be considered here or can this ride the 58 train?
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Flags: needinfo?(sphink)
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Comment on attachment 8918040 [details] [diff] [review]
Rollup patch

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: 
Fix Landed on Version:
Risk to taking this patch (and alternatives if risky): 
String or UUID changes made by this patch: 

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: 
Fix Landed on Version:
Risk to taking this patch (and alternatives if risky): 
String or UUID changes made by this patch: 

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.

Approval Request Comment
[Feature/Bug causing the regression]: This goes way back, more or less to generational GC.
[User impact if declined]: probably small number of random crashes, vaguely possible exploits
[Is this code covered by automated tests?]:not usefully, or we would have gotten crashes
[Has the fix been verified in Nightly?]:pushed 2017-10-12 (yesterday)
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]:none. The hazard analysis fixes to detect these could be uplifted in bug 1339989, I guess.
[Is the change risky?]: none of the individual changes seemed very risky at all
[Why is the change risky/not risky?]: most of it is just simple rooting, with one thing involving some code reordering but there's not much that could go wrong within that one function
[String changes made/needed]:none

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: I'm not sure whether we generally land these sorts of minor cleanups on esr52.
User impact if declined: see above. Mostly theoretical crashiness (that would likely fall into one of the existing buckets of "weird crashes we don't know what to do about".)
Fix Landed on Version: 58
Risk to taking this patch (and alternatives if risky): not much
String or UUID changes made by this patch: none
Flags: needinfo?(sphink)
Attachment #8918040 - Flags: approval-mozilla-esr52?
Attachment #8918040 - Flags: approval-mozilla-beta?
Attached patch rollup for betaSplinter Review
Ok, I've made a mess here. Everything that should have landed has landed, and if this is rejected for backporting, we're done. (I have no opinion on whether it's worth a backport.)

But if it is approved, use this patch.

What happened: I somehow decided that the patch '"Root" compartment while entering it' belonged to a different, non-security bug, and I spliced it out of my patch stack and landed it separately. Then when RyanVM merged it to mozilla-central, I saw the comment in this bug and assumed that he'd landed the rollup, which I thought contained everything. (Why did I think he would land this patch for me? I dunno, it's a security bug with backport requests, and he does magic landings for backports, and I got confused.) Then later, I landed bug 1339989 which was in the same stack in my workspace (because it depends on these bugs being fixed so they don't show up as hazards after bug 1339989), which resulted in the rollup patch getting pushed. RyanVM noticed that I landed another patch from a bug that was already marked closed.

Bleagh. Sorry for the confusion.
Attachment #8918040 - Attachment is obsolete: true
Attachment #8918040 - Flags: approval-mozilla-esr52?
Attachment #8918040 - Flags: approval-mozilla-beta?
Attachment #8918483 - Flags: review+
Attachment #8918483 - Flags: approval-mozilla-esr52?
Attachment #8918483 - Flags: approval-mozilla-beta?
Comment on attachment 8918483 [details] [diff] [review]
rollup for beta

Crash fix, Beta57+
Attachment #8918483 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Group: javascript-core-security → core-security-release
Since this is a security issue and we've fixed it on other branches, OK, let's try it on ESR.
Attachment #8918483 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Whiteboard: [adv-main57-]
Flags: qe-verify-
Whiteboard: [adv-main57-] → [adv-main57-][post-critsmash-triage]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.