Closed Bug 887334 (CVE-2013-1738) Opened 11 years ago Closed 11 years ago

GC hazard with default compartments and frame chain restoration

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla25
Tracking Status
firefox22 --- wontfix
firefox23 --- wontfix
firefox24 + fixed
firefox25 + fixed
firefox-esr17 --- wontfix
firefox-esr24 --- fixed
b2g18 24+ wontfix
b2g-v1.1hd --- wontfix
b2g-v1.2 --- fixed

People

(Reporter: bholley, Assigned: bholley)

References

(Blocks 1 open bug)

Details

(Keywords: sec-critical, Whiteboard: [adv-main24+] underlying cause/fix of 882897,883301 and probably 886174)

Attachments

(17 files, 2 obsolete files)

2.56 KB, patch
luke
: review+
Details | Diff | Splinter Review
1.34 KB, patch
luke
: review+
Details | Diff | Splinter Review
917 bytes, patch
luke
: review+
Details | Diff | Splinter Review
1.63 KB, patch
luke
: review+
Details | Diff | Splinter Review
2.49 KB, patch
luke
: review+
Details | Diff | Splinter Review
1.57 KB, patch
luke
: review+
Details | Diff | Splinter Review
2.67 KB, patch
luke
: review+
gkrizsanits
: review+
Details | Diff | Splinter Review
1.95 KB, patch
luke
: review+
Details | Diff | Splinter Review
9.12 KB, patch
luke
: review+
Details | Diff | Splinter Review
1.74 KB, patch
luke
: review+
Details | Diff | Splinter Review
1.30 KB, patch
bent.mozilla
: review+
Details | Diff | Splinter Review
1.08 KB, patch
gkrizsanits
: review+
Details | Diff | Splinter Review
4.05 KB, patch
luke
: review+
Details | Diff | Splinter Review
967 bytes, patch
luke
: review+
Details | Diff | Splinter Review
2.54 KB, patch
luke
: review+
Details | Diff | Splinter Review
6.05 KB, patch
bhackett1024
: review+
Details | Diff | Splinter Review
2.71 KB, patch
mrbkap
: review+
Details | Diff | Splinter Review
Default compartments make our book keeping tricky in a number of ways: * When we're in the default compartment, enterCompartmentDepth is zero but cx->compartment() is non-null. * The default compartment on a cx can change at any time, even when we've already got code running on that cx. * The default compartment is actually tracked by a "default compartment object", which is the outer window in Gecko. But this object gets transplanted, so even if it stays the same, the compartment we compute from it varies with time. Meanwhile, saveFrameChain slurps up the current compartment and enterCompartmentDepth of a cx and pushes it onto a stack, putting the cx in whatever compartment currently holds the associated default compartment object. restoreFrameChain pops the stack and resets the previous compartment and enterCompartmentDepth. This whole setup significantly complicates our attempt to mark all the live globals in the system. Currently, we do three things: (A) (In the JS Engine): Mark the global associated with each frame on the stack. (B) (In the JS Engine): Iterate over all the compartments, and mark those we've called enter() on. (C) (In XPConnect): Iterate over all the contexts, and mark the global associated with the default compartment object. This usually works. But it breaks in the following case: 1 - cx->enterCompartmentDepth is zero, and cx is in the default compartment. There is no code running on cx. 2 - We push cx, causing us to call saveFrameChain, stashing the tuple (0, defaultCompartmentObject->compartment()) onto the SavedFrameChain stack. 3 - We either call JS_SetGlobalObject(cx, someOtherObject), or transplant defaultCompartmentObject to another compartment. 4 - We pop cx, causing us to call restoreFrameChain, setting cx->enterCompartmentDepth to 0, and putting cx in the old compartment, which is no longer equal to cx->defaultCompartmentObject->compartment(). After (3), nothing is keeping the saved compartment alive (See A-C above): (A) No code was running in (1), we don't have any stack frames to trace. (B) We never call enter() on default compartments. (C) The compartment doesn't match that of the defaultCompartmentObject on the cx, so the XPConnect hook misses it. We can keep the compartment alive between (3) and (4) by calling enter() on all the compartments we push onto the SavedFrameChain stack (luke suggested this on monday). But once (4) happens, we'd have to call leave(), and still be screwed. I then thought that we could check in saveFrameChain whether we either (a) have manually entered the current compartment or (b) have a non-empty scope-chain in the current compartment. If we have neither of the above, we'd set the compartment of the SavedFrameChain to null, and compute the default compartment when it comes time to actually use it. But if (b) is the case, then we're still in trouble, because as soon as the associated code is finished running, the cx will remain in that compartment, but we'll have no way to mark it (since none of A-C will apply once the last stack frame is popped). In general, the solution here is to rip out default compartment objects, and I have a plan to do that. But I'd like to find a well-contained fix here that we can backport. Thoughts, luke?
Flags: needinfo?(luke)
Nice catch with the problem of enter()/leave() on save/restore. This defaultCompartment stuff is madness. Agreed that the long-term fix is to remove it all, but how about this as an intermediate solution: we have cx->compartment = NULL by default (until a compartment is explicitly via AutoCompartment). This means adding "AutoCompartment ac(cx, GetDefaultGlobalForContext(cx)" in the few remaining places that depend on the defaultCompartmentObject. (So JS_SaveFrameChain would reset cx->compartment = NULL.) This would also help out bug 722345 which wants to remove the request API to use "cx->compartment == NULL" in place of "cx->outstandingRequests > 0". I started to do this but I didn't know enough to pick the right compartment to enter in all cases; istr you saying this would be much easier now.
Flags: needinfo?(luke)
(In reply to Luke Wagner [:luke] from comment #1) > This defaultCompartment stuff is madness. Agreed that the long-term fix is > to remove it all Yes. > but how about this as an intermediate solution: we have > cx->compartment = NULL by default (until a compartment is explicitly via > AutoCompartment). This means adding "AutoCompartment ac(cx, > GetDefaultGlobalForContext(cx)" in the few remaining places that depend on > the defaultCompartmentObject. (So JS_SaveFrameChain would reset > cx->compartment = NULL.) This is tantamount to "removing it all", AFAICT. We'd either need to fix a bunch of consumers to use JSAutoCompartments, or fix a bunch of places in the JS engine to handle cx->compartment being null. I have a plan for doing the former, but I don't think it's easily backportable. Do you have any ideas for a more localized fix?
Flags: needinfo?(luke)
(In reply to Bobby Holley (:bholley) from comment #2) > This is tantamount to "removing it all", AFAICT. Well, except that there is still a GetDefaultGlobalForContext/JS_SetGlobalObject available. I thought that any path into the JS engine already had to funnel through one of a few of this CxPusher paths (which is where I assumed you'd add the AutoCompartment). > We'd either need to fix a > bunch of consumers to use JSAutoCompartments, or fix a bunch of places in > the JS engine to handle cx->compartment being null. The former is what I intended; you have to enter a compartment to enter the JS engine. > I have a plan for doing > the former, but I don't think it's easily backportable. Do you have any > ideas for a more localized fix? Nope, this is all pretty gnarly. OTOH, other than these ASan bugs, do you think this is crashing much in the wild? If not, letting the fix ride the trains seems preferable.
Flags: needinfo?(luke)
Blocks: 882897
Blocks: 883301
(In reply to Luke Wagner [:luke] from comment #3) > (In reply to Bobby Holley (:bholley) from comment #2) > > This is tantamount to "removing it all", AFAICT. > > Well, except that there is still a > GetDefaultGlobalForContext/JS_SetGlobalObject available. I thought that any > path into the JS engine already had to funnel through one of a few of this > CxPusher paths (which is where I assumed you'd add the AutoCompartment). As of recently yeah, but that doesn't help us on branches. > > I have a plan for doing > > the former, but I don't think it's easily backportable. Do you have any > > ideas for a more localized fix? > > Nope, this is all pretty gnarly. OTOH, other than these ASan bugs, do you > think this is crashing much in the wild? If not, letting the fix ride the > trains seems preferable. I'm concerned about letting a pretty exploitable gc hazard go unfixed for the lifetime union of all our supported branches. It was less detectable before when we were pushing/popping less. But now that it is, it seems pretty likely that somebody is going to found it. Nils found it, and a lot of those tools are open-source. Is there a choke-point where the last frame pops off the callstack for a given cx, and would doing a little bit of work there be acceptable performance-wise? If so, we could do the following: * If restoreFramechain pops a SavedFrameChain with enterCompartmentDepth == 0, it throws away the stashed compartment and uses the current defaultCompartmentObject, _unless_ there are frames on the stack. * In the event of the above, we rely on code in the aforementioned choke point to reset cx->compartment based on the defaultCompartmentObject when the stack becomes empty. I know the above is kind of gross, but the code complexity will be short-lived, since we'll follow it up with a removal of all this gunk. But if it's feasible from a performance standpoint, I think we should do it for the sake of branch security.
(In reply to Bobby Holley (:bholley) from comment #4) > Is there a choke-point where the last frame pops off the callstack for a > given cx, and would doing a little bit of work there be acceptable > performance-wise? If so, we could do the following: Yeah, on branches, ContextStack::popSegment() (if !cx->hasfp()), on trunk/aurora, Activation::~Activation() (if !currentlyRunning()). > * If restoreFramechain pops a SavedFrameChain with enterCompartmentDepth == > 0, it throws away the stashed compartment and uses the current > defaultCompartmentObject, _unless_ there are frames on the stack. I didn't consider this an option because it would change observable behavior (what if someone assumed that, after the JS_RestoreFrameChain that the compartment was as it was before JS_SaveFrameChain? Maybe you understand more about this, but in the abstract this seems scary for a Beta branch patch.
(In reply to Luke Wagner [:luke] from comment #5) > (In reply to Bobby Holley (:bholley) from comment #4) > > Is there a choke-point where the last frame pops off the callstack for a > > given cx, and would doing a little bit of work there be acceptable > > performance-wise? If so, we could do the following: > > Yeah, on branches, ContextStack::popSegment() (if !cx->hasfp()), on > trunk/aurora, Activation::~Activation() (if !currentlyRunning()). Ok, I will give that a try. > I didn't consider this an option because it would change observable behavior > (what if someone assumed that, after the JS_RestoreFrameChain that the > compartment was as it was before JS_SaveFrameChain? Maybe you understand > more about this, but in the abstract this seems scary for a Beta branch > patch. Hm. I guess it might be an issue if someone was in the default compartment on a cx, allocated an object, then did a scoped push/pop of that cx, then used that object. That doesn't seem super likely, but it's hard to say. I'm going to give this some thought and dig into the code a bit.
Whiteboard: underlying cause/fix of 882897,883301 and probably 886174
Talked with luke about this. I think we have no other good option but to remove default compartment objects. We can backport that to aurora24 (and definitely should, given esr24), but probably not anything earlier. dveditz is in the loop on this too.
Whiteboard: underlying cause/fix of 882897,883301 and probably 886174
Whiteboard: underlying cause/fix of 882897,883301 and probably 886174
I assume this would get rated sec-high or sec-crit?
(In reply to Andrew McCreight [:mccr8] from comment #8) > I assume this would get rated sec-high or sec-crit? Yeah.
Keywords: sec-critical
Blocks: 886174
Depends on: 888104
Blocks: 889387
Depends on: 889714
As an update - I'd run into some nasty GC crashes in IPC xpcshell tests, which I fixed by spending a week gutting XPCShellTestEnvironment and friends to be much simpler and single-cx-ready (bug 889714). The patches in this bug are now green. Uploading and flagging for review.
This thing is...yikes.
Attachment #773636 - Flags: review?(luke)
The current code makes calls that assume (implicitly, via assertions) that |cx| is in a compartment, which isn't a valid assumption going forward.
Attachment #773637 - Flags: review?(luke)
cx->global() assumes a non-null compartment(). When we fix up various bugs related to being in a compartment when we shouldn't be, we start to crash here. Fix it.
Attachment #773639 - Flags: review?(luke)
The call to JS_SetGlobalObject causes cx->compartment_ to be set to the self- hosting global, which means that the JSAutoCompartment picks up that compartment as the 'previous' compartment. So despite the attempt to restore things with JS_SetGlobalObject at the end of the function, the JSAutoCompartment destructor actually ends up leaving cx in the self-hosting global's compartment at the end of this function. Moving the JSAutoCompartment construction above the call to JS_SetGlobalObject fixes the problem.
Attachment #773640 - Flags: review?(luke)
This gets rid of the wonky setCompartment usage in that function.
Attachment #773641 - Flags: review?(luke)
This should hopefully take care of any cases where consumers expect to be in the default compartment.
Attachment #773645 - Flags: review?(luke)
Attachment #773645 - Flags: review?(gkrizsanits)
The stuff in nsXBLProtoImplMethod is doing the same thing, so let's just have it call into nsJSUtils.
Attachment #773646 - Flags: review?(luke)
Attachment #773647 - Flags: review?(luke)
I don't know why it's done this way, but it sure needs to change.
Attachment #773648 - Flags: review?(luke)
Attachment #773651 - Flags: review?(bent.mozilla)
With this change, defaultCompartmentObject_ is just an opaque (though traced) piece of embedder state on the cx.
Attachment #773653 - Flags: review?(luke)
If it's non-null, that means we've got a JSAutoCompartment on the stack, which is going to run into trouble when it tries to restore the old compartment on the now-dead cx.
Attachment #773657 - Flags: review?(luke)
Attachment #773650 - Flags: review?(bent.mozilla) → review+
Comment on attachment 773651 [details] [diff] [review] Part 13 - Conditionally enter a compartment in WorkerPrivate::DoRunLoop. v1 Review of attachment 773651 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/workers/WorkerPrivate.cpp @@ +1696,5 @@ > + // lazily create a global, in which case we need to be in its compartment > + // when calling PostRun() below. Maybe<> this time... > + if (mTarget == WorkerThread && ac.empty() && > + js::GetDefaultGlobalForContext(cx)) > + { Nit: { on previous line @@ +2869,5 @@ > + // "operate in the compartment of the worker global if it exists" > + // behavior here. This could probably be improved with some refactoring. > + Maybe<JSAutoCompartment> maybeAC; > + if (js::GetDefaultGlobalForContext(aCx)) > + maybeAC.construct(aCx, js::GetDefaultGlobalForContext(aCx)); So I don't know if it's a good idea to do this each time in the loop. Couldn't we just do it the first time and be done? Also, nit: Use braces
Attachment #773652 - Flags: review?(gkrizsanits) → review+
Comment on attachment 773636 [details] [diff] [review] Part 1 - Remove AutoSwitchCompartment. v1 Die die die my darling
Attachment #773636 - Flags: review?(luke) → review+
Attachment #773637 - Flags: review?(luke) → review+
Attachment #773639 - Flags: review?(luke) → review+
Attachment #773640 - Flags: review?(luke) → review+
Attachment #773641 - Flags: review?(luke) → review+
Comment on attachment 773642 [details] [diff] [review] Part 6 - Don't rethrow in quickstubs if there's already an exception pending. v1 This is in xpc, so your call, but I assume the "we used to" comment will become stale/irrelevant in a few months and so maybe you'd rather remove it (and but it in the cset comment for anyone that notices the change and hg annotates).
Attachment #773642 - Flags: review?(luke) → review+
Attachment #773643 - Flags: review?(luke) → review+
Attachment #773645 - Flags: review?(luke) → review+
Attachment #773646 - Flags: review?(luke) → review+
Attachment #773647 - Flags: review?(luke) → review+
Attachment #773648 - Flags: review?(luke) → review+
Comment on attachment 773653 [details] [diff] [review] Part 15 - Stop setting the compartment to defaultCompartmentObject_->compartment(). v1 Our hero
Attachment #773653 - Flags: review?(luke) → review+
Attachment #773657 - Flags: review?(luke) → review+
Attachment #773658 - Flags: review?(luke) → review+
Great job Bobby!
Attachment #773645 - Flags: review?(gkrizsanits) → review+
(In reply to Luke Wagner [:luke] from comment #34) > Comment on attachment 773642 [details] [diff] [review] > Part 6 - Don't rethrow in quickstubs if there's already an exception > pending. v1 > > This is in xpc, so your call, but I assume the "we used to" comment will > become stale/irrelevant in a few months and so maybe you'd rather remove it > (and but it in the cset comment for anyone that notices the change and hg > annotates). Quickstubs will be gone in a few months, so I think it should be ok.
We have to do some temporary hackiness to deal with some of the new PJS work. This patch stays as true to the old world as possible, so that we can more easily backport it.
Attachment #773643 - Attachment is obsolete: true
Attachment #776726 - Flags: review?(bhackett1024)
Attachment #773651 - Attachment is obsolete: true
Attachment #776767 - Flags: review?(bent.mozilla)
Attachment #776726 - Flags: review?(bhackett1024) → review+
The xpcshell fail on that last push was the result of me pushing with some patches missing from bug 889911. Doing an xpcshell try push again, just in case: https://tbpl.mozilla.org/?tree=Try&rev=a583efc39fb2
Comment on attachment 776767 [details] [diff] [review] Part 13 - Conditionally enter a compartment in WorkerPrivate::DoRunLoop. v2 Review of attachment 776767 [details] [diff] [review]: ----------------------------------------------------------------- Stealing this review from bent at bholley's request. ::: dom/workers/WorkerPrivate.cpp @@ +2868,5 @@ > + // explicitly entering it. That no longer works, so we mimic the > + // "operate in the compartment of the worker global once it exists" > + // behavior here. This could probably be improved with some refactoring. > + if (maybeAC.empty() && js::GetDefaultGlobalForContext(aCx)) > + maybeAC.construct(aCx, js::GetDefaultGlobalForContext(aCx)); Nit: single-line ifs need braces in this component.
Attachment #776767 - Flags: review?(bent.mozilla) → review+
followup bustage fix for b2g desktop c++ unit tests: https://hg.mozilla.org/integration/mozilla-inbound/rev/ff75b1f352e9
Comment on attachment 773658 [details] [diff] [review] Part 17 - Reorder some bookkeeping and assert in setCompartment that both the old and new compartments are marked as entered. v1 Flagging these patches, along with dependencies, for Firefox 24 uplift, given that this fixes a major otherwise unfixable sec-critical that fuzzers are starting to find (see dependent bug). dveditz is in the loop here, and definitely wants this on 24 (see comment 7). This request applies to the patches in this bug, as well as those in bug 889714 and bug 889911. Note that the fixes in those bugs are _only_ needed to fix failures in the IPC XPCshell test harness, and they largely only affect testing infrastructure. [Approval Request Comment] Bug caused by (feature/regressing bug #): longstanding User impact if declined: sec-critical Testing completed (on m-c, etc.): Just landed on m-c Risk to taking this patch (and alternatives if risky): Moderate risk, but no less-risky alternatives String or IDL/UUID changes made by this patch: None
Attachment #773658 - Flags: approval-mozilla-aurora?
Depends on: 895774
Comment on attachment 773658 [details] [diff] [review] Part 17 - Reorder some bookkeeping and assert in setCompartment that both the old and new compartments are marked as entered. v1 Approving, given we need this on Fx24. Given, moderate risk here if there is any QA testing/verification needed here that could mitigate that , please loop in QA here(Matt Wobensmith). And an eyeout on possible regression this may have in the next couple of days will be helpful as well, given the bake time on m-c.
Attachment #773658 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I'm ready to assist in testing, which will require test case(s) and steps to reproduce, if possible.
(In reply to Bobby Holley (:bholley) from comment #49) > https://hg.mozilla.org/releases/mozilla-aurora/ > pushloghtml?changeset=84b828b63115 Backed out from Aurora for possibly causing xpcshell crashes along with the other changes from bholley's push in https://hg.mozilla.org/releases/mozilla-aurora/rev/659b0d61fbc6
No longer blocks: 883301
Whiteboard: underlying cause/fix of 882897,883301 and probably 886174 → [adv-main24+] underlying cause/fix of 882897,883301 and probably 886174
Don't we want to embargo this for esr17?
Flags: needinfo?(abillings)
My impression of the embargo is we just wouldn't open up the bug to the public until we end of life ESR17, not that we wouldn't write an advisory. Dan, what do you think?
Flags: needinfo?(abillings) → needinfo?(dveditz)
If that's the case, then the embargo can't say anything much more useful than "we fixed a GC hazard".
Sounds like it is best off the list entirely then!
Alias: CVE-2013-1738
Flags: needinfo?(dveditz)
:bholley looks like this will need a backport to mozilla-b2g18 given its still affected. Can you please help with a follow-up patch?
Flags: needinfo?(bobbyholley+bmo)
(In reply to bhavana bajaj [:bajaj] from comment #61) > :bholley looks like this will need a backport to mozilla-b2g18 given its > still affected. Can you please help with a follow-up patch? This is not backportable. See comment 7.
Flags: needinfo?(bobbyholley+bmo)
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: