Closed Bug 991661 Opened 10 years ago Closed 6 years ago

Sort out a mechanism to get the current realm in the presence of Xrays

Categories

(Core :: XPConnect, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED INACTIVE

People

(Reporter: bholley, Unassigned)

Details

Splitting this off into a separate bug. See (H) in bug 989528 comment 2, as well as the discussion starting in bug 989528 comment 13.
(In reply to Boris Zbarsky [:bz] from bug 989528 comment 13)
> Right now, compartments conflate two things: they're the security context
> for security checks and they're the global that will be used for JS_New*.

Yes, but I would say "tie" instead of "conflate", because the security guarantees of the former depend on the latter.
 
> But if we want to move to the ES6 Realm model, the Realm becomes the global
> you use for JS_New*, while in the Xray case the security context is
> something different...  I'd like us to at least have an idea of how we plan
> to go about dealing with this.

I agree that this is something we should sort out.

Boris, I seem to recall us discussing this in various outer bugs ~1 year ago. Do you remember what they are? It would be helpful to link to that discussion as a starting point. In particular, I remember some patch to poison object creation in Xrays until an explicit compartment is entered, but I don't know what happened to it.
Bobby, I don't recall the discussion you're referring to.  :(
(In reply to Boris Zbarsky [:bz] from comment #2)
> Bobby, I don't recall the discussion you're referring to.  :(

It was bug 751926. See in particular my proposal (1) in bug 751926 comment 4. This doesn't provide us a mechanism to get the Realm (something we still need to sort out), but it's a good enforcement mechanism to make sure we get it right.
So I have a perhaps totally naive proposal that at least will help me understand the problem (a lot of bug 751926 went over my head).

It sounds like for almost everything (ES6 Realms included) we want to be in the callee's compartment even when invoked from XrayWrapper.  The only problem is that we want to execute with chrome privileges (in the chrome-Xraying-content case).  In bug 751926 comment 3, bz proposed having two compartments (one for security, one for "everything else"), but bholley rejected this b/c it'd complicate the global security mental model.

So what I want to propose is a minor variation on this that is rather like what we did with (Hide,Unhide,Describe)ScriptedCaller and doesn't feel scary at all, at least to me:
 - In addition to a "current compartment" a JSContext has a "caller principal" (JSPrincipal*)
 - By default (without doing anything special), caller principal == current compartment's principal
   - Specifically, entering/leaving a compartment always sets the current principal to the newly-current compartment's principal
 - We add a JS::SetCallerPrincipal which straight-up clobbers 'current principal'
 - XrayWrapper enters the compartment of the callee always, but JS:SetCallerPrincipal's to the caller's principal (and unsets when returning, either explicitly or just via ~AutoCompartment)

What this whole proposal depends on, of course, is that this "caller principal" concept is enough for all the various ways we need to express we are running with chrome privileges.  Is this the case?
(In reply to Luke Wagner [:luke] from comment #4)
> It sounds like for almost everything (ES6 Realms included) we want to be in
> the callee's compartment even when invoked from XrayWrapper. The only
> problem is that we want to execute with chrome privileges (in the
> chrome-Xraying-content case). 

That's not really true. The whole reason that we can securely compute the subject principal from the compartment is that we know that any script we might run (or state we might inspect) is associated with that compartment (and its principal). The whole reason we know that Xrays are secure is that we don't (in general) enter the content compartment at all, so it's safe to hold onto the caller's principal.

So while something like JS::SetCallerPrincipal might seem convenient, it sends us back to the pre-CPG world of "caller privilege hijacking" security bugs.

In a world of fuzzy JS/DOM distinctions, it's true that there will be some places where we need to enter the target compartment. But we should strive to scope that JSAutoCompartment as narrowly as possible - pretty much just around the creation of the object itself. I think we can do that effectively in the bindings layer, and the assertion mechanism mentioned in comment 3 will enforce it.
(In reply to Bobby Holley (:bholley) from comment #5)
That argument is a bit to general for me to follow.

I guess what I don't get is how and where JS::GetCallerPrincipal() would be used (or even, for that matter, what is happening while executing Xray'd code that even depends on chrome privileges in the first place).  It seems like everything would, by default, use the current compartment which is content-privileged which seems like a safe default.
Luke, Xrays are used for cross-origin content-to-content access (on Window and Location), not just for chrome-to-content access.
(In reply to Boris Zbarsky [:bz] from comment #7)
> Luke, Xrays are used for cross-origin content-to-content access (on Window
> and Location), not just for chrome-to-content access.

Sure, those use cases make sense and, naively, I'd assume a JS::GetCallerPrincipal() would satisfy these use cases and that we're also already super-careful for this small set of cross-origin-exposed-props.  I guess I'm more interested in the chrome-Xraying-content case since it seems like there is a much bigger surface area which is I'd expect the source of bholley's concerns.
(In reply to Luke Wagner [:luke] from comment #6)
> (In reply to Bobby Holley (:bholley) from comment #5)

> I guess what I don't get is how and where JS::GetCallerPrincipal() would be
> used

I had assumed in your proposal that you intended it to affect the behavior of nsContentUtils::GetSubjectPrincipal(). If not, it adding it doesn't really fix the problem.

> (or even, for that matter, what is happening while executing Xray'd
> code that even depends on chrome privileges in the first place).

Any kind of C++ security check we do, a subset of which are here: mxr.mozilla.org/mozilla-central/search?string=IsCallerChrome

For example, chrome expects to be able to read tainted canvases and whatnot over Xrays.
Ah, I see, so there are several widely used utilities that would leak JS::GetCallerPrincipal all over the place.  Now, can you explain what you are concerned about happening with this JSPrincipal?  In the old days, it might get passed into to the creation of some random object or compiled script, but now all that is determined by the compartment.
> But we should strive to scope that JSAutoCompartment as narrowly as possible - pretty
> much just around the creation of the object itself. I think we can do that effectively
> in the bindings layer

Fixing bindings might be doable.  Doing it without a perf hit will require either threading the callee throughout the world or adding something that hangs off the JSContext and that Xrays set that allows quickly entering the callee compartment.

But as more and more things (promises!) move into the JS engine, we will have more and more people having to create JSAPI things manually, not leaning on the bindings.  And what that means is that every single piece of code that touches JSAPI, ever, needs to think about maybe being called over Xrays and carefully manage things if it is.  That's a pretty huge cognitive burden.

My goal here is that writing web-facing code should be as little of a footgun as possible, and vast swaths of that code have to worry about maybe being called via Xrays seems to me like a spectacular footgun.  :(

One thing I considered is having two compartments hanging off the JSContext, one of which is used for everything but allocating objects, while the other is used for allocating objects.  JSAutoCompartment sets both.  Xrays set the object-allocation compartment to the callee.

The unfortunate aspect of that setup is that it requires anyone who wants to allocate and then manipulate an object to immediately enter the compartment of the object being manipulated. Which is not really any better than entering the "callee" compartment before allocating the object, I guess...  It's just a bit clearer, in that you don't have to worry about Xrays or not; it's just a simple rule: if you just allocated an object, it might be in a random compartment; if you want to work with it, you have to enter its compartment.
(In reply to Luke Wagner [:luke] from comment #10)
> Now, can you explain what you
> are concerned about happening with this JSPrincipal?  In the old days, it
> might get passed into to the creation of some random object or compiled
> script, but now all that is determined by the compartment.

Xrayed method tries to manipulate JSAPI. Accidentally triggers content script. Content script runs with the Xray principal. And even if no content script runs, the caller might otherwise get confused and do something it didn't intend to do.

Basically, if superman wants to get drunk in a downtown bar, he has to leave his superpowers at the door.
(In reply to Bobby Holley (:bholley) from comment #12)
> Xrayed method tries to manipulate JSAPI. Accidentally triggers content
> script. Content script runs with the Xray principal.

Ahhh, I see the problem now, thanks for walking me through it.

Please humor me with one more iteration:
 - Start with the same proposal as before
 - When a JSContext has caller-principals explicitly set, it will either MOZ_CRASH or throw security exception if any code tries to run (we have good pinchpoints to do this)

So, with this, we'd get to execute Xrays in the callee compartment with caller privileges, but if we tried to run any code without AutoCompartment'ing, we'll throw/abort.  We can make this assertion even stronger by having, in DEBUG builds, non-Xray calls do SetCallerPrincipals(cx, GetCallerPrincipals(cx)), which is effectively a noop principals-wise, but puts cx into the asserting state so that we can hit the asserts even when Xrays aren't being used.

If I'm understanding correctly, this will avoid all the narrowly-scoped AutoCompartments that would be required when we stay in the caller's compartment for Xrays.
(In reply to Luke Wagner [:luke] from comment #13)
>  - Start with the same proposal as before
>  - When a JSContext has caller-principals explicitly set, it will either
> MOZ_CRASH or throw security exception if any code tries to run (we have good
> pinchpoints to do this)

As noted in comment 12, it's not just an issue of content script running. It's the much broader problem of interacting with untrusted JS objects. Properties can be redefined in strange ways. Prototype chains might lead unexpected (cross-origin) places.

Moreover, it totally muddies the water of our security machinery. For example, we'll end up in situations where the subject principal is system but the wrapper vetoes something (IsCallerChrome() && CheckedUnwrap(obj) == nullptr). Should we allow the operation, or not? It's not clear.

Doing something like this is basically like dummy frames - something that seems plausible at the time, but that eventually causes trouble because we lose simple hard invariants.

(In reply to Boris Zbarsky [:bz] from comment #11)
> One thing I considered is having two compartments hanging off the JSContext,
> one of which is used for everything but allocating objects, while the other
> is used for allocating objects.  JSAutoCompartment sets both.  Xrays set the
> object-allocation compartment to the callee.

I've been thinking about this, and I think something like this is probably the way to go.
 
> The unfortunate aspect of that setup is that it requires anyone who wants to
> allocate and then manipulate an object to immediately enter the compartment
> of the object being manipulated.

Actually, I don't think this is a problem. It's really the same situation with WebIDL wrapping: the underlying object may not be same-compartment, so we either allow the creation of security wrappers, or we accept the fact that the returned object may not be same-compartment. If I recall, you're in favor of the default being "wrap it", and letting the caller unwrap if they want the raw object. That way, the worst that happens in the Xray case is that some security wrapper blocks the operation, rather than having a compartment mismatch.

Basically, objects are supposed to be allocated in the current Realm, which is ambient state depending on nature of the most-recently-invoked Function. So the separate compartment for allocations make sense. It's the most-analagous-thing to what we do for WebIDL.
(In reply to Bobby Holley (:bholley) from comment #14)
From everything you're saying, it sounds like, by and large, we almost always want to operate with content principals b/c we are dealing with untrusted content.  I realize there are a ton of IsCallerChrome checks, but are all those necessary for the chrome-Xraying-content case or are they mostly for normal chrome JS doing chrome JS things?  That is, is there, by any chance, a small finite number of ways in which chrome-Xraying-content needs an exception if Xrays simply entered the content compartment and ran with content privileges?

> > One thing I considered is having two compartments hanging off the JSContext,
> > one of which is used for everything but allocating objects, while the other
> > is used for allocating objects.  JSAutoCompartment sets both.  Xrays set the
> > object-allocation compartment to the callee.
> 
> I've been thinking about this, and I think something like this is probably
> the way to go.

This seems to also muddy the waters since there are tons of ways to allocate objects in the engine and, necessarily, only some will use this second allocation compartment.  If we did this, it seems error-prone to just change the semantics of a select few existing JSAPIs and we'd instead introduce new explicit APIs with "InAllocationCompartment" in their name/contract.

Also: wouldn't this solution still have the problem that (1) compartment assertions would only catch bugs if the Xray wrappers were actually used/fuzzed, (2) random paths that escape into content code (that forget to AutoCompartment b/c it just works in non-Xray) would be cross-compartment/security bugs?
> If I recall, you're in favor of the default being "wrap it"

I don't know that I have an opinion yet...  If anything, I'm in favor of not having a default and forcing the caller to choose or something.

> That way, the worst that happens in the Xray case is that some security wrapper blocks
> the operation, rather than having a compartment mismatch.

That's fair as long as we have test coverage.  In particular, need to have Xray coverage for every single API, including the b2g-only ones...  I guess that's true pretty much no matter what we do here, in fact.

I believe our current Xray test coverage is somewhere between "spotty" and "nonexistent".

> since there are tons of ways to allocate objects in the engine

There are two ways to approach this, I think.

The first is to change all the engine internals.  The second is to enforce that all public (that includes friend) APIs that can end up allocating enter the allocation compartment before doing anything else.  I'm not sure how feasible either one is.

> and we'd instead introduce new explicit APIs 

We could try to do that, yes.  The danger there is whether we can end up with accidental allocations via other APIs...

Maybe we should make a list of the sorts of gcthing-allocating activities Gecko code is likely to actually engage in?

> compartment assertions would only catch bugs if the Xray wrappers were actually
> used/fuzzed

Yes.  Just like the current situation with nodes, say, as Bobby points out.
(In reply to Boris Zbarsky [:bz] from comment #16)
> > since there are tons of ways to allocate objects in the engine
> 
> There are two ways to approach this, I think.
> 
> The first is to change all the engine internals.  The second is to enforce
> that all public (that includes friend) APIs that can end up allocating enter
> the allocation compartment before doing anything else.  I'm not sure how
> feasible either one is.

The first approach sounds like it would immediately lead to cross-compartmet violations inside the engine as compartments mix.

The second option is interesting.  Considering that practically any JSAPI function that cares what the current compartment is can allocate, what this effectively means is basically a lazy version of EnterCompartment that says: here is a new compartment; don't enter it now, but enter it right before any JSAPI call (and leave it when the JSAPI call returns).

Considering that this is for a very Gecko-specific issue, one might hope the problem could be solved in Gecko (viz., by having Gecko AutoCompartment around all the relevant JSAPI calls).  Unfortunately it seems like there's no real pinch-point for that since there is so much widespread manual JSAPI usage.  You and Bobby talked a while ago by having all DOM/binding code go through a set of JSAPI wrappers that could ensure invariants (things like this).

> > compartment assertions would only catch bugs if the Xray wrappers were actually
> > used/fuzzed
> 
> Yes.  Just like the current situation with nodes, say, as Bobby points out.

I know, but some of the ideas (including the original one Bobby proposed) would catch errors in debug mode even without tests that exercised Xrays and I liked that property.
(In reply to Luke Wagner [:luke] from comment #17)
> The second option is interesting.  Considering that practically any JSAPI
> function that cares what the current compartment is can allocate, what this
> effectively means is basically a lazy version of EnterCompartment that says:
> here is a new compartment; don't enter it now, but enter it right before any
> JSAPI call (and leave it when the JSAPI call returns).

I think this proposal loses sight of the original problem. We don't want to do anything special for execute, call, getProperty, and other sorts of operations, because those already have sufficient context surrounding them to make sure that objects get allocated in the right place. The only actual problem here is somebody executing a bare JS_NewFoo or JS_CreateFoo. And the number of API entry points for this kind of thing are auditable and limited. And really, in the long run, most of this stuff should go through DOM-y wrappers anyway (like TypedArrays do).

> Considering that this is for a very Gecko-specific issue, one might hope the
> problem could be solved in Gecko (viz., by having Gecko AutoCompartment
> around all the relevant JSAPI calls).  Unfortunately it seems like there's
> no real pinch-point for that since there is so much widespread manual JSAPI
> usage.  You and Bobby talked a while ago by having all DOM/binding code go
> through a set of JSAPI wrappers that could ensure invariants (things like
> this).

I'm still in favor of this.
> The only actual problem here is somebody executing a bare JS_NewFoo or JS_CreateFoo

Well, and the subsequent JS_DefineProperty calls they might do, if any....

Maybe the right answer is in fact to just have typedarray-like DOM wrappers for all the places where we're talking to the JS engine and asking it to allocate stuff.  It wouldn't be all that hard to hook up; we just have to keep extending it any time new kinds of objects get added to the JS engine that DOM code might want to allocate.

So maybe we need a list of the sort of things DOM code might want to allocate.  Typed arrays are covered.  We need something for JS_NewObject (though usually people should be using dictionaries).  We need something for JS_NewArrayObject, though maybe we can just refactor the bit I'm adding in bug 991080 to handle this case and unhook it from Promise.

Anything else?
(In reply to Boris Zbarsky [:bz] from comment #19)
> Well, and the subsequent JS_DefineProperty calls they might do, if any....

I would hope there's not too much of that going on. See below.
 
> Maybe the right answer is in fact to just have typedarray-like DOM wrappers
> for all the places where we're talking to the JS engine and asking it to
> allocate stuff.

Agreed.

> It wouldn't be all that hard to hook up; we just have to
> keep extending it any time new kinds of objects get added to the JS engine
> that DOM code might want to allocate.

Yeah. I think the set of things is actually pretty limited.
 
> So maybe we need a list of the sort of things DOM code might want to
> allocate.  Typed arrays are covered.  We need something for JS_NewObject
> (though usually people should be using dictionaries).

Yeah, we should track down the use cases here. I don't think anybody should be doing this kind of thing directly. It's just too error prone.

> We need something for
> JS_NewArrayObject, though maybe we can just refactor the bit I'm adding in
> bug 991080 to handle this case and unhook it from Promise.

That sounds great.

> Anything else?

I'm exposing JS_NewWeakMapObject and friends in bug 990290, but the use case there is pretty internal (XBL data structures). I don't think we should be encouraging DOM APIs to do that.
Per policy at https://wiki.mozilla.org/Bug_Triage/Projects/Bug_Handling/Bug_Husbandry#Inactive_Bugs. If this bug is not an enhancement request or a bug not present in a supported release of Firefox, then it may be reopened.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.