Closed Bug 1511359 Opened 9 months ago Closed 7 months ago

JSIID objects not 100% portable across privileged sandbox

Categories

(Core :: XPConnect, defect, P1)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox63 --- unaffected
firefox64 --- unaffected
firefox65 --- fixed
firefox66 --- fix-optional

People

(Reporter: neil, Assigned: Nika)

References

Details

(Keywords: regression)

Attachments

(1 file, 10 obsolete files)

47 bytes, text/x-phabricator-request
Details | Review
I am trying to write a WebExperiment. The extension API loads the code for this into a system principal sandbox. I then want to call a regular API from a module loaded with `ChromeUtils.import`. The API in question expects an object with a property that is an array of JSIID objects. It then creates an XPCOM object and tries to use `instanceof` to check that the XPCOM object implements the interfaces and add the necessary properties to the XPCWrappedNative. This works in Beta, but fails in Nightly.

In particular, I have determined that if code running in a system principal sandbox retrieves a JSIID from Components.interfaces, and passes that to code in a regular scope, then code in that scope can successfully test for that interface using `instanceof`, but the XPCWrappedNative does not gain any additional properties.
Conversely, it is also true that if code running in a regular scope retrieves a JSIID from Components.interfaces, and passes that to code in a system principal sandbox, then code in that scope can successfully test for that interface using `instanceof`, but the XPCWrappedNative does not gain any additional properties.

Code using `instanceof` with a JSIID retrieved from the same scope works as expected of course.

I have created a minimal test case displaying the behaviour. Simply evaluate the following line of code in the browser console:
[c = Components.classes["@mozilla.org/timer;1"].createInstance(), c instanceof Cu.evalInSandbox("Ci.nsITimer", Cu.Sandbox(Services.scriptSecurityManager.getSystemPrincipal())), c.init]
When evaluated in Firefox Beta, the result is Array(3) [ XPCWrappedNative_NoHelper, true, init() ]
When evaluated in Firefox Nightly, the result is Array(3) [ XPCWrappedNative_NoHelper, true, undefined ]
It sounds like maybe we're wrapping the LHS object into the compartment of the ID object before doing the check, rather than the reverse?
> It sounds like maybe we're wrapping the LHS object into the compartment of the ID object before doing the check, rather than the reverse?

That is in fact how it would work, yes.  The hasInstance hook is on the RHS object, and when calling class hooks we would be in the compartment of that object.

Presumably this used to work because when we handed the IID object to a different compartment we just created a new XPCNW for it in that compartment.  Now we create a CCW around it instead, and end up doing the hasInstance on an Xray in the above-described situation, I expect.

What I'm not sure about is what the right solution here is.  We could CheckedUnwrap the LHS in IID_HasInstance but that would (I think) change behavior in cases where people are purposefully passing Xrays.  Do we care?

I suppose we could restore the old passing-cross-compartment behavior for IIDs of just creating a separate object, but we'd need to keep a map of them, etc, which is not good.
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #2)
> What I'm not sure about is what the right solution here is.  We could
> CheckedUnwrap the LHS in IID_HasInstance but that would (I think) change
> behavior in cases where people are purposefully passing Xrays.  Do we care?

I don't think unwrapping the LHS would help. For most XPCWNs, I think we'd generally just get the same object anyway, since most of them just get new per-compartment reflectors rather than wrappers.

I agree IID maps are probably a bad idea... though there's a small chance it may actually save memory if we have a bunch of duplicate IID objects as it is... Maybe we could somehow get away with storing a non-wrapper IID object in the wrapper map, though?

I think, ideally, we'd just special-case hasInstance for IIDs ro create a new same-compartment object rather than wrapping the LHS... but I have no idea how feasible that is.
> but I have no idea how feasible that is.

So I am pretty sure the entrypoint to this stuff is js::HasInstance, for the case we care about (where RHS is not a function).  When the RHS is a CCW, it ends up in Proxy::hasInstance and then CrossCompartmentWrapper::hasInstance.

One thing we could conceivably do is create a new type (or types if we need different security policies) of CCW for IIDs that override this hook to not do the wrap() call, I guess...
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #2)
> Presumably this used to work because when we handed the IID object to a
> different compartment we just created a new XPCNW for it in that
> compartment.  Now we create a CCW around it instead

When did that change?

> and end up doing the
> hasInstance on an Xray in the above-described situation, I expect.

Note that hasInstance Xrays are getting better in bug 1504660, but that's not really relevant here.

> What I'm not sure about is what the right solution here is.  We could
> CheckedUnwrap the LHS in IID_HasInstance but that would (I think) change
> behavior in cases where people are purposefully passing Xrays.  Do we care?

I'm not sure how it would change behavior. The IID hasintance code just gets the isupports and does a QI. I don't think XrayWrappers make a difference there, unless I'm missing something?

So CheckedUnwrap seems reasonable to me. It does seem like it would also allow doing IID instanceof on content objects, which I'm not sure would work before?
 
> I suppose we could restore the old passing-cross-compartment behavior for
> IIDs of just creating a separate object, but we'd need to keep a map of
> them, etc, which is not good.

Yeah, I don't want to overrotate and do anything complex too here.
(In reply to Bobby Holley (:bholley) from comment #5)
> (In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #2)
> > Presumably this used to work because when we handed the IID object to a
> > different compartment we just created a new XPCNW for it in that
> > compartment.  Now we create a CCW around it instead
> 
> When did that change?

After bug 1477432, when IIDs stopped being XPCWNs.

> > What I'm not sure about is what the right solution here is.  We could
> > CheckedUnwrap the LHS in IID_HasInstance but that would (I think) change
> > behavior in cases where people are purposefully passing Xrays.  Do we care?
> 
> I'm not sure how it would change behavior. The IID hasintance code just gets
> the isupports and does a QI. I don't think XrayWrappers make a difference
> there, unless I'm missing something?

It's not an issue of X-rays so much as the compartment we access the LHS object in. We add the tear-off for the RHS interface to the wrapper in the caller compartment, so when we enter the compartment of the ID object, and it's a separate compartment from the calling script, the LHS object remains unchanged for the caller.
I suppose the other possibility is to just revisit the idea of creating all NoHelper XPCWNs in the shared JSM scope... I'm still not sure what the performance or memory effects of that will be at this point, though.
Ah, so the key bit is [1], wherein people are relying on instanceof doing an implicit QI? Can't we fix that by just making the drive-by QI do a CheckedUnwrap first (perhaps asserting that the target is also system)?

[1] https://searchfox.org/mozilla-central/rev/8f0db72fb6e35414fb9a6fc88af19c69f332425f/js/xpconnect/src/XPCJSID.cpp#443
(In reply to Bobby Holley (:bholley) from comment #8)
> Ah, so the key bit is [1], wherein people are relying on instanceof doing an
> implicit QI?

Yup.

> Can't we fix that by just making the drive-by QI do a CheckedUnwrap first (perhaps asserting that the target is also system)?

I don't think so. At that point, we generally don't have a wrapper at all since we just wrap for the new compartment by creating a whole new XPCWN (or getting an existing one from the cache). And we don't have any way to figure out what compartment the original WN came from (no sane way, anyway...).
Oh I see. Yuck.

I guess the other question is: do we need to support the drive-by QI when doing instanceof? Given that the test suite is green, and there's no corpus of addons we might break, can't we just tell people to do the QI explicitly?

The no-home-compartment-for-XPCWN issue does continue to bite us every now and then, and I'm certainly open to the idea of pinnig them somewhere. But it also doesn't feel like the most important thing to spend time on right now.
(In reply to Bobby Holley (:bholley) from comment #10)
> I guess the other question is: do we need to support the drive-by QI when
> doing instanceof? Given that the test suite is green, and there's no corpus
> of addons we might break, can't we just tell people to do the QI explicitly?

I think we probably do. Having it only work sometimes feels like a pretty big footgun, and making it stop working altogether would require changing *a lot* of code that depends on it. I suspect a lot of it isn't even really tested...

And even if we did that, I'm not sure it would be a good idea. Having `instanceof` return true and then requiring a separate QI to be sure the interface is really available feels like enough of a footgun on its own, especially given how easy it already is to write code that works sometimes because of cached interfaces, but then breaks at random if a GC happens at the wrong time.

QI calls are also not especially cheap, so having `instanceof` checks always followed by QI that does more or less the same thing will probably add measurable overhead in some places.
Ok.

It also seems like this is a somewhat temporary problem, since the long-term plan here is to have compartment-per-origin here, and thus no wrappers at all, right? Could we accelerate compartment-per-origin for the System Principal case, where a lot of the security issues don't apply?
(In reply to Bobby Holley (:bholley) from comment #12)
> Could we accelerate compartment-per-origin for the System
> Principal case, where a lot of the security issues don't apply?

Oh, I'd love to get CPO enabled somewhere :)

Boris, does this seem reasonable to you too?
Flags: needinfo?(bzbarsky)
That sounds like a good plan to me. Hopefully we don't have any system principal sandboxes that really do need their own compartments... They all get unique compartment X-ray expandos as it is, though I can't think of any that actually rely on them.
I think a single compartment for all system principals is worth at least trying, yes.  Doing it for system-principal sandboxes is a _little_ worrying, but might be ok.

It would mean we will have a single zone for all system stuff.  Are we OK with that?
Flags: needinfo?(bzbarsky)
In particular, presumably we would only be able to do this in cases when we're already using the system zone for the sandbox (so "freshZone" was not passed and "sameZoneAs" was not passed or is a system-zone object; we'd need new JSAPI to detect this last).

Right now system-principal windows are _not_ in the system zone (they get separate per-window zones), but presumably we would need to change that, right?
That said, the only uses of "freshZone" seem to be in tests.  I wonder whether we could just remove those.

"sameZoneAs" is only used for non-test purposes in EnsureContentXBLScope which will never be a system-principal situation.  Anyway, if "freshZone" is eliminated and system-principal windows (and system-principal compartments in general) go in the system zone, then system principal would always imply system zone and we wouldn't need to worry about "sameZoneAs".
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #16)
> Right now system-principal windows are _not_ in the system zone (they get
> separate per-window zones), but presumably we would need to change that,
> right?

I think that would be desirable. I was under the impression that system-principal windows were in the system zone already. I do know for a fact that cross-compartment overhead for system principal windows is already very high (both in terms of memory and in terms of perf), so I imagine cross-zone overhead if anything makes the situation much worse.
Depends on: 1512029
Jan, do you think this (based on the work in bug 1512029) is likely to be good for uplift to 65 or should it stay in 66?
Flags: needinfo?(jdemooij)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #19)
> Jan, do you think this (based on the work in bug 1512029) is likely to be
> good for uplift to 65 or should it stay in 66?

Uplift is unlikely. There are a lot of dependencies there and these changes are big enough that they deserve a Nightly release cycle IMO.
Flags: needinfo?(jdemooij)
We should probably back bug 1477432 out from beta then?
Flags: needinfo?(nika)
I think we need to track this for 65, and solve it by backing out bug 1477432 on beta.
(In reply to Bobby Holley (:bholley) from comment #21)
> We should probably back bug 1477432 out from beta then?

I think backing that out from beta is probably reasonable enough. I can't think of anything that immediately depends on it. And I'm still not sure what code may randomly get bitten by this if it runs at the wrong time.
Note that bug 1512029 will only change things for system-principal *sandboxes*, not other globals (yet). Do we need a more general fix here?
(In reply to Jan de Mooij [:jandem] from comment #24)
> Note that bug 1512029 will only change things for system-principal
> *sandboxes*, not other globals (yet). Do we need a more general fix here?

We do, yes. We're going to hit this for chrome windows that get ID objects from JSMs, and vice versa.
The issue here is that we call |hasInstance| on the IID object, which then causes the XPCWN to be re-wrapped into the new scope, creating a brand-new XPCWN. So we QI on the wrong wrapper.

It wouldn't be super expensive to copy that weird behaviour and re-wrap the IID object in each compartment, especially because we already have a global lookup table in the form of the Components.interfaces object. I'm not sure how to customize the wrapping behaviour right now though - I'd have to read the code again.

I'm also OK with just backing this out of beta. It's not super essential and shouldn't break anything too badly, although doing the backout may be tricky with the reformatting and other changes in the area.

One last thing: If we do decide to merge all system principals into the same compartment, we should make the Components object be compartment-global, rather than per-global, so that we don't end up with N copies of the IIDs per compartment.
Flags: needinfo?(nika)
(In reply to :Nika Layzell from comment #26)
> It wouldn't be super expensive to copy that weird behaviour and re-wrap the
> IID object in each compartment, especially because we already have a global
> lookup table in the form of the Components.interfaces object. I'm not sure
> how to customize the wrapping behaviour right now though - I'd have to read
> the code again.

We talked about that, but we can't really do that without a map to avoid creating a bunch of duplicate ID objects, or abusing an existing wrapper map to hold non-wrapper objects.

I don't really want a new map, and abusing existing wrapper maps makes me nervous.

And since we know compartment-per-origin will obviate this problem, I'm not sure there's a point in investing effort into anything like that, anyway.

> One last thing: If we do decide to merge all system principals into the same
> compartment

We've already made that decision.

> we should make the Components object be compartment-global,
> rather than per-global, so that we don't end up with N copies of the IIDs
> per compartment.

That's probably a good idea, but I'm not sure it really affects this one way or the other.
Priority: -- → P1
Nika, can you own the backout?
Flags: needinfo?(nika)
Attachment #9032453 - Attachment description: Bug 1511359 - Backout bug 1477432 Part 11 due to nsJSID compartment concerns → Bug 1511359 - Backout bug 1477432 due to nsJSID compartment concerns,
Attachment #9032454 - Attachment is obsolete: true
Attachment #9032455 - Attachment is obsolete: true
Attachment #9032456 - Attachment is obsolete: true
Attachment #9032457 - Attachment is obsolete: true
Attachment #9032458 - Attachment is obsolete: true
Attachment #9032459 - Attachment is obsolete: true
Attachment #9032460 - Attachment is obsolete: true
Attachment #9032461 - Attachment is obsolete: true
Attachment #9032462 - Attachment is obsolete: true
Attachment #9032463 - Attachment is obsolete: true
(sorry about the spam - accidentally pushed the full 11-backout stack instead of the combined one)
Flags: needinfo?(nika)
Comment on attachment 9032453 [details]
Bug 1511359 - Backout bug 1477432 due to nsJSID compartment concerns,

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1477432

User impact if declined: Potential for issues around passing IID objects between JS compartments.

Is this code covered by automated tests?: Unknown

Has the fix been verified in Nightly?: No

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Backout of patch which caused the issue

String changes made/needed:
Attachment #9032453 - Flags: approval-mozilla-beta?
Assignee: nobody → nika
Comment on attachment 9032453 [details]
Bug 1511359 - Backout bug 1477432 due to nsJSID compartment concerns,

[Triage Comment]
Fixes potential for issues around passing IID objects between JS compartments by backing out the regressing patch from Beta. Fx66+ are being handled via other bugs, so nothing needs to be done on trunk here. Approved for 65.0b6.
Attachment #9032453 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
This was only backed out on beta, shouldn't be resolved.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Sorry, guess I misunderstood the relationship with bug 1512029.
Status: REOPENED → NEW
I verified the case in comment 0 will be fixed by bug 1514210 and bug 1512029. Not all our system principal globals will be in the shared compartment, though.

Are we OK closing this now? The offending patches are now on beta, but so is the "single compartment for system-principal sandboxes and windows".

Flags: needinfo?(bobbyholley)

I think so - kmag, WDYT?

Flags: needinfo?(bobbyholley) → needinfo?(kmaglione+bmo)

Yeah, I think we've covered all of the cases I'm really worried about.

Status: NEW → RESOLVED
Closed: 8 months ago7 months ago
Flags: needinfo?(kmaglione+bmo)
Resolution: --- → WORKSFORME

The status for 66 is a little confusing here. I'm going to mark it wontfix and assume that you all will file new bugs for followup on any remaining issues.

You need to log in before you can comment on or make changes to this bug.