Last Comment Bug 351370 - XSS by setting img.src to javascript: URL
: XSS by setting img.src to javascript: URL
Status: VERIFIED FIXED
[sg:high][at risk] needs branch landing
: verified1.8.0.9, verified1.8.1.1
Product: Core
Classification: Components
Component: Security (show other bugs)
: Trunk
: x86 Windows XP
: -- normal (vote)
: ---
Assigned To: Johnny Stenback (:jst, jst@mozilla.com)
:
Mentors:
Depends on: 355365 368655 CVE-2007-0994
Blocks: 350433 353732
  Show dependency treegraph
 
Reported: 2006-09-04 19:24 PDT by moz_bug_r_a4
Modified: 2007-08-10 11:26 PDT (History)
12 users (show)
dveditz: blocking1.7.14?
dveditz: blocking‑aviary1.0.9?
dveditz: blocking1.9?
mtschrep: blocking1.8.1-
dveditz: blocking1.8.1.1+
dveditz: blocking1.8.0.9+
jwalden+bmo: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch that shows what's needed to make imglib take an owner (3.87 KB, patch)
2006-09-11 17:52 PDT, Johnny Stenback (:jst, jst@mozilla.com)
no flags Details | Diff | Splinter Review
Make js urls from different origins evaluate in a sandbox. (19.68 KB, patch)
2006-09-13 23:01 PDT, Johnny Stenback (:jst, jst@mozilla.com)
no flags Details | Diff | Splinter Review
Updated diff for sandboxing. (20.57 KB, patch)
2006-09-19 21:59 PDT, Johnny Stenback (:jst, jst@mozilla.com)
mrbkap: review+
bzbarsky: superreview+
Details | Diff | Splinter Review
Some testcases (1.13 KB, text/html)
2006-09-21 19:47 PDT, Boris Zbarsky [:bz]
no flags Details
Better testcases (2.02 KB, text/html)
2006-09-21 20:00 PDT, Boris Zbarsky [:bz]
no flags Details
Patch (against a clean trunk) that was checked in on the trunk. (22.90 KB, patch)
2006-09-23 10:22 PDT, Johnny Stenback (:jst, jst@mozilla.com)
no flags Details | Diff | Splinter Review
Patch for the 1.8.0 branch. (23.93 KB, patch)
2006-11-28 14:36 PST, Johnny Stenback (:jst, jst@mozilla.com)
mrbkap: review+
bzbarsky: superreview+
dveditz: approval1.8.0.9+
dveditz: approval1.8.1.1+
Details | Diff | Splinter Review

Description moz_bug_r_a4 2006-09-04 19:24:07 PDT
1. Get an image element (I) that was created in a document that was loaded in
   another window (W).
2. Load an other domain page (X) into W.
3. Set I.src to a javascript: URL.

A script is executed in the context of X, and its principal is a null principal
(and, on branches, it's a codebase principal whose URI is "javascript:...").
This means that the script cannot directly access properties of the X's global
scope.  But the script can create a function that has the X's principal by
using a named function expression or a nested function.

  img.src = "javascript:alert(1)";
  -> Permission denied to get property Window.alert

  img.src = "javascript:(function() { alert(1) }).call('')";
  -> Permission denied to get property Window.alert

  img.src = "javascript:(function f() { alert(1) }).call('')";
  -> ok

  img.src = "javascript:(function() {\
    (function() { alert(1) }).call(''); }).call('')";
  -> ok

The trunk, fx2.0, fx1.5.0.7, fx1.0.8 and moz1.7.13 are affected.
Comment 2 Blake Kaplan (:mrbkap) 2006-09-04 23:23:35 PDT
I looked into this some tonight, and I think the problem is basically that we compile the javascript: url with principals that cannot access the given scope chain. The javascript engine doesn't check (in either the anonymous function object or the named function object case) whether the function that we're cloning can access the top of the scope chain.

So what happens is that when we run the javascript: url, it starts out with null principals, but we then clone the named function object (more about why the named vs anon funobj case matters in a bit) which means that when looking for the subject principals during the time that it's running, caps uses its object principals which, due to the cloning, is the window's principals.

The reason that it matters whether we use a named function object or an anonymous one is that in the anonymous function object case, we optimize the function cloning away (see the OBJ_GET_PARENT(...) != parent check in the JSOP_ANONFUNOBJ case), meaning that caps treats the resulting function like any other, and uses the (null) principals that we compiled into the javascript: url instead of the object principals leading to the window. We don't do this optimization in the named fun object case, meaning that do get the object principals.

I'm not sure exactly how we should fix this -- more origin tracking would save us (since we wouldn't fall into the "give the javascript: url null principals" case) ... alternatively, we could check to make sure that the function cloning won't change a function's object principals from its script principals, but that might be too restrictive.
Comment 3 Boris Zbarsky [:bz] 2006-09-06 23:59:54 PDT
> more origin tracking would save us

You mean allow img.src to access the page?  We don't want that -- that causes XSS issues.

Comment 4 Blake Kaplan (:mrbkap) 2006-09-07 07:32:20 PDT
(In reply to comment #3)
> You mean allow img.src to access the page?  We don't want that -- that causes
> XSS issues.

Well, if we could note that the javascript: url was originated by script running in the page with bugzilla's principals, we'd deny access to run the javascript: url at all and avoid having to use the null principal.
Comment 5 Boris Zbarsky [:bz] 2006-09-07 08:39:15 PDT
> if we could note that the javascript: url was originated by script
> running in the page with bugzilla's principals

Ah, ok.  Yeah.  But this is Hard.  :(
Comment 6 Johnny Stenback (:jst, jst@mozilla.com) 2006-09-08 17:07:39 PDT
Wouldn't passing in an owner (principal) to imgLoader::LoadImage() and have it set the channel's owner to it do just that? A local test (on the trunk) seems to show that that does indeed work and blocks the javascript: URL from running at all in this case. Or are there other complications that I didn't notice in imgLoader that makes this harder than it seems?
Comment 7 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-09-08 17:11:20 PDT
The part here that is the real bug IMHO is the fact that code compiled with null principal ends up getting the principals of the window. Seems like the entire idea of null principals is twarted if you can get a new principal by just doing
 (function f() {...}).call()

Why can't the cloned function get the same principal as the script the original function was in?
Comment 8 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-09-08 17:38:21 PDT
Oh, btw, i can reproduce this in 1.5.0.6 too, where we don't give javascript uris in images a null principal, right? Though maybe we there fall into a different trap, but with the same result.
Comment 9 Boris Zbarsky [:bz] 2006-09-08 19:59:16 PDT
> Wouldn't passing in an owner (principal) to imgLoader::LoadImage()

Which principal?  How would that help?

> where we don't give javascript uris in images a null principal

We give them something equivalent.  A "null principal" on trunk is just a one-off codebase principal.  On branch we use an almost-one-off codebase.  They should be identical from the POV of the JS engine in terms of what they can and cannot do.
Comment 10 Blake Kaplan (:mrbkap) 2006-09-08 20:39:47 PDT
(In reply to comment #7)
> Why can't the cloned function get the same principal as the script the original
> function was in?

Because cloned function objects don't use the principals compiled into the script. Instead, we find the object principal of the function object via the parent link (like any other object), and given that the parent of the cloned function object is the scope chain it's running in (the window object), so we lose.
Comment 11 Boris Zbarsky [:bz] 2006-09-08 20:52:37 PDT
So the real conceptual issue is that we're assuming that there's one principal per scope object, which is kinda not the case here...
Comment 12 Brendan Eich [:brendan] 2006-09-08 21:26:42 PDT
(In reply to comment #9)
> > Wouldn't passing in an owner (principal) to imgLoader::LoadImage()
> 
> Which principal?

The subject principal, the unnamed actor doing the verbs in comment 0's steps 1 through 3.  Let's call it S, as opposed to moz_bug_r_a4's X for the domain that forms the main part of the victim codebase principal.

jst's right to suggest this, but even if we use some fake principal, it ought not match the inner window's object principal that nsJSThunk::EvaluateScript tests it against.  Something else is wrong.

To answer sicking's question:

Closures use their static scope to find their principals.  It's not a matter of what subject calls them, but of the object in which their definition, statement, or expresson is evaluated.  That they are precompiled (sometimes in a different object) is an optimization.

If we don't want to form an evil closure in a victim object, we shouldn't evaluate it in that context.  Just as you don't <script src="evil.org/x.js"/> without taking the consequences.  Static scope <=> security context in JS.

The problem here is that native code in nsJSProtocolHandler.cpp is using the wrong static scope for the evaluation.

My question is why we allow S to inject code essentially "into" X's inner window (by forming closures scoped by it) just because it has its hands on an image that was in a previous inner window (and therefore still parented by that previous and different inner window -- right?).

It may be that because S is the object principal of the image (S loaded the document that contained the image, so its parent slot reached the previous inner window) we mistakenly think any closures in the javascript: URL will capture the scope chain terminating in that previous inner window whose object principal is S. But the javascript: URL handler must be evaluating using the present inner window, the one whose principal is victim X.

It seems that nsJSThunk::EvaluateScript must be getting the previous inner window, not the present one, from the channel when it calls:

    NS_QueryNotificationCallbacks(aChannel, globalOwner);

Otherwise, the later securityManager->CheckSameOriginPrincipal(principal, objectPrincipal) would not succeed.  Or perhaps something else is going wrong, but in any event we shouldn't even let this evaluation take place using the wrong static scope, and I don't believe it's hard to fix this bug.

Question: what is the objectPrincipal returned by the securityManager at line 209 of current nsJSProtocolHandler.cpp, and what is the (channel owner, supposedly subject) principal QI'd at line 198.

/be
Comment 13 Boris Zbarsky [:bz] 2006-09-08 21:34:04 PDT
> why we allow S to inject code

Because the logic in nsJSThunk::EvaluateScript looks more or less like this:

if (channel has owner) {
  QI owner to principal
  Check whether this principal is same-origin with the object principal
    of the global we plan to run on.
} else {
  Set principal to null principal.
}

> Otherwise, the later securityManager->CheckSameOriginPrincipal(principal,
> objectPrincipal) would not succeed.

That check isn't even being made in this case.

> and what is the (channel owner, supposedly subject) principal

In this case, null.
Comment 14 Brendan Eich [:brendan] 2006-09-08 21:35:41 PDT
(In reply to comment #11)
> So the real conceptual issue is that we're assuming that there's one principal
> per scope object, which is kinda not the case here...

No, it is the case: we store a principal per inner window (or document, same one to one relationship).

I just mid-air'ed.  Why is the owner null?

/be
Comment 15 Boris Zbarsky [:bz] 2006-09-08 21:40:21 PDT
So I was going to change the code to create the null principal before doing the same-origin check, but then I realized that's equivalent to simply not evaluating javascript: URIs for which we have no subject principal.

Note that this would mean that

   <img src="javascript:stuff to generate image data and return it">

(which never actually touches the global object in question) would stop working.  Currently it works.  If we can't safely do what we're doing now, I think we should indeed make this stop working until such a time as we have better origin tracking... or something.  Note that we'd want to continue not evaluating <img> srcs with the page's principal, imo.

> No, it is the case: we store a principal per inner window

Yes, but we have a separate principal for this script we're compiling.  So now we have one inner window, but two principals floating around.  Our system can't deal with this.

> Why is the owner null?

This keeps coming up every time we talk about javascript:.  Once again: All owners are always null, except for loads of certain URIs happening in docshells.  Period.  Again, the only time the channel might have owner is when:

1)  It's being loaded in a docshell (frame/iframe/browser)
2)  The URI is about:blank, data:, or javascript:

This channel is not being loaded in a docshell.  So it has a null owner.
Comment 16 Brendan Eich [:brendan] 2006-09-08 22:23:27 PDT
(In reply to comment #15)
> Note that this would mean that
> 
>    <img src="javascript:stuff to generate image data and return it">
> 
> (which never actually touches the global object in question) would stop
> working.  Currently it works.

We should not break this.  We can fix it.  It's not hard :-P.

> > No, it is the case: we store a principal per inner window
> 
> Yes, but we have a separate principal for this script we're compiling.

You said we have more than one principal per scope object.  I said not so.  Now you say:

> So now
> we have one inner window, but two principals floating around.

Define "floating around".  The question we were disputing was whether there can be more than one subject principal safely acting on an object under same-origin.  The answer is no, and since we have one principal per final scope object, we must have a bug.  Indeed, any time two different non-system principals mix as subject or object in any window (except for a few cases where XSS property setting is allowed and except for document gets), we have a bug.

> Our system can't deal with this.

Right.  I'm saying we must prevent the evil subject principal S from evaluating. Then we are back to one trust label for all data and code in a container.

> > Why is the owner null?
> 
> This keeps coming up every time we talk about javascript:.  Once again: All
> owners are always null, except for loads of certain URIs happening in
> docshells.  Period.  Again, the only time the channel might have owner is when:
> 
> 1)  It's being loaded in a docshell (frame/iframe/browser)
> 2)  The URI is about:blank, data:, or javascript:
> 
> This channel is not being loaded in a docshell.  So it has a null owner.

Yes, but why can't we fix this as jst proposed?

/be

Comment 17 Boris Zbarsky [:bz] 2006-09-08 22:45:12 PDT
> It's not hard :-P.

I must be missing something then.

> Define "floating around".

"That this code ends up having to work with", if you prefer...  The subject (which is not known, hence defaults to null principal), and the object (window).

> The question we were disputing

I wasn't disputing anything.  I merely stated that we're effectively trying to mix two different principals in the same global scope.

> we must have a bug.

I'll buy that.  ;)

> Indeed, any time two different non-system principals mix as subject or object
> in any window

This is a problem, by the way -- XBL2 depends on such mixing being allowed in controlled ways.

> I'm saying we must prevent the evil subject principal S from evaluating.
> Then we are back to one trust label for all data and code in a container.

Agreed.  We keep violently agreeing on all points of this except "it's not hard", I think.  ;)

> Yes, but why can't we fix this as jst proposed?

I was trying to clarify exactly what jst proposed with my question in comment 9.  Could we get back to answering that question?  Exactly which principal do we propose to pass to LoadImage and from where?  How do we propose to prevent XSS attacks on websites that allow users to post images in the process?

Again, I feel like we keep having the same conversation over and over again...  It's starting to be a waste of my time and yours.  The way forward is to continue work on that document you started....
Comment 18 Boris Zbarsky [:bz] 2006-09-08 22:50:18 PDT
Just to make it clear, we could perhaps handle this particular case by making setting of .src do something different from setting the "src" attribute security-wise.  But we'd still have the problem where having a static <img src="...."> wouldn't work.  Given the XSS issues and our restrictions on multiple principals interacting with one scope object, I see no way to make it work.

Quite frankly, I think we should just break javascript: for <img src>.  Some testing in other browsers would be needed, but for reference IE7 doesn't allow such javascript: to touch the main page, if it even executes it at all (I didn't see any obvious security errors).
Comment 19 Brendan Eich [:brendan] 2006-09-08 22:56:28 PDT
(In reply to comment #17)
> > Indeed, any time two different non-system principals mix as subject or object
> > in any window
> 
> This is a problem, by the way -- XBL2 depends on such mixing being allowed in
> controlled ways.

I agree it is a problem.  Does the XBL2 advert to it?

> Agreed.  We keep violently agreeing on all points of this except "it's not
> hard", I think.  ;)

But then in the next comment you seem to say we could do the obvious (to me, sorry) thing of propagating principal S from JS through the owner to the code in nsJSThunk::EvaluateScript, as we do for other script-called setters.

> I was trying to clarify exactly what jst proposed with my question in comment
> 9.  Could we get back to answering that question?  Exactly which principal do
> we propose to pass to LoadImage and from where?

S, from the setter for img.src.

> How do we propose to prevent
> XSS attacks on websites that allow users to post images in the process?

I have said this before too, and I'm sure you've read it.  We should not mix up XSS policy with de-facto restrictons imposed by doomed user-generated content sanitization hacks and hosting security flaws.

> Again, I feel like we keep having the same conversation over and over again... 
> It's starting to be a waste of my time and yours.  The way forward is to
> continue work on that document you started....

I'll have time for that some day soon, but I don't think that's needed to fix this bug.  That model or something like it will be needed for XBL2, but for this bug I think we need to get S passed through via the channel owner.

/be
Comment 20 Brendan Eich [:brendan] 2006-09-08 23:03:27 PDT
Sorry to repeat stuff from elsewhere, but I think we should not break javascript: URLs set as image src statically (by markup) or dynamically (by script).  We seem to agree that the dynamic case can be handled by the JS-invoked setter for the src property getting the subject principal and setting it as the imglib channel owner (sorry if I'm mangling terminology).

For the static (markup) case, why can't we set the owner otherwise?  Why can only docshell loads of javascript: URLs set the owner?  Here I'm asking for education or a reminder (if I once knew and forgot).  And mainly, since it's all software, what is hard (rather than tedious) in fixing this?

I agree mixed trust labels in the same trust domain is hard; that's being studied still with interesting research, and models based on the lambda calculus with a labeled data model and a semi-lattice of trust labels in which control and data flows join or meet.  That's the long-haired stuff.

But here, in this bug, we should be upholdng same-origin and preventing S from starting the evaluation in the first place.  There should never be multiple principals flying around. ;-)

/be
Comment 21 Brendan Eich [:brendan] 2006-09-08 23:14:00 PDT
(In reply to comment #19)
> (In reply to comment #17)
> > > Indeed, any time two different non-system principals mix as subject or object
> > > in any window
> > 
> > This is a problem, by the way -- XBL2 depends on such mixing being allowed in
> > controlled ways.
> 
> I agree it is a problem.  Does the XBL2 advert to it?

The XBL2 spec, I meant to write.

/be
Comment 22 Boris Zbarsky [:bz] 2006-09-08 23:14:20 PDT
> I agree it is a problem.  Does the XBL2 advert to it?

The problem is that we can't handle it.  With XBL2, nodes with different origins mix in the same window and share a single presentation.

> as we do for other script-called setters

The basic problem is that we have no good way of telling whether a setter is in fact script-called.  Right now we have methods where we effectively document that no one will ever call them from non-script (but SetAttribute sure doesn't fall in that category).  For everything else we're at various levels of bugginess due to what we do or don't assume about where we're called from.

In this case we could easily declare nsHTMLImageElement::SetSrc as only callable from JS and pass S there.  But then the POC could just use SetAttribute.  And we can't declare SetAttribute only callable from JS...  One other thing that keeps coming up seems to be that people assume we have this infallible knowledge of who's currently acting. We don't.  Not even close.

We could pass the image node's principal in for S, which is probably the sanest thing to do in general.  But this leads us back to XSS issues.

> S, from the setter for img.src.

Again, the basic problem is we generally don't know who the setter is.  Imo.  Or rather, we don't know whether the JS stack has anything to do with the setter.

> with de-facto restrictons imposed by doomed user-generated content
> sanitization hacks

OK.  But that doesn't answer my question.  What's our story on user-provided images?  "Sites do this at your own risk, you're on your own?"  Or something else?  If something else, what's the concrete security model we're proposing?  That's all I'm trying to get a straight answer to.
Comment 23 Boris Zbarsky [:bz] 2006-09-08 23:20:03 PDT
> We seem to agree that the dynamic case can be handled by the
> JS-invoked setter for the src property getting the subject principal

We agree that this particular testcase can be fixed this way.  Modifications of the testcase to use SetAttribute cannot.

> For the static (markup) case, why can't we set the owner otherwise? 

I'd like an answer to the last question in comment 22 before trying to give a coherent answer to this; the answer will depend strongly on the answer to that question.

> Why can only docshell loads of javascript: URLs set the owner?

Again, end of comment 22 matters here.

> preventing S from starting the evaluation in the first place

_This_ I agree with.  The problem is that I feel that there are cases when we want to allow evaluation but not allow access.... and we have no way to do that, given the function cloning thing.  Does that make any sense?
Comment 24 Brendan Eich [:brendan] 2006-09-08 23:26:34 PDT
(In reply to comment #22)
> > I agree it is a problem.  Does the XBL2 advert to it?
> 
> The problem is that we can't handle it.  With XBL2, nodes with different
> origins mix in the same window and share a single presentation.

And code from different origins mix too -- nodes == codes (methods, handlers).  If it were only data, it would be easier.

But I repeat that the XBL2 problem is not this bug.  This is a same-origin hole.

> > as we do for other script-called setters
> 
> The basic problem is that we have no good way of telling whether a setter is in
> fact script-called.

I'm beginning to like generated glue code for the JS/DOM binding, then.  We ought let XPConnect obscure the situation.  Or any other layer.  If the issue is script security then the scripted call into native code has to be mediated by one layer that sets the right owner.  Period, end of story, anything else is a bug.  So how would we fix this bug, if we were starting more from a clean slate but still using XPConnect?

> We could pass the image node's principal in for S, which is probably the sanest
> thing to do in general.  But this leads us back to XSS issues.

Do you mean user-generated content doomed sanitization mitigation, instead of XSS? If not, what XSS issues?

> > S, from the setter for img.src.
> 
> Again, the basic problem is we generally don't know who the setter is.  Imo. 
> Or rather, we don't know whether the JS stack has anything to do with the
> setter.

There can be two cases: the JS stack does indicate the subject principal, from the top-most scripted frame; the stack does not, because some native code needs to override that principal, or that principal is unrelated to the native code that is running on top of the script.  Is this why we push different (sometimes null) thread-context stack items?  Is it too hard to add more auto-pushers?

>> > with de-facto restrictons imposed by doomed user-generated content
> > sanitization hacks
> 
> OK.  But that doesn't answer my question.  What's our story on user-provided
> images?  "Sites do this at your own risk, you're on your own?"  Or something
> else?  If something else, what's the concrete security model we're proposing? 
> That's all I'm trying to get a straight answer to.

I'd like to separate that from this bug, if you don't mind.  My answer for now is "do what IE6 did", but that's mainly to buy us time and interop until a new age dawns.  Mixed trust labels on the server (user-generated content) is as hairy as on the client.  Again, that ain't this bug and we don't have to solve all problems at once.  Let's divide and conquer.

/be
Comment 25 Brendan Eich [:brendan] 2006-09-08 23:27:43 PDT
It should go without saying that I'm not interested in patching just one symptom, or one avenue of attack, where obvious variations are unpatched.

I just want that subject principal propagated in the path that counts, the scripted path.

/be
Comment 26 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-09-11 14:57:08 PDT
I agree, just fixing <img src> will most likely leave plenty of holes.

I still don't get why setting a different principal than the null principal increases security. That seems very unintuitive for me and a serious shortcomming of the way we treat null principals. Is it because we for non-null principals execute the script in some different context? Could we do the same for null principals and create a temporary context?
Comment 27 Johnny Stenback (:jst, jst@mozilla.com) 2006-09-11 16:10:44 PDT
(In reply to comment #26)

AFAIK the null principal itself is in no way special here, it's just that if we don't have an owner in the javascript: protocol case, we create a null principal and execute the script in the URL with that principal. If we do have an owner, then we use that owner and check if it's the same origin as the page we're loading the javascript: URL into and if the principals match we execute the script in the URL, if the principals don't match we block the script in the URL.

IOW, we only get into a situation where the running principals in a javascript: URL don't match that of the page where the script in the URL executes when a javascript: URL is loaded with no owner.
Comment 28 Johnny Stenback (:jst, jst@mozilla.com) 2006-09-11 17:52:36 PDT
Created attachment 237876 [details] [diff] [review]
Patch that shows what's needed to make imglib take an owner

This patch shows (at least most of) what needs to be changed to make imglib take an owner and pass it through to the channel. The patch simply uses the documents or image elements NodePrincipal() as the principal, which is obviously not correct, so feel free to ignore that part of the patch...
Comment 29 Boris Zbarsky [:bz] 2006-09-11 17:56:39 PDT
So I think what we should do here for the branches and for trunk for now is to execute the javascript: URI in a sandbox if its principal doesn't match the cx principal.  This would include the case when its principal is the null principal.

That should be safe but still allow some things to work, as long as they don't touch the page DOM.

Going forward, we can work on a better security system for trunk....
Comment 30 Johnny Stenback (:jst, jst@mozilla.com) 2006-09-13 18:04:37 PDT
I've got a sandbox execution scheme limping along here. Seems to work pretty well (with the obvious limitations of not being able to reach DOM things), but still needs a bunch of cleaning up.
Comment 31 Johnny Stenback (:jst, jst@mozilla.com) 2006-09-13 23:01:06 PDT
Created attachment 238375 [details] [diff] [review]
Make js urls from different origins evaluate in a sandbox.

This isn't done, but it works and is pretty close to what we want I think. Mostly needs a look over to make sure the refactoring of the existing code I did is correct, and some more comments etc. This adds the two following methods to nsIXPConnect:

+    [noscript] nsIXPConnectJSObjectHolder createSandbox(in JSContextPtr cx,
+                                                        in nsIPrincipal princip
al);
+    [noscript] JSVal evalInSandboxObject(in AString source, in JSContextPtr cx,
+                                         in nsIXPConnectJSObjectHolder sandbox)
;

and makes nsJSProtocolHandler use those if the URL we're loading has an owner that differs from the principal in the target window (or has no owner). This way JS continues to run in all cases, but it runs in an empty sandbox.

Comments/ideas welcome.
Comment 32 Johnny Stenback (:jst, jst@mozilla.com) 2006-09-13 23:03:45 PDT
Oh, and I meant to say that very little of this is new code, it all uses the existing evalInSandbox code, but I had to refactor a bunch of that code. The rest of this is pretty straight forward. And this is against a trunk tree for now.
Comment 33 Mike Schroepfer 2006-09-14 11:19:27 PDT
Hey guys - is this practical for 1.8.1 or should we look for a fix in 1.8.1.1?
Comment 34 Brendan Eich [:brendan] 2006-09-14 11:33:30 PDT
My advice is to defer.  We can keep this bug confidential, and I don't think it's likely to be discovered independently.  If I'm wrong, I'll help do the 1.8.1.1 in a hurry.  If anyone knows of public information pointing in the direction of this bug's testcase, please cite it now.

/be
Comment 35 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-09-14 13:56:24 PDT
My question would be, how big is the risk involved? If the risk is low enough i'd say it'd be good to get in even though it's unknown to the public right now.
Comment 36 Brendan Eich [:brendan] 2006-09-14 14:01:50 PDT
We don't have a patch yet.  Trying to code-freeze .8.1 today.  Therefore...

/be
Comment 37 Mike Schroepfer 2006-09-14 15:20:36 PDT
pushing to 1.8.1.1
Comment 38 Boris Zbarsky [:bz] 2006-09-18 21:44:15 PDT
Comment on attachment 238375 [details] [diff] [review]
Make js urls from different origins evaluate in a sandbox.

This isn't a thorough review; in particular I haven't reviewed the nuts and  bolts of the XPConnect refactoring.

>+    if (useSandbox) {
...
>+            JSAutoRequest ar(cx);
>+            rv = JSValueToAString(cx, rval, &result, &bIsUndefined);

So if the rval returned is a JSObject with a toString method, against what global object will that method get run?  I'd hope it's still against the sandbox global, but make sure?

>diff --git a/js/src/xpconnect/src/nsXPConnect.cpp 
>+    jsval rval;
>+    AUTO_MARK_JSVAL(ccx, rval);

I think you want to init |rval| to JSVAL_VOID and to pass &rval to AUTO_MARK_JSVAL.  Otherwise you're marking the current (random) value of rval, which isn't really what we want here.

Other than that and the need for docs and a review of the XPConnect refactoring stuff, looks pretty good!
Comment 39 Daniel Veditz [:dveditz] 2006-09-19 15:58:14 PDT
Restoring lost blocking flag
Comment 40 Johnny Stenback (:jst, jst@mozilla.com) 2006-09-19 21:59:27 PDT
Created attachment 239306 [details] [diff] [review]
Updated diff for sandboxing.

This should be close. As for what JS_ValueToString() does on the result when we do evaluate in a sandbox, it does evaluate the code in the scope of the sandbox object. BUT it does that on the context from the web page it does not have access to, not on the temporary sandbox context that the sandbox code uses. I'm not sure that's important as it should only matter in cases where the context's global object is used, I'm hoping mrbkap can help shed some light on this.
Comment 41 Mike Schroepfer 2006-09-20 10:34:26 PDT
Would this be feasible/risk free enough to move back into a 1.8.1 RC2?
Comment 42 Boris Zbarsky [:bz] 2006-09-21 19:27:35 PDT
So... That patch doesn't apply to trunk over here because nsIXPConnect doesn't have anything like:

       readonly attribute JSClassConstPtr XPCNativeWrapperClass;

in it.  Some sort of local git repository mix-up?
Comment 43 Boris Zbarsky [:bz] 2006-09-21 19:47:59 PDT
Created attachment 239604 [details]
Some testcases

Attached patch passes these.
Comment 44 Boris Zbarsky [:bz] 2006-09-21 20:00:20 PDT
Created attachment 239605 [details]
Better testcases

The last few test the case I was worried about.  We seem to be ok!
Comment 45 Boris Zbarsky [:bz] 2006-09-21 20:46:09 PDT
Comment on attachment 239306 [details] [diff] [review]
Updated diff for sandboxing.

>+++ b/js/src/xpconnect/idl/nsIXPConnect.idl
>+     * Create a sandbox for evaluating code in isolation in using

s/in using/using/

>+    [noscript] JSVal evalInSandboxObject(in AString source, in JSContextPtr cx,
>+                                         in nsIXPConnectJSObjectHolder sandbox);

Document that the caller should be GC-rooting the jsval* that's the out param for this method?

>diff --git a/js/src/xpconnect/src/nsXPConnect.cpp

>+nsXPConnect::CreateSandbox(JSContext *cx, nsIPrincipal *principal,

Comment the #else and #endif with /* XPCONNECT_STANDALONE */, please.

>+    AUTO_MARK_JSVAL(ccx, rval);

This should be

  AUTO_MARK_JSVAL(ccx, &rval);

That protects whatever is stored in rval, whereas |AUTO_MARK_JSVAL(ccx, rval);| protects the current value of rval only (in this case JSVAL_VOID).

>+    nsresult rv = xpc_CreateSandboxObject(cx, &rval, principal);
>+
>+    if (NS_SUCCEEDED(rv) && JSVAL_IS_OBJECT(rval)) {

So if !JSVAL_IS_OBJECT we'll return success and a null holder?

Can we simply asser that |if (NS_SUCCEEDED(rv))| then JSVAL_IS_OBJECT?  I fail to see how that could happen to be false, and I think it's reasonable to demand that out of xpc_CreateSandboxObject.

>+nsXPConnect::EvalInSandboxObject(const nsAString& source, JSContext *cx,

Again, comment the #else and #endif.

>diff --git a/js/src/xpconnect/src/xpccomponents.cpp 

>+xpc_CreateSandboxObject(JSContext * cx, jsval * vp, nsISupports *prinOrSop)

Comment the #endif after this as XPCONNECT_STANDALONE.

>+            principal = do_CreateInstance("@mozilla.org/nullprincipal;1", &rv);
>+
>+            if (!principal || NS_FAILED(rv)) {

I guess this is ok, but really we should be able to assume (and assert) that NS_SUCCEEDED(rv) means principal is not null here.  do_CreateInstance does promise that.

In nsXPCComponents_utils_Sandbox::CallOrConstruct, would it make any sense to avoid creating the PrincipalHolder and just pass the principal to the sandbox?  Or would the games with an nsISupports (holding either nsIPrincipal or nsIScriptObjectPrincipal) involved be just as much code as the 2-3 lines to create the PrincipalHolder?

>+    // Get the current source info from xpc. Use the codebase as a fallback,
>+    // though.

That last part happens inside xpc_EvalInSandbox, right?  Comment makes it sound like we should be doing it here...

>+xpc_EvalInSandbox(JSContext *cx, JSObject *sandbox, const nsAString& source,
...
>-#endif /* !XPCONNECT_STANDALONE */
> }
>+#endif

Put the comment back, please.

>--- a/js/src/xpconnect/src/xpcprivate.h

>+xpc_CreateSandboxObject(JSContext * cx, jsval * vp, nsISupports *prinOrSop);

Please document, especially the last arg, since it has nontrivial behavior, and the expected GC behavior of vp (caller protects).

>+xpc_EvalInSandbox(JSContext *cx, JSObject *sandbox, const nsAString& source,
>+                  const char *filename, PRInt32 lineNo, jsval *rval);

Please document.

sr=bzbarsky with those nits.  Especially the &rval thing in CreateSandbox.
Comment 46 Blake Kaplan (:mrbkap) 2006-09-22 11:43:48 PDT
Comment on attachment 239306 [details] [diff] [review]
Updated diff for sandboxing.

>diff --git a/js/src/xpconnect/src/nsXPConnect.cpp 

>+    if (NS_SUCCEEDED(rv) && JSVAL_IS_OBJECT(rval)) {
>+        *_retval = XPCJSObjectHolder::newHolder(cx, JSVAL_TO_OBJECT(rval));

Do you want to make that second check !JSVAL_IS_PRIMITIVE to avoid null dereferences? Or does newHolder null check?
Comment 47 Mike Schroepfer 2006-09-22 19:09:26 PDT
We are doing a 181RC2.   This safe enough for that or should we wait until 1.8.1.1?
Comment 48 Johnny Stenback (:jst, jst@mozilla.com) 2006-09-23 10:22:32 PDT
Created attachment 239804 [details] [diff] [review]
Patch (against a clean trunk) that was checked in on the trunk.

This landed on the trunk. bz, please have a look at the last changes I made to not create a PrincipalHolder in nsXPCComponents_utils_Sandbox::CallOrConstruct() as you suggested.

I'm not super excited about taking this on the branch just like that, but if we get good feedback from this on the trunk now and others agree I'm not against taking it...
Comment 49 Blake Kaplan (:mrbkap) 2006-09-23 10:59:39 PDT
(In reply to comment #40)
> I'm hoping mrbkap can help shed some light on this.

I don't believe that running untrusted code on the protected context is a problem. AFAICT, the most evil thing the toString call could do would be to mess up the context-based regexp statics (like multiline, parenCount, etc.) because the scope chain is inherited from the parent of the callable object and all the sandbox code has access to is other sandbox code.
Comment 50 Boris Zbarsky [:bz] 2006-09-24 10:34:08 PDT
> please have a look at the last changes I made to not create a PrincipalHolder

Looks good.

Should we resolve this fixed?
Comment 51 Mike Schroepfer 2006-09-26 15:21:10 PDT
Conversation and comments from JST seem to indicate that we'd want some serious trunk testing of this before it was safe for branch - so I'm moving the flags back to 1.8.1.1.  If you think otherwise please re-nom for 1.8.1.
Comment 52 Daniel Veditz [:dveditz] 2006-11-27 10:41:12 PST
resolving FIXED based on comment 48

What about the branches?
Comment 53 Johnny Stenback (:jst, jst@mozilla.com) 2006-11-27 17:49:13 PST
I've got this mostly merged to the 1.8.0 branch, but the lack of null principals on the branches means we need another way to sandbox JS to have no access...
Comment 54 Johnny Stenback (:jst, jst@mozilla.com) 2006-11-28 14:36:51 PST
Created attachment 246851 [details] [diff] [review]
Patch for the 1.8.0 branch.

This is essentially the same except for not changing the nsIXPConnect API directly (changing nsIXPConnect_MOZILLA_1_8_0_BRANCH), and not using the null principals that don't exist on the branch. The alternative apporoach I picked to get around the lack of null princpals was to generate mostly unique URIs in the form of about:noaccess-XXX where XXX is a 32-bit random number. Other than that the changes are mostly the same, but it was a pretty tricky merge so I'd like to have more eyes on this before landing it.
Comment 55 Blake Kaplan (:mrbkap) 2006-11-30 17:16:59 PST
Comment on attachment 246851 [details] [diff] [review]
Patch for the 1.8.0 branch.

Looks okay.
Comment 56 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-11-30 17:45:01 PST
Comment on attachment 246851 [details] [diff] [review]
Patch for the 1.8.0 branch.

rs=me. I don't really understand the code itself, but the porting looks fine.
Comment 57 Johnny Stenback (:jst, jst@mozilla.com) 2006-11-30 17:47:14 PST
Comment on attachment 246851 [details] [diff] [review]
Patch for the 1.8.0 branch.

This is the 1.8.0 branch diff, the 1.8 branch diff is essentially identical
Comment 58 Boris Zbarsky [:bz] 2006-11-30 18:12:31 PST
Comment on attachment 246851 [details] [diff] [review]
Patch for the 1.8.0 branch.

>diff --git a/js/src/xpconnect/idl/nsIXPConnect.idl b/js/src/xpconnect/idl/nsIXPConnect.idl

Rev the iid of nsIXPConnect_MOZILLA_1_8_0_BRANCH?  I guess it's ok to not put this on a new interface, though changing even nsIXPConnect_MOZILLA_1_8_0_BRANCH bothers me.

sr=bzbarsky with that nit.
Comment 59 Daniel Veditz [:dveditz] 2006-11-30 23:08:27 PST
Comment on attachment 246851 [details] [diff] [review]
Patch for the 1.8.0 branch.

approved for 1.8/1.8.0 branches, a=dveditz (with bz's sr= nit)
Comment 60 Johnny Stenback (:jst, jst@mozilla.com) 2006-12-01 16:46:28 PST
Fix landed on branches.
Comment 61 Bob Clary [:bc:] 2006-12-06 12:34:06 PST
verified fixed 1.8.0.9, 1.8.1.1, 1.9 windows/linux/macpcc
Comment 62 chris hofmann 2007-04-24 15:29:31 PDT
pvnick is doing a bit of research on XSS and also gathering up bugs with security related test cases to help add to the regression/certification test suites.  adding him to the cc list in these...

Note You need to log in before you can comment on or make changes to this bug.