Closed Bug 105705 Opened 23 years ago Closed 23 years ago

When chrome calls into content, content can call chrome functions

Categories

(Core :: Security: CAPS, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla0.9.6

People

(Reporter: security-bugs, Assigned: security-bugs)

References

Details

(Keywords: perf)

Attachments

(3 files, 8 obsolete files)

487 bytes, text/html
Details
1.10 KB, application/vnd.mozilla.xul+xml
Details
9.00 KB, patch
brendan
: review+
Details | Diff | Splinter Review
When a function in navigator.js calls a function under _content, if the function
has been modified, it can call arguments.callee.caller.__parent__ which is the
chrome window object, and call functions on that object, which can do lots of
things that content is not supposed to be able to do. I think the problem is
that the same-origin policy is not being enforced on the function.caller
property. If it were, a function would not be able to access its caller if the
function is content and its caller is chrome. Examples to follow.
Nominating ...
Keywords: nsbranch
Whiteboard: [PDT]
See bug 65683. It was thought that the caller property of a function object is
safe, because it can't be used to walk up the call contexts the way __caller__
on the activation object could. However, the function object at function.caller
has a __parent__ property which could be the chrome window.
Even though content can get the reference to the chome window, why are security 
checks not stopping content from accessing the chrome window's properties?

Is ensuring that content can't even get references to chome windows the only 
viable solution?
I don't mind belt-and-braces (defense in depth), but I wish I could see what the
bad script is doing with the chrome window.  I can't seem to view-source or save
the Example, however -- is that part of its evil?

Mitch, why don't existing subject/object checks suffice?

The patch I attached provides a JSACC_CALLER mode for checkAccess implementors
to use when deciding policy.  It's just mechanism, use it if you need it.

/be
Comment on attachment 54293 [details] [diff] [review]
proposed JS extension: JSACC_CALLER access mode for JSClass.checkAccess and JSObjectOps.checkAccess, called from jsfun.c

Looks Ok to me. I'm wondering why this got inserted where it did in the enum.
I assume a patch to nsDOMClassInfo::CheckAccess and the security manager is on the way?
Attachment #54293 - Flags: superreview+
Example attachment saves for me. Looks like:

<html>
Written by georgi guninski

<script src="show_props.js">
</script>
<script>
window._content.__defineGetter__("document",get1);
function get1()
{
//alert('in get'+arguments.callee.caller.__parent__);
try {
alert("homepage="+arguments.callee.caller.__parent__.getHomePage());
}
catch (e) {};
alert('load chrome');
try {
arguments.callee.caller.__parent__.loadURI("chrome://navigator/content");
}
catch (e) {};
}
</script>
<a href="#" id="l1">link</a>
</html>
Adding sol to cc list.
Status: NEW → ASSIGNED
Removing the needsSecurityCheck optimization (in nsDOMClassInfo)seems to fix the
problem, that's an easy fix, however I am looking for an alternative fix so we
don't lose the benefits of that optimization. The problem is that we've got one
context (the chrome window) calling into a second context (content) which is
calling back into the first context. The needsSecurityCheck function compares
the calling context to the one it's got cached, sees that both are chrome, and
happily bypasses the security check.

What I'm wondering is, is there any quick way to identify this situation, where
you've got a finction in context A calling a function in context B, which then
calls back into A? Obviously, the security manager can identify this situation,
but does the JS engine have a fast way to identify this?
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
whoops, didn't mean to mark it fixed yet...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Removing this optimization will make tight loops in JS run considerably slower,
but it fixes the exploit.

Other options include: 
1) remove the optimization for chrome contexts only. This will leave
signed/privileged scripts still vulnerable to this class of exploit and that
will probably come back to bite us. But it's a possibility.
2) Find some other low-cost way of telling if we're in the situation I described
above. I haven't found this yet.

(1) is a serious option. How important is this optimization?
Status: REOPENED → ASSIGNED
Comment on attachment 54560 [details] [diff] [review]
the real fix - please ignore the last patch

sr=jst

It'd be really nice to have some kind of optimization here
but given the limited amount of time we have to experiment with this
I'm ok with taking out the security check optimization on the branch
and try to come up with a better fix on the trunk later on.
Attachment #54560 - Flags: superreview+
Comment on attachment 54560 [details] [diff] [review]
the real fix - please ignore the last patch

r=harishd
Attachment #54560 - Flags: review+
when and where might we observe slowdowns after the optimization is removed?
need this to test and figure out if we would want to go ahead and release
with the slowdowns or to wait for a better solution.
pls check this into the 094 branch, as soon as you can.
Whiteboard: [PDT] → [PDT+]
We're all convinced that slowdowns are not going to be noticable at all in real 
world use. It is really only an issue for code that makes very large numbers of 
accesses to global vars; e.g. loop counters. So, I think the only negative 
impact to worry about here is the possible impact on any benchmark tests that 
might be based on running a *lot* of JS code.
Checked into the 0.9.4 branch. I'm leaving this bug open until we get a
permanent fix for the trunk.

Removing this optimization will slow down frequent access to global properties
such as top-level variables defined by a script. I will attach a very simple
benchmark test. This sort of test (a very tight loop) probably doesn't happen in
real life very often, though it may affect our standing on the critics' benchmarks.

We have an alternate fix in mind that allows us to keep the optimization. We can
 walk up the JS stack, and if we encounter any other context, then we do the
full security check. If there's only one context on the JS call stack, then we
can safely skip the check. However, this fix is more involved and much more
error-prone than simply removing the optimization. Similar stack-walking code in
nsScriptSecurityManager contained sevaral subtle errors that took a couple of
months to find. I think, for the branch, we are better off taking a small
performance hit than running the risk of crating more security holes, especially
since the performance hit is not in a real-world situation. Jband agrees with me
on this.
Whiteboard: [PDT+] → [PDT]
And, I think the new optimization scheme with the stack walking should not try 
to be too clever. It should only accept the simplest local call pattern and 
reject *everything* else (and let the caps code decide via its more complicated 
scheme)
jrgm - can you run the iBench tests before and after this fix?

mitch - can you suggest areas of testing which we want to focus on in the short
time we have before release?  Thank you.
For testing, we'll also run automation (DOM Core/HTML, Inhr, DHTML, JavaScript)
on all four platforms on Monday evening, or Tuesday Branch builds. It will
ensure no regressions happen in these areas.

Gerardo - post to this bug when that testing is completed.  Branch builds to be
done by 7:30pm PST tonight.
I've verify the specific test case attached to this bug report.

Here is what I see before the fix:
1.  Load testcase
2.  Get alert dlg.  Text reads:
"homepage=http://home.netscape.com/bookmark/6_2/homebutton.htm.
3.  Click OK.
4.  Get another alert.  Text reads:  "load chrome"
5.  Click OK.  Alerts in 2-4 repeat.
6.  Click OK.  The following URL loads in the browser window: 
chrome://navigator/content/navigator.xul.

I'll go verify on the newer branch builds now.
Using the Win32 2001-10-22-18-094 build, I now only get two "load chrome" alerts
and nothing else happens (not URL for the chrome loads).

Can someone pls let me know if this is the correct behavior now?

Thanks
On the Linux respin build (branch) 20011022, I've also verified that I only see
two "load chrome" alerts as I describe above.
On Mac OS X branch build 2001-10-22-18-094, have verified only two load chrome
alerts come up.

Whiteboard: [PDT] → [PDT+][fixed and verified on 094 branch]
Some of us are trying to help QA this on the branch tonight but this bug is very
poorly written from a testing point of view =). Mitch would it be possible for
you to outline some testing scenarios for this security problem? 

In particular, can you describe what the user should see when they run the
posted test case on a build with out your fix and with your fix? I get an alert
saying Loaded chrome which comes up twice. From reading the bug description,
it's not clear to me whether that's what I'm supposed to be seeing or not. 

If you could post some clear steps that would be awesome. Thanks!
I just attached a WIP patch that attempts to fix this exploit w/o slowing down
access to global script defined properties too much, it's a bit scary, so we
need to think about this alot.

The attached patch makes assumptions about that lifetime of JSStackFrame's, and
I don't know if my assumptions about that expose potential holes or not. I
wouldn't be surprised if it does.

So the patch works as follows:

The first time we come through needsSecurityCheck() after our cached pointers
have been invalidated (or we're called on a context/wrapper other than the
cached ones) we check that there are no functions on the stack that are defined
in a different global object. If all functions on the stack are on the same
global object that we're accessing a property on, we cache the context/wrapper
pointers. 

Next time needsSecurityCheck() is called we see that the context/wrapper match
the cached ones and we think we probably don't need to do the security check,
but to make sure that that's the case we check that the function at the top of
the stack is from the same window that we're trying to access a property on. If
this is the case then we don't do the security check, if not we go ahead and do
a full security check.

An additional optimization that I made was to only check if the inner most
function on the stack is in the same window once, so I cache a JSStackFrame
pointer and if needsSecurityCheck() is called again with the same context,
wrapper, and frame (cached_fp), we don't do a security check, and we don't even
look at the inner most function on the stack. This is the optimization that I'm
not sure about, the rest of the patch *should* be good, but someone who knows
the details about JSStackFrame n' stuff should look this over real close. Brendan?
Whiteboard: [PDT+][fixed and verified on 094 branch] → [PDT]
mscott, if you see an alert that says "homepage:http://...", and mozilla lods a
XUL file (after showing some more alerts), your build is voulnerable to the
attack. If you only see two alerts saying "loading chrome", and mozilla just
shows you a page that says "Written by georgi guninski link" (and stops there),
the exploit has been plugged. Mitch, correct me if I'm wrong.
Verified on 2001-10-22-branch build on WinNT.

As described above, I get two alerts that says "load chrome" and the browser
window is opened that says "Written by George..."

Also, on the JS console I see following text:
Error: window._content.document has no properties
Source File: chrome://global/content/charsetOverlay.js
Line: 199

Marking it verified fixed.
I am going to verify on the Linux now.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Reopening, this is still not fixed on the trunk.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I ran page load tests on mac/linux/win32, for ~266MHz/128MB systems. There 
_may_ be a slight slowdown in these tests (~1%), but that is putting too fine
a line on the precision of the test. Whatever it is, it isn't a deal-breaker.
[Note: I ran the page load tests, since the ibench test's content has zero
javascript in them].

You are correct about the testcase - if you see an alert which shows your
homepage URL, and then navigator.xul loads *within* the content area (browser
within a browser) then the exploit is *not* fixed. When fixed, you should see
the "load chrome" alert twice but nothing else. I will post an additional testcase.
Verified on 2001-10-22-branch build on WinNT.

I see the 3 alert boxes:
1. chrome called us
2. Permission denied to get Window.scriptGlobals
3. trying to access /tmp/a.js

Marking verified fixed. Will check on Linux as well in a short while. 
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
bindu - pls leave this bug open until there is a fix for the trunk.  I've noted
in the status whiteboard that we've verified on 094 branch.  Thanks.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [PDT] → [PDT+][fixed and verified on 094 branch]
Also note that a better fix is expected to also land on the branch *after* the 
branch has been handed off to embedding.
Tested Client-side JavaScript and DOM Core/HTML on 10/22 branch builds. No 
regressions were found on these areas.
Blocks: 107065
Group: security?
Keywords: nsbranch
=======
WARNING
=======

This bug appears to have been removed from the security group as part of a mass
nsbeta1 keyword cleanup change last week. This is probably due to bug 107718,
which is a bugzilla bug, since changing the group doesn't seem likely as part of
this mass change. Readding security group restriction, but if there was still a
security hole in this bug, it better get fixed soon...

This doesn't appear to be the case for any of the two bugs affected, though -
they're both open for better fixes (rather than a fix) on the branch later. The
assignee/qa of this bug should check that, though.
Group: security?
Target Milestone: --- → mozilla0.9.6
Just noting that we take a 1% cost in removing this optimization (see current 
tinderbox btek)
Blocks: 71668
Keywords: perf
Mitch, we never agree to take out the optimization on the trunk, what's the
hurry? Why not wait for the real fix?
This is the mother of all security holes and we need at least the band-aid in
place for 0.9.6 -- leaving this known hole open because of a mere 1% degradation
would be ethically wrong.
That's fine with me, but that doesn't mean this needed to land on the trunk.
Is the scenario, that we take the patch for the 096 milestone, and leave this
bug open, until we can have a fix with the optimization in place? what are the
chances, that a hacker would go after a Mozilla release, and who would make
public notice of this security hole (i.e. What's our exposure? Would it get the
press visibility, and would anyone, other than guniski know about this problem?)
I'm fine with this being on the trunk, so long as we can get some of the
optimization back ASAP.
A few quick issues (not security related), the hunk:

@@ -1904,12 +1918,10 @@
   }
 #endif
 
-  JSObject *global = obj;
-  JSObject *tmp;
+  rv = if_info->GetName(getter_Copies(name));
+  NS_ENSURE_SUCCESS(rv, rv);
...

is bogus, this can't even compile, the two lines that were added should not be
there.

Also, unused leftover debugging code at:

@@ -2314,15 +2404,21 @@
 {
   nsresult rv = NS_OK;
 
-  rv = doCheckReadAccess(cx, obj, id, wrapper);
+  char *foo = nsnull;
   ^^^^^^^^^^^^^^^^^^^
Build Gecko/20011107 stops the above testcases for me but the problem is not
solved. Today sent a testcase to Mitchell which basically does the same -
execution of programs.
For me the root of the problem is that chrome calls user defined js and stack
walking is possible.
Think that this should be addressed.
I agree with georgi -- we should stop the stack-walk cold with the access check
support I'm reattaching.  The optimization may be safe once we do this.  If it
is not safe, then the attack must not involve arguments.caller -- but I know of
no such attack.  Let's stop dancing around the problem.

/be
Attachment #54293 - Attachment is obsolete: true
Comment on attachment 57048 [details] [diff] [review]
JSACC_CALLER extension to JSClass.checkAccess, updated patch

No half measures:
rip out __call__ altogether over in bug 109113.

/be
Attachment #57048 - Attachment is obsolete: true
This patch combines doCheckReadAccess and doCheckWriteAccess, since they're
almost the same anyway. Also, I've changed the combined function to pass the
actual property name being accessed on to the security manager, instead of
"Window.scriptglobals". This will facilitate some other security fixes I've got
planned.
Attachment #54557 - Attachment is obsolete: true
Attachment #54560 - Attachment is obsolete: true
Attachment #54618 - Attachment is obsolete: true
Attachment #56964 - Attachment is obsolete: true
I'll wait for jst's r=, but wonder why PRInt16 accessMode argument isn't PRUintn
or some such.  It doesn't pay (it can cost on some common architectures) to use
short integer types for parameters that don't need longer integer values chopped
down to size.

/be
You're right, that should have been PRUint32. I've made the change locally, but
I'll wait for further review before posting a new patch.
Mitchell, I had a look and the changes look good, fix what brendan pointed out
and remove the stray comment:

+// nsIXPCSecurityManager::ACCESS_GET_PROPERTY

Brendan, any thoughts about the XXX comment? Will we ever see two or more native
frames (with not function object) in a row at the top of the stack followed by a
JS frame?

This patch gives us a nice little *50* times speedup for a tight while(++i <
hugenumber) loop :-)

r=jst with the above fixed, be careful with merging this.

Oh, and one more thing I noticed while reviewing this, the security manager
API's are not safe against properties with '\0' embedded in the name. I don't
see this ever being exploitable on human written scripts :-) but in theory it
is. Here's a hypothetical, and way far stretched, case where the security checks
would fail to block cross domain access due to this problem:

  mybank.com defines a property named "location\0_password";

  evil.com opens mybank.com, I log in and evil.com accesses
  mybankwindow["location\0_password"]

the security manager will think evil.com is accessing "location" ('cause of the
embedded '\0') on mybankwindow and it'll say sure, anyone can access
window.location, go ahead. Is this worthy of it's own bug?
Attachment #57872 - Flags: review+
Actually this change should be taken out:

@@ -2025,19 +2048,13 @@
 {
   if ((mode == JSACC_WATCH || mode == JSACC_PROTO || mode == JSACC_PARENT) &&
sSecMan) {
-    JSString *str = ::JS_ValueToString(cx, id);
 
-    if (!str)
-      return NS_ERROR_UNEXPECTED;
-
     JSObject *real_obj = nsnull;
     nsresult rv = wrapper->GetJSObject(&real_obj);
     NS_ENSURE_SUCCESS(rv, rv);
 
     NS_ConvertUCS2toUTF8
-      prop_name(NS_REINTERPRET_CAST(const PRUnichar *,
-                                    ::JS_GetStringChars(str)),
-                ::JS_GetStringLength(str));
+      prop_name(JSValIDToString(cx, id));
 
     rv =
       sSecMan->CheckPropertyAccess(cx, real_obj, sClassInfoData[mID].mName,

This change is a potential embedded null problem (if we ever fix the security
manager API's) and this change makes this code slower since we're now creating
the NS_ConvertUCS2toUTF8 string object w/o passing it the length.
Attachment #57872 - Attachment is obsolete: true
Comment on attachment 58145 [details] [diff] [review]
Patch take 3 with jst's and brendan's suggestions

Yes, you should loop over native frames, there may be more than one (e.g., a
native method calls into the JS API to call another native).

Nits:

Why introduce the |global| local when you can use obj:

+static inline JSObject *
+GetGlobalJSObject(JSContext *cx, JSObject *obj)
+{
+  JSObject *global = obj;
+  JSObject *tmp;
+
+  while ((tmp = ::JS_GetParent(cx, global))) {
+    global = tmp;
+  }
+
+  return global;
+}

Follow local style here, or jst will whack it later (2-space indentation,
opening brace of function body on its own line):

+// Result of this function should not be freed.
+static inline const PRUnichar *
+JSValIDToString(JSContext *aJSContext, const jsval idval) {
+    JSString *str = JS_ValueToString(aJSContext, idval);
+    if(!str)
+	 return nsnull;
+    return NS_REINTERPRET_CAST(const PRUnichar*, JS_GetStringChars(str));
+}

/be
Attachment #58145 - Flags: needs-work+
Attached patch Patch take 4Splinter Review
Attachment #58145 - Attachment is obsolete: true
Comment on attachment 58478 [details] [diff] [review]
Patch take 4

r/sr=brendan@mozilla.org (upgrade me to sr= if you need to, although jst can
sr= this one).

/be
Attachment #58478 - Flags: review+
Fix the single space indentation in the new loop, and to avoid double checking
for null fp every iteration I'd change the loop to look like this:

+  do {
+    fp = ::JS_FrameIterator(cx, &fp);
+
+    if (!fp) {
+      break;
+    }
+
+    fp_obj = ::JS_GetFrameFunctionObject(cx, fp);
+  } while (!fp_obj);

(the compiler would probably optimize your version to do the same, but we might
as well do it ourselves to make it clearer what this code is doing.)

sr=jst
Attachment #58478 - Flags: superreview+
Checked in on trunk - waiting for driver approval for branch
mitch, thank you for putting back the page-load performance we had lost!  :-)
Verified on 2001-11-26-6.2.1 build on WinNT.

Both of above test case passes.

Will verify on trunk now.
Adding Antonio to Cc.  He is helping with Mac testing.
Verified on 2001-11-26-Branch build on Linux 7.2, Mac OS X and Mac OS 9.1.

Both of above test case passes.


Verified on 2001-11-29-Trunk build on WinNT.

Both of the above test case passes.
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Verified on
build: 2001-12-11-o3-Trunk
platform: Win NT

Both of the above test case load fine.
Status: RESOLVED → VERIFIED
Severity: normal → critical
Group: security?
Keywords: verified0.9.4
Whiteboard: [PDT+][fixed and verified on 094 branch]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: