Closed Bug 346659 Opened 18 years ago Closed 16 years ago

XSS by using document.open or document.write

Categories

(Core :: Security, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.1

People

(Reporter: moz_bug_r_a4, Assigned: bzbarsky)

References

Details

(Keywords: fixed1.9.0, fixed1.9.1, Whiteboard: [sg:high])

Attachments

(4 files)

See also bug 344495.

1) When document.open, or document.write on a document that has finished
loading, is called by a script in a different security context, the document's
principal is set to the caller's principal.  In such case, if there are
references to objects/functions that were created in that document's context,
then the references also get the caller's principal.  This can be used to
exploit in situations like bug 344495, bug 344494, bug 344751 and bug 345305.

attacker:
  * Note: an exploit page needs to be constructed in about:blank.
  <ifarme src="about:blank"/>
  <ifarme src="target site"/>
  func = frames[0].eval("create a function", frames[0]);
  window.__defineGetter__("x", frames[0].document.write);
  window.__proto__.__proto__ = frames[0].document;

target site:
  parent.x;

func will get the target site's principal.

2) The proposed patch in bug 344495 does *not* fix 1), and a regression that is
caused by that patch makes the problem far worse -- an attacker can do XSS
attacks against arbitrary sites.  With that patch applied, when document.open
is called, the document's principal is set to the document.open function's
principal instead of the caller's principal.  Thus, an attacker can get a
document.open method from a target site and call it on another document to give
the target site's principal to an attacker-defined function.

attacker:
  <ifarme src="about:blank"/>
  <ifarme src="target site"/>
  func = frames[0].eval("create a function", frames[0]);
  frames[1].document.open.call(frames[0].document);

func will get the target site's principal.
Attached file testcase 1
Attached file testcase 2
This works only with the patch in bug 344495 applied.
Adding this to the dep list for the bug with the patch on it, and asking for blocking-1.8.1
Blocks: 344495
Flags: blocking1.8.1?
So the key here is that we're calling document.open on about:blank, which preserves the old inner window, right?  This is sorta "fixed" in my build that has the patch for bug 332182, but only because the document.write call throws a security exception (since it's a cross-domain access).  That patch wouldn't help for the case when the page being attacked has UniversalXPConnect or whatever.

Perhaps we should preserve the inner for about:blank only in the following cases:

1)  The about:blank document has no parent.
  or
2)  The caller's principal is SameOrigin with the parent's principal.
Flags: blocking1.8.0.7?
OS: Windows XP → All
Hardware: PC → All
Flags: blocking1.8.1? → blocking1.8.1+
Whiteboard: [sg:high]
I think comment 4 is correct. My first knee-jerk reaction to fix this bug is to change our behavior from "when you call document.open on an about:blank document you own it" into "when you set a property on an about:blank document, you own it". Brendan had some other comments, which he'll hopefully put here.
Does the patch for bug 343168 actually help somewhat here?
It's my opinion that we should not patch further with an unsound trust label calculus.  We need some very simple rules that do not have "escape clauses" such as "oh, certain native method activations are trusted or invisible to the stack walker" or "because we have always done it, we must allow anyone to access a new window loading about:blank, without changing its trust label to something other than 'about:blank'".

We call trust labels principals for historical reasons.  Every running script, every object in the DOM, and every window object, is owned by a principal for which its static markup source was loaded, or else the object was created by script loaded from some source that has an associated principal.  Leaving out PKI-based principals for now, some definitions and propositions:

Types:
principal = (null, codebase, system)  // disjoint union
null      = {null1, ... nullM}        // set of M nulls
codebase  = {origin1, ... originN}    // set of N origins

Null principals are unique tokens distinct from all other principals.

An origin is a string of the form scheme://host[:port] where [:port] is optional, port is a 64-bit unsigned number if present, host is a domain name, and scheme is a URI scheme or "protocol name" other than "chrome".

The principal associated with all chrome://... origins is the system principal.

Let P be the set of all principals.

Let <= be the binary relation by which P is partially ordered.

For all p in P, p <= system.

For all codebase principals cp and cq such that cp != cq, !(cp <= cq) and !(cq <= cp) -- IOW distinct codebase principals are unordered with respect to one another.

For all null principals np and nq such that np != nq, np and nq are unordered.

For all principals p and q, there exists in P the greatest lower bound (p ^ q), the meet of p and q, defined by <=.  Therefore (P, <=) is a meet semi-lattice.

For all p in P, (p ^ system) == p.

Claim #1: a sound mandatory access control system must compute the meet of all principals active on the JS stack, and use that meet as the subject principal, s.  Let t be the object principal of the target object being accessed.  Access must be denied unless t <= s.

Proof later, but if you buy this, then *no frame skipping hacks*.  mrbkap will confirm the wisdom of this ;-).

Objections concern the allAccess properties such as window.alert, which may be invoked across origin; document.write and its ilk; and the "fake" URI schemes such as about:, javascript:, and data:.

Claim #2: a sound mandatory access control system must identify all codebases by origin (scheme://host[:port]), never by a fake URI scheme lacking a domain name host part (about:, data:, javascript:).

This is vacuous given the definintion of origin above, but if (as we used to do) you allow fake URIs to be origins, or you translate fake URIs to null principals (as we do today), you end up with principals for content that are unordered with respect to the principal of the script that generated the content and now wants to manipulate its DOM.  This means access denied where it ought to be allowed, at the limit -- a bug and an interoperation hazard with respect to other browsers (ignore any user-generated content anti-sanitization constraints placed on the DOM, they are straw men).

This claim implies tracking origins of URIs.  In the case of static markup, origin tracking is easy given the loading page's origin and the loaded inline element's src or similar URI-valued attribute (details of current code parameterization and data structures notwithstanding :-P).  In the case of dynamic content generation via script, we have subject principals and stack walking to help.

This comment is meant as a starting point.  We should work through any hard cases or apparent counterexamples and evolve these definitions and rules to address all such objections.  We should not hack more patches in haste, unless we really have to right now or something very bad will happen :-/.  It's time to get a sound trust label calculus written down as a logical system.  Otherwise I think we are on a fool's errand.

/be
(In reply to comment #7)
> Types:
> principal = (null, codebase, system)  // disjoint union
> null      = {null1, ... nullM}        // set of M nulls
> codebase  = {origin1, ... originN}    // set of N origins
> 
> Null principals are unique tokens distinct from all other principals.
[. . .]
> For all p in P, p <= system.
> 
> For all codebase principals cp and cq such that cp != cq, !(cp <= cq) and !(cq
> <= cp) -- IOW distinct codebase principals are unordered with respect to one
> another.
> 
> For all null principals np and nq such that np != nq, np and nq are unordered.
[. . .]
> For all p in P, (p ^ system) == p.

Originally I thought one null principal, a lattice bottom, would do.  But in lieu of origin tracking we have M null principals, unordered with respect to all other principals and < system.  So these are really just like codebase principals except that they have more restricted rights -- I'm not sure how to model those, but we ought to be able to write down some rules.  I could use bz's expert help.

mrbkap points out my window.alert gaffe -- window.alert is not allAccess.  He says that document.open a better example.

Anyway, this is the kind of knowledge we need to document and rationalize.  Someone else chime in.

/be
I'm going to respond to Brendan's comment sometime soon, but I think this discussion is worth having on an open mailing list (.platform or .security or whatever, with initial cross-posts to the other one of those two and probably .dom and .network to make sure that all the interested parties see it).

Anyone object to that?  If so, I'd still prefer to have the conversation in a separate bug filed just for that purpose....
(In reply to comment #9)
> I'm going to respond to Brendan's comment sometime soon, but I think this
> discussion is worth having on an open mailing list (.platform or .security or
> whatever, with initial cross-posts to the other one of those two and probably
> .dom and .network to make sure that all the interested parties see it).

I'll start a wiki.mozilla.org page and post a pointer to it later tonight.

> Anyone object to that?  If so, I'd still prefer to have the conversation in a
> separate bug filed just for that purpose....

Sure, but it will block this bug for a few days.  We (mrbkap and I) are tired of whack-a-mole.

/be
For what it's worth, I don't think a wiki is the right medium here, unless we plan to make extensive use of the Talk page.  I'm running into a fair amount of differences in assumptions while reading comment 7, so we'll have a good bit of discussion, not just documentation of fact.

Speaking of which, I assume we are describing a conceptual system that would do the right thing while at the same time allowing our existing functionality, without worrying too much about performance or what our actual implementation looks like at the moment, correct?
testcase 2 might be ambiguous, because there are 2 regressions regarding
docment.open and Function.prototype.call.  (Function.prototype.call's
regression is critical. See bug 344495 comment #27.)

I'll attach new testcase that does not use document.open.call().
Attached file testcase 3
This works only with the patch in bug 344495 applied.

difference from testcase 2:
- frames[1].document.open.call(d);
+ d.o = frames[1].document.open;
+ d.o();
(In reply to comment #7)
> Claim #2: a sound mandatory access control system must 
> identify all codebases by origin (]), 
> never by a fake URI scheme lacking a domain name host part 
> (about:, data:, javascript:).

Just out of interest (this is a problem I ran into while trying to spec this), what's the "scheme://host[:port]" origin of a page that was loaded by the user clicking a data: URI bookmark, or the user running a javascript: bookmarklet on about:blank, or a file: resource?
(In reply to comment #11)
> For what it's worth, I don't think a wiki is the right medium here, unless we
> plan to make extensive use of the Talk page.  I'm running into a fair amount of
> differences in assumptions while reading comment 7, so we'll have a good bit of
> discussion, not just documentation of fact.

We should not make a sky castle, so we need to handle compatibility requirements. The discussion will center around what's required, what may seem required but has unwanted false positives, etc.  We've had this already re: css background URL being a javascript: URL (hence my minor snark about anti-sanitization strawmen).

But why are we arguing?  I'm absolutely going to wiki the formal write-up.  There is no point in posting it repeatedly instead of linking to it, and I don't intend to keep it in a private editing sandbox -- it's perfect for a wiki.  For threaded discussion short of wiki-talk, you are right and that's why I will post to a newsgroup (or two).

I'm catching up on other bugs, so this post didn't happen last night.  Doing it soon, but probably after lunch.

> Speaking of which, I assume we are describing a conceptual system that would do
> the right thing while at the same time allowing our existing functionality,
> without worrying too much about performance or what our actual implementation
> looks like at the moment, correct?

Yes, we should design for correctness and optimize after.  We're all seasoned enough not to screw performance with a sky-castle design, I think :-).

/be
(In reply to comment #14)
> (In reply to comment #7)
> > Claim #2: a sound mandatory access control system must 
> > identify all codebases by origin (]), 
> > never by a fake URI scheme lacking a domain name host part 
> > (about:, data:, javascript:).
> 
> Just out of interest (this is a problem I ran into while trying to spec this),
> what's the "scheme://host[:port]" origin of a page that was loaded by the user
> clicking a data: URI bookmark,

We had such things in the old days, and the answer was something like the "null principal" (http://lxr.mozilla.org/classic/ident?i=lm_unknown_origin_str).

From http://lxr.mozilla.org/classic/source/network/protocol/js/mkmocha.c#235, and if you ignore the cruft about "grid_parent" (in Netscape 2-4, when you navigated back or forward to a frameset page, the frame window contexts were recreated, and their session histories were repopulated from saved data along with saved form data; this made for special cases), ignore the memory leak, and with other jargon updates, the algorithm goes like this:

function getOriginFromURL(win, url) {
  // URL origin is computed lazily, so if we have one already, return it.
  if (url.origin)
    return url.origin;

  // In the basis case, a javascript: URL targets a window from a window
  // whose document loaded a non-javascript URL.  The referrer stored in
  // the URL tells us the origin.
  if (url.referer)
    return url.origin = url.referer;

  // No referrer.  Use the URL currently loaded in win, if possible.  This
  // is the inductive step, as the recursive call suggests.
  entry = getCurrentHistoryEntry(win);
  if (entry)
    return url.origin = getOriginFromURL(win, entry.url);

  // No basis case, no way to induct: use the unknown origin.
  return url.origin = UNKNOWN;
}

Note that win here is the window being targeted, not the window in which the referring page was loaded.  When targeting a different window from the one that loaded the referring page, the system stipulated that url.referer was set.

This means that the other critical part of the old codebase was maintaining the referrer for a URL object correctly.  In the case of link clicking, this would be the same as the uncensored HTTP Referer header, but if the referring page was created by script or a "fake" URI type, then url.referer was set from the origin associated with the referring page, not its user-facing location.

> or the user running a javascript: bookmarklet on
> about:blank, or a file: resource?

Those are two distinct cases.  about:blank could have an origin if it was loaded by src= or similar from a page -- it would inherit the parent page's origin.  Similar arguments for window.open and other scripted means of loading apply.

OTOH file: is at least one distinct origin (in the old codebase it was, but we wanted to make many file: origins, distinguished somehow; IIRC we have bugs now on file asking to make File/Save As/Web Page (Complete) somehow annotate the saved page, as I believe IE does, to make it and its inline elements subdir be one distinct trust domain).

/be
How viable is the current state of this bug, in such respect that if I'm reading most of the above comments correctly, the idea is to revisit security handling, and if I'm looking at the flags, they're implying that this is going to get fixed on the 1.8 branch, and that people would prefer it to be fixed on the 1.8.0 branch as well. The details of exactly what you, Brendan, Blake and Boris, want to discuss are not clear to me, and probably needn't be - but I *am* concerned whether this will indeed be fixed on those branches. Would love to hear your thoughts in that respect. If I can be of any help, let me know.
(In reply to comment #17)
> but I
> *am* concerned whether this will indeed be fixed on those branches. Would love
> to hear your thoughts in that respect. If I can be of any help, let me know.

It's a bit premature, but we are losing at whack-a-mole (google for it if you haven't seen this term before).  Patching branches with ad-hoc fixes is very likely (a) not adding to the security of those branches; (b) wasting our time by keeping us from the analysis needed for security soundness.

/be
brendan: cool, thanks. That helps.
Flags: blocking1.8.0.7? → blocking1.8.0.7+
Note that the current patch for bug 332182 removes the "you own it" block completely...
Target Milestone: --- → mozilla1.8.1
This looks unhopeful for 1.8.0.7 :-(
Flags: blocking1.8.0.8?
Flags: blocking1.8.0.7-
Flags: blocking1.8.0.7+
moving out to 1.8.1.1
Flags: blocking1.8.1.1?
Flags: blocking1.8.1-
Flags: blocking1.8.1+
Restoring lost blocking flag
Flags: blocking1.8.0.9?
Blake: any thoughts on these? Is there an alternative assignee that makes sense?
Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x+
Flags: blocking1.8.1.1?
Flags: blocking1.8.1.1-
Flags: blocking1.8.0.9?
Flags: blocking1.8.0.9-
Hoping we can wrap up the bug 344495 related fallout in this release.
Flags: blocking1.8.1.2+
Flags: blocking1.8.0.10+
Flags: blocking1.8.1.2+
Flags: blocking1.8.0.10+
This seems to be WFM on the trunk now
Loading the page for me doesn't trigger the alerts. When I go back, though, I get XSS alerts.
See comment 4.  This particular testcase is not a problem, but the general issue remains.
Assignee: dveditz → mrbkap
Flags: wanted1.9.0.x?
Apparently there are currently no known exploits with this since we've tightened up things better elsewhere. But the main issue is still there.

So not giving very high priority, but keeping on radar.
Flags: wanted1.9.0.x? → wanted1.9.0.x+
Priority: -- → P3
If there's no current exploit for this problem can we close it? It's pointless to have an sg:high bug floating around that we don't intend to do anything about -- either it's important or it's not. If we find future exploits based on this we can open new security bugs later.
I still think fixing comment 4 is pretty important, myself.  Brendan just seemed opposed to it.  :(
I'm not opposed to it in particular, only to whack-a-mole that may break compat where we care while not increasing "security". But if the particular idea here is not going to break compat we care about, and (even better) if other browsers do likewise, sure: one more whack.

/be
Is this an issue still even though we no longer let you call document.open or document.write on cross origin documents?
Brendan, I don't think this should break compat any more than we already have; in fact it will only affect cases when the site being attacked has UniversalBrowserRead privileges

Jonas, see comment 4.
Er, actually changing the window-preservation behavior might affect more things than that.  I need to think about how to test behavior in other UAs...
bz: ok, full speed ahead then.

/be
(In reply to comment #36)
> bz: ok, full speed ahead then.

Er, on the other-UAs-testing, that is ;-).

/be
Could we make it such that when domain A calls window.open to open a document from domain B, the about:blank document we create is given a principal from domain B.
That could break sites, I suspect....
How? Seems very unlikely that sites rely on being able to write stuff into the about:blank window before the other site has loaded. Especially since they'd essentially be racing with the load.
They might be relying on doing other things though (setting event listeners, document.location stuff, etc).

You don't race with the load if you do stuff right after open() returns.

In any case, that wouldn't help with the initial bug as filed here, would it?
(In reply to comment #41)
> You don't race with the load if you do stuff right after open() returns.

Right. Sicking writes "JS execution model is run-to-completion" 100 times on the blackboard :-P.

> In any case, that wouldn't help with the initial bug as filed here, would it?

Right.

/be
yes, it would break people that do

w=window.open(<cross domain>);
w.document.write(...);

But is that really common enough that we need to care? It seems like a very nonsensical thing to do.

And wouldn't it help since calling .write would cause XOW to throw? (or in most cases even accessing .document would throw even)
Jonas, the testcases in this bug doesn't use window.open.
Talked with bz about this and the issue seems fixed. We no longer reuse the inner window when we shouldn't because document.open can these days never change (the origin of) the principal of the document.

However still reuse the inner window if:
1257 // -- We are not currently a content window (i.e., we're currently a chrome
1258 //    window).

This is a little concerning. Does that mean that if there somehow is a about:blank chrome window somewhere that content can reach there can be badness?
Content wouldn't be able to touch it, though, because the principals are different.  So there shouldn't be badness (unless it's UniversalXPConnect content, but then we lose anyway).

I think we need the chrome window exception so that chrome can open a window, set some stuff on it, then load something non-system in there and have it see those things.  Or something.

Maybe we can remove that check (in a separate bug); need to do some CVS archeology.
Makes sense, then all we need is a testcase.
Flags: in-testsuite?
Is this going to affect compat? I think we really need to get this into a 3.1 rather than hope to get it into a minor update later.
Flags: blocking1.9.1?
Given the compat risk I think this should be a b3 blocker.
Flags: blocking1.9.1? → blocking1.9.1+
Priority: P3 → P1
The only thing remaining here is writing tests, there's no code changes or anything else to do here that has any compatibility impact. Marking this a P2 as it'd be awesome to get the tests in place for the release, but that's all that's remaining here.
Priority: P1 → P2
Target Milestone: mozilla1.8.1 → mozilla1.9.1
We should also file a new bug on what comment 46 talks about
Actually, comment 4 covers an actual code change that would have possible compat impact that I think we should make...
Ok. Over to bz per discussion on IRC.
Assignee: mrbkap → bzbarsky
Well, some good news is that the frame version also doesn't work because we now _never_ reuse the inner window for subframes.  That seems to be compatible-enough with the web, apparently!

The testcase could be recast in terms of an additional opened window, though.
OK.  So in fact the change for bug 332182 did fix this bug completely, by making us not reuse the inner if the two documents were not strictly same-origin (not just subsuming, but actually same-origin).

Then we added some extra fixage on trunk when we disallowed even doing document.write on documents from a different origin.

In any case, this is fixed.  I'll post a testcase that exercises the various pieces, and in particular makes sure that we don't reuse the inner on origin change even if privileged code is about.  The chrome issue is still outstanding, but I don't think it's really related to this bug.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
This _is_ sorta an issue on the 1.8 branch, maybe (if someone is using privileged code, etc), but I'm not going to worry about that too much, honestly.
Depends on: 332182
Attached patch MochitestSplinter Review
Pushed tests as http://hg.mozilla.org/mozilla-central/rev/95c1ce032033
Flags: in-testsuite? → in-testsuite+
(In reply to comment #55)
> OK.  So in fact the change for bug 332182 did fix this bug completely

Does that make this fixed1.9.1 as well?
Yeah, and fixed1.9.0.0 if you want too.
Keywords: fixed1.9.1
Keywords: fixed1.9.0
Flags: blocking1.8.1.next-
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: