Last Comment Bug 696331 - "ASSERTION: Principal mismatch. Not good" with marquee, document.write, onmouseleave
: "ASSERTION: Principal mismatch. Not good" with marquee, document.write, onmo...
Status: RESOLVED FIXED
[sg:high]
: assertion, testcase
Product: Core
Classification: Components
Component: DOM: Events (show other bugs)
: Trunk
: x86_64 Linux
: -- normal (vote)
: mozilla13
Assigned To: Bobby Holley (busy)
:
Mentors:
Depends on:
Blocks: 325861 594645
  Show dependency treegraph
 
Reported: 2011-10-21 00:36 PDT by Jesse Ruderman
Modified: 2012-02-27 15:07 PST (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
testcase (follow instructions) (146 bytes, text/html)
2011-10-21 00:36 PDT, Jesse Ruderman
no flags Details
stack trace (5.05 KB, text/plain)
2011-10-21 00:36 PDT, Jesse Ruderman
no flags Details
warning stack (10.64 KB, text/plain)
2011-11-09 16:39 PST, Bobby Holley (busy)
no flags Details
Clear the document principle in nsDocument::ResetToURI after teardown, rather than before. v1 (1.68 KB, patch)
2012-02-14 10:59 PST, Bobby Holley (busy)
jst: review+
Details | Diff | Review

Description Jesse Ruderman 2011-10-21 00:36:13 PDT
Created attachment 568619 [details]
testcase (follow instructions)

###!!! ASSERTION: Principal mismatch.  Not good: 'strcmp(jsClass->name, "Location") == 0 ? NS_SUCCEEDED(CheckSameOriginPrincipal(result, principal)) : result == principal', file caps/src/nsScriptSecurityManager.cpp, line 2504
Comment 1 Jesse Ruderman 2011-10-21 00:36:47 PDT
Created attachment 568620 [details]
stack trace
Comment 2 Johnny Stenback (:jst, jst@mozilla.com) 2011-11-03 13:47:17 PDT
Bobby, can you look into this? Guessing that this is sg:high
Comment 3 Jesse Ruderman 2011-11-07 21:34:06 PST
This assertion also shows up in bug 700561.
Comment 4 Bobby Holley (busy) 2011-11-09 16:39:56 PST
Created attachment 573370 [details]
warning stack

The assertion comes from deep inside our XBL implementation of marquee, which I'd sure love to avoid digging into. ;-) It's preceded by a warning, though:

WARNING: Moving XPConnect wrappedNative to new scope, but can't fixup __proto__

This comes from within the call to document.open(), as a result of wrapper reparenting. Not sure if they're related, or if that's what the issue is. I can dig in more soon.

Attached the warning stack trace, in case it's useful to anyone.
Comment 5 Bobby Holley (busy) 2011-11-10 06:18:13 PST
This bug involves a crap-ton of stuff I've never worked with before, but I'm still managing to make decent progress. ;-)

In a nutshell, the warning does seem to be related. The principal mismatch reported in the assertion is the result of nsScriptSecurityManager::doGetObjectPrincipal returning different principals depending on whether aAllowShortCircuit is true or not. The object in question is the function object of onmouseleave.

The default path uses short circuit, but in debug mode we double-check at the end of doGetObjectPrincipal that we get the same result by with aShortCircuit == false. In this case we don't.

With short circuit, the principal is the one from i.html. Without short circuit, the principal is the null principal.

What appears to be happening is that we fail to entirely reparent the HTMLHtmlElement __proto__ to the new document (which presumably has the null principal). So we're left in a state where the short circuit leads us to the old global, and the long path leads us to the new global. I'm guessing that this has to do with the short circuit going through the __proto__, but I haven't dug into that yet.

More to come, though it may get interrupted by mozCamp EU stuff. :-)
Comment 6 Johnny Stenback (:jst, jst@mozilla.com) 2011-11-10 09:43:14 PST
Cc:ing more people who know about wrapper reparenting and failure to do so.
Comment 7 Bobby Holley (busy) 2011-12-25 15:34:14 PST
Ok, so I was totally wrong about this one. The wrapper reparenting thing was a red herring. On the plus side, it caused me to learn all about that code, which was probably time well spent. After convincing myself that the wrapper reparenting code was doing the right thing, I re-evaluated various initial assumptions I made about things.

From what I gather, i.html is actually the correct principal in this case, meaning that the inner window (short-circuit path via the reparented wrapper) is actually giving us the right answer. The wrong answer comes from the document (non-short-circuited path via the parent chain).

What's happening is that the call to nsDocument::ResetToURI begins with a call to SetPrincipal(nsnull): http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsDocument.cpp#2107

It then ejects the old content from the DOM, and then finally calls SetPrincipal(aPrincipal).

Between the two calls SetPrincipal(), we eject the XBL-bound DIV from the DOM. When this ends up in the Binding Manager, it does a context push, which ends up triggering a query to GetCxSubjectPrincipal(), which calls doGetObjectPrincipal() on the calling function, which asserts.

The correct behavior here might be a bit debatable, but I think the short circuit path should probably be returning the null principle here as well. Currently, old content is being ejected from the DOM with the authority of the new principal, which doesn't seem quite right.

The fundamental issue here is that we have two principals - one on the document, and one on the window, and they're not always precisely synced up during document.open. Why do we have two, anyway?

It seems to me that the window should undergo the same old->null->new principal transition as the document, probably at the same time during nsDocument::ResetToURI().

I don't know this code well, though. What do people think? Maybe bz can weigh in?
Comment 8 Boris Zbarsky [:bz] (Out June 25-July 6) 2011-12-25 19:19:39 PST
> Why do we have two, anyway?

Before inner windows, the outer window and the document had to store principals independently, because you could have situations (e.g closed windows) where a window had no document...

jst or mrbkap know this stuff better than I do.
Comment 9 Bobby Holley (busy) 2012-01-03 21:26:32 PST
(In reply to Boris Zbarsky (:bz) from comment #8)
> jst or mrbkap know this stuff better than I do.

jst, can you weigh in here?
Comment 10 Bobby Holley (busy) 2012-01-04 16:47:19 PST
(In reply to Boris Zbarsky (:bz) from comment #8)
> > Why do we have two, anyway?
> 
> Before inner windows, the outer window and the document had to store
> principals independently, because you could have situations (e.g closed
> windows) where a window had no document...

This seems to be causing similar problems over in bug 700561.
Comment 11 Bobby Holley (busy) 2012-02-14 10:58:16 PST
So the basic story here is that, in nsHTMLDocument::Open, we call nsGlobalWindow::SetNewDocument on the outer window, which invokes teardown (FreeInnerObjects) on the current inner window. The inner window then saves a copy of its principal and nulls out its document, locking in the principal at that moment for the rest of its (limited) lifespan.

However, in ResetToURI, we alter the document principal, setting it to null before the teardown process. Within some of the teardown stuff we can still get ahold of the inner window, leading to the principal mismatch.

It seems like we have 4 options here:

1 - Stop caching the principal on the window once we null out mDoc, and return null (or the null principal) instead. This might break some things, and is only coincidentally correct in this case (since the document's principal happens to be null as well). I don't think this is a good option.

2 - Call SetNewDocument on the window later. I'm not confident that this is easy to do with the current code.

3 - Null out the principal after document teardown, rather than before. This seems reasonable, and I don't see an obvious reason why we don't do it that way now. I've been digging through the blame a little bit, and it seems rooted in a long-ago security bug, bug 14443:

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/content/base/src/nsDocument.cpp&rev=3.392&root=/cvsroot
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&root=/cvsroot&subdir=mozilla/content/base/src&command=DIFF_FRAMESET&file=nsDocument.cpp&rev2=3.153&rev1=3.152

However, that code appeared to just keep the old principal around pseudo-indefinitely, so I'm not convinced the reasoning is still relevant.

4 - Ignore the bug, and wait until we can rip out principals entirely.

Given that, I think the reasonable options are 3 and 4. Since 4 requires no action, I've written a patch for 3, and I'll let jst decide via his review.

Also, it seems like we should declassify this bug, yeah?
Comment 12 Bobby Holley (busy) 2012-02-14 10:59:20 PST
Created attachment 597099 [details] [diff] [review]
Clear the document principle in nsDocument::ResetToURI after teardown, rather than before. v1

Attaching a patch per comment 11, flagging jst to make the call here.
Comment 13 Johnny Stenback (:jst, jst@mozilla.com) 2012-02-14 11:47:42 PST
Comment on attachment 597099 [details] [diff] [review]
Clear the document principle in nsDocument::ResetToURI after teardown, rather than before. v1

I think this is what we should do here, and I think this is a better option than 4, even if this is unlikely something that causes any actual problems for us.
Comment 14 Bobby Holley (busy) 2012-02-16 14:20:27 PST
Also, this bug should probably be unprotected, yeah?
Comment 15 Bobby Holley (busy) 2012-02-17 23:53:52 PST
Pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=ebdd13c91942
Comment 16 Bobby Holley (busy) 2012-02-18 08:58:44 PST
Landed to m-i: http://hg.mozilla.org/integration/mozilla-inbound/rev/bab2083815b7

I think this bug can be un-hidden, but I'll let jst make the call.
Comment 17 Ed Morley [:emorley] 2012-02-20 10:25:22 PST
https://hg.mozilla.org/mozilla-central/rev/bab2083815b7
Comment 18 Johnny Stenback (:jst, jst@mozilla.com) 2012-02-27 15:07:57 PST
Yes, no need to keep this hidden.

Note You need to log in before you can comment on or make changes to this bug.