Last Comment Bug 691178 - "ASSERTION: should have a JS object by this point" with iframe hash navigation, GC
: "ASSERTION: should have a JS object by this point" with iframe hash navigatio...
Status: VERIFIED FIXED
[sg:nse][qa!][advisory-tracking+]
: assertion, testcase
Product: Core
Classification: Components
Component: XPConnect (show other bugs)
: Trunk
: x86_64 Mac OS X
: -- normal (vote)
: mozilla13
Assigned To: Bobby Holley (:bholley) (busy with Stylo)
:
: Andrew Overholt [:overholt]
Mentors:
Depends on:
Blocks: 594645
  Show dependency treegraph
 
Reported: 2011-10-02 11:38 PDT by Jesse Ruderman
Modified: 2012-05-22 16:46 PDT (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
verified
wontfix


Attachments
testcase (requires extension for GC) (486 bytes, text/html)
2011-10-02 11:38 PDT, Jesse Ruderman
no flags Details
stack traces (4.85 KB, text/plain)
2011-10-02 11:38 PDT, Jesse Ruderman
no flags Details
Refuse to wrap windows that are on the way out. v1 (1.25 KB, patch)
2012-02-17 17:11 PST, Bobby Holley (:bholley) (busy with Stylo)
jst: review+
Details | Diff | Splinter Review
Refuse to wrap windows that are on the way out. v2 (1.03 KB, patch)
2012-02-24 15:24 PST, Bobby Holley (:bholley) (busy with Stylo)
jst: review+
Details | Diff | Splinter Review

Description Jesse Ruderman 2011-10-02 11:38:23 PDT
Created attachment 564067 [details]
testcase (requires extension for GC)

1. Install https://www.squarefree.com/extensions/domFuzzLite2.xpi
2. Load the testcase

Result:

###!!! ASSERTION: should have a JS object by this point: 'win->GetOuterWindowInternal()->IsCreatingInnerWindow()', file dom/base/nsDOMClassInfo.cpp, line 4916

###!!! ASSERTION: Non-global object has the wrong flags: '!(jsclazz->flags & JSCLASS_IS_GLOBAL)', file js/src/xpconnect/src/xpcwrappednative.cpp, line 1146
Comment 1 Jesse Ruderman 2011-10-02 11:38:42 PDT
Created attachment 564068 [details]
stack traces
Comment 2 Johnny Stenback (:jst, jst@mozilla.com) 2011-10-05 16:44:05 PDT
Blake, can you suggest a severity for this?
Comment 3 Johnny Stenback (:jst, jst@mozilla.com) 2011-11-03 13:29:03 PDT
Bobby, can you dig in here? It's unclear what the severity of this is, so as soon as you get an understanding of what's going on here we can put the appropriate rating on this.
Comment 4 Johnny Stenback (:jst, jst@mozilla.com) 2012-01-11 18:25:37 PST
Bobby, when do you think you'll get to investigating here?
Comment 5 Bobby Holley (:bholley) (busy with Stylo) 2012-01-11 18:52:24 PST
(In reply to Johnny Stenback (:jst, jst@mozilla.com) from comment #4)
> Bobby, when do you think you'll get to investigating here?

Not totally sure. Ideally sometime in the next few weeks, but I have a pretty full plate at the moment. :-(
Comment 6 Bobby Holley (:bholley) (busy with Stylo) 2012-01-19 21:12:18 PST
Hm, I can't reproduce this (using a recent nightly debug build with domFuzzLite3, OSX Lion, x86_64). Jesse, can you still reproduce?
Comment 7 Jesse Ruderman 2012-01-21 05:51:16 PST
I can still reproduce. Apparently the testcase has to be local.
Comment 8 Bobby Holley (:bholley) (busy with Stylo) 2012-01-21 07:27:33 PST
Still can't reproduce on macosx, but I can on my 32-bit linux VM. I'll take a look.
Comment 9 Johnny Stenback (:jst, jst@mozilla.com) 2012-02-15 17:01:03 PST
Bobby, is this still behaving the same after your cpg related xpconnect changes?
Comment 10 Bobby Holley (:bholley) (busy with Stylo) 2012-02-15 19:03:18 PST
I've been looking into this today. I don't have a handle on the problem yet, but my sense thus far is that it's unaffected by c-p-g. I'll have more info shortly.
Comment 11 Bobby Holley (:bholley) (busy with Stylo) 2012-02-17 16:55:10 PST
Ok, so here's what's going on here:

1 - We make an iframe. We give it a url, and an onload handler P. We then insert the iframe into the dom to kick off the load.

2 - The iframe loads, and onload is fired, which triggers P. P simply does setTimeout(j, 0), sticking j at the end of the event loop.

3 - Once onload has been fired, pageshow fires. This triggers a chrome event listener Q, defined here: http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#1533

4 - Q is called. It creates a closure R that holds onto the event (!), and posts that closure to the end of the event queue.

5 - We return to the event loop and process the next thing in line. j from step (2) is called. It sets the hash on the iframe window to something arbitrary. This is necessary for the testcase, because it appears to keep the native window alive across the next GC, but _not_ the window JS Object. I think this has to do with how the docshell handles hash loads (something about creating an asynchronous load event, but it actually just reusing the same document), but I'm not totally sure.

6 - Next, j does a redundant appendChild to insert the iframe into the master document. Because it's already there, this first removes it, which causes the frameloader and docshell to be destroyed. This tears down the document and window, and makes them GC-able (for the window, this happens when we destroy the DocShell and call nsGlobalWindow::CleanUp, where we null out mInnerWindowHolder). The window loses its pointer to the document, and the document loses mScriptGlobalObject. But the document still has the weak mScopeObject pointer, so it can find the window as long as it's alive.

6 - Finally, j forces a GC. This collects the window object, eventually invoking nsGlobalWindow::OnFinalize, which nulls out the JSObject pointer.

7 - j returns, and then R rears its ugly head. The first thing it does is to access originalTarget on the event, which corresponds to the (now dead) document. We try to create a wrapper for it, and call nsNodeSH::PreCreate. The native window is still alive and accessible by the weak pointer, so we proceed in trying to wrap it. But the underlying JS Object is long gone, so win->GetFlatJSObject() is null. This confuses us in nsWindowSH::PreCreate, because it looks like the window is not-yet-born, but it's actually long-dead. So we assert.

So in a nutshell, the issue is the native document stays alive (via the Event), and the native window stays alive (via the short-circuit hashtag docshell load), but they're both torn down, and their wrappers are gone.

I think the best solution here is just to throw when we try to wrap a window that's been closed. This will presumably also throw for torn-down documents, because they need to wrap their window before wrapping themselves, assuming we always tear windows and documents down at the same time. Alternatively, we might not care about wrapping torn-down documents at all.

I'm going to write up this fix and attach it.
Comment 12 Jesse Ruderman 2012-02-17 16:58:33 PST
> I think the best solution here is just to throw when we try to wrap a window that's 
> been closed.

With this fix, will the web-page-visible behavior still depend on GC timing?
Comment 13 Bobby Holley (:bholley) (busy with Stylo) 2012-02-17 17:11:25 PST
Created attachment 598447 [details] [diff] [review]
Refuse to wrap windows that are on the way out. v1

Attaching the fix proposed in the previous comment. With this fix, the testcase throws in browser.js:pageShowEventHandlers, which is (I think) what we want.

Flagging jst for review.

What do we think about the security rating here? I'd like to turn the testcase into a crashtest, which is why I switched it to a fatal assert. I'll let jst make the call here.
Comment 14 Jesse Ruderman 2012-02-17 23:15:54 PST
> I'd like to turn the testcase into a crashtest, which is why I switched it to a 
> fatal assert.

Crashtest considers unexpected NS_ASSERTIONs to be test failures, fwiw.
Comment 15 Bobby Holley (:bholley) (busy with Stylo) 2012-02-17 23:50:38 PST
(In reply to Jesse Ruderman from comment #14)
> > I'd like to turn the testcase into a crashtest, which is why I switched it to a 
> > fatal assert.
> 
> Crashtest considers unexpected NS_ASSERTIONs to be test failures, fwiw.

Oh, that's interesting. Is there any way to trigger a GC in crashtests? I was under the impression that there wasn't any way to access SpecialPowers and all that in that context.
Comment 16 Jesse Ruderman 2012-02-17 23:54:22 PST
I guess not, which means you'll be using mochitest, which doesn't yet consider NS_ASSERTION to be a failure (bug 404077).
Comment 17 Bobby Holley (:bholley) (busy with Stylo) 2012-02-18 09:03:37 PST
(In reply to Jesse Ruderman from comment #12)
> > I think the best solution here is just to throw when we try to wrap a window that's 
> > been closed.
> 
> With this fix, will the web-page-visible behavior still depend on GC timing?

It does, I think. But I'm not really sure how to change that. We'd either need to find a way to trace all the events in the event queue, or alter the ownership/teardown model for windows I guess.
Comment 18 Johnny Stenback (:jst, jst@mozilla.com) 2012-02-23 22:03:06 PST
Comment on attachment 598447 [details] [diff] [review]
Refuse to wrap windows that are on the way out. v1

+  // See bug 691178 comment 11 for why this is necessary.
+  if (!win->IsClosedOrClosing())
+    return NS_ERROR_FAILURE;
+

I would say this check should be inside the if !winObj check below, it's fine to reach into another window after it's been closed, which could trigger this PreCreate() hook being called. I think we should throw here only in the case where the window has been collected already, and we're attempting to re-wrap it.

r=jst with that changed.
Comment 19 Bobby Holley (:bholley) (busy with Stylo) 2012-02-24 15:24:34 PST
Created attachment 600547 [details] [diff] [review]
Refuse to wrap windows that are on the way out. v2

It looks like I actually had the previous check backwards. :-( Reflagging jst on that account.

I also gave up trying to make a mochitest in favor of working on other stuff. So I've also reverted to an NS_ASSERTION.
Comment 20 Bobby Holley (:bholley) (busy with Stylo) 2012-02-24 17:21:53 PST
Pushed this to try: https://tbpl.mozilla.org/?tree=Try&rev=67d964886265
Comment 21 Bobby Holley (:bholley) (busy with Stylo) 2012-02-29 18:10:12 PST
Pushed to m-i: http://hg.mozilla.org/integration/mozilla-inbound/rev/782603d06047
Comment 22 Bobby Holley (:bholley) (busy with Stylo) 2012-03-01 09:01:22 PST
Marking at mak's request:

merged to m-c:
https://hg.mozilla.org/mozilla-central/rev/782603d06047
Comment 23 Daniel Veditz [:dveditz] 2012-03-29 14:57:22 PDT
Given the understanding in comment 11 is this an exploitable condition? What happened after nsWindowSH::PreCreate returned after firing the assertion? R has an empty originalTarget, then what? Other than the assertions I don't see any obvious evidence of bad memory use in a debug build.

Or should we just fix this on ESR and not worry about it.
Comment 24 Bobby Holley (:bholley) (busy with Stylo) 2012-03-29 16:00:52 PDT
Hm. Well, the parent object will remain unchanged, so we'll end up wrapping the DOM object in a scope different from its old window. I don't see any glaring security issues with this, though.

Given that the bug is hard enough to trigger as is, I'd be tempted to just let it ride. We could also land the fix though, since it's super low-risk. That might be a better idea.
Comment 25 Vlad [QA] 2012-05-22 00:45:59 PDT
I have verified this on:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20120517 Firefox/13.0 debug beta build

I've followed the steps from the description and no assertion is shown in the terminal.
Marking this as Verified Fixed.

Note You need to log in before you can comment on or make changes to this bug.