Open Bug 1391211 Opened 7 years ago Updated 2 years ago

Assertion failure: aScriptGlobalObject || !mAnimationController || mAnimationController->IsPausedByType( nsSMILTimeContainer::PAUSE_PAGEHIDE | nsSMILTimeContainer::PAUSE_BEGIN) (Clearing window pointer while animations are unpaused),

Categories

(Core :: Layout, defect, P3)

56 Branch
defect

Tracking

()

Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- wontfix
firefox57 --- affected
firefox58 --- affected

People

(Reporter: cbook, Unassigned)

References

(Blocks 1 open bug, )

Details

(Keywords: assertion, regression, testcase)

Attachments

(2 files)

Assertion failure: aScriptGlobalObject || !mAnimationController || mAnimationController->IsPausedByType( nsSMILTimeContainer::PAUSE_PAGEHIDE | nsSMILTimeContainer::PAUSE_BEGIN) (Clearing window pointer while animations are unpaused), at z:/build/build/src/dom/base/nsDocument.cpp:4816

found by bughunter and reproduced on latest nightly from taskcluster build from https://hg.mozilla.org/mozilla-central/rev/932388b8c22c9775264e543697ce918415db9e23

Steps to reproduce:
-> Load http://forum.kerbalspaceprogram.com/index.php?/topic/164055-how-to-landing-leg-a-base-autostrut-explosion-conundrum/#comment-3142152
--> Assertion failure
Priority: -- → P3
Attached file Testcase
Attached reduced testcase.
Depends on: domino
Flags: in-testsuite?
Keywords: testcase
Blocks: domino
No longer depends on: domino
I can't reproduce the assertions on the live site from #c0 (even with builds from around the time it was filed), but I can with the attached testcase.

For the attached testcase:
INFO: Last good revision: 9c8803d97c7dc4d536d7f218f3052f80de94ccb2
INFO: First bad revision: 956306ea34f36514350ce6f03c114eedbabe934f
INFO: Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=9c8803d97c7dc4d536d7f218f3052f80de94ccb2&tochange=956306ea34f36514350ce6f03c114eedbabe934f
Blocks: 1343728
Flags: needinfo?(michael)
Version: unspecified → 56 Branch
(In reply to Ryan VanderMeulen [:RyanVM] from comment #3)
> I can't reproduce the assertions on the live site from #c0 (even with builds
> from around the time it was filed), but I can with the attached testcase.
> 
> For the attached testcase:
> INFO: Last good revision: 9c8803d97c7dc4d536d7f218f3052f80de94ccb2
> INFO: First bad revision: 956306ea34f36514350ce6f03c114eedbabe934f
> INFO: Pushlog:
> https://hg.mozilla.org/integration/mozilla-inbound/
> pushloghtml?fromchange=9c8803d97c7dc4d536d7f218f3052f80de94ccb2&tochange=9563
> 06ea34f36514350ce6f03c114eedbabe934f

This problem is not a particular problem with window.open, but rather a problem with nested event loops in general.

The summary of what's going on here is as follows.

            o1 = document.createElement('frame')
            document.documentElement.appendChild(o1)
            o1.contentDocument.addEventListener('visibilitychange', function () { window.open('', '', '4') }, false)

// At this point we have an iframe in the document with the about:blank document displayed,
// and a listener for the visibilitychange event registered on that document.

            o1.src = 'data:text/html,<svg>'

// Setting the `src` attribute doesn't perform the load synchronously, but rather queues up
// some work which should perform the load later.

            document.documentElement.appendChild(o1)

// re-appending the document to the body causes the nsFrameLoader to be torn down, 
// as before we can insert it at its new location, we first have to unmount it from 
// its original location. What happens is effectively the following:
 * We notify the current document that it is being hidden, as it's about to be unloaded.
 * The document detects that its being hidden, and tries to clean up its animation controller.
 * The document synchronously dispatches the visibilitychange event to content, this allows JS to run. 
 * JS runs window.open, which causes a nested event loop to spin (since bug 1343728). 
   This would happen with alert() or sync XHR before then. We could also probably simulate 
   this with any action which synchronously loads a new document in the iframe.
 * While the nested event loop is spinning, the asynchronous event to load the svg document 
   into the iframe is fired, which sets up a new document inside of the iframe with an animation controller.
 * The document finishes clearing its visibility, and proceeds to try to tear down the docshell.
 * This assertion fails, because the current document in the docshell wasn't told that it has been hidden.

My immediate thought is to, in nsDocumentViewer::PageHide, loop calling mDocument->OnPageHide on its document until its document stops changing. This could theoretically allow for infinite loops however, as we constantly replace the document in the visibilitychange event.

ni? olli to see if he has a better idea.
Flags: needinfo?(nika) → needinfo?(bugs)
We could also just move disabling the animation controller to happen after we notify content, which would fix this particular crash.
(In reply to Nika Layzell [:mystor] from comment #4)
> My immediate thought is to, in nsDocumentViewer::PageHide, loop calling
> mDocument->OnPageHide on its document until its document stops changing.

I don't understand what this means.
Don't we change nsDocumentViewer here all the time, not just the document of it.?
Flags: needinfo?(bugs) → needinfo?(nika)
I'll do my best to reconstruct the path of logic which got me to writing that post:

1. When calling AppendElement to move the iframe around, we first destroy the old iframe's docshell, which thus calls nsDocShell::Destroy (http://searchfox.org/mozilla-central/rev/298033405057ca7aa5099153797467eceeaa08b5/docshell/base/nsDocShell.cpp#5892)

2. The crash happens here when we call mContentViewer->Destroy() in this function, as mContentViewer->Destroy calls nsDocument->Destroy which calls nsDocument->SetScriptGlobalObject, which asserts that its animation controller has been cleaned up. (http://searchfox.org/mozilla-central/rev/298033405057ca7aa5099153797467eceeaa08b5/docshell/base/nsDocShell.cpp#5967 .. http://searchfox.org/mozilla-central/rev/298033405057ca7aa5099153797467eceeaa08b5/dom/base/nsDocument.cpp#4991-4995)

3. This value is supposed to be cleaned up by the call to FirePageHideNotification(true) in nsDocShell::Destroy, as the animation context is cleaned up by hiding the document (http://searchfox.org/mozilla-central/rev/298033405057ca7aa5099153797467eceeaa08b5/docshell/base/nsDocShell.cpp#5923)

4. The specific problem here is that mAnimationController is cleaned up well before we update the visibility state, which fires an event to synchronously run javascript, so javscript gets the chance to replace the document inside the nsDocShell before the docshell goes away with a new document which has an animation controller (the svg documeent), and the assertion fires: (http://searchfox.org/mozilla-central/rev/298033405057ca7aa5099153797467eceeaa08b5/dom/base/nsDocument.cpp#9503-9505,9541)

I'm not sure my original idea was very good but from what I can tell:
a) We probably can't guarantee a document has received a PageHide before its docshell is cleared, as in pagehide we can replace the document and the new one won't get pagehide, so.
b) We should probably change the assertion and modify the code around it to make sure we notify the animation controller and clean it up ourselves before we need it to be gone.
Flags: needinfo?(nika)
I tried the simplest fix I could think of, and it didn't work (just moving when we call mAnimationController->OnPageHide()). I did make a crashtest for this bug, so I figured I'd just post the patch so whoever actually fixes it (perhaps me if I figure out what the best solution is) doesn't have to do that :-).

MozReview-Commit-ID: 1p67dhny7Vu
Has Regression Range: --- → yes
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: