Assertion failure: aScriptGlobalObject || !mAnimationController || mAnimationController->IsPausedByType( SMILTimeContainer::PAUSE_PAGEHIDE | SMILTimeContainer::PAUSE_BEGIN) (Clearing window pointer while animations are unpaused), at /builds/worker/workspa

RESOLVED FIXED in Firefox 68

Status

()

defect
P3
normal
RESOLVED FIXED
3 months ago
3 months ago

People

(Reporter: jkratzer, Assigned: dholbert)

Tracking

(Blocks 1 bug, {assertion, testcase})

unspecified
mozilla68
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox-esr60 wontfix, firefox66 wontfix, firefox67 wontfix, firefox68 fixed)

Details

Attachments

(7 attachments)

Reporter

Description

3 months ago
Posted file testcase.html

Testcase found while fuzzing mozilla-central rev ab709310d23f.rax = 0x0000561ab2b06e40 rdx = 0x0000000000000000
rcx = 0x00007f74d25876b3 rbx = 0x00007f74c3cb4000
rsi = 0x00007f74df2ef8b0 rdi = 0x00007f74df2ee680
rbp = 0x00007ffed9eb8f90 rsp = 0x00007ffed9eb8ee0
r8 = 0x00007f74df2ef8b0 r9 = 0x00007f74e044c740
r10 = 0x0000000000000000 r11 = 0x0000000000000000
r12 = 0x00007ffed9eb9000 r13 = 0x0000000000000000
r14 = 0x00007f74c3cb5068 r15 = 0x00007f74c4f13998
rip = 0x00007f74cdb640de
OS|Linux|0.0.0 Linux 4.18.0-16-generic #17~18.04.1-Ubuntu SMP Tue Feb 12 13:35:51 UTC 2019 x86_64
CPU|amd64|family 6 model 94 stepping 3|1
GPU|||
Crash|SIGSEGV /SEGV_MAPERR|0x0|0
0|0|libxul.so|mozilla::dom::Document::SetScriptGlobalObject(nsIScriptGlobalObject*)|hg:hg.mozilla.org/mozilla-central:mfbt/AlreadyAddRefed.h:ab709310d23f9b7b17ba50731c63666aaf67945b|128|0x0
0|1|libxul.so|mozilla::dom::Document::Destroy()|hg:hg.mozilla.org/mozilla-central:dom/base/Document.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|7519|0xe
0|2|libxul.so|nsDocumentViewer::Destroy()|hg:hg.mozilla.org/mozilla-central:layout/base/nsDocumentViewer.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|1806|0x18
0|3|libxul.so|nsDocumentViewer::Show()|hg:hg.mozilla.org/mozilla-central:layout/base/nsDocumentViewer.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|2091|0x18
0|4|libxul.so|nsPresContext::EnsureVisible()|hg:hg.mozilla.org/mozilla-central:layout/base/nsPresContext.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|1609|0x14
0|5|libxul.so|nsIPresShell::UnsuppressAndInvalidate()|hg:hg.mozilla.org/mozilla-central:layout/base/PresShell.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|3792|0x11
0|6|libxul.so|mozilla::PresShell::sPaintSuppressionCallback(nsITimer*, void*)|hg:hg.mozilla.org/mozilla-central:layout/base/PresShell.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|1827|0x10
0|7|libxul.so|nsTimerImpl::Fire(int)|hg:hg.mozilla.org/mozilla-central:xpcom/threads/nsTimerImpl.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|559|0x11
0|8|libxul.so|nsTimerEvent::Run()|hg:hg.mozilla.org/mozilla-central:xpcom/threads/TimerThread.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|260|0x18
0|9|libxul.so|mozilla::SchedulerGroup::Runnable::Run()|hg:hg.mozilla.org/mozilla-central:xpcom/threads/SchedulerGroup.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|295|0x15
0|10|libxul.so|nsThread::ProcessNextEvent(bool, bool*)|hg:hg.mozilla.org/mozilla-central:xpcom/threads/nsThread.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|1179|0x15
0|11|libxul.so|NS_ProcessNextEvent(nsIThread*, bool)|hg:hg.mozilla.org/mozilla-central:xpcom/threads/nsThreadUtils.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|482|0x11
0|12|libxul.so|mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*)|hg:hg.mozilla.org/mozilla-central:ipc/glue/MessagePump.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|88|0xa
0|13|libxul.so|MessageLoop::RunInternal()|hg:hg.mozilla.org/mozilla-central:ipc/chromium/src/base/message_loop.cc:ab709310d23f9b7b17ba50731c63666aaf67945b|315|0x17
0|14|libxul.so|MessageLoop::Run()|hg:hg.mozilla.org/mozilla-central:ipc/chromium/src/base/message_loop.cc:ab709310d23f9b7b17ba50731c63666aaf67945b|308|0x8
0|15|libxul.so|nsBaseAppShell::Run()|hg:hg.mozilla.org/mozilla-central:widget/nsBaseAppShell.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|137|0xd
0|16|libxul.so|XRE_RunAppShell()|hg:hg.mozilla.org/mozilla-central:toolkit/xre/nsEmbedFunctions.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|933|0x11
0|17|libxul.so|mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*)|hg:hg.mozilla.org/mozilla-central:ipc/glue/MessagePump.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|238|0x5
0|18|libxul.so|MessageLoop::RunInternal()|hg:hg.mozilla.org/mozilla-central:ipc/chromium/src/base/message_loop.cc:ab709310d23f9b7b17ba50731c63666aaf67945b|315|0x17
0|19|libxul.so|MessageLoop::Run()|hg:hg.mozilla.org/mozilla-central:ipc/chromium/src/base/message_loop.cc:ab709310d23f9b7b17ba50731c63666aaf67945b|308|0x8
0|20|libxul.so|XRE_InitChildProcess(int, char**, XREChildData const*)|hg:hg.mozilla.org/mozilla-central:toolkit/xre/nsEmbedFunctions.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|771|0xc
0|21|firefox-bin|content_process_main(mozilla::Bootstrap*, int, char**)|hg:hg.mozilla.org/mozilla-central:ipc/contentproc/plugin-container.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|56|0x14
0|22|firefox-bin|main|hg:hg.mozilla.org/mozilla-central:browser/app/nsBrowserApp.cpp:ab709310d23f9b7b17ba50731c63666aaf67945b|265|0x11
0|23|libc-2.27.so||||0x21b97
0|24|firefox-bin|MOZ_ReportCrash|hg:hg.mozilla.org/mozilla-central:mfbt/Assertions.h:ab709310d23f9b7b17ba50731c63666aaf67945b|184|0x5

Assertion failure: aScriptGlobalObject || !mAnimationController || mAnimationController->IsPausedByType( SMILTimeContainer::PAUSE_PAGEHIDE | SMILTimeContainer::PAUSE_BEGIN) (Clearing window pointer while animations are unpaused), at /builds/worker/workspace/build/src/dom/base/Document.cpp:4358

Flags: in-testsuite?

Updated

3 months ago
Component: DOM: Core & HTML → DOM: Animation
Component: DOM: Animation → SVG
Priority: -- → P3

Adding Daniel because it looks like this assertion was added in bug 536834.

I reduced the test case somewhat and turned it into a crashtest.

I expect this can be reduced further but it's the weekend here now. I can look at this next week unless Daniel gets a chance before then.

(Adding bz too because when it involves XHR docs I normally end up asking him anyway.)

Assignee

Comment 4

3 months ago

(In reply to Brian Birtles (:birtles) from comment #3)

I can look at this next week unless Daniel gets a chance before then.

I probably won't get a chance right away (started to take a look & then noticed it requires the fuzzer extension to trigger GC/CC), so I'll hold off since you've already started investigating.

Thanks!

Flags: needinfo?(bbirtles)

(In reply to Brian Birtles (:birtles) from comment #3)
(started to take a look & then noticed it requires the fuzzer extension to trigger GC/CC)

It doesn't actually need the fuzzer extension. SpecialPowers.forceGC() etc. is enough. The reduced test case I posted in attachment 9051211 [details] [diff] [review] uses SpecialPowers and should trigger the assertion in a vanilla debug build.

Flags: needinfo?(bbirtles)
Assignee

Comment 6

3 months ago

Thanks! I hadn't noticed that. I'm taking a look now.

So, it seems this is simply a way to trigger Document destruction without having ever fired "PageHide()". And SMILAnimationController expects/asserts that PageHide should've been fired before it's torn down. This used to be more important back in the pre-RefreshDriver days, when (I think) SMIL had its own timer, which could hypothetically continue firing even after the document was destroyed if we neglected to officially stop sampling. "PageHide" is where we expect to stop sampling -- we expect to always receive that notification & officially stop sampling before the document & animation controller are torn down.

I wonder if it's known/intended that we can hit mozilla::dom::Document::Destroy in this scenario without ever having received a Document::OnPageHide call (and if that causes any other trouble)...?

Assignee

Comment 7

3 months ago

tl;dr:
So when running the attached crashtest, the httprequest "send" triggers PageShow and then PageHide, and then later on we call PageShow again (unpausing the animation controller), and then we fire sPaintSuppressionCallback which ends up flushing some cleanup work which destroys the (old?) document.

MORE DETAILS:

Here's how "send" triggers PageShow:

#0  0x00007f6b4f52f5fb in mozilla::dom::Document::OnPageShow(bool, mozilla::dom::EventTarget*, bool) (this=0x7f6b3f950000, aPersisted=false, aDispatchStartTarget=0x0, aOnlySystemGroup=false)
    at $SRC/dom/base/Document.cpp:7801
#1  0x00007f6b528c0395 in nsDocumentViewer::LoadComplete(nsresult) (this=0x7f6b4376a9b0, aStatus=-2142109678) at $SRC/layout/base/nsDocumentViewer.cpp:1128
#2  0x00007f6b548c0582 in nsDocShell::EndPageLoad(nsIWebProgress*, nsIChannel*, nsresult) (this=0x7f6b43589000, aProgress=0x7f6b43589028, aChannel=0x7f6b5e2c1f68, aStatus=-2142109678)
    at $SRC/docshell/base/nsDocShell.cpp:6588
#3  0x00007f6b548bfe0d in nsDocShell::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, nsresult)
    (this=0x7f6b43589000, aProgress=0x7f6b43589028, aRequest=0x7f6b5e2c1f68, aStateFlags=131088, aStatus=-2142109678) at $SRC/docshell/base/nsDocShell.cpp:6389
#4  0x00007f6b4e934a87 in nsDocLoader::DoFireOnStateChange(nsIWebProgress*, nsIRequest*, int&, nsresult)
    (this=0x7f6b43589000, aProgress=0x7f6b43589028, aRequest=0x7f6b5e2c1f68, aStateFlags=@0x7ffed2ab281c: 131088, aStatus=-2142109678)
    at $SRC/uriloader/base/nsDocLoader.cpp:1312
#5  0x00007f6b4e9344c1 in nsDocLoader::doStopDocumentLoad(nsIRequest*, nsresult) (this=0x7f6b43589000, request=0x7f6b5e2c1f68, aStatus=-2142109678)
    at $SRC/uriloader/base/nsDocLoader.cpp:871
#6  0x00007f6b4e9328f8 in nsDocLoader::DocLoaderIsEmpty(bool) (this=0x7f6b43589000, aFlushLayout=true) at $SRC/uriloader/base/nsDocLoader.cpp:709
#7  0x00007f6b4e933e0f in nsDocLoader::OnStopRequest(nsIRequest*, nsresult) (this=0x7f6b43589000, aRequest=0x7f6b5e2c1f68, aStatus=-2142109678)
    at $SRC/uriloader/base/nsDocLoader.cpp:597
#8  0x00007f6b4d300ff7 in mozilla::net::nsLoadGroup::RemoveRequest(nsIRequest*, nsISupports*, nsresult) (this=0x7f6b43725d90, request=0x7f6b5e2c1f68, ctxt=0x0, aStatus=-2142109678)
    at $SRC/netwerk/base/nsLoadGroup.cpp:568
#9  0x00007f6b4d2c8b2d in nsBaseChannel::OnStopRequest(nsIRequest*, nsresult) (this=0x7f6b5e2c1f20, request=0x7f6b3f94c2f0, status=-2142109678)
    at $SRC/netwerk/base/nsBaseChannel.cpp:786
#10 0x00007f6b4d2fdb87 in nsInputStreamPump::OnStateStop() (this=0x7f6b3f94c2f0) at $SRC/netwerk/base/nsInputStreamPump.cpp:655
#11 0x00007f6b4d2fca63 in nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (this=0x7f6b3f94c2f0, stream=0x7f6b3f9bd0d0)
    at $SRC/netwerk/base/nsInputStreamPump.cpp:403
#12 0x00007f6b4d0b457b in nsInputStreamReadyEvent::Run() (this=0x7f6b3f95e9c0) at $SRC/xpcom/io/nsStreamUtils.cpp:91
#13 0x00007f6b4d1262a9 in mozilla::SchedulerGroup::Runnable::Run() (this=0x7f6b3f95e740) at $SRC/xpcom/threads/SchedulerGroup.cpp:295
#14 0x00007f6b4d152a22 in nsThread::ProcessNextEvent(bool, bool*) (this=0x7f6b43724050, aMayWait=true, aResult=0x7ffed2ab37b7)
    at $SRC/xpcom/threads/nsThread.cpp:1179
#15 0x00007f6b4d156033 in NS_ProcessNextEvent(nsIThread*, bool) (aThread=0x7f6b43724050, aMayWait=true) at $SRC/xpcom/threads/nsThreadUtils.cpp:482
#16 0x00007f6b5221415d in mozilla::SpinEventLoopUntil<(mozilla::ProcessFailureBehavior)1, mozilla::dom::XMLHttpRequestMainThread::SendInternal(mozilla::dom::BodyExtractorBase const*, bool)::$_0>(mozilla::dom::XMLHttpRequestMainThread::SendInternal(mozilla::dom::BodyExtractorBase const*, bool)::$_0&&, nsIThread*) (aPredicate=..., aThread=0x0) at ../../dist/include/nsThreadUtils.h:348
#17 0x00007f6b52213581 in mozilla::dom::XMLHttpRequestMainThread::SendInternal(mozilla::dom::BodyExtractorBase const*, bool) (this=0x7f6b40fce400, aBody=0x0, aBodyIsDocumentOrString=false)
    at $SRC/dom/xhr/XMLHttpRequestMainThread.cpp:2868
#18 0x00007f6b522125cd in mozilla::dom::XMLHttpRequestMainThread::Send(JSContext*, mozilla::dom::Nullable<mozilla::dom::DocumentOrBlobOrArrayBufferViewOrArrayBufferOrFormDataOrURLSearchParamsOrUSVString> const&, mozilla::ErrorResult&) (this=0x7f6b40fce400, aCx=0x7f6b43325000, aData=..., aRv=...) at $SRC/dom/xhr/XMLHttpRequestMainThread.cpp:2645
#19 0x00007f6b50568447 in mozilla::dom::XMLHttpRequest_Binding::send(JSContext*, JS::Handle<JSObject*>, mozilla::dom::XMLHttpRequest*, JSJitMethodCallArgs const&)
    (cx=0x7f6b43325000, obj=(JSObject * const) 0x1045e23d4220 [object XMLHttpRequest], self=0x7f6b40fce400, args=...) at ./XMLHttpRequestBinding.cpp:1342

This stack unwinds up to the NS_ProcessNextEvent call, which then handles another nsInputStreamReadyEvent, which calls nsInputStreamPump::OnStateStart, which:
(1) (~11 stack levels deeper) calls Document::OnPageHide and pauses the SMILAnimationController.
(2) calls nsDocShell::SetupNewViewer, which puts a new nsDocumentViewer in place with a non-null mPreviousViewer.

Then, a bit later (after we leave XMLHttpRequestMainThread::Send), we end up in DispatchContentLoadedEvents which calls EndPageLoad / OnPageShow on the old nsDocument. (NOTE: Maybe this is supposed to be happening for the new document?)

So, we're back to having an unpaused animation controller in that old document.

Then, a bit later, mozilla::PresShell::sPaintSuppressionCallback fires for the new nsDocumentViewer, which triggers this callstack with the assertion failing at the innermost stack level here:

#0  0x00007f6b4f51e53c in mozilla::dom::Document::SetScriptGlobalObject(nsIScriptGlobalObject*) (this=0x7f6b3f950000, aScriptGlobalObject=0x0)
    at $SRC/dom/base/Document.cpp:4366
#1  0x00007f6b4f52ecdf in mozilla::dom::Document::Destroy() (this=0x7f6b3f950000) at $SRC/dom/base/Document.cpp:7535
#2  0x00007f6b528bbc79 in nsDocumentViewer::Destroy() (this=0x7f6b4376a9b0) at $SRC/layout/base/nsDocumentViewer.cpp:1806
#3  0x00007f6b528c4936 in nsDocumentViewer::Show() (this=0x7f6b447e4350) at $SRC/layout/base/nsDocumentViewer.cpp:2091
#4  0x00007f6b528f81c6 in nsPresContext::EnsureVisible() (this=0x7f6b40fce800) at $SRC/layout/base/nsPresContext.cpp:1612
#5  0x00007f6b5283555b in nsIPresShell::UnsuppressAndInvalidate() (this=0x7f6b3f91a000) at $SRC/layout/base/PresShell.cpp:3791
#6  0x00007f6b5282b8d4 in mozilla::PresShell::UnsuppressPainting() (this=0x7f6b3f91a000) at $SRC/layout/base/PresShell.cpp:3830
#7  0x00007f6b5282b6e4 in mozilla::PresShell::sPaintSuppressionCallback(nsITimer*, void*) (aTimer=0x7f6b41f9e100, aPresShell=0x7f6b3f91a000)
    at $SRC/layout/base/PresShell.cpp:1827
#8  0x00007f6b4d149996 in nsTimerImpl::Fire(int) (this=0x7f6b43418090, aGeneration=1) at /scratch/work/builds/mozilla-central/mozilla/xpcom/threads/nsTimerImpl.cpp:559
#9  0x00007f6b4d1494ad in nsTimerEvent::Run() (this=0x7f6b437401d0) at /scratch/work/builds/mozilla-central/mozilla/xpcom/threads/TimerThread.cpp:260

In trying to paint the new document, we're finding a non-null "mPreviousViewer" and cleaning it up before we show the new document, here:
https://searchfox.org/mozilla-central/rev/358f816f63da072145c593e9e2ac36b7250ecd25/layout/base/nsDocumentViewer.cpp#2086-2091
...and we're failing the assertion because the old document's animation controller is unpaused because its last notification was PageShow.

Assignee

Comment 8

3 months ago

Aha, so the problem here seems to be the following:

During a call to nsDocumentViewer::LoadComplete, we reenter into JS and set a new nsDocumentViewer, replacing ourselves! (i.e. our nsDocumentViewer ends up being shoved into a new nsDocumentViewer's "mPreviousViewer" variable).

Then, as we wind back up the stack (finishing out this same LoadComplete() call for the suddenly "old" nsDocumentViewer), we fire PageShow on its nsDocument, despite the fact that this nsDocument will actually never be shown. And then we never get a corresponding PageHide event, which causes the assertion failure.

Assignee

Updated

3 months ago
Attachment #9052145 - Attachment description: backtrace of when we replace our nsDocumentViewer, inside of the suddenly old nsDocumentViewer's call to LoadCOmplete → backtrace of when we replace our nsDocumentViewer, inside of the suddenly old nsDocumentViewer's call to LoadComplete
Assignee

Comment 10

3 months ago

Perhaps we need to add a new condition here (to the OnPageShow call), to skip it in this case:
https://searchfox.org/mozilla-central/rev/358f816f63da072145c593e9e2ac36b7250ecd25/layout/base/nsDocumentViewer.cpp#1126-1128

Assignee

Comment 12

3 months ago

Posted a patch, which includes the crashtest & which fixes the crashtest (based on local testing).

Note: in the patch, I'm assuming that the content viewer "stopped" state is something we can trust to indicate that the page is not in fact shown and that PageShow would be inappropriate to send. It does seem to be a useful signal in this case, at least, because the contentviewer-replacing code does call Stop() on the old viewer before replacing it:

nsresult nsDocShell::SetupNewViewer(nsIContentViewer* aNewViewer) {
[...]
  nsCOMPtr<nsIContentViewer> contentViewer = mContentViewer;
  if (contentViewer) {
    // Stop any activity that may be happening in the old document before
    // releasing it...
    contentViewer->Stop();
[...]
  mContentViewer = aNewViewer;

https://searchfox.org/mozilla-central/rev/358f816f63da072145c593e9e2ac36b7250ecd25/docshell/base/nsDocShell.cpp#8288-8292

Anyway, here's a try run to see if it causes any trouble:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1240a154aa5583132a9bb54a9e6e135c1955c257

What does the spec say about firing a pageshow event to the web page in this situation? What do other browsers do as far as that observable behavior goes?

Flags: needinfo?(dholbert)
Assignee

Comment 15

3 months ago
Posted file interop testcase 1
Assignee

Comment 16

3 months ago

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #13)

What does the spec say about firing a pageshow event to the web page in this situation?

To be honest, I'm not entirely sure what the "the web page in this situation" even is. :) This bug is about a content viewer that has been swapped out, in favor of a newer one, during a sync XHR. I'm not sure the spec goes into much detail about these subtleties. https://html.spec.whatwg.org/multipage/indices.html#event-pageshow says that these events should be fired when "the page's entry in the session history becomes [or stops being] the current entry".

What do other browsers do as far as that observable behavior goes?

I'm also not sure what aspects of this bug (and this change) are observable -- if you have any specific pointers, I'd be much appreciative.

I did post "interop testcase 1", which lacks a sync XHR (for simplicity) and just logs pageshow/pagehide events in a scenario where we reassign document.location, and I can confirm that Chrome and Firefox Nightly both seem to log the following:

  pageshow page1
  pagehide page1
  pageshow page2

One interesting interop difference: after interop-testcase-1 finishes loading (when it's landed on page 2, the "helper file"), Chrome won't let you navigate back to page1 (it skips over it entirely if you hit nav-backarrow on toolbar), whereas Firefox does let you navigate back to page1... This seems to contradict what the spec says about these events being tied to session history. mccr8 tells me this is probably because Chrome doesn't yet have a bfcache implementation, i.e. maybe this is a semi-expected side effect of that missing feature.

I'm not sure the spec goes into much detail about these subtleties

Well, the spec in theory defines the behavior here. The fact that we spin the event loop for XHR may mean that our behavior looks nothing like the spec...

I'm also not sure what aspects of this bug (and this change) are observable

If I understood correctly, we are in a situation where the load event runs some script that unloads the page before the event finishes firing. The proposed fix is then to not fire the pageshow event, right?

I am pretty sure per spec you can't have a case where "load" fires but pageshow does not. On the other hand, per spec the "load" event can't unload the page either... though it could do something like remove the iframe the page is in from the DOM. In that situation, does the pageshow event still fire now? Would it fire with this patch?

Chrome won't let you navigate back to page1

Presumably because it did a refresh load for page2? The are various conditions under which loads get turned into replace loads... but I am surprised they do it after 1s!

Assignee

Comment 18

3 months ago
Posted file interop testcase 2
Flags: needinfo?(dholbert)
Assignee

Comment 19

3 months ago

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #17)

If I understood correctly, we are in a situation where the load event runs some script that unloads the page before the event finishes firing. The proposed fix is then to not fire the pageshow event, right?

That was my initial proposed fix, only because I thought "mStopped" meant that nothing that happened would be observable from that point on.

But that's apparently not the case. My current patch changes behavior on "interop testcase 2" (attachment 9052416 [details]) -- it makes us stop logging "pageshow page1" there.

When I was initially debugging, I thought we were sending a redundant/mistargeted pageshow for page1, but I'm less sure about that now... I'm diving back into rr to poke around some more.

Assignee

Comment 20

3 months ago

Yeah, so the real issue here seems to be that we're firing pagehide before pageshow (because we redirect during onload, due to the sync XHR which synchronously spins the event loop).

In testcases that trigger the assertion failure, here's what happens:
(1) onload fires for the initial document (from nsDocumentViewer::LoadComplete), which triggers JS that does a redirect + then spins the event loop via sync XHR.

(2) During the event-loop-spin, we start honoring the redirect, which makes us call nsDocShell::CreateContentViewer, which calls FirePageHideNotification for the old document. This fires pagehide for the first doc. Note that we have not yet fired pageshow for that document!

(3) When we pop back up the stack to nsDocumentViewer::LoadComplete, we now fire pageshow for the already-replaced document.

So we end up with a soon-to-be-destroyed document that thinks it's visible, which causes the fatal assertion failure reported here, along with these ones afterwards if I let us get past that one:

###!!! ASSERTION: Destroying a currently-showing document: '!mIsShowing', file dom/base/Document.cpp, line 1398
###!!! ASSERTION: Expecting to be paused for pagehide before disconnect: 'mPauseState & SMILTimeContainer::PAUSE_PAGEHIDE', file dom/smil/SMILAnimationController.cpp, line 69

Attachment #9052146 - Attachment is obsolete: true

OK, right. So in some ways the real bug here is the "doing the navigation during the sync XHR" thing... But it might make sense to just work around the invariant violation for now by not doing pageshow stuff if we've been navigated away from during the load event firing...

Assignee

Comment 22

3 months ago

Yeah. Is there a good way to check that we've been navigated away from?

Possible idea: would it make sense to compare "this" against the result of docShell->GetContentViewer(...), in nsDocumentViewer::LoadComplete where we fire pageshow? (To check if the docshell has replaced us with some other content viewer)

Source link: https://searchfox.org/mozilla-central/rev/7c20ad925005fbad7b8e08813115f1ec7fa1c248/layout/base/nsDocumentViewer.cpp#1117-1128

Flags: needinfo?(bzbarsky)

Yeah. Is there a good way to check that we've been navigated away from?

Does checking nsIDocument::IsCurrentActiveDocument on our document (and maybe checking whether mDocument is null) do the right thing in this case? I'd think it should...

But yes, we could compare us to the docshell's current viewer too, probably. It might be fairly equivalent.

Flags: needinfo?(bzbarsky)
Assignee

Comment 24

3 months ago

Thanks! That seems to work.

Assignee

Comment 25

3 months ago

Here's a still-in-progress try run with that strategy (checking IsCurrentActiveDocument):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dd434358a852af511355a7908984b95be1e3d834

Attachment #9052146 - Attachment is obsolete: false
Attachment #9052146 - Attachment description: Bug 1535388: Don't fire PageShow from nsDocumentViewer::LoadComplete, if onload caused a new viewer to replace us. r?smaug → Bug 1535388: Don't fire PageShow from nsDocumentViewer::LoadComplete, if onload caused a new viewer to replace us. r?bz

Comment 26

3 months ago
Pushed by dholbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8f3474b08474
Don't fire PageShow from nsDocumentViewer::LoadComplete, if onload caused a new viewer to replace us. r=bzbarsky

Comment 27

3 months ago
bugherder
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
Assignee: nobody → dholbert
Flags: in-testsuite? → in-testsuite+
You need to log in before you can comment on or make changes to this bug.