Closed Bug 1758199 Opened 3 years ago Closed 3 years ago

crash near null in [@ nsPresContext::NotifyDOMContentFlushed]

Categories

(Core :: Layout, defect)

defect

Tracking

()

VERIFIED FIXED
100 Branch
Tracking Status
firefox-esr91 --- wontfix
firefox98 --- wontfix
firefox99 --- wontfix
firefox100 --- verified

People

(Reporter: tsmith, Assigned: dholbert)

References

(Blocks 1 open bug)

Details

(Keywords: crash, testcase, Whiteboard: [bugmon:bisected,confirmed])

Crash Data

Attachments

(2 files, 1 obsolete file)

Attached file testcase.html (obsolete) —

Found while fuzzing m-c 20220225-c875dbd49223 (--enable-address-sanitizer --enable-fuzzing)

To reproduce via Grizzly Replay:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch -a --fuzzing -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.html
#0 0x7f1856d2d8ef in nsPresContext::NotifyDOMContentFlushed() /gecko/layout/base/nsPresContext.cpp:2598:3
#1 0x7f1851a2d400 in mozilla::dom::Document::UnblockDOMContentLoaded() /gecko/dom/base/Document.cpp:8101:36
#2 0x7f1851a2ce5f in mozilla::dom::Document::EndLoad() /gecko/dom/base/Document.cpp:8065:3
#3 0x7f1850a703e2 in nsHtml5TreeOpExecutor::DidBuildModel(bool) /gecko/parser/html/nsHtml5TreeOpExecutor.cpp:210:16
#4 0x7f18509f5c97 in nsHtml5Parser::Terminate() /gecko/parser/html/nsHtml5Parser.cpp:470:20
#5 0x7f1856c9c927 in nsDocumentViewer::Stop() /gecko/layout/base/nsDocumentViewer.cpp:1795:16
#6 0x7f185a6941d2 in nsDocShell::Stop(unsigned int) /gecko/docshell/base/nsDocShell.cpp:4289:11
#7 0x7f185a6e26df in non-virtual thunk to nsDocShell::Stop(unsigned int) /gecko/docshell/base/nsDocShell.cpp
#8 0x7f185a6944cb in nsDocShell::Stop(unsigned int) /gecko/docshell/base/nsDocShell.cpp:4324:19
#9 0x7f185a6b8a5c in nsDocShell::Destroy() /gecko/docshell/base/nsDocShell.cpp:4564:3
#10 0x7f185ad9703d in nsWebBrowser::SetDocShell(nsDocShell*) /gecko/toolkit/components/browser/nsWebBrowser.cpp:1123:18
#11 0x7f185ad964ac in nsWebBrowser::InternalDestroy() /gecko/toolkit/components/browser/nsWebBrowser.cpp:176:3
#12 0x7f185ad9b8fc in Destroy /gecko/toolkit/components/browser/nsWebBrowser.cpp:856:3
#13 0x7f185ad9b8fc in non-virtual thunk to nsWebBrowser::Destroy() /gecko/toolkit/components/browser/nsWebBrowser.cpp
#14 0x7f1855cbb480 in mozilla::dom::BrowserChild::DestroyWindow() /gecko/dom/ipc/BrowserChild.cpp:880:31
#15 0x7f1855cd3c8c in mozilla::dom::BrowserChild::RecvDestroy() /gecko/dom/ipc/BrowserChild.cpp:2601:3
#16 0x7f18503e42b8 in mozilla::dom::PBrowserChild::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PBrowserChild.cpp:6637:56
#17 0x7f184f90c18e in mozilla::dom::PContentChild::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PContentChild.cpp:8274:32
#18 0x7f184f67df19 in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) /gecko/ipc/glue/MessageChannel.cpp:1665:25
#19 0x7f184f67bb19 in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) /gecko/ipc/glue/MessageChannel.cpp:1590:9
#20 0x7f184f67d057 in mozilla::ipc::MessageChannel::MessageTask::Run() /gecko/ipc/glue/MessageChannel.cpp:1486:14
#21 0x7f184e16b812 in mozilla::RunnableTask::Run() /gecko/xpcom/threads/TaskController.cpp:467:16
#22 0x7f184e12fd3d in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /gecko/xpcom/threads/TaskController.cpp:770:26
#23 0x7f184e12d298 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /gecko/xpcom/threads/TaskController.cpp:606:15
#24 0x7f184e12d9a9 in mozilla::TaskController::ProcessPendingMTTask(bool) /gecko/xpcom/threads/TaskController.cpp:390:36
#25 0x7f184e1741b4 in operator() /gecko/xpcom/threads/TaskController.cpp:127:37
#26 0x7f184e1741b4 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_1>::Run() /gecko/xpcom/threads/nsThreadUtils.h:531:5
#27 0x7f184e150ad7 in nsThread::ProcessNextEvent(bool, bool*) /gecko/xpcom/threads/nsThread.cpp:1173:16
#28 0x7f184e15c04c in NS_ProcessNextEvent(nsIThread*, bool) /gecko/xpcom/threads/nsThreadUtils.cpp:467:10
#29 0x7f1851844d46 in bool mozilla::SpinEventLoopUntil<(mozilla::ProcessFailureBehavior)1, nsGlobalWindowOuter::Print(nsIPrintSettings*, nsIWebProgressListener*, nsIDocShell*, nsGlobalWindowOuter::IsPreview, nsGlobalWindowOuter::IsForWindowDotPrint, std::function<void (mozilla::dom::PrintPreviewResultInfo const&)>&&, mozilla::ErrorResult&)::$_3>(nsTSubstring<char> const&, nsGlobalWindowOuter::Print(nsIPrintSettings*, nsIWebProgressListener*, nsIDocShell*, nsGlobalWindowOuter::IsPreview, nsGlobalWindowOuter::IsForWindowDotPrint, std::function<void (mozilla::dom::PrintPreviewResultInfo const&)>&&, mozilla::ErrorResult&)::$_3&&, nsIThread*) /builds/worker/workspace/obj-build/dist/include/mozilla/SpinEventLoopUntil.h:176:25
#30 0x7f18518406df in nsGlobalWindowOuter::Print(nsIPrintSettings*, nsIWebProgressListener*, nsIDocShell*, nsGlobalWindowOuter::IsPreview, nsGlobalWindowOuter::IsForWindowDotPrint, std::function<void (mozilla::dom::PrintPreviewResultInfo const&)>&&, mozilla::ErrorResult&) /gecko/dom/base/nsGlobalWindowOuter.cpp:5351:5
#31 0x7f185183e57c in nsGlobalWindowOuter::PrintOuter(mozilla::ErrorResult&) /gecko/dom/base/nsGlobalWindowOuter.cpp:5150:3
#32 0x7f1856c97995 in nsDocumentViewer::LoadComplete(nsresult) /gecko/layout/base/nsDocumentViewer.cpp:1171:43
#33 0x7f185a6f0e43 in nsDocShell::EndPageLoad(nsIWebProgress*, nsIChannel*, nsresult) /gecko/docshell/base/nsDocShell.cpp:6416:20
#34 0x7f185a6f013b in nsDocShell::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, nsresult) /gecko/docshell/base/nsDocShell.cpp:5805:7
#35 0x7f185a6f210f in non-virtual thunk to nsDocShell::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, nsresult) /gecko/docshell/base/nsDocShell.cpp
#36 0x7f18507e8eb0 in nsDocLoader::DoFireOnStateChange(nsIWebProgress*, nsIRequest*, int&, nsresult) /gecko/uriloader/base/nsDocLoader.cpp:1377:3
#37 0x7f18507e7ac4 in nsDocLoader::doStopDocumentLoad(nsIRequest*, nsresult) /gecko/uriloader/base/nsDocLoader.cpp:975:14
#38 0x7f18507e42f2 in nsDocLoader::DocLoaderIsEmpty(bool, mozilla::Maybe<nsresult> const&) /gecko/uriloader/base/nsDocLoader.cpp:794:9
#39 0x7f18507e7e0a in ChildDoneWithOnload /builds/worker/workspace/obj-build/dist/include/nsDocLoader.h:228:5
#40 0x7f18507e7e0a in nsDocLoader::NotifyDoneWithOnload(nsDocLoader*) /gecko/uriloader/base/nsDocLoader.cpp:869:14
#41 0x7f18507e42fd in nsDocLoader::DocLoaderIsEmpty(bool, mozilla::Maybe<nsresult> const&) /gecko/uriloader/base/nsDocLoader.cpp:796:9
#42 0x7f18507e64b5 in nsDocLoader::OnStopRequest(nsIRequest*, nsresult) /gecko/uriloader/base/nsDocLoader.cpp:677:5
#43 0x7f185a72b63b in nsDocShell::OnStopRequest(nsIRequest*, nsresult) /gecko/docshell/base/nsDocShell.cpp:13792:23
#44 0x7f184e4bb84e in mozilla::net::nsLoadGroup::NotifyRemovalObservers(nsIRequest*, nsresult) /gecko/netwerk/base/nsLoadGroup.cpp:614:22
#45 0x7f184e4be293 in mozilla::net::nsLoadGroup::RemoveRequest(nsIRequest*, nsISupports*, nsresult) /gecko/netwerk/base/nsLoadGroup.cpp:518:10
#46 0x7f184f309391 in operator() /gecko/netwerk/ipc/DocumentChannel.cpp:118:22
#47 0x7f184f309391 in mozilla::detail::RunnableFunction<mozilla::net::DocumentChannel::ShutdownListeners(nsresult)::$_0>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:531:5
#48 0x7f184e16b812 in mozilla::RunnableTask::Run() /gecko/xpcom/threads/TaskController.cpp:467:16
#49 0x7f184e12fd3d in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /gecko/xpcom/threads/TaskController.cpp:770:26
#50 0x7f184e12d298 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /gecko/xpcom/threads/TaskController.cpp:606:15
#51 0x7f184e12d9a9 in mozilla::TaskController::ProcessPendingMTTask(bool) /gecko/xpcom/threads/TaskController.cpp:390:36
#52 0x7f184e174181 in operator() /gecko/xpcom/threads/TaskController.cpp:124:37
#53 0x7f184e174181 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() /gecko/xpcom/threads/nsThreadUtils.h:531:5
#54 0x7f184e150ad7 in nsThread::ProcessNextEvent(bool, bool*) /gecko/xpcom/threads/nsThread.cpp:1173:16
#55 0x7f184e15c04c in NS_ProcessNextEvent(nsIThread*, bool) /gecko/xpcom/threads/nsThreadUtils.cpp:467:10
#56 0x7f184f68517f in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /gecko/ipc/glue/MessagePump.cpp:85:21
#57 0x7f184f4fd7b1 in RunInternal /gecko/ipc/chromium/src/base/message_loop.cc:331:10
#58 0x7f184f4fd7b1 in RunHandler /gecko/ipc/chromium/src/base/message_loop.cc:324:3
#59 0x7f184f4fd7b1 in MessageLoop::Run() /gecko/ipc/chromium/src/base/message_loop.cc:306:3
#60 0x7f1856630507 in nsBaseAppShell::Run() /gecko/widget/nsBaseAppShell.cpp:137:27
#61 0x7f185b387c7f in XRE_RunAppShell() /gecko/toolkit/xre/nsEmbedFunctions.cpp:878:20
#62 0x7f184f4fd7b1 in RunInternal /gecko/ipc/chromium/src/base/message_loop.cc:331:10
#63 0x7f184f4fd7b1 in RunHandler /gecko/ipc/chromium/src/base/message_loop.cc:324:3
#64 0x7f184f4fd7b1 in MessageLoop::Run() /gecko/ipc/chromium/src/base/message_loop.cc:306:3
#65 0x7f185b386eb3 in XRE_InitChildProcess(int, char**, XREChildData const*) /gecko/toolkit/xre/nsEmbedFunctions.cpp:715:34
#66 0x55c884d307ed in content_process_main(mozilla::Bootstrap*, int, char**) /gecko/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
#67 0x55c884d30c20 in main /gecko/browser/app/nsBrowserApp.cpp:327:18
#68 0x7f1872bae0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
#69 0x55c884c7f8d9 in _start (/home/worker/builds/m-c-20220225104705-fuzzing-asan-opt/firefox+0x5d8d9)
Flags: in-testsuite?
Crash Signature: [@ nsPresContext::NotifyDOMContentFlushed]

Bugmon Analysis
Verified bug as reproducible on mozilla-central 20220304214025-967ae1edad41.
The bug appears to have been introduced in the following build range:

Start: 256b8ee1c2cf7a1f08b903738344a5aacf97bde3 (20220102213625)
End: 1cb2015e6fbc11f3a03137692fe60b111b94693a (20220103062926)
Pushlog: https://hg.mozilla.org/mozilla-unified/pushloghtml?fromchange=256b8ee1c2cf7a1f08b903738344a5aacf97bde3&tochange=1cb2015e6fbc11f3a03137692fe60b111b94693a

Keywords: regression
Whiteboard: [bugmon:bisected,confirmed]

Tyson, do you see this crashing in regular builds (not fuzzing builds), out of curiosity? (And how reliable does it seem to be?)

I didn't get it to crash locally (in a regular nightly), but I do notice that the page throbber keeps spinning forever after I close the print dialog (if I don't have print.always_print_silent set).

Also, would you mind posting a pernosco trace?

Flags: needinfo?(twsmith)
Attached file testcase.html

Updated test case to include window.location.reload(true) to help reproduce the crash automatically.

Attachment #9266612 - Attachment is obsolete: true

(In reply to Daniel Holbert [:dholbert] from comment #2)

Tyson, do you see this crashing in regular builds (not fuzzing builds), out of curiosity? (And how reliable does it seem to be?)

Yes, it is fairly reliable 1 in 4, maybe? Much better with the updated test case.

I didn't get it to crash locally (in a regular nightly), but I do notice that the page throbber keeps spinning forever after I close the print dialog (if I don't have print.always_print_silent set).

The test case seems to require print.always_print_silent=true to trigger the crash otherwise it seems to trigger a hang.

Also, would you mind posting a pernosco trace?

I am unable to trigger the crash in a no-opt build so getting a useful Pernosco session is likely out of the question, sorry.

Flags: needinfo?(twsmith)

Thanks for the updated testcase! I was able to repro with that.

I am unable to trigger the crash in a no-opt build so getting a useful Pernosco session is likely out of the question, sorry.

FWIW, pernosco is still useful with optimized-but-still-debuggable builds, e.g. ac_add_options --enable-debug --enable-optimize="-O1". That's worth trying in these cases. (Some variables are still optimized out which can be a little annoying, but often you can infer their value from going up or down the stack or forward/backward in time. And often --enable-optimize="-O1" is enough to make us "win" race conditions in the same way that we do in fully-optimized builds.)

I tried capturing an rr trace locally, and I was able to do so, with a debug-and-optimized build:
https://pernos.co/debug/7Z0WKneMa1U4ney7Vnrl3w/index.html

...and also in an --enable-debug --disable-optimize build -- pernosco's still processing that trace, and I'll post the link when I get it.

Here's the --enable-debug --disable-optimize trace:
https://pernos.co/debug/olCzvsbj_sCBH-bl760c-g/index.html#f{m[DrOD,AA_,t[rQ,Nok_,f{e[DrOA,Cqo_,s{af8ltdsAA,bDHc,uFpRYgA,oFpYFUw___/

At the point where we crash in nsPresContext::NotifyDOMContentFlushed, the this pointer is null.

There's one missing backtrace-entry for nsRefreshDriver::NotifyDOMContentLoaded in comment 0's backtrace (possibly due to inlining or some other optimization)

When we crash, we actually have this backtrace:

#0  nsPresContext::NotifyDOMContentFlushed (this=0x0) at layout/base/nsPresContext.cpp:2598
#1  0x00007fc9801c4ea2 in nsRefreshDriver::NotifyDOMContentLoaded (this=0x7fc9685e5400) at layout/base/nsRefreshDriver.cpp:1523
#2  0x00007fc97b8b359c in mozilla::dom::Document::UnblockDOMContentLoaded (this=0x7fc960ce9100) at dom/base/Document.cpp:8104
#3  0x00007fc97b8b33b5 in mozilla::dom::Document::EndLoad (this=0x7fc960ce9100) at dom/base/Document.cpp:8068
#4  0x00007fc97abb0021 in nsHtml5TreeOpExecutor::DidBuildModel (this=0x7fc95ded8800, aTerminated=true) at parser/html/nsHtml5TreeOpExecutor.cpp:210
#5  0x00007fc97ab67e80 in nsHtml5Parser::Terminate (this=0x7fc95c01c500) at parser/html/nsHtml5Parser.cpp:470
#6  0x00007fc97b89fc97 in mozilla::dom::Document::StopDocumentLoad (this=0x7fc960ce9100) at dom/base/Document.cpp:3899

(notice the "this=0x0" in backtrace level 0 there -- that's why we crash)

We've got a null nsPresContext because the nsRefreshDriver (the one from stack level 1) has its mPresContext member set to null -- that happened a bit earlier in nsRefreshDriver::Disconnect() which explicitly nulls out that member here:
https://searchfox.org/mozilla-central/rev/b0779bcc485dc1c04334dfb9ea024cbfff7b961a/layout/base/nsRefreshDriver.cpp#3143

We reach that point while we're deep inside of nsDocumentViewer::OnDonePrinting, which destroys the nsPrintData and nsPrintObject and a bunch of presentation-related stuff. Here's a pernosco link to that moment where we disconnect the refresh driver:
https://pernos.co/debug/olCzvsbj_sCBH-bl760c-g/index.html#f{m[DrFI,MJgt_,t[rQ,Nok_,f{e[DrFI,LglU_,s{af8ltdsAA,bDHc,uD9DOzA,oD9iNqA___/

I suspect we just need a null-check for GetPresContext() in nsRefreshDriver::NotifyDOMContentLoaded. It's very suspicious/footgunny to just be directly dereferencing the return-value of GetPresContext() there, given the "Get" in its name and given that we do have null-checks in other callers of the same API in nsRefreshDriver.

Assignee: nobody → dholbert
Status: NEW → ASSIGNED

Tyson, FWIW the original testcase here (the one I'm including in the patch) makes one of our linters angry due to windows line return characters (I think our linter is expecting unix line return charaters):

TEST-UNEXPECTED-ERROR | /builds/worker/checkouts/gecko/layout/printing/crashtests/1758199-1.html:0 | Windows line return (file-whitespace)

https://treeherder.mozilla.org/logviewer?job_id=371035854&repo=try&lineNumber=117

If we've got a fuzzer that's generating testcases with windows line returns, that would probably be worth fixing to avoid tripping over this linter issue.

Pushed by dholbert@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bafe0d9393ed Add null-check for GetPresContext() in nsRefreshDriver::NotifyDOMContentLoaded. r=emilio
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 100 Branch

:dholbert, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.

Flags: needinfo?(dholbert)

Bugmon Analysis
Verified bug as fixed on rev mozilla-central 20220315091352-d36030f3adc3.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Status: RESOLVED → VERIFIED
Keywords: bugmon

(In reply to Release mgmt bot [:marco/ :calixte] from comment #13)

:dholbert, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?

I'll drop the regression keyword -- looking at the regression range, Bug 1741698 is what would have been responsible for the fuzzers to have start catching this -- but this wasn't a "regression" from bug 1741698. Rather, it's just that printing was failing entirely in fuzzing configurations for some while before that patch landed, so they wouldn't get far enough to trigger the issue.

Flags: needinfo?(dholbert)
Keywords: regression

Ticking 'in-testsuite+' since the commit included the original testcase as a crashtest.

Flags: in-testsuite? → in-testsuite+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: