Crash in [@ mozilla::dom::ContentParent::AssertAlive]
Categories
(Core :: DOM: Content Processes, defect, P2)
Tracking
()
People
(Reporter: gsvelto, Assigned: jstutte)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
Crash report: https://crash-stats.mozilla.org/report/index/2227b679-4683-40ee-95b8-7cdb70230121
MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(!mIsSignaledImpendingShutdown)
Top 10 frames of crashing thread:
0 xul.dll mozilla::dom::ContentParent::AssertAlive dom/ipc/ContentParent.cpp:1961
1 xul.dll mozilla::dom::ContentParent::GetNewOrUsedLaunchingBrowserProcess dom/ipc/ContentParent.cpp:1063
2 xul.dll mozilla::dom::CanonicalBrowsingContext::ChangeRemoteness docshell/base/CanonicalBrowsingContext.cpp:2072
3 xul.dll mozilla::net::DocumentLoadListener::TriggerProcessSwitch netwerk/ipc/DocumentLoadListener.cpp:1979
4 xul.dll mozilla::net::DocumentLoadListener::MaybeTriggerProcessSwitch netwerk/ipc/DocumentLoadListener.cpp:1907
4 xul.dll mozilla::net::DocumentLoadListener::OnStartRequest netwerk/ipc/DocumentLoadListener.cpp:2503
5 xul.dll mozilla::net::ParentChannelListener::OnStartRequest netwerk/protocol/http/ParentChannelListener.cpp:86
6 xul.dll nsDocumentOpenInfo::OnStartRequest uriloader/base/nsURILoader.cpp:167
7 xul.dll mozilla::net::ParentProcessDocumentOpenInfo::OnDocumentStartRequest netwerk/ipc/DocumentLoadListener.cpp:297
7 xul.dll mozilla::net::ParentProcessDocumentOpenInfo::OnStartRequest netwerk/ipc/DocumentLoadListener.cpp:345
The earliest affected buildid is 20230120212103. Only one crash report has a comment that says:
restart after NIGHTLY update
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 1•2 years ago
•
|
||
These crashes come through DocumentLoadListener::OnStartRequest
and will hopefully be addressed by bug 1811195. At least I am pretty sure that there should be no other edge case left than parent shutdown here. But I'll not dupe it until we'll see it go away.
There remains also the question, where are good places to put all the IsInOrBeyond
checks. We currently have them at:
ContentParent::BeginSubprocessLaunch
but only in or beyondAppShutdown
ContentParent::CreateBrowser
but only in or beyondAppShutdown
and bug 1811195 wants to add them at
DocumentLoadListener::OnStartRequest
which will probably help hereAsyncFrameInit::Run
which can always be in flight when we want to start shutdown
which would make the first two kind of obsolete. But I overlooked that ContentParent::GetNewOrUsedLaunchingBrowserProcess
has a shortcut that hands out a used process, too. I think that we should:
- have a check/assert/diagnostic assert and bail out at the beginning of
ContentParent::GetNewOrUsedLaunchingBrowserProcess
- have regular bail outs (without assert) at higher level places (like bug 1811195 proposed)
- have only (diagnostic) asserts inside the internal launching functions (via
AssertAlive
)
Edit: Actually looking at call paths to BeginSubprocessLaunch
we should probably have regular checks also at:
SharedWorkerManager::MaybeCreateRemoteWorker
andBackgroundParentImpl::AllocPRemoteWorkerControllerParent
ContentParent::RecvCloneDocumentTreeInto
nsFrameLoader::EnsureRemoteBrowser
and/or some of its callersthe
PreallocatedProcessManager`should stop to serve processes, too, which has been already addressed by bug 1795964.
I assume that XRE_SendTestShellCommand
and nsIXULRuntime::EnsureContentProcess
ahould not check anything but just let the launching code assert at lower levels in case.
Edit 2: I moved this aspect to bug 1811195 comment 5 for further analysis.
Comment 2•2 years ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 10 desktop browser crashes on nightly
:janv, could you consider increasing the severity of this top-crash bug?
For more information, please visit auto_nag documentation.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 3•2 years ago
•
|
||
It seems we can find our process still subscribed to the given BrowsingContextGroup
despite having started our content process' shutdown. IIUC, we always will ContentParent::MarkAsDead
and thus ContentParent::RemoveFromList
where we unsubscribe the process from all mGroups
at the same time when we have done SignalImpendingShutdownToContentJS
, so this is not really expected, unless there is some misalignment in our book-keeping for BrowsingContextGroup
s between what the ContentParent
knows about it and who else might reference them.
I assume we could just check for !contentParent->IsShuttingDown()
here to paint over this here but it feels wrong that those BrowsingContextGroup
still refer to an apparently dying process?
Comment 4•2 years ago
|
||
Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 5•2 years ago
|
||
So the volume went down since bug 1814603 landed and the remaining crashes happen in a different place than comment 3 assumed:
0 XUL mozilla::dom::ContentParent::AssertAlive() dom/ipc/ContentParent.cpp:1993 context
1 XUL mozilla::dom::ContentParent::AddKeepAlive() dom/ipc/ContentParent.cpp:2434 cfi
2 XUL mozilla::dom::CanonicalBrowsingContext::ChangeRemoteness(mozilla::dom::NavigationIsolationOptions const&, unsigned long long) docshell/base/CanonicalBrowsingContext.cpp:2063 cfi
3 XUL mozilla::net::DocumentLoadListener::TriggerProcessSwitch(mozilla::dom::CanonicalBrowsingContext*, mozilla::dom::NavigationIsolationOptions const&, bool) netwerk/ipc/DocumentLoadListener.cpp:1994 cfi
4 XUL mozilla::net::DocumentLoadListener::MaybeTriggerProcessSwitch(bool*) netwerk/ipc/DocumentLoadListener.cpp:1922 inlined
4 XUL mozilla::net::DocumentLoadListener::OnStartRequest(nsIRequest*) netwerk/ipc/DocumentLoadListener.cpp:2525 cfi
5 XUL mozilla::net::ParentChannelListener::OnStartRequest(nsIRequest*) netwerk/protocol/http/ParentChannelListener.cpp:86 cfi
6 XUL nsDocumentOpenInfo::OnStartRequest(nsIRequest*) uriloader/base/nsURILoader.cpp:167 cfi
I think we should just do the same IsShuttingDown
check as here also for this content parent ?
Assignee | ||
Comment 6•2 years ago
|
||
Updated•2 years ago
|
Comment 8•2 years ago
|
||
bugherder |
Updated•2 years ago
|
Assignee | ||
Comment 10•3 months ago
|
||
It seems the recent (small) spike started with bug 1728331 landing.
Description
•