Closed Bug 1924461 Opened 1 year ago Closed 7 months ago

Crash in [@ mozilla::dom::ContentParent::AssertAlive]

Categories

(Core :: DOM: Content Processes, defect, P3)

Unspecified
Windows
defect

Tracking

()

RESOLVED FIXED
138 Branch
Tracking Status
firefox-esr115 --- disabled
firefox-esr128 --- disabled
firefox136 --- disabled
firefox137 --- disabled
firefox138 --- fixed

People

(Reporter: gsvelto, Assigned: jstutte)

References

Details

(Keywords: crash)

Crash Data

Attachments

(3 files, 1 obsolete file)

Crash report: https://crash-stats.mozilla.org/report/index/76f15a01-4889-4930-8162-e5e510241014

MOZ_CRASH Reason:

MOZ_DIAGNOSTIC_ASSERT(!mIsSignaledImpendingShutdown)

Top 10 frames:

0  xul.dll  AnnotateMozCrashReason(char const*)  mfbt/Assertions.h:58
0  xul.dll  mozilla::dom::ContentParent::AssertAlive()  dom/ipc/ContentParent.cpp:1818
0  xul.dll  mozilla::dom::ContentParent::GetNewOrUsedLaunchingBrowserProcess(nsTSubstring...  dom/ipc/ContentParent.cpp:999
1  xul.dll  mozilla::dom::ContentParent::GetNewOrUsedBrowserProcess(nsTSubstring<char> co...  dom/ipc/ContentParent.cpp:1029
1  xul.dll  mozilla::dom::ContentParent::CreateBrowser(mozilla::dom::TabContext const&, m...  dom/ipc/ContentParent.cpp:1409
1  xul.dll  nsFrameLoader::TryRemoteBrowserInternal()  dom/base/nsFrameLoader.cpp:2683
2  xul.dll  nsFrameLoader::TryRemoteBrowser()  dom/base/nsFrameLoader.cpp:2749
2  xul.dll  nsFrameLoader::EnsureRemoteBrowser()  dom/base/nsFrameLoader.cpp:2485
2  xul.dll  nsFrameLoader::ShowRemoteFrame(nsSubDocumentFrame*)  dom/base/nsFrameLoader.cpp:1091
3  xul.dll  nsFrameLoader::Show(nsSubDocumentFrame*)  dom/base/nsFrameLoader.cpp:957

This is only happening on nightly & beta because it's a diagnostic assertion, the volume is significant though. Could this be showing up as a different problem in release?

Severity: -- → S3
Priority: -- → P3
Assignee: nobody → jstutte
Status: NEW → ASSIGNED

Could this be showing up as a different problem in release?

I have no evidence but I would not be surprised if it manifests as some kind of hang or not working browser.

The patch is only diagnostic, I have a possible theory what might go wrong from breaking up a minidump, but I'd like to confirm it's the (only) thing.

Keywords: leave-open
Pushed by jstutte@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e4cda21f9cdd Add diagnostic assertions to pin down which type of process selection goes wrong. r=smaug
Crash Signature: [@ mozilla::dom::ContentParent::AssertAlive] → [@ mozilla::dom::ContentParent::AssertAlive] [@ mozilla::dom::ContentParent::GetNewOrUsedLaunchingBrowserProcess ]

So all new crashes come from an existing host process attached to this BrowsingContextGroup which is already shutting down when we get here. Given the earlier general shutdown check which seems to never trigger in recent versions, I assume the host process is shutting down for other reasons and we have some race here.

Most likely we see a misalignment between mIsSignaledImpendingShutdown and mThreadsafeHandle->mShutdownStarted here. I assume we should stop adding new uses to a process as soon as mIsSignaledImpendingShutdown happened.

In most places where we SignalImpendingShutdownToContentJS we already MarkAsDead, except for the shutdown blocker for QuitApplicationGranted, where we explictly do not MarkAsDead, yet.

Note that this patch is supposed to help only if we see a very edgy race between the earlier general shutdown check and the content process' shutdown blocker being invoked. But for now I have no better bet.

Pushed by jstutte@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c90cf1207b44 Ensure we always set mThreadsafeHandle->mShutdownStarted when we SignalImpendingShutdownToContentJS. r=smaug

So most of the crashes have

XPCOMSpinEventLoopStack: AsyncShutdown Spinner for quit-application-granted (or more but always containing the "quit-application-granted" piece.)

Now there is a misalignment between ShutdownPhase::AppShutdownConfirmed mapping to the "quit-application" notification and the AsyncShutdown exposing quitApplicationGranted which maps (not very surprisingly) to "quit-application-granted" instead.

Any blocker using quitApplicationGranted that might cause (directly or indirectly) a content process to begin its shutdown might race with our shutdown check based on ShutdownPhase::AppShutdownConfirmed.

I think the right course of action would probably be to align this better. It might actually be enough to map this barrier to "quit-application", as there seems to be no mapping for that one, anyways. There might be a slight risk of mixing those with other things that are already happening during "quit-application", changing the relative order, but I think it is worth to check this for fallout.

Depends on: 1954988

"quit-application-granted" is notified before AppShutdown enters ShutdownPhase::AppShutdownConfirmed, which confusingly maps to "quit-application".
Bug 1954988 tracks a general alignment between AsyncShutdown and ShutdownPhase, but that might influence shutdown order and be a bit more risky.
In the meantime we can just deal with this situation by keeping track of when we entered the quitApplicationGranted shutdown barrier.

No longer depends on: 1954988
See Also: → 1954988
Attachment #9473208 - Attachment description: WIP: Bug 1924461 - Have a static flag for when quitApplicationGranted has been notified to ContentParent. r?smaug → Bug 1924461 - Have a static flag for when quitApplicationGranted has been notified to ContentParent. r?smaug
Keywords: leave-open
Depends on: 1955532
Attachment #9473208 - Attachment is obsolete: true

After bug 1955532 removed the last inconsistency we can rely on IsInOrBeyond(ShutdownPhase::AppShutdownConfirmed) to be sufficient to check such that the later assertions of a single process not being in shutdown are still meaningful.

Callers always had to react on nullptr, and in release we already did so here.

Pushed by jstutte@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/97d784aff636 Rely on callers of GetNewOrUsedLaunchingBrowserProcess to handle a nullptr on shutdown. r=smaug
Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → 138 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: