Closed Bug 1724083 Opened 4 years ago Closed 1 year ago

Crash in [@ mozilla::dom::BrowserParent::ActorDestroy]

Categories

(Core :: DOM: Content Processes, defect, P3)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
127 Branch
Tracking Status
firefox-esr78 --- wontfix
firefox-esr91 --- wontfix
firefox-esr115 --- wontfix
firefox91 --- wontfix
firefox92 --- wontfix
firefox93 --- wontfix
firefox125 --- wontfix
firefox126 --- wontfix
firefox127 --- fixed

People

(Reporter: Gijs, Assigned: nika)

References

Details

(Keywords: crash)

Crash Data

Attachments

(3 files)

Crash report: https://crash-stats.mozilla.org/report/index/47400ad0-3133-4177-bedb-92a620210804

MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(childBp)

Top 10 frames of crashing thread:

0 xul.dll mozilla::dom::BrowserParent::ActorDestroy dom/ipc/BrowserParent.cpp:678
1 xul.dll mozilla::ipc::IProtocol::DestroySubtree ipc/glue/ProtocolUtils.cpp:618
2 xul.dll mozilla::dom::PBrowserParent::OnMessageReceived ipc/ipdl/PBrowserParent.cpp:4182
3 xul.dll mozilla::dom::PContentParent::OnMessageReceived ipc/ipdl/PContentParent.cpp:6612
4 xul.dll mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:1978
5 xul.dll mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal xpcom/threads/TaskController.cpp:805
6 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1148
7 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:85
8 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:324
9 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:306

STR (from bug 1722880 comment 6):

If you create a new tab, load a PDF from file, navigate to example.org - breakpoint in didDestroy in PdfJsParent doesn't fire.
If you then close the tab, it does fire.
when you then resume it crashes, but that only happens when debugging and only with the breakpoint set - not sure if that is related.

(to be clear, "breakpoint" here means setting a breakpoint with the browser debugger)

https://crash-stats.mozilla.org/signature/?product=Firefox&signature=mozilla%3A%3Adom%3A%3ABrowserParent%3A%3AActorDestroy&date=%3E%3D2021-07-07T21%3A14%3A00.000Z&date=%3C2021-08-04T21%3A14%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_columns=startup_crash&_sort=-date&page=1

shows enough of these that I think there may be other cases where whatever this diagnostic assert is about breaks.

Note that the diagnostic assert doesn't mean we don't crash - the next line just dereferences the pointer we're checking so if it's null we just crash on the next line in release (and there are release crashes on crash-stats, so we do, in practice).

Severity: -- → S2
See Also: → 1722880

Assigning to Nika.

spinning nested event loop inside actor teardown of the actor hierarchy? Attempting to destroy BrowserParent twice after destroying JSWindowActorParent?

Nika will add some extra checks for recursive actor destruction.

Assignee: nobody → nika
Priority: -- → P2
See Also: 1722880

Looking at one of the newer crashes, it seems we fail to UnregisterRemoteFrame(mTabId) as mBrowserParentMap.Extract(aChildTabId) is giving us a nullptr.

There are a few instances of the MOZ_DIAGNOSTIC_ASSERT(childBp); that wants to guard us here, too.

I assume that while it is probably weird that we cannot find our registration anymore, it is probably not fatal enough to always crash and we can rely on the diagnostic assert to examine this further?

BTW, in this crash I see no nested SpinEventLoop whatsoever. We are apparently in a normal event loop. And the frequency is low enough to bump down severity, I think.

Severity: S2 → S3
Priority: P2 → P3

(In reply to Chris Peterson [:cpeterson] from comment #1)

Nika will add some extra checks for recursive actor destruction.

Hi Nika, are you still planning to do this?

Flags: needinfo?(nika)
Duplicate of this bug: 1869141
Flags: needinfo?(nika)

renewing my ni? so I remember to get back to this.

Flags: needinfo?(nika)

Copying crash signatures from duplicate bugs.

Crash Signature: [@ mozilla::dom::BrowserParent::ActorDestroy] → [@ mozilla::dom::BrowserParent::ActorDestroy] [@ mozilla::Maybe<T>::operator* | mozilla::dom::ContentProcessManager::UnregisterRemoteFrame] [@ mozilla::dom::ContentProcessManager::UnregisterRemoteFrame]

After the changes in the previous part, actors are now removed from their
manager's managed list before invoking ActorDestroy. There were a few places in
tree which specifically checked for the old behaviour, due to calls from the
managed actor during ActorDestroy to potentially trigger clean-up.

This patch adjusts these checks to instead expect a count of 0 as the actor
will now already have been removed at that point.

Depends on D198840

Pushed by nlayzell@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c53881f61ea6 Part 1: Rework ActorDestroy logic to be more resilient to nested event loops, r=ipc-reviewers,mccr8 https://hg.mozilla.org/integration/autoland/rev/fef6f9c26f1c Part 2: Update Actor destruction checks for new manager list behaviour, r=ipc-reviewers,mccr8 https://hg.mozilla.org/integration/autoland/rev/70121b8066c2 Part 3: Add a test to ensure IPDL actors handle reentrant destruction, r=ipc-reviewers,mccr8
Crash Signature: [@ mozilla::dom::BrowserParent::ActorDestroy] [@ mozilla::Maybe<T>::operator* | mozilla::dom::ContentProcessManager::UnregisterRemoteFrame] [@ mozilla::dom::ContentProcessManager::UnregisterRemoteFrame] → [@ mozilla::dom::BrowserParent::ActorDestroy] [@ mozilla::Maybe<T>::operator* | mozilla::dom::ContentProcessManager::UnregisterRemoteFrame] [@ mozilla::dom::ContentProcessManager::UnregisterRemoteFrame]
Regressions: 1893087
Regressions: 1893648
Regressions: 1893150
See Also: → 1895358
Flags: needinfo?(nika)
See Also: → 1909374
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: