Closed Bug 1566196 Opened 8 months ago Closed 6 months ago

Enable subframe crashing regression test

Categories

(Core :: DOM: Content Processes, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla71
Fission Milestone M4
Tracking Status
firefox71 --- fixed

People

(Reporter: mconley, Assigned: mconley)

References

Details

Attachments

(2 files)

Attempting to crash a subframe just brings the whole parent down. We desperately need bug 1559244 fixed so we can get some coverage and this stops happening. :/

I'll collect a stack trace and see what I can find.

I'm failing an assertion. Here's the output from lldb from a local opt build:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001077638da XUL`mozilla::dom::ContentParent::RecvCacheBrowsingContextChildren(this=0x0000000124f5a000, aContext=0x000000012a5c7570) at ContentParent.cpp:5852:5 [opt]
   5849	    // process. This is illegal since the owner of the BrowsingContext
   5850	    // is the proccess with the in-process docshell, which is tracked
   5851	    // by OwnerProcessId.
-> 5852	    MOZ_DIAGNOSTIC_ASSERT(false, "Trying to cache out of process context");
   5853
   5854	    MOZ_LOG(BrowsingContext::GetLog(), LogLevel::Warning,
   5855	            ("ParentIPC: Trying to cache out of process context 0x%08" PRIx64,
(lldb) bt

  * frame #0: 0x00000001077638da XUL`mozilla::dom::ContentParent::RecvCacheBrowsingContextChildren(this=0x0000000124f5a000, aContext=0x000000012a5c7570) at ContentParent.cpp:5852:5 [opt]
    frame #1: 0x000000010587dbde XUL`mozilla::dom::PContentParent::OnMessageReceived(this=0x0000000124f5a000, msg__=0x00000001166428c0) at PContentParent.cpp:10604:57 [opt]
    frame #2: 0x00000001057d0347 XUL`mozilla::ipc::MessageChannel::DispatchAsyncMessage(this=0x0000000124f5a120, aProxy=0x00000001166be120, aMsg=0x00000001166428c0) at MessageChannel.cpp:2158:25 [opt]
    frame #3: 0x00000001057cf1e1 XUL`mozilla::ipc::MessageChannel::DispatchMessage(this=0x0000000124f5a120, aMsg=0x00000001166428c0) at MessageChannel.cpp:2082:9 [opt]
    frame #4: 0x00000001057cfbfe XUL`mozilla::ipc::MessageChannel::MessageTask::Run(this=0x0000000116642870) at MessageChannel.cpp:1970:13 [opt]
    frame #5: 0x000000010522fc93 XUL`nsThread::ProcessNextEvent(this=0x0000000101fe13c0, aMayWait=<unavailable>, aResult=<unavailable>) at nsThread.cpp:1225:14 [opt]
    frame #6: 0x000000010522dde2 XUL`NS_ProcessPendingEvents(aThread=0x0000000101fe13c0, aTimeout=10) at nsThreadUtils.cpp:434:19 [opt]
    frame #7: 0x0000000107a1ec5f XUL`nsBaseAppShell::NativeEventCallback(this=0x0000000101fd1700) at nsBaseAppShell.cpp:87:3 [opt]
    frame #8: 0x0000000107a88718 XUL`nsAppShell::ProcessGeckoEvents(aInfo=0x0000000101fd1700) at nsAppShell.mm:440:11 [opt]
    frame #9: 0x00007fff5185e161 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
    frame #10: 0x00007fff5191872c CoreFoundation`__CFRunLoopDoSource0 + 108
    frame #11: 0x00007fff51840bc0 CoreFoundation`__CFRunLoopDoSources0 + 208
    frame #12: 0x00007fff5184003d CoreFoundation`__CFRunLoopRun + 1293
    frame #13: 0x00007fff5183f8a3 CoreFoundation`CFRunLoopRunSpecific + 483
    frame #14: 0x00007fff50b29d96 HIToolbox`RunCurrentEventLoopInMode + 286
    frame #15: 0x00007fff50b29b06 HIToolbox`ReceiveNextEventCommon + 613
    frame #16: 0x00007fff50b29884 HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64
    frame #17: 0x00007fff4edd9a73 AppKit`_DPSNextEvent + 2085
    frame #18: 0x00007fff4f56fe34 AppKit`-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044
    frame #19: 0x0000000107a87c3b XUL`::-[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:](self=0x0000000101fd18b0, _cmd=<unavailable>, mask=18446744073709551615, expiration=0x422d63c37f00000d, mode=0x00007fffab952128, flag=YES) at nsAppShell.mm:169:24 [opt]
    frame #20: 0x00007fff4edce885 AppKit`-[NSApplication run] + 764
    frame #21: 0x0000000107a88e29 XUL`nsAppShell::Run(this=0x0000000101fd1700) at nsAppShell.mm:703:5 [opt]
    frame #22: 0x0000000108d1b44e XUL`nsAppStartup::Run(this=0x0000000101fdd1f0) at nsAppStartup.cpp:276:30 [opt]
    frame #23: 0x0000000108e45e90 XUL`XREMain::XRE_mainRun(this=0x00007ffeedf4fe20) at nsAppRunner.cpp:4636:22 [opt]
    frame #24: 0x0000000108e46916 XUL`XREMain::XRE_main(this=0x00007ffeedf4fe20, argc=<unavailable>, argv=<unavailable>, aConfig=0x00007ffeedf4ff90) at nsAppRunner.cpp:4771:8 [opt]
    frame #25: 0x0000000108e46e7b XUL`XRE_main(argc=<unavailable>, argv=<unavailable>, aConfig=<unavailable>) at nsAppRunner.cpp:4852:21 [opt]
    frame #26: 0x0000000101cb003b firefox`main [inlined] do_main(argc=<unavailable>, argv=<unavailable>, envp=0x00007ffeedf50418) at nsBrowserApp.cpp:213:22 [opt]
    frame #27: 0x0000000101cafe35 firefox`main(argc=<unavailable>, argv=<unavailable>, envp=0x00007ffeedf50418) at nsBrowserApp.cpp:295 [opt]
    frame #28: 0x00007fff7976c015 libdyld.dylib`start + 1
    frame #29: 0x00007fff7976c015 libdyld.dylib`start + 1
Assignee: nobody → mconley

Hey nika,

I apologize - you walked me through what we needed to do here a week or so back, and it's completely slipped my mind. These are the fragments I remember:

  1. We probably don't need the SubframeCrashChild actor anymore - the parent process can do the work of flipping the remoteness of the crashed frame and sending it to about:framecrashed.
  2. We're going to need to update the method for flipping remoteness to allow passing in a document loading error code.

I just can't remember which mechanism I'm supposed to be using to swap out the crashed subframe with about:framecrashed. Can you remind me again?

Flags: needinfo?(nika)

The method I was thinking about was WindowGlobalParent::ChangeFrameRemoteness. Also pointing you at bug 1563619, which might be necessary to fix some issues with subframes of crashed processes.

It might be good to coordinate with :farre on figuring this out.

Flags: needinfo?(nika) → needinfo?(mconley)
Fission Milestone: --- → M4
Flags: needinfo?(mconley)
Flags: needinfo?(mconley)
See Also: → 1563619

I'm getting a new stack when crashing OOP subframes now:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001077d1bd0 XUL`mozilla::dom::ContentProcessManager::GetContentProcessById(this=0x000000013bb61c70, aChildCpId=0x00007ffeefbf6b28) at ContentProcessManager.cpp:92
    frame #1: 0x00000001077b78d6 XUL`mozilla::dom::ContentParent::NotifyTabDestroying(aTabId=0x00007ffeefbf6b30, aCpId=0x00007ffeefbf6b28) at ContentParent.cpp:1782
    frame #2: 0x00000001077945ae XUL`mozilla::dom::BrowserParent::Destroy(this=0x000000016d74c800) at BrowserParent.cpp:623
    frame #3: 0x00000001077923e8 XUL`mozilla::dom::BrowserBridgeParent::Destroy(this=0x000000016e787280) at BrowserBridgeParent.cpp:116
    frame #4: 0x000000010779609c XUL`mozilla::dom::BrowserBridgeParent::ActorDestroy(this=0x000000016e787280, aWhy=Deletion) at BrowserBridgeParent.cpp:223
    frame #5: 0x0000000102ed8d8b XUL`mozilla::ipc::IProtocol::DestroySubtree(this=0x000000016e787280, aWhy=Deletion) at ProtocolUtils.cpp:659
    frame #6: 0x00000001036ee830 XUL`mozilla::dom::PBrowserBridgeParent::OnMessageReceived(this=0x000000016e787280, msg__=0x000000016aa086b8) at PBrowserBridgeParent.cpp:234
    frame #7: 0x00000001030dce7c XUL`mozilla::dom::PContentParent::OnMessageReceived(this=0x000000016dcd4000, msg__=0x000000016aa086b8) at PContentParent.cpp:5551
    frame #8: 0x0000000102ece326 XUL`mozilla::ipc::MessageChannel::DispatchAsyncMessage(this=0x000000016dcd4140, aProxy=0x000000016df50b20, aMsg=0x000000016aa086b8) at MessageChannel.cpp:2158
    frame #9: 0x0000000102ecc67e XUL`mozilla::ipc::MessageChannel::DispatchMessage(this=0x000000016dcd4140, aMsg=0x000000016aa086b8) at MessageChannel.cpp:2082
    frame #10: 0x0000000102ecd10e XUL`mozilla::ipc::MessageChannel::RunMessage(this=0x000000016dcd4140, aTask=0x000000016aa08660) at MessageChannel.cpp:1939
    frame #11: 0x0000000102ecd7d6 XUL`mozilla::ipc::MessageChannel::MessageTask::Run(this=0x000000016aa08660) at MessageChannel.cpp:1970
    frame #12: 0x000000010213bcb1 XUL`nsThread::ProcessNextEvent(this=0x000000010192d980, aMayWait=false, aResult=0x00007ffeefbfcb1f) at nsThread.cpp:1225
    frame #13: 0x000000010213808a XUL`NS_ProcessPendingEvents(aThread=0x000000010192d980, aTimeout=10) at nsThreadUtils.cpp:434
    frame #14: 0x0000000107ede1c0 XUL`nsBaseAppShell::NativeEventCallback(this=0x00000001019e4160) at nsBaseAppShell.cpp:87
    frame #15: 0x0000000107f926a1 XUL`nsAppShell::ProcessGeckoEvents(aInfo=0x00000001019e4160) at nsAppShell.mm:440
    frame #16: 0x00007fff5185e161 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
    frame #17: 0x00007fff5191872c CoreFoundation`__CFRunLoopDoSource0 + 108
    frame #18: 0x00007fff51840bc0 CoreFoundation`__CFRunLoopDoSources0 + 208
    frame #19: 0x00007fff5184003d CoreFoundation`__CFRunLoopRun + 1293
    frame #20: 0x00007fff5183f8a3 CoreFoundation`CFRunLoopRunSpecific + 483
    frame #21: 0x00007fff50b29d96 HIToolbox`RunCurrentEventLoopInMode + 286
    frame #22: 0x00007fff50b29b06 HIToolbox`ReceiveNextEventCommon + 613
    frame #23: 0x00007fff50b29884 HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64
    frame #24: 0x00007fff4edd9a73 AppKit`_DPSNextEvent + 2085
    frame #25: 0x00007fff4f56fe34 AppKit`-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044
    frame #26: 0x0000000107f90f3c XUL`::-[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:](self=0x00000001019d3040, _cmd="nextEventMatchingMask:untilDate:inMode:dequeue:", mask=18446744073709551615, expiration=4001-01-01 00:00:00 UTC, mode="kCFRunLoopDefaultMode", flag=YES) at nsAppShell.mm:169
    frame #27: 0x00007fff4edce885 AppKit`-[NSApplication run] + 764
    frame #28: 0x0000000107f933b7 XUL`nsAppShell::Run(this=0x00000001019e4160) at nsAppShell.mm:703
    frame #29: 0x000000010aa5b12b XUL`nsAppStartup::Run(this=0x00000001019dd2e0) at nsAppStartup.cpp:276
    frame #30: 0x000000010acf2ac6 XUL`XREMain::XRE_mainRun(this=0x00007ffeefbfed68) at nsAppRunner.cpp:4636
    frame #31: 0x000000010acf3c16 XUL`XREMain::XRE_main(this=0x00007ffeefbfed68, argc=5, argv=0x00007ffeefbff3e0, aConfig=0x00007ffeefbfef08) at nsAppRunner.cpp:4771
    frame #32: 0x000000010acf4429 XUL`XRE_main(argc=5, argv=0x00007ffeefbff3e0, aConfig=0x00007ffeefbfef08) at nsAppRunner.cpp:4852
    frame #33: 0x000000010ad0b637 XUL`mozilla::BootstrapImpl::XRE_main(this=0x0000000101911230, argc=5, argv=0x00007ffeefbff3e0, aConfig=0x00007ffeefbfef08) at Bootstrap.cpp:45
    frame #34: 0x000000010000134d firefox`do_main(argc=5, argv=0x00007ffeefbff3e0, envp=0x00007ffeefbff410) at nsBrowserApp.cpp:213
    frame #35: 0x0000000100000d19 firefox`main(argc=5, argv=0x00007ffeefbff3e0, envp=0x00007ffeefbff410) at nsBrowserApp.cpp:295
    frame #36: 0x00007fff7976c015 libdyld.dylib`start + 1
Status: NEW → ASSIGNED
Priority: -- → P2

It looks like we're crashing at the stack in comment 4 because we're removing the ContentParent from the ContentProcessManager id->ContentParent mapping with this stack after the crash:

* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001077d1df6 XUL`mozilla::dom::ContentProcessManager::RemoveContentProcess(this=0x000000012eb090d0, aChildCpId=0x00007ffeefbfbe88) at ContentProcessManager.cpp:60
    frame #1: 0x00000001077d1217 XUL`mozilla::dom::ContentParent::ActorDestroy(this=0x000000016b54e000, why=AbnormalShutdown) at ContentParent.cpp:1680
    frame #2: 0x0000000102ed8d8b XUL`mozilla::ipc::IProtocol::DestroySubtree(this=0x000000016b54e000, aWhy=AbnormalShutdown) at ProtocolUtils.cpp:659
    frame #3: 0x00000001030fd734 XUL`mozilla::dom::PContentParent::OnChannelError(this=0x000000016b54e000) at PContentParent.cpp:12291
    frame #4: 0x00000001077d02da XUL`mozilla::dom::ContentParent::OnChannelError(this=0x000000016b54e000) at ContentParent.cpp:1511
    frame #5: 0x0000000102ecf8d3 XUL`mozilla::ipc::MessageChannel::NotifyMaybeChannelError(this=0x000000016b54e140) at MessageChannel.cpp:2572
    frame #6: 0x0000000102ecfb04 XUL`mozilla::ipc::MessageChannel::OnNotifyMaybeChannelError(this=0x000000016b54e140) at MessageChannel.cpp:2602
    frame #7: 0x0000000102ee91ce XUL`decltype(o=0x000000016b54e140, m=90 f9 ec 02 01 00 00 00 00 00 00 00 00 00 00 00, args=0x0000000137b105f8, (null)=std::__1::index_sequence<> @ 0x00007ffeefbfc2b8).*fp0(Get<>(fp1).PassAsParameter())) mozilla::detail::RunnableMethodArguments<>::applyImpl<mozilla::ipc::MessageChannel, void (mozilla::ipc::MessageChannel::*)()>(mozilla::ipc::MessageChannel*, void (mozilla::ipc::MessageChannel::*)(), mozilla::Tuple<>&, std::__1::integer_sequence<unsigned long>) at nsThreadUtils.h:1124
    frame #8: 0x0000000102ee913d XUL`_ZN7mozilla6detail23RunnableMethodArgumentsIJEE5applyINS_3ipc14MessageChannelEMS5_FvvEEEDTcl9applyImplfp_fp0_dtdefpT10mArgumentstlNSt3__116integer_sequenceImJEEEEEEPT_T0_(this=0x0000000137b105f8, o=0x000000016b54e140, m=90 f9 ec 02 01 00 00 00 00 00 00 00 00 00 00 00) at nsThreadUtils.h:1130
    frame #9: 0x0000000102ee8f2e XUL`mozilla::detail::RunnableMethodImpl<mozilla::ipc::MessageChannel*, void (mozilla::ipc::MessageChannel::*)(), false, (mozilla::RunnableKind)1>::Run(this=0x0000000137b105b0) at nsThreadUtils.h:1176
    frame #10: 0x000000010213bcb1 XUL`nsThread::ProcessNextEvent(this=0x000000010192d980, aMayWait=false, aResult=0x00007ffeefbfcb1f) at nsThread.cpp:1225
    frame #11: 0x000000010213808a XUL`NS_ProcessPendingEvents(aThread=0x000000010192d980, aTimeout=10) at nsThreadUtils.cpp:434
    frame #12: 0x0000000107ede1c0 XUL`nsBaseAppShell::NativeEventCallback(this=0x00000001019e3160) at nsBaseAppShell.cpp:87
    frame #13: 0x0000000107f926a1 XUL`nsAppShell::ProcessGeckoEvents(aInfo=0x00000001019e3160) at nsAppShell.mm:440
    frame #14: 0x00007fff5185e161 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
    frame #15: 0x00007fff5191872c CoreFoundation`__CFRunLoopDoSource0 + 108
    frame #16: 0x00007fff51840bc0 CoreFoundation`__CFRunLoopDoSources0 + 208
    frame #17: 0x00007fff5184003d CoreFoundation`__CFRunLoopRun + 1293
    frame #18: 0x00007fff5183f8a3 CoreFoundation`CFRunLoopRunSpecific + 483
    frame #19: 0x00007fff50b29d96 HIToolbox`RunCurrentEventLoopInMode + 286
    frame #20: 0x00007fff50b29b06 HIToolbox`ReceiveNextEventCommon + 613
    frame #21: 0x00007fff50b29884 HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64
    frame #22: 0x00007fff4edd9a73 AppKit`_DPSNextEvent + 2085
    frame #23: 0x00007fff4f56fe34 AppKit`-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044
    frame #24: 0x0000000107f90f3c XUL`::-[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:](self=0x00000001019d3040, _cmd="nextEventMatchingMask:untilDate:inMode:dequeue:", mask=18446744073709551615, expiration=4001-01-01 00:00:00 UTC, mode="kCFRunLoopDefaultMode", flag=YES) at nsAppShell.mm:169
    frame #25: 0x00007fff4edce885 AppKit`-[NSApplication run] + 764
    frame #26: 0x0000000107f933b7 XUL`nsAppShell::Run(this=0x00000001019e3160) at nsAppShell.mm:703
    frame #27: 0x000000010aa5b12b XUL`nsAppStartup::Run(this=0x00000001019d71a0) at nsAppStartup.cpp:276
    frame #28: 0x000000010acf2ac6 XUL`XREMain::XRE_mainRun(this=0x00007ffeefbfed68) at nsAppRunner.cpp:4636
    frame #29: 0x000000010acf3c16 XUL`XREMain::XRE_main(this=0x00007ffeefbfed68, argc=5, argv=0x00007ffeefbff3e0, aConfig=0x00007ffeefbfef08) at nsAppRunner.cpp:4771
    frame #30: 0x000000010acf4429 XUL`XRE_main(argc=5, argv=0x00007ffeefbff3e0, aConfig=0x00007ffeefbfef08) at nsAppRunner.cpp:4852
    frame #31: 0x000000010ad0b637 XUL`mozilla::BootstrapImpl::XRE_main(this=0x0000000101911230, argc=5, argv=0x00007ffeefbff3e0, aConfig=0x00007ffeefbfef08) at Bootstrap.cpp:45
    frame #32: 0x000000010000134d firefox`do_main(argc=5, argv=0x00007ffeefbff3e0, envp=0x00007ffeefbff410) at nsBrowserApp.cpp:213
    frame #33: 0x0000000100000d19 firefox`main(argc=5, argv=0x00007ffeefbff3e0, envp=0x00007ffeefbff410) at nsBrowserApp.cpp:295
    frame #34: 0x00007fff7976c015 libdyld.dylib`start + 1
Flags: needinfo?(mconley)

mozregression suggests this was regressed by bug 1555287.

6:02.23 INFO: No more inbound revisions, bisection finished.
6:02.23 INFO: Last good revision: 36f11c1d8950e9106831a71c60fa6336bc0df606
6:02.23 INFO: First bad revision: 695c34774fbce79bf31e8d11ce2d7d92d1739f33
6:02.23 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=36f11c1d8950e9106831a71c60fa6336bc0df606&tochange=695c34774fbce79bf31e8d11ce2d7d92d1739f33

Blocks: 1555287
Flags: needinfo?(afarre)

Well, yes. It does look a bit weird. I guess that this is with the fission pref on?

Flags: needinfo?(afarre) → needinfo?(mconley)

That's correct, yes. I guess our teardown logic in the event of a crash is out of order or something?

Flags: needinfo?(mconley) → needinfo?(afarre)
Flags: needinfo?(afarre)
Summary: Subframe crashing is broken again → Enable subframe crashing regression test

Bug 1559244 landed, which added a new subframe crashing test (disabled by default), and a little while later, bug 1563619 landed, which actually fixed subframe crashing.

This patch I'm about to post modifies the test to better match what happens now when subframes crash, and also enables the test by default.

The test originally assumed that the BrowsingContext for the crashing
frame would get destroyed and replaced with a new one. Having read through
some of bug 1563619 though, it looks as if we preserve the BrowsingContext
after the subframe crashes, so I've modified the test to skip that check.

It also looks like we have to wait until the WindowGlobalParent is created
asynchronously from the about:blank load that's initiated in the content
process before we can query the subframe for its current location, so
I've added a waitForCondition for that.

So in debug builds, this test still causes a parent process crash due to an assertion failure with the same stacks from comment 4.

The problem appears to be that when we have a crashing subframe, we end up calling ActorDestroy on the BrowserBridgeParent, which calls Destroy on the BrowserParent, which then calls ContentParent::NotifyTabDestroying.

ContentParent::NotifyTabDestroying attempts to get a ContentParent associated with the ContentParentId that was passed to it, but because that ContentParent was removed from its list via the stack in comment 5, we end up hitting this ASSERT_UNLESS_FUZZING in ContentProcessManager::GetContentProcessById.

That's why we're crashing on debug builds.

One immediate observation is that in the non-debug build case, instead of failing with ASSERT_UNLESS_FUZZING, ContentProcessManager::GetContentProcessById returns nullptr, which then causes NotifyTabDestroying to bail out immediately.

What I suspect we want to do here is to not call NotifyTabDestroying if we're responding to a crash. As far as I can tell, none of the things inside of NotifyTabDestroying make sense to do if we're crashing (it's mostly concerned with bookkeeping to determine whether or not we should shut down the content process that the frame or tab is associated with).

What I suggest is that we pass a boolean from BrowserBridgeParent down to BrowserParent when calling the Destroy method, and that bit tells us whether or not the destruction is due to an AbnormalShutdown. If so, we have BrowserParent::Destroy skip calling NotifyTabDestroying if we're crashing.

Any objections to that plan farre, or nika?

Flags: needinfo?(nika)
Flags: needinfo?(afarre)

(In reply to Mike Conley (:mconley) (:⚙️) from comment #12)

ContentParent::NotifyTabDestroying attempts to get a ContentParent associated with the ContentParentId that was passed to it, but because that ContentParent was removed from its list via the stack in comment 5, we end up hitting this ASSERT_UNLESS_FUZZING in ContentProcessManager::GetContentProcessById.

I've somewhat frequently run into issues where these ASSERT_UNLESS_FUZZING assertions cause unnecessary crashes which need to be worked around awkwardly. As this is another instance of that, I'm mildly inclined towards simply removing this assertion entirely.

The assertion was originally added in bug 1020172. As far as I can tell, the intention was to assert that messages from remote processes contained a valid ID in them, and fail tests if they did not. However, it also causes issues due to how we often treat IDs as a sort-of weak reference, and the ContentParent can easily disappear.

One immediate observation is that in the non-debug build case, instead of failing with ASSERT_UNLESS_FUZZING, ContentProcessManager::GetContentProcessById returns nullptr, which then causes NotifyTabDestroying to bail out immediately.

As far as I am aware every call site of this method has correct recovery code for handling the missing ContentParent case (and would be fairly unsound if it did not, as this is a debug assert).

Flags: needinfo?(nika)

I should also note that NotifyTabDestroying using ChildID is a relic of the old B2G "nested content process" setup. There are technically 2 callsites, one from over IPC and the other from within the current process, but the IPC one is never invoked, as we have removed all code which can allow nested content processes. We should probably also change the logic to not try to use nested content processes here.

See Also: → 1579121

Yeah, the patch in bug 1579121 fixes things nicely. Thanks!

It turns out I also needed to add a CSP header to the about:framecrashed header, or debug builds would assert on me.

Here's a try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=50b97d29b151fb57d6bbdedc7fe95288a830a33f

Flags: needinfo?(afarre)
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2de26156c571
Apply a CSP to about:framecrashed. r=jaws
https://hg.mozilla.org/integration/autoland/rev/eaf4a3284f00
Enable subframe crashing regression test. r=layely
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/872417d970d5
Apply a CSP to about:framecrashed. r=jaws
https://hg.mozilla.org/integration/autoland/rev/3e0217846b4b
Enable subframe crashing regression test. r=layely

I forgot that crash tests like this need to be disabled for builds that don't have crashreporter, and for verify jobs. Re-landed with those job types skipped.

Flags: needinfo?(mconley)
Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71
Regressions: 1579716
You need to log in before you can comment on or make changes to this bug.