Closed Bug 1789126 Opened 2 years ago Closed 2 years ago

Intermittent Assertion failure: false (MOZ_ASSERT_UNREACHABLE: Failure when starting actor), at /builds/worker/checkouts/gecko/ipc/glue/UtilityProcessManager.cpp:258

Categories

(Core :: IPC, defect)

defect

Tracking

()

RESOLVED FIXED
114 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox112 --- wontfix
firefox113 --- wontfix
firefox114 --- fixed

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Regression)

Details

(Keywords: assertion, intermittent-failure, regression, Whiteboard: [stockwell unknown][stockwell unknown])

Filed by: mlaza [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=389411613&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/WviZOqCLTeSrcJC4AaR-PQ/runs/0/artifacts/public/logs/live_backing.log


[task 2022-09-04T10:11:38.772Z] 10:11:38     INFO - TEST-START | dom/canvas/test/webgl-conf/generated/test_2_conformance__textures__misc__tex-video-using-tex-unit-non-zero.html
[task 2022-09-04T10:11:38.854Z] 10:11:38     INFO - GECKO(1742) | [Parent 1742, Main Thread] WARNING: NS_ENSURE_TRUE(she && she->mInfo->mSharedState.Get()) failed: file /builds/worker/checkouts/gecko/docshell/shistory/SessionHistoryEntry.cpp:1127
[task 2022-09-04T10:11:39.006Z] 10:11:39     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: NS_ENSURE_TRUE(nsContentUtils::IsJavascriptMIMEType(type)) failed: file /builds/worker/checkouts/gecko/dom/script/ScriptLoader.cpp:876
[task 2022-09-04T10:11:39.006Z] 10:11:39     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: NS_ENSURE_TRUE(nsContentUtils::IsJavascriptMIMEType(type)) failed: file /builds/worker/checkouts/gecko/dom/script/ScriptLoader.cpp:876
[task 2022-09-04T10:11:39.113Z] 10:11:39     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: '!scrollbar', file /builds/worker/checkouts/gecko/widget/Theme.cpp:1073
[task 2022-09-04T10:11:39.274Z] 10:11:39     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: Attempt to __delete__ missing or closed actor: file /builds/worker/workspace/obj-build/ipc/ipdl/PWebGLChild.cpp:125
[task 2022-09-04T10:11:39.275Z] 10:11:39     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: Attempt to __delete__ missing or closed actor: file /builds/worker/workspace/obj-build/ipc/ipdl/PWebGLChild.cpp:125
[task 2022-09-04T10:11:39.275Z] 10:11:39     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: Attempt to __delete__ missing or closed actor: file /builds/worker/workspace/obj-build/ipc/ipdl/PWebGLChild.cpp:125
[task 2022-09-04T10:11:39.276Z] 10:11:39     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: Attempt to __delete__ missing or closed actor: file /builds/worker/workspace/obj-build/ipc/ipdl/PWebGLChild.cpp:125
[task 2022-09-04T10:11:40.408Z] 10:11:40     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: '!scrollbar', file /builds/worker/checkouts/gecko/widget/Theme.cpp:1073
[task 2022-09-04T10:11:40.458Z] 10:11:40     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: '!scrollbar', file /builds/worker/checkouts/gecko/widget/Theme.cpp:1073
[task 2022-09-04T10:11:40.509Z] 10:11:40     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: '!scrollbar', file /builds/worker/checkouts/gecko/widget/Theme.cpp:1073
[task 2022-09-04T10:11:40.599Z] 10:11:40     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: '!scrollbar', file /builds/worker/checkouts/gecko/widget/Theme.cpp:1073
[task 2022-09-04T10:11:40.641Z] 10:11:40     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: '!scrollbar', file /builds/worker/checkouts/gecko/widget/Theme.cpp:1073
[task 2022-09-04T10:11:40.734Z] 10:11:40     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: '!scrollbar', file /builds/worker/checkouts/gecko/widget/Theme.cpp:1073
[task 2022-09-04T10:11:43.739Z] 10:11:43     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: Attempt to __delete__ missing or closed actor: file /builds/worker/workspace/obj-build/ipc/ipdl/PWebGLChild.cpp:125
[task 2022-09-04T10:11:43.739Z] 10:11:43     INFO - GECKO(1742) | [Child 1760, Main Thread] WARNING: Attempt to __delete__ missing or closed actor: file /builds/worker/workspace/obj-build/ipc/ipdl/PWebGLChild.cpp:125
[task 2022-09-04T10:11:45.930Z] 10:11:45     INFO - GECKO(1742) | [Parent 1742, Main Thread] WARNING: Reject LaunchProcess() for LaunchPromise() rejection: file /builds/worker/checkouts/gecko/ipc/glue/UtilityProcessManager.cpp:205
[task 2022-09-04T10:11:45.930Z] 10:11:45     INFO - GECKO(1742) | Assertion failure: false (MOZ_ASSERT_UNREACHABLE: Failure when starting actor), at /builds/worker/checkouts/gecko/ipc/glue/UtilityProcessManager.cpp:258
[task 2022-09-04T10:11:46.018Z] 10:11:46     INFO -  Initializing stack-fixing for the first stack frame, this may take a while...
See Also: → 1769624
Assignee: nobody → lissyx+mozillians

~330 more retriggers, no repro so far

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Assignee: lissyx+mozillians → nobody

The huge spike since ~2 weeks ago correlates with bug 1812035 and bug 1812275 landing, and they're the only changes to Utility I see since. Andreas, can you take a look?

Flags: needinfo?(afarre)
Regressed by: 1812035
Regressed by: 1812275

Those tests changes are to be able to pick up the js utility process, that we've set up to start early. The tests doesn't change anything than expectations. I think we need to ask :gerard-majax about this.

Flags: needinfo?(afarre) → needinfo?(lissyx+mozillians)
Assignee: nobody → lissyx+mozillians
Flags: needinfo?(lissyx+mozillians)

(In reply to Andreas Farre [:farre] from comment #20)

Those tests changes are to be able to pick up the js utility process, that we've set up to start early. The tests doesn't change anything than expectations. I think we need to ask :gerard-majax about this.

You start a new process early, that does not qualify for "does not change anything", it's actually a pretty invasive change. ... According to the stack:

[task 2023-02-24T19:42:18.340Z] 19:42:18     INFO - GECKO(1412) | [Parent 1412, Main Thread] WARNING: Reject LaunchProcess() for LaunchPromise() rejection: file /builds/worker/checkouts/gecko/ipc/glue/UtilityProcessManager.cpp:222
[task 2023-02-24T19:42:18.340Z] 19:42:18     INFO - GECKO(1412) | Assertion failure: false (MOZ_ASSERT_UNREACHABLE: Failure when starting actor), at /builds/worker/checkouts/gecko/ipc/glue/UtilityProcessManager.cpp:285
[task 2023-02-24T19:42:18.385Z] 19:42:18     INFO -  Initializing stack-fixing for the first stack frame, this may take a while...
[task 2023-02-24T19:42:37.707Z] 19:42:37     INFO - GECKO(1412) | #01: mozilla::ipc::UtilityProcessManager::StartUtility<mozilla::ipc::UtilityAudioDecoderChild>(RefPtr<mozilla::ipc::UtilityAudioDecoderChild>, mozilla::ipc::SandboxingKind)::{lambda(nsresult)#1}::operator()(nsresult) const [ipc/glue/UtilityProcessManager.cpp:285]
[task 2023-02-24T19:42:37.708Z] 19:42:37     INFO - GECKO(1412) | #02: mozilla::MozPromise<bool, nsresult, false>::InvokeCallbackMethod<true, mozilla::ipc::UtilityProcessManager::StartUtility<mozilla::ipc::UtilityAudioDecoderChild>(RefPtr<mozilla::ipc::UtilityAudioDecoderChild>, mozilla::ipc::SandboxingKind)::{lambda(nsresult)#1}, RefPtr<mozilla::MozPromise<bool, nsresult, false> > (mozilla::ipc::UtilityProcessManager::StartUtility<mozilla::ipc::UtilityAudioDecoderChild>(RefPtr<mozilla::ipc::UtilityAudioDecoderChild>, mozilla::ipc::SandboxingKind)::{lambda(nsresult)#1}::*)(nsresult) const, nsresult const&, RefPtr<mozilla::MozPromise<bool, nsresult, false>::Private> >(mozilla::ipc::UtilityProcessManager::StartUtility<mozilla::ipc::UtilityAudioDecoderChild>(RefPtr<mozilla::ipc::UtilityAudioDecoderChild>, mozilla::ipc::SandboxingKind)::{lambda(nsresult)#1}*, RefPtr<mozilla::MozPromise<bool, nsresult, false> > (mozilla::ipc::UtilityProcessManager::StartUtility<mozilla::ipc::UtilityAudioDecoderChild>(RefPtr<mozilla::ipc::UtilityAudioDecoderChild>, mozilla::ipc::SandboxingKind)::{lambda(nsresult)#1}::*)(nsresult) const, nsresult const&, RefPtr<mozilla::MozPromise<bool, nsresult, false>::Private>&&) [xpcom/threads/MozPromise.h:649]
[task 2023-02-24T19:42:37.709Z] 19:42:37     INFO - GECKO(1412) | #03: mozilla::MozPromise<bool, nsresult, false>::ThenValue<mozilla::ipc::UtilityProcessManager::StartUtility<mozilla::ipc::UtilityAudioDecoderChild>(RefPtr<mozilla::ipc::UtilityAudioDecoderChild>, mozilla::ipc::SandboxingKind)::{lambda()#1}, mozilla::ipc::UtilityProcessManager::StartUtility<mozilla::ipc::UtilityAudioDecoderChild>(RefPtr<mozilla::ipc::UtilityAudioDecoderChild>, mozilla::ipc::SandboxingKind)::{lambda(nsresult)#1}>::DoResolveOrRejectInternal(mozilla::MozPromise<bool, nsresult, false>::ResolveOrRejectValue&) [xpcom/threads/MozPromise.h:861]
[task 2023-02-24T19:42:37.709Z] 19:42:37     INFO - GECKO(1412) | #04: mozilla::MozPromise<bool, nsresult, false>::ThenValueBase::ResolveOrRejectRunnable::Run() [xpcom/threads/MozPromise.h:490]
[task 2023-02-24T19:42:37.710Z] 19:42:37     INFO - GECKO(1412) | #05: mozilla::RunnableTask::Run() [xpcom/threads/TaskController.cpp:540]
[task 2023-02-24T19:42:37.710Z] 19:42:37     INFO - GECKO(1412) | #06: mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) [xpcom/threads/TaskController.cpp:852]
[task 2023-02-24T19:42:37.710Z] 19:42:37     INFO - GECKO(1412) | #07: mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) [xpcom/threads/TaskController.cpp:0]
[task 2023-02-24T19:42:37.711Z] 19:42:37     INFO - GECKO(1412) | #08: mozilla::TaskController::ProcessPendingMTTask(bool) [xpcom/threads/TaskController.cpp:462]
[task 2023-02-24T19:42:37.711Z] 19:42:37     INFO - GECKO(1412) | #09: mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_2>::Run() [xpcom/threads/nsThreadUtils.h:547]
[task 2023-02-24T19:42:37.711Z] 19:42:37     INFO - GECKO(1412) | #10: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1229]
[task 2023-02-24T19:42:37.712Z] 19:42:37     INFO - GECKO(1412) | #11: NS_ProcessPendingEvents(nsIThread*, unsigned int) [xpcom/threads/nsThreadUtils.cpp:442]
[task 2023-02-24T19:42:37.712Z] 19:42:37     INFO - GECKO(1412) | #12: nsBaseAppShell::NativeEventCallback() [widget/nsBaseAppShell.cpp:88]
[task 2023-02-24T19:42:37.712Z] 19:42:37     INFO - GECKO(1412) | #13: nsAppShell::ProcessGeckoEvents(void*) [widget/cocoa/nsAppShell.mm:512]
[task 2023-02-24T19:42:37.964Z] 19:42:37     INFO - GECKO(1412) | #14: __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x83d52]
[task 2023-02-24T19:42:37.964Z] 19:42:37     INFO - GECKO(1412) | #15: __CFRunLoopDoSource0 [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x83cf1]
[task 2023-02-24T19:42:37.965Z] 19:42:37     INFO - GECKO(1412) | #16: __CFRunLoopDoSources0 [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x83b0b]
[task 2023-02-24T19:42:37.965Z] 19:42:37     INFO - GECKO(1412) | #17: __CFRunLoopRun [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x8283a]
[task 2023-02-24T19:42:37.965Z] 19:42:37     INFO - GECKO(1412) | #18: CFRunLoopRunSpecific [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x81e3e]
[task 2023-02-24T19:42:38.236Z] 19:42:38     INFO - GECKO(1412) | #19: RunCurrentEventLoopInMode [/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox + 0x2fabd]
[task 2023-02-24T19:42:38.236Z] 19:42:38     INFO - GECKO(1412) | #20: ReceiveNextEventCommon [/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox + 0x2f7d5]
[task 2023-02-24T19:42:38.237Z] 19:42:38     INFO - GECKO(1412) | #21: _BlockUntilNextEventMatchingListInModeWithFilter [/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox + 0x2f579]
[task 2023-02-24T19:42:38.496Z] 19:42:38     INFO - GECKO(1412) | #22: _DPSNextEvent [/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit + 0x41039]
[task 2023-02-24T19:42:38.496Z] 19:42:38     INFO - GECKO(1412) | #23: -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] [/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit + 0x3f880]
[task 2023-02-24T19:42:38.496Z] 19:42:38     INFO - GECKO(1412) | #24: -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] [widget/cocoa/nsAppShell.mm:175]
[task 2023-02-24T19:42:38.497Z] 19:42:38     INFO - GECKO(1412) | #25: -[NSApplication run] [/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit + 0x3158e]
[task 2023-02-24T19:42:38.497Z] 19:42:38     INFO - GECKO(1412) | #26: nsAppShell::Run() [widget/cocoa/nsAppShell.mm:0]
[task 2023-02-24T19:42:38.497Z] 19:42:38     INFO - GECKO(1412) | #27: nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:296]
[task 2023-02-24T19:42:38.498Z] 19:42:38     INFO - GECKO(1412) | #28: XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:5650]
[task 2023-02-24T19:42:38.498Z] 19:42:38     INFO - GECKO(1412) | #29: XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) [toolkit/xre/nsAppRunner.cpp:5843]
[task 2023-02-24T19:42:38.498Z] 19:42:38     INFO - GECKO(1412) | #30: XRE_main(int, char**, mozilla::BootstrapConfig const&) [toolkit/xre/nsAppRunner.cpp:5899]
[task 2023-02-24T19:42:38.499Z] 19:42:38     INFO - GECKO(1412) | #31: main [browser/app/nsBrowserApp.cpp:423]

So we reject because of an error: https://searchfox.org/mozilla-central/rev/aa3ccd258b64abfd4c5ce56c1f512bc7f65b844c/ipc/glue/UtilityProcessManager.cpp#222

Likely the cause of our reject is this reject: https://searchfox.org/mozilla-central/rev/aa3ccd258b64abfd4c5ce56c1f512bc7f65b844c/ipc/glue/UtilityProcessHost.cpp#150, which means it's GeckoChildProcessHost that rejects beneath us and this could come from:

Given the issue at least reproduces on mac and Windows, and is much more present on macOS debug builds, I'd suspect the first one, race at shutdown, especially since this is something we already hit, the mochitest-gl suites ends up starting some audio/video decoders for reasons I cannot remember about, and races because it shutdowns quite quickly.

In the crash I can see that both the RDDProcessStatus and UtilityProcess status annotations contain the value Destroyed, could this be a case where we're spinning up a process too late during shutdown?

Depends on: 1819311

Utility process startup failure because of IPC mismatch are also one reason why we can hit this issue.

Assignee: lissyx+mozillians → nobody

We're not working on this until bug 1819311 makes it easier to diagnose, if this is an issue then at worst the regressing bugs can always be backed out.

So we have bug 1819311 that is landing, this will remove the assertion, so we should not get anymore of those errors. We will rather monitor telemetry and address the issues there, so I'm going to close that as FIXED.

Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 114 Branch
You need to log in before you can comment on or make changes to this bug.