Hanging at shutdown as mMainThreadDebuggeeEventTarget is paused and will not execute CancelingOnParentRunnable
Categories
(Core :: DOM: Workers, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox123 | --- | fixed |
People
(Reporter: tsmith, Assigned: jstutte)
References
(Blocks 1 open bug)
Details
(Keywords: pernosco)
Attachments
(1 file)
Found while fuzzing m-c 20240103-9ea90dc23395 (--enable-debug --enable-fuzzing)
I don't have a test case but I do have a Pernosco session: https://pernos.co/debug/BKJlE9c0OL2DIiMuKywbPw/index.html
stderr:
[Parent 2109771, Main Thread] WARNING: '!top', file /builds/worker/checkouts/gecko/dom/xul/MenuBarListener.cpp:99
[Parent 2109771, IPC I/O Parent] WARNING: Process 2110195 may be hanging at shutdown; will wait for up to 8000ms: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:184
[Parent 2109771, IPC I/O Parent] WARNING: Process 2110195 hanging at shutdown; attempting crash report (fatal error).: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:207
#0 0x7f0396297041 in mozilla::MediaTr #0 0x70000002 (linux-vdso.so.1+0x70000002) (BuildId: 274539cc9764a41518e972bafd6e92fc673c25be)
#1 0x7fad6819bd07 in _raw_syscall /home/twsmith/code/rr/src/preload/raw_syscall.S:120
#2 0x7fad68195908 in traced_raw_syscall /home/twsmith/code/rr/src/preload/syscallbuf.c:350:10
#3 0x7fad68198d02 in sys_futex /home/twsmith/code/rr/src/preload/syscallbuf.c:2040:14
#4 0x7fad68198d02 in syscall_hook_internal /home/twsmith/code/rr/src/preload/syscallbuf.c:4134:5
#5 0x7fad6819bacb in syscall_hook /home/twsmith/code/rr/src/preload/syscallbuf.c:4311:17
#6 0x7fad6819bacb in syscall_hook /home/twsmith/code/rr/src/preload/syscallbuf.c:4295:16
#7 0x7fad68195322 in _syscall_hook_trampoline /home/twsmith/code/rr/src/preload/syscall_hook.S:308
#8 0x7fad6819538c in __morestack /home/twsmith/code/rr/src/preload/syscall_hook.S:443
#9 0x7fad681953a8 in _syscall_hook_trampoline_48_3d_00_f0_ff_ff /home/twsmith/code/rr/src/preload/syscall_hook.S:462
#10 0x7fad6816337b in futex_wait_cancelable /build/glibc-wuryBv/glibc-2.31/nptl/../sysdeps/nptl/futex-internal.h:183:13
#11 0x7fad6816337b in __pthread_cond_wait_common /build/glibc-wuryBv/glibc-2.31/nptl/pthread_cond_wait.c:508:14
#12 0x7fad6816337b in pthread_cond_wait@@GLIBC_2.3.2 /build/glibc-wuryBv/glibc-2.31/nptl/pthread_cond_wait.c:647:10
#13 0x555b24315b49 in mozilla::detail::ConditionVariableImpl::wait(mozilla::detail::MutexImpl&) /builds/worker/checkouts/gecko/mozglue/misc/ConditionVariable_posix.cpp:106:11
#14 0x555b24315c2f in mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) /builds/worker/checkouts/gecko/mozglue/misc/ConditionVariable_posix.cpp:113:5
#15 0x7fad485390e9 in mozilla::OffTheBooksCondVar::Wait(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>) /builds/worker/checkouts/gecko/xpcom/threads/BlockingResourceBase.cpp:534:20
#16 0x7fad48538fb0 in mozilla::OffTheBooksCondVar::Wait() /builds/worker/checkouts/gecko/xpcom/threads/BlockingResourceBase.cpp:514:21
#17 0x7fad48545afc in mozilla::TaskController::GetRunnableForMTTask(bool) /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:619:19
#18 0x7fad48585f04 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1134:38
#19 0x7fad4858e975 in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:480:10
#20 0x7fad509cc7d0 in mozilla::dom::workerinternals::RuntimeService::Cleanup() /builds/worker/checkouts/gecko/dom/workers/RuntimeService.cpp:1619:14
#21 0x7fad509d3341 in mozilla::dom::workerinternals::RuntimeService::Observe(nsISupports*, char const*, char16_t const*) /builds/worker/checkouts/gecko/dom/workers/RuntimeService.cpp:1909:5
#22 0x7fad48434737 in nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) /builds/worker/checkouts/gecko/xpcom/ds/nsObserverList.cpp:71:19
#23 0x7fad48436be1 in nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) /builds/worker/checkouts/gecko/xpcom/ds/nsObserverService.cpp:288:19
#24 0x7fad48332431 in mozilla::AppShutdown::AdvanceShutdownPhaseInternal(mozilla::ShutdownPhase, bool, char16_t const*, nsCOMPtr<nsISupports> const&) /builds/worker/checkouts/gecko/xpcom/base/AppShutdown.cpp:433:21
#25 0x7fad48332a87 in mozilla::AppShutdown::AdvanceShutdownPhase(mozilla::ShutdownPhase, char16_t const*, nsCOMPtr<nsISupports> const&) /builds/worker/checkouts/gecko/xpcom/base/AppShutdown.cpp:456:3
#26 0x7fad485fa0d2 in mozilla::ShutdownXPCOM(nsIServiceManager*) /builds/worker/checkouts/gecko/xpcom/build/XPCOMInit.cpp:612:5
#27 0x7fad485f9e24 in NS_ShutdownXPCOM /builds/worker/checkouts/gecko/xpcom/build/XPCOMInit.cpp:564:10
#28 0x7fad504c0ec4 in mozilla::dom::ContentProcess::CleanUp() /builds/worker/checkouts/gecko/dom/ipc/ContentProcess.cpp:189:3
#29 0x7fad5561c916 in XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/checkouts/gecko/toolkit/xre/nsEmbedFunctions.cpp:660:16
#30 0x7fad55631c86 in mozilla::BootstrapImpl::XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/checkouts/gecko/toolkit/xre/Bootstrap.cpp:67:12
#31 0x555b24257f18 in content_process_main(mozilla::Bootstrap*, int, char**) /builds/worker/checkouts/gecko/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
#32 0x555b242581b9 in main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:375:18
#33 0x7fad67c12082 in __libc_start_main /build/glibc-wuryBv/glibc-2.31/csu/../csu/libc-start.c:308:16
#34 0x555b2422e0e8 in _start (/home/twsmith/workspace/browsers/m-c-20240103160634-fuzzing-noopt-debug/firefox-bin+0xce0e8) (BuildId: f0cc65e059645bdf2e7305d49c15f7207c65d749)
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 1•1 year ago
|
||
Thanks for that pernosco trace! I added some comments there and I think this points us to and helps us with bug 1769913.
Assignee | ||
Comment 2•1 year ago
|
||
(In reply to Jens Stutte [:jstutte] from bug 1769913 comment #6)
Bug 1872913 contains an interesting case for such a shutdown hang as of comment 0.
It seems we post the
CancelingOnParentRunnable
(holding the strong worker ref that blocks our shutdown) to the worker'smMainThreadDebuggeeEventTarget
but an incomingnsGlobalWindowInner::Suspend
pauses that queue before it will ever be executed. I wonder if theCancelingOnParentRunnable
should better be dispatched directly to the main thread queue? Or we need to drain/unpause the throttled queue on Cancel (but I fear that could always race with asynchronous events causing it to pause again) ?
Let's give it a try to avoid pausing when canceling. There are potentially more/different runnables in that queue, I assume.
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 3•1 year ago
|
||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Comment 5•1 year ago
|
||
Backed out for crashes on iframe-append-2.https.html
Backout link: https://hg.mozilla.org/integration/autoland/rev/73cda4c569603df9d2367f96714331416c4a56ca
Log link: https://treeherder.mozilla.org/logviewer?job_id=443075873&repo=autoland&lineNumber=3344
Assignee | ||
Comment 6•1 year ago
|
||
Yes, that test revealed a logic problem with that patch, as we now run some runnables in a situation they do not expect to be run in.
Comment 8•1 year ago
|
||
bugherder |
Assignee | ||
Comment 9•1 year ago
|
||
Is there a reason for this bug to not have the bugmon keyword? I'd like to make it check if it is fixed for good.
Reporter | ||
Comment 10•1 year ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #9)
Is there a reason for this bug to not have the bugmon keyword? I'd like to make it check if it is fixed for good.
Bugmon requires a test case and we don't have a reliable one for this issue.
Assignee | ||
Comment 11•1 year ago
|
||
(In reply to Tyson Smith [:tsmith] from comment #10)
Bugmon requires a test case and we don't have a reliable one for this issue.
Fair enough, should have known. Thanks
Description
•