Bug 1874800 Comment 7 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

(In reply to Tyson Smith [:tsmith] from comment #4)
> (In reply to Tyson Smith [:tsmith] from comment #2)
> >  I was only able to successfully reproduce under rr with a -O2 build. Hopefully this has enough detail.
> 
> Just to clarify, this was a debug build built -O2.

OK, the `SIGABRT` I see happening on the blocked content process comes from [`ChildLaxReaper::CrashProcessIfHanging`](https://searchfox.org/mozilla-central/rev/1aa61dcd48e128a8cbfbe59b7ba43d31bd3c248a/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc#211) and happens after a hardcoded [8000 ms](https://searchfox.org/mozilla-central/rev/1aa61dcd48e128a8cbfbe59b7ba43d31bd3c248a/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc#36) when the parent's IPC IO thread ends. That means the parent's shutdown sequence was not blocked before getting to [`delete sIOThread;`](https://searchfox.org/mozilla-central/rev/1aa61dcd48e128a8cbfbe59b7ba43d31bd3c248a/xpcom/build/XPCOMInit.cpp#787) and this timer kicked in. 

Apparently what happens is that the content process is able to acknowledge [and unblock its shutdown](https://searchfox.org/mozilla-central/rev/1aa61dcd48e128a8cbfbe59b7ba43d31bd3c248a/dom/ipc/ContentChild.cpp#3106) on the main thread but later hangs waiting for the blocked worker thread on `ShutdownPhase::XPCOMShutdownThreads`. So it makes totally sense that the `ChildLaxReaper` timeout kicks in and not the `ContentParent::ForceKillTimerCallback`.

Looking at comment 0, that is exactly what the log was always saying here:

```
[Parent 367548, IPC I/O Parent] WARNING: Process 367730 may be hanging at shutdown; will wait for up to 8000ms: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:184
[Parent 367548, IPC I/O Parent] WARNING: Process 367730 hanging at shutdown; attempting crash report (fatal error).: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:207
UndefinedBehaviorSanitizer:DEADLYSIGNAL
```

It might be interesting to look a bit more into this session how come we unblocked the worker shutdown blocker of the parent and proceeded happily with our shutdown on the parent side. There might be a `WorkerRef` missing somewhere?
(In reply to Tyson Smith [:tsmith] from comment #4)
> (In reply to Tyson Smith [:tsmith] from comment #2)
> >  I was only able to successfully reproduce under rr with a -O2 build. Hopefully this has enough detail.
> 
> Just to clarify, this was a debug build built -O2.

OK, the `SIGABRT` I see happening on the blocked content process comes from [`ChildLaxReaper::CrashProcessIfHanging`](https://searchfox.org/mozilla-central/rev/1aa61dcd48e128a8cbfbe59b7ba43d31bd3c248a/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc#211) and happens after a hardcoded [8000 ms](https://searchfox.org/mozilla-central/rev/1aa61dcd48e128a8cbfbe59b7ba43d31bd3c248a/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc#36) when the parent's IPC IO thread ends. That means the parent's shutdown sequence was not blocked before getting to [`delete sIOThread;`](https://searchfox.org/mozilla-central/rev/1aa61dcd48e128a8cbfbe59b7ba43d31bd3c248a/xpcom/build/XPCOMInit.cpp#787) and this timer kicked in. 

Apparently what happens is that the content process is able to acknowledge [and unblock its shutdown](https://searchfox.org/mozilla-central/rev/1aa61dcd48e128a8cbfbe59b7ba43d31bd3c248a/dom/ipc/ContentChild.cpp#3106) on the main thread but later hangs waiting for the blocked worker thread on `ShutdownPhase::XPCOMShutdownThreads`. So it makes totally sense that the `ChildLaxReaper` timeout kicks in and not the `ContentParent::ForceKillTimerCallback`.

Looking at comment 0, that is exactly what the log was always saying here:

```
[Parent 367548, IPC I/O Parent] WARNING: Process 367730 may be hanging at shutdown; will wait for up to 8000ms: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:184
[Parent 367548, IPC I/O Parent] WARNING: Process 367730 hanging at shutdown; attempting crash report (fatal error).: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:207
UndefinedBehaviorSanitizer:DEADLYSIGNAL
```

It might be interesting to look a bit more into this session how come we unblocked the worker shutdown blocker of the parent and proceeded happily with our shutdown on the parent side. There might be a `WorkerRef` missing somewhere? Edit: Probably not, we actually want the worker to end.

Back to Bug 1874800 Comment 7