Closed Bug 1314816 Opened 4 years ago Closed 4 years ago
Fix shutdown crash when the GPU process is enabled
Subtle bug: top-level actors retain references to a MessageLoop, but there is no mechanism making this explicit. Forgetting to Close() or wait for all outstanding ActorDestroys can leave the I/O thread with a dangling MessageLoop pointer during shutdown. VsyncBridgeChild had this bug. It didn't usually manifest, because the severed-pipe-message arriving before shutdown is a timing thing.
Attachment #8806940 - Flags: review?(matt.woodrow) → review+
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/c3c0827d6c66 Fix shutdown crash in VsyncBridgeChild when the GPU process is enabled. (bug 1314816, r=mattwoodrow)
Sigh. This unfortunately exposed another, probably long-forgotten problem: Close() is inherently racy in IPDL unless channels negotiate ahead of time who is responsible for closing. Otherwise, this can happen: 1. Side A calls Close. 2. Side A sends the special "Goodbye" message, indicating that A is closed. 3. Side B, on the IO thread, receives the "Goodbye" message, and enters the ChannelClosing state. 4. Side B calls Close. If in between 3 and 4, side B has not yet received an indicator that the link is dead, then IPDL will abort because there's "no compelling reason" not to.  I have a compelling reason: The only way to decouple a MessageChannel from the event loop is to close it, and this behavior makes it really hard to Close channels. This almost certainly explains why the compositor mysteriously spins an event loop on shutdown (bug 1295391). Spinning the event loop would ensure all the MessageChannels have received a link error and therefore posted a shutdown message. I think we should fix this behavior, so that if Close() is called in the ChannelClosed state, it simply early returns. In the ChannelClosing state it should change the state to ChannelClosed in a way that doesn't break OnChannelErrorFromLink.  http://searchfox.org/mozilla-central/rev/f5c9e9a249637c9abd88754c8963ecb3838475cb/ipc/glue/MessageChannel.cpp#2271
4 years ago
Try run with a fix, assuming that is indeed the problem: https://treeherder.mozilla.org/#/jobs?repo=try&revision=74bb6f358fc41a89a4e5b893f404b64af978e147
This version fixes a problem that the monitor wasn't unlocked before calling NotifyChannelClosed.
Attachment #8807253 - Flags: review?(wmccloskey) → review+
Pushed by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/c04f84afb1bd Make it safe to close an IPC channel that has already been closed by its remote endpoint. (bug 1314816 part 2, r=billm)
4 years ago
You need to log in before you can comment on or make changes to this bug.