Closed Bug 1298938 Opened 3 years ago Closed 3 years ago

ImageBridge shutdown is very racy

Categories

(Core :: Graphics: Layers, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: dvander, Assigned: dvander)

References

(Blocks 1 open bug)

Details

Attachments

(7 files, 1 obsolete file)

ImageBridgeChild is very racy, and has lots of random booleans and state changes to try and paper over problems.

There's no real reason for this, most problems would go away if we just use proper locking and hold onto references during asynchronous operations.
An audit of IsShutdown looks okay because it happens to only be used on the ImageBridgeChild thread. But it would work better if we replaced it with the "CanSend" pattern we have on other actors, and then implemented ActorDestroy as well.

IsCreated and the singleton are more problematic with the GPU process. We have to protect the singleton with a lock, and callers must be changed to acquire a singleton with a RefPtr. Then we can move the static dispatch methods to member functions, and do away with the IsCreated: CanSend will subsume "!IsCreated() && !IsShutdown()" checks.
The goal of these next few patches is to tie ImageBridgeChild messages to |this| instead of being static methods that read sImageBridgeChildSingleton. It's much easier to reason about state when it's not global. After these patches, we should be able to remove all IsCreated() and IsShutdown() checks.

This first patch changes all the layout/compositing related tasks that ImageBridgeChild is responsible for - no functionality should be changed here.
Attachment #8787823 - Flags: review?(matt.woodrow)
This simplifies how ImageBridgeChild binds to IPDL when in the same process as the ImageBridgeParent. No functionality should be changed. (Though a missing call to SendImageBridgeThreadId was added in the GPU process case.)
Attachment #8787824 - Flags: review?(matt.woodrow)
This makes shutdown a member of ImageBridgeChild. As a result, the shutdown code no longer has to check sImageBridgeChildSingleton. It's not guaranteed that the IPDL actor is alive, but we'll tackle that in another patch.
Attachment #8787825 - Flags: review?(nical.bugzilla)
IsCreated is removed - either callers should be acquiring a reference to ImageBridgeChild, or they are guaranteed ImageBridgeChild exists because they are in the event queue and holding a reference.

IsShutDown is removed as well for the same reasons. Using either of these off the main thread was invalid anyway.

Finally, mShuttingDown is removed as well. It is replaced by mCanSend, which starts out false, is set to true when we bind to IPDL, and set to false in ActorDestroy and when we shut down. We assert that mCanSend is only accessed on the IPDL thread.

The tricky part of this patch is figuring out what all the old asserts and checks were really trying to do. They seemed to fall into a few categories:
 * Detecting when IPDL is open. mCanSend replaces this.
 * Detecting when sImageBridgeClientSingleton is null. This is no longer necessary.

In some cases, we were checking some nebulous state that existed during the middle of shutdown (like, before ActorDestroy was called, but after MarkShutDown). I just assumed this doesn't really matter, but if it does, I can re-add mShuttingDown. The only really iffy case is FlushAllImagesSync - maybe it still wants to clear images even if CanSend() is false?

Also, in many cases we assert that IPDL is open. This isn't necessarily true, but I kept those asserts and we'll have to audit them later for the GPU process.
Attachment #8787842 - Flags: review?(nical.bugzilla)
Fix two CanSend() asserts running on the wrong thread.
Attachment #8787842 - Attachment is obsolete: true
Attachment #8787842 - Flags: review?(nical.bugzilla)
Attachment #8787844 - Flags: review?(nical.bugzilla)
Since sImageBridgeChildSingleton is a RefPtr, it is not safe to read it while another thread writes to it. (RefPtr is not thread-safe). This patch wraps access to it inside a lock.
Attachment #8787846 - Flags: review?(matt.woodrow)
Finally, make sure IPDL owns a reference to ImageBridgeChild, in case for some reason ActorDestroy gets called after we release the final reference on the main thread.
Attachment #8787847 - Flags: review?(wmccloskey)
Attachment #8787823 - Flags: review?(matt.woodrow) → review+
Attachment #8787824 - Flags: review?(matt.woodrow) → review+
Attachment #8787846 - Flags: review?(matt.woodrow) → review+
Attachment #8787825 - Flags: review?(nical.bugzilla) → review+
Attachment #8787844 - Flags: review?(nical.bugzilla) → review+
Attachment #8787847 - Flags: review?(wmccloskey) → review+
Patches 1-3 changed shutdown so that the final ImageBridgeChild Release() can happen off the main thread, to avoid races where pending tasks don't know which ImageBridge they were slated for.

However, media tasks can still be running after XPCOM shutdown, and ImageBridgeChild retains some an ActiveResourceTracker, which is bound to XPCOM. So these patches introduced a shutdown crash on try.

The fix is to just listen for xpcom shutdown in ImageBridgeChild, and make sure we release these dependent resources in case the media thread is lagging behind.
Attachment #8789037 - Flags: review?(matt.woodrow)
Attachment #8789037 - Flags: review?(matt.woodrow) → review+
Pushed by danderson@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a757098353c6
Route dispatches in ImageBridgeChild through |this|, not the singleton. (bug 1298938 part 1, r=mattwoodrow)
https://hg.mozilla.org/integration/mozilla-inbound/rev/81e994db6b8c
Simplify ImageBridgeChild asynchronous initialization. (bug 1298938 part 2, r=mattwoodrow)
https://hg.mozilla.org/integration/mozilla-inbound/rev/6dd0b4b22b23
Simplify ImageBridgeChild asynchronous shutdown. (bug 1298938 part 3, r=nical)
https://hg.mozilla.org/integration/mozilla-inbound/rev/e43fc0029b90
Shutdown XPCOM-dependent resources in ImageBridgeChild at the appropriate time. (bug 1298938 part 3.1, r=mattwoodrow)
https://hg.mozilla.org/integration/mozilla-inbound/rev/3f6883401be4
Remove racy ImageBridgeChild shutdown/creation checks. (bug 1298938 part 4, r=nical)
https://hg.mozilla.org/integration/mozilla-inbound/rev/df832e32e7aa
Protect ImageBridgeChild's singleton with a StaticMutex. (bug 1298938 part 5, r=mattwoodrow)
https://hg.mozilla.org/integration/mozilla-inbound/rev/d710b5ac1e13
Ensure IPDL owns a reference to ImageBridgeChild. (bug 1298938 part 6, r=billm)
Pushed by danderson@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/975b2c736abd
Route dispatches in ImageBridgeChild through |this|, not the singleton. (bug 1298938 part 1, r=mattwoodrow)
https://hg.mozilla.org/integration/mozilla-inbound/rev/aa56a00f0329
Simplify ImageBridgeChild asynchronous initialization. (bug 1298938 part 2, r=mattwoodrow)
https://hg.mozilla.org/integration/mozilla-inbound/rev/b5b50b5ec62f
Simplify ImageBridgeChild asynchronous shutdown. (bug 1298938 part 3, r=nical)
https://hg.mozilla.org/integration/mozilla-inbound/rev/92de67e54720
Shutdown XPCOM-dependent resources in ImageBridgeChild at the appropriate time. (bug 1298938 part 3.1, r=mattwoodrow)
https://hg.mozilla.org/integration/mozilla-inbound/rev/84560a23836b
Remove racy ImageBridgeChild shutdown/creation checks. (bug 1298938 part 4, r=nical)
https://hg.mozilla.org/integration/mozilla-inbound/rev/2d2b7d0a949a
Protect ImageBridgeChild's singleton with a StaticMutex. (bug 1298938 part 5, r=mattwoodrow)
https://hg.mozilla.org/integration/mozilla-inbound/rev/77428fd9afc3
Ensure IPDL owns a reference to ImageBridgeChild. (bug 1298938 part 6, r=billm)
Leak was of course the result of a last-minute change after a successful try run. Should be better now.
Flags: needinfo?(dvander)
Blocks: 1328840
You need to log in before you can comment on or make changes to this bug.