Closed Bug 1175765 Opened 9 years ago Closed 9 years ago

[EME] Async shutdown hang in mozilla::gmp::GeckoMediaPluginService::Observe(...) - Improve timer in GMPParent

Categories

(Core :: Audio/Video, defect)

x86
Windows
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla42
Tracking Status
firefox39 --- wontfix
firefox40 --- fixed
firefox41 --- fixed
firefox42 --- fixed

People

(Reporter: mozbugz, Assigned: mozbugz)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file, 1 obsolete file)

Follow-up to bug 1173634, where the patch adds the list of pending GMPs to crash reports.

In this bug, a timer will be added in GMPService itself to force the hung plugin(s) to crash -- or do something else if more appropriate.
Summary: [EME] Async shutdown hang in mozilla::gmp::GeckoMediaPluginService::Observe(...) → [EME] Async shutdown hang in mozilla::gmp::GeckoMediaPluginService::Observe(...) - Add timer in GMPService
After further study, there used to be a timer in GMPService but it was moved to GMPParent to catch timeouts in non-xpcom-shutdown cases, see bug 1085168. So re-adding a timer in GMPService may help with this particular shutdown hang, but would not actually cover all possible hangs.

The current GMPParent timer is only started when async-shutdown proper starts, which is after CloseActive completes (when receiving an expected number of RecvPGMPContentChildDestroyed messages); So if this first asynchronous process stalls, there is no timeout to bail from it.

The timer should therefore be started earlier, to cover the CloseActive-RecvPGMPContentChildDestroyed part of the shutdown process.
Summary: [EME] Async shutdown hang in mozilla::gmp::GeckoMediaPluginService::Observe(...) - Add timer in GMPService → [EME] Async shutdown hang in mozilla::gmp::GeckoMediaPluginService::Observe(...) - Improve timer in GMPParent
The main goal of this patch is to start the async-shutdown watchdog timer from CloseActive(), in case we don't receive the expected RecvPGMPContentChildDestroyed's in time.

In addition:
- Some (!a||b) tests have been split to reduce (my) confusion, but more importantly to get more precise crash report annotations.
- A (rare) case is handled, where CloseActive() is called but all content children are already destroyed, in which case CloseIfUnused() is called again to give it a chance to run async-or-not shutdown processes if not already done.
Attachment #8629166 - Flags: review?(cpearce)
Comment on attachment 8629166 [details] [diff] [review]
1175765-gmpparent-closeactive-timer.patch

Review of attachment 8629166 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/gmp/GMPParent.cpp
@@ +364,5 @@
> +#endif
> +      AbortAsyncShutdown();
> +    } else if (IsUsed()) {
> +      // We're expecting RecvPGMPContentChildDestroyed's -> Start async-shutdown timer now.
> +      if (NS_FAILED(EnsureAsyncShutdownTimeoutSet())) {

We don't actually need the timer if the plugin doesn't require async shutdown, right? i.e. mAsyncShutdownRequired==false? 

So please make this:

  if (mAsyncShutdownRequired && NS_FAILED(EnsureAsyncShutdownTimeoutSet()))

Please also add an MOZ_ASSERT(mAsyncShutdownRequired) in EnsureAsyncShutdownTimeoutSet().
Attachment #8629166 - Flags: review?(cpearce) → review+
Updated as per comment 3; carrying r+.
Attachment #8629166 - Attachment is obsolete: true
Attachment #8629542 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/68602685db0f
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Comment on attachment 8629542 [details] [diff] [review]
1175765-gmpparent-closeactive-timer.patch

Approval Request Comment

[Feature/regressing bug #]: EME

[User impact if declined]: We're seeing crash reports which are shutdown hangs on Beta40 waiting for either EME plugins or OpenH264 to shutdown.
This patch starts the timer as soon as the shutdown is requested (instead of later in the process, which may miss a cause of hangs); It also adds further and more precise annotations, so that we can debug and fix these shutdown hangs.
Denying uplift on this patch will make it take longer for us to fix the shutdown hangs.
Also uplift candidates bug 1183433 and bug 1185392 work on top of this patch.

[Describe test coverage new/current, TreeHerder]: We have lots of EME mochitests.

[Risks and why]: Pretty low; this just extends an existing timer to include more of the shutdown process, and adds crash reporter annotations.

[String/UUID change made/needed]: None.
Attachment #8629542 - Flags: approval-mozilla-beta?
Attachment #8629542 - Flags: approval-mozilla-aurora?
Comment on attachment 8629542 [details] [diff] [review]
1175765-gmpparent-closeactive-timer.patch

I asked rillian and jesup to review the series of patches for EME as well because we're late in the 40 cycle. Their suggestion is to take the changes as they put us in a better state that we were in previously and because waiting an additional ~2.5 weeks for 41 to hit Beta will be detrimental. I'm approving for Beta. We need to watch the beta8 build to ensure to the best of our ability that we have not regressed anything related to WebRTC or EME. Beta+ Aurora+
Attachment #8629542 - Flags: approval-mozilla-beta?
Attachment #8629542 - Flags: approval-mozilla-beta+
Attachment #8629542 - Flags: approval-mozilla-aurora?
Attachment #8629542 - Flags: approval-mozilla-aurora+
Flags: needinfo?(cpearce)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: