Closed Bug 1082203 Opened 10 years ago Closed 10 years ago

[EME] Notify JS when CDM crashes

Categories

(Core :: Audio/Video, defect)

x86_64
Windows Vista
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla37
Tracking Status
firefox37 --- fixed
firefox38 --- fixed
firefox39 --- fixed

People

(Reporter: cpearce, Assigned: jwwang)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 4 obsolete files)

We need to provide a way to notify JS when the CDM crashes. Like an error event, and we should probably resolve the MediaKeySession.closed promises.
Filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=27067 to get behaviour defined in the spec.
This bug is necessary for basic playback of EME video on Windows.
Blocks: eme-m1
JW can you take this? The spec hasn't stabilized here, but I think it's close enough to that we can do something here and it'll be close.

The behaviour we want is to dispath an "error" event to the HTMLMediaElement if the GMP crashes. If the GMP crashes, we'll get a CDMProxy::gmp_Terminated() callback. So we need to plum through from the CDMProxy::gmp_Terminated() callback to the media element to fire an error event.

We should also resolve the "closed" promise on all MediaKeySessions for that CDM.

We'll need a mochitest for this; we must start up the EME session, and then we can cause the GMP to crash by toggling the pref media.gmp.plugin.crash.

The W3 bug in comment 1 mentions a dispatching a "close" event to the MediaKeySession. We can make that change once it has been added to the spec; i.e. at a later date. In the meantime, we can just dispatch "error" to the HTMLMediaElement.
Assignee: nobody → jwwang
I think we can notify HTMLMediaElement about the crash as a decode error.

CDMProxy::gmp_Shutdown() calls |job->mClient->Decrypted(NS_ERROR_ABORT, nullptr)| to notify decode errors which will go all the way to MP4Reader::Error(). But I can't find where MP4Reader calls |RequestSampleCallback::OnNotDecoded(xxx, DECODE_ERROR)| back to its listener.
Sounds fine. We'll need to hook up the OnNotDecoded() callback in MP4Reader. That's a good thing.

We need to ensure that the MediaKeys gets closed. I think it makes sense to do this from the CDMProxy::gmp_Terminated() callback, as the MediaKeys may not yet be associated with an HTMLMediaElement when it crashes; we may not yet have a decoder/MP4Reader and an HTMLMediaElement through which to report the crash.
My proposal is have MediaDecoderReader::DecodeAudioData return an enum {MORE, ERROR, FINISH} instead of a bool to tell error from EOS. Then MediaDecoderReader::RequestAudioData can check the return value of DecodeAudioData to decide whether to call OnNotDecoded(xxx, END_OF_STREAM) or OnNotDecoded(xxx, DECODE_ERROR) to notify decode error.

About the MediaKeySession.closed promise, I will have a deeper look at the code to understand how it works out.
(In reply to JW Wang [:jwwang] from comment #6)
> My proposal is have MediaDecoderReader::DecodeAudioData return an enum
> {MORE, ERROR, FINISH} instead of a bool to tell error from EOS. Then
> MediaDecoderReader::RequestAudioData can check the return value of
> DecodeAudioData to decide whether to call OnNotDecoded(xxx, END_OF_STREAM)
> or OnNotDecoded(xxx, DECODE_ERROR) to notify decode error.

Bug 1032633 landed a few moments ago which makes MP4Reader properly async... Provided this doesn't get backed out again, MP4Reader will no longer implement DecodeAudioData(), and more importantly, it now calls the OnNotDecoded(DECODE_ERROR) callback...
No longer blocks: eme-m1
Route CDMCallbackProxy::Terminated to MediaKeys::Terminated so MediaKeys has a chance to handle CDM crashes and close sessions if necessary to resolve MediaKeySession.closed promises.
Attachment #8534838 - Flags: review?(cpearce)
Per comment 4, a decode error will be raised if CDM crashes which will be solved by bug 1032633.
Comment on attachment 8534838 [details] [diff] [review]
1082203_handle_CDM_crashes-v1.patch

Review of attachment 8534838 [details] [diff] [review]:
-----------------------------------------------------------------

Looking pretty good, but we need a test here, and we need to hook up "error" events I think...

Please make a new mochitest which starts up an EME session, and then toggles the pref media.gmp.plugin.crash. This causes all running GMPs to crash. Then check that all open MediaKeySessions had their closed promise resolved, and that we dispatched an "error" event to the HTMLMediaElement.

I don't see how the "error" event gets propagated to the HTMLMediaElement with this patch... It looks to me like we need to change EMEH264Decoder::Terminated() to call mCallback->Error(), and the same for EMEAudioDecoder::Terminated() and EMEDecryptor::Terminated(). Then the error should propagate out through MP4Reader to the HTMLMediaElement. Make sense?
Attachment #8534838 - Flags: review?(cpearce) → review-
I see now. Comment 4 only addresses the cdm-decrypt case. We still need to address the cdm-decrypt-decode case.
Yeah, and it also only handles the case where we're waiting on a decrypt job; if we happen to not be waiting on a decrypt job at the exact instance the CDM crashes, we won't get an "error" event dispatched to the media element, because there won't be anything in the CDMProxy::mDecryptionJobs array to notify.
I think it is OK for that case. Chances are that we reach the end of playback and don't need to decrypt/deocde anymore. If CDM crashes after playback ended, we are still fine and don't have to raise an error. It is until we play again that decryption will fail and an error will be fired.

Should we proactively report an error on CDM crash or delay it until we ain't able to decrypt/decode due to CDM crash?
I think we should proactively report an error, as if we don't we'll get random test failures.
EMEH264Decoder::Terminated calls GmpShutdown which resets mGMP. Next time when GmpInput is called, it will see mGMP is null and call mCallback->Error() to report the error. I guess EMEH264Decoder::Terminated doesn't need to be changed. I will have a deeper look to ensure all code paths are covered for the crash case.
(In reply to Chris Pearce (:cpearce) from comment #14)
> I think we should proactively report an error, as if we don't we'll get
> random test failures.

Ok, lets call mCallback->Error() in EMEH264Decoder::Terminated.
Call call mCallback->Error() in EMEH264Decoder::Terminated(), EMEAudioDecoder::Terminated() and EMEDecryptor::Terminated(), as discussed in comment 10.
Attachment #8534838 - Attachment is obsolete: true
Attachment #8536361 - Flags: feedback?(cpearce)
Comment on attachment 8536361 [details] [diff] [review]
1082203_handle_CDM_crashes-v2-WIP.patch

Review of attachment 8536361 [details] [diff] [review]:
-----------------------------------------------------------------

This approach adds complexity to CDMProxy which maintains a list of listeners which will be called upon termination. I will prefer to take another approach which notifys the media element about CDM crash in MediaKeys::Terminated. This seems to be simpler.

::: dom/media/eme/CDMProxy.cpp
@@ +576,5 @@
> +  if (!mKeys.IsNull()) {
> +    mKeys->Terminated();
> +  }
> +
> +  // TODO: Notify listeners registered with AddListener()

This approach is weird for CDMProxy now has 2 ways to notify CDM termination, one by mKeys->Terminated, and the other by the listeners registered with AddListener().

::: dom/media/fmp4/eme/EMEDecoderModule.cpp
@@ +42,5 @@
>  #ifdef DEBUG
>      , mIsShutdown(false)
>  #endif
>    {
> +    mProxy->AddListener(this);

I find there is no way to notify EMEDecryptor when CDMProxy::Terminated is called. So I add new APIs to register listeners with CDMProxy via CDMProxy::AddListener.

@@ +54,5 @@
>      return rv;
>    }
>  
> +  virtual void Terminated() MOZ_OVERRIDE {
> +    mCallback->Error();

To propagate CDM crash.
Notify media element about the decode error in MediaKeys::Terminated when CDM terminated.
Attachment #8536395 - Flags: feedback?(cpearce)
(In reply to Chris Pearce (:cpearce) from comment #10)
> Please make a new mochitest which starts up an EME session, and then toggles
> the pref media.gmp.plugin.crash. This causes all running GMPs to crash. Then
> check that all open MediaKeySessions had their closed promise resolved, and
> that we dispatched an "error" event to the HTMLMediaElement.

The test case will fail for crash dumps left behind. Is there a way to clear the crash dumps to keep Try green when the crash is produced purposefully?
Comment on attachment 8536361 [details] [diff] [review]
1082203_handle_CDM_crashes-v2-WIP.patch

Review of attachment 8536361 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/eme/CDMProxy.h
@@ +100,5 @@
>    // processed the request.
>    void RemoveSession(const nsAString& aSessionId,
>                       PromiseId aPromiseId);
>  
> +  void AddListener(GMPCallbackBase* aListener) {}

So, we want to call the MediaKeys on the main thread, and the EMEDecryptor to be called back on its task queue.

You can make aListener an nsIRunnable instead, and pass an NS_NewRunnableMethod bound to the EMEDecryptor::Terminated() function, and then also pass the MediaTaskQueue to dispatch to the runnable to.

i.e.

void AddTerminatedListener(nsIRunnable*, MediaTaskQueue*);

Then once you've dispatched the runnable, you'll drop the reference to it, so you don't have to worry about RemoveListener.

You'll have to be careful, as this would create a reference cycle.

You could also use this same API to report termination to MediaKeys if you wanted to.
Attachment #8536361 - Flags: feedback?(cpearce) → feedback+
(In reply to JW Wang [:jwwang] from comment #20)
> (In reply to Chris Pearce (:cpearce) from comment #10)
> > Please make a new mochitest which starts up an EME session, and then toggles
> > the pref media.gmp.plugin.crash. This causes all running GMPs to crash. Then
> > check that all open MediaKeySessions had their closed promise resolved, and
> > that we dispatched an "error" event to the HTMLMediaElement.
> 
> The test case will fail for crash dumps left behind. Is there a way to clear
> the crash dumps to keep Try green when the crash is produced purposefully?

Jesup: Do you know the answer to this? Do you guys have a mochit test or somesuch to ensure that GMP crashes are reported to your code?
Flags: needinfo?(rjesup)
Comment on attachment 8536395 [details] [diff] [review]
1082203_handle_CDM_crashes-v3-WIP.patch

Review of attachment 8536395 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/html/HTMLMediaElement.cpp
@@ +2967,5 @@
>      } else {
>        NS_WARNING("Should know the source we were loading from!");
>      }
>    } else {
> +    // Now we have multiple paths calling into DecodeError, e.g.

We should probably have a flag to denote whether we've already fired an "error" event, so that we don't fire "error" twice.

::: dom/media/eme/MediaKeys.cpp
@@ +82,5 @@
> +  mKeySessions.Enumerate(&CloseSessions, nullptr);
> +
> +  // Notify the element about that CDM has terminated.
> +  if (mElement) {
> +    mElement->DecodeError();

Actually this is sufficient to send an "error" to the media element... We shouldn't need the terminated listener in CDMProxy or the EMEDecryptors.
Attachment #8536395 - Flags: feedback?(cpearce) → feedback+
Comment on attachment 8536361 [details] [diff] [review]
1082203_handle_CDM_crashes-v2-WIP.patch

Review of attachment 8536361 [details] [diff] [review]:
-----------------------------------------------------------------

We can probably just use your v3 patch, and that should shutdown the decoder too. Much simpler than this v2 version.
Attachment #8536361 - Flags: feedback+
Comment on attachment 8536395 [details] [diff] [review]
1082203_handle_CDM_crashes-v3-WIP.patch

Review of attachment 8536395 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/html/HTMLMediaElement.cpp
@@ +2967,5 @@
>      } else {
>        NS_WARNING("Should know the source we were loading from!");
>      }
>    } else {
> +    // Now we have multiple paths calling into DecodeError, e.g.

It can be done by checking |mError != nullptr| in HTMLMediaElement::Error(uint16_t). We would return immeidately if mError is not null.
Sounds good.
Notify the media element about the decode error when CDM crashes via MediaKeys::Terminated based on patch-v3.
Attachment #8536361 - Attachment is obsolete: true
Attachment #8536395 - Attachment is obsolete: true
Attachment #8537002 - Flags: review?(cpearce)
(In reply to Chris Pearce (:cpearce) from comment #22)
> (In reply to JW Wang [:jwwang] from comment #20)
> > (In reply to Chris Pearce (:cpearce) from comment #10)
> > > Please make a new mochitest which starts up an EME session, and then toggles
> > > the pref media.gmp.plugin.crash. This causes all running GMPs to crash. Then
> > > check that all open MediaKeySessions had their closed promise resolved, and
> > > that we dispatched an "error" event to the HTMLMediaElement.
> > 
> > The test case will fail for crash dumps left behind. Is there a way to clear
> > the crash dumps to keep Try green when the crash is produced purposefully?
> 
> Jesup: Do you know the answer to this? Do you guys have a mochit test or
> somesuch to ensure that GMP crashes are reported to your code?

We test the crash reporting by hand, IIRC.
Flags: needinfo?(rjesup)
Comment on attachment 8537002 [details] [diff] [review]
1082203_handle_CDM_crashes-v4.patch

Review of attachment 8537002 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. r+ with the two comments addressed.

::: dom/media/eme/CDMProxy.cpp
@@ +575,4 @@
>    NS_WARNING("CDM terminated");
> +  if (!mKeys.IsNull()) {
> +    mKeys->Terminated();
> +  }

I think you should still call gmp_Shutdown (on the GMP thread) when the CDM is terminated, so that any cycles in CMDProxy::mDecryptionJobs are broken. So you should dispatch a task in CDMProxy::Terminated() to the GMP thread to call gmp_Shutdown().

::: dom/media/eme/MediaKeys.cpp
@@ +80,5 @@
> +void
> +MediaKeys::Terminated()
> +{
> +  // Close all sessions since CDM has terminated.
> +  mKeySessions.Enumerate(&CloseSessions, nullptr);

Removing the session inside the iterator adds a new case/exception to MediaKeys::OnClosed()...

It would be good if we can reduce the number of case/exceptions to rules/contracts that we add to our code, so that it's easier to reason about and maintain.

Could you instead here enumerate over the hashtable and add the MediaKeySession entries to an nsTArray<nsRefPtr<MediaKeySession>> (which you pass to the Enumerate call), and then do a second iteration over the array to call OnClosed()? Then we don't need to add a new case to MediaKeySession::OnClosed. Thanks.
Attachment #8537002 - Flags: review?(cpearce) → review+
Comment on attachment 8537002 [details] [diff] [review]
1082203_handle_CDM_crashes-v4.patch

Review of attachment 8537002 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/eme/CDMProxy.cpp
@@ +575,4 @@
>    NS_WARNING("CDM terminated");
> +  if (!mKeys.IsNull()) {
> +    mKeys->Terminated();
> +  }

The code flow is like this: CDMProxy::Terminated -> MediaKeys::Terminated -> MediaKeys::Shutdown -> CDMProxy::Shutdown -> CDMProxy::gmp_Shutdown.

The idea is to have a single entry (MediaKeys::Shutdown) for all shutdown sequences (initiated by CDM crash or media element) to make code simplier.
Address comment 29.
Attachment #8537002 - Attachment is obsolete: true
Attachment #8537607 - Flags: review+
(In reply to Randell Jesup [:jesup] from comment #28)
> We test the crash reporting by hand, IIRC.

I guess we can't write test case for that.

I see something like:
v.addEventListener("error", bail(token + " got error event"));
in our test_eme_*.html

So with this patch landed, test cases shall not pass silently with CDM crashes. That should also fix the bug 1081251.
https://hg.mozilla.org/mozilla-central/rev/620ba2647840
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
Mass update firefox-status to track EME uplift.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: