[EME] Race getting CDM caps in MediaSourceReader

RESOLVED FIXED in Firefox 38

Status

()

RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: cpearce, Assigned: cpearce)

Tracking

(Blocks: 2 bugs)

unspecified
mozilla39
x86_64
Windows 8.1
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox38 fixed, firefox39 fixed)

Details

Attachments

(2 attachments, 2 obsolete attachments)

Adobe reported that playback is failing in:

http://drmtest2.adobe.com/HTML5AAXSPlayer/2/mse-access/

Turns out we have a introduced a data race in 3/24 Nightly build.

We race between the CDM reporting its capabilities and the video stack trying to start decoders. The MSReader doesn't wait for the CDMCaps to be available before leaving WAITING_FOR_CDM state, so if the video stack wins the race, we end up assuming that the CDM can't decode and we won't try to use the CDM for decoding, and so the video playback pipeline will stall.

This may be the reason why all the EME tests turned orange when I landed bug 1143278.

Also, once we make the MSReader wait for the caps to arrive, we need to kick the reader to wake it up once the caps arrive so that it stops waiting.
(Assignee)

Comment 1

4 years ago
Prevent the race by waiting for the CDM and its caps to be known in MediaSourceReader::IsWaitingOnCDMResource(), and kicking the MSDecoder/Reader when the caps actually are known.

No doubt this will help with orange.
Attachment #8583577 - Flags: review?(edwin)
(Assignee)

Comment 2

4 years ago
I discovered this while debugging this bug. MediaKeySession::SetSessionId() attempts to validate its input, but in fact it's validating its mSessionId field before setting it.
Attachment #8583578 - Flags: review?(edwin)
(Assignee)

Comment 3

4 years ago
This caused a crash in an --enable-optimize --enable-debug build when the logging tried to use this table in an error condition. Thankfully we don't ship with those build options.
Attachment #8583579 - Flags: review?(edwin)
Comment on attachment 8583577 [details] [diff] [review]
Patch 1: Prevent race

Review of attachment 8583577 [details] [diff] [review]:
-----------------------------------------------------------------

d'oh.
Attachment #8583577 - Flags: review?(edwin) → review+
Comment on attachment 8583578 [details] [diff] [review]
Patch 2: Fix MediaKeySession::SetSessionId input validation

Review of attachment 8583578 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/eme/MediaKeySession.cpp
@@ +70,2 @@
>  
> +  if (NS_WARN_IF(!aSessionId.IsEmpty())) {

Fairly certain this is meant to be |mSession|. The check is here so that we don't set a new session ID over an existing one.

Totally valid to be logging |aSession| though.

|!aSessionId.IsEmpty()| will always be hit...
Attachment #8583578 - Flags: review+ → review-
Would that explain why the decoder apparently receive a corrupted sample (Which fail to decode)?
(Assignee)

Comment 7

4 years ago
(In reply to Jean-Yves Avenard [:jya] from comment #6)
> Would that explain why the decoder apparently receive a corrupted sample
> (Which fail to decode)?

I don't think so. This should only affect whether we create a decoder in Gecko or in the CDM.
(Assignee)

Comment 8

4 years ago
(In reply to Edwin Flores [:eflores] [:edwin] from comment #5)
> Comment on attachment 8583578 [details] [diff] [review]
> Patch 2: Fix MediaKeySession::SetSessionId input validation
> 
> Review of attachment 8583578 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/media/eme/MediaKeySession.cpp
> @@ +70,2 @@
> >  
> > +  if (NS_WARN_IF(!aSessionId.IsEmpty())) {
> 
> Fairly certain this is meant to be |mSession|. The check is here so that we
> don't set a new session ID over an existing one.
> 
> Totally valid to be logging |aSession| though.
> 
> |!aSessionId.IsEmpty()| will always be hit...


Ah yes. I was crashing without this, but I realise that's because the mKeys->GetPendingSession() call in CDMProxy::OnSetSessionId() was returning null. I can just nullcheck the session there.

I was crashing if I mashed the "play" button on the Adobe testcase at URL above. I assume it's because we had shutdown the session when the callback ran.
(Assignee)

Comment 9

4 years ago
Fix the incorrect return in MediaKeySession::SetSessionId() and null check the caller so we don't crash.
Attachment #8583578 - Attachment is obsolete: true
Attachment #8583624 - Flags: review?(edwin)
(Assignee)

Comment 10

4 years ago
Comment on attachment 8583579 [details] [diff] [review]
Patch 3: Add missing state to gMachineStateStr

bholley bet me to it and already landed the equivalent patch in Bug 1147495.
Attachment #8583579 - Attachment is obsolete: true
Attachment #8583624 - Flags: review?(edwin) → review+
https://hg.mozilla.org/mozilla-central/rev/71ba4f1e5d8d
https://hg.mozilla.org/mozilla-central/rev/bd65fd7d5cc0
Status: NEW → RESOLVED
Last Resolved: 4 years ago
status-firefox39: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
(Assignee)

Comment 14

4 years ago
Comment on attachment 8583577 [details] [diff] [review]
Patch 1: Prevent race

Approval Request Comment
[Feature/regressing bug #]: EME
[User impact if declined]: Potential for data race when starting up EME playback that will cause EME playback to fail
[Describe test coverage new/current, TreeHerder]: Local testing, mochitests.
[Risks and why]: Seems low, though there's potential that I got it wrong and this will make playback worse. ;)
[String/UUID change made/needed]: None
Flags: needinfo?(cpearce)
Attachment #8583577 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 15

4 years ago
Comment on attachment 8583624 [details] [diff] [review]
Patch 2:v2 Fix MediaKeySession::SetSessionId input validation

Approval Request Comment
[Feature/regressing bug #]: EME
[User impact if declined]: Potential for crash when creating/shutting down lots of EME decryption sessions.
[Describe test coverage new/current, TreeHerder]: Local testing, mochitests.
[Risks and why]: Low risk, this basically just adds a null check.
[String/UUID change made/needed]: None
Attachment #8583624 - Flags: approval-mozilla-aurora?
(Assignee)

Updated

4 years ago
Flags: needinfo?(cpearce)
Attachment #8583577 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8583624 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
This depends on bug 1134434 IIUC. Otherwise, it needs rebasing.
(Assignee)

Comment 17

4 years ago
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #16)
> This depends on bug 1134434 IIUC. Otherwise, it needs rebasing.

Indeed. Testing here it turns out that 71ba4f1e5d8d fixes a regression introduced by bug 1134434. So since bug 1134434 doesn't need to be uplifted (I think, so far...) we don't need to uplift 71ba4f1e5d8d. bd65fd7d5cc0 should be uplifted though, it fixes a crash that can still happen in Fx38.
(Assignee)

Updated

4 years ago
Flags: needinfo?(cpearce)
You need to log in before you can comment on or make changes to this bug.