Make EME (encrypted & encoded) media sample bypassing MediaDecoderStateMachine for HW CDM.

RESOLVED DUPLICATE of bug 1194652

Status

()

Core
Audio/Video
RESOLVED DUPLICATE of bug 1194652
3 years ago
2 years ago

People

(Reporter: kikuo, Assigned: kikuo)

Tracking

(Blocks: 1 bug)

unspecified
FxOS-S5 (21Aug)
ARM
Gonk (Firefox OS)
Points:
---

Firefox Tracking Flags

(feature-b2g:2.5+)

Details

Attachments

(3 attachments, 2 obsolete attachments)

(Assignee)

Description

3 years ago
1. A trigger point to activate HW CDM rendering path.
2. Create EMEBypassDecoder to bypass encrypted & encoded media samples back to MediaDecoderStateMachine.
3. Wrap encrypted & encoded media sample with MediaData.
(Assignee)

Updated

3 years ago
Blocks: 1146791
(Assignee)

Comment 1

3 years ago
Created attachment 8592685 [details] [diff] [review]
[WIP] Add EMEBypassDecoderModule and reformat MediaRawData to AudioData/VideoData then pass to MDSM _ based_on_rv238952

Could pass mochitest/test_eme_playback.html.
(Assignee)

Updated

3 years ago
Attachment #8592685 - Attachment is patch: true
(Assignee)

Updated

3 years ago
Attachment #8592685 - Attachment description: [WIP] Add EMEBypassDecoderModule and reformat MediaRawData to AudioData/VideoData then pass to MDSM → [WIP] Add EMEBypassDecoderModule and reformat MediaRawData to AudioData/VideoData then pass to MDSM _ based_on_rv238952
(Assignee)

Comment 2

3 years ago
Created attachment 8594447 [details] [diff] [review]
[WIP] Add EMEBypassDecoderModule and reformat MediaRawData to AudioData/VideoData then pass to MDSM _ based_on_rv239819

Calling original AudioData / VideoData constructor to create MediaData which contains a MediaRawData pointer indicating encrypted / encoded samples.

NOTE : AudioData will be transfered while running with MSE. So there's also pointer transfer in AudioData::TransferAndUpdateTimestampAndDuration(...)
Attachment #8592685 - Attachment is obsolete: true
(Assignee)

Comment 3

3 years ago
Created attachment 8601328 [details]
The position of EMEBypassDecoder in simple media architecture.
(Assignee)

Comment 4

3 years ago
Created attachment 8601344 [details] [diff] [review]
[WIP] Add EMEBypassDecoderModule and reformat(encapsulate) MediaRawData into AudioData/VideoData then pass to MDSM_242264

Hello JW,

This bug should be the first step of EME hardware rendering procedure.
You could take a look in Attachment 8601328 [details], see area which is colored in lilac at background, to have a rough picture.

I'm trying to make encrypted/encoded sample(orange line) bypassing the decrypt/decode phases of EMEDecryptor/EMEDecoder, and wrap them into AudioData/VideoData so that data flow logic of MediaDecoderStateMachine would not be influenced.

So I
1. Add new capability in dom/media/gmp/gmp-api/gmp-decryption.h
2. Pass the capability to PlatformDecoderModule to create corresponding EMEDecoderModule.
3. Implementation of EMEBypassDecoderModule
4. Handle MediaRawData swap between old AudioData and new AudioData while calling AudioData::TransferAndUpdateTimestampAndDuration()

It will be appreciated if you would help giving me feedbacks about this patch.
I'm wondering that if I miss something or maybe there's a better way to go.
Thank you.
Attachment #8594447 - Attachment is obsolete: true
Attachment #8601344 - Flags: feedback?(jwwang)
(Assignee)

Updated

3 years ago
Assignee: nobody → kikuo
Comment on attachment 8601344 [details] [diff] [review]
[WIP] Add EMEBypassDecoderModule and reformat(encapsulate) MediaRawData into AudioData/VideoData then pass to MDSM_242264

Review of attachment 8601344 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaData.h
@@ +84,5 @@
> +  class RawDataImp : public RawDataInterface {
> +    nsRefPtr<RawDataType> mData;
> +  public:
> +    RawDataImp(RawDataType* aT) : mData(aT) {}
> +    virtual already_AddRefed<RawDataType> GetData() { return mData.forget(); }

Do you want to transfer the ownership and make it a non-idempotent method?

@@ +107,5 @@
> +      rawData = static_cast<RawDataImp<RawDataType>*>(mRawData.get())->GetData();
> +    }
> +    return rawData.forget();
> +  }
> +  UniquePtr<RawDataInterface> mRawData;

I would prefer to just store a void* and require the caller to do type cast on their own since they have to know what the actual type it is anyway. It also makes it possible to store data that are not ref-counted.

::: dom/media/eme/EMEUtils.h
@@ +34,5 @@
>  
> +  #ifndef EME_RENDER_LOG
> +    PRLogModuleInfo* GetEMERenderLog();
> +    #define EME_RENDER_LOG(args, ...) PR_LOG(GetEMERenderLog(), PR_LOG_DEBUG,\
> +      ("\033[1;34m[EMERender]\033[m]" args, ##__VA_ARGS__))

color code?

::: dom/media/fmp4/PlatformDecoderModule.cpp
@@ +103,5 @@
>    }
>  
>    nsRefPtr<PlatformDecoderModule> pdm;
> +  if ((!cdmRendersAudio && !cdmDecodesAudio && aHasAudio) ||
> +      (!cdmRendersVideo && !cdmDecodesVideo && aHasVideo)) {

There is no way to be able to render without decoding. You can just check |!cdmDecodesAudio && aHasAudio| as before.

::: dom/media/fmp4/eme/EMEDecoderModule.cpp
@@ +182,5 @@
> +  {}
> +
> +  already_AddRefed<MediaData> Create(MediaRawData* aSample)
> +  {
> +    CheckedInt64 frames = UsecsToFrames(aSample->mDuration+1, mRate);

Why |aSample->mDuration+1| instead of |aSample->mDuration|?

@@ +195,5 @@
> +                                           nullptr,
> +                                           mChannels,
> +                                           mRate);
> +    nsRefPtr<MediaRawData> mrd = aSample->Clone();
> +    ad->SetRawData((MediaRawData*)mrd);

I fail to see the point to carry the raw data around. Can you just return an empty AudioData since we're not gonna play it anyway.

@@ +245,5 @@
> +  RefPtr<layers::ImageContainer> mImageContainer;
> +};
> +
> +template<class EMEBypassMediaDataCreator>
> +class EMEBypassDecoder : public MediaDataDecoder

Can you explain what EMEBypassDecoder is for and how it works?

::: dom/media/gmp/gmp-api/gmp-decryption.h
@@ +127,5 @@
>  
> +// Capability; CDM can decrypt and then decode encrypted buffers and then
> +// render these frames out, samples are not gonna be back to gecko.
> +#define GMP_EME_CAP_DECRYPT_AND_DECODE_AND_RENDER_AUDIO (uint64_t(1) << 4)
> +#define GMP_EME_CAP_DECRYPT_AND_DECODE_AND_RENDER_VIDEO (uint64_t(1) << 5)

This is really verbose. We should be able to simplify it based on the fact where canRender implies canDecode which implies canDecrypt.

::: media/gmp-clearkey/0.1/ClearKeySessionManager.cpp
@@ +82,5 @@
>  ClearKeySessionManager::Init(GMPDecryptorCallback* aCallback)
>  {
>    CK_LOGD("ClearKeySessionManager::Init");
>    mCallback = aCallback;
> +  if (CanRender()) {

Each capability level is a subset of the next level. For example, canDecrypt is a subset of canDecode which is a subset of canRender. So you should put |if (canRender())| block inside |if (canDecode())| block.
Attachment #8601344 - Flags: feedback?(jwwang) → feedback-

Updated

3 years ago
Component: General → General
Product: Firefox OS → Core

Updated

3 years ago
Component: General → Video/Audio
(Assignee)

Comment 6

3 years ago
Created attachment 8602527 [details]
EMEDataFlow_Decrypt/Decode/Render_v1.png
(Assignee)

Comment 7

3 years ago
(In reply to JW Wang [:jwwang] from comment #5)
> Comment on attachment 8601344 [details] [diff] [review]
> [WIP] Add EMEBypassDecoderModule and reformat(encapsulate) MediaRawData into
> AudioData/VideoData then pass to MDSM_242264
> 
> Review of attachment 8601344 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/media/MediaData.h
> @@ +84,5 @@
> > +  class RawDataImp : public RawDataInterface {
> > +    nsRefPtr<RawDataType> mData;
> > +  public:
> > +    RawDataImp(RawDataType* aT) : mData(aT) {}
> > +    virtual already_AddRefed<RawDataType> GetData() { return mData.forget(); }
> 
> Do you want to transfer the ownership and make it a non-idempotent method?
> 

Oh, yes, my intention is to transfer the ownership.

I was thinking that the encrypted/encoded data is stored into AudioData/VideoData in EMEBypassDecoder, and then push back to AudioQueun/VideoQueue in MediaDecoderStateMachine. After that these data will be send to other external decryptor/decoder/renderer to do the rest of things, and that's the place  where the ownership transfer should happen.

I could modify function with proper name, something like, 'DataTransfer(SOMETYPE**)'

> @@ +107,5 @@
> > +      rawData = static_cast<RawDataImp<RawDataType>*>(mRawData.get())->GetData();
> > +    }
> > +    return rawData.forget();
> > +  }
> > +  UniquePtr<RawDataInterface> mRawData;
> 
> I would prefer to just store a void* and require the caller to do type cast
> on their own since they have to know what the actual type it is anyway. It
> also makes it possible to store data that are not ref-counted.
> 

void* was originally my option to do so.
No matter using void* or template type, caller(who wants to store data) and retriever(who wants to use data) both need to know the exact data type.
I was thinking that using ref-counted pointer to make these data trackable. It's a good suggestion to me to use void* storing not ref-counted data :)

> ::: dom/media/eme/EMEUtils.h
> @@ +34,5 @@
> >  
> > +  #ifndef EME_RENDER_LOG
> > +    PRLogModuleInfo* GetEMERenderLog();
> > +    #define EME_RENDER_LOG(args, ...) PR_LOG(GetEMERenderLog(), PR_LOG_DEBUG,\
> > +      ("\033[1;34m[EMERender]\033[m]" args, ##__VA_ARGS__))
> 
> color code?

oh, I'll remove it. (Just to make tracing easier in console) 

> 
> ::: dom/media/fmp4/PlatformDecoderModule.cpp
> @@ +103,5 @@
> >    }
> >  
> >    nsRefPtr<PlatformDecoderModule> pdm;
> > +  if ((!cdmRendersAudio && !cdmDecodesAudio && aHasAudio) ||
> > +      (!cdmRendersVideo && !cdmDecodesVideo && aHasVideo)) {
> 
> There is no way to be able to render without decoding. You can just check
> |!cdmDecodesAudio && aHasAudio| as before.
> 

That's good point !

> ::: dom/media/fmp4/eme/EMEDecoderModule.cpp
> @@ +182,5 @@
> > +  {}
> > +
> > +  already_AddRefed<MediaData> Create(MediaRawData* aSample)
> > +  {
> > +    CheckedInt64 frames = UsecsToFrames(aSample->mDuration+1, mRate);
> 
> Why |aSample->mDuration+1| instead of |aSample->mDuration|?
>

Uh, I saw this in BlankDecoderModule, adding 1 to account for rounding errors in order to get a consistent tone. We don't need to do this here.
 
> @@ +195,5 @@
> > +                                           nullptr,
> > +                                           mChannels,
> > +                                           mRate);
> > +    nsRefPtr<MediaRawData> mrd = aSample->Clone();
> > +    ad->SetRawData((MediaRawData*)mrd);
> 
> I fail to see the point to carry the raw data around. Can you just return an
> empty AudioData since we're not gonna play it anyway.
> 

Uh, sorry, I've a second thought after per discussion offline yesterday.
My attempt was to have a general data flow path for both EME & NON-EME cases to passing AudioData/VideoData (no matter raw/clean) to external renderer(e.g. 3rd party HWComposer).
That's why I need EMEPypassDecoder to re-encapsulate data and pass back to MediaDecoderStateMachine.
After that, these (raw/clean) data can be transferred to external renderer via GMPExternalRendererAPI(will be added in another bug)

If we're focusing on EME only case, I think it would be easier to transfer raw data to CDM via EMEDecoderModule without big architecture change.

Because the raw data are already inside gecko (no matter they are sent out through EMEDecoderModule, or they went back to MediaDecoderStateMachine)

Does it make sense to you ?

> @@ +245,5 @@
> > +  RefPtr<layers::ImageContainer> mImageContainer;
> > +};
> > +
> > +template<class EMEBypassMediaDataCreator>
> > +class EMEBypassDecoder : public MediaDataDecoder
> 
> Can you explain what EMEBypassDecoder is for and how it works?
> 

Please see Attachment 8602527 [details] (line colored in green), I was thinking to make EMEBypassDecoder as a place to re-encapsulate raw data into AudioData/VideoData and flow into MediaDecoderStateMachine first, after that passing these data to CDM for decrypt/decode/render. The data path is different from EMEVideoDecoder (line colored in blue).

> ::: dom/media/gmp/gmp-api/gmp-decryption.h
> @@ +127,5 @@
> >  
> > +// Capability; CDM can decrypt and then decode encrypted buffers and then
> > +// render these frames out, samples are not gonna be back to gecko.
> > +#define GMP_EME_CAP_DECRYPT_AND_DECODE_AND_RENDER_AUDIO (uint64_t(1) << 4)
> > +#define GMP_EME_CAP_DECRYPT_AND_DECODE_AND_RENDER_VIDEO (uint64_t(1) << 5)
> 
> This is really verbose. We should be able to simplify it based on the fact
> where canRender implies canDecode which implies canDecrypt.
> 

Current definition is a little weird for me.
We check the capability @ https://dxr.mozilla.org/mozilla-central/source/dom/media/eme/CDMCaps.cpp#192

#define GMP_EME_CAP_DECRYPT_AUDIO (uint64_t(1) << 0)
#define GMP_EME_CAP_DECRYPT_VIDEO (uint64_t(1) << 1)
#define GMP_EME_CAP_DECRYPT_AND_DECODE_AUDIO (uint64_t(1) << 2)
#define GMP_EME_CAP_DECRYPT_AND_DECODE_VIDEO (uint64_t(1) << 3)

bool
CDMCaps::AutoLock::CanDecryptAndDecodeAudio()
{
  return mData.HasCap(GMP_EME_CAP_DECRYPT_AND_DECODE_AUDIO);
}

Wouldn't it be more easy to understand to purify the definition like

#define GMP_CAP_DECRYPT_AUDIO (uint64_t(1) << 0)
#define GMP_CAP_DECRYPT_VIDEO (uint64_t(1) << 1)
#define GMP_CAP_DECODE_AUDIO (uint64_t(1) << 2)
#define GMP_CAP_DECODE_VIDEO (uint64_t(1) << 3)
#define GMP_CAP_RENDER_AUDIO (uint64_t(1) << 4)
#define GMP_CAP_RENDER_VIDEO (uint64_t(1) << 5)

#define GMP_EME_CAP_DECRYPT_AUDIO (GMP_CAP_DECRYPT_AUDIO)
#define GMP_EME_CAP_DECRYPT_VIDEO (GMP_CAP_DECRYPT_VIDEO)
#define GMP_EME_CAP_DECODE_AUDIO (GMP_CAP_DECRYPT_AUDIO + GMP_CAP_DECODE_AUDIO)
#define GMP_EME_CAP_DECODE_VIDEO (GMP_CAP_DECRYPT_VIDEO + GMP_CAP_DECODE_VIDEO)
#define GMP_EME_CAP_RENDER_AUDIO (GMP_CAP_DECRYPT_AUDIO + GMP_CAP_DECODE_AUDIO + GMP_CAP_RENDER_AUDIO)
#define GMP_EME_CAP_RENDER_VIDEO (GMP_CAP_DECRYPT_VIDEO + GMP_CAP_DECODE_VIDEO + GMP_CAP_RENDER_VIDEO)

bool
CDMCaps::AutoLock::CanDecryptAndDecodeAudio()
{
  return mData.HasCap(GMP_EME_CAP_DECODE_AUDIO);
}

By these definition, we're able to check the external hwcomposer only does rendering. (Not EME case)
Does it make sense to you ?

Updated

2 years ago
Flags: needinfo?(jwwang)

Comment 8

2 years ago
Hi JW,
Could you help to review the comment 7 from Kilik?
Thanks!
Sounds reasonable to me. Btw, bit flags should be "or"ed instead of being "add"ed.
Flags: needinfo?(jwwang)
> Wouldn't it be more easy to understand to purify the definition like
> 
> #define GMP_CAP_DECRYPT_AUDIO (uint64_t(1) << 0)
> #define GMP_CAP_DECRYPT_VIDEO (uint64_t(1) << 1)
> #define GMP_CAP_DECODE_AUDIO (uint64_t(1) << 2)
> #define GMP_CAP_DECODE_VIDEO (uint64_t(1) << 3)
> #define GMP_CAP_RENDER_AUDIO (uint64_t(1) << 4)
> #define GMP_CAP_RENDER_VIDEO (uint64_t(1) << 5)
> 
> #define GMP_EME_CAP_DECRYPT_AUDIO (GMP_CAP_DECRYPT_AUDIO)
> #define GMP_EME_CAP_DECRYPT_VIDEO (GMP_CAP_DECRYPT_VIDEO)
> #define GMP_EME_CAP_DECODE_AUDIO (GMP_CAP_DECRYPT_AUDIO +
> GMP_CAP_DECODE_AUDIO)
> #define GMP_EME_CAP_DECODE_VIDEO (GMP_CAP_DECRYPT_VIDEO +
> GMP_CAP_DECODE_VIDEO)
> #define GMP_EME_CAP_RENDER_AUDIO (GMP_CAP_DECRYPT_AUDIO +
> GMP_CAP_DECODE_AUDIO + GMP_CAP_RENDER_AUDIO)
> #define GMP_EME_CAP_RENDER_VIDEO (GMP_CAP_DECRYPT_VIDEO +
> GMP_CAP_DECODE_VIDEO + GMP_CAP_RENDER_VIDEO)

This would break backwards compatibility with existing EME CDMs in the field, i.e. with Adobe's CDM for Desktop Firefox. We should avoid breaking Adobe EME if at all possible.

Why is what's already landed (GMP_EME_CAP_RENDER_VIDEO etc) unsuitable?
(Assignee)

Comment 11

2 years ago
(In reply to Chris Pearce (:cpearce) from comment #10)
> > Wouldn't it be more easy to understand to purify the definition like
> > 
> > #define GMP_CAP_DECRYPT_AUDIO (uint64_t(1) << 0)
> > #define GMP_CAP_DECRYPT_VIDEO (uint64_t(1) << 1)
> > #define GMP_CAP_DECODE_AUDIO (uint64_t(1) << 2)
> > #define GMP_CAP_DECODE_VIDEO (uint64_t(1) << 3)
> > #define GMP_CAP_RENDER_AUDIO (uint64_t(1) << 4)
> > #define GMP_CAP_RENDER_VIDEO (uint64_t(1) << 5)
> > 
> > #define GMP_EME_CAP_DECRYPT_AUDIO (GMP_CAP_DECRYPT_AUDIO)
> > #define GMP_EME_CAP_DECRYPT_VIDEO (GMP_CAP_DECRYPT_VIDEO)
> > #define GMP_EME_CAP_DECODE_AUDIO (GMP_CAP_DECRYPT_AUDIO +
> > GMP_CAP_DECODE_AUDIO)
> > #define GMP_EME_CAP_DECODE_VIDEO (GMP_CAP_DECRYPT_VIDEO +
> > GMP_CAP_DECODE_VIDEO)
> > #define GMP_EME_CAP_RENDER_AUDIO (GMP_CAP_DECRYPT_AUDIO +
> > GMP_CAP_DECODE_AUDIO + GMP_CAP_RENDER_AUDIO)
> > #define GMP_EME_CAP_RENDER_VIDEO (GMP_CAP_DECRYPT_VIDEO +
> > GMP_CAP_DECODE_VIDEO + GMP_CAP_RENDER_VIDEO)
> 
> This would break backwards compatibility with existing EME CDMs in the
> field, i.e. with Adobe's CDM for Desktop Firefox. We should avoid breaking
> Adobe EME if at all possible.
> 
> Why is what's already landed (GMP_EME_CAP_RENDER_VIDEO etc) unsuitable?

Before Bug 1186375 landed, I was thinking that to separate the wording "GMP" & "EME" and simplify these 3 capabilities (decrypt/decode/render) into 3 units.  So that we can make GMP_EME_CAP_DECODE_AUDIO a combination of 2 smaller units.

But considering backwards compatibility and after Bug 1186375 landed, I think we should keep using current |#define NAME VALUE| to focus on EME capabilities first.

Updated

2 years ago
Status: NEW → ASSIGNED
Flags: needinfo?(cpearce)

Updated

2 years ago
feature-b2g: --- → 2.5+
Target Milestone: --- → FxOS-S5 (21Aug)
Josh, I don't understand why you needinfo'd me on this bug. What more input can I provide here? By my reading, Kilik agreed to keep using the existing GMP_EME capabilities pattern.
Flags: needinfo?(cpearce) → needinfo?(jocheng)
(Assignee)

Comment 13

2 years ago
Bug 1194652 will be a more proper solution. Mark this bug as a duplicate.
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1194652

Updated

2 years ago
Flags: needinfo?(jocheng)
You need to log in before you can comment on or make changes to this bug.