Closed Bug 1057908 Opened 10 years ago Closed 9 years ago

[e10s] [EME] GeckoMediaPluginService needs to be proxied from Content processes to parent process

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla40
tracking-b2g +
Tracking Status
e10s m5+ ---
firefox40 --- fixed

People

(Reporter: cpearce, Assigned: peterv)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(7 files, 12 obsolete files)

8.23 KB, patch
peterv
: review+
Details | Diff | Splinter Review
109.36 KB, patch
peterv
: review+
Details | Diff | Splinter Review
85.80 KB, patch
peterv
: review+
Details | Diff | Splinter Review
97.36 KB, patch
peterv
: review+
Details | Diff | Splinter Review
12.26 KB, patch
peterv
: review+
Details | Diff | Splinter Review
43.36 KB, patch
peterv
: review+
Details | Diff | Splinter Review
2.90 KB, patch
Details | Diff | Splinter Review
This is a spin off from bug 1035454 comment 16 and 17.

The GeckoMediaPluginService is creating child processes. In e10s Firefox this is not allowed. The parent process needs to be doing that, and the GeckoMediaPluginService in the content processes need to be routing messages to the GMP child processes via the parent (I assume).
Also note that the GeckoMediaPluginService that directly owns the GMP child processes needs to be able receive the "profile-change-teardown" notification, as EME GMPs sometimes need to do some bookkeeping on shutdown, and we initiate that from the "profile-change-teardown" notification.
This sounds like the exact same problem johns is currently working on for normal web plugins.
Note that although the GMP process will be a child of the main chrome process, you don't necessarily need to proxy all communications through the shared parent. You can set up a separate IPDL channel directly between the two processes for short-term communication. So e.g. PGMP will be chrome <-> GMP but PGMPVideoDecoder etc can be content <-> GMP.
See Also: → e10s-plugin-ipc
Assignee: nobody → ajones
Chris - do you know who might be able to work on this bug?
Flags: needinfo?(cpeterson)
Brad: is implementing content process <-> EME/GMP process communication the responsibility of the Sandboxing team or the e10s team?
Blocks: core-e10s
No longer blocks: e10s
Flags: needinfo?(cpeterson) → needinfo?(blassey.bugs)
(In reply to Chris Peterson (:cpeterson) from comment #5)
> Brad: is implementing content process <-> EME/GMP process communication the
> responsibility of the Sandboxing team or the e10s team?

This should fall on the e10s team. CC'ing a few people who should know a thing or two. 

Johns specifically, how much of what you're doing to reparent NPAPI plugins is applicable here?
Flags: needinfo?(blassey.bugs)
John: is your e10s work for plugin processes relevant for the EME/GMP sandbox process?
Flags: needinfo?(jschoenick)
Summary: [e10s] GeckoMediaPluginService needs to be proxied from Content processes to parent process → [e10s] [EME] GeckoMediaPluginService needs to be proxied from Content processes to parent process
Blocks: 1082331
See Also: → 1082331
The work we'll need to do here will be very similar to what we're doing for plugins. Unfortunately, there isn't really any actual code sharing that we can take advantage of.

I think this work should be the responsibility of the media team. The e10s team can help out with IPDL bridging, but it seems more efficient for the people familiar with the GMP protocols to handle this.
Flags: needinfo?(jschoenick)
hey jesup, is someone on your team working on e10sifying all this new media code?
Flags: needinfo?(rjesup)
Assignee: ajones → nobody
Flags: needinfo?(rjesup)
Flags: needinfo?(mreavy)
(In reply to Bill McCloskey (:billm) from comment #8)
> I think this work should be the responsibility of the media team. The e10s
> team can help out with IPDL bridging, but it seems more efficient for the
> people familiar with the GMP protocols to handle this.

I asked Chris Pearce to file this bug so we could get someone from the e10s team to work on it, or at least someone not working on media. It isn't a media issue outside of the fact that it blocks e10s because it breaks media. We just need to find someone familiar with IPDL or someone with less important things to do who is willing to become familiar.
BillM and Jesup talked about the right plan going forward.  It sounds like the work Bill is doing for the NPAPI plugins will be largely applicable to GMP, and Bill is willing to look at how much of it applies in a few days (when he's done with the NPAPI work).
Flags: needinfo?(mreavy)
Assignee: nobody → wmccloskey
Blocks: 1065117
This bug is necessary for basic playback of EME video on Windows.
Blocks: eme-m1
No longer blocks: eme-m1
Blocks: 1105816
Talked to billm IRL. The general idea here was to split the PGMP protocol into services that Chrome provides to the GMP process (GMPStorage, GMPTimer, AsyncShutdown* messages, other async shutdown messages, maybe PCrashReporter) and the protocols that need to be connected to code in the content-process (GMP*coder, GMPDecryptor). We also need to proxy the calls to GMPService from the content-process to the chrome process. We can do this in the places that we dispatch runnables to the GMPThread in the child process, since none of these block.

Shutdown is a special case; we'd need to have GMPService's profile-teardown observer run in the chrome process, and initiate shutdown from there.
Peter, do you think you'll have time to look at this? We need to restructure how IPC works for GMP. Right now the content process is trying to create its own GMP process and completely manage communications with it. We want to change things around so that the content process asks the chrome process to spawn a new GMP process and then send down a direct IPC link to that GMP process. Sending the direct link would happen via IPDL "bridging". You can find details on that in bug 564086, which has a pretty decent description in comment 0.
Flags: needinfo?(peterv)
jst says peterv is working on this bug now.
Assignee: wmccloskey → peterv
Flags: needinfo?(peterv)
hi peterv,

Is there any updated information? Thanks.
Flags: needinfo?(peterv)
I'm working on splitting the protocols without bridging for now.
Flags: needinfo?(peterv)
hi peterv,

Got it, thanks for the information. :)
Quick progress report, I have patches that work on http://people.mozilla.org/~cpearce/mse-clearkey/ with and without e10s. Still have to clean up shutdown (currently crashes on shutdown) and move some code around, but getting closer.
tracking-b2g: --- → +
(In reply to Peter Van der Beken [:peterv] from comment #22)
Awesome! The fix will be much appreciated!
Peter, any updates on EME support for e10s? The EME team would like to ship EME in Firefox 38.
Flags: needinfo?(peterv)
Have an almost green try run. There's two issues remaining, haven't been able to reproduce them locally yet so probably will need more try runs to figure them out :-/.
Flags: needinfo?(peterv)
Depends on: 1133748
Depends on: 1121676
EME support for e10s does not need to block adobe-eme.
Peter, just curious what the status is. Still working on Windows orange?
Flags: needinfo?(peterv)
Note that this doesn't actually make the getters themselves asynchronous yet, it just makes API consumers deal with potentially asynchronous getters.

The NS_ProcessNextEvent loops are sadmaking, but I don't think there's a way around it (loading the GMP plugin requires the main thread to process events).
Attachment #8565432 - Attachment is obsolete: true
Attachment #8571838 - Flags: review?(cpearce)
This doesn't actually gives us anything in e10s, but it makes sure things keep working with e10s disabled.
Let me know if you want someone else (cpearce?) to do an additional review since it's in dom/media.
Attachment #8565433 - Attachment is obsolete: true
Attachment #8571840 - Flags: review?(wmccloskey)
Again, this doesn't actually make GetNodeId asynchronous but makes callers deal if it were.
Attachment #8571841 - Flags: review?(cpearce)
Same as for part 3, let me know if you want an additional reviewer.
Attachment #8565434 - Attachment is obsolete: true
Attachment #8565436 - Attachment is obsolete: true
Attachment #8571842 - Flags: review?(wmccloskey)
(In reply to Bill McCloskey (:billm) from comment #32)
> Peter, just curious what the status is. Still working on Windows orange?

Figuring out the remaining oranges. Finally got a green tree (modulo some intermittent oranges).
Flags: needinfo?(peterv)
Patches are bitrotten now... :(
Attachment #8571840 - Attachment is obsolete: true
Attachment #8571840 - Flags: review?(wmccloskey)
Attachment #8573101 - Flags: review?(wmccloskey)
(In reply to Chris Pearce (:cpearce) from comment #39)
> Patches are bitrotten now... :(

They applied cleanly to inbound the moment I attached them. Right now just one patch has one chunk for which context changed, I've updated it.
Actually, I know what happened. I forgot about this new patch that was needed due to a new API being added recently (in bug 1124031).
Attachment #8573119 - Flags: review?(cpearce)
Do it work on both e10s and non-e10s tabs?
Comment on attachment 8573119 [details] [diff] [review]
Part 0 - Make mozIGeckoMediaPluginService::GetPluginVersionForAPI return whether we even have the plugin.

Review of attachment 8573119 [details] [diff] [review]:
-----------------------------------------------------------------

Is this patch just for saving some duplicated code?
Comment on attachment 8571838 [details] [diff] [review]
Part 2 - support asynchronous GMP API getters.

Review of attachment 8571838 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/gmp/GMPService.cpp
@@ +279,5 @@
> +
> +  void Done(GMPParent* aGMPParent)
> +  {
> +    GMPVideoDecoderParent* gmpVDP;
> +    GMPVideoDecoderProxy* videoHost;

It should be |GMPVideoHost* videoHost|.

@@ +335,5 @@
> +
> +  void Done(GMPParent* aGMPParent)
> +  {
> +    GMPVideoEncoderParent* gmpVEP;
> +    nsresult rv = aGMPParent->GetGMPVideoEncoder(gmpVEP);

nsresult rv = aGMPParent->GetGMPVideoEncoder(&gmpVEP);

@@ +337,5 @@
> +  {
> +    GMPVideoEncoderParent* gmpVEP;
> +    nsresult rv = aGMPParent->GetGMPVideoEncoder(gmpVEP);
> +    if (NS_SUCCEEDED(rv)) {
> +      videoHost = &gmpVEP->Host();

Need to declare videoHost.
Comment on attachment 8571838 [details] [diff] [review]
Part 2 - support asynchronous GMP API getters.

Review of attachment 8571838 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/eme/CDMProxy.cpp
@@ +82,5 @@
>  }
>  #endif
>  
>  void
> +CDMProxy::gmp_InitDone(GMPDecryptorProxy* aCDM, nsAutoPtr<InitData> aData)

Is it possible for this function to race with gmp_Shutdown()? Does this function guarantee to run in the GMP thread?

::: dom/media/fmp4/gmp/GMPVideoDecoder.cpp
@@ +212,5 @@
>  
> +nsresult
> +GMPVideoDecoder::Init()
> +{
> +  return mGMP ? NS_OK : NS_ERROR_FAILURE;

We should assert GMPInit() is already run.

::: dom/media/fmp4/gmp/GMPVideoDecoder.h
@@ +84,5 @@
>    virtual nsresult Flush() MOZ_OVERRIDE;
>    virtual nsresult Drain() MOZ_OVERRIDE;
>    virtual nsresult Shutdown() MOZ_OVERRIDE;
>  
> +  void GMPInit();

It is tricky to the client code to know when to call init() or GMPInit(). Can we have only one init function?

@@ +112,5 @@
> +    {
> +      mThread->Dispatch(this, NS_DISPATCH_NORMAL);
> +    }
> +
> +    bool IsDone()

Should assert this is only called in the same thread as mThread otherwise we have data race in mInitDone.

::: dom/media/gmp/GMPService.cpp
@@ +231,5 @@
> +  }
> +
> +  void Done(GMPParent* aGMPParent)
> +  {
> +    GMPAudioDecoderParent* gmpADP;

It is a good practice to always initialize the variable to avoid compiler warnings.

@@ +234,5 @@
> +  {
> +    GMPAudioDecoderParent* gmpADP;
> +    nsresult rv = aGMPParent->GetGMPAudioDecoder(&gmpADP);
> +    if (NS_FAILED(rv)) {
> +      gmpADP = nullptr;

When rv is FAILED, gmpADP should remain unchanged. Otherwise, it is a bad design for GetGMPAudioDecoder().

@@ +287,5 @@
> +    } else {
> +      gmpVDP = nullptr;
> +      videoHost = nullptr;
> +    }
> +    mCallback->Done(static_cast<GMPVideoDecoderProxy*>(gmpVDP), videoHost);

Cast is not needed here since GMPVideoDecoderParent inherits GMPVideoDecoderProxy.

::: dom/media/gmp/GMPService.h
@@ +55,4 @@
>    NS_IMETHOD GetGMPDecryptor(nsTArray<nsCString>* aTags,
>                               const nsACString& aNodeId,
> +                             nsAutoPtr<GetGMPDecryptorCallback> aCallback)
> +    MOZ_OVERRIDE;

Can we use the macro NS_DECL_MOZIGECKOMEDIAPLUGINSERVICE instead of declaring the functions one by one here?

::: dom/media/gmp/mozIGeckoMediaPluginService.idl
@@ +18,5 @@
> +template<class T>
> +class GMPGetterCallback
> +{
> +public:
> +  GMPGetterCallback() { MOZ_COUNT_CTOR(GMPGetterCallback<T>); }

Macro is expanded before template instantiation. I wonder if this can work correctly.

@@ +81,5 @@
>     */
>    [noscript]
> +  void getGMPVideoDecoder(in TagArray tags,
> +                          [optional] in ACString nodeId,
> +                          in GetGMPVideoDecoderCallback callback);

Is is possible to use rvalue reference for GetGMPVideoDecoderCallback? It would be more obvious about the ownership transfer.

::: dom/media/gtest/TestGMPCrossOrigin.cpp
@@ +35,5 @@
> +    : mFinished(false)
> +  {
> +  }
> +
> +  void AwaitFinished()

Assert this function runs in the main thread.

@@ +45,5 @@
> +  }
> +
> +  void SetFinished()
> +  {
> +    NS_DispatchToMainThread(new SetFinishedRunnable(this));

You can use NS_NewRunnableMethod to save the class SetFinishedRunnable.

@@ +116,5 @@
> +    : mMonitor(aMonitor)
> +  {
> +  }
> +
> +  static nsresult Get(const nsACString& aNodeId, nsAutoPtr<Base> aCallback)

aCallback can take rvalue reference to make ownership transfer explicit.

@@ +164,5 @@
> +public:
> +  virtual void Done(typename Base::GMPCodecType* aGMP, GMPVideoHost* aHost)
> +  {
> +    EXPECT_TRUE(aGMP);
> +    if (mStep == 2) {

Polymorphism is preferred over value based switch. Or make mStep a const member to show the behavior is determined at object construction time.

@@ +179,4 @@
>  
> +    nsAutoPtr<typename Base::GMPCallbackType> callback(
> +      new RunTestGMPCrossOrigin(Base::mMonitor, aGMP, mShouldBeEqual));
> +    Base::Get(mOrigin2, callback);

Do we need Move(callback)?

@@ +186,5 @@
> +                  const nsCString& aOrigin2)
> +  {
> +    nsAutoPtr<typename Base::GMPCallbackType> callback(
> +      new RunTestGMPCrossOrigin(aMonitor, aOrigin1, aOrigin2));
> +    Base::Get(aOrigin1, Move(callback));

Though unlikely, we should handle error and terminate test appropriately if Base::Get() fails.
Comment on attachment 8573101 [details] [diff] [review]
Part 3 - split the GMP IPDL actors in 2 parts (and use opens to open the second in non-e10s).

Review of attachment 8573101 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/gmp/GMPChild.cpp
@@ +706,5 @@
> +  }
> +
> +  GMPContentChild* child =
> +    *mGMPContentChildren.AppendElement(new GMPContentChild(this));
> +  GMPMessageLoop()->PostTask(FROM_HERE,

Why do we put aChild->Open() in the GMP thread instead of running it here?

::: dom/media/gmp/GMPChild.h
@@ +17,5 @@
>  namespace mozilla {
>  namespace gmp {
>  
> +class GMPContentChild;
> +class PGMPAudioDecoderChild;

PGMPAudioDecoderChild is not used in this header. Do we need the forward declaration here?

@@ +86,5 @@
>    virtual void ProcessingError(Result aCode, const char* aReason) MOZ_OVERRIDE;
>  
>    GMPErr GetAPI(const char* aAPIName, void* aHostAPI, void** aPluginAPI);
>  
> +  nsTArray<GMPContentChild*> mGMPContentChildren;

Since a GMPParent has only one GMPContentParent, shouldn't a GMPChild have a GMPContentChild only?

::: dom/media/gmp/GMPContentParent.h
@@ +48,5 @@
> +  virtual void CheckThread() MOZ_OVERRIDE;
> +
> +private:
> +  ~GMPContentParent();
> +  nsRefPtr<GeckoMediaPluginService> mService;

Is this member ever used?

::: dom/media/gmp/GMPParent.cpp
@@ +829,5 @@
> +PGMPContentParent*
> +GMPParent::AllocPGMPContentParent(Transport* aTransport,
> +                                  ProcessId aOtherProcess)
> +{
> +  MOZ_ASSERT(GMPThread() == NS_GetCurrentThread());

Does AllocPGMPContentParent happen on the same thread where PGMPContent::Open() is called?

@@ +860,5 @@
> +      new RunCreateContentParentCallbacks(mGMPContentParent, Move(aCallback));
> +    NS_DispatchToCurrentThread(runCallbacks);
> +  } else {
> +    mCallbacks.AppendElement(Move(aCallback));
> +    if (mCallbacks.Length() == 1 &&

Add a comment for it is not obvious |mCallbacks.Length() == 1| implies GetGMPContentParent() is called for the 1st time and we want to call PGMPContent::Open() only once.

::: dom/media/gmp/PGMP.ipdl
@@ +25,5 @@
>    async PCrashReporter(NativeThreadId tid);
>    async PGMPTimer();
>    async PGMPStorage();
>  
> +  sync SetIsUsed(bool aIsUsed);

Why is this a sync function?

::: dom/media/gmp/PGMPVideoEncoder.ipdl
@@ +36,5 @@
>    async __delete__();
>    async Encoded(GMPVideoEncodedFrameData aEncodedFrame,
>                  uint8_t[] aCodecSpecificInfo);
>    async Error(GMPErr aErr);
> +  sync Shutdown();

Why do we need sync Shutdown() when we already have __delete__()?
Comment on attachment 8571842 [details] [diff] [review]
Part 5 - use bridging for GMP in e10s.

Review of attachment 8571842 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/gmp/GMPContentParent.h
@@ +27,5 @@
>  {
>  public:
>    NS_INLINE_DECL_THREADSAFE_REFCOUNTING(GMPContentParent)
>  
> +  explicit GMPContentParent(GMPParent* aParent = nullptr);

It concerns me to see GMPContentParent references GMPParent. Since we have PGMPService, can we have it handle the discrepancy of chrome/content code and eliminate the need for GeckoMediaPluginService{Child,Parent}?

::: dom/media/gmp/PGMPService.ipdl
@@ +16,5 @@
> +
> +parent:
> +  sync LoadGMP(nsCString nodeId, nsCString api, nsCString[] tags)
> +    returns (ProcessId id);
> +  sync BridgeGMP(ProcessId id);

Can we conbine LoadGMP and BridgeGMP into a single function? By doing so, we don't need the return value from LoadGMP and it can be an async function.
Btw, I got a crash in the plugin-container process when running gtests.

#0  0x00007f513404a9bd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f513404a854 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x00007f512f7690fa in ah_crap_handler (signum=11) at /media/jwwang/DATA/codebase/mozilla-central2/toolkit/xre/nsSigHandlers.cpp:101
#3  0x00007f512fcb74ca in AsmJSFaultHandler (signum=<optimized out>, info=0x7f51210fd4b0, context=0x7f51210fd380) at /media/jwwang/DATA/codebase/mozilla-central2/js/src/asmjs/AsmJSSignalHandlers.cpp:1151
#4  <signal handler called>
#5  0x00007f512dca32f4 in mozilla::ipc::IToplevelProtocol::IToplevelProtocol (this=this@entry=0x7f5133b35498, aProtoId=aProtoId@entry=PGMPContentMsgStart) at ../../dist/include/mozilla/ipc/ProtocolUtils.h:191
#6  0x00007f512deb310d in mozilla::gmp::PGMPContentParent::PGMPContentParent (this=0x7f5133b35480) at /media/jwwang/DATA/codebase/mozilla-central2/obj-x86_64-unknown-linux-gnu/ipc/ipdl/PGMPContentParent.cpp:60
#7  0x00007f512ec41043 in mozilla::gmp::GMPContentParent::GMPContentParent (this=0x7f5133b35480, aParent=0x7f51219dac00) at /media/jwwang/DATA/codebase/mozilla-central2/dom/media/gmp/GMPContentParent.cpp:40
#8  0x00007f512ec4956c in mozilla::gmp::GMPParent::AllocPGMPContentParent (this=0x7f51219dac00, aTransport=0x7f512194b540, aOtherProcess=<optimized out>)
    at /media/jwwang/DATA/codebase/mozilla-central2/dom/media/gmp/GMPParent.cpp:838
#9  0x00007f512deaf753 in mozilla::gmp::PGMPParent::OnMessageReceived (this=0x7f51219dac00, __msg=...) at /media/jwwang/DATA/codebase/mozilla-central2/obj-x86_64-unknown-linux-gnu/ipc/ipdl/PGMPParent.cpp:679
#10 0x00007f512dc0889b in mozilla::ipc::MessageChannel::DispatchAsyncMessage (this=0x7f51219dac60, aMsg=...) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/glue/MessageChannel.cpp:1211
#11 0x00007f512dc0f928 in mozilla::ipc::MessageChannel::DispatchMessage (this=this@entry=0x7f51219dac60, aMsg=...) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/glue/MessageChannel.cpp:1138
#12 0x00007f512dc1284f in mozilla::ipc::MessageChannel::OnMaybeDequeueOne (this=0x7f51219dac60) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/glue/MessageChannel.cpp:1122
#13 0x00007f512dbebbdc in MessageLoop::RunTask (this=0x7f5133b5e640, task=0x7f51219f0040) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/chromium/src/base/message_loop.cc:361
#14 0x00007f512dbf215c in MessageLoop::DeferOrRunPendingTask (this=<optimized out>, pending_task=...) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/chromium/src/base/message_loop.cc:369
#15 0x00007f512dbf227f in MessageLoop::DoWork (this=0x7f5133b5e640) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/chromium/src/base/message_loop.cc:447
#16 0x00007f512dc06d01 in mozilla::ipc::DoWorkRunnable::Run (this=<optimized out>) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/glue/MessagePump.cpp:233
#17 0x00007f512d95a1e1 in nsThread::ProcessNextEvent (this=0x7f5121b25900, aMayWait=<optimized out>, aResult=0x7f51210fdcff) at /media/jwwang/DATA/codebase/mozilla-central2/xpcom/threads/nsThread.cpp:855
#18 0x00007f512d97b2ce in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=<optimized out>) at /media/jwwang/DATA/codebase/mozilla-central2/xpcom/glue/nsThreadUtils.cpp:265
#19 0x00007f512dc0b41b in mozilla::ipc::MessagePumpForNonMainThreads::Run (this=0x7f51219de440, aDelegate=0x7f5133b5e640) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/glue/MessagePump.cpp:339
#20 0x00007f512dbee447 in MessageLoop::RunInternal (this=this@entry=0x7f5133b5e640) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/chromium/src/base/message_loop.cc:233
#21 0x00007f512dbee478 in RunHandler (this=0x7f5133b5e640) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/chromium/src/base/message_loop.cc:226
#22 MessageLoop::Run (this=this@entry=0x7f5133b5e640) at /media/jwwang/DATA/codebase/mozilla-central2/ipc/chromium/src/base/message_loop.cc:200
#23 0x00007f512d95a67b in nsThread::ThreadFunc (aArg=0x7f5121b25900) at /media/jwwang/DATA/codebase/mozilla-central2/xpcom/threads/nsThread.cpp:356
#24 0x00007f51338e600f in _pt_root (arg=0x7f5121bfb2c0) at /media/jwwang/DATA/codebase/mozilla-central2/nsprpub/pr/src/pthreads/ptthread.c:212
#25 0x00007f5134b65182 in start_thread (arg=0x7f51210fe700) at pthread_create.c:312
#26 0x00007f5134083fbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(In reply to JW Wang [:jwwang] from comment #51)
> Found crashes in gtests:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=a233b9c3d0e8

Might be something new, they weren't in my last try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1839571a0b47
Try again: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0e5e4d1dcc3b

GTests didn't fail on Linux Debug this time. I suspect if it is a timing issue.

Btw, I can't repro it on my local side.
Attachment #8573119 - Flags: review?(cpearce) → review+
peterv: It would be good if you could answer JW's questions, he's been working on EME/GMP with me, and is focused on getting EME working on B2G. So he's keenly interested in ensuring these patches and getting them landed.
(In reply to JW Wang [:jwwang] from comment #53)
> Try again:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=0e5e4d1dcc3b
> 
> GTests didn't fail on Linux Debug this time. I suspect if it is a timing
> issue.
> 
> Btw, I can't repro it on my local side.

Found the cause. See bug 1141888.
Depends on: 1141888
I'm sorry this is taking me so long. It's a really big patch and I also have to understand how GMP works. I've been working on it the last few days.
Comment on attachment 8573101 [details] [diff] [review]
Part 3 - split the GMP IPDL actors in 2 parts (and use opens to open the second in non-e10s).

Review of attachment 8573101 [details] [diff] [review]:
-----------------------------------------------------------------

Some drive by comments (without having fully understood how this works yet)... I'm concerned that we're making the protocols more synchronous here...

::: dom/media/gmp/PGMP.ipdl
@@ +25,5 @@
>    async PCrashReporter(NativeThreadId tid);
>    async PGMPTimer();
>    async PGMPStorage();
>  
> +  sync SetIsUsed(bool aIsUsed);

This doesn't need to be sync as the implementation itself just does and async dispatch?

::: dom/media/gmp/PGMPAudioDecoder.ipdl
@@ +13,5 @@
>  
>  namespace mozilla {
>  namespace gmp {
>  
> +sync protocol PGMPAudioDecoder

Doesn't changing "sync" to "async" here cause all messages in this protocol to become synchronous? That seems like a bad thing.

Why does this entire protocol have to be sync? Being async seems better here, since if there's some fatal decoding error we expect to receive an Error() message from the child process. Also the video protocol has mostly async messages, so it's inconsistent to have the protocol here mostly sync.

::: dom/media/gmp/PGMPDecryptor.ipdl
@@ +14,5 @@
>  
>  namespace mozilla {
>  namespace gmp {
>  
> +sync protocol PGMPDecryptor

Doesn't changing "sync" to "async" here cause all messages in this protocol to become synchronous? That seems like a bad thing.

Why does this entire protocol have to be sync? If Shutdown() need to be sync, can you only make that sync and the rest async?
The sync annotation on a protocol is just a type checking annotation. It tells you at a glance whether any message in the protocol is sync or not. It doesn't change behavior at all.
Comment on attachment 8573101 [details] [diff] [review]
Part 3 - split the GMP IPDL actors in 2 parts (and use opens to open the second in non-e10s).

Review of attachment 8573101 [details] [diff] [review]:
-----------------------------------------------------------------

I read this over and it looks good to me. I tried really hard to find problems and I didn't see any :-). There are a few cases where you send a message on a channel that might have already had ActorDestroy called on it, but I don't think that's really an issue. It will just print a warning.

I have the same questions as Chris and JW. Namely, why are SetIsUsed and Shutdown sync messages?

I also wonder if converting all the APIs to be async is the right decision. It looks like the previous patch needs to spin a nested event loop in several places. Now I'm wondering whether it would be better to provide a sync version of PGMPContent::Open.

r+ on this since I don't think the sync messages are too big a deal.

::: dom/media/gmp/GMPChild.cpp
@@ +590,5 @@
> +  return true;
> +}
> +
> +bool
> +GMPChild::RecvCloseContent()

Should we just put this in ActorDestroy instead? Then it would also run in case the parent process crashed, which might be nice.

@@ +595,5 @@
> +{
> +  nsTArray<GMPContentChild*> contentChildren;
> +  contentChildren.SwapElements(mGMPContentChildren);
> +  for (uint32_t i = contentChildren.Length(); i > 0; i--) {
> +    MOZ_ASSERT(!contentChildren[i - 1]->IsUsed());

If you move to ActorDestroy, you should probably make this assertion conditional on normal destruction.

@@ +596,5 @@
> +  nsTArray<GMPContentChild*> contentChildren;
> +  contentChildren.SwapElements(mGMPContentChildren);
> +  for (uint32_t i = contentChildren.Length(); i > 0; i--) {
> +    MOZ_ASSERT(!contentChildren[i - 1]->IsUsed());
> +    contentChildren[i - 1]->Close();

Do we leak the GMPContentChild here? I don't think it's getting torn down anywhere.

@@ +706,5 @@
> +  }
> +
> +  GMPContentChild* child =
> +    *mGMPContentChildren.AppendElement(new GMPContentChild(this));
> +  GMPMessageLoop()->PostTask(FROM_HERE,

If we're already on the GMP thread (and I think we are), then I agree that we could just do this right away.

::: dom/media/gmp/GMPParent.h
@@ +188,5 @@
>    nsCOMPtr<nsITimer> mAsyncShutdownTimeout; // GMP Thread only.
>    // NodeId the plugin is assigned to, or empty if the the plugin is not
>    // assigned to a NodeId.
>    nsAutoCString mNodeId;
> +  nsRefPtr<GMPContentParent> mGMPContentParent;

Please comment that this is the GMP content in the parent process, and there may be others in content processes.

::: dom/media/gmp/PGMPAudioDecoder.ipdl
@@ +29,5 @@
>    InputDataExhausted();
>    DrainComplete();
>    ResetComplete();
>    Error(GMPErr aErr);
> +  sync Shutdown();

I don't understand why the Shutdown messages need to be sync.
Attachment #8573101 - Flags: review?(wmccloskey) → review+
Comment on attachment 8571842 [details] [diff] [review]
Part 5 - use bridging for GMP in e10s.

Review of attachment 8571842 [details] [diff] [review]:
-----------------------------------------------------------------

This patch is more complicated than I would have expected. Why do we need PGMPService? Couldn't you do the bridging directly through PContent?
(In reply to Bill McCloskey (:billm) from comment #59)
> I also wonder if converting all the APIs to be async is the right decision.
> It looks like the previous patch needs to spin a nested event loop in
> several places. Now I'm wondering whether it would be better to provide a
> sync version of PGMPContent::Open.

I went with the async approach since that's what you suggested at first, but I'm fine with revisiting this if it's not too much work. It means rewriting a big piece of these patches but the GMP code would end up being a lot simpler.

(In reply to Bill McCloskey (:billm) from comment #60)
> This patch is more complicated than I would have expected. Why do we need
> PGMPService? Couldn't you do the bridging directly through PContent?

PContent uses the main thread and PGMP runs on the GMP thread, you can't bridge protocols that run on different threads.


I'd started responding to the other comments and making changes to the patches, but it seems like we need to decide on the synchronous Open question first, since it affects most of the patches.
Flags: needinfo?(wmccloskey)
I talked to cpearce and bholley today about the sync versus async issue and the nested event loops. Originally I was hoping that we wouldn't need so many of them. However, Chris thinks that the media team should be able to refactor the remaining stuff to be async within 3 to 6 months. So it doesn't seem like an unreasonable strategy to do it the way you have it.

The alternative is to add an IPC mechanism where we can call PGMPContent::Open and synchronously get a PGMPContentParent*. I think it should be possible to do that, although it violates the IPDL rule about message ordering since the Open message would be dispatched before messages that might have been sent before. However, I don't think that would break anything, especially if we only did it for PGMPContent.

Right now the IPC change seems like the best way to go to me if we can make it work. I can take a look at writing a synchronous Open function tomorrow and see how feasible it is. However, if you felt strongly, Peter, I could be convinced to do it the way you have now.
Flags: needinfo?(wmccloskey)
Specifically, we could change our MediaDataDecoder interface (which is the thing calling the GMP*Decoder API objects) to have its Init() method return a MediaPromise<bool>, and then the init can be async. This is a bigger change though, so we don't need to block on it.
I'm struggling to find the time to get to reviewing this, things keep coming up, so I'll pass the review to JW Wang in Taipei. It's his job to make EME work on B2G, so he's very motivated to see this reviewed and landed. JW's a smart and thorough guy, so I'm sure he'll do a good job.
Comment on attachment 8571828 [details] [diff] [review]
Part 1 - split GeckoMediaPluginService into a part for chrome and a part for both content and chrome.

Review of attachment 8571828 [details] [diff] [review]:
-----------------------------------------------------------------

Passing review to JW so we can get this moving faster...
Attachment #8571828 - Flags: review?(cpearce) → review?(jwwang)
Comment on attachment 8571838 [details] [diff] [review]
Part 2 - support asynchronous GMP API getters.

Review of attachment 8571838 [details] [diff] [review]:
-----------------------------------------------------------------

Passing review to JW so we can get this moving faster...
Attachment #8571838 - Flags: review?(cpearce) → review?(jwwang)
Comment on attachment 8571841 [details] [diff] [review]
Part 4 - support asynchronous GetNodeId.

Review of attachment 8571841 [details] [diff] [review]:
-----------------------------------------------------------------

Passing review to JW so we can get this moving faster...
Attachment #8571841 - Flags: review?(cpearce) → review?(jwwang)
Attachment #8571828 - Flags: review?(jwwang) → review+
Comment on attachment 8571838 [details] [diff] [review]
Part 2 - support asynchronous GMP API getters.

Review of attachment 8571838 [details] [diff] [review]:
-----------------------------------------------------------------

Overall looks good to me with nits addressed. We need webrtc peers to review WebrtcGmpVideoCodec.*.
Attachment #8571838 - Flags: review?(jwwang) → review+
Comment on attachment 8571838 [details] [diff] [review]
Part 2 - support asynchronous GMP API getters.

Hi Randell,
Can you review changes to WebrtcGmpVideoCodec.*? Thanks.
Attachment #8571838 - Flags: review+ → review?(rjesup)
Attachment #8571838 - Flags: review?(jwwang)
Comment on attachment 8571838 [details] [diff] [review]
Part 2 - support asynchronous GMP API getters.

Review of attachment 8571838 [details] [diff] [review]:
-----------------------------------------------------------------

Bring r+ back.
Attachment #8571838 - Flags: review?(jwwang) → review+
Attachment #8571841 - Flags: review?(jwwang) → review+
Comment on attachment 8571838 [details] [diff] [review]
Part 2 - support asynchronous GMP API getters.

Review of attachment 8571838 [details] [diff] [review]:
-----------------------------------------------------------------

I'll note also that the changes to GMPVideoDecoder and GMPAudioDecoder are basically identical, which makes me wonder if they should be both inheriting this boilerplate from somewhere.

Some things jwwang flagged implied this code couldn't have been successfully compiled, which makes me wary of r+'ing it.

However, I'll r+ (with nits resolved, and a good try run, AND manual testing of OpenH264 via pc_test.html) if there's an immediate followup for the cleanup/refactoring discussed, or good reasons why it can't/shouldn't be.

::: dom/media/fmp4/gmp/GMPAudioDecoder.cpp
@@ +193,5 @@
>  
> +nsresult
> +GMPAudioDecoder::Init()
> +{
> +  return mGMP ? NS_OK : NS_ERROR_FAILURE;

see jwwang's comment on GMPVideoDecoder

::: dom/media/fmp4/gmp/GMPAudioDecoder.h
@@ +107,5 @@
> +  private:
> +    bool mInitDone;
> +    nsCOMPtr<nsIThread> mThread;
> +  };
> +  void GetGMPAPI(GMPInitDoneRunnable* aInitDone);

nit; blank line before this and after

::: dom/media/fmp4/gmp/GMPVideoDecoder.cpp
@@ +162,5 @@
> +                                                      initDone),
> +    NS_DISPATCH_NORMAL);
> +
> +  while (!initDone->IsDone()) {
> +    NS_ProcessNextEvent(NS_GetCurrentThread(), true);

Make it match GMPAudioDecoder.cpp

::: dom/media/fmp4/gmp/GMPVideoDecoder.h
@@ +121,5 @@
> +  private:
> +    bool mInitDone;
> +    nsCOMPtr<nsIThread> mThread;
> +  };
> +  void GetGMPAPI(GMPInitDoneRunnable* aInitDone);

same comments as for Audio

::: dom/media/gmp/GMPService.cpp
@@ +279,5 @@
> +
> +  void Done(GMPParent* aGMPParent)
> +  {
> +    GMPVideoDecoderParent* gmpVDP;
> +    GMPVideoDecoderProxy* videoHost;

same comments as jwwang for initialization, instead of setting to null on failure.  This isn't a white-hot perf path

@@ +334,5 @@
> +  }
> +
> +  void Done(GMPParent* aGMPParent)
> +  {
> +    GMPVideoEncoderParent* gmpVEP;

And again

@@ +342,5 @@
> +    } else {
> +      gmpVEP = nullptr;
> +      videoHost = nullptr;
> +    }
> +    mCallback->Done(static_cast<GMPVideoEncoderProxy*>(gmpVEP), videoHost);

is the cast needed?

@@ +388,5 @@
> +  }
> +
> +  void Done(GMPParent* aGMPParent)
> +  {
> +    GMPDecryptorParent* ksp;

and again

@@ +393,5 @@
> +    nsresult rv = aGMPParent->GetGMPDecryptor(&ksp);
> +    if (NS_FAILED(rv)) {
> +      ksp = nullptr;
> +    }
> +    mCallback->Done(static_cast<GMPDecryptorProxy*>(ksp));

is the cast needed?

@@ +397,5 @@
> +    mCallback->Done(static_cast<GMPDecryptorProxy*>(ksp));
> +  }
> +
> +private:
> +  nsAutoPtr<GetGMPDecryptorCallback> mCallback;

UniquePtr (and for the other nsAutoPtr's), unless we just want to avoid having a mix used in the file

::: dom/media/gtest/TestGMPCrossOrigin.cpp
@@ +97,5 @@
> +  virtual void Done(T* aGMP, GMPVideoHost* aHost)
> +  {
> +    EXPECT_TRUE(aGMP);
> +    EXPECT_TRUE(aHost);
> +    if (aGMP) aGMP->Close();

{}'s, newlines

@@ +621,5 @@
> +    }
> +
> +    NS_IMETHOD Run()
> +    {
> +      for (uint32_t length = mUpdates.Length(), i = 0; i < length; ++i) {

size_t (or auto if it works), or use the new hotness:
for (auto& update : mUpdates) {
  mRunner->Update(update);
}

::: media/webrtc/signaling/src/media-conduit/WebrtcGmpVideoCodec.cpp
@@ +259,5 @@
> +    nsCOMPtr<nsIRunnable> task(
> +      WrapRunnable(this,
> +                   &WebrtcGmpVideoEncoder::RegetEncoderForResolutionChange,
> +                   &aInputImage,
> +                   initDone));

Does this design allow for codecs that don't need to be re-initialized for a resolution change?  For example, in theory VP8 only needs to be reconfigured, or can pick up on the input size change.  (There's a bug in the encoder that stops us from using this right now.)

@@ +265,5 @@
> +
> +    nsCOMPtr<nsIThread> currentThread(do_GetCurrentThread());
> +    while (!initDone->IsDone()) {
> +      NS_ProcessNextEvent(currentThread, true);
> +    }

This pattern is being repeated a bunch of times... a) why does using DISPATCH_SYNC not work for this? b) if DISPATCH_SYNC doesn't work, please document that in comments, and can this pattern be collapsed into a class to avoid the repeated while (!done) ProcessNextEvent()'s all over the place?

Even having a 
InitDoneRunnable::Dispatch(thread, task) {
  thread->Dispatch(task, NS_DISPATCH_NORMAL); 
  while(!IsDone()) {
    ProcessNextEvent(...);
  }
} (roughly)
would help

@@ +297,5 @@
> +                                            aInputImage->height()));
> +
> +  // OpenH264 codec (at least) can't handle dynamic input resolution changes
> +  // re-init the plugin when the resolution changes
> +  // XXX allow codec to indicate it doesn't need re-init!

Right; we should at least create a dummy function that currently always returns false with a signature something like GMPVideoEncoder->ReconfigureSize(width, height), and file a bug (referenced in the dummy function) for updating the GMP API to the codec to add ReconfigureSize (or a more generic Reconfigure()).

Since you're touching this anyways... :-)
Attachment #8571838 - Flags: review?(rjesup) → review+
Comment on attachment 8571842 [details] [diff] [review]
Part 5 - use bridging for GMP in e10s.

Review of attachment 8571842 [details] [diff] [review]:
-----------------------------------------------------------------

Given the time pressure on this, I don't think it makes sense to make PFoo::Open be sync in this bug. We can handle that in a separate bug.

::: dom/media/gmp/GMPServiceChild.cpp
@@ +257,5 @@
> +public:
> +  NS_IMETHOD Run()
> +  {
> +    MOZ_ASSERT(NS_IsMainThread());
> +    mParent = new GMPContentParent();

Is there still a reason for this to be allocated on the main thread?
Attachment #8571842 - Flags: review?(wmccloskey) → review+
(In reply to JW Wang [:jwwang] from comment #44)
> Is this patch just for saving some duplicated code?

No, it's to avoid multiple rpc calls.

(In reply to JW Wang [:jwwang] from comment #45)

> It should be |GMPVideoHost* videoHost|.

> nsresult rv = aGMPParent->GetGMPVideoEncoder(&gmpVEP);

> Need to declare videoHost.

Ah, yes, these changes accidently got merged into the wrong patch. I've moved them and made sure patches still compile individually.

(In reply to JW Wang [:jwwang] from comment #46)
> >  void
> > +CDMProxy::gmp_InitDone(GMPDecryptorProxy* aCDM, nsAutoPtr<InitData> aData)
> 
> Is it possible for this function to race with gmp_Shutdown()?

Right. I've added a flag that Shutdown was called and bail out at the start of gmp_InitDone if it's set.

> Does this
> function guarantee to run in the GMP thread?

Yes, since it's called from a getter callback and the getter callbacks all run in the GMP thread (that is explicitly document in the .idl).

> It is tricky to the client code to know when to call init() or GMPInit().
> Can we have only one init function?

Turns out that we actually can (couldn't at some point in the past). Merged them again.

> It is a good practice to always initialize the variable to avoid compiler
> warnings.

Quelling compiler warnings seems like the wrong reason to do this :-). But I'll init it.

> Can we use the macro NS_DECL_MOZIGECKOMEDIAPLUGINSERVICE instead of
> declaring the functions one by one here?

No, since we don't implement everything in NS_DECL_MOZIGECKOMEDIAPLUGINSERVICE here.

> > +  GMPGetterCallback() { MOZ_COUNT_CTOR(GMPGetterCallback<T>); }
> 
> Macro is expanded before template instantiation. I wonder if this can work
> correctly.

Correctly in what sense? The class will just literally show up as "GMPGetterCallback<T>" in the logs. Seems correct to me.

> You can use NS_NewRunnableMethod to save the class SetFinishedRunnable.

I don't see how I could. Though I guess I could use NS_NewNonOwningRunnableMethod.

> Polymorphism is preferred over value based switch.

Changed it, though it just looks like more code for no obvious benefit to me.

> Do we need Move(callback)?

It didn't really matter, but since it now takes a rvalue reference it does now.

(In reply to JW Wang [:jwwang] from comment #47)

> > +  nsTArray<GMPContentChild*> mGMPContentChildren;
> 
> Since a GMPParent has only one GMPContentParent, shouldn't a GMPChild have a
> GMPContentChild only?

No, there can be multiple GMPContentChild objects (since there can be multiple content processes).

> Why do we put aChild->Open() in the GMP thread instead of running it here?

We could run it directly, changed.

> Does AllocPGMPContentParent happen on the same thread where
> PGMPContent::Open() is called?

Yes.

> 
> ::: dom/media/gmp/PGMPVideoEncoder.ipdl
> @@ +36,5 @@
> >    async __delete__();
> >    async Encoded(GMPVideoEncodedFrameData aEncodedFrame,
> >                  uint8_t[] aCodecSpecificInfo);
> >    async Error(GMPErr aErr);
> > +  sync Shutdown();
> 
> Why do we need sync Shutdown() when we already have __delete__()?

Shutdown does very different things than __delete__, it sends DecodingComplete which is what we need iirc.

(In reply to JW Wang [:jwwang] from comment #48)

> It concerns me to see GMPContentParent references GMPParent. Since we have
> PGMPService, can we have it handle the discrepancy of chrome/content code
> and eliminate the need for GeckoMediaPluginService{Child,Parent}?

I don't understand what you're suggesting here. We did discuss some of this on irc so this might be a misunderstanding. GMPContentParent only has access to the GMPParent in non-e10s mode or from chrome in e10s mode. Content in e10s can't access the GMPParent, since it lives in a different process.

> 
> ::: dom/media/gmp/PGMPService.ipdl
> @@ +16,5 @@
> > +
> > +parent:
> > +  sync LoadGMP(nsCString nodeId, nsCString api, nsCString[] tags)
> > +    returns (ProcessId id);
> > +  sync BridgeGMP(ProcessId id);
> 
> Can we conbine LoadGMP and BridgeGMP into a single function? By doing so, we
> don't need the return value from LoadGMP and it can be an async function.

The chrome process knows whether a GMP process for a specific nodeId/API/tags combination has been launched. The content process knows whether the bridging has happened with a specific GMP process or not. We'd either have to start storing state in the chrome process about whether bridging has happened (for every combination of content <-> GMP processes) or duplicate the state that currently lives in the chrome process about which GMP process to use for what nodeId/API/tags in every content process. Otherwise I think we can't do without the two steps of asking for the right GMP process and then bridging if still needed.

(In reply to Randell Jesup [:jesup] from comment #71)

> Some things jwwang flagged implied this code couldn't have been successfully
> compiled, which makes me wary of r+'ing it.

Yeah, I accidentally merged part of some changes into the wrong patch in the stack, the whole stack always compiled. I fixed this up now.

> Does this design allow for codecs that don't need to be re-initialized for a
> resolution change?  For example, in theory VP8 only needs to be
> reconfigured, or can pick up on the input size change.  (There's a bug in
> the encoder that stops us from using this right now.)

It looks to me like I didn't really change anything wrt this issue compared to the current code. If that's the case then I'm going to ignore this, it's out of scope for this bug.

> This pattern is being repeated a bunch of times... a) why does using
> DISPATCH_SYNC not work for this? b) if DISPATCH_SYNC doesn't work, please
> document that in comments, and can this pattern be collapsed into a class to
> avoid the repeated while (!done) ProcessNextEvent()'s all over the place?

I'm not sure how DISPATCH_SYNC would work for this, can you explain? I'm calling an async API and looping until it posts an event that it's done. DISPATCH_SYNC posts an event and loops until the event has been processed. AFAICT those are quite different operations.

> Even having a 
> InitDoneRunnable::Dispatch(thread, task) {
>   thread->Dispatch(task, NS_DISPATCH_NORMAL); 
>   while(!IsDone()) {
>     ProcessNextEvent(...);
>   }
> } (roughly)
> would help

Personally I would really dislike hiding the fact that we're running the event loop behind a helper function. It's a potential source of problems, and the place where it happens should be very clear.

> Right; we should at least create a dummy function that currently always
> returns false with a signature something like
> GMPVideoEncoder->ReconfigureSize(width, height), and file a bug (referenced
> in the dummy function) for updating the GMP API to the codec to add
> ReconfigureSize (or a more generic Reconfigure()).

I'm happy to file the bug, but I'm going to expand the scope of this bug, it's dragged on for long enough already.

(In reply to Chris Pearce (:cpearce) from comment #57)

> Doesn't changing "sync" to "async" here cause all messages in this protocol
> to become synchronous?

> Doesn't changing "sync" to "async" here cause all messages in this protocol
> to become synchronous?

No, it just means that there are some messages in the protocol that are sync.

(In reply to Bill McCloskey (:billm) from comment #59)

> Should we just put this in ActorDestroy instead? Then it would also run in
> case the parent process crashed, which might be nice.

I've moved it, but note that most of the time I don't see GMPChild::ActorDestroy being called (which might also explain why we often don't get leak stats for the GMP process).

> Do we leak the GMPContentChild here? I don't think it's getting torn down
> anywhere.

Good catch, fixed. I've also added ctor/dtor counting (though they're useless without leak stats).

(In reply to JW Wang [:jwwang] from comment #47)
> > +  sync SetIsUsed(bool aIsUsed);
> 
> Why is this a sync function?

(In reply to Bill McCloskey (:billm) from comment #59)
> I have the same questions as Chris and JW. Namely, why are SetIsUsed and
> Shutdown sync messages?

It looks like Shutdown can be async, but SetIsUsed not so much. The issue is that the parent shouldn't eagerly shut down the GMP process if we've handed out API objects (GMPDecryptor*, ...). If SetIsUsed is async the parent may very well decide to shut down the GMP process while the GMP process has handed out new API objects, because the parent hasn't processed the latest SetIsUsed message. As soon as I made SetIsUsed async the GMP gtest started failing on try again because of this issue.

I took care of most of the other review comments, and it looks like I finally might have a green tree again.
> It looks like Shutdown can be async, but SetIsUsed not so much. The issue is that the parent
> shouldn't eagerly shut down the GMP process if we've handed out API objects (GMPDecryptor*,
> ...). If SetIsUsed is async the parent may very well decide to shut down the GMP process
> while the GMP process has handed out new API objects, because the parent hasn't processed the
> latest SetIsUsed message. As soon as I made SetIsUsed async the GMP gtest started failing on
> try again because of this issue.

OK, but I'm worried that this is just covering up the failure rather than fixing it. In this situation:

Content->GMP: Create decryptor or whatever
GMP->chrome: SetIsUsed

The GMP process is blocked but the content and chrome processes are still runnin freely (unless content is spinning an event loop I guess, but we probably shouldn't count on that). What's to prevent them from deciding to shut down the GMP process?

The reason I worry about this is that we've had a lot of problems in the plugins code where sync messages caused deadlocks. Things are really complex when there are three processes involved, and especially when content and chrome can wait on each other in complex ways because of CPOWs. It would make things a lot simpler to reason about if we could simply say that the GMP process never waits on anyone.
Bill and I discussed this over irc and we've come up with a new plan:

1) We'll merge loading and bridging after all, by making the content process send over in the load call the list of GMP processes that it's bridged to already. During that load call the chrome process can synchronously bridge if the content process isn't already bridged to the selected GMP process. That way loading and bridging are one synchronous operation.
2) GMPContentChild will eagerly close itself once it doesn't have any managees (PGMPDecryptor/PGMPAudioDecoder/...) anymore.
3) The chrome process will keep a count of the open PGMPContents, by incrementing a counter during the (synchronous) load/bridge call if we actually bridge, and decrementing it when receiving an asynchronous message sent from the GMP process once a PGMPContent is closed.
4) The chrome process will use the counter to decide whether to shut down a GMP process.
Attachment #8571838 - Attachment is obsolete: true
Attachment #8586949 - Flags: review+
Attachment #8571841 - Attachment is obsolete: true
Attachment #8586953 - Flags: review+
Attachment #8571842 - Attachment is obsolete: true
Attachment #8586956 - Flags: review+
This just turns on tests that were disabled in e10s because they depended on this bug being fixed.
I'm going to try to land this tomorrow.
(In reply to Peter Van der Beken [:peterv] from comment #84)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/ae9f94c0e3d8

This introduced local build bustage for me (clang 3.6, warnings-as-errors build), due to a missing 'override' annotation:
{
GMPServiceParent.h:200:16: error: 'ActorDestroy' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]
}

I pushed a followup to add the missing annotation, with the blanket r+ that ehsan granted me for fixes of this sort over in bug 1126447 comment 2:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aece681acd85
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: