Closed Bug 1035653 Opened 10 years ago Closed 10 years ago

Fatal assertion creating GMPParent on GMP thread since bug 1028383 landed.

Categories

(Core :: Audio/Video, defect)

x86_64
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: cpearce, Assigned: cpearce)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files)

Bug 1028383 merged a few hours ago, and now Gecko asserts when trying to create the GMPParent on a non-main thread.

This is unfortunate, as the GMP code is actually designed to *only* create GMP related objects on the GMP thread...

Based on the comments in bug 1028383 it's only safe to create GMPParent on the main thread.

So now Gecko Media Plugins (EME and OpenH264) don't work. :(

Assertion failure: NS_IsMainThread(), at c:\Users\Bob\src\mozilla\purple\objdir\dist\include\mozilla/ipc/ProtocolUtils.h:191
0[b0f140]: 16dcd680 Queuing event progress
0[b0f140]: 16dcd680 Dispatching event progress
0[b0f140]: 16dcd680 Queuing event progress
mozilla::gmp::PGMPParent::PGMPParent+0x00000027 [xul +0x00000000009732A7] (c:\users\bob\src\mozilla\purple\objdir\ipc\ipdl\pgmpparent.cpp, line 69)
0[b0f140]: 16dcd680 Dispatching event progress
mozilla::gmp::GMPParent::GMPParent+0x0000000F [xul +0x0000000001D9877F] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpparent.cpp, line 25)
mozilla::gmp::GeckoMediaPluginService::ProcessPossiblePlugin+0x00000245 [xul +0x0000000001DA05A5] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpservice.
cpp, line 552)
mozilla::gmp::GeckoMediaPluginService::RefreshPluginList+0x000001AF [xul +0x0000000001DA08DF] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpservice.cpp,
 line 590)
mozilla::gmp::GeckoMediaPluginService::SelectPluginForAPI+0x0000008D [xul +0x0000000001D9F77D] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpservice.cpp
, line 364)
mozilla::gmp::GeckoMediaPluginService::GetGMPDecryptor+0x000000FD [xul +0x0000000001D9F3FD] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpservice.cpp, l
ine 312)
mozilla::CDMProxy::gmp_Init+0x0000010D [xul +0x0000000002A1E58D] (c:\users\bob\src\mozilla\purple\content\media\eme\cdmproxy.cpp, line 83)
nsRunnableMethodImpl<void (__thiscall mozilla::CDMProxy::*)(unsigned int),unsigned int,1>::Run+0x00000021 [xul +0x0000000002A281C1] (c:\users\bob\src\mozilla\pu
rple\objdir\dist\include\nsthreadutils.h, line 368)
nsThread::ProcessNextEvent+0x0000038F [xul +0x00000000002BC42F] (c:\users\bob\src\mozilla\purple\xpcom\threads\nsthread.cpp, line 766)
NS_ProcessNextEvent+0x00000058 [xul +0x00000000001E0978] (c:\users\bob\src\mozilla\purple\xpcom\glue\nsthreadutils.cpp, line 256)
mozilla::ipc::MessagePumpForNonMainThreads::Run+0x000001EF [xul +0x00000000007AAEAF] (c:\users\bob\src\mozilla\purple\ipc\glue\messagepump.cpp, line 307)
MessageLoop::RunInternal+0x0000004E [xul +0x000000000072AD1E] (c:\users\bob\src\mozilla\purple\ipc\chromium\src\base\message_loop.cc, line 230)
MessageLoop::RunHandler+0x00000082 [xul +0x000000000072AC42] (c:\users\bob\src\mozilla\purple\ipc\chromium\src\base\message_loop.cc, line 223)
MessageLoop::Run+0x0000001D [xul +0x000000000072AB4D] (c:\users\bob\src\mozilla\purple\ipc\chromium\src\base\message_loop.cc, line 197)
nsThread::ThreadFunc+0x00000118 [xul +0x00000000002BACD8] (c:\users\bob\src\mozilla\purple\xpcom\threads\nsthread.cpp, line 353)
_PR_NativeRunThread+0x000000DB [nss3 +0x00000000001E322B] (c:\users\bob\src\mozilla\purple\nsprpub\pr\src\threads\combined\pruthr.c, line 397)
pr_root+0x00000019 [nss3 +0x00000000001CAE59] (c:\users\bob\src\mozilla\purple\nsprpub\pr\src\md\windows\w95thred.c, line 90)
endthreadex+0x0000003A [MSVCR100 +0x000000000005C556]
endthreadex+0x000000E4 [MSVCR100 +0x000000000005C600]
BaseThreadInitThunk+0x0000000E [KERNEL32 +0x000000000001919F]
RtlInitializeExceptionChain+0x00000084 [ntdll +0x000000000004A8CB]
RtlInitializeExceptionChain+0x0000005A [ntdll +0x000000000004A8A1]
Right, the reason for Bug 1028383 is that apparently since some Nuwa-related changes, top-level actors are put in linked lists (see mOpenActors in ipc/glue/ProtocolUtils.*) without any kind of lock or other mechanism to make that not race when multiple threads try to modify that list. So we had various intermittent crashes caused by this list getting corrupted as multiple threads were accessing it as top-level actors were created and destroyed on multiple threads...
We could probably use some :bent advice here. Should we back out bug 1028383?
Flags: needinfo?(bent.mozilla)
...or put some lock around mOpenActors so we can safely remove that main-thread requirement?
Kyle and/or Blake might know what we should do here, needinfo'ing.
Flags: needinfo?(mrbkap)
Flags: needinfo?(khuey)
(In reply to Benoit Jacob [:bjacob] from comment #3)
> We could probably use some :bent advice here. Should we back out bug 1028383?

No, definitely not. There's no way having a linked list mutating on several threads ends well. We need to figure out how to handle this correctly.

The other top-level actors destined to be bound to a non-main thread are all created on the main thread, right? Can we not make GMPParent do something similar?

Otherwise we could lock around the list I guess...
Flags: needinfo?(bent.mozilla)
We could just do a sync dispatch from the GMP thread to the main thread when we come to create GMPParent. I guess we need to do the same for all *Parent object creation? We don't create these frequently, so it should not adversely affect perf.
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #7)
> Otherwise we could lock around the list I guess...

This seems hard given that the list is a collection of template-added members to a bunch of objects, I guess we could add a version with a lock, but... ugh.

Can we try doing comment 8 first?
Flags: needinfo?(mrbkap)
Simple fix.
Assignee: nobody → cpearce
Status: NEW → ASSIGNED
Attachment #8453499 - Flags: review?(rjesup)
Comment on attachment 8453499 [details] [diff] [review]
Patch: Simple fix; proxy creation of GMPParent to main thread

Review of attachment 8453499 [details] [diff] [review]:
-----------------------------------------------------------------

Drive-by:

::: content/media/gmp/GMPService.cpp
@@ +527,5 @@
>  
> +class CreateGMPParentTask : public nsRunnable {
> +public:
> +  CreateGMPParentTask()
> +    : mMonitor("CreateGMPParentTask")

You don't need a monitor for this, NS_DISPATCH_SYNC'd runnables provide all the synchronization you need already.
Also, don't you have to worry about the destruction also? That will attempt to remove the actor from the linked list...
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #11)
> You don't need a monitor for this, NS_DISPATCH_SYNC'd runnables provide all
> the synchronization you need already.

I had wondered about that, thanks!


(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #12)
> Also, don't you have to worry about the destruction also? That will attempt
> to remove the actor from the linked list...

I changed GMPParent's refcounting to NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION(GMPParent), which should run the destructor on the main thread.
Comment on attachment 8453499 [details] [diff] [review]
Patch: Simple fix; proxy creation of GMPParent to main thread

Review of attachment 8453499 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with the Monitor removed, and if we're ok spinning the event loop

::: content/media/gmp/GMPService.cpp
@@ +543,5 @@
> +    return mParent.forget();
> +  }
> +
> +private:
> +  Monitor mMonitor;

Why do we need a monitor?  This is Dispatched synchronously; there's no cross-thread contention/access.

@@ +574,5 @@
>  
> +  // The GMPParent inherits from IToplevelProtocol, which must be created
> +  // on the main thread to be threadsafe. See Bug 1035653.
> +  nsRefPtr<CreateGMPParentTask> task(new CreateGMPParentTask());
> +  rv = NS_DispatchToMainThread(task, NS_DISPATCH_SYNC);

Do we care if the thread spins the event loop while we're waiting for GMPParent?  (And are are on an nsThread?  I presume so.)  If we don't want to spin the event loop, we can use SyncRunnable::DispatchToThread() (already used in GMPService.cpp)
Attachment #8453499 - Flags: review?(rjesup) → review+
Yes, we're on a nsIThread.

I think not spinning the GMP thread's event loop would be a prudent thing to do. I can imagine Bad Things happening unexpectedly if we don't. I'll update the patch to do that,and to remove the monitor.
Note that you want mozilla::SycnRunnable, not mozilla::gmp::SyncRunnable ;-)
Flags: needinfo?(khuey)
Attached patch Updated patchSplinter Review
Carrying forward r=jesup.
Attachment #8454095 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/17cd99be695f
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: