Fatal assertion creating GMPParent on GMP thread since bug 1028383 landed.

RESOLVED FIXED in mozilla33

Status

()

Core
Audio/Video
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: cpearce, Assigned: cpearce)

Tracking

(Blocks: 3 bugs)

unspecified
mozilla33
x86_64
All
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Assignee)

Description

3 years ago
Bug 1028383 merged a few hours ago, and now Gecko asserts when trying to create the GMPParent on a non-main thread.

This is unfortunate, as the GMP code is actually designed to *only* create GMP related objects on the GMP thread...

Based on the comments in bug 1028383 it's only safe to create GMPParent on the main thread.

So now Gecko Media Plugins (EME and OpenH264) don't work. :(

Assertion failure: NS_IsMainThread(), at c:\Users\Bob\src\mozilla\purple\objdir\dist\include\mozilla/ipc/ProtocolUtils.h:191
0[b0f140]: 16dcd680 Queuing event progress
0[b0f140]: 16dcd680 Dispatching event progress
0[b0f140]: 16dcd680 Queuing event progress
mozilla::gmp::PGMPParent::PGMPParent+0x00000027 [xul +0x00000000009732A7] (c:\users\bob\src\mozilla\purple\objdir\ipc\ipdl\pgmpparent.cpp, line 69)
0[b0f140]: 16dcd680 Dispatching event progress
mozilla::gmp::GMPParent::GMPParent+0x0000000F [xul +0x0000000001D9877F] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpparent.cpp, line 25)
mozilla::gmp::GeckoMediaPluginService::ProcessPossiblePlugin+0x00000245 [xul +0x0000000001DA05A5] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpservice.
cpp, line 552)
mozilla::gmp::GeckoMediaPluginService::RefreshPluginList+0x000001AF [xul +0x0000000001DA08DF] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpservice.cpp,
 line 590)
mozilla::gmp::GeckoMediaPluginService::SelectPluginForAPI+0x0000008D [xul +0x0000000001D9F77D] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpservice.cpp
, line 364)
mozilla::gmp::GeckoMediaPluginService::GetGMPDecryptor+0x000000FD [xul +0x0000000001D9F3FD] (c:\users\bob\src\mozilla\purple\content\media\gmp\gmpservice.cpp, l
ine 312)
mozilla::CDMProxy::gmp_Init+0x0000010D [xul +0x0000000002A1E58D] (c:\users\bob\src\mozilla\purple\content\media\eme\cdmproxy.cpp, line 83)
nsRunnableMethodImpl<void (__thiscall mozilla::CDMProxy::*)(unsigned int),unsigned int,1>::Run+0x00000021 [xul +0x0000000002A281C1] (c:\users\bob\src\mozilla\pu
rple\objdir\dist\include\nsthreadutils.h, line 368)
nsThread::ProcessNextEvent+0x0000038F [xul +0x00000000002BC42F] (c:\users\bob\src\mozilla\purple\xpcom\threads\nsthread.cpp, line 766)
NS_ProcessNextEvent+0x00000058 [xul +0x00000000001E0978] (c:\users\bob\src\mozilla\purple\xpcom\glue\nsthreadutils.cpp, line 256)
mozilla::ipc::MessagePumpForNonMainThreads::Run+0x000001EF [xul +0x00000000007AAEAF] (c:\users\bob\src\mozilla\purple\ipc\glue\messagepump.cpp, line 307)
MessageLoop::RunInternal+0x0000004E [xul +0x000000000072AD1E] (c:\users\bob\src\mozilla\purple\ipc\chromium\src\base\message_loop.cc, line 230)
MessageLoop::RunHandler+0x00000082 [xul +0x000000000072AC42] (c:\users\bob\src\mozilla\purple\ipc\chromium\src\base\message_loop.cc, line 223)
MessageLoop::Run+0x0000001D [xul +0x000000000072AB4D] (c:\users\bob\src\mozilla\purple\ipc\chromium\src\base\message_loop.cc, line 197)
nsThread::ThreadFunc+0x00000118 [xul +0x00000000002BACD8] (c:\users\bob\src\mozilla\purple\xpcom\threads\nsthread.cpp, line 353)
_PR_NativeRunThread+0x000000DB [nss3 +0x00000000001E322B] (c:\users\bob\src\mozilla\purple\nsprpub\pr\src\threads\combined\pruthr.c, line 397)
pr_root+0x00000019 [nss3 +0x00000000001CAE59] (c:\users\bob\src\mozilla\purple\nsprpub\pr\src\md\windows\w95thred.c, line 90)
endthreadex+0x0000003A [MSVCR100 +0x000000000005C556]
endthreadex+0x000000E4 [MSVCR100 +0x000000000005C600]
BaseThreadInitThunk+0x0000000E [KERNEL32 +0x000000000001919F]
RtlInitializeExceptionChain+0x00000084 [ntdll +0x000000000004A8CB]
RtlInitializeExceptionChain+0x0000005A [ntdll +0x000000000004A8A1]

Updated

3 years ago
Duplicate of this bug: 1035263
Right, the reason for Bug 1028383 is that apparently since some Nuwa-related changes, top-level actors are put in linked lists (see mOpenActors in ipc/glue/ProtocolUtils.*) without any kind of lock or other mechanism to make that not race when multiple threads try to modify that list. So we had various intermittent crashes caused by this list getting corrupted as multiple threads were accessing it as top-level actors were created and destroyed on multiple threads...
We could probably use some :bent advice here. Should we back out bug 1028383?
Flags: needinfo?(bent.mozilla)
...or put some lock around mOpenActors so we can safely remove that main-thread requirement?
Blocks: 948160
Kyle and/or Blake might know what we should do here, needinfo'ing.
Flags: needinfo?(mrbkap)
Flags: needinfo?(khuey)
Duplicate of this bug: 1036010
(In reply to Benoit Jacob [:bjacob] from comment #3)
> We could probably use some :bent advice here. Should we back out bug 1028383?

No, definitely not. There's no way having a linked list mutating on several threads ends well. We need to figure out how to handle this correctly.

The other top-level actors destined to be bound to a non-main thread are all created on the main thread, right? Can we not make GMPParent do something similar?

Otherwise we could lock around the list I guess...
Flags: needinfo?(bent.mozilla)
(Assignee)

Comment 8

3 years ago
We could just do a sync dispatch from the GMP thread to the main thread when we come to create GMPParent. I guess we need to do the same for all *Parent object creation? We don't create these frequently, so it should not adversely affect perf.
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #7)
> Otherwise we could lock around the list I guess...

This seems hard given that the list is a collection of template-added members to a bunch of objects, I guess we could add a version with a lock, but... ugh.

Can we try doing comment 8 first?
Flags: needinfo?(mrbkap)
(Assignee)

Comment 10

3 years ago
Created attachment 8453499 [details] [diff] [review]
Patch: Simple fix; proxy creation of GMPParent to main thread

Simple fix.
Assignee: nobody → cpearce
Status: NEW → ASSIGNED
Attachment #8453499 - Flags: review?(rjesup)
Comment on attachment 8453499 [details] [diff] [review]
Patch: Simple fix; proxy creation of GMPParent to main thread

Review of attachment 8453499 [details] [diff] [review]:
-----------------------------------------------------------------

Drive-by:

::: content/media/gmp/GMPService.cpp
@@ +527,5 @@
>  
> +class CreateGMPParentTask : public nsRunnable {
> +public:
> +  CreateGMPParentTask()
> +    : mMonitor("CreateGMPParentTask")

You don't need a monitor for this, NS_DISPATCH_SYNC'd runnables provide all the synchronization you need already.
Also, don't you have to worry about the destruction also? That will attempt to remove the actor from the linked list...
(Assignee)

Comment 13

3 years ago
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #11)
> You don't need a monitor for this, NS_DISPATCH_SYNC'd runnables provide all
> the synchronization you need already.

I had wondered about that, thanks!


(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #12)
> Also, don't you have to worry about the destruction also? That will attempt
> to remove the actor from the linked list...

I changed GMPParent's refcounting to NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION(GMPParent), which should run the destructor on the main thread.
Comment on attachment 8453499 [details] [diff] [review]
Patch: Simple fix; proxy creation of GMPParent to main thread

Review of attachment 8453499 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with the Monitor removed, and if we're ok spinning the event loop

::: content/media/gmp/GMPService.cpp
@@ +543,5 @@
> +    return mParent.forget();
> +  }
> +
> +private:
> +  Monitor mMonitor;

Why do we need a monitor?  This is Dispatched synchronously; there's no cross-thread contention/access.

@@ +574,5 @@
>  
> +  // The GMPParent inherits from IToplevelProtocol, which must be created
> +  // on the main thread to be threadsafe. See Bug 1035653.
> +  nsRefPtr<CreateGMPParentTask> task(new CreateGMPParentTask());
> +  rv = NS_DispatchToMainThread(task, NS_DISPATCH_SYNC);

Do we care if the thread spins the event loop while we're waiting for GMPParent?  (And are are on an nsThread?  I presume so.)  If we don't want to spin the event loop, we can use SyncRunnable::DispatchToThread() (already used in GMPService.cpp)
Attachment #8453499 - Flags: review?(rjesup) → review+
(Assignee)

Comment 15

3 years ago
Yes, we're on a nsIThread.

I think not spinning the GMP thread's event loop would be a prudent thing to do. I can imagine Bad Things happening unexpectedly if we don't. I'll update the patch to do that,and to remove the monitor.

Comment 16

3 years ago
Note that you want mozilla::SycnRunnable, not mozilla::gmp::SyncRunnable ;-)
Flags: needinfo?(khuey)
(Assignee)

Comment 17

3 years ago
Created attachment 8454095 [details] [diff] [review]
Updated patch

Carrying forward r=jesup.
Attachment #8454095 - Flags: review+
(Assignee)

Comment 18

3 years ago
Created attachment 8454137 [details] [diff] [review]
Patch rebased
(Assignee)

Comment 19

3 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/412b654e5cd2
Something in this push was causing leaks. Backed out.
https://hg.mozilla.org/integration/mozilla-inbound/rev/70f3b01cb736

https://tbpl.mozilla.org/php/getParsedLog.php?id=43566548&tree=Mozilla-Inbound
Blocks: 1036801
https://hg.mozilla.org/integration/mozilla-inbound/rev/17cd99be695f
Target Milestone: --- → mozilla33
https://hg.mozilla.org/mozilla-central/rev/17cd99be695f
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.