Closed Bug 1162358 Opened 4 years ago Closed 4 years ago

[EME] DecodingComplete can happen during GMPVideoDecoderChild::Alloc is doing intr IPC

Categories

(Core :: Audio/Video, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox39 --- fixed
firefox40 --- fixed

People

(Reporter: cpearce, Assigned: cpearce)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: crash)

Attachments

(1 file)

Adobe reported that while a GMP is blocked in GMPVideoi420FrameImpl::CreateEmptyFrame() waiting for the intr IPC to return a frame handle, a DecodingComplete message can come in and delete the GMPVideoDecoderChild, and then all hell breaks loose.

Callstack:
  xul.dll!mozilla::gmp::GMPVideoDecoderChild::~GMPVideoDecoderChild() Line 27
C++
  xul.dll!mozilla::gmp::GMPVideoDecoderChild::`scalar deleting destructor'(unsigned int)
C++
  xul.dll!mozilla::layers::LayerTransactionParent::DeallocPLayerParent(mozilla::layers::PLayerParent * actor) Line 61
C++
  xul.dll!mozilla::gmp::PGMPChild::RemoveManagee(int aProtocolId, mozilla::ipc::IProtocol * aListener) Line 413
C++
  xul.dll!mozilla::gmp::PGMPVideoDecoderChild::Send__delete__(mozilla::gmp::PGMPVideoDecoderChild * actor) Line 73
C++
  xul.dll!mozilla::gmp::GMPVideoDecoderChild::RecvDecodingComplete() Line 199
C++
  xul.dll!mozilla::gmp::PGMPVideoDecoderChild::OnMessageReceived(const IPC::Message & msg__) Line 493
C++
  xul.dll!mozilla::gmp::PGMPChild::OnMessageReceived(const IPC::Message & msg__) Line 702
C++
  xul.dll!mozilla::ipc::MessageChannel::DispatchAsyncMessage(const IPC::Message & aMsg) Line 1213
C++
  xul.dll!mozilla::ipc::MessageChannel::DispatchMessageW(const IPC::Message & aMsg) Line 1139
C++
  xul.dll!mozilla::ipc::MessageChannel::Call(IPC::Message * aMsg, IPC::Message * aReply) Line 922
C++
  xul.dll!mozilla::gmp::PGMPVideoDecoderChild::CallNeedShmem(const unsigned int & aFrameBufferSize, mozilla::ipc::Shmem * aMem) Line 263
C++
  xul.dll!mozilla::gmp::GMPVideoDecoderChild::Alloc(unsigned int aSize, mozilla::ipc::SharedMemory::SharedMemoryType aType, mozilla::ipc::Shmem * aMem) Line 53
C++
  xul.dll!mozilla::gmp::GMPSharedMemManager::MgrAllocShmem(mozilla::gmp::GMPSharedMem::GMPMemoryClasses aClass, unsigned int aSize, mozilla::ipc::SharedMemory::SharedMemoryType aType, mozilla::ipc::Shmem * aMem) Line 45 C++
  xul.dll!mozilla::gmp::GMPPlaneImpl::MaybeResize(int aNewSize) Line 89
C++
  xul.dll!mozilla::gmp::GMPPlaneImpl::CreateEmptyPlane(int aAllocatedSize, int aStride, int aPlaneSize) Line 121
C++
  xul.dll!mozilla::gmp::GMPVideoi420FrameImpl::CreateEmptyFrame(int aWidth, int aHeight, int aStride_y, int aStride_u, int aStride_v) Line 148
C++
// [Adobe GMP Code]

I assume the same issue exists with the GMP video encoder API.

Maybe we should defer processing the DecodingComplete message until the intr IPC is has returned?
Assignee: nobody → cpearce
Defer processing GMP DecodingComplete() calls until intr CallNeedShmem()s are finished. This should mean we don't tear down the GMPVideoDecoderChild and friends until after the GMP should have stopped calling it.
Attachment #8602506 - Flags: review?(rjesup)
Comment on attachment 8602506 [details] [diff] [review]
Patch: Defer processing GMP DecodingComplete() calls until intr shmem allocs are finished.

Review of attachment 8602506 [details] [diff] [review]:
-----------------------------------------------------------------

Feel free to convert to r+ with correct locking or well-thought-out analysis in the source why it's safe.  (I strongly think it isn't safe.)

::: dom/media/gmp/GMPVideoDecoderChild.cpp
@@ +189,5 @@
>  GMPVideoDecoderChild::RecvDecodingComplete()
>  {
> +  MOZ_ASSERT(mPlugin->GMPMessageLoop() == MessageLoop::current());
> +
> +  if (mNeedShmemIntrCount) {

Under the very likely assumption this is on a different thread than ::Alloc(), this needs locking or some form of Atomic to avoid TSAN/etc races.  Do that and r+
Attachment #8602506 - Flags: review?(rjesup) → review-
Comment on attachment 8602506 [details] [diff] [review]
Patch: Defer processing GMP DecodingComplete() calls until intr shmem allocs are finished.

Review of attachment 8602506 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/gmp/GMPVideoDecoderChild.cpp
@@ +189,5 @@
>  GMPVideoDecoderChild::RecvDecodingComplete()
>  {
> +  MOZ_ASSERT(mPlugin->GMPMessageLoop() == MessageLoop::current());
> +
> +  if (mNeedShmemIntrCount) {

I assert that this and Alloc() happen on the same thread, and since Alloc() does IPC, it must be on the GMPMessageLoop thread... So I don't see how RecvDecodingComplete() and Alloc() can happen on different threads... Right?

Or do you think we need something in the code to fail in these GMPMessageLoop-thread-only functions in release builds to ensure third party plugins that won't be tested (much) in debug builds?
Comment on attachment 8602506 [details] [diff] [review]
Patch: Defer processing GMP DecodingComplete() calls until intr shmem allocs are finished.

Review of attachment 8602506 [details] [diff] [review]:
-----------------------------------------------------------------

Aha. Was tired... if they're on the same thread, no problem, and your comment makes sense.
Attachment #8602506 - Flags: review- → review+
Keywords: crash
Priority: -- → P1
https://hg.mozilla.org/mozilla-central/rev/b0ebc8d237f0
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Comment on attachment 8602506 [details] [diff] [review]
Patch: Defer processing GMP DecodingComplete() calls until intr shmem allocs are finished.

Requesting approval for uplift to 39. I've set the approval-beta? flag in case this ends up not getting approved until after the m-c -> m-a merge next week.

Approval Request Comment
[Feature/regressing bug #]: EME
[User impact if declined]: Potentially the Adobe EME plugin can crash when shutting down depending on the ordering of event durin shutdown. This patch prevents that.
[Describe test coverage new/current, TreeHerder]: We have plenty of EME mochitests, but none to cover this particular edge case, as it's quite hard to reproduce.
[Risks and why]: Low; this enforces the order of events during EME plugin shutdown, so should make it more reliable, not less.
[String/UUID change made/needed]: None.
Attachment #8602506 - Flags: approval-mozilla-beta?
Attachment #8602506 - Flags: approval-mozilla-aurora?
Sounds good Chris. How can we verify this? Did we ever get steps to reproduce it from Adobe?
Flags: needinfo?(cpearce)
Comment on attachment 8602506 [details] [diff] [review]
Patch: Defer processing GMP DecodingComplete() calls until intr shmem allocs are finished.

Approved for uplift to beta (39) since this missed uplift at the end of the 39 aurora cycle.
Attachment #8602506 - Flags: approval-mozilla-beta?
Attachment #8602506 - Flags: approval-mozilla-beta+
Attachment #8602506 - Flags: approval-mozilla-aurora?
(In reply to Liz Henry (:lizzard) from comment #8)
> Sounds good Chris. How can we verify this? Did we ever get steps to
> reproduce it from Adobe?

We did, but I was unable to reproduce the crash with their STR. I will ask them to verify.
Flags: needinfo?(cpearce)
Possibly this crash report is also fixed by the patch here:
https://crash-stats.mozilla.com/report/index/d342a08f-ec18-4465-bbdc-3c4202150509
Depends on: 1164925
Duplicate of this bug: 1164740
You need to log in before you can comment on or make changes to this bug.