Closed
Bug 1282577
Opened 8 years ago
Closed 8 years ago
GMP crash in mozalloc_abort | NS_DebugBreak | mozilla::gmp::PGMPVideoDecoderChild::Write
Categories
(Core :: Audio/Video: Playback, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla50
People
(Reporter: cpeterson, Assigned: cpearce)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
58 bytes,
text/x-review-board-request
|
jesup
:
review+
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta+
|
Details |
@ cpearce: this looks like a new crash signature that started climbing last week and only appears to affect Firefox 47 clients. Perhaps this is related to Firefox 47 users running e10s for the first time?
This bug was filed from the Socorro interface and is
report bp-bd9d80f7-0c87-4481-91df-465512160621.
=============================================================
0 mozglue.dll mozalloc_abort(char const* const) memory/mozalloc/mozalloc_abort.cpp:33
1 xul.dll NS_DebugBreak xpcom/base/nsDebugImpl.cpp:434
2 xul.dll mozilla::gmp::PGMPVideoDecoderChild::Write(mozilla::gmp::PGMPVideoDecoderChild*, IPC::Message*, bool) obj-firefox/ipc/ipdl/PGMPVideoDecoderChild.cpp:880
3 xul.dll mozilla::gmp::PGMPVideoDecoderChild::Send__delete__(mozilla::gmp::PGMPVideoDecoderChild*) obj-firefox/ipc/ipdl/PGMPVideoDecoderChild.cpp:52
4 xul.dll mozilla::gmp::GMPVideoDecoderChild::RecvDecodingComplete() dom/media/gmp/GMPVideoDecoderChild.cpp:211
5 xul.dll MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc:459
6 xul.dll base::MessagePumpDefault::Run(base::MessagePump::Delegate*) ipc/chromium/src/base/message_pump_default.cc:34
7 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:227
8 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:201
9 xul.dll XRE_InitChildProcess toolkit/xre/nsEmbedFunctions.cpp:637
10 plugin-container.exe content_process_main(int, char** const) ipc/contentproc/plugin-container.cpp:237
11 plugin-container.exe wmain toolkit/xre/nsWindowsWMain.cpp:131
12 plugin-container.exe __tmainCRTStartup f:/dd/vctools/crt/crtw32/startup/crt0.c:255
13 kernel32.dll BaseThreadInitThunk
14 ntdll.dll __RtlUserThreadStart
15 ntdll.dll _RtlUserThreadStart
Flags: needinfo?(cpearce)
Assignee | ||
Comment 1•8 years ago
|
||
I built the same revision and looked through the generated IPDL code. This crash appears to be a runtime abort because the PGMPVideoDecoder child actor is receiving a DecodingComplete message after it's already been shutdown.
Updated•8 years ago
|
Priority: -- → P1
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → cpearce
Flags: needinfo?(cpearce)
Assignee | ||
Comment 2•8 years ago
|
||
We must be hitting the #ifndef SHMEM_ALLOC_IN_CHILD block in GMPVideoDecoderChild::Alloc() that causes us to dispatch a task to call GMPVideoDecoderChild::RecvDecodingComplete():
https://hg.mozilla.org/releases/mozilla-release/annotate/7f5abf95991b/dom/media/gmp/GMPVideoDecoderChild.cpp#l229
That was added in bug 1162358.
We must be getting multiple allocs doing intr calls at once. I think that could happen in the WidevineVideoDecoder if a Decode message comes in, and goes into a ReturnOutput(), tries to alloc a frame and has to spin on an intr message response, and another Decode message comes in and does the same, so GMPVideoDecoderChild::mNeedShmemIntrCount will be 2, and then a DecodingComplete comes in, and when the events on the stack in GMPVideoDecoderChild::Alloc() finish they both end up dispatching a task to re-call GMPVideoDecoderChild::RecvDecodingComplete().
intr, or spinning event loops in general for that matter, is evil.
Assignee | ||
Comment 3•8 years ago
|
||
This is a regression from bug 1162358.
We must be hitting the #ifndef SHMEM_ALLOC_IN_CHILD block in
GMPVideoDecoderChild::Alloc() with multiple allocs doing intr calls at once.
If this happens when a DecodingComplete() comes in, we'll end up sending one
task to re-call RecvDecodingComplete for every Alloc() blocked on an intr
response. This would result in us ending up trying to Send__delete__() in
RecvDecodingComplete() twice. Which causes the runtime abort we're seeing
here.
I think that could happen in the WidevineVideoDecoder if a Decode message comes
in, and goes into a ReturnOutput(), tries to alloc a frame and has to spin on
an intr message response, and another Decode message comes in and does the
same, so GMPVideoDecoderChild::mNeedShmemIntrCount will be 2, and then a
DecodingComplete comes in, and when two tasks on the stack in
GMPVideoDecoderChild::Alloc() finish they both end up dispatching a task to
re-call GMPVideoDecoderChild::RecvDecodingComplete(). So we end up trying to
Send__delete__() in RecvDecodingComplete() twice.
I expect the same problem exists in the GMPVideoEncoder too.
intr, or spinning event loops in general for that matter, is evil.
Review commit: https://reviewboard.mozilla.org/r/63442/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/63442/
Attachment #8769617 -
Flags: review?(rjesup)
Updated•8 years ago
|
Attachment #8769617 -
Flags: review?(rjesup) → review+
Comment 4•8 years ago
|
||
Comment on attachment 8769617 [details]
Bug 1282577 - Guard against multiple intr messages causes us to multi-delete GMP actors. .
https://reviewboard.mozilla.org/r/63442/#review60432
Pushed by cpearce@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2bfcec45187a
Guard against multiple intr messages causes us to multi-delete GMP actors. r=jesup.
Assignee | ||
Comment 6•8 years ago
|
||
Comment on attachment 8769617 [details]
Bug 1282577 - Guard against multiple intr messages causes us to multi-delete GMP actors. .
Approval Request Comment
[Feature/regressing bug #]: Widevine EME support
[User impact if declined]: The CDM process will crash, interrupting EME viewing pleasure. Firefox content/main process won't crash, just the CDM process. This will affect multiple large EME video sites, and is a significant cause of errors on those sites.
[Describe test coverage new/current, TreeHerder]: We have lots of EME mochitests.
[Risks and why]: Should be fairly low; the patch is quite simple.
[String/UUID change made/needed]: None.
Attachment #8769617 -
Flags: approval-mozilla-beta?
Attachment #8769617 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 7•8 years ago
|
||
Note: I think the patch here also fixes bug 1274963, which is the MacOSX variant of this CDM crash.
Assignee | ||
Comment 8•8 years ago
|
||
Also looks like this fixes bug 1173632 which affects Primetime EME and we've been trying to solve for ages. :)
Blocks: 1173632
Comment 9•8 years ago
|
||
Hi Chris,
From the crash report, I don't see any crashes for 48, do you think we need to uplift this patch to 48 beta?
Flags: needinfo?(cpearce)
Comment 10•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Assignee | ||
Comment 11•8 years ago
|
||
(In reply to Gerry Chang [:gchang] from comment #9)
> Hi Chris,
> From the crash report, I don't see any crashes for 48, do you think we need
> to uplift this patch to 48 beta?
Yes, I think we should uplift to 48 beta.
There are crashes here:
https://crash-stats.mozilla.com/signature/?date=%3E%3D2016-06-01&signature=mozalloc_abort%20%7C%20NS_DebugBreak%20%7C%20mozilla%3A%3Agmp%3A%3APGMPVideoDecoderChild%3A%3AWrite&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-version&_sort=-date&page=1
This is the signature search with an extra search filter of "date >= 2016-06-01".
I think we're seeing more crashes in 47 and not as many in 48, as Netfix are testing Widevine in 47 and not in Firefox 48. So Widevine isn't being used as much in 48, so it's not failing as much.
I'd guess the other Firefox versions hitting this are mostly Amazon Prime Video users.
Flags: needinfo?(cpearce)
Updated•8 years ago
|
Comment 12•8 years ago
|
||
Comment on attachment 8769617 [details]
Bug 1282577 - Guard against multiple intr messages causes us to multi-delete GMP actors. .
Looks like this will fix some Widevine and other issues. Please uplift to aurora and beta
Attachment #8769617 -
Flags: approval-mozilla-beta?
Attachment #8769617 -
Flags: approval-mozilla-beta+
Attachment #8769617 -
Flags: approval-mozilla-aurora?
Attachment #8769617 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 13•8 years ago
|
||
bugherder uplift |
Assignee | ||
Comment 14•8 years ago
|
||
bugherder uplift |
You need to log in
before you can comment on or make changes to this bug.
Description
•