Closed
Bug 1166107
Opened 9 years ago
Closed 9 years ago
shutdown hang SharedDecoderManager::DrainComplete()/Flush() deadlock
Categories
(Core :: Audio/Video, defect)
Core
Audio/Video
Tracking
()
RESOLVED
FIXED
mozilla41
People
(Reporter: karlt, Assigned: karlt)
References
Details
(Keywords: regression)
Attachments
(3 files, 1 obsolete file)
1.94 KB,
patch
|
mozbugz
:
review+
|
Details | Diff | Splinter Review |
1.85 KB,
patch
|
mozbugz
:
review+
|
Details | Diff | Splinter Review |
2.00 KB,
patch
|
Details | Diff | Splinter Review |
https://crash-stats.mozilla.com/report/index/930399a3-9302-499f-a7b6-deb492150517#allthreads SharedDecoderManager::SetIdle() is holding its monitor while calling Flush(). http://hg.mozilla.org/releases/mozilla-aurora/annotate/b03c5f1a77a1/dom/media/fmp4/SharedDecoderManager.cpp#l173 WMFMediaDataDecoder::Flush() waits to run a task on the same queue as DrainComplete() runs. SharedDecoderManager::DrainComplete() is waiting for the monitor held by SharedDecoderManager::SetIdle(). http://hg.mozilla.org/releases/mozilla-aurora/annotate/b03c5f1a77a1/dom/media/fmp4/SharedDecoderManager.cpp#l196 Holding the lock during Flush() is a regression from http://hg.mozilla.org/releases/mozilla-aurora/rev/380b0a052ef0 I suspect there was an incomplete external Drain before SetIdle() set mWaitForInternalDrain. The wait for the internal drain has picked up the DrainComplete() from the external Drain and the DrainComplete() from the internal Drain runs after the Wait() acquires the monitor again.
Assignee | ||
Comment 1•9 years ago
|
||
[Tracking Requested - why for this release]: Recently introduced cause of video playback failure and shutdown hang.
tracking-firefox40:
--- → ?
Comment 2•9 years ago
|
||
Enabled tracking for 40 and 41 — because what Karl said. :)
tracking-firefox41:
--- → +
Assignee | ||
Comment 3•9 years ago
|
||
The DrainComplete() caught with mWaitForInternalDrain still won't necessarily be from the internal Drain(), but all we need is that one DrainComplete() is caught for the internal Drain() because one more will be generated if there is a Drain() in progress. What protecting mWaitForInternalDrain access with the monitor provides here is that the compiler won't use its address for storage of other data meaningless in the context of mWaitForInternalDrain and so, for example, two DrainComplete() calls won't unintentionally think that they are both for one internal drain. And TSan warnings.
Attachment #8609159 -
Flags: review?(gsquelart)
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → karlt
Status: NEW → ASSIGNED
Assignee | ||
Comment 4•9 years ago
|
||
Attachment #8609160 -
Flags: review?(gsquelart)
Comment on attachment 8609159 [details] [diff] [review] release internal drain monitor before calling Flush() Review of attachment 8609159 [details] [diff] [review]: ----------------------------------------------------------------- r+. Optional: Fix typo if easy enough (not your typo, but we might as well correct it while we're nearby). ::: dom/media/platforms/SharedDecoderManager.cpp @@ +169,1 @@ > // We don't want to hold the lock while calling Drain() has some "has" -> "as"
Attachment #8609159 -
Flags: review?(gsquelart) → review+
Attachment #8609160 -
Flags: review?(gsquelart) → review+
Comment 6•9 years ago
|
||
Comment on attachment 8609159 [details] [diff] [review] release internal drain monitor before calling Flush() Review of attachment 8609159 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/platforms/SharedDecoderManager.cpp @@ +173,2 @@ > if (NS_SUCCEEDED(rv)) { > + MonitorAutoLock mon(mMonitor); Is it possible that this monitor might make DrainComplete() unable to acquire the lock and cause a dead lock in next while below?
Comment 7•9 years ago
|
||
Comment on attachment 8609159 [details] [diff] [review] release internal drain monitor before calling Flush() Review of attachment 8609159 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/platforms/SharedDecoderManager.cpp @@ +173,2 @@ > if (NS_SUCCEEDED(rv)) { > + MonitorAutoLock mon(mMonitor); wait() releases the lock
Comment 8•9 years ago
|
||
(In reply to Jean-Yves Avenard [:jya] from comment #7) > Comment on attachment 8609159 [details] [diff] [review] > release internal drain monitor before calling Flush() > > Review of attachment 8609159 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: dom/media/platforms/SharedDecoderManager.cpp > @@ +173,2 @@ > > if (NS_SUCCEEDED(rv)) { > > + MonitorAutoLock mon(mMonitor); > > wait() releases the lock Oh,my bad! I read it carelessly...XD
https://hg.mozilla.org/integration/mozilla-inbound/rev/8da84460924f https://hg.mozilla.org/integration/mozilla-inbound/rev/afb28a3157b3
Comment 10•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/9d1051881bfb https://hg.mozilla.org/integration/mozilla-inbound/rev/334877fc5309
Comment 11•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/8da84460924f https://hg.mozilla.org/mozilla-central/rev/9d1051881bfb https://hg.mozilla.org/mozilla-central/rev/334877fc5309
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Assignee | ||
Comment 12•9 years ago
|
||
Approval Request Comment [Feature/regressing bug #]: bug 1161443 [User impact if declined]: video playback failure and shutdown hang [Describe test coverage new/current, TreeHerder]: no new tests; this is a potential cause of intermittent failure in existing tests. [Risks and why]: low risk because the deviation from existing code is very small. [String/UUID change made/needed]: none
Attachment #8611598 -
Flags: approval-mozilla-aurora?
Comment 13•9 years ago
|
||
Comment on attachment 8611598 [details] [diff] [review] 40 branch patch Shutdown hangs are a pain. Taking it.
Attachment #8611598 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 14•9 years ago
|
||
applying 1166107-8611598 failed to synchronize metadata for "dom/media/fmp4/SharedDecoderManager.cpp"
Flags: needinfo?(karlt)
Assignee | ||
Comment 15•9 years ago
|
||
In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #14) > applying 1166107-8611598 > failed to synchronize metadata for "dom/media/fmp4/SharedDecoderManager.cpp" Thanks. I guess something was being fussy about the arguments listed against diff.
Flags: needinfo?(karlt)
Updated•9 years ago
|
Attachment #8611598 -
Attachment is obsolete: true
You need to log in
before you can comment on or make changes to this bug.
Description
•