Closed
Bug 1402584
Opened 7 years ago
Closed 7 years ago
Crash in mozilla::MediaDecoderStateMachine::DecodeMetadataState::HandleSeek
Categories
(Core :: Audio/Video: Playback, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla58
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox55 | --- | unaffected |
firefox56 | --- | unaffected |
firefox57 | --- | fixed |
firefox58 | --- | fixed |
People
(Reporter: philipp, Assigned: jwwang)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(4 files, 2 obsolete files)
59 bytes,
text/x-review-board-request
|
Details | |
59 bytes,
text/x-review-board-request
|
mozbugz
:
review+
|
Details |
59 bytes,
text/x-review-board-request
|
mozbugz
:
review+
|
Details |
1.33 KB,
patch
|
jwwang
:
review+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-2d4e43f5-a7fa-4905-9113-4b0dd0170922.
=============================================================
Crashing Thread (71), Name: MediaPlayback #9
Frame Module Signature Source
0 xul.dll mozilla::MediaDecoderStateMachine::DecodeMetadataState::HandleSeek(mozilla::SeekTarget) dom/media/MediaDecoderStateMachine.cpp:352
1 xul.dll mozilla::MediaDecoderStateMachine::Seek(mozilla::SeekTarget const&) dom/media/MediaDecoderStateMachine.cpp:3082
2 xul.dll mozilla::detail::ProxyRunnable<mozilla::MozPromise<bool, bool, 1>, RefPtr<mozilla::MozPromise<bool, bool, 1> > ( mozilla::MediaDecoderStateMachine::*)(mozilla::SeekTarget const&), mozilla::MediaDecoderStateMachine, StoreCopyPassByRRef<mozilla::SeekTarget> >::Run() xpcom/threads/MozPromise.h:1395
3 xul.dll mozilla::AutoTaskDispatcher::TaskGroupRunnable::Run() xpcom/threads/TaskDispatcher.h:209
4 xul.dll mozilla::TaskQueue::Runner::Run() xpcom/threads/TaskQueue.cpp:246
5 xul.dll nsThreadPool::Run() xpcom/threads/nsThreadPool.cpp:226
6 xul.dll nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1039
7 xul.dll NS_ProcessNextEvent(nsIThread*, bool) xpcom/threads/nsThreadUtils.cpp:521
8 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:338
9 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:319
10 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:299
11 xul.dll nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp:427
12 nss3.dll PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397
13 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:95
14 ucrtbase.dll o__realloc_base
15 kernel32.dll BaseThreadInitThunk
16 ntdll.dll RtlUserThreadStart
this crash signature for content process crashes is newly showing up since 57.0a1. so far they are only from windows machines and all come with 'MOZ_RELEASE_ASSERT(false) (Can't seek while decoding metadata.)' that got added in bug 1308603.
Updated•7 years ago
|
Priority: -- → P2
Comment 2•7 years ago
|
||
Hmm......, the crash report does not talk much, and the only thing we know is that a seek operation is invoked before Metadata is read.
It's HTMLMediaElement::Seek()'s responsibility to filter out seek requests that are not valid, and it's this check which rejects seek request before Metadata loaded:
http://searchfox.org/mozilla-central/rev/f54c1723befe6bcc7229f005217d5c681128fcad/dom/html/HTMLMediaElement.cpp#2765-2769
For non-MSE cases, MediaDecoder::GetSeekable() calls MediaDecoder::GetDuration() to check if the duration is known or not, and the duration "should" ever be set only after Metadata is loaded. However, bug 1385699 refactored how MDSM calculating duration and I think it's the patch 2 (https://reviewboard.mozilla.org/r/164224/diff/1#index_header) might have something to do with this bug.
Bug 1385699 patch 2 removed MDSM::mObservedDuration which was assgined in two places: (1) MDSM::UpdatePlaybackPositionInternal() and MDSM::BufferedRangeUpdated(). MDSM::UpdatePlaybackPositionInternal() is only ever be called after Metadata loaded, but that's not the case for MDSM::BufferedRangeUpdated().
In the old mechanism, setting MDSM::mObservedDuration also invokes MDSM::RecomputeDuration() which filters out the case that MDSM::mObservedDuration is set before Metadata is loaded. However, the criterion is not kept in the new mechanism.
http://searchfox.org/mozilla-central/rev/b2911ee198ac26f9ce7d6fef8cd42f5dbd022a21/dom/media/MediaDecoderStateMachine.cpp#3107-3111
Let's have a patch for this and see if it works.
Assignee: nobody → kaku
Status: NEW → ASSIGNED
Flags: needinfo?(kaku)
Comment hidden (mozreview-request) |
Comment 4•7 years ago
|
||
Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Tzuhao Kuo [:kaku] (PTO 9/27) from comment #2)
> For non-MSE cases, MediaDecoder::GetSeekable() calls
> MediaDecoder::GetDuration() to check if the duration is known or not, and
> the duration "should" ever be set only after Metadata is loaded. However,
> bug 1385699 refactored how MDSM calculating duration and I think it's the
> patch 2 (https://reviewboard.mozilla.org/r/164224/diff/1#index_header) might
> have something to do with this bug.
I don't think bug 1385699 is responsible.
http://searchfox.org/mozilla-central/rev/20e41d4a61a8f5e34c9cf357304b78b3e9bced8a/dom/html/HTMLMediaElement.cpp#2751-2755
The |mReadyState == nsIDOMHTMLMediaElement::HAVE_NOTHING| test ensures mDecoder->Seek is called only after the metadata is loaded.
I think there is something wrong to HTMLMediaElement::mReadyState which becomes HAVE_METADATA or above before metadata loaded is reported by mDecoder.
Assignee | ||
Comment 6•7 years ago
|
||
mozreview-review |
Comment on attachment 8913088 [details]
Bug 1402584 - only set MDSM::mDuration after metadata is loaded in MDSM::BufferedRangeUpdated();
https://reviewboard.mozilla.org/r/184504/#review190012
Per comment 5, I don't think this is the right fix.
Attachment #8913088 -
Flags: review?(jwwang)
Updated•7 years ago
|
Assignee: kaku → nobody
Status: ASSIGNED → NEW
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → jwwang
Assignee | ||
Comment 8•7 years ago
|
||
I am not sure if I follow. There is no patch to land so far.
Flags: needinfo?(jwwang)
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8915425 -
Flags: review?(gsquelart)
Comment 10•7 years ago
|
||
mozreview-review |
Comment on attachment 8915425 [details]
Bug 1402584 - add logs for debugging. See comment 5 for the detail.
https://reviewboard.mozilla.org/r/186616/#review191680
LGTM, but please give it a good Try, to make sure you don't obviously add crashes for normal situations
::: dom/html/HTMLMediaElement.cpp:2840
(Diff revision 1)
> + MOZ_CRASH_UNSAFE_PRINTF(
> + "Metadata not loaded! readyState=%d networkState=%d",
> + mReadyState.Ref(),
> + mNetworkState);
These variables are uint16_t, so "%d" is probably incorrect. Please use PRIu16, or cast to int.
Same with the other printf below.
Attachment #8915425 -
Flags: review?(gsquelart) → review+
Comment hidden (mozreview-request) |
Assignee | ||
Comment 12•7 years ago
|
||
Thanks!
Comment 13•7 years ago
|
||
Pushed by jwwang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/442e8020e2ac
add logs for debugging. See comment 5 for the detail. r=gerald
Assignee | ||
Updated•7 years ago
|
Keywords: leave-open
![]() |
||
Comment 14•7 years ago
|
||
bugherder |
Comment 15•7 years ago
|
||
There are 5 crashes with signature 'mozilla::dom::HTMLMediaElement::ChangeReadyState'.
The crash reasons are 'Metadata not loaded! readyState=4 networkState=2' and 'Metadata not loaded! readyState=2 networkState=2'.
Crash Signature: [@ mozilla::MediaDecoderStateMachine::DecodeMetadataState::HandleSeek] → [@ mozilla::MediaDecoderStateMachine::DecodeMetadataState::HandleSeek]
[@ mozilla::dom::HTMLMediaElement::ChangeReadyState]
![]() |
||
Updated•7 years ago
|
Flags: needinfo?(jwwang)
Assignee | ||
Comment 16•7 years ago
|
||
Per comment 5, the assertion (Nightly and early Beta only) is added for debugging. However, seeing new crash reports, I still have no idea how this could be. I will more assertions to debug the issue.
Flags: needinfo?(jwwang)
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8915425 -
Attachment is obsolete: true
Assignee | ||
Updated•7 years ago
|
Attachment #8915801 -
Flags: review?(gsquelart)
Assignee | ||
Updated•7 years ago
|
Attachment #8915801 -
Flags: review?(gsquelart)
Assignee | ||
Comment 18•7 years ago
|
||
I think I found the root cause.
1. the media element loads the URI from a source child and starts playback.
2. a decode error is encountered (assume readyState is HAVE_ENOUGH_DATA).
3. the media element continues to load the next source child without resetting the readyState.
4. now readyState is HAVE_ENOUGH_DATA before metadata is loaded for the next source child.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8915801 -
Attachment is obsolete: true
Assignee | ||
Updated•7 years ago
|
Attachment #8916205 -
Flags: review?(gsquelart)
Attachment #8916206 -
Flags: review?(gsquelart)
Comment 21•7 years ago
|
||
mozreview-review |
Comment on attachment 8916205 [details]
Bug 1402584. P1 - Backed out changeset 442e8020e2ac.
https://reviewboard.mozilla.org/r/187446/#review192532
Attachment #8916205 -
Flags: review?(gsquelart) → review+
Comment 22•7 years ago
|
||
mozreview-review |
Comment on attachment 8916206 [details]
Bug 1402584. P2 - reset readyState to HAVE_NOTHING before loading next source child. See comment 18 for the root cause.
https://reviewboard.mozilla.org/r/187448/#review192534
::: dom/html/HTMLMediaElement.cpp:1917
(Diff revision 1)
> if (!mIsLoadingFromSourceChildren || mShuttingDown) {
> return;
> }
>
> + if (mDecoder) {
> + // Reset readyState to HAVE_NOTHING since we're gonna load a new decoder.
"gonna" -> "going to"
Attachment #8916206 -
Flags: review?(gsquelart) → review+
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 26•7 years ago
|
||
Pushed by jwwang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e17425d9f940
P1 - Backed out changeset 442e8020e2ac. r=gerald
https://hg.mozilla.org/integration/autoland/rev/e80105b6353d
P2 - reset readyState to HAVE_NOTHING before loading next source child. See comment 18 for the root cause. r=gerald
![]() |
||
Comment 27•7 years ago
|
||
Backed out for frequent mda failures on Windows 10, e.g. in dom/media/test/test_seek-2.html:
https://hg.mozilla.org/integration/autoland/rev/db5526c855d07beb52b3c0f4b5b1ec769bb89fce
https://hg.mozilla.org/integration/autoland/rev/53548fbb1e5fd6ce9f973a5ed9089750dcb81296
Related to the disabled DRM notification bar?
E.g. https://treeherder.mozilla.org/logviewer.html#?job_id=135520104&repo=autoland
15:42:08 INFO - 1061 INFO TEST-PASS | dom/media/test/test_replay_metadata.html | small-shot-mp3.mp4 isEncrypted should be true if we have decryption keys
15:42:08 INFO - 1062 INFO TEST-PASS | dom/media/test/test_replay_metadata.html | No more than 1 onloadeddata event for small-shot-mp3.mp4
15:42:08 INFO - Buffered messages finished
15:42:08 ERROR - 1063 INFO TEST-UNEXPECTED-FAIL | dom/media/test/test_replay_metadata.html | Test timed out!
15:42:08 INFO - onTimeout@dom/media/test/manifest.js:1693:7
15:42:08 INFO - setTimeout handler*MediaTestManager/this.started@dom/media/test/manifest.js:1699:26
15:42:08 INFO - startTest@dom/media/test/test_replay_metadata.html:94:3
15:42:08 INFO - MediaTestManager/this.nextTest@dom/media/test/manifest.js:1748:7
15:42:08 INFO - MediaTestManager/this.finished@dom/media/test/manifest.js:1727:7
15:42:08 INFO - playbackEnded@dom/media/test/test_replay_metadata.html:86:5
15:42:08 INFO - EventListener.handleEvent*startTest@dom/media/test/test_replay_metadata.html:107:3
15:42:08 INFO - MediaTestManager/this.nextTest@dom/media/test/manifest.js:1748:7
15:42:08 INFO - MediaTestManager/this.runTests/<@dom/media/test/manifest.js:1668:7
15:42:08 INFO - 1064 INFO small-shot.mp3-2 timed out!
See also https://treeherder.mozilla.org/logviewer.html#?job_id=135528693&repo=autoland
Flags: needinfo?(jwwang)
Assignee | ||
Comment 28•7 years ago
|
||
Both test_replay_metadata.html and test_playback_rate.html don't run the code added by my patches at all. I don't think my patches are responsible for the timeouts.
Flags: needinfo?(jwwang)
Comment 29•7 years ago
|
||
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/f7b9b00dfe17
P1 - Backed out changeset 442e8020e2ac. r=gerald
https://hg.mozilla.org/integration/autoland/rev/a4466933d251
P2 - reset readyState to HAVE_NOTHING before loading next source child. See comment 18 for the root cause. r=gerald on a CLOSED TREE
![]() |
||
Comment 30•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f7b9b00dfe17
https://hg.mozilla.org/mozilla-central/rev/a4466933d251
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Assignee | ||
Comment 32•7 years ago
|
||
Approval Request Comment
[Feature/Bug causing the regression]:1385699
[User impact if declined]:crash in beta builds, but not release build.
[Is this code covered by automated tests?]:no
[Has the fix been verified in Nightly?]:yes
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]:none
[Is the change risky?]:low
[Why is the change risky/not risky?]:a simple change.
[String changes made/needed]:none
Flags: needinfo?(jwwang)
Attachment #8918155 -
Flags: review+
Attachment #8918155 -
Flags: approval-mozilla-beta?
Comment on attachment 8918155 [details] [diff] [review]
1402584_fix_beta_57.patch
Crash fix, Beta57+
Attachment #8918155 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 34•7 years ago
|
||
bugherder uplift |
Comment 35•7 years ago
|
||
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #32)
> [Needs manual test from QE? If yes, steps to reproduce]: no
Per JW's assessment on manual testing needs, marking this issue as qe-verify-.
Flags: qe-verify-
You need to log in
before you can comment on or make changes to this bug.
Description
•