Crash in [@ mozilla::MediaData::HasValidTime]
Categories
(Core :: Audio/Video, defect, P3)
Tracking
()
People
(Reporter: gsvelto, Assigned: alwu)
References
(Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(3 files)
This bug is for crash report bp-1b12b057-3471-4449-a789-479100191222.
Top 10 frames of crashing thread:
0 XUL mozilla::MediaData::HasValidTime const mfbt/CheckedInt.h:560
1 XUL mozilla::MediaTrackDemuxer::SamplesHolder::AppendSample dom/media/MediaDataDemuxer.h:104
2 XUL mozilla::MediaSourceTrackDemuxer::DoGetSamples dom/media/mediasource/MediaSourceDemuxer.cpp:440
3 XUL mozilla::detail::ProxyRunnable<mozilla::MozPromise<RefPtr<mozilla::MediaTrackDemuxer::SamplesHolder>, mozilla::MediaResult, true>, RefPtr<mozilla::MozPromise<RefPtr<mozilla::MediaTrackDemuxer::SamplesHolder>, mozilla::MediaResult, true> > xpcom/threads/MozPromise.h:1343
4 XUL mozilla::TaskQueue::Runner::Run xpcom/threads/TaskQueue.cpp:207
5 XUL nsThreadPool::Run xpcom/threads/nsThreadPool.cpp:300
6 XUL non-virtual thunk to nsThreadPool::Run xpcom/threads/nsThreadPool.cpp
7 XUL nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1241
8 XUL <name omitted> xpcom/threads/nsThreadUtils.cpp:486
9 XUL mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:332
This looks like a NULL pointer dereference, low volume but happening on multiple platforms.
Comment 1•6 years ago
|
||
It seems that all crashes here are in MediaSourceTrackDemuxer.
Alastor, any guess? Does HasValidTime() need to be checked before AppendSample in MediaSourceTrackDemuxer is called, like what FlacDemuxer does ?
Updated•6 years ago
|
| Assignee | ||
Comment 2•6 years ago
|
||
This invalid issue is an old issue, which we spent time before but couldn't find any cues. We won't need to call HasValidTime() in every sample call, because we do have a check in [1] to ensure that returned sample time is valid. But don't know why we could still see this kind of invalid time...
Hi, Bryce,
I remember that we have traced a similar bug before, but I couldn't find the bug number. Do you remember that?
| Assignee | ||
Comment 3•6 years ago
|
||
Considering this crash doesn't have high happen rate, I would mark this one as P3.
| Reporter | ||
Comment 4•6 years ago
|
||
Since it's happening only occasionally and it shouldn't be happening at all during regular execution, could this be a narrow race?
Updated•6 years ago
|
| Assignee | ||
Comment 5•6 years ago
|
||
| Assignee | ||
Comment 6•6 years ago
|
||
The sample we got from demuxer are all guaranteed to be valid because they are returned in a SamplesHolder, which has assertion to make sure all samples appended are valid [1].
ProcessFrame() might be a possible place where we incorrectly change sample from valid to invalid, because we modify sample's time and duration there. Therefore, adding an assertion to make sure all samples are still valid.
[1] https://searchfox.org/mozilla-central/source/dom/media/MediaDataDemuxer.h#103-106
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
Comment 8•6 years ago
|
||
| bugherder | ||
(In reply to Alastor Wu [:alwu] from comment #2)
Hi, Bryce,
I remember that we have traced a similar bug before, but I couldn't find the bug number. Do you remember that?
Is it bug 1545108? It's similar in a "there shouldn't be invalid data at this point" sense.
| Assignee | ||
Comment 10•6 years ago
|
||
Comment on attachment 9118077 [details]
Bug 1605699 - part1 : add an assertion to ensure 'GetSample()' always return a valid sample.
Beta/Release Uplift Approval Request
- User impact if declined: Nothing happen, the purpose of these patches is to add more assertions to help us diagnose the reason of the crash.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Adding assertion to ensure the sample time is valid, which should be a correct assumption all the time.
- String changes made/needed:
| Assignee | ||
Updated•6 years ago
|
Comment 11•6 years ago
|
||
Comment on attachment 9118077 [details]
Bug 1605699 - part1 : add an assertion to ensure 'GetSample()' always return a valid sample.
Adds crash diagnostics. Approved for 73.0b6.
Updated•6 years ago
|
Comment 12•6 years ago
|
||
Comment on attachment 9118077 [details]
Bug 1605699 - part1 : add an assertion to ensure 'GetSample()' always return a valid sample.
These actually landed on m-c when it was still tracking 73, so Beta already has these patches.
Updated•6 years ago
|
| Assignee | ||
Comment 13•6 years ago
|
||
Ah, sorry, I didn't notice that. Do you know which beta version starts including these patches?
Thank you.
| Assignee | ||
Comment 15•6 years ago
|
||
Thank you, so apparently the patches I added didn't prevent the crash. This needs more investigation...
Comment 16•5 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:alwu, maybe it's time to close this bug?
Updated•5 years ago
|
Comment 17•5 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:alwu, maybe it's time to close this bug?
Comment 18•4 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:alwu, maybe it's time to close this bug?
| Assignee | ||
Comment 19•4 years ago
|
||
In D58392 and D58393, we already added some assertions to ensure that (1) sample is valid when it got added (2) sample is valid when we call GetSample().
However, the crash still exists and doesn't hit these two places. So another possible situation is that samples become invalid during modification before we return it via GetSample().
Therefore, add more assertion to see my guess is correct or not.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 20•4 years ago
|
||
Comment 21•4 years ago
|
||
| bugherder | ||
Updated•3 years ago
|
Comment 22•3 years ago
|
||
Closing because no crashes reported for 12 weeks.
Comment 23•3 years ago
|
||
Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit auto_nag documentation.
Updated•3 years ago
|
Description
•