Crash in [@ mozilla::CheckedInt<T>::isValid] via ReaderProxy::OnAudioDataRequestCompleted() or mozilla::ReaderProxy::RequestVideoData()
Categories
(Core :: Audio/Video: Playback, defect, P3)
Tracking
()
People
(Reporter: cpeterson, Assigned: aosmond)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
Crash report: https://crash-stats.mozilla.org/report/index/59167ec3-c62a-43fc-b65f-da6740230105
This CheckedInt<T>::isValid
crash signature is pretty generic, but this particular media::TimeUnit::IsPosInf()
stack trace appears to have started in Firefox 106 or 107.
A similar CheckedInt assertion failure was fixed in MediaData::AdjustForStartTime
bug 1540740 (way back in Firefox 68).
Reason: SIGSEGV / SEGV_MAPERR
Top 10 frames of crashing thread:
0 libxul.so mozilla::CheckedInt<long>::isValid const mfbt/CheckedInt.h:571
0 libxul.so mozilla::media::TimeUnit::IsPosInf const dom/media/TimeUnits.h:210
0 libxul.so mozilla::media::TimeUnit::IsInfinite const dom/media/TimeUnits.h:122
0 libxul.so mozilla::media::TimeUnit::operator- const dom/media/TimeUnits.h:163
1 libxul.so mozilla::media::TimeUnit::operator-= dom/media/TimeUnits.h:178
1 libxul.so mozilla::AudioData::AdjustForStartTime dom/media/MediaData.cpp:63
2 libxul.so mozilla::ReaderProxy::OnAudioDataRequestCompleted dom/media/ReaderProxy.cpp:46
3 libxul.so mozilla::MozPromise<RefPtr<mozilla::AudioData>, mozilla::MediaResult, true>::InvokeMethod<mozilla::ReaderProxy, RefPtr<mozilla::MozPromise<RefPtr<mozilla::AudioData>, mozilla::MediaResult, true> > xpcom/threads/MozPromise.h:632
3 libxul.so mozilla::MozPromise<RefPtr<mozilla::AudioData>, mozilla::MediaResult, true>::InvokeCallbackMethod<true, mozilla::ReaderProxy, RefPtr<mozilla::MozPromise<RefPtr<mozilla::AudioData>, mozilla::MediaResult, true> > xpcom/threads/MozPromise.h:648
4 libxul.so mozilla::MozPromise<RefPtr<mozilla::AudioData>, mozilla::MediaResult, true>::ThenValue<mozilla::ReaderProxy*, RefPtr<mozilla::MozPromise<RefPtr<mozilla::AudioData>, mozilla::MediaResult, true> > xpcom/threads/MozPromise.h:717
Reporter | ||
Comment 1•2 years ago
|
||
On closer inspection, this crash is a null pointer dereference, not a CheckedInt assertion failure.
Is aAudio
null here somehow? https://hg.mozilla.org/releases/mozilla-beta/file/0088ccd61ad1d2f6a7cd70acc03d39cd0087a338/dom/media/ReaderProxy.cpp#l46
Updated•2 years ago
|
Comment 2•2 years ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 10 AArch64 and ARM crashes on beta
:jimm, could you consider increasing the severity of this top-crash bug?
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Updated•2 years ago
|
Reporter | ||
Comment 3•2 years ago
|
||
Some of these crashes have a similar but different crash stack trace through ReaderProxy::RequestVideoData
. Why are these ReaderProxy
crashes so much more common on Android than Windows or macOS?
Crash report: https://crash-stats.mozilla.org/report/index/19afffe1-c813-491d-9878-934ef0230121
Reason: SIGSEGV / SEGV_MAPERR
Top 10 frames of crashing thread:
0 libxul.so mozilla::CheckedInt<long>::isValid const mfbt/CheckedInt.h:571
0 libxul.so mozilla::media::TimeUnit::IsPosInf const dom/media/TimeUnits.h:210
0 libxul.so mozilla::media::TimeUnit::IsInfinite const dom/media/TimeUnits.h:122
0 libxul.so mozilla::media::TimeUnit::operator- const dom/media/TimeUnits.h:163
1 libxul.so mozilla::ReaderProxy::RequestVideoData const dom/media/ReaderProxy.cpp:86
1 libxul.so mozilla::MozPromise<RefPtr<mozilla::VideoData>, mozilla::MediaResult, true>::InvokeMethod<mozilla::ReaderProxy::RequestVideoData xpcom/threads/MozPromise.h:632
1 libxul.so mozilla::MozPromise<RefPtr<mozilla::VideoData>, mozilla::MediaResult, true>::InvokeCallbackMethod<true, mozilla::ReaderProxy::RequestVideoData xpcom/threads/MozPromise.h:648
1 libxul.so mozilla::MozPromise<RefPtr<mozilla::VideoData>, mozilla::MediaResult, true>::ThenValue<mozilla::ReaderProxy::RequestVideoData xpcom/threads/MozPromise.h:848
2 ? @0x0000088b8a2ed56e
3 libxul.so mozilla::AutoTaskDispatcher::TaskGroupRunnable::Run xpcom/threads/TaskDispatcher.h:230
Comment 4•2 years ago
|
||
Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 5•2 years ago
|
||
I'm only going by an educated guess here, but I've been staring at this all day, and I think the only way this could happen is:
Our promise is exclusive (it is):
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/dom/media/MediaFormatReader.h#88
So when we resolve it, it does a Move instead of a copy:
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/xpcom/threads/MozPromise.h#718
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/xpcom/threads/MozPromise.h#169
So that the second Then invocation gets a null pointer. That describes the situation we are in.
There is strong evidence that we are indeed violating the exclusive property of the promise, e.g. in the signature [@ mozilla::MediaFormatReader::RequestAudioData ]:
https://crash-stats.mozilla.org/report/index/0256b29f-ea17-4ccb-ba1f-728e50230325
and in the signature [@ mozilla::MozPromise<T>::ChainTo ]
https://crash-stats.mozilla.org/report/index/3cf83294-817c-4fca-8ec9-51e680230405
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 6•2 years ago
|
||
Maybe we just need to replicate these checks:
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/dom/media/ExternalEngineStateMachine.cpp#822
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/dom/media/ExternalEngineStateMachine.cpp#759
at the other call sites:
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/dom/media/MediaDecoderStateMachine.cpp#1176
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/dom/media/MediaDecoderStateMachine.cpp#3913
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/dom/media/MediaDecoderStateMachine.cpp#1123
https://searchfox.org/mozilla-central/rev/504bd0a7ee3110ecb2b31f89b9ce54faf76e228b/dom/media/MediaDecoderStateMachine.cpp#3851
Assignee | ||
Comment 7•2 years ago
|
||
Crash analysis shows that we are crashing with a null pointer resolving
the promises from MediaFormatReader::RequestAudio/VideoData because
there was already an outstanding request when we issued a second. This
violates the promise invariant where we expected there would only be one
chained promise per request. Since we used the IsExclusive property of
MozPromise, this meant we moved the pointer to the runnable when
dispatching to the owner thread/task queue, which meant the second
promise would get a null pointer and trigger the crash observed in
MediaData::AdjustForStartTime.
We already check for outstanding requests in ExternalEngineStateMachine's
OnRequestVideo and OnRequestAudio and ignore it if there is a request
still pending. We assert in MediaDecoderStateMachine's RequestAudioData
and RequestVideoData but would let multiple requests pass through in
non-debug builds. We don't even assert in
MediaDecoderStateMachine::LoopingDecodingState's variants.
This patch hopes to solve the crashes by duplicating the checks in
ExternalEngineStateMachine in the other callers.
Comment 8•2 years ago
|
||
Sorry for removing the keyword earlier but there is a recent change in the ranking, so the bug is again linked to a topcrash signature, which matches the following criteria:
- Top 10 desktop browser crashes on nightly
- Top 10 AArch64 and ARM crashes on nightly
For more information, please visit BugBot documentation.
Updated•2 years ago
|
Updated•1 years ago
|
Comment 9•1 years ago
|
||
The bug is marked as blocking firefox115 (beta). However, the bug still has low priority and has low severity.
:jimm, could you please increase the priority and increase the severity for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Comment 10•1 years ago
|
||
Spike in 115 caused by 1817997
Comment 11•1 years ago
|
||
We're handling the TimeUnit::IsValid
crashes in bug 1838256, this is unrelated.
Comment 13•1 years ago
|
||
Based on the topcrash criteria, the crash signatures linked to this bug are not in the topcrash signatures anymore.
For more information, please visit BugBot documentation.
Updated•1 year ago
|
Description
•