Diagnostic crash in [@ mozilla::MediaTrackGraphImpl::UpdateGraph]
Categories
(Core :: WebRTC: Audio/Video, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox123 | --- | unaffected |
firefox124 | + | fixed |
firefox125 | + | fixed |
People
(Reporter: jimm, Assigned: pehrsons)
References
(Regression)
Details
(4 keywords)
Crash Data
Attachments
(6 files)
48 bytes,
text/x-phabricator-request
|
tjr
:
sec-approval+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
tjr
:
sec-approval+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
tjr
:
sec-approval+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
Crash report: https://crash-stats.mozilla.org/report/index/e8129be1-e9a4-476f-b38f-8173d0240229
MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(false) (A non-ended SourceMediaTrack wasn't fed enough data by NotifyPull)
Top 10 frames of crashing thread:
0 XUL mozilla::MediaTrackGraphImpl::UpdateGraph dom/media/MediaTrackGraph.cpp:1364
0 XUL mozilla::MediaTrackGraphImpl::OneIterationImpl dom/media/MediaTrackGraph.cpp:1626
1 XUL mozilla::GraphRunner::Run dom/media/GraphRunner.cpp:141
2 XUL nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1193
3 XUL NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:480
3 XUL mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:300
4 XUL MessageLoop::RunInternal ipc/chromium/src/base/message_loop.cc:370
4 XUL MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:363
4 XUL MessageLoop::Run ipc/chromium/src/base/message_loop.cc:345
5 XUL nsThread::ThreadFunc xpcom/threads/nsThread.cpp:370
![]() |
Reporter | |
Comment 1•1 year ago
|
||
STR:
I had a set of external headphones plugged into my mac. In Meet however when joining a meeting the service defaulted to the internal speaker and mic for some unknown reason. In Meet settings, I switched mic and speaker to the external, hit the test speaker button, and noticed that the audio came out of the mac speaker. So while still in the meeting, I unplugged the headphones from the audio jack, which triggered this crash.
![]() |
Reporter | |
Updated•1 year ago
|
Updated•1 year ago
|
Comment 2•1 year ago
|
||
Set release status flags based on info from the regressing bug 1878380
Updated•1 year ago
|
Comment 3•1 year ago
•
|
||
:padenot i know :apehrsons is out until next week, do you think you could take a look?
Comment 4•1 year ago
|
||
The bug is marked as tracked for firefox124 (beta). We have limited time to fix this, the soft freeze is in 13 days. However, the bug still isn't assigned.
:jimm, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Comment 5•1 year ago
|
||
Jim, I'm marking this an S3, but please update if you disagree.
Assignee | ||
Comment 6•1 year ago
|
||
Failing this assert is not ideal (it's a canary for a specific kind of audio distortion issues) but should be harmless in release.
Comment 7•1 year ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 10 content process crashes on beta
:pehrsons, could you consider increasing the severity of this top-crash bug?
For more information, please visit BugBot documentation.
Assignee | ||
Comment 8•1 year ago
|
||
I cannot reproduce this crash. But I'll note we only use the type of track that can fail this assertion for fake audio tracks and remote audio tracks (i.e. audio received through RTCPeerConnection). When running meet the only such tracks in play are remote audio tracks.
I'll also note that all crash reports I looked at here have one GraphRunner thread (indicating there is one MediaTrackGraph present) and two MediaTrackGrph threads (indicating there are two SystemClockDrivers driving a MediaTrackGraph). There should never be two drivers active simultaneously for a single graph, and I think these reports indicate that is the case. P1/S2 because of this.
This seems reasonably likely related to the mThreadState == ThreadState::Wait
assertion failure in bug 1882172. Marking this as a sec bug because of that. And I'll note that having two drivers run a single graph could also have security implications, though GraphRunner should mitigate some modes.
Given that there is a low rate of this signature across platforms prior to bug 1878380, it seems like the regression is an increase in the triggering of an existing issue. The root cause of the issue is unclear at this time.
![]() |
Reporter | |
Comment 9•1 year ago
|
||
Trying to reproduce by unplugging headphones under various conditions within a Meet meeting. No luck reproducing so far.
Assignee | ||
Comment 10•1 year ago
|
||
Auditing the code I do see one way, introduced by bug 1878380, of having two system clock drivers drive the same graph, if the graph is on an audio driver with active fallback and switches driver, then before the next driver asynchronously stops the audio driver with fallback, a devicechange event comes in and starts a new fallback driver, disregarding that the fallback state is Stopped
.
Note this mode affects only macOS, the latent issue resulting in this signature would have a different root cause. If it is similar enough maybe we can catch it with an assert, otherwise it will continue to occur.
I'll try to repro this mode in a gtest tomorrow.
Assignee | ||
Comment 11•1 year ago
|
||
Assignee | ||
Comment 12•1 year ago
|
||
Assignee | ||
Comment 13•1 year ago
|
||
Assignee | ||
Comment 14•1 year ago
|
||
Comment on attachment 9389358 [details]
Bug 1882783 - Don't start a fallback driver after being asked to stop. r?padenot!
Security Approval Request
- How easily could an exploit be constructed based on the patch?: Not easily. Patches make it pretty clear what the issue is, but it requires both user interaction (changing default audio output device) and timing, and only applies to mac.
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
- Which branches (beta, release, and/or ESR) are affected by this flaw, and do the release status flags reflect this affected/unaffected state correctly?: beta, yes
- If not all supported branches, which bug introduced the flaw?: Bug 1878380
- Do you have backports for the affected branches?: No
- If not, how different, hard to create, and risky will they be?: trivial
- How likely is this patch to cause regressions; how much testing does it need?: Unlikely to cause further regressions, but it's speculative and we are not sure it fixes the root cause of this bug.
- Is the patch ready to land after security approval is given?: Yes
- Is Android affected?: No
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 15•1 year ago
|
||
Note on the sec-approval: I marked this as a sec bug because of bug 1882172, which is sec-low. That is however based on it being a shutdown assertion race. The case I am fixing here is not shutdown specific, and if it is indeed the root cause of this bug then we have a situation where a graph is driven by more than one graph driver. The security implications given that are hard to overview but I'm pretty sure we could deduce a path leading to a UAF somewhere in the graph, probably when switching driver or shutting it down.
Updated•1 year ago
|
Updated•1 year ago
|
Comment 16•1 year ago
|
||
Comment on attachment 9389358 [details]
Bug 1882783 - Don't start a fallback driver after being asked to stop. r?padenot!
Based on the limited blast radius (mac) and high barrier for triggering (unusual user interaction) I'm find with this landing with the test and not making it into 124 if relman is not confident in its stability and would like more bake time.
Assignee | ||
Comment 17•1 year ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D203591
Updated•1 year ago
|
Assignee | ||
Comment 18•1 year ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D203592
Updated•1 year ago
|
Assignee | ||
Comment 19•1 year ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D203593
Updated•1 year ago
|
Comment 20•1 year ago
|
||
Uplift Approval Request
- String changes made/needed: None
- Is Android affected?: no
- Fix verified in Nightly: no
- Steps to reproduce for manual QE testing: (haven't been able to reproduce on demand)
- User impact if declined: We suspect there could be a sec issue in the non-diagnostic-assert variant of this bug
- Code covered by automated testing: yes
- Risk associated with taking this patch: Low
- Needs manual QE test: no
- Explanation of risk level: This patch itself is low risk, bug 1878380 I deemed low-medium risk and I'd say there's still some risk whether any regression remains with this fix.
Comment 21•1 year ago
|
||
Comment 22•1 year ago
|
||
https://hg.mozilla.org/mozilla-central/rev/d261da754cf2
https://hg.mozilla.org/mozilla-central/rev/1f0beb4ce26d
https://hg.mozilla.org/mozilla-central/rev/352c6538f922
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 23•1 year ago
|
||
uplift |
Updated•1 year ago
|
Updated•9 months ago
|
Updated•5 months ago
|
Description
•