Closed Bug 1293976 Opened 5 years ago Closed 5 years ago
Crash in mozilla::Media
Stream Graph::Notify Output Data since Firefox 49
This bug was filed from the Socorro interface and is report bp-c3d75891-a31e-49da-ab1c-a8ebe2160809. ============================================================= Crashing Thread (98) Frame Module Signature Source 0 xul.dll mozilla::MediaStreamGraph::NotifyOutputData(float*, unsigned int, int, unsigned int) dom/media/MediaStreamGraph.cpp:1256 1 xul.dll mozilla::AudioCallbackDriver::DataCallback(float const*, float*, long) dom/media/GraphDriver.cpp:911 2 xul.dll mozilla::AudioCallbackDriver::DataCallback_s(cubeb_stream*, void*, void const*, void*, long) dom/media/GraphDriver.cpp:772 3 xul.dll noop_resampler::fill(void*, long*, void*, long) media/libcubeb/src/cubeb_resampler.cpp:54 4 xul.dll `anonymous namespace'::refill media/libcubeb/src/cubeb_wasapi.cpp:529 5 xul.dll `anonymous namespace'::refill_callback_duplex media/libcubeb/src/cubeb_wasapi.cpp:737 6 xul.dll `anonymous namespace'::wasapi_stream_render_loop media/libcubeb/src/cubeb_wasapi.cpp:906 7 ucrtbase.dll _crt_at_quick_exit 8 kernel32.dll BaseThreadInitThunk 9 ntdll.dll __RtlUserThreadStart 10 ntdll.dll _RtlUserThreadStart this crash signature on windows and os x seems to be regressing since firefox 49 builds - on 49.0b1 it's currently making up 0.25% of all browser crashes.
e5e5 signature. Most likely mListeners needs to be locked.
Assignee: nobody → rjesup
Component: WebRTC: Audio/Video → Audio/Video: MediaStreamGraph
Which mListeners? mAudioInputs seems to always be dealt with on the MSG thread.
mAudioInputs (what I meant to say) is only modified on the Graph thread... but it's a refcounted object, and mAudioInputs isn't a RefPtr, so if the caller calls CloseAudioInput (which is async!) and then lets the listener die, mAudioInputs may still have a pointer to it and try to call through it. That's what's happening here I believe. Simple fix: make it an nsTArray<RefPtr<...>>. I also found a missing lock in the AudioInput object (for DeviceChanged()).
(In reply to Randell Jesup [:jesup] from comment #3) > mAudioInputs (what I meant to say) is only modified on the Graph thread... > but it's a refcounted object, and mAudioInputs isn't a RefPtr, so if the > caller calls CloseAudioInput (which is async!) and then lets the listener > die, mAudioInputs may still have a pointer to it and try to call through it. > That's what's happening here I believe. > > Simple fix: make it an nsTArray<RefPtr<...>>. I also found a missing lock > in the AudioInput object (for DeviceChanged()). It's async, but we keep a ref to the listener through the whole asyncness, see . Unless there's a listener implemented without the proper addref forwarding I can't see how this is the problem.  http://searchfox.org/mozilla-central/rev/d83f1528dbcb1e837d99cf5b6c36a90b03bf0e77/dom/media/MediaStreamGraph.cpp#1064
Comment on attachment 8780388 [details] [diff] [review] make mAudioInputs use RefPtrs Review of attachment 8780388 [details] [diff] [review]: ----------------------------------------------------------------- I'd be willing to r+ this, but first I think we should have a grip on what's causing this so we don't just mask it into something else.
Attachment #8780388 - Attachment is obsolete: true
The actual hole is the DispatchToMainThread without holding a ref... We should still hold a ref in the array, to guard against accidental missing the call to CloseAudioInput
Attachment #8780525 - Flags: review?(pehrson) → review+
Comment on attachment 8780525 [details] [diff] [review] make mAudioInputs use RefPtrs Approval Request Comment [Feature/regressing bug #]: full_duplex landings; Bug 1221587. full_duplex is off in 48 except for Linux. It's on for windows and mac in 49, but will be turned off on mac in 49 shortly. [Security approval request comment] How easily could an exploit be constructed based on the patch? Not easily - we're clearly adding a ref to a Dispatch, but it's a very tough hole to provoke. One might be able to flood MainThread with events to slow processing of the Dispatch, and give time for the calling thread to release the underlying object. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No. Which older supported branches are affected by this flaw? 46 and later, but only if full_duplex is on. full_duplex is off for all but linux in 48; 49 will go out with windows added; 50 will go out with mac as well; and android will likely be 51. If not all supported branches, which bug introduced the flaw? 48 is the earliest that has the flaw, but we have 0 hits in crash-stats for 48. I do not recommend holding 48.0.1 for this or shipping a new point release unless we believe someone has discovered a way to exploit it on Linux. If we had to update linux, we could force the pref to false instead of changing the code. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Trivial to apply. How likely is this patch to cause regressions; how much testing does it need? Extremely unlikely to cause regressions; at most it would leak memory and mochitests are green.
Comment on attachment 8780525 [details] [diff] [review] make mAudioInputs use RefPtrs Approvals given.
25 crashes with this signature in the last week, with 4 of them on 49 RC, 2 on Aurora and 2 on Nightly. https://crash-stats.mozilla.com/signature/?product=Firefox&signature=mozilla%3A%3AMediaStreamGraph%3A%3ANotifyOutputData This seems very low volume, so I assume it is safe to close it. If someone considers these results worrying, feel free to reopen.
Still happening with UAF signature in FF 51 and 50 and 49.0b99, such as a 50.0a2 from build 20160908004007 https://crash-stats.mozilla.com/report/index/0032d33d-dd99-4e8b-9c84-687f22160910 or a 49.0b99 (buildid 20160907153016) https://crash-stats.mozilla.com/report/index/9b1a9e00-db49-4e76-8f13-48a8a2160910 The frequency is lower, but it's not (completely) fixed. There might be a secondary cause. We did close a clear hole, and the frequency dropped sharply. When I expand the search to start July 1, and then look at the Graphs using 'version', I see the crash rate drops to almost but not quite 0 after the fix here landed. Padenot, pehrsons - any thoughts how this could still be happening?
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
re-closing and remarking verified - I spun off a bug to handle the remaining crashes which likely have a slightly different source. The fix here clearly closed the primary hole.
Crash volume for signature 'mozilla::MediaStreamGraph::NotifyOutputData': - nightly (version 52): 1 crash from 2016-09-19. - aurora (version 51): 2 crashes from 2016-09-19. - beta (version 50): 46 crashes from 2016-09-20. - release (version 49): 366 crashes from 2016-09-05. - esr (version 45): 0 crashes from 2016-07-25. Crash volume on the last weeks (Week N is from 10-17 to 10-23): W. N-1 W. N-2 W. N-3 W. N-4 - nightly 0 1 0 0 - aurora 0 1 1 0 - beta 12 15 10 3 - release 91 63 128 37 - esr 0 0 0 0 Affected platform: Windows Crash rank on the last 7 days: Browser Content Plugin - nightly - aurora - beta #1315 #581 - release #807 #236 - esr
Updating overall status based on Comment 16.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.