Closed Bug 1415755 Opened 7 years ago Closed 1 year ago

Crash in mozilla::AudioMixer::FinishMixing

Categories

(Core :: Audio/Video: MediaStreamGraph, defect, P1)

56 Branch
x86
Windows 7
defect

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox-esr52 --- wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- wontfix

People

(Reporter: jesup, Assigned: pehrsons)

References

Details

(Keywords: crash, csectype-uaf, sec-high)

Crash Data

This bug was filed from the Socorro interface and is
report bp-b2dcd15c-d03f-4830-a5d3-b7bd40171101.
=============================================================

Top 10 frames of crashing thread:

0 xul.dll mozilla::AudioMixer::FinishMixing dom/media/AudioMixer.h:66
1 xul.dll mozilla::MediaStreamGraphImpl::Process dom/media/MediaStreamGraph.cpp:1420
2 xul.dll mozilla::MediaStreamGraphImpl::OneIteration dom/media/MediaStreamGraph.cpp:1467
3 xul.dll mozilla::AudioCallbackDriver::DataCallback dom/media/GraphDriver.cpp:1028
4 xul.dll mozilla::AudioCallbackDriver::DataCallback_s dom/media/GraphDriver.cpp:897
5 xul.dll passthrough_resampler<float>::fill media/libcubeb/src/cubeb_resampler.cpp:69
6 xul.dll `anonymous namespace'::refill media/libcubeb/src/cubeb_wasapi.cpp:564
7 xul.dll `anonymous namespace'::refill_callback_output media/libcubeb/src/cubeb_wasapi.cpp:831
8 xul.dll `anonymous namespace'::wasapi_stream_render_loop media/libcubeb/src/cubeb_wasapi.cpp:966
9 ucrtbase.dll _o__CIpow 

=============================================================

UAF; Crashes go back to at least 34.

Karl - any chance your recent patches might help with this?
Flags: needinfo?(karlt)
(In reply to Randell Jesup [:jesup] from comment #0)
> Karl - any chance your recent patches might help with this?

It's conceivable, but I would have expected crashes from the bugs I've fixed
to be more likely earlier in the iteration, and so I suspect this bug would
not be fixed.

Bug 1414510 has a slightly deeper callstack, but quite a different crash
rate graph.
Flags: needinfo?(karlt)
Flags: needinfo?(padenot)
It looks same issue like Bug 1406772
Rank: 12
Flags: needinfo?(padenot)
Priority: -- → P2
(In reply to Alex Chronopoulos [:achronop] from comment #2)
> It looks same issue like Bug 1406772

Unclear, this is crashing on the mReceiver probably.
Adding karlt's bugs. Unclear if they are really related, but it could be.

In any case, bumping to P1.
Rank: 12 → 8
Priority: P2 → P1
See Also: → 1408276, 1382366
Assignee: nobody → jib
No crashes for 59 and 60. Maybe this was fixed in another issue (see comment 3, comment 4). Let's wait until 60 hits beta.
Bug 1429666 may have been a big cause here.

Still have one crash on 59.0b3:
https://crash-stats.mozilla.com/report/index/ac13e8ec-9fca-45fb-a91d-11d400180129
See Also: → 1429666
So, it's still out there despite your patches, karlt?
Who can own this bug, then? You? Paul?
Flags: needinfo?(padenot)
Flags: needinfo?(karlt)
It looks like the volume may be considerably less on 59 than 58, but yes it is
still out there.  I'm not aware of any changes on 60 or 61 that are likely to
resolve this.

This happens with both cubeb_resampler_speex and passthrough_resampler.

On winnt, there are more crashes on x86, but there are some on amd64.
e.g.

https://crash-stats.mozilla.com/report/index/45423462-24e6-4062-af9a-b641f0180330
https://crash-stats.mozilla.com/report/index/8c9d6a79-1823-4dac-964c-ca5ca0180328

The former is refill_callback_output.  More crashes are in
refill_callback_duplex, which is probably expected.

I filed bug 1457060 in the hope of narrowing down the problem a little.
Flags: needinfo?(karlt)
Flags: needinfo?(padenot)
This aurora 60.0b7 crash with build id 20180326164103 may be related.  It
indicates that bug 1436267 was not the last path leading to two drivers running
the same graph.

https://crash-stats.mozilla.com/report/index/cce65736-d617-46ea-b16d-1f38c0180329

Crash rates are very low, which will make diagnosis difficult.
See Also: → 1457372
Bug 1457372 reports an ASAN uaf that matches these reports.

That ASAN uaf occurred on autoland, where I assume MOZ_DIAGNOSTIC_ASSERT()s
are compiled in.

This implies that the GraphImpl()->CurrentDriver() == aPreviousDriver
assertion in SetGraphTime() involved in comment 10 does not need to fail for
this crash to occur.
Waiting for Bug 1460346 landed in 63 to reach Beta or Release.
See Also: → 1460346
See Also: → 1496669

This crash is still occurring; there are still clear UAFs such as https://crash-stats.mozilla.com/report/index/8ae31e2d-3335-4e56-9dd6-86e4f0181220

Something is likely going wrong with switchover of drivers... perhaps something not locked correctly, or that should be atomic and isn't.

Added Andreas who should be on this bug.

Flags: needinfo?(karlt)
Flags: needinfo?(apehrson)
Flags: needinfo?(karlt)

FWIW I see similar crashes for both GraphDriver and AudioStream data callbacks.

Flags: needinfo?(apehrson)

Andreas, could you as the MSG expert please take a close look at this?
Is it possible that any of your refactoring is going to fix this?

Assignee: jib → apehrson
Flags: needinfo?(apehrson)

Sure, I'll take a look - keeping the needinfo until I do. It's one of these that seem really tricky though, so it might be a bit of a rabbit hole.

I doubt my refactoring fixes this. This seems more like an audio callback that is holding on to a rawptr to a graph or driver that we already destroyed.

Status: NEW → ASSIGNED
Flags: needinfo?(apehrson)

kinetik, can you take a look at [1]?

This is a failure mode on Windows where we crash on 0xffffffffffffffff. There's another that crashes on 0x0 or somewhere close that I for now assume is different.

This one is crashing on an AsyncCubebOperation thread after winmm_stream_init calls into winmm_refill_stream at [2]. At the same time there's thread #40 blocked in winmm_buffer_thread at [3], which is where the responsibility to call winmm_refill_stream normally lies. Can you comment on whether this is expected?

I find it surprising that AudioCallbackDriver::Init() leads to calling a DataCallback synchronously. Is this expected?
That these two threads are active at the same time doesn't seem like a coincidence wrt reaching this crash. There's a similar pattern in the older report at [4] too. [5] crashes similarly but is wasapi and the threading situation looks completely different.

[1] https://crash-stats.mozilla.com/report/index/22990879-c01c-4798-ba58-e57230190322
[2] https://hg.mozilla.org/releases/mozilla-release/annotate/164a57c0cdf0088e786e6b966e34fdd3799671d1/media/libcubeb/src/cubeb_winmm.c#l536
[3] https://hg.mozilla.org/releases/mozilla-release/annotate/164a57c0cdf0088e786e6b966e34fdd3799671d1/media/libcubeb/src/cubeb_winmm.c#l236
[4] https://crash-stats.mozilla.com/report/index/b06848a1-5d46-4c19-b7de-2942c0190319
[5] https://crash-stats.mozilla.com/report/index/46ffa3f2-8efa-4a30-88cf-fb2480190223

Flags: needinfo?(kinetik)

I wonder why we're using the WinMM backend on Windows 10 at all. That might indicate some problem with the audio hardware/drivers or OS, since we've already fallen back from the WASAPI backend. Possibly a machine with no audio hardware?

As far as winmm_refill_stream, it's normal for winmm_stream_init to call that as the buffers need to be queued before starting. WinMM signals our provided event once a buffer is free, which is what drives the winmm_buffer_thread to requeue buffers. It should've been done in _start really (and consistenly in all backends), but that's a bug in the design of libcubeb since the beginning.

I wonder if the failure case here is: no audio hardware available - causes WASAPI to fail to init, but WinMM chugs on until you try to play audio, at which point it never makes progress. With this crash, maybe that occurred and then something waiting on us timed out (just guessing) and got freed, then WinMM unblocks and we hit a UAF. I can test that theory locally, will report back.

(In reply to Matthew Gregan [:kinetik] from comment #19)

I wonder if the failure case here is: no audio hardware available - causes WASAPI to fail to init, but WinMM chugs on until you try to play audio, at which point it never makes progress. With this crash, maybe that occurred and then something waiting on us timed out (just guessing) and got freed, then WinMM unblocks and we hit a UAF. I can test that theory locally, will report back.

This doesn't seem to be the case, at least testing in a Windows 10 VM with no audio hardware (or with audio hardware attached but disabled via mmsys.cpl). The WASAPI backend fails to initialize due to IMMDeviceEnumerator::GetDefaultAudioEndpoint returning E_NOTFOUND. The WinMM backend then fails due to waveOutGetNumDevs() returning 0 - but even with that check removed, waveOutOpen fails with MMSYSERR_BADDEVICEID during stream initialization.

Flags: needinfo?(kinetik)

I think with comment #20 we have run out of ideas here for now. Marking as stalled.

Keywords: stalled
Severity: critical → S2

Only 5 crashes in the last 6 months

Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → INCOMPLETE

Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit auto_nag documentation.

Keywords: stalled
Group: media-core-security
You need to log in before you can comment on or make changes to this bug.