Closed Bug 1362764 Opened 7 years ago Closed 7 years ago

Crash in CAudioClient::FinalRelease

Categories

(Core :: Audio/Video: cubeb, defect, P1)

55 Branch
x86_64
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 --- unaffected
firefox55 --- wontfix
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: philipp, Assigned: kinetik)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-011b9dde-472b-4dd1-9a52-5b8620170505.
=============================================================
Crashing Thread (69)
Frame 	Module 	Signature 	Source
0 	audioses.dll 	CAudioClient::FinalRelease() 	
1 	audioses.dll 	ATL::CComObject<CAudioClient>::~CComObject<CAudioClient>() 	
2 	audioses.dll 	ATL::CComObject<CAudioClient>::`vector deleting destructor'(unsigned int) 	
3 	audioses.dll 	ATL::CComObject<CAudioClient>::Release() 	
4 	audioses.dll 	ATL::CComObject<CAudioClientClock>::~CComObject<CAudioClientClock>() 	
5 	audioses.dll 	ATL::CComObject<CAudioClientClock>::`vector deleting destructor'(unsigned int) 	
6 	audioses.dll 	ATL::CComObject<CAudioClientClock>::Release() 	
7 	xul.dll 	`anonymous namespace'::com_ptr<IAudioClient>::release 	media/libcubeb/src/cubeb_wasapi.cpp:130
8 	xul.dll 	`anonymous namespace'::close_wasapi_stream 	media/libcubeb/src/cubeb_wasapi.cpp:1877
9 	xul.dll 	`anonymous namespace'::wasapi_stream_destroy 	media/libcubeb/src/cubeb_wasapi.cpp:1908
10 	xul.dll 	cubeb_stream_destroy 	media/libcubeb/src/cubeb.c:348
11 	xul.dll 	mozilla::AudioStream::Shutdown() 	dom/media/AudioStream.cpp:471
12 	xul.dll 	mozilla::media::AudioSink::Shutdown() 	dom/media/mediasink/AudioSink.cpp:146
13 	xul.dll 	mozilla::media::AudioSinkWrapper::Stop() 	dom/media/mediasink/AudioSinkWrapper.cpp:215
14 	xul.dll 	mozilla::media::VideoSink::Stop() 	dom/media/mediasink/VideoSink.cpp:217
15 	xul.dll 	mozilla::MediaDecoderStateMachine::StopMediaSink() 	dom/media/MediaDecoderStateMachine.cpp:3156
16 	xul.dll 	mozilla::MediaDecoderStateMachine::CompletedState::Step() 	dom/media/MediaDecoderStateMachine.cpp:1881
17 	xul.dll 	mozilla::detail::RunnableMethodImpl<mozilla::detail::Listener<RefPtr<mozilla::layers::KnowsCompositor> >* const, void ( mozilla::detail::Listener<RefPtr<mozilla::layers::KnowsCompositor> >::*)(void), 1, 0>::Run() 	obj-firefox/dist/include/nsThreadUtils.h:909
18 	xul.dll 	mozilla::AutoTaskDispatcher::TaskGroupRunnable::Run() 	obj-firefox/dist/include/mozilla/TaskDispatcher.h:205
19 	xul.dll 	mozilla::TaskQueue::Runner::Run() 	xpcom/threads/TaskQueue.cpp:240
20 	xul.dll 	nsThreadPool::Run() 	xpcom/threads/nsThreadPool.cpp:225
21 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1270
22 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/threads/nsThreadUtils.cpp:393
23 	xul.dll 	mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:368
24 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:231
25 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:211
26 	xul.dll 	nsThread::ThreadFunc(void*) 	xpcom/threads/nsThread.cpp:501
27 	nss3.dll 	PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:397
28 	nss3.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:95
29 	ucrtbase.dll 	o__realloc_base 	
30 	kernel32.dll 	BaseThreadInitThunk 	
31 	ucrtbase.dll 	o__realloc_base 	
32 	ucrtbase.dll 	o__realloc_base 	
33 	ntdll.dll 	RtlUserThreadStart

the crash signature is showing up on nightly since build 20170421030241, so far only on 64bit versions of the browser on win 10/8.1.

this would be the changelog for the build: https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2017-04-20&tochange=dd530a59750adcaa0d48fa4f69b0cdb52715852a - out of this range bug 1357683 and bug 1346665 stick out as related to cubeb...
Rank: 15
Priority: -- → P1
Matthew - can you take a look at this?  Or achronop? (padenot is on PTO for a while)

Note address=0xfffffffffffff may or may not be a UAF due to crashreporter on windows.
Flags: needinfo?(kinetik)
Flags: needinfo?(achronop)
Bug 1346665 is Linux only, and didn't touch the main libcubeb code in any major way anyway, so we can probably exclude that.

Bug 1357683 updated the Gecko copy of libcubeb from 04826edb to 6e52314f, bringing in quite a few changes, but nothing specifically around the lifetime of IAudioClient or IAudioClock.

Looking at changes to cubeb_wasapi.cpp, the candidates for causing a regression are the addition of S16 support (2bb3e25) and changes to the mixing code (727a7a4), and maybe the introduction of auto_array_wrapper (eacf836).  I skimmed through the changes and nothing jumps out as being obviously wrong.  The changes to stm->mix_buffer sizing in 727a7a4 look correct (we were sizing the buffer in bytes but allocating floats before causing the mix buffer to be 4x too large), but it's possible that change has uncovered an existing misuse of that buffer - slightly concerned about the buffer sizing using the before_mix variant of the frames->bytes conversion.

Alex, do you have time to dig into this more?  I'm fairly busy with other stuff, but I can make time if you can't.
Flags: needinfo?(kinetik)
Extending the crash stats search back a year, there are a handful (9) of very similar crashes releasing stm->audio_clock going back to 50.1.0 (20161208153507), but it's clear they spike (46 so far) in 55.0a1, so it's possible the recent changes exacerbated an existing (rare) bug...
Paul's back and may be able to help here, too.
Flags: needinfo?(padenot)
This is only on recent versions of Windows (8+). There has been quite an population shift recently, with people moving to Windows 10 (which is the version that is affected the most, looking at the crash stats data).

This rang a bell, and I looked up the docs [0]:

> To release the IAudioClient object and free all its associated resources, the client must release all references to any service
> objects that were created by calling GetService, in addition to calling Release on the IAudioClient interface itself. The 
> client must release a service from the same thread that releases the IAudioClient object.

We're doing the opposite here: releasing the IAudioClient and _then_ the interface objects we got from it. Also (at least in MSG, not sure about AudioStream these days), we don't use the same thread for init and release.

Also doing the operations on the same thread is necessary for all interfaces that we got from the IAudioClient.

I'll reorder the calls now, and we can see about the threading issue later (since it's less obvious).

[0]: https://msdn.microsoft.com/en-us/library/windows/desktop/dd370873(v=vs.85).aspx
Flags: needinfo?(padenot)
Flags: needinfo?(achronop)
Actually, maybe this simply describes ref-counting.
Low volume crash, setting 55 fix-optional, we can still take a fix in beta.
See Also: → 1380775
Paul: any more thoughts here?  Anything we want to land to try to stop this?
Flags: needinfo?(padenot)
Yes, I need to write the patch. I want to do it soon but I have other things to do first.
Flags: needinfo?(padenot)
I can repro this one too now that I have a 64-bit debug build, so taking.
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
This is a regression from bug 1357683, which imported https://github.com/kinetiknz/cubeb/commit/2bb3e25537042ad113f02f36bd1080dc8abeb3b4#diff-2f55f506bd2dd918b7b8c6ceec8aee0dR1552 from upstream.

That changeset introduced a case where libcubeb would store into a WAVEFORMATEXTENSIBLE pointer retrieved from IAudioClient::GetMixFormat without verifying that the underlying pointer was truly a WAVEFORMATEXTENSIBLE.  The documentation for GetMixFormat is slightly unclear, but seems to imply that the WAVEFORMATEX pointer returned is guaranteed to actually be a WAVEFORMATEXTENSIBLE:

    For this reason, the GetMixFormat method retrieves a format descriptor that is in the form of a WAVEFORMATEXTENSIBLE
    structure instead of a standalone WAVEFORMATEX structure. Through the ppDeviceFormat parameter, the method outputs a
    pointer to the WAVEFORMATEX structure that is embedded at the start of this WAVEFORMATEXTENSIBLE structure.

But that's not true in practice, and even MSFT's own example code handles the case where GetMixFormat returns a non-WAVEFORMATEXTENSIBLE structure, e.g. https://github.com/Microsoft/Windows-universal-samples/blob/6370138b150ca8a34ff86de376ab6408c5587f5d/Samples/WindowsAudioSession/cpp/WASAPICapture.cpp#L167

It's worth noting that we've hit this bug before (https://dxr.mozilla.org/mozilla-central/source/media/libcubeb/src/cubeb_wasapi.cpp#1401), but given the ambiguous documentation it's not surprising we hit it again.  A longer term solution probably involves moving all of the WAVEFORMATEX manipulation code to a single location where it can be more careful about the true format of the pointer.

Fix in https://github.com/kinetiknz/cubeb/pull/343, I'll use this bug to import an update from the upstream repository.
Blocks: 1357683
PR 343 is still waiting for review, NI? reviewer so it's not forgotten.
Flags: needinfo?(padenot)
oops sorry I forgot about this one. It's r+ed on github now.
Flags: needinfo?(padenot)
Fix landing via bug 1386957.
Depends on: 1386957
Fix merged to mozilla-central.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
No longer blocks: 1357683
Regressed by: 1357683
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.