Crash in `anonymous namespace''::wasapi_stream_start

RESOLVED FIXED in Firefox 54

Status

()

defect
P1
critical
Rank:
15
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: philipp, Assigned: padenot)

Tracking

({crash, regression})

54 Branch
mozilla55
All
Windows
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr45 unaffected, firefox-esr52 unaffected, firefox53 unaffected, firefox54+ fixed, firefox55 fixed)

Details

(crash signature)

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

2 years ago
This bug was filed from the Socorro interface and is 
report bp-b51a6e42-803f-49e5-9128-8c2030170422.
=============================================================
Crashing Thread (37)
Frame 	Module 	Signature 	Source
0 	xul.dll 	`anonymous namespace'::wasapi_stream_start 	media/libcubeb/src/cubeb_wasapi.cpp:1917
1 	xul.dll 	cubeb_stream_start 	media/libcubeb/src/cubeb.c:349
2 	xul.dll 	mozilla::AudioStream::InvokeCubeb<int (*)(cubeb_stream*)>(int (*)(cubeb_stream*)) 	dom/media/AudioStream.cpp:317
3 	xul.dll 	mozilla::AudioStream::Resume() 	dom/media/AudioStream.cpp:450
4 	xul.dll 	mozilla::media::AudioSink::SetPlaying(bool) 	dom/media/mediasink/AudioSink.cpp:189
5 	xul.dll 	mozilla::media::AudioSinkWrapper::SetPlaying(bool) 	dom/media/mediasink/AudioSinkWrapper.cpp:164
6 	xul.dll 	mozilla::media::VideoSink::SetPlaying(bool) 	dom/media/mediasink/VideoSink.cpp:156
7 	xul.dll 	mozilla::MediaDecoderStateMachine::MaybeStartPlayback() 	dom/media/MediaDecoderStateMachine.cpp:2919
8 	xul.dll 	mozilla::MediaDecoderStateMachine::DecodingState::Step() 	dom/media/MediaDecoderStateMachine.cpp:685
9 	xul.dll 	mozilla::AutoTaskDispatcher::TaskGroupRunnable::Run() 	obj-firefox/dist/include/mozilla/TaskDispatcher.h:205
10 	xul.dll 	mozilla::TaskQueue::Runner::Run() 	xpcom/threads/TaskQueue.cpp:232
11 	xul.dll 	nsThreadPool::Run() 	xpcom/threads/nsThreadPool.cpp:225
12 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1270
13 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/threads/nsThreadUtils.cpp:389
14 	xul.dll 	mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:338
15 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:231
16 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:211
17 	xul.dll 	nsThread::ThreadFunc(void*) 	xpcom/threads/nsThread.cpp:501
18 	nss3.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:397
19 	nss3.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:95
20 	ucrtbase.dll 	_o__CIpow 	
21 	kernel32.dll 	BaseThreadInitThunk 	
22 	ntdll.dll 	__RtlUserThreadStart 	
23 	ntdll.dll 	_RtlUserThreadStart

these crashes started (re)appearing in firefox since 54.0a1 build 20170211030205 and in a regular fashion thereafter, so it's likely related to bug 1337805 which landed the day before.
the crashes are hitting users of 32bit and 64bit versions of firefox on windows and all the report are annotated with MOZ_RELEASE_ASSERT(stm && !stm->thread && !stm->shutdown_event) added in bug 1280280. in stability data for 54.0b1 the signature currently accounts for 0.5% of browser crashes and 1.1% of content process crashes.
Flags: needinfo?(padenot)
Alex, could you take a look about comment 0?
Flags: needinfo?(achronop)
Redirect to Paul.
Flags: needinfo?(achronop)
(Assignee)

Comment 4

2 years ago
This probably happens in the following scenario:

- Audio is being played back audio for an `HTMLMediaElement`
- `pause` is called on the `HTMLMediaElement`
- cubeb can't join the audio rendering thread (as it happens sometimes), so it leaks it. At this point `stm->thread` and `stm->shutdown_event` are still valid pointers.
- cubeb returns CUBEB_OK
- `play` is called on the `HTMLMediaElement`
- `wasapi_stream_start` is called, and `stm->thread` and `stm->shutdown_event` are still valid pointers.
- Release assert crash.
Flags: needinfo?(padenot) → needinfo?(jwwang)
(Assignee)

Updated

2 years ago
Assignee: nobody → padenot
(Assignee)

Comment 5

2 years ago
No need for info from JW, I initially had another analysis but found the info I needed reading `AudioStream.cpp`.
Flags: needinfo?(jwwang)
Rank: 15
Priority: -- → P1
(Assignee)

Comment 6

2 years ago
JW, we have a thing in Windows, where sometimes a system bug means we can lose control over the audio rendering thread. We usually detect this when trying to do `cubeb_stream_stop` or `cubeb_stream_destroy`. Would you be ok with cubeb returning an error code there, so you can try to re-create a cubeb stream ?

We don't really know why it happens, we have, for now, written some code to avoid it crashing.
Flags: needinfo?(jwwang)
http://searchfox.org/mozilla-central/rev/876c7dd30586f9c6f9c99ef7444f2d73c7acfe7c/dom/media/AudioStream.cpp#428

I think it is OK. If cubeb_stream_stop returns an error, it will put the AudioStream into an error state and other functions will act accordingly.

I will file a new bug for error recovery to give better user experience. For now, we just need to raise an decode error to the media element when a cubeb error is encountered.
Flags: needinfo?(jwwang)
Paul, should we ask for tracking on this bug?
Flags: needinfo?(padenot)
(Assignee)

Comment 9

2 years ago
Absolutely.
Flags: needinfo?(padenot)
Paul, do you have any update about this crash?
[Tracking Requested - why for this release]:
This is a preventable crash.
Track 54+ as these crashes started (re)appearing in firefox since 54.
Depends on: 1362334
Fix for this landed in cubeb repor, commit:
45c7ffc Paul Adenot     Return an error if stopping a WASAPI stream fails.

It will be included in the next cubeb import.
Matthew -- Paul (padenot) is out on PTO for the rest of the month. He had been planning to deal with this bug before he left, but ran out of time.  What's the best way to handle this for Fx 54?  Thanks.
Flags: needinfo?(kinetik)
Pretty high crash rate (200-400/day).  Matthew?  Alex?  we'll need an uplift to 54.
Flags: needinfo?(achronop)
Sorry for the delay, I've been away sick.  It's pretty trivial to backport the fix (45c7ffc) and the change is low risk, so I'll prepare a patch for uplift.

Probably worth noting that since the fix only landed on m-c in the last day or so (via bug 1362334) we don't have a lot of data to confirm it yet.
Flags: needinfo?(kinetik)
Posted patch bug1358868_v0.patch (obsolete) — Splinter Review
Approval Request Comment
[Feature/Bug causing the regression]: bug 1337805
[User impact if declined]: crash when (re-)starting audio playback
[Is this code covered by automated tests?]: the code is exercised but this crash was not triggered by automated tests
[Has the fix been verified in Nightly?]: recently landed
[Needs manual test from QE? If yes, steps to reproduce]: 
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: no
[Why is the change risky/not risky?]: n/a
[String changes made/needed]: none


:achronop, note for review that update.sh was out of date due to the uplift of an entire cubeb update, so I've fixed that.
Attachment #8866618 - Flags: review?(achronop)
Attachment #8866618 - Flags: approval-mozilla-beta?
(Reporter)

Updated

2 years ago
> :achronop, note for review that update.sh was out of date due to the uplift
> of an entire cubeb update, so I've fixed that.

I think the problem is that uplift-cubeb-f07ee6d-to-aurora.patch changed (but it should not) the commit message in README_MOZILLA file. But the rest of the changes are there so I'm not sure if we should remove the patching of that file from update.sh.
Comment on attachment 8866618 [details] [diff] [review]
bug1358868_v0.patch

Review of attachment 8866618 [details] [diff] [review]:
-----------------------------------------------------------------

I provide some comments to verify that the uplift will be in the right state. It seems that a mistake in a previous uplift created a mess here (my bad). The overlap I have with Matthew is small. Since he is aware of the comments he will take the right decisions and he can go on with the uplift. In case we need to discuss it more I'll be around.

::: media/libcubeb/README_MOZILLA
@@ +4,5 @@
>  Makefile.in build files for the Mozilla build system.
>  
>  The cubeb git repository is: git://github.com/kinetiknz/cubeb.git
>  
> +The git commit ID used was f07ee6d5c536e2106ffe7a847074d7685ca07bd3 (2017-03-09 11:46:48 +0100)

Do we want to modify the parent commit info (time here) for an uplift? I know I did it with uplift-cubeb-f07ee6d-to-aurora.patch and now it's a mess.

::: media/libcubeb/src/cubeb_wasapi.cpp
@@ -1169,5 @@
> -  // If we've already leaked the thread, just return,
> -  // there is not much we can do.
> -  if (!stm->emergency_bailout.load()) {
> -    return false;
> -  }

This change added with patch fix-crashes.patch in change set:

changeset:   392923:93b358d61a9b
user:        Paul Adenot <paul@paul.cx>
date:        Thu Mar 16 16:06:21 2017 +0100
summary:     Bug 1347944 - Uplift three cubeb crash fixes to Fx54. r=kinetik, a=lizzard

I'm not sure why you remove it.

@@ +1616,5 @@
>      // This delays the input side slightly, but allow to not glitch when no input
>      // is available when calling into the resampler to call the callback: the input
>      // refill event will be set shortly after to compensate for this lack of data.
>      // In debug, four buffers are used, to avoid tripping up assertions down the line.
> +#if !defined(DEBUG)

This change imported in gecko with Bug 1344653 which is not requested for uplift.

::: media/libcubeb/update.sh
@@ -69,3 @@
>  
>  echo "Applying a patch on top of $version"
> -patch -p3 < ./uplift-cubeb-f07ee6d-to-aurora.patch

Please see comment 19.
Attachment #8866618 - Flags: review?(achronop) → review+
Comment on attachment 8866618 [details] [diff] [review]
bug1358868_v0.patch

Clearing approval until the issues raised by achronop are resolved.
Attachment #8866618 - Flags: approval-mozilla-beta?
(In reply to Alex Chronopoulos [:achronop] from comment #19)
> I think the problem is that uplift-cubeb-f07ee6d-to-aurora.patch changed
> (but it should not) the commit message in README_MOZILLA file. But the rest
> of the changes are there so I'm not sure if we should remove the patching of
> that file from update.sh.

Yeah, I think in the case of a full cubeb update being uplifted we should treat it like a regular upstream uplift and update README_MOZILLA and remove any obsolete patches applied via update.sh.  That gives us a new, simpler baseline to uplift any further patches on top of.

(In reply to Alex Chronopoulos [:achronop] from comment #20)
> Do we want to modify the parent commit info (time here) for an uplift? I
> know I did it with uplift-cubeb-f07ee6d-to-aurora.patch and now it's a mess.

The time should be updated if the revision changes, but in this case (since the base revision isn't changing) I shouldn't touch the date, so I'll leave it at what it was when f07ee6d was uplifted.

> ::: media/libcubeb/src/cubeb_wasapi.cpp
> @@ -1169,5 @@
> > -  // If we've already leaked the thread, just return,
> > -  // there is not much we can do.
> > -  if (!stm->emergency_bailout.load()) {
> > -    return false;
> > -  }
> 
> This change added with patch fix-crashes.patch in change set:
> 
> changeset:   392923:93b358d61a9b
> user:        Paul Adenot <paul@paul.cx>
> date:        Thu Mar 16 16:06:21 2017 +0100
> summary:     Bug 1347944 - Uplift three cubeb crash fixes to Fx54.
> r=kinetik, a=lizzard
> 
> I'm not sure why you remove it.

That wasn't intentional, thanks for pointing it out.  I should've noticed before requesting review.  I assumed that everything before the uplift would've been rolled into the uplift.  Digging further:

fix-crashes is a roll up of:
c500d18dd74a2c0e8133f14c8802550ba24bce1b
7b082450d2b57a4f2c541cb69558ae28682163f2
988e9e242c69ee221470b520569cdf93b1b37244

Of those, 7b0824 is missing from f07ee6d but the others are already present.  So I'll update fix-crashes to include only the missing change.

> @@ +1616,5 @@
> >      // This delays the input side slightly, but allow to not glitch when no input
> >      // is available when calling into the resampler to call the callback: the input
> >      // refill event will be set shortly after to compensate for this lack of data.
> >      // In debug, four buffers are used, to avoid tripping up assertions down the line.
> > +#if !defined(DEBUG)
> 
> This change imported in gecko with Bug 1344653 which is not requested for
> uplift.

This one is a bit confusing.  The change is present in f07ee6 upstream (which comes after a3c012 imported in the bug you referenced), so it should already be present in the f07ee6 uplift patch?  I'm not sure why it was missing, any ideas?  I think we should just take it, otherwise we're going to be in a weird situation where part of f07ee6 is missing and we'll need to keep applying it as a patch rather than a full update.
I've split this into two patches to make it easier to work out what's happening.

This patch re-baselines against f07ee6.  After running update.sh against a cubeb tree at f07ee6, the source in m-b is the same as it was before this patch except for the addition of a small missing change (DEBUG vs NDEBUG) from the f07ee6 uplift.
Attachment #8866618 - Attachment is obsolete: true
Attachment #8867024 - Flags: review?(achronop)
And this patch is the uplift required for this bug.
Attachment #8867026 - Flags: review?(achronop)
Comment on attachment 8867026 [details] [diff] [review]
bug1358868_p1.patch

See comment 17 for the approval request.
Attachment #8867026 - Flags: approval-mozilla-beta?
Comment on attachment 8867024 [details] [diff] [review]
bug1358868_p0.patch

See comment 17 for the approval request.

Note that this patch is tidying up a previous import and doesn't change the source we build with the exception of a small missing update (DEBUG vs NDEBUG) from that previous import.
Attachment #8867024 - Flags: approval-mozilla-beta?
Attachment #8867026 - Flags: review?(achronop) → review+
(In reply to Matthew Gregan [:kinetik] from comment #22)

> That wasn't intentional, thanks for pointing it out.  I should've noticed
> before requesting review.  I assumed that everything before the uplift
> would've been rolled into the uplift.  Digging further:
> 
> fix-crashes is a roll up of:
> c500d18dd74a2c0e8133f14c8802550ba24bce1b
> 7b082450d2b57a4f2c541cb69558ae28682163f2
> 988e9e242c69ee221470b520569cdf93b1b37244
> 
> Of those, 7b0824 is missing from f07ee6d but the others are already present.
> So I'll update fix-crashes to include only the missing change.

Well the confusion made because of the change made in README_MOZILLA by uplift-cubeb-f07ee6d-to-aurora.patch. That change made "accidentally" so it was not the result of a normal rebase to that version. The intention of that patch was to uplift a single commit and not to rebase to that version.

> This one is a bit confusing.  The change is present in f07ee6 upstream
> (which comes after a3c012 imported in the bug you referenced), so it should
> already be present in the f07ee6 uplift patch?  I'm not sure why it was
> missing, any ideas?  I think we should just take it, otherwise we're going
> to be in a weird situation where part of f07ee6 is missing and we'll need to
> keep applying it as a patch rather than a full update.

That period we had massive uplifts, DEBUG vs NDEBUG commit was the only one left behind. That's why it is not included. But we can fix it now, and I agree we can take it.
Comment on attachment 8867024 [details] [diff] [review]
bug1358868_p0.patch

I realize the intention of that commit is to create a real rebase to version f07ee6 from 075329 something we missed to do earlier.
Attachment #8867024 - Flags: review?(achronop) → review+
Comment on attachment 8867024 [details] [diff] [review]
bug1358868_p0.patch

Required by bug 1358896. Beta54+. Should be in 54 beta 8.
Attachment #8867024 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8867026 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Depends on: 1366707
You need to log in before you can comment on or make changes to this bug.