Crashes in cubeb_stream_init

RESOLVED FIXED in Firefox 36

Status

()

defect
--
critical
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: jimm, Assigned: padenot)

Tracking

({crash})

Trunk
mozilla38
All
Windows NT
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox36+ fixed, firefox37+ fixed, firefox38+ fixed)

Details

(crash signature)

Attachments

(9 attachments)

(Reporter)

Description

4 years ago
#1 top content crasher right now. Appears to be tied to playing video on youtube.

https://crash-stats.mozilla.com/report/list?product=Firefox&range_value=3&range_unit=days&date=2015-02-14&signature=RtlpWaitOnCriticalSection+|+audioses.dll%400x3ec0&version=Firefox%3A38.0a1#tab-reports

https://crash-stats.mozilla.com/report/index/7bd99c09-4bd8-48e3-9223-5f7f02150212

4 	xul.dll 	`anonymous namespace'::SafeRelease<IAudioClient>(IAudioClient*) 	media/libcubeb/src/cubeb_wasapi.cpp
5 	ntdll.dll 	RtlDeNormalizeProcessParams 	
6 	xul.dll 	`anonymous namespace'::owned_critical_section::enter() 	media/libcubeb/src/cubeb_wasapi.cpp
7 	xul.dll 	`anonymous namespace'::setup_wasapi_stream(cubeb_stream*) 	media/libcubeb/src/cubeb_wasapi.cpp
8 	xul.dll 	`anonymous namespace'::wasapi_stream_init(cubeb*, cubeb_stream**, char const*, cubeb_stream_params, unsigned int, long (*)(cubeb_stream*, void*, void*, long), void (*)(cubeb_stream*, void*, cubeb_state), void*) 	media/libcubeb/src/cubeb_wasapi.cpp
9 	xul.dll 	cubeb_stream_init 	media/libcubeb/src/cubeb.c
10 	xul.dll 	mozilla::AudioStream::OpenCubeb(cubeb_stream_params&, mozilla::AudioStream::LatencyRequest) 	dom/media/AudioStream.cpp
11 	xul.dll 	mozilla::AudioStream::Init(int, int, mozilla::dom::AudioChannel, mozilla::AudioStream::LatencyRequest) 	dom/media/AudioStream.cpp
12 	xul.dll 	mozilla::AudioSink::InitializeAudioStream() 	dom/media/AudioSink.cpp
13 	xul.dll 	mozilla::AudioSink::AudioLoop()
(Reporter)

Updated

4 years ago
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…
Summary: crash in RtlpWaitOnCriticalSection | audioses.dll@0x3ec0 → Crashes in cubeb_stream_init

Updated

4 years ago
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor… → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…

Comment 1

4 years ago
[Tracking Requested - why for this release]:
By far the #1 crash in 36.0b9

Updated

4 years ago
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor… → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…

Updated

4 years ago
Duplicate of this bug: 1133306

Comment 3

4 years ago
Bug 1133306 has some good data on this:

Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=c939ab0df211&tochange=c4500f0d25cd

Suspect: Bug 1127213 and Bug 1125804


Also two sets of STR:

(Bug 1133306 comment #0)
1. Start browser with newly created profile (e10s is blocked by IME detect) 
2. Open https://www.youtube.com/watch?v=R6KJjPqlPz4&index=3&list=PL75_HhpYGJQ1Fzv9a46FlHfiy-fJusKBZ
3. Click seek bar within buffered range

Actual Results:
Crash immediately when I click seek bar within buffered


(Bug 1133306 comment #3)
1. Start browser with newly created profile (e10s is blocked by IME detect) 
2. Open https://www.youtube.com/watch?v=R6KJjPqlPz4&index=3&list=PL75_HhpYGJQ1Fzv9a46FlHfiy-fJusKBZ
3. Click a thumnail in the right-hand play lists(black background area). And repeat Step3 if need

Updated

4 years ago
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor… → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…

Comment 4

4 years ago
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #3)
> Suspect: Bug 1127213 and Bug 1125804

Given that, Paul, could you please take a look? This right now is a blocker for shipping 36 or even doing an RC this upcoming week.
Flags: needinfo?(padenot)
I'm trying to repro (method 1) on an inbound debug build from a few days ago, no luck so far.  Will try again on beta... and also repro #2

Turned off e10s on inbound - seeking caused a bunch of e10s errors sometimes:

###!!! [Parent][DispatchAsyncMessage] Error: (msgtype=0xFFFB,name=???) Value error: message was deserialized, but contained an illegal value

[Parent 14704] ###!!! ASSERTION: Wrong size for this Shmem!: 'Error', file c:\mozilla\inbound2\ipc\glue\Shmem.cpp, line 459

###!!! [Parent][DispatchAsyncMessage] Error: (msgtype=0xFFFC,name=???) Payload error: message could not be deserialized


###!!! [Parent][DispatchAsyncMessage] Error: (msgtype=0xFFFB,name=???) Value error: message was deserialized, but contained an illegal value
AH, my debug build was built on a base from the 6th or so, so the suspect landings were after that.
With a current m-i debug build, I also cannot get it to crash (e10s off). Note: my video driver is blacklisted for acceleration (Lenovo W520 2011 (INtel+nvidia GPUs) with the original Intel GFX driver)
Tried on Nightly 2/15 and current Beta on Win 8.1 with a DX9 video card (NVidia 7600GT), cannot reproduce there.
And nothing on the Lenovo W520, Win7 using beta either
Alice - can you get more info about your setup and post it here?  The Troubleshooting info in particular, plus perhaps what audio system and drivers your windows machine uses.

If this is related to video somehow, it might be because my systems have 2d acceleration disabled by default.
Flags: needinfo?(alice0775)

Comment 11

4 years ago
bp-5be5a987-0851-4c78-bf05-9c1d82150215 with HWA off and non-e10s.
In this case, you might need step4.

And Beta36.0b9 did not crash in my test.
Aurora37.0 and Nightly38.0 crashes in my test.

Steps
1. Start browser with newly created profile
2. Open https://www.youtube.com/watch?v=R6KJjPqlPz4&index=3&list=PL75_HhpYGJQ1Fzv9a46FlHfiy-fJusKBZ
3. Click seek bar within buffered range 
4. And rapidly repeat step3 (should not pause the video and should not mute while testing)
Flags: needinfo?(alice0775)

Comment 12

4 years ago
(In reply to Alice0775 White from comment #11)
> bp-5be5a987-0851-4c78-bf05-9c1d82150215 with HWA off and non-e10s.

Err
bp-b3f39f2c-2b31-4ba5-b0df-448552150215 with HWA off and non-e10s.

Comment 13

4 years ago
Posted file about_support.txt
(In reply to Randell Jesup [:jesup] from comment #10)
> Alice - can you get more info about your setup and post it here?  The
> Troubleshooting info in particular, plus perhaps what audio system and
> drivers your windows machine uses.

About:support: attached.
Realtek High Definition Audio 6.0.1.5864
Stereo Speaker


> 
> If this is related to video somehow, it might be because my systems have 2d
> acceleration disabled by default.

I dint think so, because it crashes with disabled HWA.

Comment 14

4 years ago
This is connected to audio, nor video. cubeb is AFAIK an audio thing.
(In reply to Alice0775 White from comment #12)
> (In reply to Alice0775 White from comment #11)
> > bp-5be5a987-0851-4c78-bf05-9c1d82150215 with HWA off and non-e10s.
> 
> Err
> bp-b3f39f2c-2b31-4ba5-b0df-448552150215 with HWA off and non-e10s.

I'm afraid that shows almost nothing in the stack

It does not repro for me on Aurora either, on either system.  One has default W520 audio (using USB headphones, note), the other has a Creative soundcard in it (old Creative SB Audigy 2 ZS)
"By far the #1 crash in 36.0b9" is interesting, given the above.  Likely there's some timing and/or driver issue involved.  Alice's inability to repro on b9 when she can repro on nightly/aurora would also point to timing
So, it is a super-common crasher on b9.  One example:
https://crash-stats.mozilla.com/report/index/5571a4d4-ba75-42ca-8a6e-73d7d2150213
Note the crash address, freed memory likely.  Again implicating timing and locks (and this is likely the critical section (stream_reset_lock) itself that's freed somehow, or not initialized.

The stm (cubeb_stream) struct is allocated with calloc(), so it shouldn't be garbage to start.

Alice (or kairo): any chance of getting a MSVC backtrace/locals or dump so we can look at higher levels?
ALso: can you try to repro with a debug build?  There are lock assertions which might be firing in debug before it crashes.  Thanks!!
Flags: needinfo?(alice0775)

Comment 19

4 years ago
Flags: needinfo?(alice0775)

Comment 21

4 years ago
Crashing is slightly difficult on the Debug build of Nightly and Aurora.

Updated

4 years ago
Attachment #8564779 - Attachment description: log.txt → log.txt, crash on Nightly

Updated

4 years ago
Attachment #8564780 - Attachment description: log2.txt → log2.txt, crash on Aurora
Since this only just came up in the recent beta, it must be a regression from bug 1131768.

Jesup, does this diff help you debug? https://hg.mozilla.org/releases/mozilla-beta/rev/69a8d311ddd6

Another option is to just roll back the entire series of audio-device changes given that MSE isn't happening in 36.
Blocks: 1131768
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor… → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…
(In reply to David Major [:dmajor] (UTC+13) from comment #23)
> Another option is to just roll back the entire series of audio-device
> changes given that MSE isn't happening in 36.

Yes, I think it's time to do this.  There have been too many regressions already
Preparing a backout patch for beta and aurora now.  If we decided we want it on aurora for MSE, we can backport it again after it has stabilized.
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
Flags: needinfo?(padenot)
This patch applies cleanly to beta and aurora, and reverts the following changesets:

https://hg.mozilla.org/releases/mozilla-beta/rev/20f7d44346da
https://hg.mozilla.org/releases/mozilla-beta/rev/0411d20465b4
https://hg.mozilla.org/releases/mozilla-beta/rev/9579b9ab68ca
https://hg.mozilla.org/releases/mozilla-beta/rev/69a8d311ddd6
https://hg.mozilla.org/releases/mozilla-beta/rev/69a8d311ddd6

Approval Request Comment
[Feature/regressing bug #]: bug 698079
[User impact if declined]: crashes during media playback, possibly other undiscovered regressions
[Describe test coverage new/current, TreeHerder]: n/a
[Risks and why]: low, reverting to code that we have been shipping for some time
[String/UUID change made/needed]: none
Attachment #8564804 - Flags: approval-mozilla-beta?
Attachment #8564804 - Flags: approval-mozilla-aurora?
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor… → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…
From alice's logs:
Nightly:
[5916] WARNING: AudioStream::OpenCubeb() fe70140 failed to init cubeb: file c:\builds\moz2_slave\m-cen-w32-d-000000000000000000\build\src\dom\media\AudioStream.cpp, line 513
[5916] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file c:\builds\moz2_slave\m-cen-w32-d-000000000000000000\build\src\dom\media\AudioStream.cpp, line 393

Aurora:
[724] WARNING: AudioStream::OpenCubeb() 16dda4e0 failed to init cubeb: file c:\builds\moz2_slave\m-aurora-w32-d-000000000000000\build\dom\media\AudioStream.cpp, line 513
[724] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file c:\builds\moz2_slave\m-aurora-w32-d-000000000000000\build\dom\media\AudioStream.cpp, line 393
[724] WARNING: Initializing AudioStream failed.: file c:\builds\moz2_slave\m-aurora-w32-d-000000000000000\build\dom\media\AudioSink.cpp, line 155

Note, this doesn't get us the local vars/etc of a debugger backtrace, but it may point to the event that laid the landmine

Updated

4 years ago
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor… → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…

Comment 29

4 years ago
(In reply to Matthew Gregan [:kinetik] from comment #25)
> Preparing a backout patch for beta and aurora now.  If we decided we want it
> on aurora for MSE, we can backport it again after it has stabilized.

We are trying to ship MSE in 37, that is still the plan. And 37 is on Aurora now.
Comment on attachment 8564804 [details] [diff] [review]
bug1133190_v0.patch

As Kairo said, we want to ship MSE in 37. So, we want to fix the bug itself in 37.
Attachment #8564804 - Flags: approval-mozilla-beta?
Attachment #8564804 - Flags: approval-mozilla-beta+
Attachment #8564804 - Flags: approval-mozilla-aurora?
Attachment #8564804 - Flags: approval-mozilla-aurora-
(Assignee)

Comment 31

4 years ago
So, Alice's log show that netwerk is failing at some cache operation, while instantiating nsBinaryStreams, MediaSource is failing at reading some data (possibly related), the MP4 reader is failing at reading data from cache, gfx is failing at allocating texture, and cubeb is failing at allocating a resampler.

Overall, it does not look great.
(Assignee)

Comment 33

4 years ago
The try push in comment 32 is based on current inbound, and includes various patches that fix various issues with the code that seem to cause the issues, here, but more importantly, enables more logging.

Alice, you seem to be able to reproduce easily (I've all day with tried the same driver version you mention on a dell machine and on a lenovo machine, with various audio settings, no luck), would you mind grabbing the builds [0] from the try push and giving us logs ? Also, if possible, could you enable the MediaDecoder:5 NSPR log as well?

[0]: https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/paul@paul.cx-f6aab427b0cb/
Flags: needinfo?(alice0775)

Comment 34

4 years ago
Posted file logtrybuild.zip
log
set NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,MediaDecoder:5
Flags: needinfo?(alice0775)
(Assignee)

Comment 35

4 years ago
Alice, do you confirm attachment 8565135 [details] shows a crash? I would also ideally need the stdout for this process, if possible, as my custom logging is not hooked to the NSPR logging system.
Flags: needinfo?(alice0775)

Comment 36

4 years ago
(In reply to Paul Adenot (:padenot) from comment #35)
> Alice, do you confirm attachment 8565135 [details] shows a crash? 

Yes, the try OPT build crashed.

> I would
> also ideally need the stdout for this process, if possible, as my custom
> logging is not hooked to the NSPR logging system.

Is there a way to logging both stdout and NSPR log?
The following is not enough?
> set NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,MediaDecoder:5
> firefox.exe

Which build do you want to logging? the try OPT build or try DEBUG build?
Flags: needinfo?(alice0775) → needinfo?(padenot)
(Assignee)

Comment 37

4 years ago
(In reply to Alice0775 White from comment #36)
> (In reply to Paul Adenot (:padenot) from comment #35)
> > Alice, do you confirm attachment 8565135 [details] shows a crash? 
> 
> Yes, the try OPT build crashed.
> 
> > I would
> > also ideally need the stdout for this process, if possible, as my custom
> > logging is not hooked to the NSPR logging system.
> 
> Is there a way to logging both stdout and NSPR log?
> The following is not enough?
> > set NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,MediaDecoder:5
> > firefox.exe
> 
> Which build do you want to logging? the try OPT build or try DEBUG build?

The opt build. In fact, it should not matter, I enabled assertion and logging on both.
Flags: needinfo?(padenot)

Comment 38

4 years ago
Posted file log.txt
log.txt(Standard Output and Standard Error) crashed with try DEBUG build.

Comment 39

4 years ago
Posted file NSPR_LOG_FILE.zip
NSPR_LOG_FILE with try DEBUG build (logged at the same time as Comment 38)
Assignee: kinetik → padenot
(In reply to Sylvestre Ledru [:sylvestre] from comment #30)
> Comment on attachment 8564804 [details] [diff] [review]
> bug1133190_v0.patch
> 
> As Kairo said, we want to ship MSE in 37. So, we want to fix the bug itself
> in 37.

I think it's a mistake to keep these changes on Aurora until they have stabilized on Nightly, but okay.

Backout landed on beta: https://hg.mozilla.org/releases/mozilla-beta/rev/82339d98aa6a
Comment on attachment 8564804 [details] [diff] [review]
bug1133190_v0.patch

(In reply to Matthew Gregan [:kinetik] from comment #40)
> I think it's a mistake to keep these changes on Aurora until they have
> stabilized on Nightly, but okay.

I agree with Matthew. Our goal is to stabilize Aurora. We should backout this patch and reland a proper patch when it is ready. I am reverting the previous Aurora decision (Sylvestre and I have already discussed this) and approving this backout for Aurora.

Aurora+
Attachment #8564804 - Flags: approval-mozilla-aurora- → approval-mozilla-aurora+
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor… → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…
(Assignee)

Comment 43

4 years ago
Posted patch patchSplinter Review
So, the issue is that when you have some combination of hardware/driver, and try to init too much AudioClients, you get back an 0x88890017 error very often. This is WASAPI telling us the mixer process does not have enough CPU to create another stream. Since we recreate streams when we seek, we end up with a bunch of streams.

At the same time, it looks like (looking at the memory footprint of the audio mixer process in the task manager), that the system takes some time before un-registering a stream on the server side (maybe this depends on the driver, I can only see this on one machine, the others are fine), meaning we have a bunch of stream going on at the same time on the server side, and it's the server side that seem to decide whether we can create a new stream or not.

Also we do the resampling on the client side, which can be costly (it depends on the resampling ratio, and most of the time, it's 44100 -> 48000 or the opposite, which is not a simple case).

All the crashes we've seen in the past few days were caused by the auto_unlock mutex trying to re-lock on a deleted mutex, because we just went into the error handling path, and just wasapi_stream_destroy-ed the stream, and the number of crashes are hopefully explained by the above.

I filed bug 1134258 and bug 1134263 as follow ups.
Attachment #8566072 - Flags: review?(kinetik)
Attachment #8566072 - Flags: review?(kinetik) → review+
(In reply to Matthew Gregan [:kinetik] from comment #42)
> Backed landed on aurora:
> https://hg.mozilla.org/releases/mozilla-aurora/rev/e62dd2ded2fc

*Backout* landed on aurora: https://hg.mozilla.org/releases/mozilla-aurora/rev/e62dd2ded2fc
https://hg.mozilla.org/mozilla-central/rev/35b8aaad1312
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38

Updated

4 years ago
Depends on: 1134977

Updated

4 years ago
No longer depends on: 1134977
Depends on: 1133381
You need to log in before you can comment on or make changes to this bug.