Closed Bug 945335 Opened 11 years ago Closed 6 years ago

crash in audioTrack_callBack_pullFromBuffQueue

Categories

(Core :: Audio/Video: MediaStreamGraph, defect, P3)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
b2g-v1.3 --- affected
b2g-v1.4 --- affected

People

(Reporter: nhirata, Unassigned)

References

Details

(Keywords: crash, steps-wanted, Whiteboard: [b2g-crash])

Crash Data

This bug was filed from the Socorro interface and is 
report bp-03c6f332-f1cc-442a-8782-e1f6b2131125.
=============================================================
Crashing Thread
Frame 	Module 	Signature 	Source
0 	libc.so 	libc.so@0xddd8 	
1 	libwilhelm.so 	audioTrack_callBack_pullFromBuffQueue 	system/media/wilhelm/src/android/AudioPlayer_to_android.cpp
2 	libmedia.so 	android::AudioTrack::processAudioBuffer 	frameworks/base/media/libmedia/AudioTrack.cpp
3 	libmedia.so 	android::AudioTrack::AudioTrackThread::threadLoop 	frameworks/base/media/libmedia/AudioTrack.cpp
4 	libutils.so 	android::Thread::_threadLoop 	frameworks/base/libs/utils/Threads.cpp
5 	libutils.so 	thread_data_t::trampoline 	frameworks/base/libs/utils/Threads.cpp
6 	libc.so 	__thread_entry 	bionic/libc/bionic/pthread.c
7 	libc.so 	pthread_create 	bionic/libc/bionic/pthread.c

More crashes : 
https://crash-stats.mozilla.com/report/list?product=B2G&signature=audioTrack_callBack_pullFromBuffQueue

Looks like some crash in the audio?
Component: Vendcom → Video/Audio
Product: Firefox OS → Core
Just hit this on my Buri device running nightly, although I do not have good STR. When I turned the phone back on the crash reporter was already on the device. Might have been playing a track from the lockscreen at some point.

Gaia   8fca2ca67e8a6022fe6ed8cb576e5d59dfb5237f
SourceStamp 1401e4b394ad
BuildID 20131206040203
Version 28.0a1
Ni on :sotaro to see if there is anything useful in the crash-stats ? Also adding steps-wanted to get concrete STR here
Flags: needinfo?(sotaro.ikeda.g)
Keywords: steps-wanted
This won't hit our QA Wanted queries, so I'm adding needinfo to Sarah to find someone to get the STR on this bug.
Flags: needinfo?(sparsons)
Going to hold off on trying to find STR on this issue until we can get some more info from Sotaro (see comment 2) 

We tried multiple scenarios to repro on both 1.3 and 1.4 and were unsuccessful. 

Once more info is provided, we can provide accurate STR.
Flags: needinfo?(sparsons)
From the crash log, it is not clear which part of audioTrack_callBack_pullFromBuffQueue() failed. The function is very stable. The function's code is almost same in android KK(4.4). From the source code, one possibility is that the crash happen by buffer deletion before memcpy.

> memcpy (pBuff->raw, pSrc, pBuff->size);

From libcubeb, buffer is enqueued to OpenSL by IBufferQueue_Enqueue(). It is called by bufferqueue_callback() in cubeb_opensl.c. The buffer is not copied, just stored in BufferQueue. It is copied to AudioTrack's buffer in audioTrack_callBack_pullFromBuffQueue(). If the buffer is deleted before the copy or the written buffer size is incorrect, the crash could happen. But it is a one possibility.

http://androidxref.com/4.0.4/xref/system/media/wilhelm/src/itf/IBufferQueue.c#46



http://androidxref.com/4.0.4/xref/system/media/wilhelm/src/android/AudioPlayer_to_android.cpp
Flags: needinfo?(sotaro.ikeda.g)
padenot, can you give us a comment about this bug?
Flags: needinfo?(paul)
Sotaro, comment 5 sounds plausible we are crashing in libc.so@0xddcc, so I grabbed a libc.so from a keon, and saw:

$ readelf -s libc-keon.so | tr -s ' ' | cut -d ' ' -f 3-9 | sort | grep -C 2 "\bmemcpy\b"
0000d8fc 320 FUNC GLOBAL DEFAULT 7 memcmp
0000db60 504 FUNC GLOBAL DEFAULT 7 __memcmp16
0000dd60 0 FUNC GLOBAL DEFAULT 7 memcpy
0000df00 0 FUNC GLOBAL DEFAULT 7 bzero
0000df08 0 FUNC GLOBAL DEFAULT 7 memset

memcpy is at 0xdd60 in the binary I have, but maybe I don't have the exact right build, so we can imagine we are crashing in memcpy here.

Quoting the SLES spec[1], page 174:

> The buffers that are queued are used in place and are not required to be copied by the
> device, although this may be implementation-dependent. The application developer
> should be aware that modifying the content of a buffer after it has been queued is
> undefined and can cause audio corruption.

So we are not required to copy, here. Then, page 343 (emphasis mine):

> All state transitions are legal. The state defaults to
> SL_PLAYSTATE_STOPPED. *Note that although the state change is
> immediate, there may be some latency between the execution of
> this method and its effect on behavior*. In this sense, a player’s
> state technically represents the application’s intentions for the
> player. 

In `opensl_stream_destroy`, we might want to set the state to STOPPED (all the examples in the spec do so). Then, we might want to wait until the state is actually STOPPED to destroy the buffers, using IPlay_GetPlayState. This will ensure we won't have a callback firing right after we free() the buffer queue. The examples don't do that, but they maybe they are meant to be simple.

[1]: http://androidxref.com/4.0.4/xref/system/media/wilhelm/doc/OpenSL_ES_Specification_1.0.1.pdf
Flags: needinfo?(paul)
Sotaro-san, please see comment 7.
Flags: needinfo?(sotaro.ikeda.g)
audioTrack_callBack_pullFromBuffQueue is called in AudioTrack callback thread. And the thread is stopped when AudioTrack::stop() is called. AudioTrack::stop() is called when OpenSL audio player is set to SL_PLAYSTATE_STOPPED state.
Flags: needinfo?(sotaro.ikeda.g)
opensl_stream_stop() does not set OpenSL audio player state to SL_PLAYSTATE_STOPPED, but to SL_PLAYSTATE_PAUSED.
http://mxr.mozilla.org/mozilla-central/source/media/libcubeb/src/cubeb_opensl.c#546
padenot, is there a reason to use SL_PLAYSTATE_PAUSED in opensl_stream_stop()?
Flags: needinfo?(paul)
libcubeb does not use SL_PLAYSTATE_STOPPED state except error handling.
libcubeb's *_stream_stop functions are in fact closer to the "pause" semantic of OpenSL: the stream should be able to be restarted. We should try to implement comment 7 and see what happens. It does not seem hard, and worst case, nothing happens, we remove the code and continue searching.
Flags: needinfo?(paul)
The particular crash happened with the comms app on top.  Most of them seem like this.
There's another crash that happened from the marketplace on top.
Another with : http://play.esviji.com/manifest.webapp 

What would cause the buffer to empty from the queue?
Maybe a telephone call?
Priority: -- → P1
This issue is still pretty active in the crash reports - but not seeing any updates or direction for where to look for STR's.  We've been trying combinations of playing music and locking and unlocking the device, rebooting the device, etc. but are not seeing this issue on the Flame at all.
Actually, I was able to repro it in the debug build of the master branch. See: https://bugzilla.mozilla.org/show_bug.cgi?id=1147386#c11.

Sotaro, would above info give an additional clue to resolving this issue?
Flags: needinfo?(sotaro.ikeda.g)
(In reply to No-Jun Park [:njpark] from comment #16)
> Actually, I was able to repro it in the debug build of the master branch.
> See: https://bugzilla.mozilla.org/show_bug.cgi?id=1147386#c11.
> 
> Sotaro, would above info give an additional clue to resolving this issue?


I do not think it is dup of bug 945335. It calls __android_log_assert() in the stack and crash reason is SIGABRT. bug 945335 does not have __android_log_assert() in crash stack and crash reason is SIGSEGV. 

https://crash-stats.mozilla.com/report/list?product=B2G&signature=audioTrack_callBack_pullFromBuffQueue#tab-reports
Flags: needinfo?(sotaro.ikeda.g)
(In reply to No-Jun Park [:njpark] from comment #16)
> Actually, I was able to repro it in the debug build of the master branch.
> See: https://bugzilla.mozilla.org/show_bug.cgi?id=1147386#c11.

Therefore it seems better to handle as a different bug.
Blocks: 1150128
Component: Audio/Video → Audio/Video: Playback
Component: Audio/Video: Playback → Audio/Video: MSG/cubeb/GMP
Does anyone know if this is still a very active crash?  If not, I'd like to drop the priority.  

Hi Sotaro -- If this is still very active, do you have any bandwidth to look at this?  Paul's recommendation in comment 7 seems the best way forward.  (Paul is incredibly busy getting ready for TPAC, and he's helping out on a nasty B2G blocker bug when he has free cycles.)  What do you think?
Flags: needinfo?(sotaro.ikeda.g)
We seems not have a crash on recent gecko.

njpark, is it correct?
Flags: needinfo?(npark)
This bug hasn't been seen recently. I think dropping the priority is a good idea.  Also Bug 1147386 is gone as well.
Flags: needinfo?(npark)
(In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #19)
> Does anyone know if this is still a very active crash?  If not, I'd like to
> drop the priority.  
> 
> Hi Sotaro -- If this is still very active, do you have any bandwidth to look
> at this?  Paul's recommendation in comment 7 seems the best way forward. 
> (Paul is incredibly busy getting ready for TPAC, and he's helping out on a
> nasty B2G blocker bug when he has free cycles.)  What do you think?

From Comment 21, it is not a active crash and we could drop the priority.
Flags: needinfo?(sotaro.ikeda.g)
Thanks for checking and updating the bug!  I'm going to drop this to a P2 (relative to other MSG priorities).  It seems we'll want someone to try Paul's suggestion in comment 7, but we don't need to do so immediately.
Severity: critical → normal
Rank: 25
Keywords: topcrash-b2g
Priority: P1 → P2
Mass change P2->P3 to align with new Mozilla triage process.
Priority: P2 → P3
Closing because no crash reported since 12 weeks.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Closing because no crash reported since 12 weeks.
You need to log in before you can comment on or make changes to this bug.