Closed Bug 839007 Opened 7 years ago Closed 7 years ago

[Audio] Music app will be crashed after seeking or playing next song in backend of Cubeb + OpenSLES (Not enabled on v1.x.x)

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(firefox22 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 wontfix)

RESOLVED FIXED
Tracking Status
firefox22 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- wontfix

People

(Reporter: mchen, Assigned: mchen)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash])

Attachments

(3 files, 1 obsolete file)

After enabling "cubed + OpenSLES" on unagi phone, crash will be happened on music app when user try to seek or play next song.
Severity: normal → critical
Keywords: crash
Whiteboard: [b2g-crash]
hi Marco, can you please add the logcat and if the crash occurred, can you get the crash id from /data/b2g/mozilla/Crash Reports/submitted/ please?
Note that, current B2G V1.x.x doesn't enable cubed + OpenSL but Sydney + AudioTrack. So I set the Severity down to normal only.
Severity: critical → normal
Summary: [Audio] Music app will be crashed after seeking or playing next song in backend of Cubed + OpenSLES (Unagi) → [Audio] Music app will be crashed after seeking or playing next song in backend of Cubed + OpenSLES (Not enabled on v1.x.x)
The crash is happened at AudioTrack::createTrack_l() and line as below.

mCblk = static_cast<audio_track_cblk_t*>(cblk->pointer());
android_atomic_or(CBLK_DIRECTION_OUT, &mCblk->flags); <-- Crash Point

After checking by GDB, the mCblk 's point address is 0.
Attached patch Patch for timing of dlclose() v1 (obsolete) — Splinter Review
The lib of openSL ES should be closed after outMix and engine object are destroyed. This is the one issue of crash on seeking or going to next song.
Attachment #715915 - Flags: review?(kinetik)
Attachment #715915 - Flags: review?(kinetik) → review+
Summary: [Audio] Music app will be crashed after seeking or playing next song in backend of Cubed + OpenSLES (Not enabled on v1.x.x) → [Audio] Music app will be crashed after seeking or playing next song in backend of Cubeb + OpenSLES (Not enabled on v1.x.x)
To check this in to the tree, please submit a pull request to the github repo at http://github.com/kinetiknz/cubeb, and then use media/libcubeb/update.sh to update the in-tree copy.
Assignee: nobody → mchen
Status: NEW → ASSIGNED
1. Follow the update script in libcubed.
2. Add reviewer.
3. wait for try. https://tbpl.mozilla.org/?tree=Try&rev=7713985e1d33
Attachment #715915 - Attachment is obsolete: true
Attachment #715947 - Flags: review+
Try server is all green.
And please help to commit  "Patch for timing of dlclose() Checkin Version".

Thanks.
Keywords: checkin-needed
For second crash issue reported from Comment 3 and which also reported on bug 698328, I found it can be re-produced in bankend of sydney + audiotrack, but it is protected by a workaround (pleaser refer to the link)

http://mxr.mozilla.org/mozilla-central/source/media/libsydneyaudio/src/sydney_audio_gonk.cpp#211

It seems to be relationship on the binder connection between media server and content process.
In cubeb_stream_destroy(), I try to keep one playerObj (it will keep on AudioTrack) alive. Then crash is gone just like what sydney_audio_gonk do now.
https://hg.mozilla.org/mozilla-central/rev/c853ef949fd7
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Hi Ryan,

Thanks for the help here. And I reopen this bug because there is another crash issue on comment 6.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Just a small nit pick on process - in the case where you are landing something that's not fixing the bug entirely, use [leave-open] in the whiteboard.
(In reply to Jason Smith [:jsmith] from comment #13)
> Just a small nit pick on process - in the case where you are landing
> something that's not fixing the bug entirely, use [leave-open] in the
> whiteboard.

Thanks for your reminding and I will follow next time.
(In reply to Marco Chen [:mchen] from comment #3)
> Created attachment 711745 [details]
> Minidump 1: During Seeking on Music App
> 
> The crash is happened at AudioTrack::createTrack_l() and line as below.
> 
> mCblk = static_cast<audio_track_cblk_t*>(cblk->pointer());
> android_atomic_or(CBLK_DIRECTION_OUT, &mCblk->flags); <-- Crash Point
> 
> After checking by GDB, the mCblk 's point address is 0.

Refer to Bug 803471, it added code as below into b2g main function then this issue is gone.
android::ProcessState::self()->startThreadPool();
With this code, to remove workaround mentioned on comment 9 for sydney is working well too.

Without this code, the binder transaction for creating a IMemoryHeap between AudioFlinger and AudioTrack (MediaDecoder thread) will be failed when AudioFlinger tried to send back the binder reply (in binder driver).

Will find the reason then asking for committing that code.
From binder_thread_read() in binder.c, it seems that the ThreadPool for Binder is necessary for Binder ipc.

Related code is follwoing.

>	if (wait_for_proc_work) {
>		if (!(thread->looper & (BINDER_LOOPER_STATE_REGISTERED |
>					BINDER_LOOPER_STATE_ENTERED))) {
>			binder_user_error("binder: %d:%d ERROR: Thread waiting "
>				"for process work before calling BC_REGISTER_"
>				"LOOPER or BC_ENTER_LOOPER (state %x)\n",
>				proc->pid, thread->pid, thread->looper);
>			wait_event_interruptible(binder_user_error_wait,
>						 binder_stop_on_user_error < 2);
>		}
Just FYI about Binder ipc

http://elinux.org/Android_Binder
To investigate further about the binder ipc problem. It is necessary to debug about binder.c. I do not know how to build kernel of unagi device. Is there an information about it?
CCing people that might be able to answer comment 18.
Another FYI.

When the ThreadPool is enabled, app process receive Binder transaction from AudioFlinger. Following function is called.
- AudioSystem::AudioFlingerClient::ioConfigChanged()

When the ThreadPool is disabled, app process do not receive the transaction. It is ONEWAY transaction. Therefore it do not block AudioFlinger's thread. But the transaction is going to stay forever within binder driver.
One more FYI.

Following functions' return values could change depends on the AudioFlinger's state. But if the ThreadPool is not enabled. Following functions do not updated correctly. 
  - AudioSystem::getOutputSamplingRate()
  - AudioSystem::getOutputFrameCount()
  - AudioSystem::getOutputLatency()

For example, AudioTrack::latency() change depends on bluetooh on/off. It affect to Audio/Video synchronization. FirefoOS v1 do not care about the Audio/Video synchronization. But it becomes important in future FirefoxOS.
(In reply to Sotaro Ikeda [:sotaro] from comment #16)
> From binder_thread_read() in binder.c, it seems that the ThreadPool for
> Binder is necessary for Binder ipc.
> 
> Related code is follwoing.
> 

Thanks for all your information first.

1. I tried to fill the binder's debug_mask to all 1 then I saw much kernel debug messages about binder but no error message from that binder_user_error. Si it seems be not the root cause.

2. Even I only add code as below into b2g main function (not create IPC thread pool), this crash is also gone.

   android::sp<android::ProcessState> proc(android::ProcessState::self());

3. I will try to figure it out from the view of kernel binder driver by one of our partner's open repository which contained kernel code.
(In reply to Marco Chen [:mchen] from comment #22)

> 2. Even I only add code as below into b2g main function (not create IPC
> thread pool), this crash is also gone.
> 
>    android::sp<android::ProcessState> proc(android::ProcessState::self());

I confirmed it works. It is weird. ProcessState is automatically created when it is necessary. And ProcessState::ProcessState() just opens "/dev/binder" driver.
Hi all, 

After adding more debug messages into kernel, I found the exactly position of this error but still don't know why. Maybe someone can give advice here. Thanks.

Description: AudioTrack tried to ask a memory heap (share memory) from AudioFlinger. The reply of binder transaction contained the fd from AudioFlinger. Then binder driver tried to map this fd to AudioTrack's process fd table.

Issue: Please refer to the link as below. After binder got a new mapped fd, it will check whether this fd number is larger then "rlim[RLIMIT_NOFILE].rlim_cur". Then in this case I found that rlim_cur is 0 and rlim_max is -789577504.

https://www.codeaurora.org/gitweb/quic/la/?p=kernel/msm.git;a=blob;f=drivers/staging/android/binder.c;h=2d097aa8f7b92254037e9e9707e23e98a87fd3cc;hb=HEAD#l353

Note: This issue doesn't happen in the first song and in that case the rlim_cur is 1024 and rlim_max is 4096.
I found that rlim_xxx is stored in thread_info->task (task_struct). and binder will maintain a single binder_proc structure for each process and binder_proc->tsk will be assigned to first thread's thread_info->task.

In the none crash case: We used main thread as a first thread to open binder so binder_proc->tsk will be from main thread's thread info.

In the crash case: We used audiostream thread as a first thread to open binder but this thread will be exist.

So in the crash case, is it possible that the single binder_proc->tsk for music app is damaged because the audiostream thread already be gone?

Thanks for all your advice here.
(In reply to Marco Chen [:mchen] from comment #25)
> So in the crash case, is it possible that the single binder_proc->tsk for
> music app is damaged because the audiostream thread already be gone?

Yes, binder_proc->tsk became already invalid. I understand now why the problem happens. Thank for the information!

When the binder diriver is opened, "current task" is stored to proc->tsk in binder_open(). "current task" is just an application's thread running in kernel context. And the thread is the first thread that calls IPCThreadState::self() in the application. Then if the thread is killed, the proc->tsk becomes invalid.

Therefore, the thread that calls IPCThreadState::self() first time should live as long as the application's lifetime. And main thread of an application is the correct thread for it.
Please refer to comment 25 & 26 which show that binder registration should be performed on main thread.
Attachment #720561 - Flags: review?(mwu)
Comment on attachment 720561 [details] [diff] [review]
Binder Registration on Main Thread v1

It looks like sotaro has a patch which fixes this on bug 803471, so clearing review. I made a suggestion for a comment which includes the information in your comment.
Attachment #720561 - Flags: review?(mwu)
Blocks: 848581
Close this bug due to land of bug 803471.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.