Closed Bug 924678 Opened 6 years ago Closed 6 years ago

crash in PR_EnterMonitor | android::OmxDecoder::NotifyDataArrived(char const*, unsigned int, long long)

Categories

(Firefox OS Graveyard :: General, defect, critical)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

(blocking-b2g:koi+, firefox26 fixed, firefox27 fixed, b2g-v1.2 fixed)

RESOLVED FIXED
blocking-b2g koi+
Tracking Status
firefox26 --- fixed
firefox27 --- fixed
b2g-v1.2 --- fixed

People

(Reporter: nhirata, Assigned: tzimmermann)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash])

Crash Data

Attachments

(4 files, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-abf94e3f-a1de-4c1d-b2db-cbd8b2131003.
=============================================================
This bug was filed from the Socorro interface and is 
report bp-abf94e3f-a1de-4c1d-b2db-cbd8b2131003.
=============================================================
Frame 	Module 	Signature 	Source
0 	libnss3.so 	PR_EnterMonitor 	nsprpub/pr/src/pthreads/ptsynch.c
1 	libxul.so 	android::OmxDecoder::NotifyDataArrived(char const*, unsigned int, long long) 	/builds/slave/b2g_m-aurora_hamachi_ntly-0000/build/objdir-gecko/content/media/omx/../../../dist/include/mozilla/ReentrantMonitor.h
2 	libxul.so 	mozilla::OmxDecoderNotifyDataArrivedRunnable::Run() 	content/media/omx/OmxDecoder.cpp
3 	libxul.so 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp
4 	libxul.so 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp
5 	libxul.so 	nsThread::Shutdown() 	xpcom/threads/nsThread.cpp
6 	libxul.so 	mozilla::ShutdownThreadEvent::Run() 	content/media/VideoUtils.h
7 	libxul.so 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp
8 	libxul.so 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp
9 	libxul.so 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp
10 	libxul.so 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp
11 	libxul.so 	MessageLoop::RunInternal() 	ipc/chromium/src/base/message_loop.cc
12 	libxul.so 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc
13 	libxul.so 	nsBaseAppShell::Run() 	widget/xpwidgets/nsBaseAppShell.cpp
14 	libxul.so 	XRE_RunAppShell 	toolkit/xre/nsEmbedFunctions.cpp
15 	libxul.so 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp
16 	libxul.so 	MessageLoop::RunInternal() 	ipc/chromium/src/base/message_loop.cc
17 	libxul.so 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc
18 	libxul.so 	XRE_InitChildProcess 	toolkit/xre/nsEmbedFunctions.cpp
19 	plugin-container 	main 	ipc/app/MozillaRuntimeMain.cpp
20 	libc.so 	__libc_init 	bionic/libc/bionic/libc_init_dynamic.c
21 		@0xb00045a9 	

Note:
1. Crash in Video app for this crash; looks like it can also happen in Music app as well
2. More reports : https://crash-stats.mozilla.com/report/list?product=B2G&signature=PR_EnterMonitor+|+android%3A%3AOmxDecoder%3A%3ANotifyDataArrived%28char+const*%2C+unsigned+int%2C+long+long%29
OS: Android → Gonk (Firefox OS)
Hardware: All → ARM
Locks like PRMonitor::owner is null, because OmxDecoder::mDecoder has been cleaned up. The source code is hard to understand, but MediaDecoderStateMachine.cpp:2127 could be the reason.
I cannot reproduce the problem, but given the cleanup of OmxDecoder::mDecoder like described in my previous comment here is an attempt to fix the problem.

The read-and-parse logic of the MP3 parser runs on the main and the I/O thread. All decoder cleanups seem to be done in nsDecoderDisposeEvent::Run, which also runs on the main thread. This problem only concerns OmxDecoder::mDecoder., the other fields are ref-counted. Usage and cleanup of OmxDecoder::mDecoder both happen on the main thread. By holding a weak reference to OmxDecoder, we can detect when OmxReader got deleted and its fields got cleared. In this case, the parser logic stops.
Attachment #814875 - Flags: feedback?(sotaro.ikeda.g)
I also think OmxDecoder::mDecoder has been cleaned up is a problem. But "Use weak pointers for OmxDecoder" is not a good way to solve the problem. OmxDecoder's life time is not a problem. The problem is AbstractMediaDecoder(MediaDecoder)'s lifetime.

OmxDecoder::mDecoder is not cleared during the OmxDecoder's lifetime. I think it is a prblem. After the MediaDecoder is deleted, OmxDecoder::mDecoder becomes a dangling pointer. OmxDecoder::mDecoder need's to be cleared before MediaDecoder is deleted. By the change, OmxDecoder could know MediaDecoder is already deleted. And in current gecko, OmxDecoderProcessCachedDataTask run in main thread, and MediaDecoder's lifetime is controlled in main thread. Therefore, MediaDecoder living check do not need to be so often.
(In reply to Sotaro Ikeda [:sotaro] from comment #3)
> and MediaDecoder's
> lifetime is controlled in main thread. Therefore, MediaDecoder living check
> do not need to be so often.

Sorry, it is not on the main thread.
(In reply to Sotaro Ikeda [:sotaro] from comment #4)
> (In reply to Sotaro Ikeda [:sotaro] from comment #3)
> > and MediaDecoder's
> > lifetime is controlled in main thread. Therefore, MediaDecoder living check
> > do not need to be so often.
> 
> Sorry, it is not on the main thread.

It seems only OmxDecoderProcessCachedDataTask.
OmxDecoderProcessCachedDataTask is triggered by OmxDecoder::ProcessCachedData(). It is called only by synchronously. So, OmxDecoderProcessCachedDataTask might be excluded from this problem.
Clearing mDecoder explicitly also seems to fix the problem.
Attachment #814875 - Attachment is obsolete: true
Attachment #814875 - Flags: feedback?(sotaro.ikeda.g)
Attachment #815280 - Flags: feedback?(sotaro.ikeda.g)
Stop parsing if the decoder got cleared, and a small bug fix.
Attachment #815281 - Flags: feedback?(sotaro.ikeda.g)
Comment on attachment 815280 [details] [diff] [review]
[01] Bug 924678: Explicitly clear OmxDecoder::mDecoder

Looks good to me.
Attachment #815280 - Flags: feedback?(sotaro.ikeda.g) → feedback+
Attachment #815281 - Flags: feedback?(sotaro.ikeda.g) → feedback+
Comment on attachment 815280 [details] [diff] [review]
[01] Bug 924678: Explicitly clear OmxDecoder::mDecoder

Great, please review.
Attachment #815280 - Flags: review?(sotaro.ikeda.g)
Attachment #815281 - Flags: review?(sotaro.ikeda.g)
In gecko's media playback framework, most classes handle low level(data delivered handling) stuff and high level(media playback control). This makes the problem and interdependency complex. Classes are stick together too much. In future, it might be better to split to low level stuff classes and high level control classes.
Comment on attachment 815280 [details] [diff] [review]
[01] Bug 924678: Explicitly clear OmxDecoder::mDecoder

In media playback framework, doublec basically reviews the code. Change the reviewer to doublec.
Attachment #815280 - Flags: review?(sotaro.ikeda.g) → review?(chris.double)
Attachment #815281 - Flags: review?(sotaro.ikeda.g) → review?(chris.double)
Attachment #815280 - Flags: review?(chris.double) → review+
Attachment #815281 - Flags: review?(chris.double) → review+
Koi?, because I think 1.2 also has this bug.
blocking-b2g: --- → koi?
Blocks: 927884
Please uplift to 1.2 for fixing bug 927242.
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #16)
> Please uplift to 1.2 for fixing bug 927242.

you are right. Device Fugu used V1.2 has this bug.
blocking-b2g: koi? → koi+
Crash Signature: [@ PR_EnterMonitor | android::OmxDecoder::NotifyDataArrived(char const*, unsigned int, long long)] → [@ PR_EnterMonitor | android::OmxDecoder::NotifyDataArrived(char const*, unsigned int, long long)] [@ mozilla::OmxDecoderNotifyDataArrivedRunnable::Run() ] [@ android::OmxDecoder::NotifyDataArrived(char const*, unsigned int, long long) ]
Duplicate of this bug: 926125
Blocks: 916996
Can we get this merged to v1.2
Flags: needinfo?(ryanvm)
Assignee: nobody → tzimmermann
Duplicate of this bug: 927242
You need to log in before you can comment on or make changes to this bug.