Closed Bug 990908 Opened 6 years ago Closed 6 years ago

[RTSP] Video app crash at android::OMXCodec::read when opening RTSP streaming

Categories

(Core :: Audio/Video, defect, critical)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla31
blocking-b2g 2.0+
Tracking Status
firefox30 --- unaffected
firefox31 --- fixed
b2g-v1.4 --- unaffected
b2g-v2.0 --- fixed

People

(Reporter: ethan, Assigned: bechen)

References

Details

(Keywords: crash, regression, reproducible, Whiteboard: [b2g-crash])

Crash Data

Attachments

(2 files, 3 obsolete files)

The video app would crash when we click an RTSP link, no matter the media source is video or audio-only.
The crash does not always happen, but the reproduction rate is pretty high.

The GDB backtrace log is attached.
Keywords: regression
Summary: [RTSP] Video app crash when opening RTSP streaming → [RTSP] Video app crash at android::OMXCodec::read when opening RTSP streaming
Not 100% sure, but I think this is a  regression after 2014/3/29 (according to my own unit test results).
Can you include a reduced test case showcasing the bug here?
(In reply to Jason Smith [:jsmith] from comment #2)
> Can you include a reduced test case showcasing the bug here?
We don't have automation test cases for RTSP yet.
If you want to reproduce this bug, you can follow these steps:
1. Open the browser app.
2. Go to the web page: http://goo.gl/lE2eE3
3. Click the first link on this page "AMX test file", which is an audio-only RTSP streaming source.
4. The video app would be launched and crashed soon. The bottom of the browser shows "Video just crashed".

Did I answer you question?
(In reply to Ethan Tseng [:ethan] from comment #3)
> (In reply to Jason Smith [:jsmith] from comment #2)
> > Can you include a reduced test case showcasing the bug here?
> We don't have automation test cases for RTSP yet.
> If you want to reproduce this bug, you can follow these steps:
> 1. Open the browser app.
> 2. Go to the web page: http://goo.gl/lE2eE3
> 3. Click the first link on this page "AMX test file", which is an audio-only
> RTSP streaming source.
> 4. The video app would be launched and crashed soon. The bottom of the
> browser shows "Video just crashed".
> 
> Did I answer you question?

Yes.

Adding qawanted here to get a crash report URL & to check what happens on 1.4.
Keywords: qawanted
I manually backed out this changeset, and the crash doesn't happen.
http://hg.mozilla.org/mozilla-central/rev/35180f110e44

Hi Benjamin, could you help to identify the root cause?
Flags: needinfo?(bechen)
(In reply to Ethan Tseng [:ethan] from comment #5)
> I manually backed out this changeset, and the crash doesn't happen.
> http://hg.mozilla.org/mozilla-central/rev/35180f110e44
> 
> Hi Benjamin, could you help to identify the root cause?

Yes, I'll check it now.
Flags: needinfo?(bechen)
Blocks: 631058
blocking-b2g: --- → 1.5?
QA Contact: sarsenyev
Crash reproduces on 1.5 on Buri
https://crash-stats.mozilla.com/report/index/4443c0bb-cbe0-4e4c-bb1c-84da32140402

The bug doesn't reproduce on 1.4, no any crashes occurred

1.4 Environmental Variables:
Device: Buri 1.4 MOZ
BuildID: 20140326000201
Gaia: 7e705dd4718d528974d99ac31866318d7e201152
Gecko: 4889124accfa
Version: 30.0a2
Firmware Version: v1.2-device.cfg
Keywords: qawanted
Crash Signature: [@ __libc_android_abort | __android_log_assert | android::OMXCodec::read]
Keywords: crash, reproducible
Whiteboard: [b2g-crash]
Component: RTSP → Video/Audio
Product: Firefox OS → Core
Version: unspecified → Trunk
Chris - Can you investigate this? Looks like this was caused by bug 631058.
Flags: needinfo?(cpearce)
I'm working from home today, so I don't have my b2g device to test on, it will have to wait until tomorrow.

Ethan: can you try backing out bug 631508 and testing if the bug goes away? You can quickly remove the changeset locally with the following hg command in your m-c repo:

   hg export -r 35180f110e44 | patch -p1 -R
Flags: needinfo?(cpearce) → needinfo?(ettseng)
(In reply to Chris Pearce (:cpearce) from comment #9)
> Ethan: can you try backing out bug 631508 and testing if the bug goes away?
> You can quickly remove the changeset locally with the following hg command
> in your m-c repo:
>    hg export -r 35180f110e44 | patch -p1 -R
Hi Chris,
I already did this (comment 5).
Yes. When I backed out bug 631508, this bug does away.
I used the following command:
    hg backout --merge -m "Backout changeset 35180f110e44" 35180f110e44
Flags: needinfo?(ettseng)
(In reply to sarsenyev from comment #7)
> Crash reproduces on 1.5 on Buri
> https://crash-stats.mozilla.com/report/index/4443c0bb-cbe0-4e4c-bb1c-
> 84da32140402
> The bug doesn't reproduce on 1.4, no any crashes occurred
Thanks.
I guess the results are reasonable because bug 631508 was landed only to the main trunk but not 1.4.
Although the gdb backtrace shows it crash at |CHECK(seekTimeUs >= 0);|
But my logcat shows it crash at |CHECK_EQ((int)info->mStatus, (int)OWNED_BY_US);|
(In reply to Ethan Tseng [:ethan] from comment #10)
> Yes. When I backed out bug 631508, this bug does away.
Correct bug number.
It should be bug 631058.
Ethan: The patch in bug 631058 will make us not decode and buffer so much decoded data before we start playing. Once we start playing, behaviour will be the same as without the patch. It's hard for me to imagine how this would cause problems for RTSP. Do you know why my patch causes problems for RTSP?
Flags: needinfo?(ettseng)
(In reply to Chris Pearce (:cpearce) from comment #14)
> Ethan: The patch in bug 631058 will make us not decode and buffer so much
> decoded data before we start playing. Once we start playing, behaviour will
> be the same as without the patch. It's hard for me to imagine how this would
> cause problems for RTSP. Do you know why my patch causes problems for RTSP?
Hi Chris, Benjamin is more familiar with the RTSP media part and is investigating into this issue.
Let him reply your question.
Flags: needinfo?(ettseng) → needinfo?(bechen)
Attached patch WIP-bug-990908.patch (obsolete) — Splinter Review
patch: Send play command before readmeta data.

Hi Chris:
I agree with you that your patch should not cause the problem.
It looks more like in some circumstances (timing, threading...), the rtsp doesn't use omx codec correctly. 

I still finding the root cause, and the wip patch works for me now.
Flags: needinfo?(bechen)
(In reply to Benjamin Chen [:bechen] from comment #16)
> I still finding the root cause, and the wip patch works for me now.

Can you take this bug?
Assignee: nobody → bechen
blocking-b2g: 2.0? → 2.0+
Thanks for cyu's help.
Clarify the backtrace is not correct. 
The crash point is |CHECK_EQ((int)info->mStatus, (int)OWNED_BY_US);|
OMXCodec::fillOutputBuffers() always passing buffers that satisfy "BufferInfo::mStatus == OWNED_BY_US" condition to the underlying component without checking OMXCodec::mFilledBuffers in any circumstances.

If some buffers have already been filled on the binder thread and the first OMXCodec::read() was called later, OMXCodec::fillOutputBuffers() will return all buffers which satisfy "BufferInfo::mStatus == OWNED_BY_US" condition to the underlying component. But since all indexes in OMXCodec::mFilledBuffers are not updated when OMXCodec::fillOutputBuffers() was called, OMXCodec::read() will experience some unexpected behaviour which is caused by unsynchronized BufferInfo::mStatus information.

OMXCodec should handle index information in OMXCodec::mFilledBuffers well when doing buffer management.


Kindly help try this patch to see if RTSP works fine or not.
Attachment #8407471 - Flags: feedback?(bechen)
Attached patch bug-990908.patch (obsolete) — Splinter Review
Revert the change of bug 938512, let the RTSP's code flow align to others,
that we can avoid the bug mentioned in comment 19.
Attachment #8402538 - Attachment is obsolete: true
Attachment #8408156 - Flags: feedback?(cpearce)
Comment on attachment 8408156 [details] [diff] [review]
bug-990908.patch

Review of attachment 8408156 [details] [diff] [review]:
-----------------------------------------------------------------

Looks reasonable...
Attachment #8408156 - Flags: feedback?(cpearce) → feedback+
Attachment #8408156 - Flags: review?(cpearce)
Comment on attachment 8407471 [details] [diff] [review]
bug990908_refine_mfilledbuffers.patch

Review of attachment 8407471 [details] [diff] [review]:
-----------------------------------------------------------------

Hi Bruce:
After apply your patch, it works great. No crash anymore.
Attachment #8407471 - Flags: feedback?(bechen) → feedback+
Comment on attachment 8407471 [details] [diff] [review]
bug990908_refine_mfilledbuffers.patch

Hi Sotaro, may I have your feedback about this patch?
Attachment #8407471 - Flags: feedback?(sotaro.ikeda.g)
Comment on attachment 8407471 [details] [diff] [review]
bug990908_refine_mfilledbuffers.patch

Looks good. I am surprised that OMXCodec still have a such basic level defect. It seems better to get feed back also from codeaurora and spreadtrum, because, they are managing their own OMXCodecs.
Attachment #8407471 - Flags: feedback?(sotaro.ikeda.g) → feedback+
Attachment #8407471 - Flags: feedback?(mvines)
Attachment #8407471 - Flags: feedback?(james.zhang)
Status: NEW → ASSIGNED
Comment on attachment 8407471 [details] [diff] [review]
bug990908_refine_mfilledbuffers.patch

Review of attachment 8407471 [details] [diff] [review]:
-----------------------------------------------------------------

Loop Ming.
Attachment #8407471 - Flags: feedback?(james.zhang) → feedback?(ming.li)
Attachment #8407471 - Flags: feedback?(ming.li) → feedback+
(In reply to James Zhang from comment #25)
> Comment on attachment 8407471 [details] [diff] [review]
> bug990908_refine_mfilledbuffers.patch
> 
> Review of attachment 8407471 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Loop Ming.

Any comments for this patch?
Flags: needinfo?(ming.li)
Comment on attachment 8407471 [details] [diff] [review]
bug990908_refine_mfilledbuffers.patch

Review of attachment 8407471 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/libstagefright/OMXCodec.cpp
@@ +4052,3 @@
>      for (size_t i = 0; i < buffers->size(); ++i) {
>          BufferInfo *info = &buffers->editItemAt(i);
> +        if (!buffersFilled[i] && info->mStatus == OWNED_BY_US) {

eg:
1.binder thread is running at : OMXCodec::on_message [case:FILL_BUFFER_DONE]
    info->mStatus = OWNED_BY_US;  

2.decode thread is running at : OMXCodec::read

    -> fillOutputBuffers(); 
    -> if (!buffersFilled[i] && info->mStatus == OWNED_BY_US) { 
    at this right point the condition above can be fullfilled. 


Is it?
Attachment #8407471 - Flags: feedback+
Flags: needinfo?(ming.li)
(In reply to Ming from comment #27)
I see. Thanks bruce. There is a lock for OMXCodec::on_message && OMXCodec::read.
So it is looks okay. We will push our team to merge this in if needed.
Attachment #8408156 - Flags: review?(cpearce) → review+
Attached patch bug-990908.patchSplinter Review
r=cpearce
try server:
https://tbpl.mozilla.org/?tree=Try&rev=f7868a9b52e7
Attachment #8408156 - Attachment is obsolete: true
Attachment #8412340 - Flags: review+
Keywords: checkin-needed
Comment on attachment 8407471 [details] [diff] [review]
bug990908_refine_mfilledbuffers.patch

Please mark old attachments as obsolete before requesting checkin.
Attachment #8407471 - Attachment is obsolete: true
Attachment #8407471 - Flags: feedback?(mvines)
https://hg.mozilla.org/mozilla-central/rev/8d160a730493
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
See Also: → 1002338
add alan 

alan ,pls have a look  whether platform need this patch.
Flags: needinfo?(alan.wang)
I've created another bug 1002338 for tracking the OMXCodec patch regarding to this problem.
Depends on: 1002338
Flags: needinfo?(alan.wang)
You need to log in before you can comment on or make changes to this bug.