Closed Bug 963624 Opened 10 years ago Closed 10 years ago

Rotation crashes video app while playing 720p clip

Categories

(Core :: Audio/Video, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED FIXED
1.3 C3/1.4 S3(31jan)
blocking-b2g 1.3+
Tracking Status
firefox27 --- wontfix
firefox28 --- fixed
firefox29 --- unaffected
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- unaffected

People

(Reporter: diego, Assigned: sotaro)

References

()

Details

(Keywords: crash, regression, Whiteboard: [caf priority: p3][b2g-crash] [CR 606415])

Attachments

(2 files, 3 obsolete files)

Attached file Sample 720p clip
STR:

1. Play shared 720p clip
2. Rotate phone several times (5 or 6).

The video app crashes with the log below.

F/OMXCodec( 2663): frameworks/av/media/libstagefright/OMXCodec.cpp:2177 CHECK_EQ( (int)bufInfo->mStatus,(int)OWNED_BY_NATIVE_WINDOW) failed: 3 vs. 2
I/InputReader(  337): Reconfiguring input devices.  changes=0x00000004
I/InputReader(  337): Device reconfigured: id=10, name='atmel_mxt_ts', size 480x800, orientation 0, mode 1, display id 0
I/Gecko   (  337): HWComposer: FPS is 0
E/OMXNodeInstance(  312): !!! Observer died. Quickly, do something, ... anything...
E/OMXNodeInstance(  312): !!! Observer died. Quickly, do something, ... anything...
I/Gecko   (  337): [Parent 337] WARNING: pipe error (173): Connection reset by peer: file /local/mnt/workspace/lnxbuild/project/release_dev_msm8610_2333603/checkout/gecko/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 446
I/Gecko   (  337): 
I/Gecko   (  337): ###!!! [Parent][MessageChannel] Error: Channel error: cannot send/recv
I/Gecko   (  337): 
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bc168 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71e1a08 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bc118 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bc0c8 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bc078 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bc028 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bbfd8 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bbf88 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bbf38 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71e1858 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71e16a8 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71c0b70 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71c8b60 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71aafe0 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71b7ab8 successful
I/Gonk    (  337): Setting nice for pid 2059 to 1
I/Gonk    (  337): Changed nice for pid 2059 from 18 to 1.
E/QC_AACDEC(  312): EventThread exiting rc[-1]                     errno[19]
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71c0890 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71bbee8 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71c1bf8 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71c1ba8 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71c1b58 successful
I/OMXNodeInstance(  312): OMX_FreeBuffer for buffer header 0xb71c1b08 successful
I/Gecko   (  337): -*- QCMessageManager_QC_B2G: receiveMessage: child-process-shutdown arrived from content process
E/        (  312): Destroy C2D instance
E/        (  312): Destroy C2D instance
E/QC_AACDEC(  312): component_deinit:COMPONENT DEINIT...
E/QC_AACDEC(  312): OMX AAC component destroyed
I/GeckoDump(  337): Crash reporter : Not online, postponing.
Sotaro,

The crash disappears after I back out the patch in bug 925444. Any idea what may be happening?
Flags: needinfo?(sotaro.ikeda.g)
blocking-b2g: --- → 1.3?
On the QRD 8x26, but I think it should be reproducible on the Nexus 4 too.
We do have a Nexus 4, will try to reproduce.  Making 1.3+ as it blocks the CS bug.
Assignee: nobody → sotaro.ikeda.g
I confirmed the crash happens on v1.3 nexus-4 by enabling HWC.
Flags: needinfo?(sotaro.ikeda.g)
Does this happen when the FPS display is off?
(In reply to Milan Sreckovic [:milan] from comment #6)
> Does this happen when the FPS display is off?

I can't crash it with FPS off! Well I'll be... how do you figure?
Flags: needinfo?(milan)
I have generated the crash once, after that I can not generate it.
(In reply to Sotaro Ikeda [:sotaro] from comment #8)
> I have generated the crash once, after that I can not generate it.

You mean with FPS off or in general?
(In reply to Diego Wilson [:diego] from comment #7)
> (In reply to Milan Sreckovic [:milan] from comment #6)
> > Does this happen when the FPS display is off?
> 
> I can't crash it with FPS off! Well I'll be... how do you figure?

FPS uses a separate code path; BenWa has been looking at changing that, so that FPS is "just another layer"; I cc-d him so that he can give us that patch once he has it.

Let's test a bit more, but if we can't reproduce with FPS off, would it still be a blocker?

Sotaro, see question in comment 9.
Flags: needinfo?(milan) → needinfo?(sotaro.ikeda.g)
The FPS code is going to use the compositor API so it will behave like we draw layers instead of a bunch of 'complex' GL code.
You're right, this affects devs but not the end user. So this is probably not a CS blocker.
blocking-b2g: 1.3? → ---
> 
> F/OMXCodec( 2663): frameworks/av/media/libstagefright/OMXCodec.cpp:2177
> CHECK_EQ( (int)bufInfo->mStatus,(int)OWNED_BY_NATIVE_WINDOW) failed: 3 vs. 2

Diego, can you check that which function and whichi part of the fucntion print the above log?
Assignee: sotaro.ikeda.g → nobody
blocking-b2g: --- → 1.3?
Flags: needinfo?(dwilson)
The patch mentioned in Comment 11 is in bug 963821. I haven't verified how it interacts with this issue.
I'm assuming the renom was by accident.
blocking-b2g: 1.3? → ---
On nexus-4, I could regenerate the crash again. The crash happened at the following in MXCodec::BufferInfo* OMXCodec::dequeueBufferFromNativeWindow()

>    // The native window no longer owns the buffer.
>    CHECK_EQ((int)bufInfo->mStatus, (int)OWNED_BY_NATIVE_WINDOW);
>    bufInfo->mStatus = OWNED_BY_US;

The crash log is the following.

>01-25 09:45:56.108  5385  5921 F OMXCodec: frameworks/av/media/libstagefright/OMXCodec.cpp:1911 CHECK_EQ( (int)bufInfo->mStatus,(int)OWNED_BY_NATIVE_WINDOW) failed: 3 vs. 2
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(dwilson)
I regenerate the crash by using the nexus-4 by FPS off. FPS seems not related.
> >01-25 09:45:56.108  5385  5921 F OMXCodec: frameworks/av/media/libstagefright/OMXCodec.cpp:1911 CHECK_EQ( (int)bufInfo->mStatus,(int)OWNED_BY_NATIVE_WINDOW) failed: 3 vs. 2

OMXCodec thinks that MediaBuffer is still in client side, when it is dequeued from ANativeWindow.
When there is a fence object, the gralloc buffer is returned to the following. The two function is independent. Therefore there could be a status inconsistency for a short time. During the time, if OMXCodec::dequeueBufferFromNativeWindow() is called the crash in Comment 16 could happen.

> ANativeWindow::cancelBuffer(); //return gralloc buffer to ANativeWindow with fence object.
> MediaBuffer::release(); // return MediaBuffer to OMXCodec
I regenerate the crash by youtube HQ video playback with rotating the phone.
Change to OMXCodec seems to be unavoidable. Current implementation release buffers by the following sequence.
> ANativeWindow::cancelBuffer()//release gralloc buffer with fence.
> MediaBuffer::release() // with "kKeyRendered == 1"

We can not change the order as opposite like the following. If we can do it, the problem could be fixed. If we changed like the following. The calling thread fall into deadlock state at MediaBuffer::release(). MediaBuffer::release() triggers OMXCodec::signalBufferReturned() and it call ANativeWindow::dequeueBuffer() in it. If there is no free buffer in ANativeWindow, it fall into deadlock.

> MediaBuffer::release() // with "kKeyRendered == 1"
> ANativeWindow::cancelBuffer()//release gralloc buffer with fence.
There are 2 choices to the solution.
-[1] Add to MediaBuffer and OMXCodec to handle fence object especially around OMXCodec::signalBufferReturned().
-[2] Add api to block/unblock OMXCodec's ANativeWindow::dequeueBuffer() call.
By manually adding some logs, I confirmed that the Comment 19 happened.

'slot=4' buffer is returned to ANativeWindow. After that OMXCodec::signalBufferReturned() is called but it is for 'slot=6' buffer. Then slot=4 buffer is dequeued from ANativeWindow. 'slot=6' buffer seems a skipped(not rendered) video frame because of delay.

So, 'slot=4' buffer's OMXCodec::signalBufferReturned() is not called until dequeued from ANativeWindow.
Replace by more good one.
Attachment #8365573 - Attachment is obsolete: true
From Comment 23, [2] in Comment 22 seems not work correctly. [1] seems only choice.
How to co-exist default OMXCodec and b2g modified OMXCodec is also one concern.
Change to a correct component.
Component: Graphics: Layers → Video/Audio
Another idea is to release all video's MediaBuffers at OmxDecoder::ReleaseAllPendingVideoBuffersLocked(). I am going to try it.
I re-checked the source code, all video's MediaBuffers are already handled by VideoGraphicBuffers. And the crash happened when VideoGraphicBuffer's destructor is called without calling VideoGraphicBuffer::Unlock(). When it happens current code directly call MediaBuffer::release() in it. I confirmed it happens when crash happens.

It's problem is already fixed by Bug 961405. But only on master, not on b2g v1.3. It might be better to create a fix only for v1.3.
By applying the patch to nexus4, I did not saw the crash.
Diego, can you test the patch?
Flags: needinfo?(dwilson)
Comment on attachment 8365605 [details] [diff] [review]
patch - always release video's media buffer in ReleaseAllPendingVideoBuffersLocked() and same thread

Review of attachment 8365605 [details] [diff] [review]:
-----------------------------------------------------------------

I can't reproduce the issue with this patch on QRD 8x26.
Attachment #8365605 - Flags: feedback+
Renoming to v1.3 as per comment 17.
blocking-b2g: --- → 1.3?
Flags: needinfo?(dwilson)
blocking-b2g: 1.3? → 1.3+
Severity: major → critical
Keywords: crash, regression
Whiteboard: [b2g-crash]
Attachment #8365605 - Flags: review?(chris.double)
Assignee: nobody → sotaro.ikeda.g
Status: NEW → ASSIGNED
Attachment #8365605 - Flags: review?(chris.double) → review+
https://tbpl.mozilla.org/?tree=Try&rev=71287e38f321
ICS debug has a lot of orange. But it is not clear that it is caused by this patch.
Whiteboard: [b2g-crash] → [b2g-crash] [CR 606415]
(In reply to Sotaro Ikeda [:sotaro] from comment #34)
> https://tbpl.mozilla.org/?tree=Try&rev=71287e38f321
> ICS debug has a lot of orange. But it is not clear that it is caused by this
> patch.

From the crash log, it seems not related to this patch. On try server no one seems to run ICS debug on aurora.
Committable patch only for b2g v1.3. Carry "r=doublec".
Attachment #8365574 - Attachment is obsolete: true
Attachment #8365605 - Attachment is obsolete: true
Attachment #8366820 - Flags: review+
Check-in only to aurora is necessary.
Keywords: checkin-needed
Doesn't apply cleanly. Please rebase.
Keywords: checkin-needed
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #38)
> Doesn't apply cleanly. Please rebase.

Ryan, I rechecked the patch to b2g 1.3(aurora). It can be applied to aurora. Is the above comment is correct for aurora? The patch is not for master.
Flags: needinfo?(ryanvm)
Attachment #8366820 - Attachment description: patch for v1.3 - always release video's media buffer in ReleaseAllPendingVideoBuffersLocked() and same thread → patch for v1.3(aurora) - always release video's media buffer in ReleaseAllPendingVideoBuffersLocked() and same thread
Yeah, that would be the problem. I'm heading out for the day, but will get it first thing in the morning. In the future, an indication on the whiteboard or something is appreciated :)
Whiteboard: [b2g-crash] [CR 606415] → [b2g-crash] [CR 606415] [checkin-needed-aurora]
https://hg.mozilla.org/releases/mozilla-aurora/rev/27d4e681c33f
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: needinfo?(ryanvm)
Resolution: --- → FIXED
Whiteboard: [b2g-crash] [CR 606415] [checkin-needed-aurora] → [b2g-crash] [CR 606415]
Target Milestone: --- → 1.3 C3/1.4 S3(31jan)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #40)
> Yeah, that would be the problem. I'm heading out for the day, but will get
> it first thing in the morning. In the future, an indication on the
> whiteboard or something is appreciated :)

Yeah, I am going add something next time!
Flags: in-moztrap?
Flags: in-moztrap? → in-moztrap+
Whiteboard: [b2g-crash] [CR 606415] → [caf priority: p3][b2g-crash] [CR 606415]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: