Closed Bug 1036451 Opened 10 years ago Closed 10 years ago

Null buffer dereference crash in WebrtcOMXH264VideoCodec

Categories

(Core :: WebRTC: Audio/Video, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1031500

People

(Reporter: diego, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

Attached file crash mini dump
Gecko sometimes crashes while running a webrtc loopback test with the H.264 decoder. It seems the native window sometimes unexpectedly returns a null buffer which is then deferenced here:

https://mxr.mozilla.org/mozilla-central/source/media/webrtc/signaling/src/media-conduit/WebrtcOMXH264VideoCodec.cpp#442
Flags: needinfo?(rjesup)
gfx guys/John Lin/etc - any ideas why this would happen?  OOM?
Flags: needinfo?(rjesup)
Needinfo for gfx guys and John Lin -- See Comment 1
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(jolin)
Flags: needinfo?(bjacob)
Flags: needinfo?(ayang)
This is not OOM. The crash stack include GonkBufferQueue::queueBuffer(). The queueBuffer() enqueues a buffer that is already allocated by onkBufferQueue.

The following condition might be hit. It could cause null pointer return. 

http://mxr.mozilla.org/mozilla-central/source/widget/gonk/nativewindow/GonkBufferQueueKK.cpp#884
Clearing my needinfo because Sotaro already replied better than I could have :-)
Flags: needinfo?(bjacob)
GonkNativeWindow(GonkBufferQueue) always work in sync mode. To handle this situation, it seems better to use async mode. In async mode, GonkBufferQueue keeps one latest buffer. In sync mode, GonkBufferQueue keeps all enqueued buffers.

There is a historical reason why GonkNativeWindow(GonkBufferQueue) works only on sync mode. On ICS, if GonkNativeWindow is async mode, some b2g devices' camera hal did not works correctly.

And between JB and KK, how to determine sync/async is different. Until JB, it is GonkBufferQueue's setting. Since JB, ANativeWindow's user can also set it to buffer's meta-data.

It seems better to handle sync/async problem in a different bug. I am going to create a bug for it.
(In reply to Sotaro Ikeda [:sotaro] from comment #5)
> GonkNativeWindow(GonkBufferQueue) always work in sync mode. To handle this
> situation, it seems better to use async mode. In async mode, GonkBufferQueue
> keeps one latest buffer. In sync mode, GonkBufferQueue keeps all enqueued
> buffers.
> 
> There is a historical reason why GonkNativeWindow(GonkBufferQueue) works
> only on sync mode. On ICS, if GonkNativeWindow is async mode, some b2g
> devices' camera hal did not works correctly.

On async mode, some camera hals caused crash on ICS.
Flags: needinfo?(sotaro.ikeda.g)
Bug 1036539 is created for Comment 5.
Sotaro's comment 3 suggests that there might be too many buffers in 'acquired' state when crashing. If that's the case, there should be error message([1]) in logcat.

Diego, could you please help upload the log? Thanks.

[1] http://dxr.mozilla.org/mozilla-central/source/widget/gonk/nativewindow/GonkBufferQueueKK.cpp?from=GonkBufferQueueKK.cpp&case=true#885
Flags: needinfo?(jolin)
Flags: needinfo?(dwilson)
Flags: needinfo?(ayang)
Set bug 1036539 as dependent bug. If GonkNativeWindow::onFrameAvailable() provide null buffer and GonkNativeWindow runs in sync mode, GonkNativeWindow is always going to store more than one pending rendered buffers. It is not good for camera stream.
Depends on: 1036539
(In reply to John Lin[:jolin][:jhlin] from comment #8)
> Sotaro's comment 3 suggests that there might be too many buffers in
> 'acquired' state when crashing. If that's the case, there should be error
> message([1]) in logcat.
> 
> Diego, could you please help upload the log? Thanks.

The full log is too large to share here (83 megs). But I did a quick grep and found this:

> 07-08 18:38:35.615 28081  2888 E GonkBufferQueue: acquireBuffer: max acquired buffer count reached: 3 (max=2
Flags: needinfo?(dwilson)
(In reply to Diego Wilson [:diego] from comment #10)
> (In reply to John Lin[:jolin][:jhlin] from comment #8)
> > Sotaro's comment 3 suggests that there might be too many buffers in
> > 'acquired' state when crashing. If that's the case, there should be error
> > message([1]) in logcat.
> > 
> > Diego, could you please help upload the log? Thanks.
> 
> The full log is too large to share here (83 megs). But I did a quick grep
> and found this:
> 
> > 07-08 18:38:35.615 28081  2888 E GonkBufferQueue: acquireBuffer: max acquired buffer count reached: 3 (max=2

From the log, undequeued buffer count is still 2. codeaurors side's gecko seems not include the following fix. It is the undequeued buffer count is changed to 10.

https://hg.mozilla.org/releases/mozilla-aurora/rev/75f0e9ea8629
Yeah, the crashing build just barely missed the patch. Do you think that should fix it then?
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Diego Wilson [:diego] from comment #12)
> Yeah, the crashing build just barely missed the patch. Do you think that
> should fix it then?

It should fix most cases, I think.
Flags: needinfo?(sotaro.ikeda.g)
OK. I'm fine with marking this as a dup of bug 1031500 then.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: