Closed Bug 823324 Opened 12 years ago Closed 11 years ago

crash in mozilla::layers::VideoGraphicBuffer::Unlock

Categories

(Firefox OS Graveyard :: General, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:tef+, blocking-basecamp:-, firefox19 wontfix, firefox20 wontfix, firefox21 fixed, b2g18 fixed, b2g18-v1.0.0 fixed)

VERIFIED FIXED
B2G C4 (2jan on)
blocking-b2g tef+
blocking-basecamp -
Tracking Status
firefox19 --- wontfix
firefox20 --- wontfix
firefox21 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- fixed

People

(Reporter: marcia, Assigned: m1)

References

Details

(Keywords: crash, reproducible, unagi, Whiteboard: [b2g-crash])

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is 
report bp-c32bde85-f4f8-45ae-aa71-db3682121219 .
============================================================= 

Seen while doing testing today on unagi using:

STR:
1. Open the Video app
2. Pan the Video app rapidly up and down

Eventually the app crashes.

On my SD card I have a few mozilla stock HTML5 videos, one MP4 file and some videos that were shot from the camera app.
Summary: crash in __libc_android_abort → crash in mozilla::layers::VideoGraphicBuffer::Unlock
Whiteboard: [b2g-crash]
Can't reproduce this issue in my local with total 7 video files(5 3gp + 2 mp4) under portrait and landscape mode.

Is it always reproducible?
If so, how many times or how long to reproduce the crash?
blocking-basecamp: --- → ?
(In reply to pchang from comment #1)
> If so, how many times or how long to reproduce the crash?

We see this in overnight stability testing.
Triage: BB- unless reproducibility is high
add qawanted to understand if this issue is reproducible by hand
blocking-basecamp: ? → -
Keywords: qawanted
Renoming.  At the point in the program hard to reproduce crashes still need to be addressed.
blocking-basecamp: - → ?
Triage: seem like gecko. change component to general
Component: Gaia::Video → General
QA Contact: mozillamarcia.knous
edwin, do you think you can take a look or find a better owner?
Assignee: nobody → edwin
blocking-basecamp: ? → +
*take*
Status: NEW → ASSIGNED
I think this is bug 812756 coming back. Thought that hack would keep this bug at bay, but evidently not. We may have to patch OMXCodec.
Actually, this could have a different cause. We don't seem to own the buffer when this happens. Maybe we're releasing the buffer more than we're reffing it.
Comment on attachment 695903 [details] [diff] [review]
Fix

Seems to make sense but I'm far from an authority in stagefright so would prefer somebody else to review.   We can certainly land this in our patch tree though.
Attachment #695903 - Flags: review?(mvines)
Attachment #695903 - Flags: review?(dwilson)
Attachment #695903 - Flags: review+
(In reply to Joe Cheng [:jcheng] from comment #5)
> Triage: seem like gecko. change component to general

But you removed the QA contact. Putting it back.
QA Contact: mozillamarcia.knous
Hey Team,
    So I was asked to help on some of these QAWANTED bugs to see if they are still valid. While attempting Marcia's repro steps on my Unagi, I got a crash which I've added the crash log for.

The build was 20121231070201.

I took about 7 videos from the phone camera. I then went into the video app and began quickly panning up and down on the list. I got the crash listed as a result. Please let me know if you need anything else.

I repro'd it 2/5 attempts.
I wish I could edit my comments :(
I think that while I was scrolling, i may have accidentally tapped with a second finger and launched a video while scrolling. Not sure if that's what caused the crash or not but seems feasible.
Keywords: qawanted
I am hitting this bug at least once a day when running the smoketest, so I can still reproduce it fairly regularly.
bp-c1fe9638-aff2-40f9-8493-5fd922121231 is the crash ID from croesch.
Priority: -- → P1
(In reply to Michael Vines [:m1] from comment #11)
> Comment on attachment 695903 [details] [diff] [review]
> Fix
> 
> Seems to make sense but I'm far from an authority in stagefright so would
> prefer somebody else to review.

Is Diego around this week?  Is there someone else who can review stagefright-related stuff?  If I'm interpreting correctly, Edwin is back at work in a few hours so we can hopefully land this very soon.
Flags: needinfo?(mvines)
Comment on attachment 695903 [details] [diff] [review]
Fix

He's back today.  Inder may be able to review as well, adding him as a possible r?
Attachment #695903 - Flags: review?(ikumar)
Flags: needinfo?(mvines)
Target Milestone: --- → B2G C4 (2jan on)
Comment on attachment 695903 [details] [diff] [review]
Fix

Review of attachment 695903 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok to me. Does it solve the panning issue?
Attachment #695903 - Flags: review?(ikumar) → review+
It looks like we needed Diego or Inder (which we have one of now).  Proceed to landing?
I'm looking it over and I'm not quite convinced yet. Can you give me an hour or two?
If I understand correctly, the patch forces OMXCodec to wait for all native window buffers before shutting down. I think this isn't really necesary. After all, in the end OMXCodec just turns around and relinquishes them back to the native window via CancelBuffer().

Could it be that somebody's trying to access an already destroyed OMXCodec if a buffer was still on its way back from rendering?

Anyway, the patch probably works as a fix for now. Worse case scenario we wait a few extra ms when video playback is stopped.
Let's just land this as is to fix the crash, and we can follow up with another patch later needed.
AFAICT ball's in the court of landing the bandaid in the chainsaw queue.  -> m1 for tracking purposes.
Assignee: edwin → mvines
(In reply to Diego Wilson from comment #23)
> Could it be that somebody's trying to access an already destroyed OMXCodec
> if a buffer was still on its way back from rendering?

No. As far as I can see, we're releasing all of the buffers before shutting down the decoder, but signalBufferReturned() is called asynchronously with respect to the decoder thread, but serially with respect to each other. So there could still be signalBufferReturned() calls queued when stop() is called. This patch just makes stop() wait for all buffers to return, and signalBufferReturned() calls to end.
(getting this landed now.  will mark the bug fixed once it shows up on CAF)
Landed.   Should show up on CAF around this time tomorrow.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Attachment #695903 - Flags: review?(dwilson) → review?
I experienced this crash in the 1/10/2013 build:
https://crash-stats.mozilla.com/report/index/45a6f317-2ecc-4bd5-90c3-8901c2130111
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #695903 - Flags: review?
Michael's plate is pretty full so let's do this for v1 but not block basecamp on it.
blocking-b2g: --- → tef+
blocking-basecamp: + → -
The patch is not in our repository. This landed in CAF and will make its way to us eventually, but we can't test it for now.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
(In reply to Naoki Hirata :nhirata from comment #29)
> I experienced this crash in the 1/10/2013 build:
> https://crash-stats.mozilla.com/report/index/45a6f317-2ecc-4bd5-90c3-
> 8901c2130111

The link says that ImageContainerChild::SetIdleNow()is called when OMXCodec is in shutdown state. I faced the problem more in Bug 826829 and created attachment 700950 [details] [diff] [review].
(In reply to Sotaro Ikeda [:sotaro] from comment #33)
> (In reply to Naoki Hirata :nhirata from comment #29)
> > I experienced this crash in the 1/10/2013 build:
> > https://crash-stats.mozilla.com/report/index/45a6f317-2ecc-4bd5-90c3-
> > 8901c2130111
> 
> The link says that ImageContainerChild::SetIdleNow()is called when OMXCodec
> is in shutdown state. I faced the problem more in Bug 826829 and created
> attachment 700950 [details] [diff] [review].

attachment 700950 [details] [diff] [review] solves it without changing OMXCodec, I think.
> > The link says that ImageContainerChild::SetIdleNow()is called when OMXCodec
> > is in shutdown state. I faced the problem more in Bug 826829 and created
> > attachment 700950 [details] [diff] [review].
> 
> attachment 700950 [details] [diff] [review] solves it without changing
> OMXCodec, I think.

sorry attachment 700993 [details] [diff] [review] is final version.
attachment 695903 [details] [diff] [review] has one problem. If gecko has a defect of do not return a buffer to OMXCodec, OMXCodec::stop() never comple.
Marking status-b2g18-v1.0.0 fixed as per comments 28 and 32
Since it's been a couple of weeks, this should be verifiable.
It's #2 top crasher in B2G 18.0.
There are still crashes in recent builds, both 18.0 and 21.0a1 (see https://crash-stats.mozilla.com/report/list?signature=__libc_android_abort+|+__android_log_assert+|+android%3A%3AOMXCodec%3A%3AsignalBufferReturned).
Should we reopen it?
(In reply to Scoobidiver from comment #40)
> It's #2 top crasher in B2G 18.0.
> There are still crashes in recent builds, both 18.0 and 21.0a1 (see
> https://crash-stats.mozilla.com/report/
> list?signature=__libc_android_abort+|+__android_log_assert+|+android%3A%3AOMX
> Codec%3A%3AsignalBufferReturned).
> Should we reopen it?

This bug only cares about OMXCodec shutdown sequence. From the above crash log, it seems that it is same to Bug 840154. During OMXCodec is in seeking state, if MediaBuffer is returned to OMXCodec, OMXCodec calls assert. This type of crash is partially fixed in Bug 812756.
I am guessing by Sotaro-san's comment, we should open a new bug...
I can't seem to reproduce the bug based on comment 0 and comment 29.  Having said that, we are still seeing crashes similar to this signature, which is why bug 844918 is filed.


Gecko  http://hg.mozilla.org/releases/mozilla-b2g18/rev/f96b3fe5d05b
Gaia   539e4f57b25b839b7b3d25619c0dc53fd1d67579
BuildID 20130225070202
Version 18.0
Unagi

Gecko  http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/3a5a27992a75
Gaia   5691a16fff8e1403c75ed9d6f3a443b7e58198c6
BuildID 20130225070200
Version 18.0
Unagi

Marking this bug as verified.
Status: RESOLVED → VERIFIED
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: