Closed Bug 1198663 Opened 4 years ago Closed 4 years ago

crash in mozilla::layers::ImageContainer::GetCurrentSize

Categories

(Firefox for Android :: General, defect, critical)

ARM
Android
defect
Not set
critical

Tracking

()

VERIFIED FIXED
Firefox 45
Tracking Status
firefox41 --- unaffected
firefox42 + wontfix
firefox43 + fixed
firefox44 + fixed
firefox45 + verified
b2g-v2.5 --- fixed
fennec 42+ ---

People

(Reporter: csuciu, Assigned: roc)

Details

(Keywords: crash, topcrash-android-armv7, Whiteboard: [gfx-noted])

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is 
report bp-49ebe05a-bc15-4864-b169-4e9622150826.
=============================================================

Nightly/Aurora crash when trying to play a video on imdb.com on 
Samsung Galaxy Tab 2 7" (4.2.2)

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e7e69cc8c07b&tochange=9b902b7669ae
[Tracking Requested - why for this release]: significant volume change in this signature for 42

Don't see much in the devices or graphics section that points to a single class of devices. Good distribution of devices, Android API versions and graphics hardware.
tracking-fennec: --- → ?
Tracking for 43+. 
Margaret, can you help find an owner for this bug? I'll also ask on the crashdebug list.
Flags: needinfo?(margaret.leibovic)
This is more in snorp's realm than mine.
Flags: needinfo?(margaret.leibovic) → needinfo?(snorp)
We didn't make any changes here, seems like a gfx regression?
Assignee: nobody → milan
Flags: needinfo?(snorp) → needinfo?(milan)
This would suggest that we can have non-empty array in mCurrentImages, where the image in the first element is null.
How did we get the regression range?

Do we have an STR, and if so, can we run this in the debug build?  If mCurrentImages[0].mImage is actually null, we probably would have hit an assertion earlier on (ImageContainer::SetCurrentImageInternal ?)

Anyway, based on bug 1143575, in the regression range, Rob, any further thoughts?

There are also some sound library changes in the regression range (not sure if they affect Android) and some of the user comments talk about "sound is messed up", but we probably want to first rule out bug 1143575.
Flags: needinfo?(snorp)
Flags: needinfo?(roc)
Flags: needinfo?(milan)
Assignee: milan → nobody
Whiteboard: [gfx-noted]
#3 crasher on release (fx42)
Can anybody tell why this went up in crash volume?  Is it the devices, or the number of people we have in pre-release versions, or what those users do, or something else?  Just trying to figure out why we weren't seeing it as a top crasher until 42 went out, and if we can catch things like this in the future.
(In reply to Milan Sreckovic [:milan] from comment #5)
> This would suggest that we can have non-empty array in mCurrentImages, where
> the image in the first element is null.

That sounds right.

> Do we have an STR, and if so, can we run this in the debug build?

No STR, as far as I can tell.

> If mCurrentImages[0].mImage is actually null, we probably would have hit an
> assertion earlier on (ImageContainer::SetCurrentImageInternal ?)

Only in debug builds.

> Anyway, based on bug 1143575, in the regression range, Rob, any further
> thoughts?

Almost certainly my bug.

Sprinkling some release asserts around should help us track this down.
Bug 1198663. Skip null Images in VideoSink::RenderVideoFrames instead of treating them as valid. r=jwwang
Attachment #8686244 - Flags: review?(jwwang)
I tried to check all the paths that could lead to a null image in NonOwningImage and this is the only one I could find, and it's clearly a bug.
Flags: needinfo?(roc)
Mark, how far would you want this uplifted?  Would you do a 42.0.1 for it?
Flags: needinfo?(snorp) → needinfo?(mark.finkle)
(In reply to Milan Sreckovic [:milan] from comment #11)
> Mark, how far would you want this uplifted?  Would you do a 42.0.1 for it?

Once we verify, yes, I would do a 42.0.1 for it. Especially if we can get bug 1209612 too.
Flags: needinfo?(mark.finkle)
Attachment #8686244 - Flags: review?(jwwang) → review+
Comment on attachment 8686244 [details]
MozReview Request: Bug 1198663. Tolerate null Image in Android NPAPI plugins. r=snorp

https://reviewboard.mozilla.org/r/24961/#review22523
This build has Roc's patch. Ioana would someone from your team try to reproduce comment 0 with it? http://people.mozilla.org/~kbrosnan/tmp/1198663/fennec-45.0a1.en-US.android-arm.apk
Flags: needinfo?(ioana.chiorean)
(In reply to Kevin Brosnan [:kbrosnan] from comment #14)
> This build has Roc's patch. Ioana would someone from your team try to
> reproduce comment 0 with it?
> http://people.mozilla.org/~kbrosnan/tmp/1198663/fennec-45.0a1.en-US.android-
> arm.apk

This build still crashes when playing videos on imdb.com, but I'm getting a different signature:
https://crash-stats.mozilla.com/report/index/59baad1b-11df-41e7-a1c0-0881b2151112
Flags: needinfo?(ioana.chiorean)
Assignee: nobody → roc
tracking-fennec: ? → 42+
The signature generation will not work for personal builds.
:snorp, do you have a build handy to try running with the patch and see where we crash?  Trying to save a few hours before :roc is back online.
Flags: needinfo?(snorp)
https://hg.mozilla.org/integration/mozilla-inbound/rev/bc2c76086aefc8f03c0cbec1eeb9d290415fec72
Bug 1198663. Skip null Images in VideoSink::RenderVideoFrames instead of treating them as valid. r=jwwang
https://hg.mozilla.org/mozilla-central/rev/bc2c76086aef
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 45
Can someone confirm that this bug is actually fixed?
Flags: needinfo?(catalin.suciu)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #20)
> Can someone confirm that this bug is actually fixed?

And once we do, let's ask for the uplifts all the way.
We probably want to track this for 42 as it may be included for a 42.0.1 for mobile.
Flags: needinfo?(sledru)
Using the steps from comment #0, I'm still able to reproduce the crash on latest Nightly (2015-11-17)

https://crash-stats.mozilla.com/report/index/3674e313-bfdb-4036-b77b-1bf8d2151118
Flags: needinfo?(catalin.suciu)
Yes, I was aware of this bug. Thanks
Flags: needinfo?(sledru)
Reopening (comment #23)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Can somebody run a debug build?
Yes, debug builds have many asserts that would narrow this down a lot.
Flags: needinfo?(catalin.suciu)
Comment on attachment 8686244 [details]
MozReview Request: Bug 1198663. Tolerate null Image in Android NPAPI plugins. r=snorp

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/24961/diff/1-2/
Attachment #8686244 - Attachment description: MozReview Request: Bug 1198663. Skip null Images in VideoSink::RenderVideoFrames instead of treating them as valid. r=jwwang → MozReview Request: Bug 1198663. Tolerate null Image in Android NPAPI plugins. r=snorp
Attachment #8686244 - Flags: review?(snorp)
nsPluginInstanceOwner::GetImageContainer asserted at "MOZ_ASSERT(img);". So this must be Flash-related.

It seems we'll hit this assert whenever nsNPAPIPluginInstance::mContentTexture and mContentSurface are both null.

Note that we really shouldn't be creating a new ImageContainer every frame. Oh well.
Attachment #8686244 - Flags: review?(snorp) → review+
Comment on attachment 8686244 [details]
MozReview Request: Bug 1198663. Tolerate null Image in Android NPAPI plugins. r=snorp

https://reviewboard.mozilla.org/r/24961/#review23373
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #30)
> 
> Note that we really shouldn't be creating a new ImageContainer every frame.
> Oh well.

At the time this was written I think there was a reason for that, but I can't remember now. I filed bug 1229378 to follow up.
https://hg.mozilla.org/mozilla-central/rev/d34e6e7a22d1
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Using the steps/device from comment #0, I cannot reproduce the crash on latest Nightly (2015-12-03)

Verifying as fixed
Status: RESOLVED → VERIFIED
When we consider uplifting, remember that two patches landed:
https://hg.mozilla.org/mozilla-central/rev/bc2c76086aef
https://hg.mozilla.org/mozilla-central/rev/d34e6e7a22d1

The last one fixed the crash, but the first one was seen as a real bug too (comment 10).
Roc, should we uplift the fix to Aurora44?
Flags: needinfo?(roc)
Yes, we should get this up to Aurora. Beta too, if it's not too late.
Flags: needinfo?(roc)
Comment on attachment 8686244 [details]
MozReview Request: Bug 1198663. Tolerate null Image in Android NPAPI plugins. r=snorp

Approval Request Comment
[Feature/regressing bug #]: 1143575
[User impact if declined]: Crashes
[Describe test coverage new/current, TreeHerder]: Lots of tests test this code
[Risks and why]: low risk. We're just avoiding null derefs.
[String/UUID change made/needed]: none
Attachment #8686244 - Flags: approval-mozilla-beta?
Attachment #8686244 - Flags: approval-mozilla-aurora?
Comment on attachment 8686244 [details]
MozReview Request: Bug 1198663. Tolerate null Image in Android NPAPI plugins. r=snorp

Please uplift both patches in comment 10 to aurora and beta.
Attachment #8686244 - Flags: approval-mozilla-beta?
Attachment #8686244 - Flags: approval-mozilla-beta+
Attachment #8686244 - Flags: approval-mozilla-aurora?
Attachment #8686244 - Flags: approval-mozilla-aurora+
Using the steps/device from comment #0, I can't reproduce this on builds marked affected, so I can't mark this fixed on Firefox 43 Beta 10.
Will need to check the crash-stats after Firefox 43 Beta 10 will be on Play Store.
You need to log in before you can comment on or make changes to this bug.