crash in mozilla::layers::GLImage::GetAsSourceSurface

RESOLVED FIXED in Firefox 50

Status

()

P3
critical
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: snorp, Assigned: jnicol)

Tracking

({crash})

unspecified
mozilla52
Unspecified
Android
crash
Points:
---

Firefox Tracking Flags

(firefox50 fixed, firefox51 fixed, firefox52 fixed)

Details

(Whiteboard: [gfx-noted], crash signature)

Attachments

(3 attachments)

This bug was filed from the Socorro interface and is 
report bp-ee540c5f-c975-4855-80dc-18fb62160203.
=============================================================
Mfg.     Model     API Level  CPU ABI      #    
samsung  GT-P5110  17 (REL)  armeabi-v7a  660  22.7%
samsung  GT-P5113  17 (REL)  armeabi-v7a  244   8.4%
samsung  GT-P5100  17 (REL)  armeabi-v7a  187   6.4%
samsung  GT-P3113  17 (REL)  armeabi-v7a  139   4.8%
samsung  SM-G900F  21 (REL)  armeabi-v7a  118   4.1%

URLs are a lot of web (flash?) games and adult video.
Jeff, any idea why we would fail to bind the FB there? Maybe the size of the texture? Does it have to allocate at the point of binding?
Flags: needinfo?(jgilbert)
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #2)
> Jeff, any idea why we would fail to bind the FB there? Maybe the size of the
> texture? Does it have to allocate at the point of binding?

Size too big, either OOM or non-OOM.

Technically RGBA/UNSIGNED_BYTE textures IIRC aren't guaranteed renderable (fb-complete-able), but we haven't actually found a driver (yet) which doesn't allow this.

It may delay allocation, but it could be in one of four or more places. Doing at-bind-time is not my first bet, but it may be the case.

Best to try to repro this and check if any glError is thrown.
Flags: needinfo?(jgilbert)
Whiteboard: [gfx-noted]

Updated

3 years ago
See Also: → bug 1272876
This still affects current versions and was reported 2550 times last week in Fennec 47.
(Assignee)

Comment 5

2 years ago
I cannot reproduce this, but I have a hunch that the framebuffer is incomplete because the texture which is attached to it has a width and height of zero.

I can trigger this assertion - https://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositableTransactionParent.cpp#191 - fairly easily on pages with flash video.

This is due to the flash plugin sending an image with zero width and height. If such an image were to coincide with the tab thumbnail being painted, which uses basic layers, we would hit the version of this crash with this stacktrace: https://crash-stats.mozilla.com/report/index/e430f1d1-c839-4a0c-8129-731162160915.

Not all the stacktraces for this signature do occur during thumbnailing, however. Some are like this - https://crash-stats.mozilla.com/report/index/652ebe2b-b345-406d-a493-4c2652160915 - which involves html5 video. Again could this perhaps be due to zero width and height images being sent by the video decoder? I don't know as I can't reproduce.

My questions are:
1. Are zero-sized images valid to be sent over ImageContainer? Should the caller of ImageContainer::SetCurrentImagesInTransaction() check before calling that, or should the caller of ImageContainer::GetCurrentImages() check before using them?
2. And is the framebuffer incompleteness really MOZ_CRASH worthy? Other failure points in GLImage::GetAsSourceSurface() simply return nullptr. Perhaps it's more worthy after we've eliminated zero-width/height as being the cause.

Nical (or Snorp or Jeff), don't suppose you can answer either of those?
Flags: needinfo?(nical.bugzilla)
(In reply to Jamie Nicol [:jnicol] from comment #5)
> I cannot reproduce this, but I have a hunch that the framebuffer is
> incomplete because the texture which is attached to it has a width and
> height of zero.
> 
> I can trigger this assertion -
> https://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/
> CompositableTransactionParent.cpp#191 - fairly easily on pages with flash
> video.
> 
> This is due to the flash plugin sending an image with zero width and height.
> If such an image were to coincide with the tab thumbnail being painted,
> which uses basic layers, we would hit the version of this crash with this
> stacktrace:
> https://crash-stats.mozilla.com/report/index/e430f1d1-c839-4a0c-8129-
> 731162160915.
> 
> Not all the stacktraces for this signature do occur during thumbnailing,
> however. Some are like this -
> https://crash-stats.mozilla.com/report/index/652ebe2b-b345-406d-a493-
> 4c2652160915 - which involves html5 video. Again could this perhaps be due
> to zero width and height images being sent by the video decoder? I don't
> know as I can't reproduce.
> 
> My questions are:
> 1. Are zero-sized images valid to be sent over ImageContainer?

I don't see a good reason to send around zero-sized images. I think it'd make sense to detect that early on and not have the whole rendering pipeline handle this case.

> Should the
> caller of ImageContainer::SetCurrentImagesInTransaction() check before
> calling that, or should the caller of ImageContainer::GetCurrentImages()
> check before using them?

Yeah, you can even make it a debug assertion in ImageContainer so that if we run into it we can investigate why we are getting silly sizes in the first place (although I guess anything can come out of the video decoder if the video stream is not well formed), up to you.

> 2. And is the framebuffer incompleteness really MOZ_CRASH worthy? Other
> failure points in GLImage::GetAsSourceSurface() simply return nullptr.
> Perhaps it's more worthy after we've eliminated zero-width/height as being
> the cause.

I would make it a gfxCriticalError rather than a crash on all targets. As you said, I'd wait until we have found and eliminated the cause before turning off the assertion.
Flags: needinfo?(nical.bugzilla)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 9

2 years ago
mozreview-review
Comment on attachment 8792557 [details]
Bug 1245959 - Check plugin has non-zero dimensions before sending image to compositor.

https://reviewboard.mozilla.org/r/79546/#review78422

::: dom/plugins/base/nsPluginInstanceOwner.cpp:228
(Diff revision 1)
>    container = LayerManager::CreateImageContainer();
> -
> +  if (r.width && r.height) {
> -  // Try to get it as an EGLImage first.
> +    // Try to get it as an EGLImage first.
> -  RefPtr<Image> img;
> +    RefPtr<Image> img;
> -  AttachToContainerAsSurfaceTexture(container, mInstance, r, &img);
> +    AttachToContainerAsSurfaceTexture(container, mInstance, r, &img);
> -
> +    printf_stderr("image size: %d, %d\n", img->GetSize().width, img->GetSize().height);

You'll have to remove the printf
Attachment #8792557 - Flags: review?(nical.bugzilla) → review+

Comment 10

2 years ago
mozreview-review
Comment on attachment 8792558 [details]
Bug 1245959 - Check video frame has non-zero dimensions before sending image to compositor.

https://reviewboard.mozilla.org/r/79548/#review78424

Comment 11

2 years ago
mozreview-review
Comment on attachment 8792558 [details]
Bug 1245959 - Check video frame has non-zero dimensions before sending image to compositor.

https://reviewboard.mozilla.org/r/79548/#review78426
Attachment #8792558 - Flags: review?(nical.bugzilla) → review+
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 14

2 years ago
Updated mozreview push removes printf (and empty line deletion). 

https://treeherder.mozilla.org/#/jobs?repo=try&revision=55b31be58591dd598dcd1702f369512fbe7438a4

Let's check this in, but leave the ticket open: these patches are fairly speculative so we should wait and see if they do indeed fix it, and if other checks are needed in other places.
Keywords: checkin-needed, leave-open

Comment 15

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/c633db465597
Check plugin has non-zero dimensions before sending image to compositor. r=nical
https://hg.mozilla.org/integration/autoland/rev/ecd53c7be28e
Check video frame has non-zero dimensions before sending image to compositor. r=nical
Keywords: checkin-needed
(Assignee)

Comment 17

2 years ago
Comment on attachment 8792557 [details]
Bug 1245959 - Check plugin has non-zero dimensions before sending image to compositor.

Since we have basically zero crashes on aurora or nightly, and we're very early in the cycle, I'd like to uplift this to beta to see if it's worked.

Approval Request Comment
[Feature/regressing bug #]: N/A, been around a long time
[User impact if declined]: Continued crashes
[Describe test coverage new/current, TreeHerder]: On nightly, try run in comment 14
[Risks and why]: Low. These crashes might not in fact be due to zero-sized textures as I have hypothesized, In which case there will be no change. If there are zero-sized textures, however, then we will detect them earlier, do less work, and avoid a certain crash.
[String/UUID change made/needed]: N/A
Attachment #8792557 - Flags: approval-mozilla-beta?
Attachment #8792557 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 18

2 years ago
Comment on attachment 8792558 [details]
Bug 1245959 - Check video frame has non-zero dimensions before sending image to compositor.

Approval Request Comment - see above comment
Attachment #8792558 - Flags: approval-mozilla-beta?
Attachment #8792558 - Flags: approval-mozilla-aurora?
Backed out for frequent timeout in mda test test_video_dimensions.html:

https://hg.mozilla.org/integration/mozilla-inbound/rev/17247ecdaf69a1412ffb327b165d190b1447d2e6
https://hg.mozilla.org/integration/mozilla-inbound/rev/2cb11a4e6f61536a8d4800da0562ec7f193200e6

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=36218646&repo=mozilla-inbound

[task 2016-09-21T16:59:14.426714Z] 16:59:14     INFO -  2009 INFO gizmo-short.mp4 (Stream) got loadedmetadata
[task 2016-09-21T16:59:14.427303Z] 16:59:14     INFO -  2010 INFO TEST-PASS | dom/media/test/test_video_dimensions.html | gizmo-short.mp4 (Stream) should only fire loadedmetadata once
[task 2016-09-21T16:59:14.428011Z] 16:59:14     INFO -  2011 INFO TEST-PASS | dom/media/test/test_video_dimensions.html | gizmo-short.mp4 (Stream) resize event should have been emitted before loadedmetadata
[task 2016-09-21T16:59:14.428458Z] 16:59:14     INFO -  2012 INFO [finished gizmo-short.mp4-3] remaining= vp9-short.webm-2
[task 2016-09-21T16:59:14.428961Z] 16:59:14     INFO -  2013 INFO TEST-PASS | dom/media/test/test_video_dimensions.html | [finished gizmo-short.mp4-3 t=1.905] Length of array should match number of running tests
[task 2016-09-21T16:59:14.429415Z] 16:59:14     INFO -  2014 INFO TEST-UNEXPECTED-FAIL | dom/media/test/test_video_dimensions.html | Test timed out.
[task 2016-09-21T16:59:14.430016Z] 16:59:14     INFO -      reportError@SimpleTest/TestRunner.js:114:7
[task 2016-09-21T16:59:14.430609Z] 16:59:14     INFO -      TestRunner._checkForHangs@SimpleTest/TestRunner.js:135:7
[task 2016-09-21T16:59:14.431203Z] 16:59:14     INFO -      setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:156:5

Please also take a look at the Win 7 debug for the push of these patches: https://treeherder.mozilla.org/logviewer.html#?job_id=3778142&repo=autoland
It crashes when test_peerConnection_scaleResolution.html runs.
Flags: needinfo?(jnicol)

Comment 20

2 years ago
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/mozilla-inbound/rev/05a02a824ae2
Check plugin has non-zero dimensions before sending image to compositor. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/e1babcef001e
Check video frame has non-zero dimensions before sending image to compositor. r=nical
Relanded because the video dimensions intermittent persisted: https://treeherder.mozilla.org/logviewer.html#?job_id=36470508&repo=mozilla-inbound
Flags: needinfo?(jnicol)

Updated

2 years ago
status-firefox50: --- → affected
status-firefox51: --- → affected
status-firefox52: --- → fixed
Comment on attachment 8792557 [details]
Bug 1245959 - Check plugin has non-zero dimensions before sending image to compositor.

Crash fix, Aurora51+, Beta50+
Attachment #8792557 - Flags: approval-mozilla-beta?
Attachment #8792557 - Flags: approval-mozilla-beta+
Attachment #8792557 - Flags: approval-mozilla-aurora?
Attachment #8792557 - Flags: approval-mozilla-aurora+

Updated

2 years ago
Attachment #8792558 - Flags: approval-mozilla-beta?
Attachment #8792558 - Flags: approval-mozilla-beta+
Attachment #8792558 - Flags: approval-mozilla-aurora?
Attachment #8792558 - Flags: approval-mozilla-aurora+
Assignee: nobody → jnicol
Keywords: leave-open
Target Milestone: --- → mozilla52
(Assignee)

Comment 26

2 years ago
There are still crashes on 50.0b4, so looks like my hunch was incorrect and neither of the patches worked.

MOZ_CRASHing on this seems excessive to me. I will change it to a gfxCriticalError, so this should safely fail in release builds. And hopefully by still crashing in debug builds we will be able to work out the cause of this at some point.
status-firefox50: fixed → affected
status-firefox51: fixed → affected
status-firefox52: fixed → affected
Priority: -- → P3
(Assignee)

Comment 27

2 years ago
Created attachment 8799923 [details] [diff] [review]
Remove MOZ_CRASH from GLImage::GetAsSourceSurface

Can't figure out how to use mozreview, doesn't seem to like that there's already been a different patch set for this bug..

Anyway, this just replaces the MOZ_CRASH with a gfxCriticalError. This might make it harder to find the actual source of the problem, but so far I've been unable to reproduce even with the crash...
Attachment #8799923 - Flags: review?(nical.bugzilla)

Updated

2 years ago
Attachment #8799923 - Flags: review?(nical.bugzilla) → review+
(Assignee)

Comment 29

2 years ago
Comment on attachment 8799923 [details] [diff] [review]
Remove MOZ_CRASH from GLImage::GetAsSourceSurface

Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: Continued crashes
[Describe test coverage new/current, TreeHerder]: Try run https://treeherder.mozilla.org/#/jobs?repo=try&revision=e77647cbbff39b1a1665f13e22411e2f653ee8d3
[Risks and why]: Very low. replaces crash with existing error path
[String/UUID change made/needed]: N/A
Attachment #8799923 - Flags: approval-mozilla-beta?
Attachment #8799923 - Flags: approval-mozilla-aurora?
Comment on attachment 8799923 [details] [diff] [review]
Remove MOZ_CRASH from GLImage::GetAsSourceSurface

Crash fix, Aurora51+, Beta50+
Attachment #8799923 - Flags: approval-mozilla-beta?
Attachment #8799923 - Flags: approval-mozilla-beta+
Attachment #8799923 - Flags: approval-mozilla-aurora?
Attachment #8799923 - Flags: approval-mozilla-aurora+

Comment 31

2 years ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c83a89713c3a
Remove MOZ_CRASH from GLImage::GetAsSourceSurface. r=nical
Keywords: checkin-needed

Comment 32

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/c83a89713c3a
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox52: affected → fixed
Resolution: --- → FIXED

Comment 33

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/0abdfc79d8e1
status-firefox51: affected → fixed

Comment 34

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/bb0b632c5045
status-firefox50: affected → fixed
You need to log in before you can comment on or make changes to this bug.