Closed Bug 1413500 Opened 2 years ago Closed 2 years ago

Crash in mozilla::gl::GLBlitHelper::BlitImage

Categories

(Firefox for Android :: Audio/Video, defect, P1, critical)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
Firefox 59
Tracking Status
fennec + ---
firefox-esr52 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected
firefox58 - fixed
firefox59 --- fixed

People

(Reporter: calixte, Assigned: snorp)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression, topcrash, Whiteboard: [clouseau])

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-1331cca0-6270-431f-aa1c-dc02e0171101.
=============================================================

There are 61 crashes in nightly 58 with buildid 20171031234943. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1395497.

[1] https://hg.mozilla.org/mozilla-central/rev/b6aad6761527
Flags: needinfo?(snorp)
Up to 141 crashes now for the 10-31 Nightly.
Assignee: nobody → snorp
The signature is ranked #1 in nightly 58 top-crashers for FennecAndroid (1738 crashes for 784 installs).
Keywords: topcrash
Huge crash, marking it as blocking.
Pretty sure this will be fixed by bug 1413230, which is just waiting for review.
Flags: needinfo?(snorp)
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #4)
> Pretty sure this will be fixed by bug 1413230, which is just waiting for
> review.

The fix for that bug landed several days ago, but I am still seeing crashes in this signature in 20171108111014.
Flags: needinfo?(snorp)
(In reply to Marcia Knous [:marcia - use ni] from comment #5)
> (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #4)
> > Pretty sure this will be fixed by bug 1413230, which is just waiting for
> > review.
> 
> The fix for that bug landed several days ago, but I am still seeing crashes
> in this signature in 20171108111014.

Indeed. I'll look into it further.
Flags: needinfo?(snorp)
Whoops, found the problem.

Jeff, I need an adult. Shader fails to compile because we don't have GL_OES_EGL_image_external_essl3. Can we easily fall back to ES 2.0?

|[0][GFX1]: DrawBlitProg link failed:
progLog: Link Error: Fragment shader was not successfully compiled.

vsLog: Success.

fsLog: Compile failed.
ERROR: 0:3: Extension GL_OES_EGL_image_external_essl3 not supported
ERROR: 1 compilation errors. No code generated.
tracking-fennec: --- → +
Flags: needinfo?(jgilbert)
Priority: -- → P1
I'm have this problem too. The  last  version of nightly for Android  crash on "x" nsfw  video sites.
369 occurrences in Nightly 20171110100131, which is enormously high -- making recenty Nightly builds probably more than 20x crashier than normal.

Please fix ASAP!
Comment on attachment 8926968 [details]
Bug 1413500 - Disable video surface readback (for page thumbnail) on Android

https://reviewboard.mozilla.org/r/198188/#review204314

I don't think this does what you want it to do.

In the notes to GL_OES_EGL_image_external_essl3 it states that it's not allowed to support GL_OES_EGL_image_external_essl3 but not GL_OES_EGL_image_external. I think we should always use ESSL100 instead of ESSL300 because of this. 

Also, we should probably make this gfxCriticalNote, not gfxCriticalError, so that it doesn't crash because of this. (assuming we have a fallback?)
Attachment #8926968 - Flags: review?(jgilbert) → review-
:snorp - Would it be possible to back this out while we continue working on a fix? Since this is in 58 we are worried about the beta audience.
Flags: needinfo?(snorp)
(In reply to Marcia Knous [:marcia - use ni] from comment #12)
> :snorp - Would it be possible to back this out while we continue working on
> a fix? Since this is in 58 we are worried about the beta audience.

I think we're gonna get a fix soon. If that doesn't happen we can certainly back it out.
Flags: needinfo?(snorp)
The backout required a bit of rebasing, but it looks good on Try so far.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=611e3676c8937ef5050426bb62dec68aa953fce5

I'd say that if we have a fix in time for Monday's b5 gtb, great. Otherwise, let's backout instead. We shouldn't ship another beta with this crash still not addressed.
Comment on attachment 8926968 [details]
Bug 1413500 - Disable video surface readback (for page thumbnail) on Android

https://reviewboard.mozilla.org/r/198188/#review205546

My concern here is that it's viable for an implementation to support OES_EGL_image_external without OES_EGL_image_external_essl3.
Now, it's not likely that an implementation with only image_external would deliberately support it in ES3 essl100 shaders but not implicitly have image_external_essl3 behavior.
But there's no reason to use essl3 here at all. We should only check EGL_image_external, and then supply essl100 shaders.
Attachment #8926968 - Flags: review?(jgilbert) → review-
Comment on attachment 8926968 [details]
Bug 1413500 - Disable video surface readback (for page thumbnail) on Android

https://reviewboard.mozilla.org/r/198188/#review205560
Comment on attachment 8926968 [details]
Bug 1413500 - Disable video surface readback (for page thumbnail) on Android

https://reviewboard.mozilla.org/r/198188/#review205546

I tried that initially, but specifying "#version 100 es" didn't work on a ES3 context (which is what we get most of the time even if it's not WebGL2). I'll poke around some more.
Comment on attachment 8926968 [details]
Bug 1413500 - Disable video surface readback (for page thumbnail) on Android

https://reviewboard.mozilla.org/r/198188/#review206120
Attachment #8926968 - Flags: review?(jgilbert) → review+
Comment on attachment 8926968 [details]
Bug 1413500 - Disable video surface readback (for page thumbnail) on Android

https://reviewboard.mozilla.org/r/198188/#review205546

Ah yeah, essl100 is actually `#version 100`, even though essl300 is `#version 300 es`. (Historical reasons, but fwiw, there is no glsl100, so it all works out)
Flags: needinfo?(jgilbert)
Pushed by jwillcox@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2fec62780f47
Use GLSL 100 for blitting on ES r=jgilbert
https://hg.mozilla.org/mozilla-central/rev/2fec62780f47
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 59
Hi :snorp,
Can you help request Beta approval for this?
Flags: needinfo?(snorp)
Hi Aryx,
Can you merge the patch to Beta directly?
Flags: needinfo?(aryx.bugmail)
Looks like this is still broken.
Status: RESOLVED → REOPENED
Flags: needinfo?(snorp)
Resolution: FIXED → ---
Hi :snorp,
We might need to back out bug 1395497 and this bug from beta?
Flags: needinfo?(snorp)
Yes we should. And we should ship another Fennec beta ASAP with that backout included. Note that comment 14 has an already-rebased patch that is green on Try.
Hi Aryx,
Can you help backout bug 1395497 & this bug from beta?
fixed on 58 by backing out bug 1395497
Target Milestone: Firefox 59 → ---
The latest patch should stop the bleeding, and fixes a problem we'd want to fix anyway.
Flags: needinfo?(snorp)
Jeff, the latest crashes all have the following shader compilation errors which I don't understand. Any ideas?

|[0][GFX1]: DrawBlitProg link failed:
progLog: L0101 All attached shaders must be compiled prior to linking
vsLog: 
fsLog: 3:2: L0001: Typename expected, found 'samplerExternalOES'
 (t=379.697)

"#extension GL_OES_EGL_image_external : require" is supposed to provide this. If that extension is missing (it's not, Android requires it in devices where SurfaceTexture is supported), surely the compiler would've complained about that line?
Flags: needinfo?(jgilbert)
Can we just buy an affected device? I don't think we can keep trying to fix it at a distance.

We also should really make this not fatal, instead using gfxCriticalNote.

I don't see how the compiler can spec-compliantly accept `#extension GL_OES_EGL_image_external : require` but not `samplerExternalOES`. I think we really need to experiment with an affected device.
Flags: needinfo?(jgilbert)
Looks like the affected devices are all Mali. I probably have one of those here, so I'll try to repro on that.
Pushed by jwillcox@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/00c7ee68e3a8
Disable video surface readback (for page thumbnail) on Android r=jgilbert
I have a Mali-400 devices and couldn't repro. WebGL doesn't even work on this thing, so the thumbnail path should be the only one we have to worry about. The patch I landed will take care of that.
https://hg.mozilla.org/mozilla-central/rev/00c7ee68e3a8
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 59
Comment on attachment 8926968 [details]
Bug 1413500 - Disable video surface readback (for page thumbnail) on Android

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1395497
[User impact if declined]: Crashes
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: Been on Nightly for the whole cycle so far
[String changes made/needed]: None
Attachment #8926968 - Flags: approval-mozilla-beta?
Comment on attachment 8926968 [details]
Bug 1413500 - Disable video surface readback (for page thumbnail) on Android

Since bug 1395497 is not in 58, we don't need to take this. Beta58-.
Attachment #8926968 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in before you can comment on or make changes to this bug.