Tabs become unresponsive after GPU crash on Android 12
Categories
(GeckoView :: General, defect)
Tracking
(firefox-esr91 unaffected, firefox98 unaffected, firefox99 unaffected, firefox100 fixed)
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox98 | --- | unaffected |
firefox99 | --- | unaffected |
firefox100 | --- | fixed |
People
(Reporter: agi, Assigned: jnicol)
References
Details
Attachments
(1 file)
Latest Fenix Nightly
- open a few tabs
- type "about:crashgpu" in one of them
- notice all tabs are blank (either full white or black depending on theme) and unresponsive
expected:
- Tabs behave normally after crash
Reporter | ||
Comment 1•3 years ago
|
||
I can reproduce on both my Samsung S10e and Pixel 3a
Assignee | ||
Comment 2•3 years ago
•
|
||
So I've tested this morning on about 20 different devices (and the emulator) running Android versions from 5 through to 12. And a few devices I was able to test on 11 then upgrade to 12. And this does indeed seem to only affect Android 12.
From the logcat it appears that on Android 12 the previous EGL context isn't detached from the Surface when its process dies. Then when we attempt to create a new EGL context it fails because it thinks the Surface is already attached to a context. And we therefore cannot resume the compositor without an EGL context.
BufferQueueProducer: [SurfaceView[org.mozilla.fenix/org.mozilla.fenix.App]#11(BLAST Consumer)11](id:29bb0000000b,api:1,p:10818,c:10683) connect: already connected (cur=1 req=1)
libEGL : eglCreateWindowSurface: native_window_api_connect (win=0x7ab34a728440) failed (0xffffffea) (already connected to another API?)
libEGL : eglCreateWindowSurfaceTmpl:676 error 3003 (EGL_BAD_ALLOC)
Gecko : [GFX1-]: Failed to create EGLSurface
CompositorBridgeParent: Unable to renew compositor surface; remaining in paused state
I think the best thing to do for now is disable the GPU process on Android 12, and we can ride the trains for all other android versions. Then I can find a solution for this in parallel.
Assignee | ||
Comment 3•3 years ago
|
||
On Android version 12, it appears as if the EGL context does not
correctly get detached from a Surface when its process dies. This
means that subsequent attempts to create an EGL context fail, meaning
we cannot render anything.
This results in a completely unusable browser following a GPU process
crash, which is worse than the alternative of a parent process crash.
Block the GPU process on Android 12 and above until we have found a
workaround.
Updated•3 years ago
|
Comment 5•3 years ago
|
||
Since this bug only affects users using a GPU process, Fenix versions <= 99 are unaffected. No need to uplift changes to Beta 99.
Comment 6•3 years ago
|
||
bugherder |
Assignee | ||
Comment 7•3 years ago
|
||
In theory, this frozen state should be recoverable from by minimising and restoring the app, as that forces the system to give us a new Surface. However, the app was still unresponsive after doing so. Which made me wonder whether something else was occurring that would prevent EGL contexts to be attached to brand new Surfaces.
That was due to bug 1762388, a bug in our compositor re-initialization path. This was unrelated to the GPU process (it's when reinitializing compositors without tearing down the GPU process) and appears to be quite long-standing. This bug gives us a really good stress-test for that code path :)
With that fixed we can indeed recover from this state by minimizing and restoring the app, as expected. (Obviously this bug still needs fixed, but at least it seems the explanation in comment 2 is correct and there isn't anything more complicated happening.)
Description
•