Closed Bug 1868825 Opened 1 year ago Closed 1 year ago

Crash in [@ libGLESv2_samsung.so@0x2c1b6c] on s5e9925

Categories

(Fenix :: General, defect, P3)

Unspecified
Android
defect

Tracking

(firefox120 wontfix, firefox121+ fixed, firefox122+ fixed)

RESOLVED FIXED
122 Branch
Tracking Status
firefox120 --- wontfix
firefox121 + fixed
firefox122 + fixed

People

(Reporter: aryx, Assigned: jnicol)

References

(Blocks 1 open bug)

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(2 files, 1 obsolete file)

596 crash reports starting from November 20. Reports by three different device models (all running Android 14):

samsung SM-S908B
samsung SM-S901B
samsung SM-S906B

Their platform versions:
0.0.0 Linux 5.10.177-android12-9-27763393-abS908BXXU6DWK4 #1 SMP PREEMPT Wed Nov 8 11:56:17 KST 2023 aarch64
0.0.0 Linux 5.10.177-android12-9-27763393-abS901BXXU6DWK4 #1 SMP PREEMPT Wed Nov 8 11:55:54 KST 2023 aarch64
0.0.0 Linux 5.10.177-android12-9-27763393-abS906BXXU6DWK4 #1 SMP PREEMPT Wed Nov 8 11:57:10 KST 2023 aarch64

Crash report: https://crash-stats.mozilla.org/report/index/360774f2-085b-4c42-bf25-e31a80231207

Reason: SIGSEGV / SEGV_MAPERR

Top 10 frames of crashing thread:

0  libGLESv2_samsung.so  libGLESv2_samsung.so@0x2c1b6c  
1  libGLESv2_samsung.so  libGLESv2_samsung.so@0x3b446c  
2  libGLESv2_samsung.so  libGLESv2_samsung.so@0x2aafec  
3  libGLESv2_samsung.so  libGLESv2_samsung.so@0x2549dc  
4  libGLESv2_samsung.so  libGLESv2_samsung.so@0x254124  
5  libGLESv2_samsung.so  libGLESv2_samsung.so@0x25555c  
6  libGLESv2_samsung.so  libGLESv2_samsung.so@0x1dfe60  
7  libGLESv2_samsung.so  libGLESv2_samsung.so@0x1e1614  
8  libEGL_samsung.so  libEGL_samsung.so@0x18840  
9  libEGL.so  libEGL.so@0x27c7c  
Flags: needinfo?(cpeterson)

Looks like we have a small number of crash reports with this Samsung signature from earlier Firefox versions, but there was a spike in Firefox 120. Perhaps there was a code regression in Firefox 120 or in a recent Samsung OS update.

Severity: -- → S3
Flags: needinfo?(cpeterson)
Priority: -- → P3
See Also: → 1866555

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 10 AArch64 and ARM crashes on release

:jonalmeida, could you consider increasing the severity of this top-crash bug?

For more information, please visit BugBot documentation.

Flags: needinfo?(jonalmeida942)
Keywords: topcrash

Looking at the crashes, they all take place in the content process rather than GPU process. We will run webgl in the content process if the GPU process has been disabled (which can occur following a number of GPU process crashes, or if manually disabled).

I can reproduce this by:

After a second or so cycle collection occurs, the webgl context is destroyed, leading to eglTerminate being called, which crashes.

If the GPU process is enabled, then the webgl context being destroyed does not lead to eglTerminate being called. I think this is because the EglDisplay is still in use for webrender.

It looks like this is only happening when android_board = s5e9925. I checked for other crashes with proto signature containing eglTerminate to see if there might be different drivers having the same problem but nothing that came up jumped out as being the same

Summary: Crash in [@ libGLESv2_samsung.so@0x2c1b6c] → Crash in [@ libGLESv2_samsung.so@0x2c1b6c] on s5e9925

To my knowledge the S22 family is the only device with the Xclipse GPU, and uses ANGLE as its GL driver. So I think we want to simply not call eglTerminate() in that case. I think we only ever use the default display on Android anyway, and calling eglInitialize() multiple times for the same display is allowed.

Using the board seems like a good way to make that decision as we have that information available to us from EglDisplay::fTerminate(). (java::sdk::Build::BOARD()) Saves us having to create a GL context and query the renderer string.

Flags: needinfo?(jonalmeida942)

We are seeing crashes on the Samsung S22 family of devices in
eglTerminate after updating to Android 14. To work around this we
deliberately leak the EGLDisplay on affected devices. In practice we
only ever use the default EGLDisplay on Android, and calling
eglInitialize multiple times is allowed, so this is fine.

Note this only occurs when running webgl in the content process, which
will occur naturally following enough GPU process crashes that we
disable the GPU process. When webgl is running in the GPU process
webrender keeps the EGLDisplay alive, meaning we never terminate it.

Assignee: nobody → jnicol
Status: NEW → ASSIGNED

We are seeing crashes on the european Samsung S22 family of devices in
eglTerminate after updating to Android 14. To work around this we
deliberately leak the EGLDisplay on affected devices. In practice we
only ever use the default EGLDisplay on Android, and calling
eglInitialize multiple times is allowed, so this is fine.

Note this only occurs when running webgl in the content process, which
will occur naturally following enough GPU process crashes that we
disable the GPU process. When webgl is running in the GPU process
webrender keeps the EGLDisplay alive, meaning we never terminate it.

Original Revision: https://phabricator.services.mozilla.com/D196146

Attachment #9368374 - Flags: approval-mozilla-release?
Pushed by jnicol@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5105ca2d1171 Leak EGLDisplays on Samsung S22 devices. r=gfx-reviewers,nical,ahale

Uplift Approval Request

  • Fix verified in Nightly: no
  • Risk associated with taking this patch: Low
  • Code covered by automated testing: no
  • Is Android affected?: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • User impact if declined: Frequent crashes on Samsung S22 devices after Android 14 update
  • Explanation of risk level: Just leaks a singleton EGLDisplay instead of crashing
  • String changes made/needed: None
Attachment #9368374 - Attachment is obsolete: true
Attachment #9368374 - Flags: approval-mozilla-release?
Attachment #9368374 - Attachment is obsolete: false
Attachment #9368374 - Flags: approval-mozilla-release?

Uplift Approval Request

  • User impact if declined: Frequent crashes on Samsung S22 devices after Android 14 update
  • Steps to reproduce for manual QE testing: N/A
  • Needs manual QE test: no
  • String changes made/needed: None
  • Explanation of risk level: Just leaks a singleton EGLDisplay instead of crashing
  • Fix verified in Nightly: no
  • Is Android affected?: yes
  • Code covered by automated testing: no
  • Risk associated with taking this patch: Low

Backed out for causing bp-nu bustages in GLLibraryEGL.h.

Flags: needinfo?(jnicol)
Attachment #9368374 - Attachment is obsolete: true
Attachment #9368374 - Flags: approval-mozilla-release?
Pushed by jnicol@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8511168942e7 Leak EGLDisplays on Samsung S22 devices. r=gfx-reviewers,nical,ahale

The bug is marked as tracked for firefox121 (beta) and tracked for firefox122 (nightly). We have limited time to fix this, the soft freeze is in a day. However, the bug still has low priority and has low severity.

:jonalmeida, could you please increase the priority and increase the severity for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(jonalmeida942)
Flags: needinfo?(jonalmeida942)
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 122 Branch

We are seeing crashes on the european Samsung S22 family of devices in
eglTerminate after updating to Android 14. To work around this we
deliberately leak the EGLDisplay on affected devices. In practice we
only ever use the default EGLDisplay on Android, and calling
eglInitialize multiple times is allowed, so this is fine.

Note this only occurs when running webgl in the content process, which
will occur naturally following enough GPU process crashes that we
disable the GPU process. When webgl is running in the GPU process
webrender keeps the EGLDisplay alive, meaning we never terminate it.

Original Revision: https://phabricator.services.mozilla.com/D196146

Attachment #9368618 - Flags: approval-mozilla-release?

Uplift Approval Request

  • User impact if declined: Frequent crashes for users with Samsung S22 devices after upgrading to Android 14
  • Steps to reproduce for manual QE testing: N/A
  • Needs manual QE test: no
  • Explanation of risk level: leaks a singleton EGLDisplay instance instead of calling a function that always crashes
  • String changes made/needed: None
  • Fix verified in Nightly: no
  • Is Android affected?: yes
  • Code covered by automated testing: no
  • Risk associated with taking this patch: Low
Flags: needinfo?(jnicol)
Attachment #9368618 - Flags: approval-mozilla-release? → approval-mozilla-release+

Checking in here. Are there any further reports of flickering for this after Jamie's patch?

Flags: needinfo?(aryx.bugmail)

(In reply to Bob Hood [:bhood] from comment #18)

Checking in here. Are there any further reports of flickering for this after Jamie's patch?

Flags: needinfo?(aryx.bugmail) → needinfo?(dmeehan)

Nothing since 121.0 was released or in 122 betas(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #19)

(In reply to Bob Hood [:bhood] from comment #18)

Checking in here. Are there any further reports of flickering for this after Jamie's patch?

Nothing since 121.0 was released or in 122 betas

Flags: needinfo?(dmeehan)
See Also: → 1903810
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: