Closed Bug 2014925 Opened 2 months ago Closed 1 month ago

Crash in [@ <gleam::gl::GlesFns as gleam::gl::Gl>::get_shader_info_log]

Categories

(Core :: Graphics, defect)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
150 Branch
Tracking Status
firefox-esr140 --- unaffected
firefox148 --- wontfix
firefox149 --- fixed
firefox150 --- fixed

People

(Reporter: mccr8, Assigned: jnicol)

Details

(Keywords: crash)

Crash Data

Attachments

(2 files)

Crash report: https://crash-stats.mozilla.org/report/index/1736ddda-863b-406b-818d-316d90260205

Looks like this first showed up in the 20260118090018 build. Last seen in the 20260128215503 build so maybe it got fixed. I saw this in the crash spike report.

MOZ_CRASH Reason:

called `Result::unwrap()` on an `Err` value: FromUtf8Error { bytes: [69, 82, 82, 79, 82, 58, 32, 48, 58, 49, 50, 51, 58, 32, 39, 229, 39, 32, 58, 32, 85, 110, 107, 110, 111, 119, 110, 32, 99, 104, 97, 114, 32, 10, 73, 78, 84, 69, 82, 78, 65, 76, 32, 69, 82, 82, 79, 82, 58, 32, 110, 111, 32, 109, 97, 105, 110, 40, 41, 32, 102, 117, 110, 99, 116, 105, 111, 110, 33, 10, 69, 82, 82, 79, 82, 58, 32, 49, 32, 99, 111, 109, 112, 105, 108, 97, 116, 105, 111, 110, 32, 101, 114, 114, 111, 114, 115, 46, 32, 32, 78, 11

Top 10 frames:

0  libxul.so  MOZ_CrashSequence(void*, long)  mfbt/Assertions.h:242
0  libxul.so  MOZ_Crash(char const*, int, char const*)  mfbt/Assertions.h:375
0  libxul.so  RustMozCrash  mozglue/static/rust/wrappers.cpp:18
1  libxul.so  mozglue_static::panic_hook  mozglue/static/rust/lib.rs:99
2  libxul.so  core::ops::function::Fn::call  /builds/worker/fetches/rustc/lib/rustlib/src/rust/library/core/src/ops/function.rs:80
3  libxul.so  <alloc::boxed::Box<F, A> as core::ops::function::Fn<Args>>::call  library/alloc/src/boxed.rs:1985
3  libxul.so  std::panicking::rust_panic_with_hook  library/std/src/panicking.rs:841
4  libxul.so  std::panicking::begin_panic_handler::{{closure}}  library/std/src/panicking.rs:706
5  libxul.so  std::sys::backtrace::__rust_end_short_backtrace  library/std/src/sys/backtrace.rs:174
6  libxul.so  __rustc::rust_begin_unwind  library/std/src/panicking.rs:697

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 10 AArch64 and ARM crashes on nightly

For more information, please visit BugBot documentation.

Keywords: topcrash

Jamie, this is a topcrash, could you take a look?

Severity: -- → S3

All crashes on a single family of devices on Android 16. It seems the android 16 update was rolled out recently so probably not a change on our end.

called Result::unwrap() on an Err value: FromUtf8Error { bytes: [69, 82, 82, 79, 82, 58, 32, 48, 58, 49, 50, 51, 58, 32, 39, 229, 39, 32, 58, 32, 85, 110, 107, 110, 111, 119, 110, 32, 99, 104, 97, 114, 32, 10, 73, 78, 84, 69, 82, 78, 65, 76, 32, 69, 82, 82, 79, 82, 58, 32, 110, 111, 32, 109, 97, 105, 110, 40, 41, 32, 102, 117, 110, 99, 116, 105, 111, 110, 33, 10, 69, 82, 82, 79, 82, 58, 32, 49, 32, 99, 111, 109, 112, 105, 108, 97, 116, 105, 111, 110, 32, 101, 114, 114, 111, 114, 115, 46, 32, 32, 78, 11

The 15th byte (229) is invalid. Up until that point we have ERROR: 0:102: . And afterwards we have ' : Unknown char \nINTERNAL ERROR: no main() function!\nERROR: 1 compilation errors. N\u{b}. I guess it's truncated in the crash annotation since we don't have the closing ] or } for the FromUtf8Error

Going to go out on a limb here and say that our shaders probably do have a main function, and it's a driver bug

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit BugBot documentation.

Keywords: topcrash

I got my hands on a Lenovo Legion Tab 3 (model TB321FU) and updated it to Android 16. Unfortunately I can't reproduce just by browsing the web for a short period.

In the past we've seen issues due to not null terminating the source string we pass to glShaderSource. So I tried passing null as the length argument to glShaderSource, and interestingly this reproduces the exact same crash we see here. e.g. I see ERROR: 0:323: '' : Unknown char in the logcat, and we panic when attempting to convert that string (which contains invalid UTF-8) to a rust string.

This doesn't explain why the length parameter would be ignored in some cases in real life, but it does perhaps indicate that the crash occurs when the length argument does get ignored for some reason, in combination with the lack of null termination.

Seems worthwhile ensuring the shader strings are null terminated on this GPU, like we do for some other ones already. Hopefully that will be enough to avoid the crash.

On some Lenovo devices with Adreno 750 GPUs we are seeing cases of
glCompileShader failing, and the subsequent call to glGetShaderInfoLog
returning invalid UTF-8, leading to a crash in the gleam bindings when
converting to a Rust String.

We have been unable to reproduce the crash organically. However, by
passing null as the length argument to glCompileShader (whilst still
passing non-null-terminated source strings) we can reproduce the exact
same behaviour. This perhaps indicates that the lack of null
termination could be a factor in the crash, though of course it does
not explain why users in the wild are encountering this when we could
not ourselves.

This patch therefore speculatively ensures the shader source strings
we pass to the driver are always null terminated on this
GPU. Hopefully that will prevent the compilation failures and crashes.

Assignee: nobody → jnicol
Status: NEW → ASSIGNED
Pushed by jnicol@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/3fb626374252 https://hg.mozilla.org/integration/autoland/rev/912f5664e575 Null terminate shader source strings on Adreno 750 GPUs. r=gfx-reviewers,lsalzman
Status: ASSIGNED → RESOLVED
Closed: 1 month ago
Resolution: --- → FIXED
Target Milestone: --- → 150 Branch

firefox-beta Uplift Approval Request

  • User impact if declined: Crashes for users with Lenovo tablets
  • Code covered by automated testing: no
  • Fix verified in Nightly: no
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing:
  • Risk associated with taking this patch: low
  • Explanation of risk level: Workaround should be safe - just adds a null terminator to strings we pass to opengl driver. We already do the same on some other devices for similar bugs. There is always a possibility this could uncover some other driver bug, but it seems unlikely
  • String changes made/needed: N/A
  • Is Android affected?: yes
Attachment #9549362 - Flags: approval-mozilla-beta?

On some Lenovo devices with Adreno 750 GPUs we are seeing cases of
glCompileShader failing, and the subsequent call to glGetShaderInfoLog
returning invalid UTF-8, leading to a crash in the gleam bindings when
converting to a Rust String.

We have been unable to reproduce the crash organically. However, by
passing null as the length argument to glCompileShader (whilst still
passing non-null-terminated source strings) we can reproduce the exact
same behaviour. This perhaps indicates that the lack of null
termination could be a factor in the crash, though of course it does
not explain why users in the wild are encountering this when we could
not ourselves.

This patch therefore speculatively ensures the shader source strings
we pass to the driver are always null terminated on this
GPU. Hopefully that will prevent the compilation failures and crashes.

Original Revision: https://phabricator.services.mozilla.com/D285497

Attachment #9549362 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: