Open Bug 1935458 Opened 2 months ago Updated 7 days ago

Crash in [@ std::sys::pal::unix::rand::imp::getrandom_fill_bytes]

Categories

(Core :: General, defect, P1)

Unspecified
Android
defect

Tracking

()

Tracking Status
firefox133 + wontfix
firefox134 + wontfix
firefox135 - fix-optional
firefox136 --- fix-optional

People

(Reporter: aryx, Assigned: towhite)

Details

(Keywords: crash)

Crash Data

[Tracking Requested - why for this release]:

Crash signature which started with Fenix 132 (except isolated instances with older versions). Only older Android version are affected, the majority on Android 6, and a subset of devices report this, e.g. from ZTE or FiberHome, all with armeabi-v7a as CPU architecture.

Crash report: https://crash-stats.mozilla.org/report/index/89707fe1-a07c-46b3-92d1-a59a80241205

MOZ_CRASH Reason:

unexpected getrandom error: 22

Top 10 frames:

0  libxul.so  MOZ_Crash(char const*, int, char const*)  mfbt/Assertions.h:317
0  libxul.so  RustMozCrash  mozglue/static/rust/wrappers.cpp:18
1  libxul.so  mozglue_static::panic_hook  mozglue/static/rust/lib.rs:102
1  libxul.so  core::ops::function::Fn::call  library/core/src/ops/function.rs:79
2  libxul.so  <alloc::boxed::Box<F, A> as core::ops::function::Fn<Args>>::call  library/alloc/src/boxed.rs:2084
2  libxul.so  std::panicking::rust_panic_with_hook  library/std/src/panicking.rs:808
3  libxul.so  std::panicking::begin_panic_handler::{{closure}}  library/std/src/panicking.rs:674
4  libxul.so  std::sys::backtrace::__rust_end_short_backtrace  library/std/src/sys/backtrace.rs:168
5  libxul.so  rust_begin_unwind  library/std/src/panicking.rs:665
6  libxul.so  core::panicking::panic_fmt  library/core/src/panicking.rs:74

:twhite could you take a look at this crash? (Adding an NI as Triage Owner but feel free to redirect)
If there's anything we can do in time for the planned Android dot release?

Flags: needinfo?(towhite)

The bug is marked as tracked for firefox133 (release), tracked for firefox134 (beta) and tracked for firefox135 (nightly). However, the bug still isn't assigned.

:towhite, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(towhite)
Assignee: nobody → towhite
Flags: needinfo?(towhite)

I wondered if this could be related to bug 1816953 but it doesn't seem similar enough.

Preliminary investigation shows that

This is as far as I feel I can take it. Looks really weird, and not like it's something that application code can trigger or avoid. But I'm no Linux/Android/syscall expert.

The severity field is not set for this bug.
:towhite, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(towhite)

Donal, I've reach out to the wider Android team for awareness

Severity: -- → S2
Flags: needinfo?(towhite)
Priority: -- → P1

Oh, I forgot to mention: This doesn't look to be Glean. This looks like something that'd happen to whoever instantiated a std::collections::HashMap (or other user of std::hash::random::RandomState) first. Glean just happens to be first. It's literally a default-constructed HashMap (created via default::Default), so there's nothing that application or library code could reasonably be expected to do about it (e.g. it's not as though it's being constructed with weird args or custom hash fns or something).

Hi Tom, any updates on this? We're heading into our final week of Beta before 135 goes to RC.

Flags: needinfo?(towhite)

Ryan, no update so far that I'm aware of - I'll bump this to the Android team

Flags: needinfo?(towhite)

This crash signature has been observed across Fenix and Focus, and given the crash details I don't believe it's related to this specific component. The logs seems to suggest an issue further down the stack in libxul.so, potentially Gecko related?

Component: Experimentation and Telemetry → General
Product: Fenix → Core

If I facet on android model it seems to me that we see mostly if not only Android-based TV settop boxes which might be running old Android versions? IIUC the Linux getrandom is not available on all Android versions, not sure if that might be relevant here.

You need to log in before you can comment on or make changes to this bug.