Crash in [@ core::result::unwrap_failed | core::result::Result<T>::expect]
Categories
(Toolkit :: Telemetry, defect)
Tracking
()
People
(Reporter: gsvelto, Assigned: chutten)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
48 bytes,
text/x-phabricator-request
|
diannaS
:
approval-mozilla-release+
|
Details | Review |
Crash report: https://crash-stats.mozilla.org/report/index/e03ce9b2-ccda-46c6-8258-345c90230326
MOZ_CRASH Reason: failed to open /dev/urandom: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }
Top 10 frames of crashing thread:
0 libxul.so MOZ_Crash mfbt/Assertions.h:261
0 libxul.so RustMozCrash mozglue/static/rust/wrappers.cpp:18
1 libxul.so mozglue_static::panic_hook mozglue/static/rust/lib.rs:91
2 libxul.so core::ops::function::Fn::call library/core/src/ops/function.rs:79
3 libxul.so <alloc::boxed::Box<F, A> as core::ops::function::Fn<Args>>::call library/alloc/src/boxed.rs:2002
3 libxul.so std::panicking::rust_panic_with_hook library/std/src/panicking.rs:692
4 libxul.so std::panicking::begin_panic_handler::{{closure}} library/std/src/panicking.rs:579
5 libxul.so std::sys_common::backtrace::__rust_end_short_backtrace library/std/src/sys_common/backtrace.rs:137
6 libxul.so rust_begin_unwind library/std/src/panicking.rs:575
7 libxul.so core::panicking::panic_fmt library/core/src/panicking.rs:64
This is a crash that is triggered via Glean though Glean itself is not at fault. This is a problem that's similar to what we've experienced on Windows when failing to get random values. There's two type of crashes here, either we fail to open /dev/urandom
or we block reading from it (the latter seem to happen with very old distros/kernels).
Reporter | ||
Comment 1•2 years ago
|
||
Correction, blocking on /dev/urandom
seems to happen on recent distros/kernels, not old ones. Also a fair amount of these crashes are in the rdd process and due to the non-specific stack signature there's a few that also seem unrelated.
Assignee | ||
Comment 2•2 years ago
|
||
A pity "Top 10 frames of crashing thread" doesn't go to frame number 10 as 9 and 10 are the fun ones:
9 libxul.so std::sys::unix::rand::imp::fill_bytes library/std/src/sys/unix/rand.rs:135 inlined
9 libxul.so std::sys::unix::rand::hashmap_random_keys library/std/src/sys/unix/rand.rs:5 cfi
10 libxul.so std::collections::hash::map::RandomState::new::KEYS::__init library/std/src/collections/hash/map.rs:3125 inlined
I'm not entirely certain what we can do about this short of reworking FOG's and Glean's internal storages to not use hashed collections. I'm also unsure what the best component'd be to get this looked at.
A quick scan of the source suggests that, like Firefox, it only uses urandom if getrandom's unavailable/unhappy. :gsevlto, is FOG/Glean merely stealing crashes from other urandom failures Firefox would have? (or would silently use 0 for instead)?
Reporter | ||
Comment 3•2 years ago
|
||
(In reply to Chris H-C :chutten from comment #2)
I'm not entirely certain what we can do about this short of reworking FOG's and Glean's internal storages to not use hashed collections. I'm also unsure what the best component'd be to get this looked at.
Maybe the Widget like we did with bug 1816953?
is FOG/Glean merely stealing crashes from other urandom failures Firefox would have? (or would silently use 0 for instead)?
Yes, I think it just happens to be the first piece of code to stumble upon this.
Assignee | ||
Comment 4•2 years ago
|
||
bug 1816953 (well, the hook from bug 1788004) has access to a fallback, though. /dev/urandom
is the fallback. Do we have another fallback hidden someplace? Or should we try waiting for a few millis and try again? I'm not sure that panicking when there is no source of randomness is actually wrong.
But I might be misunderstanding what it means to fail to read from /dev/urandom
Reporter | ||
Comment 5•2 years ago
|
||
(In reply to Chris H-C :chutten from comment #4)
bug 1816953 (well, the hook from bug 1788004) has access to a fallback, though.
/dev/urandom
is the fallback. Do we have another fallback hidden someplace? Or should we try waiting for a few millis and try again? I'm not sure that panicking when there is no source of randomness is actually wrong.But I might be misunderstanding what it means to fail to read from
/dev/urandom
It seems we have two cases. One is that the read()
call is returning EAGAIN
, possibly because there's not enough entropy available. In that case I think that waiting would be preferable to crashing. In the other case we hit EPERM
which would be super-odd... but it only happens in the rdd process! So maybe we've got a sandboxing issue there on older machines (it's a Debian 8 installation)?
Comment 6•2 years ago
|
||
Unless I'm missing something, the EAGAIN
s aren't about /dev/urandom
, they're when trying to start a thread. I think those are exactly what they look like: clone
failing with EAGAIN
, probably because we're over the RLIMIT_NPROC
resource limit (on Linux that limits the number of threads, not processes). In particular, disproportionately many of the crashes are from ALT Linux, which has come to our attention before for having an unusually low RLIMIT_NPROC
setting and causing crashes.
As for /dev/urandom
and RDD: it doesn't seem to be allowed in the sandbox policy. (11 is EACCES
, not EPERM
.) That's easy enough to fix. Those are all kernel 3.16, which makes sense — the getrandom
syscall was added in 3.17.
Comment 7•2 years ago
|
||
If we allow it on RDD we would need on all process types, if it's due to FOG.
Assignee | ||
Comment 9•2 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #7)
If we allow it on RDD we would need on all process types, if it's due to FOG.
utility
and socket
and content
are the other sandboxed process types on Linux, and they all already have /dev/urandom
, so we're clear there.
Assignee | ||
Comment 10•2 years ago
|
||
On Linuxen without getrandom(), Rust (and Firefox broadly) uses /dev/urandom
as a fallback. Rust uses it for a few things, notably hashmaps... and will
panic if it can't use it.
Updated•2 years ago
|
Comment 11•2 years ago
|
||
Comment 12•2 years ago
|
||
bugherder |
Updated•2 years ago
|
Comment 13•2 years ago
|
||
The patch landed in nightly and beta is affected.
:chutten, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox112
towontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 14•2 years ago
|
||
Comment on attachment 9326362 [details]
Bug 1824682 - Add /dev/urandom to rdd process sandbox on Linux r?gcp!
Beta/Release Uplift Approval Request
- User impact if declined: Infrequent Linux-specific crash due to sandboxing will still occur.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Low risk because it's Linux-only, it's adding an exception to one process type's sandbox that's present on all other snadboxed process types' exception lists.
- String changes made/needed:
- Is Android affected?: Unknown
Comment 15•2 years ago
|
||
Comment on attachment 9326362 [details]
Bug 1824682 - Add /dev/urandom to rdd process sandbox on Linux r?gcp!
Approved for 112.0rc2
Comment 16•2 years ago
|
||
bugherder uplift |
Description
•