stylo: Intermittent thread 'StyleThread#3' panicked at 'weak_rng: failed to create seeded RNG: Error { repr: Os { code: 5, message: "Input/output error" } }', /builds/worker/workspace/build/src/third_party/rust/rand/src/lib.rs:898:18

RESOLVED FIXED in Firefox 58

Status

()

Core
Layout
P2
normal
RESOLVED FIXED
a month ago
16 days ago

People

(Reporter: aryx, Assigned: manishearth)

Tracking

(Blocks: 1 bug, {intermittent-failure})

unspecified
mozilla58
Unspecified
Linux
intermittent-failure
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox56 unaffected, firefox57 wontfix, firefox58 fixed)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(2 attachments)

https://treeherder.mozilla.org/logviewer.html#?job_id=139160909&repo=mozilla-inbound

[task 2017-10-24T12:12:42.802Z] 12:12:42     INFO - TEST-START | /css/compositing-1/mix-blend-mode/mix-blend-mode-blended-element-interposed.html
[task 2017-10-24T12:12:42.842Z] 12:12:42     INFO - PID 2805 | ++DOCSHELL 0xc4f3d400 == 5 [pid = 2805] [id = {cdb2828c-d6b0-4af2-a46b-c33934e87afb}]
[task 2017-10-24T12:12:42.843Z] 12:12:42     INFO - PID 2805 | ++DOMWINDOW == 10 (0xc4f3d800) [pid = 2805] [serial = 10] [outer = (nil)]
[task 2017-10-24T12:12:42.843Z] 12:12:42     INFO - PID 2805 | ++DOMWINDOW == 11 (0xc4f3dc00) [pid = 2805] [serial = 11] [outer = 0xc4f3d800]
[task 2017-10-24T12:12:42.865Z] 12:12:42     INFO - PID 2805 | [Parent 2805, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4371
[task 2017-10-24T12:12:42.881Z] 12:12:42     INFO - PID 2805 | ++DOCSHELL 0xe63e6400 == 1 [pid = 2903] [id = {0e8e5e83-3bc3-41e5-977f-1e15be76f507}]
[task 2017-10-24T12:12:42.922Z] 12:12:42     INFO - PID 2805 | [Parent 2805, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4371
[task 2017-10-24T12:12:42.946Z] 12:12:42     INFO - PID 2805 | ++DOMWINDOW == 1 (0xe63e9000) [pid = 2903] [serial = 1] [outer = (nil)]
[task 2017-10-24T12:12:43.083Z] 12:12:43     INFO - PID 2805 | ++DOMWINDOW == 2 (0xe1d81400) [pid = 2903] [serial = 2] [outer = 0xe63e9000]
[task 2017-10-24T12:12:43.361Z] 12:12:43     INFO - PID 2805 | 1508847163357	Marionette	DEBUG	Register listener.js for window 4294967297
[task 2017-10-24T12:12:43.397Z] 12:12:43     INFO - PID 2805 | ++DOMWINDOW == 3 (0xe0fdc400) [pid = 2903] [serial = 3] [outer = 0xe63e9000]
[task 2017-10-24T12:12:43.438Z] 12:12:43     INFO - PID 2805 | Sandbox: Unexpected EOF, op 0 flags 02100000 path /dev/urandom
[task 2017-10-24T12:12:43.439Z] 12:12:43    ERROR - PID 2805 | thread 'StyleThread#3' panicked at 'weak_rng: failed to create seeded RNG: Error { repr: Os { code: 5, message: "Input/output error" } }', /builds/worker/workspace/build/src/third_party/rust/rand/src/lib.rs:898:18
[task 2017-10-24T12:12:43.439Z] 12:12:43     INFO - PID 2805 | stack backtrace:
[task 2017-10-24T12:12:43.459Z] 12:12:43     INFO - PID 2805 | 1508847163457	Marionette	INFO	Testing http://web-platform.test:8000/css/compositing-1/mix-blend-mode/mix-blend-mode-blended-element-interposed.html == http://web-platform.test:8000/css/compositing-1/mix-blend-mode/reference/green-square.html
[task 2017-10-24T12:12:43.724Z] 12:12:43     INFO - PID 2805 |    0: 0xf32c9f3a - std::sys::imp::backtrace::tracing::imp::unwind_backtrace::hfc7985b08e763a82
[task 2017-10-24T12:12:43.725Z] 12:12:43     INFO - PID 2805 |    1: 0xf32c4d62 - std::sys_common::backtrace::_print::h16a1db02a59ead63
[task 2017-10-24T12:12:43.725Z] 12:12:43     INFO - PID 2805 |    2: 0xf32d5d1c - std::panicking::default_hook::{{closure}}::h48ecee46f2eefc30
[task 2017-10-24T12:12:43.725Z] 12:12:43     INFO - PID 2805 |    3: 0xf32d5ac1 - std::panicking::default_hook::hb4c92ae8d005ca44
[task 2017-10-24T12:12:43.726Z] 12:12:43     INFO - PID 2805 |    4: 0xf276e687 - gkrust_shared::install_rust_panic_hook::{{closure}}::h8b31b5ba7b6976df
[task 2017-10-24T12:12:43.726Z] 12:12:43     INFO - PID 2805 |    5: 0xf32d6270 - std::panicking::rust_panic_with_hook::h25d461655d60b1a5
[task 2017-10-24T12:12:43.726Z] 12:12:43     INFO - PID 2805 |    6: 0xf32d6093 - std::panicking::begin_panic::h0f6fdd9abfd7dfb9
[task 2017-10-24T12:12:43.727Z] 12:12:43     INFO - PID 2805 |    7: 0xf32d6016 - std::panicking::begin_panic_fmt::ha31e26b280c9e878
[task 2017-10-24T12:12:43.728Z] 12:12:43     INFO - PID 2805 |    8: 0xf3270657 - rand::weak_rng::hc1689b96c6c9cd22
[task 2017-10-24T12:12:43.729Z] 12:12:43     INFO - PID 2805 |    9: 0xf32603fc - rayon_core::registry::main_loop::hb070a6087af3fa65
[task 2017-10-24T12:12:43.729Z] 12:12:43     INFO - PID 2805 |   10: 0xf3258ca6 - rayon_core::registry::Registry::new::{{closure}}::h7b2afcb252ce7aa8
[task 2017-10-24T12:12:43.730Z] 12:12:43     INFO - PID 2805 |   11: 0xf32547c0 - std::sys_common::backtrace::__rust_begin_short_backtrace::h7f3ef03104c43c4b
[task 2017-10-24T12:12:43.730Z] 12:12:43     INFO - PID 2805 |   12: 0xf325ccf0 - std::thread::Builder::spawn::{{closure}}::{{closure}}::hb68b67286a993380
[task 2017-10-24T12:12:43.731Z] 12:12:43     INFO - PID 2805 |   13: 0xf325c7f0 - <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::h7d019497d6f9ef8f
[task 2017-10-24T12:12:43.731Z] 12:12:43     INFO - PID 2805 |   14: 0xf325502c - std::panicking::try::do_call::h39907cdd35f9e329
[task 2017-10-24T12:12:43.732Z] 12:12:43     INFO - PID 2805 |   15: 0xf32dad19 - <unknown>
[task 2017-10-24T12:12:43.732Z] 12:12:43     INFO - PID 2805 | Redirecting call to abort() to mozalloc_abort
[task 2017-10-24T12:12:43.732Z] 12:12:43     INFO - PID 2805 | 
[task 2017-10-24T12:12:43.733Z] 12:12:43     INFO - PID 2805 | Hit MOZ_CRASH() at /builds/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33
Flags: needinfo?(emilio)
Looks a lot like bug 1409444. Manish, Xidorn, any insight here? Input / Output error doesn't sound particularly enlightening...
Flags: needinfo?(xidorn+moz)
Flags: needinfo?(manishearth)
Flags: needinfo?(emilio)
(Assignee)

Comment 2

a month ago
Hmm. Looks like we need https://github.com/rust-lang-nursery/rand/issues/180 to happen after all.
Flags: needinfo?(manishearth)
I didn't involve a lot on the rand issue, so no idea.
Flags: needinfo?(xidorn+moz)
(In reply to Emilio Cobos Álvarez [:emilio] from comment #1)
> Looks a lot like bug 1409444. Manish, Xidorn, any insight here? Input /
> Output error doesn't sound particularly enlightening...

We've had similar test failures reading from /dev/urandom in the past, such as Fennec bug 1140806 and sandbox permission bug 995069. We should fix this, but I doubt many real Linux users will hit this problem.

Did these /dev/urandom errors only start after updating the rand crate version 0.3.17? IIUC, the Linux OsRng should prefer getrandom() over reading from /dev/urandom, so I don't know why we are hitting this error on Linux. Is OsRng's is_getrandom_available() broken on our Linux test machine?

https://github.com/rust-lang-nursery/rand/blob/master/src/os.rs

(In reply to Manish Goregaokar [:manishearth] from comment #2)
> Hmm. Looks like we need https://github.com/rust-lang-nursery/rand/issues/180
> to happen after all.

I submitted a PR to fix #180 a few days ago (with Windows in mind) that should avoid this error on Linux:

https://github.com/rust-lang-nursery/rand/pull/181
status-firefox56: --- → unaffected
status-firefox57: --- → ?
status-firefox58: --- → affected
status-firefox-esr52: --- → unaffected
Blocks: 1243581
OS: Unspecified → Linux
Priority: -- → P2
Summary: Intermittent thread 'StyleThread#3' panicked at 'weak_rng: failed to create seeded RNG: Error { repr: Os { code: 5, message: "Input/output error" } }', /builds/worker/workspace/build/src/third_party/rust/rand/src/lib.rs:898:18 → stylo: Intermittent thread 'StyleThread#3' panicked at 'weak_rng: failed to create seeded RNG: Error { repr: Os { code: 5, message: "Input/output error" } }', /builds/worker/workspace/build/src/third_party/rust/rand/src/lib.rs:898:18
Alex, we are seeing some intermittent test failures on Linux because weak_rng panics after failing to read from /dev/urandom. Shouldn't the Linux OsRng prefer getrandom() over reading from /dev/urandom? I don't know why OsRng is even trying to read from /dev/urandom. Perhaps is_getrandom_available() is broken on our Linux test machines or blocked by some Firefox sandbox permission?

Also, can you please take a look at my proposed `rand` PR #181? Since we don't fully understand these Windows or Linux RNG errors, my proposed PR to seed weak_rng with the system time (as a fallback instead of panicking) might be a good safety net.

https://github.com/rust-lang-nursery/rand/pull/181
Flags: needinfo?(acrichton)
Yeah I'm not sure why something like `is_getrandom_available()` is returning false for the Gecko CI machines. Maybe the kernel is actually to old to support getrandom?

I'll take a look at the PR. Do you need a release after merging?
Flags: needinfo?(acrichton)

Comment 7

26 days ago
1 failures in 912 pushes (0.001 failures/push) were associated with this bug in the last 7 days.    

Repository breakdown:
* mozilla-inbound: 1

Platform breakdown:
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1411250&startday=2017-10-23&endday=2017-10-29&tree=all
(In reply to Alex Crichton [:acrichto] from comment #6)
> I'll take a look at the PR. Do you need a release after merging?

Thanks! A new rand release would be helpful, but it is not urgent.
(In reply to Alex Crichton [:acrichto] from comment #6)
> Yeah I'm not sure why something like `is_getrandom_available()` is returning
> false for the Gecko CI machines. Maybe the kernel is actually to old to
> support getrandom?

Emilio or Manish, if either of you have a Linux dev machine handy, can you verify that is_getrandom_available() works for you? 

This test failure was on Ubuntu 16.04.3 LTS (Xenial Xerus), which does support getrandom() [1]. Maybe there is something about the Linux VMs that affects getrandom()? Regardless, if is_getrandom_available() works correctly on your Linux dev machine, then at least we know the code works somewhere and we probably don't need to worry too much why the test machine is trying to read /dev/urandom.

[1] http://manpages.ubuntu.com/manpages/xenial/en/man2/getrandom.2.html
[2] https://github.com/rust-lang-nursery/rand/blob/6fd1009174d3f9f544db716a013d57dd70578a12/src/os.rs#L144-L164
status-firefox57: ? → wontfix
Flags: needinfo?(manishearth)
Flags: needinfo?(emilio)
When I say "works correctly on your Linux dev machine", I specifically mean in the Firefox content process sandbox. Maybe there is something about the content process that affects getrandom().
I tried to break on that function in a content process under an rr trace and couldn't, but getrandom was called a bunch of times successfully.
Flags: needinfo?(emilio)
(Assignee)

Updated

23 days ago
Flags: needinfo?(manishearth)
(In reply to Alex Crichton [:acrichto] from comment #6)
> I'll take a look at the PR. Do you need a release after merging?

Alex, can you please make a new rand release (0.3.18?) that includes the weak_rng fix some time next week? I'd like to get this fix into Firefox Nightly 58 before November 13. Thanks!
Flags: needinfo?(acrichton)
Ok I've now published 0.3.18
Flags: needinfo?(acrichton)
(In reply to Alex Crichton [:acrichto] from comment #13)
> Ok I've now published 0.3.18

Thanks!

@ Manish (or anyone who feels like it): do you mind revendoring rand so we get version 0.3.18 in Nightly 58 some time this week?
Flags: needinfo?(manishearth)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Updated

18 days ago
Flags: needinfo?(manishearth)
Comment on attachment 8925700 [details]
Bug 1411250 - Bump rand crate to 0.3.18 ;

https://reviewboard.mozilla.org/r/196832/#review202056
Attachment #8925700 - Flags: review?(xidorn+moz) → review+
Comment on attachment 8925701 [details]
Bug 1411250 - Revendor deps;

https://reviewboard.mozilla.org/r/196834/#review202058
Attachment #8925701 - Flags: review?(xidorn+moz) → review+

Comment 19

18 days ago
Pushed by manishearth@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/f0fbcf42783f
Bump rand crate to 0.3.18 ; r=xidorn
https://hg.mozilla.org/integration/autoland/rev/63ebc045fa98
Revendor deps; r=xidorn

Comment 20

17 days ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/f0fbcf42783f
https://hg.mozilla.org/mozilla-central/rev/63ebc045fa98
Status: NEW → RESOLVED
Last Resolved: 17 days ago
status-firefox58: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Assignee: nobody → manishearth
You need to log in before you can comment on or make changes to this bug.