Closed Bug 1746254 Opened 3 years ago Closed 3 years ago

Windows 7: application crashed [@ RustMozCrash(char const*, int, char const*)]

Categories

(Toolkit :: Telemetry, defect, P2)

defect

Tracking

()

RESOLVED FIXED
98 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox96 --- wontfix
firefox97 --- fixed
firefox98 --- fixed

People

(Reporter: florian, Assigned: janerik)

References

Details

Attachments

(3 files, 1 obsolete file)

Attached file Crash stack

This crash appears with my patches from bug 1745511 that adds FOG support on the GMP process.

Attached file crash stack (obsolete) —
`couldn't generate random bytes` comes from `rand` via our attempt to create a `std::collections::HashMap`. Uh... this is beyond me. Jan-Erik?
Attachment #9255563 - Attachment is obsolete: true
Flags: needinfo?(jrediger)

Rust uses BCryptGenRandom internally.
Is the GMP process somewhat restricted and thus crashing when using that function?

Flags: needinfo?(jrediger)

Call to ZwDeviceIoControlFile within BCryptGenRandom appears to be failing.
It is not something that we appear to be able to add sandbox policy rules for.
Warming up BCryptGenRandom before starting the sandbox might work.

also from matrix:
padenot - another option is to use another RandomState for the HashMap, another one than the default one that is
padenot - https://github.com/rust-lang/rust/blob/f0448f44bcda55fd9eb71da82495ef648eedb4e4/library/std/src/collections/hash/map.rs#L2936 for reference

Blocks: 1745511

I tried naively replacing the RandomState with a non-seeded one, but the try run failed with an out of memory error: https://treeherder.mozilla.org/jobs?repo=try&revision=b414d85828d54986d4011023398ce7cc1ff39417

Not sure what's going on there.

Assignee: nobody → jrediger
Priority: -- → P2

(In reply to Jan-Erik Rediger [:janerik] from comment #5)

I tried naively replacing the RandomState with a non-seeded one, but the try run failed with an out of memory error: https://treeherder.mozilla.org/jobs?repo=try&revision=b414d85828d54986d4011023398ce7cc1ff39417

Not sure what's going on there.

Your try run ran the first opt-mochitest-browser-chrome-e10s chunk without specifying a folder of tests to run, so the tests you ran are not those I had in the try push from comment 0. And the out of memory error is a known issue of these specific tests (these tests take a screenshot of every frame that is painted to compare them and detected unexpected flicker - that uses a lot of memory and tends to fail on Win 32 bit).

Thanks!
I thought it would just run all tests. Guess I was wrong.

Try run with only your GMP commit still crashing: https://treeherder.mozilla.org/jobs?repo=try&revision=ee7ebe3ef561ea4ae35983364df7bdbefe0335d2&selectedTaskRun=RmMZtxsZRMqooG7jmlsZqA.0

Try run with my fix of replacing RandomState passes: https://treeherder.mozilla.org/jobs?repo=try&revision=777e3bd3228daf9927f6e6a443d97b2910e055c3

So this could work, but we will need to be careful to not re-introduce any hashmap acidentally. Glean internally uses a bunch of hashmaps, but luckily in child processes we never call into Glean really.

This replaces the default-used RandomState1 on the hashmaps used in
FOG's IPC with a simple wrapper around DefaultHasher.

RandomState uses DefaultHasher internally, which at the moment of
this commit, uses SipHasher13 internally.
Our HashState does the same, but with default keys of (0, 0)
(whereas RandomState uses some random keys).

This will result in the same hashes used for the same values in
different hashmaps.
But this should be fine for the use here:
The key for all those hash maps is a MetricId, which is a simple u32.
It's already a unique identifier generated from defined metrics and not
user-manipulated.
Therefore we shouldn't need to worry about HashDos-resistance.

Attachment #9257807 - Attachment description: WIP: Bug 1746254 - Use simple default hash state to avoid random seeding → Bug 1746254 - Use simple default hash state to avoid random seeding
Pushed by jrediger@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/212d135827a5 Use simple default hash state to avoid random seeding r=chutten,padenot
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
See Also: → 1751177

I am seeing a similar issue in bug 1751177 / bug 1745173, I'm wondering whether a similar workaround in gecko_logger is a right fix or if there's something else better to do?

Flags: needinfo?(jrediger)
Attached file gecko_logger_lib.patch

Try with the added patch to gecko_logger. If that works we could consider doing it.
Let's open a new bug for it then. I don't own gecko_logger, so we will have to pull in someone else for that as well.

Flags: needinfo?(jrediger)

Thanks, bug 1751177 is already there for that. I'm giving it a spin.

Should we consider uplifting this and related changes to 97? I'm seeing all of these signatures pretty prominently in both the socket and rdd processes.

Alexandre is working on a patch that will properly fix the underlying bug (by warming up the RNG), though I'm not sure if uplifting that one is achievable.
Short of that I'd be ok with uplifting the patch here.

Comment on attachment 9257807 [details]
Bug 1746254 - Use simple default hash state to avoid random seeding

Beta/Release Uplift Approval Request

  • User impact if declined: Crashes from non-main processes on Windows 7
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It changes an only-internally used hashmap to use a deterministic hashing function. That hashmap contains only build-time-known keys and is not exposed further outside.
  • String changes made/needed:
Attachment #9257807 - Flags: approval-mozilla-beta?
See Also: → 1751094

https://www.mathies.com/mozilla/crashes/rddrelease.html suggests that this is a pretty high-volume RDD crasher on Release. Let's take this in 97.0rc1 and see how it goes.

Attachment #9257807 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Component: Glean: SDK → Telemetry
Product: Data Platform and Tools → Toolkit
Target Milestone: --- → 98 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: