Crash in [@ rust_cascade::Cascade::has]
Categories
(Core :: Security: PSM, defect, P1)
Tracking
()
People
(Reporter: philipp, Assigned: keeler)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression, Whiteboard: [psm-assigned][tbird crash])
Crash Data
Attachments
(1 file)
This bug is for crash report bp-530c1593-d10f-433a-b7a5-72bc30200216.
Top 10 frames of crashing thread:
0 xul.dll rust_cascade::Cascade::has third_party/rust/rust_cascade/src/lib.rs:200
1 xul.dll cert_storage::{{impl}}::allocate::GetCRLiteRevocationState security/manager/ssl/cert_storage/src/lib.rs:1094
2 xul.dll mozilla::psm::NSSCertDBTrustDomain::CheckRevocation security/certverifier/NSSCertDBTrustDomain.cpp:635
3 xul.dll mozilla::pkix::PathBuildingStep::Check security/nss/lib/mozpkix/lib/pkixbuild.cpp:254
4 xul.dll mozilla::psm::CheckCandidates security/certverifier/NSSCertDBTrustDomain.cpp:193
5 xul.dll mozilla::psm::NSSCertDBTrustDomain::FindIssuer security/certverifier/NSSCertDBTrustDomain.cpp:348
6 xul.dll mozilla::pkix::BuildForward security/nss/lib/mozpkix/lib/pkixbuild.cpp:365
7 xul.dll mozilla::pkix::BuildCertChain security/nss/lib/mozpkix/lib/pkixbuild.cpp:414
8 xul.dll mozilla::psm::BuildCertChainForOneKeyUsage security/certverifier/CertVerifier.cpp:240
9 xul.dll mozilla::psm::CertVerifier::VerifyCert security/certverifier/CertVerifier.cpp:745
this crash signature started appearing during 73.0a1 and now seems starting to pop up in the beta channel as well after the transition to 74.0b.
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 1•5 years ago
•
|
||
In these crashes, it seems that the underlying storage for the memory-mapped file has become unreliable (with crash reasons like STATUS_DEVICE_DATA_ERROR
). In other words, part of the disk died.
Is this even something that we can recover from?
Comment 3•5 years ago
|
||
I can occasionally repro on MacOS (once so far in 10 tries) by making a blank volume for the security_state
profile folder, copying in the contents from an existing profile:
hdiutil create -srcfolder /$profile/security_state/ -volname securitystate /tmp/securitystate.dmg
hdiutil attach -mountpoint /$profile/security_state /tmp/securitystate.dmg
mach run --allow-downgrade --profile /$profile
diskutil umount force /$profile/security_state
This doesn't seem like a common scenario for desktops, but I'd like to try putting a catch_unwind in place and see if we can intercept the exception signal. If we can, then that answers the question. If not, then I suppose that also answers the question.
Comment 4•5 years ago
|
||
I thought crlite wasn't shipping yet; is it actually enabled in beta now?
[...checks...]
Looks like it's enabled in telemetry mode. Should we turn that off for release to avoid shipping this crash?
Comment 5•5 years ago
|
||
This can't possibly be that common of a crash. Numbers are low, and I strongly suspect we have something ignoring similar crashes -- because when the profile directory goes away, Firefox always crashes, somehow or other.
I need to go try and dig up the other crashes, perhaps, and figure out how to classify this the same way.
I do like the idea of trying to wrap it in a catch_unwind, but I'm not going to be able to do that anytime soon due to sudden childcare problems. Let me pass this and the above info to Dana for her take.
Assignee | ||
Comment 6•5 years ago
|
||
If I'm reading these crash reports correctly, we're faulting when trying to read mmapped memory for which the underlying storage has gone away. Since we're not panic()ing in rust, catch_unwind won't help, unless I'm misunderstanding your suggestion. This appears to be a known issue with mmap (see e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=537742). Since the filter size is only ~1.4MB anyway, maybe we could load it into memory rather than mmapping it (although, that said, lmdb is going to have the exact same problem because it mmaps files too... (we're not seeing those crashes yet because we're using rkv's safe mode for now)).
All that said, with this low of crash volume, I'm not too concerned. If we do see too many crashes, we can disable this by remotely flipping a pref.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 8•3 years ago
|
||
Is it expected that this crash should only occur for beta versions?
per https://crash-stats.mozilla.org/signature/?signature=rust_cascade%3A%3ACascade%3A%3Ahas_internal&date=%3E%3D2022-01-20T16%3A54%3A00.000Z&date=%3C2022-02-20T16%3A54%3A00.000Z
Assignee | ||
Comment 9•3 years ago
|
||
Well, we only process crlite filters on early beta or earlier, so it makes sense we don't see it on release/esr. Maybe the nightly population is too small to hit this?
Given the volume on beta, though, it seems like we should fix this before enabling crlite in release. I'll see about doing what I said in comment 6.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 10•3 years ago
|
||
Comment 11•3 years ago
|
||
Comment 12•3 years ago
|
||
bugherder |
Updated•3 years ago
|
Description
•