some crash signatures on Windows 7 and 8.1 with unexpected function in frame #0 of crashing thread
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox88 | --- | unaffected |
firefox89 | --- | wontfix |
firefox90 | --- | wontfix |
firefox91 | --- | wontfix |
firefox92 | --- | affected |
People
(Reporter: aryx, Assigned: gsvelto)
References
Details
Attachments
(1 file)
48 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
|
Details | Review |
New startup crashes with 89 betas, observed on Windows 7 and 8.1.
Gabriele, do you know what's going on with these signatures?
- @ mozilla::safebrowsing::VariableLengthPrefixSet::GetPrefixes, e.g. bp-b634e111-0bcd-447c-a94e-187030210503
- @ _GLOBAL__sub_I_Unified_cpp_crashreporter0.cpp, e.g. bp-a2cddd9e-3980-4738-bd4d-6df190210504
- @ mozilla::safebrowsing::VariableLengthPrefixSet::GetPrefixes, e.g. bp-b634e111-0bcd-447c-a94e-187030210503
- @ mozilla::appservices::httpconfig::protobuf::Response::MergeFrom, e.g. from bp-44979a9b-34d8-4062-a13c-b8ac40210503
- @ nsTypeAheadFind::PlayNotFoundSound, e.g. bp-6ab2befc-7e55-462d-b716-af5bc0210430
- @ mozilla::appservices::httpconfig::protobuf::Response::MergeFrom, e.g. bp-44979a9b-34d8-4062-a13c-b8ac40210503
Assignee | ||
Comment 1•4 years ago
|
||
It took me a while to figure out what's going on. The crash reason for all these crashes is EXCEPTION_BREAKPOINT
which is covered by our exception handler so it was unclear why they were caught by WER instead... until I realized they're not crashes, they're hangs. If you ignore the crashing thread and look at the main one instead you'll see consistent stacks under those signatures.
WER also captures hangs when an application stops responding but I didn't think those would be passed on to the runtime exception module so I hadn't planned for them. These are interesting to us but we're not ready to handle them so in the short term I'll disable them; we'll re-enable their capture when we'll be ready to handle them.
Assignee | ||
Comment 2•4 years ago
|
||
I made a bunch of tests to figure out what's going on but couldn't come up with a definitive solution. Here's what I found though:
- I tried deliberately hanging Firefox on Windows 10 and this doesn't cause the WER module to be invoked. So it seems like this problem only affects Windows 7 and 8.1. On Windows 10 it seems that those reports go straight to the Windows event log (or to Microsoft?). This explains why these reports only affect Windows 7 and 8.1.
- WER seems to offer an option that sounds like it could be used to opt out. It's called
WER_FAULT_REPORTING_DISABLE_SNAPSHOT_HANG
and needs to be passed toWerSetFlags()
in the process that registered the WER module. However this option is not documented outside of the werapi.h header so it's unclear if it actually does what the name suggests. Additionally this flag was added in the Windows 8 SDK so even assuming it allows us to opt out hang reports I'm not sure if it's guaranteed to work on Windows 7. - The WER exception does not include information to tell apart crashes from hangs (or at least nothing is documented as such). There seems to be a way to detect hangs anyway: when encountering a crash the crashing thread is suspended before the WER module callbacks are invoked. In hangs on the other hand the crashing thread isn't suspended. We could print out the suspend count of the threads in Socorro's stackwalker and then use that information to make Socorro's signature generation to flag those crashes as hangs.
The last option would be the most desirable but it's also the hardest to implement so I'll first try to disable these entirely.
Assignee | ||
Comment 3•4 years ago
|
||
Updated•4 years ago
|
Comment 5•4 years ago
|
||
bugherder |
Comment 6•4 years ago
|
||
The patch landed in nightly and beta is affected.
:gsvelto, is this bug important enough to require an uplift?
If not please set status_beta
to wontfix
.
If yes, don't forget to request an uplift for the patches in the regression caused by this fix.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 7•4 years ago
|
||
Comment on attachment 9220609 [details]
Bug 1709423 - Opt-out of WER hang reports r=KrisWright
Beta/Release Uplift Approval Request
- User impact if declined: None but we'll get crash reports for things we don't know how to handle and that will make triage harder.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This only flips an option that informs Windows Error Reporting not to grab snapshots of application hangs. We hope this fixes the problem here but it's impossible to be sure until this gets into beta and we see the volume of the existing crashes go down (or not if it didn't work).
- String changes made/needed: none
Comment 8•4 years ago
|
||
Comment on attachment 9220609 [details]
Bug 1709423 - Opt-out of WER hang reports r=KrisWright
That sounds like an improvement we want in beta, approved for 89 beta 11, thanks.
Comment 9•4 years ago
|
||
bugherder uplift |
Assignee | ||
Comment 10•4 years ago
|
||
I'm not 100% sure if this is fixed. The original signatures all but disappeared but there's a couple new suspect ones that popped up:
- mozilla::devtools::DominatorTree::cycleCollection::TraverseNative
- mozilla::dom::ChromeUtils::SaveHeapSnapshotShared
These are all very old versions of Windows though so we can't rule out a bug in the old WER implementations.
Assignee | ||
Comment 11•4 years ago
|
||
Again not 100% sure this is fixed, as I found more crashes such as these:
The volume is very low but I'll file a follow-up to address this. It will require a fair bit of surgery in the runtime exception module though, as there's no documented way of telling apart crashes from hangs in WER.
Assignee | ||
Comment 12•3 years ago
•
|
||
I can still find instances of this in recent-ish builds:
Reopening, this needs a more complex fix.
Updated•3 years ago
|
Comment 13•3 years ago
|
||
new spiking signature in 90.0: https://crash-stats.mozilla.org/signature/?signature=vp9_get_sub_block_energy
Updated•3 years ago
|
Assignee | ||
Comment 14•3 years ago
|
||
It seems that bug 1718226 got rid of this. I couldn't find any new instances in nightly versions following the first one with the patch applied. Let's keep this open until it rides to beta so that we're 100% sure it's fixed before closing the bug.
Assignee | ||
Comment 15•3 years ago
|
||
Confirmed that bug 1718226 made this issue go away, closing.
Description
•