Closed
Bug 1401400
Opened 7 years ago
Closed 7 years ago
Shutdown hangs don't produce useful crash pings
Categories
(Toolkit :: Crash Reporting, enhancement)
Toolkit
Crash Reporting
Tracking
()
RESOLVED
FIXED
mozilla58
Tracking | Status | |
---|---|---|
firefox58 | --- | fixed |
People
(Reporter: agashlin, Assigned: agashlin)
Details
Attachments
(5 files, 8 obsolete files)
4.94 KB,
patch
|
gsvelto
:
review+
|
Details | Diff | Splinter Review |
2.33 KB,
patch
|
gsvelto
:
review+
|
Details | Diff | Splinter Review |
1.20 KB,
patch
|
gsvelto
:
review+
|
Details | Diff | Splinter Review |
50.67 KB,
patch
|
gsvelto
:
review+
|
Details | Diff | Splinter Review |
3.03 KB,
patch
|
agashlin
:
review+
|
Details | Diff | Splinter Review |
This is somewhat related to bug 1398830 but I think it is different from hangs in general as it deals with the main process and doesn't use the "PluginHang" annotation mechanism.
RunWatchdog crashes [1] on shutdown hang, and Socorro specifically looks for this function on the stack in order to instead report the main thread stack [2]. Crash ping analysis will likely want to do the same, but as crash pings currently (since bug 1380252) only include one thread (the crashing/requesting thread), it won't be able to look at other threads. Thus all shutdown hangs will be grouped together as far as crash pings are concerned.
I think we can fix this with a crash annotation just before the MOZ_CRASH() to indicate that we should pass --full to minidump-analyzer, which will include all thread stacks.
[1] http://searchfox.org/mozilla-central/rev/1c13d5cf85f904afb8976c02a80daa252b893fca/toolkit/components/terminator/nsTerminator.cpp#160
[2] https://github.com/mozilla-services/socorro/blob/d4b8836a8db7ef05c81dc94518e677319327dd2a/socorro/signature/signature_utilities.py#L579-L600
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → agashlin
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•7 years ago
|
||
Assignee | ||
Comment 2•7 years ago
|
||
Assignee | ||
Comment 3•7 years ago
|
||
Assignee | ||
Comment 4•7 years ago
|
||
Assignee | ||
Comment 5•7 years ago
|
||
As observed back in bug 1280477 comment 21 there isn't a test of the crashreporter client in order to make sure the AllThreads annotation is being handled by the crashreporter, but I have tested it manually. The test included in this patch at least ensures that we have the annotation.
Attachment #8911967 -
Attachment is obsolete: true
Assignee | ||
Comment 6•7 years ago
|
||
Fix executable mode added by accident.
Assignee | ||
Updated•7 years ago
|
Attachment #8911985 -
Attachment is obsolete: true
Assignee | ||
Comment 7•7 years ago
|
||
Attachment #8911990 -
Attachment is obsolete: true
Assignee | ||
Updated•7 years ago
|
Attachment #8911965 -
Flags: review?(gsvelto)
Assignee | ||
Updated•7 years ago
|
Attachment #8911998 -
Flags: review?(gsvelto)
Assignee | ||
Updated•7 years ago
|
Attachment #8912007 -
Flags: review?(gsvelto)
Assignee | ||
Updated•7 years ago
|
Attachment #8912033 -
Flags: review?(gsvelto)
Comment 8•7 years ago
|
||
Sorry for the delay, after inspecting the code I think that we have a better alternative to implement this:
- For process hangs it should be sufficient to gather all threads when the crash type is set CRASH_TYPE_HANG [1]. From what I can tell this covers plugin processes only, hang detection doesn't seem to cover content processes.
- For main process hangs - both shutdown and regular ones [2][3] - an easier way than to add an annotation is to provide a global flag in the crash reporter that will instruct the client to gather all stack traces from the dump. Something like SetMinidumpAnalysisAllThreads() exposed from nsExceptionHandler.h. You'll then need a way to communicate this to the client, possibly by adding an optional parameter to it (like we do with the minidump-analyzer) and then adding it when the client is invoked via LaunchProgram() [4].
Does this sound reasonable? It might require a little more code on the crashreporter client side but we'll avoid having to read the .extra file twice.
[1] https://dxr.mozilla.org/mozilla-central/rev/97efdde466f18cf580fda9673cf4c38ee21fc7b7/toolkit/components/crashes/CrashService.js#169
[2] https://dxr.mozilla.org/mozilla-central/rev/97efdde466f18cf580fda9673cf4c38ee21fc7b7/toolkit/components/terminator/nsTerminator.cpp#160
[3] https://dxr.mozilla.org/mozilla-central/rev/97efdde466f18cf580fda9673cf4c38ee21fc7b7/xpcom/threads/HangMonitor.cpp#122
[4] https://dxr.mozilla.org/mozilla-central/rev/97efdde466f18cf580fda9673cf4c38ee21fc7b7/toolkit/crashreporter/nsExceptionHandler.cpp#821
Comment 9•7 years ago
|
||
Comment on attachment 8911965 [details] [diff] [review]
Part 1. Support AllThreads crash annotation
Clearing review flags for now.
Attachment #8911965 -
Flags: review?(gsvelto)
Updated•7 years ago
|
Attachment #8911998 -
Flags: review?(gsvelto)
Updated•7 years ago
|
Attachment #8912007 -
Flags: review?(gsvelto)
Updated•7 years ago
|
Attachment #8912033 -
Flags: review?(gsvelto)
Comment 11•7 years ago
|
||
We discussed the approach described in comment 8 last week (IIRC) so Adam will pick it up again soon.
Flags: needinfo?(gsvelto)
Assignee | ||
Comment 12•7 years ago
|
||
Attachment #8911965 -
Attachment is obsolete: true
Attachment #8911998 -
Attachment is obsolete: true
Attachment #8912007 -
Attachment is obsolete: true
Attachment #8912033 -
Attachment is obsolete: true
Attachment #8920398 -
Flags: review?(gsvelto)
Assignee | ||
Comment 13•7 years ago
|
||
Attachment #8920399 -
Flags: review?(gsvelto)
Assignee | ||
Comment 14•7 years ago
|
||
Attachment #8920400 -
Flags: review?(gsvelto)
Comment 15•7 years ago
|
||
Comment on attachment 8920399 [details] [diff] [review]
Part 2. Analyze all threads for "hang" type crashes
Review of attachment 8920399 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM
Attachment #8920399 -
Flags: review?(gsvelto) → review+
Comment 16•7 years ago
|
||
Comment on attachment 8920400 [details] [diff] [review]
Part 3. Process all threads when crashing for hangs
Review of attachment 8920400 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM
Attachment #8920400 -
Flags: review?(gsvelto) → review+
Comment 17•7 years ago
|
||
Comment on attachment 8920398 [details] [diff] [review]
Part 1. Global setting to analyze all threads
LGMT though I'd really love to see this covered with tests. Adding a test for the CrashService functionality should be trivial in toolkit/components/crashes/tests/xpcshell/test_crash_service.js. As for actual shutdown hangs I'm not sure if we have integration tests covering those. If we do then we should extend them, if we don't then let's open a bug to write them.
Attachment #8920398 -
Flags: review?(gsvelto) → review+
Assignee | ||
Comment 18•7 years ago
|
||
Right, I've only manually tested the shutdown hangs, across Windows, Linux and Mac. We do have a test for the shutdown terminator [1] but I'm not sure how this could be extended to explicitly test the new functionality.
I'll look into adding the test for CrashService, that should be pretty similar to what I had done for the previous patch series.
[1] http://searchfox.org/mozilla-central/source/toolkit/crashreporter/test/unit/test_crash_terminator.js
Comment 19•7 years ago
|
||
(In reply to Adam Gashlin [:agashlin] from comment #18)
> Right, I've only manually tested the shutdown hangs, across Windows, Linux
> and Mac. We do have a test for the shutdown terminator [1] but I'm not sure
> how this could be extended to explicitly test the new functionality.
It should be rather easy, the `extra` parameter in the after_crash() function holds the contents of the .extra file, so it should have the StackTraces field populated. You'd have to check that `StackTraces.threads.length` is larger than 1.
> I'll look into adding the test for CrashService, that should be pretty
> similar to what I had done for the previous patch series.
Yep.
Assignee | ||
Comment 20•7 years ago
|
||
Attachment #8922492 -
Flags: review?(gsvelto)
Assignee | ||
Comment 21•7 years ago
|
||
Attachment #8922493 -
Flags: review?(gsvelto)
Updated•7 years ago
|
Attachment #8922492 -
Flags: review?(gsvelto) → review+
Comment 22•7 years ago
|
||
Comment on attachment 8922493 [details] [diff] [review]
Part 5. Test analyzing all threads of a hang through crash service
Excellent, let's land this.
Attachment #8922493 -
Flags: review?(gsvelto) → review+
Assignee | ||
Comment 23•7 years ago
|
||
Ready for checkin, try looks good with the newly augmented test: https://treeherder.mozilla.org/#/jobs?repo=try&revision=73c199795dc9c8e4bddae3ecae74123ddaf6022b&selectedJob=140000316
Keywords: checkin-needed
Comment 24•7 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0290bb0af376
Part 1: Option to analyze all threads. r=gsvelto
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa28abe68df1
Part 2: Dump all threads for "hang" type crashes. r=gsvelto
https://hg.mozilla.org/integration/mozilla-inbound/rev/e5860d5128fb
Part 3: Process all threads when crashing for hang. r=gsvelto
https://hg.mozilla.org/integration/mozilla-inbound/rev/3ccd39a02c21
Part 4: Test real crash dump, multiple threads. r=gsvelto
https://hg.mozilla.org/integration/mozilla-inbound/rev/788f5831a14d
Part 5: Test processing all threads for hang. r=gsvelto
Keywords: checkin-needed
Comment 25•7 years ago
|
||
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/19ca35d65a03 for eslint failure, https://treeherder.mozilla.org/logviewer.html#?job_id=140061817&repo=mozilla-inbound
Flags: needinfo?(agashlin)
Assignee | ||
Comment 26•7 years ago
|
||
Fixed the eslint semicolon issue, running a try suite in case I missed something else: https://treeherder.mozilla.org/#/jobs?repo=try&revision=de2618667452533090e77ec5e3cbd6d4b4371b13
Attachment #8922493 -
Attachment is obsolete: true
Flags: needinfo?(agashlin)
Attachment #8922897 -
Flags: review+
Assignee | ||
Updated•7 years ago
|
Keywords: checkin-needed
Comment 27•7 years ago
|
||
Pushed by philringnalda@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fbd7462750c6
Part 1. Option to analyze all threads. r=gsvelto
https://hg.mozilla.org/integration/mozilla-inbound/rev/f5219bf056ec
Part 2. Dump all threads for "hang" type crashes. r=gsvelto
https://hg.mozilla.org/integration/mozilla-inbound/rev/687bbdd7afaf
Part 3. Process all threads when crashing for hang. r=gsvelto
https://hg.mozilla.org/integration/mozilla-inbound/rev/b5b0c49cc271
Part 4. Test real crash dump, multiple threads. r=gsvelto
https://hg.mozilla.org/integration/mozilla-inbound/rev/6873f266d7a6
Part 5. Test processing all threads for hang r=gsvelto
Keywords: checkin-needed
Comment 28•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/fbd7462750c6
https://hg.mozilla.org/mozilla-central/rev/f5219bf056ec
https://hg.mozilla.org/mozilla-central/rev/687bbdd7afaf
https://hg.mozilla.org/mozilla-central/rev/b5b0c49cc271
https://hg.mozilla.org/mozilla-central/rev/6873f266d7a6
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox58:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
You need to log in
before you can comment on or make changes to this bug.
Description
•