Closed Bug 1104317 Opened 6 years ago Closed 6 years ago

Signatures for shutdown crashes


(Socorro :: General, task)

Windows 7
Not set


(Not tracked)



(Reporter: dmajor, Assigned: lars)



The shutdown crashes added by 1038342 are coming from a watchdog thread which is uninteresting for purposes of crash bucketing and analysis. It would be more helpful to see the main thread was doing.

If the signature matches:
[@ mozilla::`anonymous namespace''::RunWatchdog(void*)]
[@ mozilla::(anonymous namespace)::RunWatchdog(void*)]
(or maybe just some regex for RunWatchdog)

Then instead let's show the stack for thread 0, ignoring anything on the regular ignore-list as well as anything containing ProcessNextEvent. E.g. bp-569807e3-431a-4f7b-a800-0bbc62141124 would display as "shutdownhang | 

This won't work well for busy-hangs like bp-b776b61d-a73a-40b1-81dd-581212141124. Taking the approach above we'd get tons of random signatures. bsmedberg are you ok with that or do you want to include some kind of cleverness like search for a frame containing the word Shutdown?
Flags: needinfo?(benjamin)
Let's ignore the busy-hang case for now: if we want to we can go back and instrument that case by measuring and annotating CPU usage.

In bug 1103833 I suggested that rather than using RunWatchdog as the marker, we could use an explicit annotation. But in the short term, Run Watchdog is probably good enough.
Flags: needinfo?(benjamin)
See Also: → 1103833
Blocks: 1103833
Lonnen can you find an owner for this? It's needed in order to diagnose one of our biggest nightly crashes.
Flags: needinfo?(chris.lonnen)
can "ProcessNextEvent" be added to the general ignore list or must it be a special case for this signature variant only? 

adding that frame signature to the general ignore list makes this trivial to implement.  Having it as a special case makes the implementation a bit more complicated.
Flags: needinfo?(dmajor)
Flags: needinfo?(chris.lonnen)
Flags: needinfo?(benjamin)
QA Contact: lars
Am I remembering correctly, that there are two lists, an "ignore" list and an "append" list?

I think it would be reasonable to include /ProcessNextEvent/ on the general append list. (I wouldn't want to strip it altogether though.)
Flags: needinfo?(dmajor)
Yes, that is correct, there are both "ignore" and "append" lists (see:  

I have proceeded with adding ProcessNextEvent to the "append" list.  

To see if my modifications produce what you expect, please compare these*:

from production, a crash with the target signature:

that same crash in staging, reprocessed with a new signature rule for "shutdown hangs":

Please verify that the signature in staging is correct.  On your approval, I will submit this PR and if you act quickly, I bet we can get this into production this week.  

* the orignial example cited in Comment #0 could not be used because its symbols have expired, reprocessing now results in a signature that wouldn't trigger the shutdown hang rule.
Assignee: nobody → lars
Flags: needinfo?(dmajor)
QA Contact: lars
Flags: needinfo?(benjamin)
Flags: needinfo?(dmajor)
Blocks: 1123698
See Also: 1103833
Commit pushed to master at
Merge pull request #2588 from twobraids/runwatchdog

Fixes Bug 1104317 - adds SignatureRunWatchDog rule to processor
Closed: 6 years ago
Resolution: --- → FIXED
"shutdownhang..." signatures are now streaming out of the Socorro processor.  There are about 4 per minute.
You need to log in before you can comment on or make changes to this bug.