Closed Bug 1035144 Opened 10 years ago Closed 8 years ago

Add AsyncShutdownTimeout to the crash signature

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Yoric, Unassigned, Mentored)

References

Details

(Whiteboard: [lang=python][diamond])

At the moment, all the AsyncShutdownTimeout crashes have the same signature, regardless of their cause which makes the signature useless/counter-productive (e.g. bug 944873 was about a specific crash and has completely been hijacked by other crashes with the same signature).

For these crashes, neither the native stack nor the native signature make sense, but only the contents of metadata `AsyncShutdownTimeout`, which looks as follows

{
  phase: string,
  conditions: array of [
    name: string, // This is the most important piece of information
    state: either arbitrary JSON or the string "(none)",
    // Since FF 33
    lineNumber: number,
    filename: string,
    // In FF 34+, we will probably add
    stack: string
  ]
}

I believe that it would be more productive if we had the following signature:

AsyncShutdownTimeout | name [, name, name ...]
Benjamin, what do you think of this change?
Flags: needinfo?(benjamin)
I support this change, with the minor nit that it should be |-separated

AsyncShutdownTimeout | name [| name ...]

If the AsyncShutdownTimeout annotation is not valid JSON or has a 0-length conditions field (or other unexpected errors), the signature should be "AsyncShutdownTimeout | UNKNOWN"

https://github.com/mozilla/socorro/commit/6881b36682abf0041db8841cc6743932f6bb1fdf is an recent example of how to make these kinds of signature modifications. This one should be quite simple. Lars do you have any webdev/socorro volunteers you could mentor for this?
Flags: needinfo?(benjamin) → needinfo?(lars)
As a side-note, once bug 1034726 has landed (today?), we might wish to expose the JavaScript stacks alongside with (or instead of) the C++ stacks.
Let's not make that part of this bug.
Benjamin, I unfortunately do not have a volunteer/apprentice/protege/slave eager and willing to write such code.  You are correct that this is likely to be quite simple - in your reference to the git hub commit, this new transform rule will be patterned after  he OOM signature rules found on lines 138-180 in the commit that you reference.
Flags: needinfo?(lars)
Mentor: lars
Whiteboard: [lang=python][diamond]
Hi,
I would like to work on this :)
Hi,
I would like to work on this bug,
Thanks.
Yoric, this came up again recently and I was looking through some recent crash reports. See https://crash-stats.mozilla.com/report/index/9d500c33-c15d-458c-b471-bd1ea2160823 for an example of a crash that has no active conditions in the "Places Clients shutdown" block. Does that make any sense/do you have suggestions for figuring that out?
Flags: needinfo?(dteller)
https://github.com/mozilla/socorro/pull/3443

This is an untested patch without unit tests. Can somebody from the socorro team take this across the finish line, since I don't have a dev environment?
Flags: needinfo?(peterbe)
I would love to have this fix in Socorro, so that we can better understand reports and apparent spikes in crashes like we see right now in bug 1255745.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #9)
> https://github.com/mozilla/socorro/pull/3443
> 
> This is an untested patch without unit tests. Can somebody from the socorro
> team take this across the finish line, since I don't have a dev environment?

I'll write a unit test towards your PR but perhaps we can wait till Monday (next week) for Adrian to do a final review and merge.
Flags: needinfo?(peterbe)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #8)
> Yoric, this came up again recently and I was looking through some recent
> crash reports. See
> https://crash-stats.mozilla.com/report/index/9d500c33-c15d-458c-b471-
> bd1ea2160823 for an example of a crash that has no active conditions in the
> "Places Clients shutdown" block. Does that make any sense/do you have
> suggestions for figuring that out?

That's a bit surprising. It should mean that the problem is in "Places Clients shutdown" itself, rather than any of its blockers, but I can't see how this can happen in this barrier.

So, in all likelihood, this indicates a bug in the state-reporting method of this barrier: `PlacesShutdownBlocker::GetState`. We should check this method and this will probably help us find out what causes the timeout.
Flags: needinfo?(dteller)
Commits pushed to master at https://github.com/mozilla/socorro

https://github.com/mozilla/socorro/commit/6f7f79ef5e57c4fbf6b0e67e10d4fc0468684219
Bug 1035144 - Modify signatures for AsyncShutdownTimeout to represent the shutdown conditions instead of the stack.

https://github.com/mozilla/socorro/commit/9498594e122431834deefbde3d9767d6b6127fb3
tests for bug 1035144

https://github.com/mozilla/socorro/commit/6a82c599769cb4d286a626b0758b599a669f55dc
Merge pull request #1 from peterbe/tests-for-bug-1035144

tests for bug 1035144

https://github.com/mozilla/socorro/commit/3aa8728b1f51c498cb39009ffb884c8b413f6a38
Merge pull request #2 from adngdb/bug1035144

Removed inheritence from SignatureGenerationRule.

https://github.com/mozilla/socorro/commit/344979da66b185fc61ba660e9848591bf6a1ef14
Bug 1035144 - Modify signatures for AsyncShutdownTimeout to represent the shutdown conditions instead of the stack. (#3443)

* Bug 1035144 - Modify signatures for AsyncShutdownTimeout to represent the shutdown conditions instead of the stack.

* tests for bug 1035144

* Removed inheritence from SignatureGenerationRule.

* review

* fix test
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.