Open Bug 1110498 Opened 7 years ago Updated 2 years ago

Add UI to display minidumps for multiple processes

Categories

(Socorro :: Webapp, task, P2)

Tracking

(Not tracked)

People

(Reporter: mconley, Unassigned)

References

Details

(Whiteboard: [lang=python+html])

With e10s, it's possible for us to intentionally kill the child (ContentParent::KillHard), and we want to gather minidumps from both the parent and child process.

We would like the web interface for crash-stats to let us view the stacks for both processes.

I'm pretty sure this is a dupe - bsmedberg said something like this was filed - but I couldn't find the bug, so filing this one. needinfo'ing bsmedberg for the bug # to dupe to.
Flags: needinfo?(benjamin)
I can't find the bug. So either I never filed it, or it was a comment in some other bug that got lost.

In any case, here are some details about how to fix this:

https://github.com/mozilla/socorro/blob/master/webapp-django/crashstats/crashstats/views.py#L998 is the view for report/index
https://github.com/mozilla/socorro/blob/master/webapp-django/crashstats/crashstats/templates/crashstats/report_index.html is the template

https://crash-stats.mozilla.com/report/index/8dbd0b34-4c6d-49cb-84f9-736132141215 is an example of a plugin hang report with four dumps.

My recommendation for UI would be to take the "Crashing Thread" section of the "Details" tab and wrap that in a tabbed UI with one tab for each dump. So in this case, there would be four tabs: "plugin,browser,flash1,flash2". And then show the crashing-thread UI within each tab.
Component: General → Webapp
Flags: needinfo?(benjamin)
Whiteboard: [lang=python+html]
Schalk, would you be interested in doing this one?
I think I can help with figuring out where the data would come from etc. And if the two of us get stuck, we'll always have bsmedberg to get us unstuck.
(In reply to Peter Bengtsson [:peterbe] from comment #2)
> Schalk, would you be interested in doing this one?
> I think I can help with figuring out where the data would come from etc. And
> if the two of us get stuck, we'll always have bsmedberg to get us unstuck.

Definitely, let me know what the next steps are. Although the next steps will most likely only happen Q1 right?
My request on IRC was for somebody to sign up as a mentor. I might be able to find a webdev volunteer to take the bug.
Is there any chance we can get cooking on this asap? We have thousands of orphaned content crash signatures showing up in crashstats with no way of correlating them with browser stacks. The browser stack is what we need to diagnose accurately. (See bug 1116884 for some detail.)
Blocks: killhard-win
Depends on: 1068349
Flags: needinfo?(schalk.neethling.bugs)
(In reply to Jim Mathies [:jimm] from comment #5)
> Is there any chance we can get cooking on this asap? We have thousands of
> orphaned content crash signatures showing up in crashstats with no way of
> correlating them with browser stacks. The browser stack is what we need to
> diagnose accurately. (See bug 1116884 for some detail.)

I believe a web-dev volunteer/contributor is going to work on this with Peter as mentor.
Flags: needinfo?(schalk.neethling.bugs)
Peter, do we have someone lined up to work on this currently?
Flags: needinfo?(peterbe)
Assignee: nobody → peterbe
(In reply to Jim Mathies [:jimm] from comment #7)
> Peter, do we have someone lined up to work on this currently?

me! :)
Flags: needinfo?(peterbe)
Status: NEW → ASSIGNED
This UI really shouldn't be blocking anyone's work, since the data is all available via the API. I hacked together a quick viewer:

http://benjamin.smedbergs.us/tests/socorroloader/multiple-minidumps.html you can paste in the ID of a multi-dump report and it will show all the dumps side-by-side, e.g. 9a08adeb-1a87-4ba4-a641-f78882150119 (two dumps) or 8b7d2833-435e-4448-a6a9-1e8102150117 (four dumps).
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #9)
> This UI really shouldn't be blocking anyone's work, since the data is all
> available via the API. I hacked together a quick viewer:

Ahh, awesome! Is there a wiki or mdn page on this api?? I searched this week for docs on a socorro json api or similar but didn't find anything! I had hoped to find temp fix for this as well.

> http://benjamin.smedbergs.us/tests/socorroloader/multiple-minidumps.html you
> can paste in the ID of a multi-dump report and it will show all the dumps
> side-by-side, e.g. 9a08adeb-1a87-4ba4-a641-f78882150119 (two dumps) or
> 8b7d2833-435e-4448-a6a9-1e8102150117 (four dumps).

thanks, will take this for a spin today.
(In reply to Jim Mathies [:jimm] from comment #10)
> (In reply to Benjamin Smedberg  [:bsmedberg] from comment #9)
> > This UI really shouldn't be blocking anyone's work, since the data is all
> > available via the API. I hacked together a quick viewer:
> 
> Ahh, awesome! Is there a wiki or mdn page on this api?? I searched this week
> for docs on a socorro json api or similar but didn't find anything! I had
> hoped to find temp fix for this as well.

It is somewhat self-documented at: https://crash-stats.mozilla.com/api/

We could probably do with better descriptions of each of the endpoints, hopefully the names are somewhat self-explanatory.

We're definitely open to suggestions for improvements and available to provide help if you can't find what you need or hit anything unexpected, either in bugs or IRC (#breakpad).
(In reply to Robert Helmer [:rhelmer] from comment #11)
> (In reply to Jim Mathies [:jimm] from comment #10)
> > (In reply to Benjamin Smedberg  [:bsmedberg] from comment #9)
> > > This UI really shouldn't be blocking anyone's work, since the data is all
> > > available via the API. I hacked together a quick viewer:
> > 
> > Ahh, awesome! Is there a wiki or mdn page on this api?? I searched this week
> > for docs on a socorro json api or similar but didn't find anything! I had
> > hoped to find temp fix for this as well.
> 
> It is somewhat self-documented at: https://crash-stats.mozilla.com/api/
> 
> We could probably do with better descriptions of each of the endpoints,
> hopefully the names are somewhat self-explanatory.
> 
> We're definitely open to suggestions for improvements and available to
> provide help if you can't find what you need or hit anything unexpected,
> either in bugs or IRC (#breakpad).

Great thanks. I've had a chance to look at some of these parent stacks finally - the information available is pretty limited. Knowing this and with the api documentation in hand to do more digging on my own.. I think we can downgrade the priority on this bug.
Severity: normal → enhancement
Hardware: x86 → All
I think this would help a lot for engineers who are less-experienced at dealing with crash-stats, to understand what data are available.

For example, a lot of platform developers have spun their wheels for quite a while on ShutDownKill crashes (whose signatures were just fixed in bug 1269817) without understanding that the crash reports they were looking at were the parent process killing the child process.  If paired minidump UI had been visible, people wouldn't have wasted nearly as much time on these.  (That said, that's far from the only problem that led to that situation, but I think it would have substantially reduced the time wasted.)
Assignee: peterbe → nobody
Status: ASSIGNED → NEW

Nika and Jed work on IPC issues and for them, having the stacks of all the minidumps visible in the webui is important because they need to see both sides of the IPC channel (I'm pretty sure that's true--if not, please comment). Since Socorro only processes one of the minidumps and shows that in the Crash Stats web ui, they have to find the crash report, download the other minidumps, process them locally, and then do their analysis on that. It's time consuming and involves multiple steps and PII access.

If I understand this bug correctly, I think we need to make a few changes:

  1. The Socorro processor isn't processing the other minidumps--it only processes the main one. The output of processing the main minidump is in json_dump key in the processed crash. We would need to figure out where to store the output for processing the other minidumps.
  2. Once we fix the processor to process all the minidumps, then we need to fix the Crash Stats web ui to show the stacks from all the minidumps.

Making this a P2.

Priority: -- → P2
You need to log in before you can comment on or make changes to this bug.