Closed Bug 559360 Opened 14 years ago Closed 14 years ago

UI changes for report/index hang detection

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: laura, Assigned: ozten)

References

Details

Attachments

(6 files, 3 obsolete files)

We need UI to support hang detection.  (See bug 544940 for details.)
Quick UI questions for any devs listening:

* Should the child and parent minidump signatures be listed separately in listings like Top Crashers?

* Are there circumstances when a hang minidump might have the same signature as a non-hang report (like NtWaitForSingleObject above)? Would it be useful to distinguish these different kinds of signatures? (a la bug 559174)

* Would some kind of visual cues/filtering/sorting for either report be useful? As an advanced search parameter or a custom report? (a la bug 559137)
The most important UI is a simple paragraph/link on individual reports:

"This crash report is one part of a two-part hang report. The other half of this hang report can be found <a href="otherreport">here</a>."

The signature changes from bug 559174 should take care of the second question.

As for searching and topcrash reports, I don't know exactly what I want yet; I'm open to suggestions. We could at least have a checkbox/radio system to search all/hangs-only/crashes-only. Combined with the plugin-process/browser-process/all radio box that already exists, that would give pretty good control.
Small (but noticeable) message in the top-right corner, with a link to the corresponding browser/plug-in crash.
I'm still not certain why these reports should be separate or linked to each other.  Maybe that was just the most technically expedient way get the data off the hanging user's PC and into the data base; but by the time it reaches the reporting system it seams like that when we are able to make the association between two reports we should just be showing both stacks together.
If you want to do something more complex, that's fine, but I really don't want it to get in the way of the short-term solution. Worse is better.
Two ways of specifying searches that include/exclude hang minidumps.
I like the second way personally, but obviously they solve the same problem so I'm not picky!
there is another bug, or bugs, on file the puts us on the way to a general purpose search mechanism on just about any combination of search fields (or at least the most importantant ones).

following that idea if we wanted to identify the hang crashes we would just enter


  Stack Sig    [is exactly + drop down] [ Enter signature  ]

  Product      [ drop down  ]
  Release      [ drop down  ]
  Build ID     [ drop down  ]

  Crash Reason [ drop down  ]   
  OS version   [ drop down  ]

  User Comment [ Enter Search Term Here ]
 

 EXCEPTION_BREAKPOINT would be part of drop down on Crash Reason or we could change that to "Hang"  if that's more appropriate
(In reply to comment #4)
> I'm still not certain why these reports should be separate or linked to each
> other.  Maybe that was just the most technically expedient way get the data off
> the hanging user's PC and into the data base; but by the time it reaches the
> reporting system it seams like that when we are able to make the association
> between two reports we should just be showing both stacks together.

If we present the two reports together, what kind of arrangement would be useful? Seeing the stacks/tables side-by-side? One after the other? Or one at a time? We could accomplish the latter w/ the last design I posted and some UI tweaks (e.g. remembering the last active tab). 

In addition, if we pair the two reports, how would they appear in lists of signatures/reports, like Top Crashers or search results? Would they be listed as a single row (e.g. 'BrowserSignature / PluginSignature')? Would this happen at the signature level?

Re comment #8: I'll address that in the appropriate bug(s). Is this change slated for 1.7?
Adding some icons and links to highlight hang reports. Ignore most of the visual changes (these were taken from a different bug's mocks)-- the key things are the hang icons and the paired signature data.
(In reply to comment #9)
> (In reply to comment #4)

> 
> Re comment #8: I'll address that in the appropriate bug(s). Is this change
> slated for 1.7?

no, but its been long talked about. mostly work has been held off until the new backend is in place.   that could make searching across a wide variety of parameters easier.  its definitely useful in drilling down to find the right set of correlations and reports that are associated with a specific problem.

it turns out that EXCEPTION_BREAKPOINT isn't the best way to identify the hang reports so "Crash Cause" or maybe "Has a Hang ID" or even "Is a hang report" with check boxes might be a good idea as an additional field in the short and long term.
Attached image ADU Report w/ Hang Data
Two ideas for filtering ADU reports w/ crash data:

1. Add an extra field to the query form for selecting crashes, hangs, or both. 

2. Add a selection to the chart, which filters on the same conditions, but does so in the client. Permits for easier comparisons, but adds implementation complexity. Could also be used w/ ADU charts in other reports (e.g. Version Overview).
> (In reply to comment #12)

looks good to me.  The extra complexity of option 2 would be nice but definitely not critical.  It would be fine to just regenerate the graphs as needed.

while we are mucking about on that page can we change the colors or just get rid of that light blue on the grey background of the data table.  That is pretty unreadable.   To see the problem view the middle data column of http://crash-stats.mozilla.com/daily?p=Firefox&v[]=
(In reply to comment #13)
> while we are mucking about on that page can we change the colors or just get
> rid of that light blue on the grey background of the data table.  That is
> pretty unreadable.   To see the problem view the middle data column of
> http://crash-stats.mozilla.com/daily?p=Firefox&v[]=

Ugh, yeah. The column headings should use the same colors as the chart, which right now are [ #058DC7, #ED561B, #50B432, #990099 ]. These colors should be legible on a gray background.
Target Milestone: --- → 1.6.2
(In reply to comment #14)
@Chris couple questions for you.

Is this good to go? If so I'll grab the bug.

(In reply to comment #6)
I'm start with building the second version which is Smedberg's preference in Comment #7.

(In reply to comment #12)
This may require backend API work, if so I may split this into a new backend and frontend bug for tracking.
(In reply to comment #15)
> Is this good to go? If so I'll grab the bug.

If there are no further issues or requests, these mocks are ready to go. Specifically:

  Crash Reports (attachment #1 [details] [diff] [review]):
  As-is

  Advanced Search (attachment #2 [details] [diff] [review]):
  Use the second version (separate type/process fields)

  Signature Listing (attachment #3 [details] [diff] [review]):
  Use the hang icons. The new table visual design is optional.

  ADU Version (attachment #4 [details] [diff] [review]):
  Use the first version (new Type field in the query form)
(In reply to comment #16)
Awesome, thanks!
Assignee: chowse → ozten.bugs
Blocks: 560627
Blocks: 560628
No longer blocks: 560628
I've moved the ADU updates into Bug#560627 (backend) and Bug#560627 (frontend).
(In reply to comment #18)
> I've moved the ADU updates into Bug#560627 (backend) and Bug#560627 (frontend).

bug 560628 (frontend)
(In reply to comment #10)
I mistook this for mockup for search results.

I think the mockup is actually top crashers signature listings.

Issues:
1) These signature listings are aggregate
1.1) Do we need to segment a crash signature by process type? (same signature for browser and plugin crash)
1.2) Is it possible that the same signature would show up for both hang and traditional crashes, or is this just a corner case?
2) The data for these reports is in the backend webservice, so to do this properly, we should update the output of that service (no new inputs)

2 probably can't happen for 1.6.2 so I'll look at a workaround and we can always implement 2 for a perf boost.
(In reply to comment #20)
1.3) Does a signatures hang pair always match up to 1 other signature? If not, then clicking the icon to reveal it's hang pair wouldn't make sense, or would reveal a list of possible hang pairs.
The search results are the important bit for me... if the topcrash listings don't have this data initially, it's not that hard to use advanced search to drill down and figure it out.

1.1 probably not.
1.2 Because although it is *possible* for the same signature to happen in the child or parent, in that case it's likely to be the same bug or have a similar cause.

1.3 no, definitely not. See, for instance http://office.smedbergs.us/crashes/hang-summary.html which is correlation data I've pulled out from my own database mirror. Expanding the first CallNPP_SetWindow parent stack will show hundreds of child-side stack pairs.
Updates Advanced Search form.
Adds Hang Details widgets to query/query, report/list, report/index, and topcrash/byversion.

3 of the 4 issues in Bug#561228 are fixed with this patch, since I was already in this area.

I'll attach a text file of Query Changes as this patch alters some core SQL.

Testing Notes:
Patch doesn't include images (steal them from /home/aking/breakpad/socorro-hangu/webapp-php/img/3rdparty/)
Attachment #441062 - Flags: review?(ryan)
Attachment #441062 - Flags: review?(laura)
Attachment #441062 - Flags: feedback?
In Search we now do an outer join on plugins_reports so that we can get OOPP and Hang Details in aggregate views.

No queries are changed for individual crashes, since this information comes primarily from the jsonz file.

SQL changes affect the following queries:
* soc.web common.queryTopSig
* soc.web common.totalQueryReports
* soc.web common.queryReports
* soc.web common.queryFreq
* soc.web report.getCommentsBySignature

New SQL query:
* soc.web topcrash.oop
* soc.web report uuid from hangid
* soc.web report hangpairs from uuid

Query Plans for update SQL:
http://pastebin.mozilla.org/717870
(In reply to comment #25)
Doh, I'm missing an exlusion constraint for the three queries that join against plugins_reports.

WHERE reports.date_processed BETWEEN TIMESTAMP '2010-04-23 08:47:23' - CAST('1 weeks' AS INTERVAL) AND TIMESTAMP '2010-04-23 08:47:23'

should be 
WHERE reports.date_processed BETWEEN TIMESTAMP '2010-04-23 08:47:23' - CAST('1 weeks' AS INTERVAL) AND TIMESTAMP '2010-04-23 08:47:23' AND
plugins_reports.date_processed BETWEEN TIMESTAMP '2010-04-23 08:47:23' - CAST('1 weeks' AS INTERVAL) AND TIMESTAMP '2010-04-23 08:47:23'

This will eliminate the three instances of 
 Seq Scan on plugins_reports  (cost=0.00..20.40 rows=1040 width=40)
                                            ->  Seq Scan on plugins_reports_20100125 plugins_reports  (cost=0.00..20.40 rows=1040 width=40)
                                            ->  Seq Scan on plugins_reports_20100201 plugins_reports  (cost=0.00..53.48 rows=848 width=9)
                                            ->  Seq Scan on plugins_reports_20100208 plugins_reports  (cost=0.00..284.91 rows=5391 width=9)
Attachment #441062 - Attachment is obsolete: true
Attachment #441115 - Flags: review?(ryan)
Attachment #441115 - Flags: review?(laura)
Attachment #441062 - Flags: review?(ryan)
Attachment #441062 - Flags: review?(laura)
Attachment #441062 - Flags: feedback?
I don't think pastebin likes my text, it was truncated. Attaching. (Includes plugins_reports exclusion update)
Attachment #441115 - Attachment is obsolete: true
Attachment #441128 - Flags: review?(ryan)
Attachment #441128 - Flags: review?(laura)
Attachment #441115 - Flags: review?(ryan)
Attachment #441115 - Flags: review?(laura)
Attachment #441128 - Flags: review?(ryan) → review-
Comment on attachment 441128 [details] [diff] [review]
3rd patch - includes missing hang_details.php

Good work on this one Austin.  Most of the code is solid.  I'm going to r- this, but it won't need too much work for an r+.  I've gone over most of the feedback with you in IRC, but here it is detailed out:

1. I'm concerned the error message might be a bit too vague for a user (which I've noticed on a number of pages throughout the site).  $details['pair_error'] = "Unable to load <a href='$otherUuid'>$otherUuid</a> please reload this page in a few minutes";
  
2. $row->uuid = '45f1e5e8-26af-41d7-b5a4-7e8532100419'; is hardcoded in hang_pairs()

3. The UI for the icons is unintuitive - I think the icons should be in their own column instead of being in the date column, and that the user should be able to sort on that column.

4. Mousing over an icon should change the mouse pointer to indicate it is a clickable object.  

5. After clicking icon, the "loading" status should be an ajaxy spinning icon to indicate that it is churning.
Attachment #441128 - Attachment is obsolete: true
Attachment #441219 - Flags: review?(ryan)
Attachment #441219 - Flags: review?(laura)
Attachment #441128 - Flags: review?(laura)
(In reply to comment #30)
Great feedback.

2, 3, 4, 5 fixed.

Will pass along to chowse, others for more feedback once this is staged.
Comment on attachment 441219 [details] [diff] [review]
4th patch with fixes from 1st code review

Well done Austin.  The UI still needs a bit of love, but we can tackle that at a later time.
Attachment #441219 - Flags: review?(ryan) → review+
Based on bsmedberg's feedback, the JavaScript in the query form is incorrect, as not all plugin crashes are hang crashes. I'll be fixing this shortly.

Otherwise, staging for test.

Sending        webapp-php/application/controllers/query.php
Sending        webapp-php/application/controllers/report.php
Sending        webapp-php/application/controllers/topcrasher.php
Sending        webapp-php/application/libraries/MY_SearchReportHelper.php
Sending        webapp-php/application/models/common.php
Sending        webapp-php/application/models/report.php
Sending        webapp-php/application/models/topcrashers.php
Adding         webapp-php/application/views/common/hang_details.php
Sending        webapp-php/application/views/common/list_by_signature.php
Sending        webapp-php/application/views/common/list_reports.php
Sending        webapp-php/application/views/common/list_topcrashers.php
Sending        webapp-php/application/views/common/query_form.php
Sending        webapp-php/application/views/report/index.php
Sending        webapp-php/css/screen.css
Adding         webapp-php/img/3rdparty
Adding         webapp-php/img/3rdparty/fatcow
Adding  (bin)  webapp-php/img/3rdparty/fatcow/application16x16.png
Adding  (bin)  webapp-php/img/3rdparty/fatcow/brick16x16.png
Adding  (bin)  webapp-php/img/3rdparty/fatcow/stop16x16.png
Adding         webapp-php/img/3rdparty/silk
Adding  (bin)  webapp-php/img/3rdparty/silk/chart_curve.png
Adding         webapp-php/img/3rdparty/silk/readme.txt
Adding  (bin)  webapp-php/img/ajax-loader16x16.gif
Sending        webapp-php/js/socorro/query.js
Sending        webapp-php/js/socorro/report_list.js
Sending        webapp-php/js/socorro/topcrash.js
Transmitting file data .......................
Committed revision 1983.
Updating code on stage so that hangid and process_type 'plugin' aren't directly related. Not all OOPP crashes are hangs.
Looks like I missed adding a 'default' choice for 'Process'.
This is actually labeled 'Browser' in the mockup.

I'll update the UI and common SQL.
(In reply to comment #36)
There are SQL performance issues with this change.
Attachment #441219 - Flags: review?(laura)
Hang Id for report/index will ship tonight in 1.6.2.

I've reverted all aggregate view enhancements for hangid due to bugs and to
allow 1.6.2 to ship tonight.

Search and topcrashers enhancements will ship in 1.7. I'll file a new bug to
track that work.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Blocks: 562375
Summary: UI changes for hang detection → UI changes for report/index hang detection
Verified in prod. Note: Major release crashes will lose pairs until 1.7.
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: