Closed
Bug 411358
Opened 18 years ago
Closed 17 years ago
Show top crashes by URL
Categories
(Socorro :: General, task, P1)
Socorro
General
Tracking
(Not tracked)
RESOLVED
FIXED
0.6
People
(Reporter: samuel.sidler+old, Assigned: ozten)
References
()
Details
Attachments
(5 files, 2 obsolete files)
Reported by morgamic, Jul 10, 2007
Need to show crashes by operating system and/or platform on the top crasher
page and possibly the main query page.
--
Comment 5 by morgamic, Aug 03, 2007
OS was done, we need something that shows top crashes by URL now.
![]() |
||
Comment 1•18 years ago
|
||
Do we still want to show this report despite concerns about privacy issues?
![]() |
Reporter | |
Comment 2•18 years ago
|
||
We need this report. If it has to be behind a secure part of the site, so be it. But we need this.
![]() |
||
Updated•18 years ago
|
Assignee: morgamic → nobody
Priority: -- → P1
Target Milestone: --- → 0.5
Comment 3•18 years ago
|
||
so here's a query that will get you URLs appearing in >10 crash reports, but strips off query strings:
select split_part(url, '?', 1) as url_part, count(*) as c from reports where url is NOT NULL and URL != '' group by url_part HAVING count(*) > 10;
(needs a date limiter, obviously)
The downside is that you can't actually get to individual crash reports from that.
can you limit them to crashes that happen for more than 3 users? :)
Comment 5•18 years ago
|
||
select split_part(url, '?', 1) as url_part, count(*) as c, count(distinct user_id) as users from reports where url is NOT NULL and URL != '' group by url_part HAVING count(*) > 10 AND count(distinct user_id) > 3;
Seems to work. I don't know how much the database will hate that query though. :)
Comment 6•18 years ago
|
||
CCing morgamic as he's my DBA guru.
![]() |
||
Updated•17 years ago
|
Status: NEW → ASSIGNED
![]() |
||
Updated•17 years ago
|
Assignee: nobody → justin.gallardo
Status: ASSIGNED → NEW
![]() |
||
Comment 7•17 years ago
|
||
Justin could you take a look at this? Summary of requirements:
* for a given crash signature, show a list of URLs that occurred 3 or more times
* strip off the query arguments
* start off with the assumption that you'll be doing this the aggregate way
Comment 8•17 years ago
|
||
Comment 5 contains a query that will generate a "top URLs" list containing URLs included in >10 crash reports from >3 unique users.
![]() |
||
Comment 9•17 years ago
|
||
Sure thing. I will start working on something for this right now.
![]() |
||
Updated•17 years ago
|
Status: NEW → ASSIGNED
![]() |
||
Updated•17 years ago
|
Target Milestone: 0.5 → ---
![]() |
||
Updated•17 years ago
|
Target Milestone: --- → 0.6
![]() |
||
Updated•17 years ago
|
Assignee: justin.gallardo → nobody
![]() |
||
Updated•17 years ago
|
Target Milestone: 0.6 → ---
![]() |
||
Updated•17 years ago
|
Assignee: nobody → aking
Target Milestone: --- → 0.6
Comment 10•17 years ago
|
||
Can we get an update on this feature?
![]() |
Assignee | |
Comment 11•17 years ago
|
||
It is on my TODO list and should be shipped by 12/15.
From bug 415027:
----
Comment #10 From Austin King 2008-12-08 11:26:03 PST
Wireframe for report:
http://people.mozilla.org/~aking/Socorro/TCbyURL/TCByURL-wireframe.jpg
Comment #11 From Samuel Sidler (:ss | :sps) 2008-12-08 11:29:36 PST
Actually, comment 10 should be in bug 411358.
----
As far as I can tell from that mockup, the crash reports displayed under each URL aren't grouped by crash signature, which seems...not very useful.
It's possible--I'd even say extremely likely--that a page like myspace.com could trigger crashes in WMP/F4M, Flash, even somewhere in layout or the parser, and the current mockup looks like it just tells us "myspace.com crashes a lot" instead of "myspace.com crashes a lot in WMP/F4M, myspace.com crashes a lot in Flash, myspace.com crashes a lot in this particular function in the html parser, myspace.com crashes a lot in function A in layout, and myspace.com crashes a lot in function B in layout".
Grouping by crash signature within the URL report makes these reports much more useful for QA, who'd otherwise have to dig back through all the comments figuring out which crashes appeared, and appeared most, on a page/site. (It also happens to be how Talkback presents this information, which was one of the parts of Talkback that worked :P )
![]() |
Assignee | |
Comment 13•17 years ago
|
||
Thanks Smokey for c12.
Updates with breakdowns below each url for signature.
http://people.mozilla.org/~aking/Socorro/TCbyURL/TCByURL-wireframeV2.jpg
Comment 14•17 years ago
|
||
So you show the first stack frame only? Wouldn't it be nice to also have the first frame with symbols available? I ask because of the example kernel32.dll in your mockup.
Comment 15•17 years ago
|
||
It won't be the first frame, it will be the crash signature. See bug 411349 about work to make the signatures more meaningful when the top frames aren't very unique.
Comment 16•17 years ago
|
||
Thanks Benjamin. That sounds good. So forget my last comment...
![]() |
Assignee | |
Comment 17•17 years ago
|
||
In V2 I am aggregating by domain, then urls, then crash signatures.
Here is V3 which just starts with urls
http://people.mozilla.org/~aking/Socorro/TCbyURL/TCByURL-wireframeV3.jpg
So the myspace.com domain might show up at #3, #17, and #50 in top crashes by url.
Is V3 more useful than V2?
Background:
This Top Crashers by URL depends on two features we don't have yet, authentication ( to display full urls and link to reports ) and search by domain/url.
One of the original constraints was thinking about the page for logged in versus non-logged users. Aggregating by domains would be the public view. If you were logged in, then you could drill down into URLs.
I think I like v2 better than v3, but that may just be a bias towards expecting a domain to have similar types of content/crashes across all of its pages. My only concern with v2 over v3 is that v2 could "hide" a relatively large crash on, say, not-quite-myspace-popular-but-still-important.com because the volume of pages on myspace.com and other large sites.
That is, myspace.com in aggregate has 500 crashes, but 50 of those come from myspace.com/crashme and the other 450 are spread across 100 pages (~3 crashes/other page). facebook.com, flickr.com, yahoo.com, and so forth are also in the same situation. Then not-quite-myspace-popular-but-still-important.com (mail.google.com, perhaps) has 99 crashes, all on the same page (in this case by virtue of stripping query strings, etc.), but it's far down the domain list because of the number of "large-volume-of-pages" sites. I don't know how common this case might be, but it is something that came to mind when considering v2.
![]() |
Assignee | |
Comment 19•17 years ago
|
||
Attachment #352767 -
Flags: review?(morgamic)
Attachment #352767 -
Flags: review?(lars)
![]() |
Assignee | |
Comment 20•17 years ago
|
||
This is the create statements needed for
Dimension tables:
signaturedims
urldims
Fact Table:
topcrashurlfacts
Config Tabls:
tcbyurlconfig
TODO I don't have all the constraints and indexes in place. This file is my working scratch file. Would cleanup or integrate with schema.py (???)
References to productdims are from MTBF patches
https://bugzilla.mozilla.org/show_bug.cgi?id=411424
![]() |
Assignee | |
Comment 21•17 years ago
|
||
This code can be previewed on my dev instance
http://aking.khan.mozilla.org/reporter/topcrasher/byurl/Firefox/3.0.1
(requires VPN sorry)
Attachment #353074 -
Flags: review?(morgamic)
![]() |
Assignee | |
Comment 22•17 years ago
|
||
Staging notes:
Running against 8/24 which has 85K report rows across all products and
77K for Firefox 3.0.1
49K with non null url + sig
18mb 26mb 35mb 50mb 48mb
0 cpu 10 cpu 1
4:11 - 4:30
died on bad column name ( comments now user_comments )
rerunning ( will revert staging only code before checkin )
ran 8/23 with tons of logging
50k records non null url + sig
took 19.5 minutes
... Adding 3.0 to the mix Prod id 7 Firefox 3.0 ALL
ran 8/22
53k records non null url + sig
took 21 minutes
Disabled via config and
ran 8/21
Exited very quickly, no facts created
Enabled configs and
ran 8/21
Tue Dec 16 19:45:09 PST 2008
8:08
23 minutes...
So for 4 days of data...
topcrashurlfacts - 17809408 (17 MB)
topcrashurlfactsreports - 155648 (152 KB)
urldims - 13877248 (13 MB)
signaturedims - 1245184 (1 MB)
... Changed code to record aggregate info for facts where there are
more than 1 crash ( head of long tail )
memory - 8856 kb ( stayed under 10 MB )
Tue Dec 16 20:49:10 PST 2008
Tue Dec 16 20:50:46 PST 2008
holy crap!
Deleted all facts, urldims, signaturedims...
Ran 8/22
1 min 15 seconds
Found bug... '\n' comments should be filter out of
topcrashurlfactsreports
Ran 8/23
1 min 9 seconds
Ran 8/24
1 min 7 seconds
Table sizes for 2 products across 4 days are now:
topcrashurlfacts - 1695744 (1.6 MB)
topcrashurlfactsreports - 24576 (24 KB)
urldims - 1056768 (1 MB)
signaturedims - 1245184 (1.2 MB)
![]() |
Assignee | |
Comment 23•17 years ago
|
||
Comments:
18mb 26mb 35mb 50mb 48mb
0 cpu 10 cpu 1
and
memory - 8856 kb ( stayed under 10 MB )
are about python's Res memory and % CPU.
![]() |
Assignee | |
Comment 24•17 years ago
|
||
Working on an updated SQL script.
The CSS changes are in the MTBF patch Bug 411424
Attachment #352767 -
Attachment is obsolete: true
Attachment #353074 -
Attachment is obsolete: true
Attachment #353471 -
Flags: review?(morgamic)
Attachment #353471 -
Flags: review?(lars)
Attachment #352767 -
Flags: review?(morgamic)
Attachment #352767 -
Flags: review?(lars)
Attachment #353074 -
Flags: review?(morgamic)
![]() |
Assignee | |
Comment 25•17 years ago
|
||
Attachment #353592 -
Flags: review?(morgamic)
Attachment #353592 -
Flags: review?(lars)
![]() |
Assignee | |
Comment 26•17 years ago
|
||
![]() |
Assignee | |
Comment 27•17 years ago
|
||
![]() |
||
Comment 28•17 years ago
|
||
Comment on attachment 353592 [details] [diff] [review]
Updated with Lars feedback
I've reviewed and approved the Python code based on the idea that it be revisited later for some housecleaning and refactoring.
Attachment #353592 -
Flags: review?(lars) → review+
![]() |
Assignee | |
Comment 29•17 years ago
|
||
2 products 8-22
First run on old partitioning scheme took 58 minutes ( instead of 67 seconds,
or 20 minutes for the pre-optimized script )
2 product 8-23 52 minutes
2008-12-17 23:26:30,592 INFO - done.
2008-12-17 22:34:26,38
3 products 8-24 timed out - could be my problem, didn't use nohup...
trying by dropping index, then rebuilding index
3 products 8-24 2 hours 38 minutes
rebuilding indexes takes 500 millis
Will continue testing, Conclusions so far
This is in our bad performance, but good enough to ship with range. MTBF runs
in under a minute.
We will want to be careful with the number of builds we want to calculate "top
crashers by url" for. Specifically major releases which generate a lot of rows
in reports. For less used builds, it isn't an issue, Running against 3.0b3 for
a days worth of data took only 1 second.
![]() |
||
Comment 30•17 years ago
|
||
UX/Polish
- put link in brackets - like [ link ] to space it out from the actual url
- is there a reason why the signatures are not linked? might be useful
- comment links look good! woot.
- would be cool if the signatures under a domain were indented somehow but that's minor
Code:
- PHP looks good, let's kick it out there and polish it as we get feedback
Austin - sorry I was not able to review this more closely, I ran out of time this week.
![]() |
||
Comment 31•17 years ago
|
||
Comment on attachment 353592 [details] [diff] [review]
Updated with Lars feedback
Let's get it out the door and in front of some eyes.
Attachment #353592 -
Flags: review?(morgamic) → review+
![]() |
Assignee | |
Comment 32•17 years ago
|
||
I will write up some documentation but here is a soft launch for Top Crashers by URL...
http://crash-stats.mozilla.com/topcrasher/byurl/Firefox/3.0.5
http://crash-stats.mozilla.com/topcrasher/byurl/Firefox/3.1b3pre
http://crash-stats.mozilla.com/topcrasher/byurl/Firefox/3.1b2
http://crash-stats.mozilla.com/topcrasher/byurl/Firefox/3.0.6pre - not enough crashes on same url to make the report...
Details which would explain 3.0.6pre being empty are coming... to Socorro code wiki page.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Austin, this looks good! I've filed some smaller things as follow-ups, bug 470524, bug 470525, and bug 470526 (a couple of them I stole from morgamic in comment 30).
...and bug 470527 on a random failure to show signatures when expanding URLs in the bydomain report.
Comment 35•17 years ago
|
||
Lets add all of these follow-up bugs to the dependency list. It looks really great!
![]() |
||
Updated•17 years ago
|
Attachment #353471 -
Flags: review?(morgamic)
Attachment #353471 -
Flags: review?(lars)
Updated•14 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
You need to log in
before you can comment on or make changes to this bug.
Description
•