Show top crashes by URL



10 years ago
6 years ago


(Reporter: Samuel Sidler (old account; do not CC), Assigned: ozten)


Dependency tree / graph

Firefox Tracking Flags

(Not tracked)




(5 attachments, 2 obsolete attachments)

Reported by morgamic, Jul 10, 2007

Need to show crashes by operating system and/or platform on the top crasher
page and possibly the main query page.


Comment 5 by morgamic, Aug 03, 2007

OS was done, we need something that shows top crashes by URL now.
Do we still want to show this report despite concerns about privacy issues?

Comment 2

10 years ago
We need this report. If it has to be behind a secure part of the site, so be it. But we need this.
Assignee: morgamic → nobody
Priority: -- → P1
Target Milestone: --- → 0.5
so here's a query that will get you URLs appearing in >10 crash reports, but strips off query strings:

select split_part(url, '?', 1) as url_part, count(*) as c from reports where url is NOT NULL and URL != '' group by url_part HAVING count(*) > 10;

(needs a date limiter, obviously)

The downside is that you can't actually get to individual crash reports from that.

Comment 4

10 years ago
can you limit them to crashes that happen for more than 3 users? :)
select split_part(url, '?', 1) as url_part, count(*) as c, count(distinct user_id) as users from reports where url is NOT NULL and URL != '' group by url_part HAVING count(*) > 10 AND count(distinct user_id) > 3;

Seems to work. I don't know how much the database will hate that query though. :)
CCing morgamic as he's my DBA guru.
Assignee: nobody → justin.gallardo
Justin could you take a look at this?  Summary of requirements:
* for a given crash signature, show a list of URLs that occurred 3 or more times
* strip off the query arguments
* start off with the assumption that you'll be doing this the aggregate way
Comment 5 contains a query that will generate a "top URLs" list containing URLs included in >10 crash reports from >3 unique users.

Comment 9

9 years ago
Sure thing. I will start working on something for this right now.


9 years ago
Target Milestone: 0.5 → ---
Target Milestone: --- → 0.6
Assignee: justin.gallardo → nobody
Target Milestone: 0.6 → ---
Assignee: nobody → aking
Target Milestone: --- → 0.6
Can we get an update on this feature?

Comment 11

9 years ago
It is on my TODO list and should be shipped by 12/15.
From bug 415027:
Comment #10 From  Austin King   2008-12-08 11:26:03 PST

Wireframe for report:

Comment #11 From Samuel Sidler (:ss | :sps) 2008-12-08 11:29:36 PST

Actually, comment 10 should be in bug 411358.

As far as I can tell from that mockup, the crash reports displayed under each URL aren't grouped by crash signature, which seems...not very useful.  

It's possible--I'd even say extremely likely--that a page like could trigger crashes in WMP/F4M, Flash, even somewhere in layout or the parser, and the current mockup looks like it just tells us " crashes a lot" instead of " crashes a lot in WMP/F4M, crashes a lot in Flash, crashes a lot in this particular function in the html parser, crashes a lot in function A in layout, and crashes a lot in function B in layout".

Grouping by crash signature within the URL report makes these reports much more useful for QA, who'd otherwise have to dig back through all the comments figuring out which crashes appeared, and appeared most, on a page/site.  (It also happens to be how Talkback presents this information, which was one of the parts of Talkback that worked :P )

Comment 13

9 years ago
Thanks Smokey for c12.
Updates with breakdowns below each url for signature.
So you show the first stack frame only? Wouldn't it be nice to also have the first frame with symbols available? I ask because of the example kernel32.dll in your mockup.

Comment 15

9 years ago
It won't be the first frame, it will be the crash signature. See bug 411349 about work to make the signatures more meaningful when the top frames aren't very unique.
Thanks Benjamin. That sounds good. So forget my last comment...

Comment 17

9 years ago
In V2 I am aggregating by domain, then urls, then crash signatures.

Here is V3 which just starts with urls
So the domain might show up at #3, #17, and #50 in top crashes by url.

Is V3 more useful than V2?

This Top Crashers by URL depends on two features we don't have yet, authentication ( to display full urls and link to reports ) and search by domain/url.

One of the original constraints was thinking about the page for logged in versus non-logged users. Aggregating by domains would be the public view. If you were logged in, then you could drill down into URLs.
I think I like v2 better than v3, but that may just be a bias towards expecting a domain to have similar types of content/crashes across all of its pages. My only concern with v2 over v3 is that v2 could "hide" a relatively large crash on, say, because the volume of pages on and other large sites.  

That is, in aggregate has 500 crashes, but 50 of those come from and the other 450 are spread across 100 pages (~3 crashes/other page).,,, and so forth are also in the same situation.  Then (, perhaps) has 99 crashes, all on the same page (in this case by virtue of stripping query strings, etc.), but it's far down the domain list because of the number of "large-volume-of-pages" sites.  I don't know how common this case might be, but it is something that came to mind when considering v2.

Comment 19

9 years ago
Created attachment 352767 [details] [diff] [review]
daily cron job for top crashers by domain
Attachment #352767 - Flags: review?(morgamic)
Attachment #352767 - Flags: review?(lars)

Comment 20

9 years ago
Created attachment 352769 [details]
top crashers by url DB tables DDL and DML (in progress)

This is the create statements needed for
Dimension tables:

Fact Table:

Config Tabls:

TODO I don't have all the constraints and indexes in place. This file is my working scratch file. Would cleanup or integrate with (???)

References to productdims are from MTBF patches

Comment 21

9 years ago
Created attachment 353074 [details] [diff] [review]
php code for viewing grouped by url or domain

This code can be previewed on my dev instance

(requires VPN sorry)
Attachment #353074 - Flags: review?(morgamic)

Comment 22

9 years ago
Staging notes:
Running against 8/24 which has 85K report rows across all products and
77K for Firefox 3.0.1
49K with non null url + sig
18mb 26mb 35mb 50mb 48mb 
0 cpu 10 cpu 1
4:11 - 4:30

died on bad column name ( comments now user_comments )
rerunning ( will revert staging only code before checkin )

ran 8/23 with tons of logging
50k records non null url + sig
took 19.5 minutes

... Adding 3.0 to the mix Prod id 7 Firefox 3.0 ALL
ran 8/22 
53k records non null url + sig
took 21 minutes

Disabled via config and
ran 8/21
Exited very quickly, no facts created

Enabled configs and 
ran 8/21

Tue Dec 16 19:45:09 PST 2008
23 minutes...

So for 4 days of data...
topcrashurlfacts        - 17809408 (17 MB)
topcrashurlfactsreports - 155648 (152 KB)
urldims                 - 13877248 (13 MB)
signaturedims           - 1245184 (1 MB)

... Changed code to record aggregate info for facts where there are
more than 1 crash ( head of long tail )

memory - 8856 kb ( stayed under 10 MB )
Tue Dec 16 20:49:10 PST 2008
Tue Dec 16 20:50:46 PST 2008

holy crap!

Deleted all facts, urldims, signaturedims...

Ran 8/22 
1 min 15 seconds

Found bug... '\n' comments should be filter out of

Ran 8/23
1 min 9 seconds

Ran 8/24
1 min 7 seconds

Table sizes for 2 products across 4 days are now:
topcrashurlfacts - 1695744 (1.6 MB)
topcrashurlfactsreports - 24576 (24 KB)
urldims - 1056768 (1 MB)
signaturedims - 1245184 (1.2 MB)

Comment 23

9 years ago
18mb 26mb 35mb 50mb 48mb 
0 cpu 10 cpu 1

memory - 8856 kb ( stayed under 10 MB )

are about python's Res memory and % CPU.

Comment 24

9 years ago
Created attachment 353471 [details] [diff] [review]
Updates to python and php code. combined in one patch.

Working on an updated SQL script.
The CSS changes are in the MTBF patch Bug 411424
Attachment #352767 - Attachment is obsolete: true
Attachment #353074 - Attachment is obsolete: true
Attachment #353471 - Flags: review?(morgamic)
Attachment #353471 - Flags: review?(lars)
Attachment #352767 - Flags: review?(morgamic)
Attachment #352767 - Flags: review?(lars)
Attachment #353074 - Flags: review?(morgamic)

Comment 25

9 years ago
Created attachment 353592 [details] [diff] [review]
Updated with Lars feedback
Attachment #353592 - Flags: review?(morgamic)
Attachment #353592 - Flags: review?(lars)

Comment 26

9 years ago
Created attachment 353594 [details]
This is the SQL for creating new db tables

Comment 27

9 years ago
Created attachment 353595 [details]
This is the SQL for rolling back this deployment of new tables
Comment on attachment 353592 [details] [diff] [review]
Updated with Lars feedback

I've reviewed and approved the Python code based on the idea that it be revisited later for some housecleaning and refactoring.
Attachment #353592 - Flags: review?(lars) → review+

Comment 29

9 years ago
2 products 8-22
First run on old partitioning scheme took 58 minutes ( instead of 67 seconds,
or 20 minutes for the pre-optimized script )

2 product 8-23 52 minutes
2008-12-17 23:26:30,592 INFO - done.
2008-12-17 22:34:26,38

3 products 8-24 timed out - could be my problem, didn't use nohup...

trying by dropping index, then rebuilding index
3 products 8-24 2 hours 38 minutes

rebuilding indexes takes 500 millis

Will continue testing, Conclusions so far
This is in our bad performance, but good enough to ship with range. MTBF runs
in under a minute.

We will want to be careful with the number of builds we want to calculate "top
crashers by url" for. Specifically major releases which generate a lot of rows
in reports. For less used builds, it isn't an issue, Running against 3.0b3 for
a days worth of data took only 1 second.
- put link in brackets - like [ link ] to space it out from the actual url
- is there a reason why the signatures are not linked?  might be useful
- comment links look good! woot.
- would be cool if the signatures under a domain were indented somehow but that's minor

- PHP looks good, let's kick it out there and polish it as we get feedback

Austin - sorry I was not able to review this more closely, I ran out of time this week.
Comment on attachment 353592 [details] [diff] [review]
Updated with Lars feedback

Let's get it out the door and in front of some eyes.
Attachment #353592 - Flags: review?(morgamic) → review+

Comment 32

9 years ago
I will write up some documentation but here is a soft launch for Top Crashers by URL... - not enough crashes on same url to make the report...

Details which would explain 3.0.6pre being empty are coming... to Socorro code wiki page.
Last Resolved: 9 years ago
Resolution: --- → FIXED
Austin, this looks good!  I've filed some smaller things as follow-ups, bug 470524, bug 470525, and bug 470526 (a couple of them I stole from morgamic in comment 30).
...and bug 470527 on a random failure to show signatures when expanding URLs in the bydomain report.
Lets add all of these follow-up bugs to the dependency list. It looks really great!


9 years ago
Blocks: 470561


9 years ago
Blocks: 470563
Attachment #353471 - Flags: review?(morgamic)
Attachment #353471 - Flags: review?(lars)
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.