619246 - Support reporting crashes per pageloads

Reporter

Description

•

15 years ago

Crashes per page loads seems like an interesting way to view crash data (as opposed to crashes per ADU. We should track this info. It will become more important once we have many project branches. They may not have many ADUs (10's at most?) but likely will have developers living on them. Crashes/pageloads would give a more accurate pictures. Additionally, if we have this data, we can mash it up with ADUs and potentially get MTBF failure.

LegNeato

Reporter

Comment 2

•

15 years ago

(In reply to comment #0) > Additionally, if we have this data, we can mash it up with ADUs and potentially > get MTBF failure. Acronym fail, sigh.

(not currently active) Ted Mielczarek

Comment 3

•

15 years ago

To clarify, you want to have a "number of page loads" metric reported from the client?

LegNeato

Reporter

Comment 4

•

15 years ago

Right. I guess this is two parts. I want the # of pageloads between crashes to be stored somewhere and I want that data sent back to us in some usable form (in the crash report itself would make sense I think) We could then aggregate the pageload counts on the server and calculate the overall crashes/pageloads for a given version.

Robert Kaiser

Comment 5

•

14 years ago

"pageloads since last crash" sounds interesting and probably not a privacy hazard, yes. Of course, we won't get numbers from people who don't crash at all, so the sample will be biased towards those people who crash at all.

(not currently active) Ted Mielczarek

Comment 6

•

14 years ago

We could probably report a "pageloads since last crash" value with telemetry. In fact, it's probably worth filing a more general bug on this, allowing us to have values that are collected via telemetry also get submitted with our crash reports, so that we don't have to write these things twice.

Asa Dotzler [:asa]

Comment 7

•

14 years ago

This seems like a feature page-worthy report. We will need to coordinate with several teams to make it happen, for example. I like this metric proposal. Is there somewhere we can discuss additional metrics for tracking this? I don't want to derail anything here, but I do still think hours of usage or number and lengths of sessions between crashes feels like something more humanly familiar and thus actionable. For example, we could get our UR folks doing some studies to figure out for most people what the length of time is that is required for a user, upon crashing, to say "I can't remember the last time that happened" or something like that. If we hit that level of stability, I'd be really happy, regardless of what the number is or how it compares to other browsers.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 8

•

14 years ago

Asa, sync up with Chris Jung about this. We should definitely explore what metrics we can put into Telemetry along these lines.

Christopher Jung

Comment 9

•

14 years ago

Telemetry is collecting a metric called 'SHUTDOWN_OK' which is a Boolean for "Did the browser start after a successful shutdown": http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryHistograms.h Telemetry also has one intensity proxy (application uptime) 'simpleMeasurements.uptime' being reviewed here: https://wiki.mozilla.org/Privacy/Reviews/Telemetry/Measurements We're trying to think of better proxies for actual active usage (pageloads could work), as we'd like to normalize a lot of metrics by intensity of usage in addition to crashes. Once we settle on this metric, Taras can add it to the Telemetry ping.

[:Cww]

Comment 10

•

14 years ago

I'm not sure this is actually more useful than ADU. That is to say the metric of (# crashes total)/(# pageloads total) is basically going to correlate closely with the metrics of (# crashes total)/(# users total) because the ratio of pageviews per user probably doesn't change a whole lot. What I think would be really useful instead is a distribution of average # pageloads (or uptime) per user before a crash. That is a graph showing 30% of our users can't run Firefox for more than 30 minutes (or load more than 20 pages) before they crash, 10% can run for a day or two, 50% can run for weeks before seeing crashes. Or something. Then we can build an effort around making it so that at least 80% of Firefox users can run for at least 2 days before crashing or something.

LegNeato

Reporter

Comment 11

•

14 years ago

Note for pageloads there is still some complexity. For example, Twitter is basically 1 page load with a bunch of ajax requests...someone who uses twitter a lot would have their usage under counted unless we count JS requests. I'm not sure we want to count JS requests...perhaps just those initiated by the user/click?

LegNeato

Reporter

Comment 12

•

14 years ago

(In reply to [:Cww] from comment #10) > I'm not sure this is actually more useful than ADU. That is to say the > metric of > > (# crashes total)/(# pageloads total) > > is basically going to correlate closely with the metrics of > > (# crashes total)/(# users total) > > because the ratio of pageviews per user probably doesn't change a whole lot. I don't think that is the case. Think about the same user on a weekday vs weekend. In both cases their crashes will count against the 1 ADU but the usage may vary wildly....1 crash in 100 page loads is different than 1 crash in 10,000 page loads...in crashes/ADU they would be weighted the same. Other benefits of crashes / pageload: * Easy to weed out startup crashes or focus on them directly * Allows us to figure out a MTBF ("for every 1000 page loads you can expect to see X crashes") which we can't figure out via crashes/ADUs * related to the above, gives us a more realistic view of what a particular end user would see rather than averaging their experience over all Firefox users (as crashes/ADUs does) * Allows us to see if browser usage trends affect crash rate (if weekends are crashier because of usage level or type of usage) * Gives us a metric to directly compare against a competitor (assuming we measure the same way) * I'm sure there are more :-) Downsides: * Content like Flash games will be overcounted / weigh on crashes per pageload disproportionally high as there is one "load" followed by lots of activity * Similarly, Ajax stuff mentioned above * I'm sure there are more :-)

Robert Kaiser

Comment 13

•

14 years ago

I think reality will be more complex and convoluted than both arguments in the recent posts, but from someone looking deeply into crash data every day, I'd like to have both per-ADU and per-pageload metrics and see what they tell us. I'm sure results will be interesting.

[:Cww]

Comment 14

•

14 years ago

(In reply to Christian Legnitto [:LegNeato] from comment #12) Do we actually see that crashes/ADU increase on the weekends? I'd agree with you if this is actually what we're seeing. I don't disagree that crashes/pageload is accurate than crashes/ADU... just that crashes/pageload isn't likely to be any more actionable. I really don't think "For every X pageloads you should expect Y crashes" is going to make anyone happy. You either have a usage pattern where Firefox crashes every day (you love facebook games) or many times a day or you don't see them at all (you love wikipedia). If you're in the first camp, hearing that you should expect a crash every thousand pageloads is just cruel since you're crashing every 5. If you're in the second camp, you couldn't care less about crashes. Just because we're linking heavy users together vs less-heavy users doesn't say anything about what they're actually doing. I think it makes more sense to see how many of our users are in the "really frequent crash" camp and how many are in the "barely crashes at all" camp. (If you're being particular, we could weight that by pageloads/user so it's a distribution of crashes/pageload across users.) Fundamentally, the underlying assumption to your metric is that crashes distribute evenly across pageloads/users or whatever. I think that is really far from what is actually happening.

Robert Kaiser

Comment 15

•

14 years ago

(In reply to [:Cww] from comment #14) > Do we actually see that crashes/ADU increase on the weekends? Yes, we do. Still I also agree with you. I'd be happy to have both metrics to work with, as I think they'll tell us different things, but none tells us everything we want to hear and see.

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Bugzilla

Support reporting crashes per pageloads

Categories

(Toolkit :: Crash Reporting, enhancement)

Tracking

()

People

(Reporter: christian, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Updated