Closed
Bug 411424
Opened 17 years ago
Closed 15 years ago
Need to have a report for MTBF per build
Categories
(Socorro :: General, task, P2)
Socorro
General
Tracking
(Not tracked)
RESOLVED
FIXED
0.7
People
(Reporter: samuel.sidler+old, Assigned: ozten)
References
()
Details
Attachments
(2 files, 2 obsolete files)
2.72 KB,
text/plain
|
Details | |
23.55 KB,
patch
|
morgamic
:
review+
|
Details | Diff | Splinter Review |
Reported by morgamic, Nov 29, 2007 Need to add a report for average mean time between failure.
Reporter | ||
Updated•17 years ago
|
Priority: -- → P2
Comment 1•17 years ago
|
||
morgamic, ok to move this to P1 now? lets hash out how we can do the computation of the number and report it. in the past I think we have just added up all the time-of-crash minus browser-start-time numbers for each black box for a specific release to come up with the total number of hours run; then we divided that number by the number of crashes. sample size has been last 10 days, but we could switch to two weeks if we think that has some value. for example a report like this would allow us to directly compare to 2.x releases. Total blackboxes in this sample: 288999 Total unique users: 147090 MTBF For these builds is estimated at 25.625648 hours, based on 273144 reports and 6999491.939167 hours of user testing
Comment 2•17 years ago
|
||
and we should do any needed sanity checking and clean up of the db and the sample before we do the calculations as in bug 422549
Comment 3•17 years ago
|
||
ken, let's work out for how we want to calculate this. I think the old crash reporting system was basically using something like this. -pull a sample for all blackboxes or a particular release (e.g. grab all the reports for windows, mac, and linux build numbers for the release like firefox 3.0 beta 4, or final) -throw out any outliers like 0 or negative time since start up, or anything that looked like a duplicate submission -add up all the time since start up times. -divide by the number of blackboxes in the sample. jay can confirm what we used in the custom report we built for past tracking.
Comment 4•17 years ago
|
||
sample size has been " 10 days of data " previously but there might be good reasons to move to a two week window since we know active users drop over weekends and might create wiggles in the reporting as a different kind of user base comes in and out on weekends.
Comment 5•17 years ago
|
||
and of course he holy grail on this is to tie it into AUS active user data. We will be eye-balling a correlation between active users and total crashes received until we can get the two tied together with automation, but that shouldn't hold us up for now. right now the fact that we are only receiving crashes from the users that "opt in" can give us a distorted view of what is going on, but having that same distortion applied across multiple releases has yeilded valuable feedback... e.g. 14 days into beta 4 MTBF was 30.3 hours 14 days into beta 5 MTBF was 35.5 hours so we must have fixed the right set of top crashers to improve stability and not not introduced any crash regressions. those are really the kind of numbers we are after here. we also used to have graphs that aligned the releases to show the changes for MTBF over time since release. it would be cool if we could also get those going again at some point
Comment 6•17 years ago
|
||
Chofmann and I recently spoke regarding some new stats that would useful: (1) daily number of crashes. this is sort of a raw/ignorant look at the data, but it could be helpful. (2) median tbf + anything else that describes the distribution of failure time. e.g., is one user responsible for all crashes?, is the tbf normally distributed?, etc. (3) ratio of comments to crashes
Updated•16 years ago
|
Assignee: nobody → aking
Target Milestone: --- → 0.7
Comment 8•16 years ago
|
||
Chofmann wants a graph with these properties: * x-axis: days since releases * y-axis: hours * series: release versions only
Assignee | ||
Comment 9•16 years ago
|
||
Development URL: http://aking.khan.mozilla.org/reporter/mtbf/of/Firefox/major Screenshots: http://people.mozilla.org/~aking/Socorro/mtbf.html See following attachements with DB schema for more context.
Attachment #351635 -
Flags: review?(morgamic)
Attachment #351635 -
Flags: review?(lars)
Assignee | ||
Comment 10•16 years ago
|
||
Assignee | ||
Comment 11•16 years ago
|
||
Cron Script: When run startMtbf.py will populate MTBF facts table for the previous day. Date can be overriden say startMtbf.py -d 2008-12-01 Database Changes: To see more realistic data - look at breakpad_aking DB on Postgres on khan.mozilla.org. that DB shows realistic values in all three tables. I don't have much data to work with, so it is 5 days of data instead several release builds on day 1 through day 30 or 60. TODO: I know of a couple bugs, Need indexes on tables, Have a flot redisplay bug, etc but wanted to get a review. Thanks.
Assignee | ||
Comment 12•16 years ago
|
||
Attachment #351635 -
Attachment is obsolete: true
Attachment #353470 -
Flags: review?(morgamic)
Attachment #353470 -
Flags: review?(lars)
Attachment #351635 -
Flags: review?(morgamic)
Attachment #351635 -
Flags: review?(lars)
Assignee | ||
Comment 14•16 years ago
|
||
I don't have a firm plan around products and versions. If you give me versions and start dates then I will set this up. Optionally you can give me end dates or 60 days will be default. Example(made up data): Thunderbird 2.0.0.19 - 12/10 - major release 2.0.0.20 - 1/10/2009 - major release 3.0a3 - 9/12 - milestone release 3.0b2pre - 11/15 - developer release etc I will be getting this info for Firefox from S.S, but I don't have any other data or person for any other products yet.
Comment 15•16 years ago
|
||
Setting this up for Thunderbird would be fantastic. I think all the data for released version is likely to be available on the Release pages linked to from <https://wiki.mozilla.org/Releases/>. It would be great to track all the Thunderbird 3 releases there (3.0a1, 3.0a2, 3.0a3, 3.0b1). At least the last several Thunderbird 2 releases would be very helpful as well. I believe our branch nightlies are 3.0b2pre and our trunk nightlies are 3.1a1pre. gozer probably has exact start dates for those. 60 days sounds like a perfectly reasonable default to start with. Thanks!
Assignee | ||
Comment 16•16 years ago
|
||
Attachment #353470 -
Attachment is obsolete: true
Attachment #353593 -
Flags: review?(morgamic)
Attachment #353593 -
Flags: review?(lars)
Attachment #353470 -
Flags: review?(morgamic)
Attachment #353470 -
Flags: review?(lars)
Comment 17•16 years ago
|
||
Here are my comments for the reporter changes. - the data should be listed in a table under the graph in case scaling makes it hard to interpret - the major/milestone/development links shouldn't rotate, all three should be visible at all times - text for top nav should be "Release type: Major Milestone Development" More on table layout: # Firefox 3.0- MTBF 13010 seconds based on 50103 crash reports of 32726 users (blackboxen) from period between 2008-08-01 and 2008-11-20 # Firefox 3.0.1- MTBF 250139 seconds based on 765446 crash reports of 496840 users (blackboxen) from period between 2008-08-01 and 2008-11-20 # Firefox 3.0 Win- MTBF 10119 seconds based on 39161 crash reports of 24196 users (blackboxen) from period between 2008-08-01 and 2008-11-20 Should be changed to: Product | Version | OS | MTBF | # Reports | # Users | Start | End That was UX stuff, looking at PHP code.
Comment 18•16 years ago
|
||
Indentation is messed up in load_product_info(). Looks like there are tabs mixed in with spaces, so the code is littered with some indentation issues. Question - for the zero-case (no data) seems like some of the behavior is to show an empty white box -- is that expected? Functionally, it works for me, so let's move forward and iterate on it.
Updated•16 years ago
|
Attachment #353593 -
Flags: review?(morgamic) → review+
Assignee | ||
Comment 19•16 years ago
|
||
This code is checked in and scheduled to be released tonight. r751 with some initial configuration checked in under r753.
Status: NEW → ASSIGNED
Reporter | ||
Comment 20•16 years ago
|
||
I'm not such how much history you have, but I'd like to do MTBF for the following builds: * Firefox 3.0.3 (starting Sept 24) * Firefox 3.0.4 (starting Nov 5) * Firefox 3.0.5 (starting Dec 10) * All Firefox 3.0.x pre builds starting with 3.0.4pre (start these when 3.0.[n-1] started; i.e., start 3.0.4pre on Sept 24) * Firefox 3.1b1 (starting Oct 7) * Firefox 3.1b2 (starting Dec 1) * All Firefox 3.1pre builds starting with 3.1b2pre (starting Oct 7) For Thunderbird, do the following builds: * Thunderbird 3.0a3 (starting Oct 7) * Thunderbird 3.0b1 (starting Dec 2) * Thunderbird 3.0b1pre (starting Oct 7) * Thunderbird 3.0b2pre (starting Nov 28) If you have data prior to Sept 24 (when the first one of these starts), let me know and we can add more, but this is a great start.
Reporter | ||
Comment 21•16 years ago
|
||
(In reply to comment #15) > At least the last several Thunderbird 2 releases would be very helpful as well. Thunderbird 2 can't be done in this style since it's Socorro dependent, but you look at MTBF for Thunderbird 2 builds at: http://talkback-public.mozilla.org/reports/thunderbird/ Simply select a release (e.g., Thunderbird 2.0.0.18) and under "Smart Analysis" on the left side, select "All Platforms". MTBF appears at the top of the smart analysis report. Note: This isn't comparing apples to apples since the crash reporting is very different between 1.8 and 1.9.
Reporter | ||
Comment 22•16 years ago
|
||
Oh, and 60-day default is a good start. We can start specifying end-dates as needed later (I'll let you know what those are when we get there). Let's get this going! :)
Comment 23•16 years ago
|
||
What about SeaMonkey? 2.0 alpha 1 and 2 have been released by now, so I suppose the following SeaMonkey builds (or build families) could be added to the list (subject, I suppose, to some agreed-upon time-limit such as that in comment #22). 2.0a1pre 2.0a1 2.0a2pre 2.0a2 2.0a3pre Also, what about Firefox 3.2a1pre, which is already coming out in the form of nightlies? AFAIK, they're the only builds already being done based on Gecko 1.9.2. Not sure how much statistical data would be available as yet, but wouldn't it be worth while to have the MTBF reports up and rolling by the time Sm 2.0 and/or Fx 3.2 are ready for a release, or maybe even for a beta?
Reporter | ||
Comment 24•16 years ago
|
||
Austin, I filed a couple of follow ups to look at since some of this is live already. See the "Depends On" field.
Assignee | ||
Comment 25•16 years ago
|
||
(In reply to comment #23) > What about SeaMonkey? 2.0 alpha 1 and 2 have been released by now, so I suppose > the following SeaMonkey builds (or build families) could be added to the list > (subject, I suppose, to some agreed-upon time-limit such as that in comment > #22). > 2.0a1pre > 2.0a1 > 2.0a2pre > 2.0a2 > 2.0a3pre > > Also, what about Firefox 3.2a1pre, which is already coming out in the form of > nightlies? AFAIK, they're the only builds already being done based on Gecko > 1.9.2. > > Not sure how much statistical data would be available as yet, but wouldn't it > be worth while to have the MTBF reports up and rolling by the time Sm 2.0 > and/or Fx 3.2 are ready for a release, or maybe even for a beta? I am happy to add these to the MTBF reports. I need start dates which is "day 0" for calculating uptime. I will add SeaMonkey 2.0a2 and 2.0a3pre to the top crash by url reports also. As for 3.2a1pre is the Product Minefeild or Firefox?
Assignee | ||
Comment 26•16 years ago
|
||
I still need two more pieces of information for all the SeaMonke builds. 1) major|milestone|dev 2) start and end dates (60 days) I've taken a guess at these. Please fill in and confirm. SeaMonkey 2.0a1pre, developer, ?? - ?? SeaMonkey 2.0a1, milestone, 2008-10-05, 2008-12-03 SeaMonkey 2.0a2pre, developer, ?? - ?? SeaMonkey 2.0a2, milestone, 2008-12-10, 2009-02-07 SeaMonkey 2.0a3pre, developer, ?? - ??
Comment 27•16 years ago
|
||
SeaMonkey 2.0a1pre, developer, 2007-07-09 - (60 days) SeaMonkey 2.0a1, milestone, 2008-10-05, 2008-12-03 SeaMonkey 2.0a2pre, developer, 2008-09-25 - (60 days) SeaMonkey 2.0a2, milestone, 2008-12-10, 2009-02-07 SeaMonkey 2.0a3pre, developer, 2008-12-02 - (60 days)
Assignee | ||
Comment 28•16 years ago
|
||
Adding dependency on 477914 which has the SeaMonkey update. Will schedule a push with IT after SQL, shell script review.
Depends on: 477914
Assignee | ||
Comment 29•16 years ago
|
||
SeaMonkey 2.0a1pre, developer, 2007-07-09 - (60 days) is either wrong, or we don't have data for it back in 2007. http://crash-stats.mozilla.com/?do_query=1&product=SeaMonkey&version=SeaMonkey%3A2.0a1pre&query_search=signature&query_type=contains&query=&date=2007-07-19&range_value=1&range_unit=weeks
Comment 30•16 years ago
|
||
(In reply to comment #29) > SeaMonkey 2.0a1pre, developer, 2007-07-09 - (60 days) > is either wrong, or we don't have data for it back in 2007. I got to this date by trying to find out since when SeaMonkey had crashreporter support, but it may not have worked correctly from the start. Can we find out when we got the first SeaMonkey 2.0a1pre crash reports and start the window with that? Also, we started the 2.0b1pre dev cycle on 2009-02-19 and released the 2.0a3 milestone yesterday, what's the process for getting those added?
Assignee | ||
Comment 31•15 years ago
|
||
Please open a new bug for MTBF entries.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•14 years ago
|
Attachment #353593 -
Flags: review?(lars)
Comment 32•14 years ago
|
||
not sure we are still planning to do this but it appear that we also have values like "Install Age" 7057413 seconds (11.7 weeks) since version was first installed. We should also integrate that into the calculation, or a parallel metric that reports the lower value of TimeSinceLastCrash or InstallAge to produce MTBF_For_Current_Build This would be a a bit different number that total MTBF, but also useful to understanding the time between failure on individual builds.
Updated•13 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
You need to log in
before you can comment on or make changes to this bug.
Description
•