Closed Bug 470622 Opened 16 years ago Closed 15 years ago

Switch to ten-day floating window for MTBF

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: samuel.sidler+old, Unassigned)

References

()

Details

(Whiteboard: next)

The MTBF report right now is generated daily from data from the previous 24 hours. However, we should instead generate it from an *average* over a ten-day floating window.

i.e., on day 1, it's generated from 1 day of data. On day 2, it's generated from 2 days of data. On day 10, it's generated from all ten days of data. On day 11, it's generated from data from days 2-11, etc.

Doing this makes the data less sporadic and will still show increases over time, which is what we care about. We care less about MTBF shooting up one day because a hundred people crashed that had never crashed before, but then floating back down the next.

chofmann suggested this and I think he's right.
yeah the sliding ten day window provided the smoothing to make this report useful in tracking release to release.  

sorry for not being more specific about this in the other dependency bug.

here is and example of the talkback version of the report that shows the 10 day window
http://talkback-public.mozilla.org/reports/firefox/FF20018/smart-analysis.all

 This report was generated on Sun Dec 21 01:58:15 PST 2008 and contains 10 days of data. 

 Total blackboxes in this sample:   71386
 Total unique users:   34010
 MTBF For these builds is estimated at 14.499938 hours,
 based on 65567 reports and 950717.452778 hours of user testing
 from testers that have crashed and reported problems.
  (dev. builds tend to have low MTBF)

http://talkback-public.mozilla.org/reports/firefox/FF20011/smart-analysis.all

Smart Analysis FF20017 Builds - all

 This report was generated on Sun Nov 16 04:27:00 PST 2008 and contains 10 days of data. 

 Total blackboxes in this sample:  142947
 Total unique users:   65932
 MTBF For these builds is estimated at 14.391453 hours,
 based on 134767 reports and 1939492.955000 hours of user testing
 from testers that have crashed and reported problems.
  (dev. builds tend to have low MTBF)


when we created the talkback version we didn't have quite the variation between weekday and weekend use that we have now, so I'd switch this to a 7 day window to avoid any one period having 1 or 2 weekend as part of the calcuations.
(In reply to comment #1)
> when we created the talkback version we didn't have quite the variation between
> weekday and weekend use that we have now, so I'd switch this to a 7 day window
> to avoid any one period having 1 or 2 weekend as part of the calcuations.

The weekend shouldn't matter, should it? The amount of crashes should go down with the weekend as well. I think a 10-day window is good.
the purpose of the averaging over a longer period is to smooth out the data to
see the trends more easily.   having some ten day periods with two weekend, and
other ten day periods with only one weekend will introduce noise in the
smoothing.

a 7 or 14 day sliding window means we will have more uniform smoothing.  It
might also be interesting to show the individual daily calculations as points
on the graph, and then show the 7 and 14 day smoothed series as a line using
the same color for each release.
OS: Mac OS X → All
Hardware: x86 → All
From the back end perspective, it probably makes most sense to aggregate MTBF data in (one day or less) chunks; and let the UI provide smoothing.
a 14 day floating window and time aligning the data from the first day of release is a blocker for putting the mtbf graph on the soccoro home page I think.

the current graph 
http://crash-stats.mozilla.com/mtbf/of/Firefox/major
is much too noisy and un organized to be of any value

There really are crash spikes around Friday/Saturday that I've seen in the data I've been looking at.  

http://people.mozilla.com/~chofmann/crash-data/crashes-per-1-users.png

there are some exceptions, but peaks/upward spikes in that graph come around friday/saturday and downward spikes seem to happen around sunday/monday.
We sort of already do this but the UI is bad.
Whiteboard: next
some radio buttons to add and remove release lines would be cool too,
This does not provide useful data compared to crashes per user.  WONTFIX.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WONTFIX
it really would be able to have both reports.

this style mtbf graph that was used for many years, and the new style crashes per 1000 user style graphs.

that would allow to get the best insight into crashing frequency and patters across releases.
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.