Closed Bug 849879 Opened 11 years ago Closed 11 years ago

FHR User-Facing Tips for v1

Categories

(Firefox Health Report Graveyard :: Web: Health Report, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: lco, Assigned: dre)

References

()

Details

Attachments

(1 file)

We need to decide on which tips to provide users with for v1 of FHR. This bug is primarily about the contents of the tips and the logic used to show them.  For any discussion on the design of the tips pane, see bug 832547.

The tips that are currently suggested are:

1. The user's browser is "significantly" out-of-date (define "significantly")
2. Frequent crashes (define "frequent")
3. The user's browser is doing great! 
4. Browser is "significantly" slow (define "significantly")
5. The user is a "frequent" FHR user, but isn't sharing data with Mozilla (one-time message encouraging them to share)

See https://etherpad.mozilla.org/FHR-Tips for discussion on the logic for each tip.
Priority: -- → P1
Based on our meeting today, this is what mconnor + deinspanjer feel is a reasonable list for v1:

1. High # of crashes
2. Slow Startup time - for v1 we will have a rough heuristic, will get better over time
3. Not enough data to show a meaningful graph yet

++ DEFER: Browser out of date
++ DEFER: encouraging FHR users who aren't sharing data to turn on sharing

The Metrics Team will help me figure out the logic needed for these tips, and mconnor will help figure out how to pull the info out.
This is a P1 + beta blocker, so assigning this to Daniel for part 1.
Assignee: nobody → deinspanjer
Status: NEW → ASSIGNED
For beta functionality, we have discussed possible trigger levels and came up with the following:

Tip 1 - High # of crashes
  Triggered when there have been more than 5 crashes in the last week.

Tip 2 - Slow startup time
  To be defined by Saptarshi later today after a last minute validation run.

Tip 3 - Not enough data to show meaningful graph yet
  Triggered when we do not yet have 5 dates with startup time or crash information (either because of long lived sessions or not enough usage).  We are very flexible on this.  Ultimately, the number should be the minimum at which the chart doesn't look crappy or confusing.  Once we have a chart that can display data, we might want to tweak this number.
Flags: needinfo?(sguha)
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #3)
 
> Tip 2 - Slow startup time
>   To be defined by Saptarshi later today after a last minute validation run.
> 

Our recommendation: 

"Triggered when the 75th percentile first paint time for the last 10 sessions is greater than 6 seconds."

The tip will never be triggered while the installation has fewer than 10 sessions. The value 6 seconds was computed as the median of the distribution of this metric (the 75%-ile of last 10 sessions) across installations.
Flags: needinfo?(sguha)
Thanks for the recommendations for the tip logic. They look good to me and don't conflict with our strings. Adding Asa for his approval.
Flags: needinfo?(asa)
(In reply to dzeber from comment #4)
> (In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #3)
>  
> > Tip 2 - Slow startup time
> >   To be defined by Saptarshi later today after a last minute validation run.
> > 
> 
> Our recommendation: 
> 
> "Triggered when the 75th percentile first paint time for the last 10
> sessions is greater than 6 seconds."
> 
> The tip will never be triggered while the installation has fewer than 10
> sessions. The value 6 seconds was computed as the median of the distribution
> of this metric (the 75%-ile of last 10 sessions) across installations.

I think I don't understand this. Where did the 6 seconds come from? That seems too low to me.
Flags: needinfo?(asa)
(In reply to Asa Dotzler [:asa] from comment #6)
> (In reply to dzeber from comment #4)
> I think I don't understand this. Where did the 6 seconds come from? That
> seems too low to me.

> > "Triggered when the 75th percentile first paint time for the last 10
> > sessions is greater than 6 seconds."
> > 
> > The tip will never be triggered while the installation has fewer than 10
> > sessions. 

Here is the logic for this trigger:
1. Put the last 10 session startup times for this installation into an array.
2. Sort the array by the start up time.
3. Take the 7th and 8th entries in that sorted array and average them together. (because the 75th entry out of 100 is equivalent to the 7.5th entry out of 10.)
4. If that averaged time is greater than or equal to 6 seconds, display the alert.

> > The value 6 seconds was computed as the median of the distribution
> > of this metric (the 75%-ile of last 10 sessions) across installations.

We decided on 6 seconds by performing steps 1 - 3 above for all the submissions we've received so far to get an array of several million 75th percentile times.  We then sorted those by time and determined that the midpoint of that set was 6 seconds.

So, out of the submissions we've received so far, 50% of all installations had a 75th percentile startup time less than 6 seconds.
Thanks Daniel. Also the startup time measure is 'firstpaint'. We chose firstpaint over sessionrestored because that is less affected by variations in user behavior e.g. those who do not have 'show my window and tabs from last time' will have a very different sessionrestored than those who  do have that turned on. Moreover, the sessionrestored could be interrupted by pop up alerts etc, which can cause large (but not typical) sessionrestored times. FHR does not contain data to control for these sort of contextual events.
Can we validate our early (pre-release audience) FHR data against telemetry for our release audience. Six seconds just seems like a very short period given how many of our users are on older and slower systems. 

My concern is that for some potentially large number of users, a >6 second start up time could be "normal" and even reasonable. I don't want to be telling those users that their start up time is slow or could be made better when it's actually a fast as it's gonna get.

In an ideal world, we'd be comparing their start-up times to some average we get from all other users *with similar hardware*. Since we don't have the comparison stuff yet, I want us to be conservative in making assertions that might not be true for enough users.
(In reply to Asa Dotzler [:asa] from comment #9)
> Can we validate our early (pre-release audience) FHR data against telemetry
> for our release audience. Six seconds just seems like a very short period
> given how many of our users are on older and slower systems. 
> 

Looking at telemetry data would probably not give us anything very useful, since it is highly unrepresentative of the general population of users, especially for release. 

> My concern is that for some potentially large number of users, a >6 second
> start up time could be "normal" and even reasonable. I don't want to be
> telling those users that their start up time is slow or could be made better
> when it's actually a fast as it's gonna get.

> 
> In an ideal world, we'd be comparing their start-up times to some average we
> get from all other users *with similar hardware*. Since we don't have the
> comparison stuff yet, I want us to be conservative in making assertions that
> might not be true for enough users.

That makes sense, as the best we can do at the moment is to base our conclusions on the available data. To make a more conservative estimate, what we can do is change the reference quantile that we use from the distribution of 75th percentiles. For a highly conservative estimate, we can replace 6 seconds with 20 seconds, the 90th percentile of that distribution. In other words, out of users with at least 10 sessions, 10% of them have their 75th percentile of fp times >20 seconds.
Blocks: 853120
:Laura, 

checked with :deinspanjer on this and he believes the code for this has landed and asked me to confirm with you.
QA, is trying to verify this bug and its not clear to us . Do we have to wait for the user facing design page for verification here ?
Since Laura's speaking at a conf: Yes, it'll be in the user-facing page.
Yes, you'll want the user facing page to land before QA.
So for the startup time, are we using 6 or 20 seconds?
20 is what we settled on. As per the definition given in [10]
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #7)
> (In reply to Asa Dotzler [:asa] from comment #6)
> > (In reply to dzeber from comment #4)
> > I think I don't understand this. Where did the 6 seconds come from? That
> > seems too low to me.
> 
> > > "Triggered when the 75th percentile first paint time for the last 10
> > > sessions is greater than 6 seconds."
> > > 
> > > The tip will never be triggered while the installation has fewer than 10
> > > sessions. 
> 
> Here is the logic for this trigger:
> 1. Put the last 10 session startup times for this installation into an array.
> 2. Sort the array by the start up time.
> 3. Take the 7th and 8th entries in that sorted array and average them
> together. (because the 75th entry out of 100 is equivalent to the 7.5th
> entry out of 10.)
> 4. If that averaged time is greater than or equal to 6 seconds, display the
> alert.
> 

So for point 2 above, I just want to make sure I sort it the correct way. I am thinking the sort should be from fastest to slowest, i.e. the 7th and 8th entries would then be closer to the slowest startups rather then the fastest. Correct? No? Thanks@
I've been trying to QA the crash case by forcing crashes with the crashme now extension but this doesn't seem to be working. I now have 15 crash reports if I go to about:crashes but my about:healthreport reports 0 crashes for this month, although the Last Crash statistic correctly shows 2013-04-01 as the date of my last crash.

Can someone please advise why this might be the case?
(In reply to Schalk Neethling [:espressive] from comment #16)

> 
> So for point 2 above, I just want to make sure I sort it the correct way. I
> am thinking the sort should be from fastest to slowest, i.e. the 7th and 8th
> entries would then be closer to the slowest startups rather then the
> fastest. Correct? No? Thanks@

Yes, the times should be sorted from smallest to largest. The 75th percentile is a number such that at most 75% of the values (in this case 7 out of 10) are less than or equal to it.
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #17)
> I've been trying to QA the crash case by forcing crashes with the crashme
> now extension but this doesn't seem to be working. I now have 15 crash
> reports if I go to about:crashes but my about:healthreport reports 0 crashes
> for this month, although the Last Crash statistic correctly shows 2013-04-01
> as the date of my last crash.
> 
> Can someone please advise why this might be the case?

Could you please attach you raw JSON. Would like to take a look at it and also use it against a unit test. Thanks!
(In reply to dzeber from comment #18)
> (In reply to Schalk Neethling [:espressive] from comment #16)
> 
> > 
> > So for point 2 above, I just want to make sure I sort it the correct way. I
> > am thinking the sort should be from fastest to slowest, i.e. the 7th and 8th
> > entries would then be closer to the slowest startups rather then the
> > fastest. Correct? No? Thanks@
> 
> Yes, the times should be sorted from smallest to largest. The 75th
> percentile is a number such that at most 75% of the values (in this case 7
> out of 10) are less than or equal to it.

So, at the moment I collect that most recent 10 startup times (paintTimes) and from this take the 7th and 8th, add those two together and divide by two to get the median. If this median is higher that 20, I display the tip that provides information about slow performance.

Does that sound correct or am I missing something?
> > > So for point 2 above, I just want to make sure I sort it the correct way. I
> > > am thinking the sort should be from fastest to slowest, i.e. the 7th and 8th
> > > entries would then be closer to the slowest startups 


Yes, assuming paintTimes is "firstpaint" and there are at least 10 observations

0) define F75 = 75th percentile of last 10 paintTimes
1) sort last 10 paintTimes from lowest to highest.
2) F75 = (7th largest observation + 8th largest observation) / 2
3) if F75 > 20, display tip else do not.
Attached file Raw Data
(In reply to Schalk Neethling [:espressive] from comment #19)
> Could you please attach you raw JSON. Would like to take a look at it and
> also use it against a unit test. Thanks!

Here you go.
Component: Metrics and Firefox Health Report → Client: Desktop
Product: Mozilla Services → Firefox Health Report
Component: Client: Desktop → about:healthreport
v1 is shipped, calling this done.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: