Last Comment Bug 836010 - When startup is determined to be slow, tell users about ways to improve their startup time
: When startup is determined to be slow, tell users about ways to improve their...
Status: RESOLVED FIXED
:
Product: Firefox
Classification: Client Software
Component: General (show other bugs)
: Trunk
: All All
: -- normal (vote)
: Firefox 21
Assigned To: Dão Gottwald [:dao]
:
:
Mentors:
Depends on: 837640 838713 868400 868711 896025
Blocks: 872172 837637 848460 1015619
  Show dependency treegraph
 
Reported: 2013-01-29 13:45 PST by Jared Wein [:jaws] (please needinfo? me)
Modified: 2014-05-24 08:50 PDT (History)
20 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
21+


Attachments
patch (5.99 KB, patch)
2013-01-30 14:40 PST, Dão Gottwald [:dao]
no flags Details | Diff | Splinter Review
Slow Start Icon - OS X (499 bytes, image/png)
2013-01-31 11:14 PST, Stephen Horlander [:shorlander]
no flags Details
Slow Start Icon - OS X - @2x (961 bytes, image/png)
2013-01-31 11:15 PST, Stephen Horlander [:shorlander]
no flags Details
Slow Start Icon - Windows (512 bytes, image/png)
2013-01-31 11:16 PST, Stephen Horlander [:shorlander]
no flags Details
Slow Start Icon - Linux (478 bytes, image/png)
2013-01-31 13:24 PST, Stephen Horlander [:shorlander]
no flags Details
patch v2 (18.78 KB, patch)
2013-02-01 09:20 PST, Dão Gottwald [:dao]
no flags Details | Diff | Splinter Review
screenshot (11.25 KB, image/png)
2013-02-01 09:22 PST, Dão Gottwald [:dao]
no flags Details
patch v2 (18.00 KB, patch)
2013-02-01 09:24 PST, Dão Gottwald [:dao]
felipc: review+
Details | Diff | Splinter Review

Description Jared Wein [:jaws] (please needinfo? me) 2013-01-29 13:45:05 PST
We should keep track of startup times locally on the machine. If a user encounters multiple slow startups, we should show a notification bar that says something along the lines of "It seems that your Firefox is slow to start up. ... Find ways to improve the startup speed of Firefox"
Comment 1 Jared Wein [:jaws] (please needinfo? me) 2013-01-29 13:48:13 PST
Clicking on "Find ways to improve the startup speed of Firefox" could either take the user to a SUMO article or to about:support. A SUMO page is probably better for l10n as well as ability to update on the fly.
Comment 2 Jared Wein [:jaws] (please needinfo? me) 2013-01-29 14:54:09 PST
I talked with Matej (CC'd) and he had these first ideas for the strings that could be a little more whimsical.

> "Firefox seems. slow. to. start."             [See how you can make it _fast_]

Where "See how you can make it _fast_" is the button to the SUMO article.
Comment 3 Matej Novak [:matej] 2013-01-30 07:33:00 PST
Some small tweaks to comment 2 and two options of how it could be formatted:

Firefox seems slow. To. Start. [Learn how to _speed it up_]

Firefox seems slow... to... start. [Learn how to _speed it up_]

What I like about this approach is that it relies on formatting rather than language. That means that it could still be localized easily without the punctuation for languages where it would make sense or wouldn't be appropriate.
Comment 4 Dão Gottwald [:dao] 2013-01-30 14:40:40 PST
Created attachment 708335 [details] [diff] [review]
patch
Comment 5 Jared Wein [:jaws] (please needinfo? me) 2013-01-31 09:14:57 PST
Comment on attachment 708335 [details] [diff] [review]
patch

Review of attachment 708335 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/locales/en-US/chrome/browser/browser.properties
@@ +445,5 @@
>  mixedContentBlocked.unblock.accesskey = D
> +
> +# LOCALIZATION NOTE - %S is brandShortName
> +slowStartup.message = %S seems slow… to… start.
> +slowStartup.helpButton.label = Learn How to Speed It Up

The casing on this feels too awkward to me. I think this should be changed to Matej's proposed "Learn how to speed it up".
Comment 6 Dão Gottwald [:dao] 2013-01-31 09:18:20 PST
We use title capitalization in buttons, so "Learn how to speed it up" would definitely be wrong.
Comment 7 Tyler Downer [:Tyler] 2013-01-31 09:19:49 PST
We should consider having an easy way to exit that notification that then lets us know not to show the warning again for awhile. Perhaps changes the threshold to the current startup speed, so if Firefox gets slower then it will warn again. I'm just worried about users having this bar every time they start Firefox and not having a way to turn it off
Comment 8 Jared Wein [:jaws] (please needinfo? me) 2013-01-31 09:25:33 PST
(In reply to Dão Gottwald [:dao] from comment #6)
> We use title capitalization in buttons, so "Learn how to speed it up" would
> definitely be wrong.

I don't think we need to be 100% strict on this. The sentence capitalization is a clear win here.
Comment 9 Dão Gottwald [:dao] 2013-01-31 09:32:54 PST
(In reply to Tyler Downer [:Tyler] from comment #7)
> We should consider having an easy way to exit that notification that then
> lets us know not to show the warning again for awhile. Perhaps changes the
> threshold to the current startup speed, so if Firefox gets slower then it
> will warn again. I'm just worried about users having this bar every time
> they start Firefox and not having a way to turn it off

We wouldn't show it every time they start Firefox. With the current patch it would be every fifth start, and we can increase that number if wanted.

(In reply to Jared Wein [:jaws] from comment #8)
> (In reply to Dão Gottwald [:dao] from comment #6)
> > We use title capitalization in buttons, so "Learn how to speed it up" would
> > definitely be wrong.
> 
> I don't think we need to be 100% strict on this. The sentence capitalization
> is a clear win here.

The clear win isn't clear to me, and I don't see why this button should be different from others.
Comment 10 Benjamin Smedberg [:bsmedberg] 2013-01-31 09:35:39 PST
This was supposed to be a feature of the Firefox health report, and so I suggest that if we're going to do this we ought to make that the main entrance point.
Comment 11 Dão Gottwald [:dao] 2013-01-31 09:36:48 PST
What do you mean by main entrance point?
Comment 12 Matt Grimes [:Matt_G] 2013-01-31 09:42:59 PST
This looks awesome guys. I agree with Tyler though. We don't want the messaging to be annoying to the end user if they choose not to do anything about slowness or worse, they try to fix it and make no progress. Even if we aren't showing it every time, there should be an opt out option.
Comment 13 Gregory Szorc [:gps] 2013-01-31 09:51:35 PST
There may be some overlap between about:healthreport and this feature. Perhaps one day the slow start detecting drops you into about:healthreport?slowStart or something. Or perhaps part of FHR is periodically asking "is this browser healthy" and that triggers an alert if not. I dunno.

I think you should blaze forward without regard for FHR at this stage and we can (possibly) merge code later. Now, if you start writing a lot of code for "am I healthy," then we should probably have a discussion on the relationship between FHR and those features so we don't accumulate too much engineering debt.

mconnor: feel free to override if I'm speaking out of ignorance.
Comment 14 Gregory Szorc [:gps] 2013-01-31 09:58:07 PST
Comment on attachment 708335 [details] [diff] [review]
patch

Review of attachment 708335 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/nsBrowserGlue.js
@@ +436,5 @@
> +    }
> +
> +    Services.prefs.setIntPref("browser.slowStartup.averageTime", averageTime);
> +    Services.prefs.setIntPref("browser.slowStartup.samples", samples);
> +  },

As part of FHR we implemented https://hg.mozilla.org/mozilla-central/file/default/services/datareporting/sessions.jsm. So, there is a full history of every session for the past 180 days between this API (prefs storage) and FHR (SQLite storage).

Since we now have this "session recorder" API/service, I think it makes sense for the tracking logic you added here to live in that, not in nsBrowserGlue.js.

This service is always running assuming FHR is enabled in the build (which it is for official builds). So, you can rely on it for this feature.
Comment 15 Jared Wein [:jaws] (please needinfo? me) 2013-01-31 10:04:32 PST
Comment on attachment 708335 [details] [diff] [review]
patch

Review of attachment 708335 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/nsBrowserGlue.js
@@ +413,5 @@
> +    if (!startupInfo.firstPaint) {
> +      Cc["@mozilla.org/timer;1"]
> +        .createInstance(Ci.nsITimer)
> +        .initWithCallback(this._trackSlowStartup.bind(this),
> +                          1000, Ci.nsITimer.TYPE_ONE_SHOT);

We should add a max-tries amount here, so that in pathological cases we don't keep checking this infinite times.
Comment 16 Stephen Horlander [:shorlander] 2013-01-31 11:14:38 PST
Created attachment 708693 [details]
Slow Start Icon - OS X
Comment 17 Stephen Horlander [:shorlander] 2013-01-31 11:15:23 PST
Created attachment 708694 [details]
Slow Start Icon - OS X - @2x
Comment 18 Stephen Horlander [:shorlander] 2013-01-31 11:16:03 PST
Created attachment 708696 [details]
Slow Start Icon - Windows
Comment 19 Dão Gottwald [:dao] 2013-01-31 11:52:53 PST
(In reply to Gregory Szorc [:gps] from comment #14)
> As part of FHR we implemented
> https://hg.mozilla.org/mozilla-central/file/default/services/datareporting/
> sessions.jsm. So, there is a full history of every session for the past 180
> days between this API (prefs storage) and FHR (SQLite storage).
> 
> Since we now have this "session recorder" API/service, I think it makes
> sense for the tracking logic you added here to live in that, not in
> nsBrowserGlue.js.

What would this buy us? Would this simplify code or make it reusable for other purposes? Skimming the JSM didn't help me answer these questions.
Comment 20 Stephen Horlander [:shorlander] 2013-01-31 13:24:45 PST
Created attachment 708766 [details]
Slow Start Icon - Linux
Comment 21 Gregory Szorc [:gps] 2013-01-31 13:31:43 PST
(In reply to Dão Gottwald [:dao] from comment #19)
 > > Since we now have this "session recorder" API/service, I think it makes
> > sense for the tracking logic you added here to live in that, not in
> > nsBrowserGlue.js.
> 
> What would this buy us? Would this simplify code or make it reusable for
> other purposes? Skimming the JSM didn't help me answer these questions.

* nsBrowserGlue.js would be smaller
* Easier test coverage of feature (xpcshell not mochitest)
* Reusable by others

It's probably not worth doing at this juncture.
Comment 22 Dão Gottwald [:dao] 2013-02-01 09:20:45 PST
Created attachment 709087 [details] [diff] [review]
patch v2

I stopped relying on nsIAppStartup's firstPaint property, as I couldn't figure out when it would become available. I'm now just using the time it takes until the first browser-delayed-startup-finished notification, which actually depends on MozAfterPaint.
Comment 23 Dão Gottwald [:dao] 2013-02-01 09:22:18 PST
Created attachment 709088 [details]
screenshot
Comment 24 Dão Gottwald [:dao] 2013-02-01 09:24:55 PST
Created attachment 709090 [details] [diff] [review]
patch v2

reverted obsolete change that I made when experimenting with how to reliably access the firstPaint property
Comment 25 :Felipe Gomes (needinfo me!) 2013-02-01 10:10:48 PST
I heard Greg mentioning yesterday that the lack of firstPaint data is a known race condition. If that is simply the case and the data always seems available on the next trial at retrieving it, we should keep using firstPaint because that metric is much more understood and has historical data for us to base which threshold to use.
A random front-end code change is much more likely to sway the timing of browser-delayed-startup-finished then of firstPaint, which may suddenly make lots of users see that dialog.

In any case, we should also add telemetry to measure this feature and keep an eye on it to make sure it doesn't explode.
Comment 26 Dão Gottwald [:dao] 2013-02-01 10:17:23 PST
(In reply to :Felipe Gomes from comment #25)
> I heard Greg mentioning yesterday that the lack of firstPaint data is a
> known race condition. If that is simply the case and the data always seems
> available on the next trial at retrieving it,

I tried that, it didn't help.

> A random front-end code change is much more likely to sway the timing of
> browser-delayed-startup-finished then of firstPaint, which may suddenly make
> lots of users see that dialog.

I can see how e.g. an add-on could break delayedStartup such that browser-delayed-startup-finished would never fire. If how ever it fires late because delayedStartup blocks the main thread for a long time, taking this into account in the time measurement seems quite reasonable to me.
Comment 27 Gregory Szorc [:gps] 2013-02-01 10:51:35 PST
sessionRestored is recorded during sessionrestore-windows-restored observer: https://mxr.mozilla.org/mozilla-central/source/toolkit/components/startup/nsAppStartup.cpp#646

firstPaint is recorded at https://mxr.mozilla.org/mozilla-central/source/view/src/nsViewManager.cpp#338. Not sure where that gets called during startup.

Search for "StartupTimeline::Record" in MXR for all the recordings.

The race condition I was referring to was sessionRestored. In FHR I am currently trying to access sessionRestored during a sessionrestore-windows-restored observer. However, it only works sometimes because observer install order isn't consistent.

If you are measuring firstPaint, I'd not do slow start detection until sessionrestore-windows-restored. firstPaint will definitely be defined then. You may also be able to get away with installing a timer in final-ui-startup. Either way, you should have code to handle the case where firstPaint == -1.

bsmedberg could probably give a more solid recommendation.
Comment 28 :Felipe Gomes (needinfo me!) 2013-02-01 10:56:04 PST
(In reply to Dão Gottwald [:dao] from comment #26)
> I can see how e.g. an add-on could break delayedStartup such that
> browser-delayed-startup-finished would never fire. If how ever it fires late
> because delayedStartup blocks the main thread for a long time, taking this
> into account in the time measurement seems quite reasonable to me.

Fair point. Is the time for firstPaint and browser-delayed-startup-finished really similar on your tests? I feel much less confindent on setting the threshold to 1min for browser-delayed-startup-finished. Is there any good reasoning to apply and reach this number (60sec)?
Comment 29 Dão Gottwald [:dao] 2013-02-01 11:01:26 PST
(In reply to :Felipe Gomes from comment #28)
> (In reply to Dão Gottwald [:dao] from comment #26)
> > I can see how e.g. an add-on could break delayedStartup such that
> > browser-delayed-startup-finished would never fire. If how ever it fires late
> > because delayedStartup blocks the main thread for a long time, taking this
> > into account in the time measurement seems quite reasonable to me.
> 
> Fair point. Is the time for firstPaint and browser-delayed-startup-finished
> really similar on your tests?

Yes, the numbers seemed reasonably close.

> I feel much less confindent on setting the
> threshold to 1min for browser-delayed-startup-finished. Is there any good
> reasoning to apply and reach this number (60sec)?

I'm not sure what exactly you're asking, but if browser-delayed-startup-finished is fired after one minute, something is definitely wrong and we can count that as a slow startup. 60 seconds is a conservative threshold and I expect that we'll carefully reduce it in the future, maybe once we've shipped this for the first time.
Comment 30 :Felipe Gomes (needinfo me!) 2013-02-01 11:49:41 PST
My point is that we based the 60s by looking at the telemetry histogram for firstPaint and noticing that 99.99% of reports are on 30s or less.
I'm trying to look at the same number for delayed-startup-finished (I didn't know there was telemetry for that).  Right now the dashboard is too slow to load so I can't see the histogram :(  But the evolution loaded and the numbers for it in the main percentile seem around 4-5sec higher than firstPaint.  So maybe 60s is also reasonable for this one but let's revisit this number again after we see what's the long tail for that measurement.

Can you add a telemetry probe reporting when we've showed the slow warning to the users?
Comment 31 Dão Gottwald [:dao] 2013-02-01 12:10:14 PST
> Can you add a telemetry probe reporting when we've showed the slow warning
> to the users?

Wouldn't this be just another (less accurate) way to catch delayed-startup-finished regressions?
Comment 32 :Felipe Gomes (needinfo me!) 2013-02-01 12:33:55 PST
Comment on attachment 709090 [details] [diff] [review]
patch v2

Review of attachment 709090 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Dão Gottwald [:dao] from comment #31)
> > Can you add a telemetry probe reporting when we've showed the slow warning
> > to the users?
> 
> Wouldn't this be just another (less accurate) way to catch
> delayed-startup-finished regressions?

Possibly, I need to think more about that. But we can leave that for a follow-up.

::: browser/components/nsBrowserGlue.js
@@ +424,5 @@
> +      if (averageTime > Services.prefs.getIntPref("browser.slowStartup.timeThreshold"))
> +        this._showSlowStartupNotification();
> +      averageTime = 0;
> +      samples = 0;
> +    }

getting the average of the 5 runs will trigger the dialog for an unusual spike, where what we wanted is to show the dialog for when the N previous runs were all above the threshold.

The worst scenario is: samples = 0, startup takes 5 seconds for a random (non-consistent) reason. We won't show anything, and the following startups only take 50ms. We will still show the "Slow startup" warning for a really fast startup due to the previously slow one 4 runs ago.
Comment 33 Dão Gottwald [:dao] 2013-02-01 12:45:52 PST
(In reply to :Felipe Gomes from comment #32)
> getting the average of the 5 runs will trigger the dialog for an unusual
> spike, where what we wanted is to show the dialog for when the N previous
> runs were all above the threshold.
> 
> The worst scenario is: samples = 0, startup takes 5 seconds for a random
> (non-consistent) reason. We won't show anything, and the following startups
> only take 50ms. We will still show the "Slow startup" warning for a really
> fast startup due to the previously slow one 4 runs ago.

This case seems too constructed to me to really worry about it.

Even assuming it existed in the real world, surely a single 5 minute startup is bad enough that we shouldn't ignore it, so the fact that it would affect the average and trigger the notification is intended. Showing the notification later than the single slow startup happened would indeed be inconvenient but not wrong per se.
Comment 34 Dão Gottwald [:dao] 2013-02-01 12:51:56 PST
I'm open to consider different ways to decide when to show the notification, if anybody wants to propose something, but it shouldn't be significantly more complex than what we have.
Comment 35 :Felipe Gomes (needinfo me!) 2013-02-01 13:10:37 PST
I was thinking that the samples pref could be a text array with the 5 previous times, like:

"15032,28345,32053,20154,29353"

then you do a .split(","), remove the oldest and push the new value. If all values are > then the threshold, then we'd show it. What do you think?  If you prefer the current version over this suggestion please request review again.
Comment 36 Dão Gottwald [:dao] 2013-02-01 13:31:34 PST
Comment on attachment 709090 [details] [diff] [review]
patch v2

Yeah, I still prefer this, since I don't want all values to be above the threshold. As described in comment 33, if some startups are bad enough to push the average above the threshold, we should notify.
Comment 37 Jared Wein [:jaws] (please needinfo? me) 2013-02-01 13:34:25 PST
I prefer the current version because the approach that comment #35 proposes has the potential to show the warning multiple times in a row, which can get too annoying.
Comment 38 :Felipe Gomes (needinfo me!) 2013-02-01 13:40:05 PST
The telemetry data for browser-delayed-startup shows 99.85% on 30s or less, but it's worth noting there's a 0.16% report for NaN.
Comment 39 Dão Gottwald [:dao] 2013-02-01 13:50:42 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/bf1502fe8053

(In reply to Jared Wein [:jaws] from comment #37)
> I prefer the current version because the approach that comment #35 proposes
> has the potential to show the warning multiple times in a row, which can get
> too annoying.

Well, this could be avoided by removing all records when notifying. Anyway, as I said I still prefer the current approach for other reasons.
Comment 40 Phil Ringnalda (:philor) 2013-02-03 12:45:54 PST
https://hg.mozilla.org/mozilla-central/rev/bf1502fe8053
Comment 41 Johnathan Nightingale [:johnath] 2013-02-11 11:29:36 PST
Our nightly users aren't the real targets here, but has anyone checked the before/after on startup time metrics to see if this moved the needle? Is support seeing any extra traffic to the SUMO page? Should I stop spamming the bug with these questions and take it to email?
Comment 42 Tyler Downer [:Tyler] 2013-02-11 11:33:47 PST
SUMO never gets much nightly traffic, maybe 1-2 questions a week, so we haven't seen this. 

Looking on Input (which gets more nightly traffic) we see a few comments like https://input.mozilla.org/opinion/3553794, which could possibly relate to user confusion (but it honestly isn't clear enough to tell).

As this hits aurora we can plan an aurora survey around this feature, and when it hits beta we'll have more feedback as well.
Comment 43 Tyler Downer [:Tyler] 2013-02-26 15:40:13 PST
We should consider, instead of pointing to a SUMO article, pointing to the Firefox Reset dialog, which will fix this for the majority of users.
Comment 44 Dão Gottwald [:dao] 2013-02-26 15:59:26 PST
(In reply to Tyler Downer [:Tyler] from comment #43)
> We should consider, instead of pointing to a SUMO article, pointing to the
> Firefox Reset dialog, which will fix this for the majority of users.

It seems like if we want users to try resetting their profile as a first step, we should make the SUMO article say so.
Comment 45 Bram Pitoyo [:bram] 2013-02-26 18:27:18 PST
(In reply to Dão Gottwald [:dao] from comment #44)
> It seems like if we want users to try resetting their profile as a first
> step, we should make the SUMO article say so.

That’s a good point. There are many steps to take that will fix Firefox if reset doesn’t work.

At the same time, it seems like we can do better than simply asking user to read an article that could potentially solve slowness, in the same browser that was experiencing slowness. Putting a reset button in the UI will give user a step that has result s/he can feel right away.

If startup is still slow after reset, we should give user the next thing to try in the same location. My hypothesis: asking user to read an article is going to have a lower conversion rate (ie. percentage of user who will try the solution), than if we present the solution in the form of actionable buttons.

So, perhaps, a link to “Learn more…” should be given when we’ve exhausted all the possible solutions. For example, if all we have is Reset, then when startup is still slow after reset, we should show “Learn more…”
Comment 46 Stephen Horlander [:shorlander] 2013-02-27 06:27:32 PST
I have to agree. If we are telling users that Firefox is slow they probably just want us to fix it rather than us sending them on a troubleshooting deep dive. 

Although I would be a little hesitant to link directly to the Profile Reset while it is still throwing away session data.
Comment 47 Justin Dolske [:Dolske] 2013-02-27 14:53:32 PST
I don't think we need to optimize the current UI too much. The long term plan (see comment 13) is that Firefox Health Report will be the more comprehensive framework for this kind of thing. The notification bar added here in this bug is a quickie short-term solution (aka "better than nothing" :). That's not to say the current UI is firmly locked-down/frozen, just that I think we'd be better off focusing on how FHR can solve this better.
Comment 48 Verdi [:verdi] 2013-05-01 15:08:21 PDT
(In reply to Bram Pitoyo [:bram] from comment #45)
> (In reply to Dão Gottwald [:dao] from comment #44)
> > It seems like if we want users to try resetting their profile as a first
> > step, we should make the SUMO article say so.
> 
> That’s a good point. There are many steps to take that will fix Firefox if
> reset doesn’t work.
> 
Actually there isn't really anything else to do in Firefox beyond a reset. It really does talk care of most problems (except malware and hardware issues). Putting a sumo article as an inbetween step will just decrease the amount of people who successfully complete the task.
Comment 49 Verdi [:verdi] 2013-07-26 13:51:00 PDT
We until we can offer the reset prompt (bug 872172) can we please change the support article people are sent to? Right now that are sent to https://support.mozilla.org/kb/firefox-takes-long-time-start-up which tells people to go to https://support.mozilla.org/kb/reset-firefox-easily-fix-most-problems

We should send people to https://support.mozilla.org/kb/reset-firefox-easily-fix-most-problems directly. Should I reopen this bug or file a new one?
Comment 50 Dão Gottwald [:dao] 2013-07-26 15:16:57 PDT
You should file a new bug.

Note You need to log in before you can comment on or make changes to this bug.