Closed Bug 1188966 Opened 7 years ago Closed 7 years ago

We need Telemetry on the slow add-on watcher

Categories

(Toolkit :: Performance Monitoring, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
e10s ? ---
firefox43 --- fixed

People

(Reporter: Yoric, Assigned: Yoric, NeedInfo)

References

Details

Attachments

(1 file, 7 obsolete files)

We need to know if the slow add-on watcher works as we hope to.

We need Telemetry to find out:
- how often it shows up;
- how often people click on "ignore", on "ignore permanently", on "disable", on "x";
- how often people just never click.
Yoric: take a look at PLUGIN_HANG_UI_* histograms for inspiration. Aaron wrote those for the Plugin Hang dialog
This is blocking a ux bug that should have been fixed by now. We need to make a decision here on whether we need this.
tracking-e10s: --- → ?
Assignee: nobody → dteller
Bug 1188966 - Telemetry on the slow add-on watcher;r?blassey
Attachment #8641862 - Flags: review?(blassey.bugs)
(In reply to Jim Mathies [:jimm] from comment #2)
> This is blocking a ux bug that should have been fixed by now. We need to
> make a decision here on whether we need this.

What's the alternative? For the moment, nobody from UX has looked at this bug, and 100% of the feedback I have received is negative.
Comment on attachment 8641862 [details]
MozReview Request: Bug 1188966 - Telemetry on the slow add-on watcher;r?blassey

https://reviewboard.mozilla.org/r/14583/#review13295

::: browser/components/nsBrowserGlue.js:681
(Diff revision 1)
> +      let timeout = window.setTimeout(() => done(RESPONSE_TIMEOUT));

What is the purpose of this timeout? If I'm reading it correctly it will fire immediately after the infobar is shown. Is it to count how many times the bar is shown? If so, why cancel it in the done() function?

::: browser/components/nsBrowserGlue.js:738
(Diff revision 1)
>          notificationBox.appendNotification(message, notificationId, "",

I don't think you're handling the case where the user clicks the 'x' button to close the notification.
Attachment #8641862 - Flags: review?(blassey.bugs)
Comment on attachment 8641862 [details]
MozReview Request: Bug 1188966 - Telemetry on the slow add-on watcher;r?blassey

Bug 1188966 - Telemetry on the slow add-on watcher;r?blassey
Attachment #8641862 - Flags: review?(blassey.bugs)
https://reviewboard.mozilla.org/r/14583/#review13295

> What is the purpose of this timeout? If I'm reading it correctly it will fire immediately after the infobar is shown. Is it to count how many times the bar is shown? If so, why cancel it in the done() function?

My bad, I forgot to add a delay.
Attachment #8641862 - Flags: review?(blassey.bugs) → review-
Er... why?
Flags: needinfo?(blassey.bugs)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #8)
> Er... why?

(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #5)
> Comment on attachment 8641862 [details]
> MozReview Request: Bug 1188966 - Telemetry on the slow add-on
> watcher;r?blassey
> 
> https://reviewboard.mozilla.org/r/14583/#review13295
> 
> ::: browser/components/nsBrowserGlue.js:681
> (Diff revision 1)
> > +      let timeout = window.setTimeout(() => done(RESPONSE_TIMEOUT));
> 
> What is the purpose of this timeout? If I'm reading it correctly it will
> fire immediately after the infobar is shown. Is it to count how many times
> the bar is shown? If so, why cancel it in the done() function?
> 
> ::: browser/components/nsBrowserGlue.js:738
> (Diff revision 1)
> >          notificationBox.appendNotification(message, notificationId, "",
> 
> I don't think you're handling the case where the user clicks the 'x' button
> to close the notification.
Flags: needinfo?(blassey.bugs)
Yes, that was version 1. Since then, I have uploaded a new version. Do you encounter a problem with MozReview that prevents you from seeing it?
Flags: needinfo?(blassey.bugs)
Can we just not use MozReview then?
Flags: needinfo?(blassey.bugs)
Attached patch Patch, v2 (obsolete) — Splinter Review
As you wish.
Attachment #8641862 - Attachment is obsolete: true
Attachment #8643066 - Flags: review?(blassey.bugs)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #7)
> https://reviewboard.mozilla.org/r/14583/#review13295
> 
> > What is the purpose of this timeout? If I'm reading it correctly it will fire immediately after the infobar is shown. Is it to count how many times the bar is shown? If so, why cancel it in the done() function?
> 
> My bad, I forgot to add a delay.

Could you explain why the timeout is useful? Seems like we'll loose a lot of information for the case where someone just isn't looking at their browser for 60s.
Flags: needinfo?(dteller)
Flags: needinfo?(cweiner)
Well, looking at my personal use of the slow add-on alert, I realize that I have trained myself to ignore it. Others tell me they have done the same and I believe that this is an interesting [non-]usage pattern of the notification.

I may be wrong, but I suspect that few add-ons are going to cause slowdowns that will popup while the user is not looking at the screen. Now, if you prefer, I can increase this from 60s to 10 minutes or 30 minutes or until shutdown.
Flags: needinfo?(dteller)
Alternatively, we could have telemetry on the number of times the notification is opened vs. number of times the user clicks on any button.
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #15)
> Alternatively, we could have telemetry on the number of times the
> notification is opened vs. number of times the user clicks on any button.

That's exactly what I was going to suggest. One probe for the info bar being shown, another for it's resolution.
Attachment #8641862 - Attachment is obsolete: false
Attachment #8641862 - Flags: review- → review?(blassey.bugs)
Comment on attachment 8641862 [details]
MozReview Request: Bug 1188966 - Telemetry on the slow add-on watcher;r?blassey

Bug 1188966 - Telemetry on the slow add-on watcher;r?blassey
Attachment #8641862 - Attachment is obsolete: true
Attachment #8641862 - Flags: review?(blassey.bugs)
Comment on attachment 8643066 [details] [diff] [review]
Patch, v2

Review of attachment 8643066 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/nsBrowserGlue.js
@@ +691,5 @@
> +        complete = true;
> +        Services.telemetry.getHistogramById("MISBEHAVING_ADDONS_UI_USER_RESPONSE").add(response);
> +        Services.telemetry.getHistogramById("MISBEHAVING_ADDONS_UI_RESPONSE_TIME").add(Date.now() - start);
> +      };
> +      let timeout = window.setTimeout(() => done(RESPONSE_TIMEOUT), 60000 /*1 minute*/);

Please get rid of the timeout
Attachment #8643066 - Flags: review?(blassey.bugs) → review-
Could you please use MozReview for your reviews? :)
(instead of canceling my up-to-date review requests and reviewing the old ones)
Flags: needinfo?(blassey.bugs)
Attachment #8643066 - Attachment is obsolete: true
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #19)
> Could you please use MozReview for your reviews? :)

I thought we agreed not to use moz review
Flags: needinfo?(blassey.bugs)
Here we go. Note that MozReview is quite good, despite a lack of polish, so I'd prefer using it for future bugs.
Attachment #8644476 - Flags: review?(blassey.bugs)
Comment on attachment 8644476 [details] [diff] [review]
Telemetry for the slow add-on watcher, v3

Review of attachment 8644476 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/nsBrowserGlue.js
@@ +729,5 @@
> +                {
> +                  label: win.gNavigatorBundle.getFormattedString("addonwatch.restart.label", [brandShortName]),
> +                  accessKey: win.gNavigatorBundle.getString("addonwatch.restart.accesskey"),
> +                  callback: function() {
> +                    done(RESPONSE_DISABLE_ADDON);

you'll miss the case where the user doesn't restart the browser through this notification. The add-on has been disabled though, so just remove the |if (addon.pendingOperations == addon.PENDING_NONE)| block and report the disabling whether or not it needs a restart.

::: toolkit/components/telemetry/Histograms.json
@@ +5124,5 @@
> +  },
> +  "SLOW_ADDON_WARNING_RESPONSE_TIME": {
> +    "expires_in_version": "never",
> +    "kind": "exponential",
> +    "high": "60000",

I'm assuming the is ms, if so 1 minute seems like an arbitrarily low limit.
Attachment #8644476 - Flags: review?(blassey.bugs) → review-
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #24)
> Comment on attachment 8644476 [details] [diff] [review]
> Telemetry for the slow add-on watcher, v3
> 
> Review of attachment 8644476 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: browser/components/nsBrowserGlue.js
> @@ +729,5 @@
> > +                {
> > +                  label: win.gNavigatorBundle.getFormattedString("addonwatch.restart.label", [brandShortName]),
> > +                  accessKey: win.gNavigatorBundle.getString("addonwatch.restart.accesskey"),
> > +                  callback: function() {
> > +                    done(RESPONSE_DISABLE_ADDON);
> 
> you'll miss the case where the user doesn't restart the browser through this
> notification. The add-on has been disabled though, so just remove the |if
> (addon.pendingOperations == addon.PENDING_NONE)| block and report the
> disabling whether or not it needs a restart.
> 

I actually think that it's a case of incomplete interaction. So let's add one more case.

> ::: toolkit/components/telemetry/Histograms.json
> @@ +5124,5 @@
> > +  },
> > +  "SLOW_ADDON_WARNING_RESPONSE_TIME": {
> > +    "expires_in_version": "never",
> > +    "kind": "exponential",
> > +    "high": "60000",
> 
> I'm assuming the is ms, if so 1 minute seems like an arbitrarily low limit.

Fair enough.
Attachment #8644476 - Attachment is obsolete: true
Attachment #8645009 - Flags: review?(blassey.bugs)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #25)
> (In reply to Brad Lassey [:blassey] (use needinfo?) from comment #24)
> > Comment on attachment 8644476 [details] [diff] [review]
> > Telemetry for the slow add-on watcher, v3
> > 
> > Review of attachment 8644476 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: browser/components/nsBrowserGlue.js
> > @@ +729,5 @@
> > > +                {
> > > +                  label: win.gNavigatorBundle.getFormattedString("addonwatch.restart.label", [brandShortName]),
> > > +                  accessKey: win.gNavigatorBundle.getString("addonwatch.restart.accesskey"),
> > > +                  callback: function() {
> > > +                    done(RESPONSE_DISABLE_ADDON);
> > 
> > you'll miss the case where the user doesn't restart the browser through this
> > notification. The add-on has been disabled though, so just remove the |if
> > (addon.pendingOperations == addon.PENDING_NONE)| block and report the
> > disabling whether or not it needs a restart.
> > 
> 
> I actually think that it's a case of incomplete interaction. So let's add
> one more case.

I don't understand, can you explain further?
Flags: needinfo?(dteller)
If the user clicks on "Disable the add-on" but fails to click on "Restart Firefox", this means that the user hasn't realized that he needs a second click. So that's a different result from "Disable the add-on" then clicking "Restart Firefox".
Flags: needinfo?(dteller)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #28)
> If the user clicks on "Disable the add-on" but fails to click on "Restart
> Firefox", this means that the user hasn't realized that he needs a second
> click. So that's a different result from "Disable the add-on" then clicking
> "Restart Firefox".

I disagree. The user could hit command Q to close Firefox, or apply an update to restart or quit though the menu, all of which start the restart immediately. They also may consciously decide to leave the the browser running with the addon disabling in the pending state for numerous reasons (need to finish composing an email in gmail, comment in bugzilla etc). Regardless the addon will b disabled after the next restart so action has been taken.
Comment on attachment 8645009 [details] [diff] [review]
Telemetry for the slow add-on watcher, v4

Review of attachment 8645009 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/nsBrowserGlue.js
@@ +727,5 @@
>              label: win.gNavigatorBundle.getFormattedString("addonwatch.disable.label", [addon.name]),
>              callback: function() {
>                addon.userDisabled = true;
> +              if (addon.pendingOperations == addon.PENDING_NONE) {
> +                done(STATE_USER_PICKED_DISABLE_NO_CONFIRMATION_NEEDED);

record it as done regardless of whether a restart is pending or not.
Attachment #8645009 - Flags: review?(blassey.bugs) → review-
We can decide whether it's "done" or "not done" while processing data, no need to hardcode this in the extraction.
Flags: needinfo?(blassey.bugs)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #31)
> We can decide whether it's "done" or "not done" while processing data, no
> need to hardcode this in the extraction.

You'll miss all of the cases I mentioned above
Flags: needinfo?(blassey.bugs)
Again, why hardcode it in Firefox, when we can decide how we count it while we are actually processing data?
cf. comment 32
Flags: needinfo?(blassey.bugs)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #33)
> Again, why hardcode it in Firefox, when we can decide how we count it while
> we are actually processing data?

Again, because you're not collecting the data at all.
Flags: needinfo?(blassey.bugs)
How's that? I'm just using two different constants instead of one.
Flags: needinfo?(blassey.bugs)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #36)
> How's that? I'm just using two different constants instead of one.

If the user doesn't click the restart button (any of the scenarios listed in comment 29), you don't record it being disabled.

Once an addon is marked to be disabled after restart, the transaction is done. If you want to separately measure people's behavior with the restart prompts, that's fine, but let's not conflate the two.
Flags: needinfo?(blassey.bugs)
I believe that we are talking past each other. Well, Bugzilla has never been a good medium for such things. How do you find this version?
Attachment #8645009 - Attachment is obsolete: true
Attachment #8648912 - Flags: review?(blassey.bugs)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #38)
> Created attachment 8648912 [details] [diff] [review]
> Telemetry for the slow add-on watcher, v5
> 
> I believe that we are talking past each other. Well, Bugzilla has never been
> a good medium for such things. How do you find this version?

My main concern is that we'll never call done() for a user who restarts their browser through any means other than the prompt.

I would rather call done once they've chosen to disable the add-on. Honestly, their interaction with the restart prompt is entirely uninteresting to me, but if you want to measure it I'd suggest another probe. Does that make sense?
Flags: needinfo?(dteller)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #39)
> (In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment
> #38)
> > Created attachment 8648912 [details] [diff] [review]
> > Telemetry for the slow add-on watcher, v5
> > 
> > I believe that we are talking past each other. Well, Bugzilla has never been
> > a good medium for such things. How do you find this version?
> 
> My main concern is that we'll never call done() for a user who restarts
> their browser through any means other than the prompt.

I'm not sure I understand how that is a problem. With the latest version, you can just count the number of hits of `STATE_USER_PICKED_DISABLE` to find out how often the user clicked on "disable". With the previous version, you had to count `STATE_USER_PICKED_DISABLE_NO_CONFIRMATION_NEEDED` + `STATE_USER_PICKED_DISABLE_CONFIRMATION_PENDING` + `STATE_USER_PICKED_DISABLE_CONFIRMATION_GIVEN`.

> I would rather call done once they've chosen to disable the add-on.
> Honestly, their interaction with the restart prompt is entirely
> uninteresting to me, but if you want to measure it I'd suggest another
> probe. Does that make sense?

Let's just move this question to a followup.
Flags: needinfo?(dteller)
Attachment #8648912 - Attachment is obsolete: true
Attachment #8648912 - Flags: review?(blassey.bugs)
Attachment #8649169 - Flags: review?(blassey.bugs)
Attachment #8649169 - Flags: review?(blassey.bugs) → review+
Needs rebasing
Keywords: checkin-needed
Rebased.
Attachment #8650480 - Attachment is obsolete: true
Attachment #8651500 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/a34f961f651b
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Early results are not encouraging: http://mzl.la/1hreENx
This suggests either a back-end problem (bug 1191342), a front-end problem (bug 1124073) or both. I hope to land a fix this week that should help a lot with the back-end.
Flags: needinfo?(blassey.bugs)
Flags: needinfo?(blassey.bugs)
Duplicate of this bug: 1177863
Telemetry has improved a lot since bug 1191342, but it's still not very good. Only about 8.7% of alerts are followed by a user action, and ~50% of the actions are closing the alert.
Keywords: productwanted
You need to log in before you can comment on or make changes to this bug.