Closed Bug 864993 Opened 7 years ago Closed 7 years ago

longitudinal tracking for update system

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set

Tracking

(firefox24+ fixed, firefox25 verified)

VERIFIED FIXED
Firefox 25
Tracking Status
firefox24 + fixed
firefox25 --- verified

People

(Reporter: mconnor, Assigned: gps)

References

Details

Attachments

(2 files, 1 obsolete file)

The current Telemetry data for the updater tells us that about for 1 in 8 Windows Telemetry users the current state of the updater is ELEVATION_CANCELLED, meaning that the the updater has been denied admin permission to complete an upgrade.  That said, we don't have a clear picture of how the system is working over time, such as if this state persists for specific users, or if many users experience some delay for this reason.

As having an up to date Firefox is a key element in delivering the best and most secure browsing experience for users, there's a clear user benefit to collecting, analyzing, and comparing individual install data against for FHR.  We should collect sufficient information to identify both temporary and permanent problems, and provide users with clear steps to resolve problems.

I believe we will need to collect at least the following, on a daily basis, in order to gain a clear picture of how updates are working for individuals and users as a whole:

behaviour: 0/1 (is the updater enabled for automatic updates, automatic checking, or disabled, along with warning on incompatible add-ons)

lastCheck: int (days since last recorded update check, may be negative if last update was recorded in the future)

status: int (values correspond to http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/common/errors.h#10, value from most recent update attempt)
(optional) pendingVersion: version of the pending update
(optional) timesFailed: int (number of times a pending update has failed)
(optional) userRejected: true or false if a pending update was declined manually by the user.  combines with timesFailed in cases where we re-prompt

In terms of the user-facing report, I would want to build a feature which examines the current state of the update system, and compared against "normal" values (i.e. thresholds for update checks not happening, individual updates not working), and offers some/all of the following:

* Link to download a current version and reinstall as a user, rather than admin.
* Re-prompt to apply an update
* Updates/alternatives to incompatible add-ons
* Possible local fixes (i.e. elevate a process to correct permission errors on various files)
Waiting on Metrics for data reporting requirements.
Flags: needinfo?(jjensen)
There was discussion in the weekly meeting just now over how important a basic update metric will be.

Are there any low-hanging fruits we can quickly implement? e.g. could we include a boolean "is auto update enabled" flag in the longitudinal history?
The recommendations that we want to offer users do depend significantly on whether they recently turned auto-update off: "You turned automatic updates off N days ago and your Firefox is no longer safe! [Enable Updates] [Dismiss] [Provide feedback...]

If a user has had updates turned off for a long time, we probably want to offer a different kind of interface: "Your Firefox is N days/years out of date and you are unsafe!"

And considering that the longitudinal history can be very compact since the value will rarely change, it shouldn't significantly affect overall size.
Couple of notes

(In reply to Mike Connor [:mconnor] from comment #0)
>...
> I believe we will need to collect at least the following, on a daily basis,
> in order to gain a clear picture of how updates are working for individuals
> and users as a whole:
> 
> behaviour: 0/1 (is the updater enabled for automatic updates, automatic
> checking, or disabled, along with warning on incompatible add-ons)
Already implemented via telemetry

> lastCheck: int (days since last recorded update check, may be negative if
> last update was recorded in the future)
This is being added to telemetry in bug 886545.

> status: int (values correspond to
> http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/common/
> errors.h#10, value from most recent update attempt)
Already implemented via telemetry

> (optional) pendingVersion: version of the pending update
Not implemented

> (optional) timesFailed: int (number of times a pending update has failed)
Not implemented

> (optional) userRejected: true or false if a pending update was declined
> manually by the user.  combines with timesFailed in cases where we re-prompt
Last page shown in the app update UI (not the about dialog which is only for user initiated update check) infers some of this and is being added to telemetry in bug 886545
> 
> In terms of the user-facing report, I would want to build a feature which
> examines the current state of the update system, and compared against
> "normal" values (i.e. thresholds for update checks not happening, individual
> updates not working), and offers some/all of the following:
> 
> * Link to download a current version and reinstall as a user, rather than
> admin.
Not sure if this is what you had in mind but users without rights to install are already notified once per new version and there is a link to download. They can then install as a user.

> * Re-prompt to apply an update
This is already implemented though it has been a while since I verified that it is working properly.

> * Updates/alternatives to incompatible add-ons
Not implemented

> * Possible local fixes (i.e. elevate a process to correct permission errors
> on various files)
I don't think I have ever seen a bug report for Windows where this would have fixed things though I have seen permissions change on Mac after an OS X upgrade that fixing permissions would have allowed updating. The solution for this is to make it so Firefox can update on Mac similar to the Mac update mechanism where the user has to enter a password. I have already discussed adding this for Mac with a couple of the Mac devs but Lion work is currently taking priority.
(In reply to Gregory Szorc [:gps] from comment #2)
> There was discussion in the weekly meeting just now over how important a
> basic update metric will be.
> 
> Are there any low-hanging fruits we can quickly implement? e.g. could we
> include a boolean "is auto update enabled" flag in the longitudinal history?
We already report this to telemetry and it is a pref so it should be simple.


(In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> The recommendations that we want to offer users do depend significantly on
> whether they recently turned auto-update off: "You turned automatic updates
> off N days ago and your Firefox is no longer safe! [Enable Updates]
> [Dismiss] [Provide feedback...]
> 
> If a user has had updates turned off for a long time, we probably want to
> offer a different kind of interface: "Your Firefox is N days/years out of
> date and you are unsafe!"
> 
> And considering that the longitudinal history can be very compact since the
> value will rarely change, it shouldn't significantly affect overall size.
This would be excellent
Which pref (or prefs) should I record longitudinal history for? We can have that in Nightly by the end of the week. We can implement more detailed tracking later.

Let me know if you prefer I land it here or file a new bug.
Flags: needinfo?(robert.bugzilla)
Whether it is enabled
app.update.enabled boolean

Whether updates can be downloaded without the user being notified
app.update.auto boolean


app.update.mode integer
See the following (we should always have an appVersion attribute so the last half shouldn't happen anymore):
    /**
#      From this point on there are two possible outcomes:
#      1. download and install the update automatically
#      2. notify the user about the availability of an update
#
#      Notes:
#      a) if the app.update.auto preference is false then automatic download and
#         install is disabled and the user will be notified.
#      b) if the update has a showPrompt attribute the user will be notified.
#      c) Mode is determined by the value of the app.update.mode preference.
#
#      If the update when it is first read has an appVersion attribute the
#      following behavior implemented in bug 530872 will occur:
#      Mode   Incompatible Add-ons   Outcome
#      0      N/A                    Auto Install
#      1      Yes                    Notify
#      1      No                     Auto Install
#
#      If the update when it is first read does not have an appVersion attribute
#      the following deprecated behavior will occur:
#      Update Type   Mode   Incompatible Add-ons   Outcome
#      Major         all    N/A                    Notify
#      Minor         0      N/A                    Auto Install
#      Minor         1      Yes                    Notify
#      Minor         1      No                     Auto Install
     */
Flags: needinfo?(robert.bugzilla)
I think we should also implement the lastCheck metric (check pref for timestamp, do arithmetic, record number), as it's low-hanging fruit that doesn't require active checking.  The rest is important, but definitely more involved than just pref-based reporting.
(In reply to Mike Connor [:mconnor] from comment #8)
> I think we should also implement the lastCheck metric (check pref for
> timestamp, do arithmetic, record number), as it's low-hanging fruit that
> doesn't require active checking.  The rest is important, but definitely more
> involved than just pref-based reporting.
I should have mentioned that the last update check implemented in bug 886545 is for background update checks and does not track user initiated update checks. Also, there is no record of user initiated update checks.
Talked to Daniel IRL today and he says Metrics would like to have both app.update.enabled and app.update.auto. I didn't ask about the last check timestamp because I just now read those comments.
Daniel: So far I plan to implement:

1) Last-per-day boolean for "is app update enabled"
2) Last-per-day boolean for "is auto update enabled"

I need answers on whether:

3) Last-per-day value (UNIX timestamp? days since UNIX epoch?) of when the background update check last checked.

4) Whether to increment the version number for the org.mozilla.appInfo.appinfo measurement.
Flags: needinfo?(deinspanjer)
(In reply to Gregory Szorc [:gps] from comment #11)
> Daniel: So far I plan to implement:
> 
> 1) Last-per-day boolean for "is app update enabled"
> 2) Last-per-day boolean for "is auto update enabled"
> 
> I need answers on whether:
> 
> 3) Last-per-day value (UNIX timestamp? days since UNIX epoch?) of when the
> background update check last checked.
seconds

http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateTimerManager.js#138
(In reply to Robert Strong [:rstrong] (do not email) from comment #12)
> (In reply to Gregory Szorc [:gps] from comment #11)
> > Daniel: So far I plan to implement:
> > 
> > 1) Last-per-day boolean for "is app update enabled"
> > 2) Last-per-day boolean for "is auto update enabled"
> > 
> > I need answers on whether:
> > 
> > 3) Last-per-day value (UNIX timestamp? days since UNIX epoch?) of when the
> > background update check last checked.
> seconds
> 
> http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/
> nsUpdateTimerManager.js#138

It's more a question of privacy and need. There are potential privacy concerns with exposing second granularity (e.g. correlating update server logs with FHR payloads to match up IP addresses). Plus, I'm not sure if second granularity will provide any additional data benefit over say day granularity. We should default to as coarse as possible and add granularity only if there is a sufficient benefit.
Sorry, I thought you were asking about the value of the pref and not what is going to be reported.
I can't answer the question of whether we can definitely pull the trigger on logging last update check in any form. I think we need to default to Asa and/or jjenson for that, but I can say with high confidence that we don't need anything at a higher resolution than utc date if we do log it.  The precise time the check happened would not be as interesting to us than the number of days/weeks/months since a check last happened.
Flags: needinfo?(deinspanjer)
Record daily boolean flag for app.update.enabled and app.update.auto.
Holding off on last check time because of no clear requirements.
Follow-up bug I reckon.

I decided to go with a separate measurement because (and feel free to
call YAGNI) I feel app update will eventually evolve to covering much
more and will warrant its own measurement.
Attachment #774751 - Flags: review?(rnewman)
Assignee: nobody → gps
I'm going to nominate this for tracking Aurora/24. The patch is low risk and will allow us to have update data on release channel 6 weeks sooner.
Status: NEW → ASSIGNED
Comment on attachment 774751 [details] [diff] [review]
Add app update tracking to FHR

Review of attachment 774751 [details] [diff] [review]:
-----------------------------------------------------------------

Ahhhh, I love the smell of a simple, extensible module in the morning.

::: services/healthreport/providers.jsm
@@ +393,5 @@
>    },
> +
> +  collectDailyData: function () {
> +    let m = this.getMeasurement(AppUpdateMeasurement1.prototype.name,
> +                                      AppUpdateMeasurement1.prototype.version);

Indenting.

::: services/healthreport/tests/xpcshell/test_provider_appinfo.js
@@ +230,5 @@
> +  Services.prefs.setBoolPref("app.update.enabled", true);
> +  Services.prefs.setBoolPref("app.update.auto", true);
> +  let provider = new AppInfoProvider();
> +  yield provider.init(storage);
> +  yield provider.collectDailyData();

Can we move `let now ...` to immediately prior to this line? Narrows the window for a day transition during a test run.
Attachment #774751 - Flags: review?(rnewman) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/ce7b43ed8bf1

I'll file another bug to track adding additional data.
Flags: needinfo?(jjensen)
Target Milestone: --- → Firefox 25
Blocks: 893098
https://hg.mozilla.org/mozilla-central/rev/ce7b43ed8bf1
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Docs on http://docs.services.mozilla.com/healthreport/index.html for this new measurement should be updated within the next hour.
Comment on attachment 774751 [details] [diff] [review]
Add app update tracking to FHR

[Approval Request Comment]
Bug caused by (feature/regressing bug #): New FHR probe
User impact if declined: Won't be able to tell users update is disabled for another release cycle
Testing completed (on m-c, etc.): It just landed today. Might want to have it bake a few more days.
Risk to taking this patch (and alternatives if risky): Low risk. FHR probes are isolated from each other. If one fails, it shouldn't affect others. Plus, this probe is very simple and not likely to fail.
String or IDL/UUID changes made by this patch: None
Attachment #774751 - Flags: approval-mozilla-aurora?
Waiting for it to bake a couple of days before taking it on aurora.

:gps, can we validate the probe data on nightly to get the first pass here to make sure it looks as expected while waiting for it to bake to be uplifted ? Or any other testing/verification that may be valuable here on metrics side.
We should be fine with a simple spot check. If you open a profile that's been running a current Nightly for a few days, if you go to the raw data view in about:healthreport, you should see a few "org.mozilla.appInfo.update" entries in the daily data. If you adjust the app.update.enabled and app.update.auto prefs and reload about:healthreport, the values should switch to 0 for the current UTC day.
Keywords: qawanted
(In reply to Gregory Szorc [:gps] from comment #24)
> If you open a profile that's been running a current Nightly for a few days, if you go to
> the raw data view in about:healthreport, you should see a few "org.mozilla.appInfo.update" 
> entries in the daily data. 

I'm not seeing this in my Raw Data report with Firefox 25.0a1 2013-07-17, updated from 2013-07-16 today, 2013-07-15 yesterday.
Flags: needinfo?(gps)
Uh, yeah, it's not showing up. Hmmm.

I suspect another bug is at play here. I'm looking into it.
Flags: needinfo?(gps)
No, the implementation in this bug was wrong.

1374089334004	Services.HealthReport.HealthReporter	WARN	Error collecting daily data from providers: collectPromise.then is not a function JS Stack trace: @HealthReport.jsm:981 < @HealthReport.jsm:958 < doCollection@HealthReport.jsm:4716 < TaskImpl_run@Task.jsm:192 < resolve@promise.js:122 < then@promise.js:47 < resolve@promise.js:189 < TaskImpl_run@Task.jsm:217 < @Promise.jsm:500 < @Promise.jsm:282
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
collectDailyData needs to return a single promise and can't be a
generator function. I actually don't know how the tests were passing
before. The test performs a "yield provider.collectDailyData();" I
/thought/ a single yield on a generator function emitting multiple
promises would only execute up until the first yield. I may have to
re-read how generators work in JS...
Attachment #777310 - Flags: review?(rnewman)
Comment on attachment 777310 [details] [diff] [review]
Part 2: Properly implement app update collection

I trust that you verified the fix!
Attachment #777310 - Flags: review?(rnewman) → review+
Still no automated test?
Now with proper, high-level tests. This required a bit of refactoring in
shared test code. But, I think the change is for the better.
Attachment #778012 - Flags: review?(rnewman)
Attachment #777310 - Attachment is obsolete: true
Comment on attachment 778012 [details] [diff] [review]
Part 2: Properly implement app update collection

Review of attachment 778012 [details] [diff] [review]:
-----------------------------------------------------------------

Hooray tests!
Attachment #778012 - Flags: review?(rnewman) → review+
fx-team is closed and this patch conflicts with code that just landed in inbound in bug 884421, so I rebased this and landed it on inbound. Could be an interesting merge for whoever does it.

https://hg.mozilla.org/integration/mozilla-inbound/rev/26ebafeabaf7
https://hg.mozilla.org/mozilla-central/rev/a9327bfcbd8a
Status: ASSIGNED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Comment on attachment 778012 [details] [diff] [review]
Part 2: Properly implement app update collection

[Approval Request Comment]
Bug caused by (feature/regressing bug #): FHR update system collection
User impact if declined: See part 1 comment
Testing completed (on m-c, etc.): See part 1
Risk to taking this patch (and alternatives if risky): See part 1
String or IDL/UUID changes made by this patch: None
Attachment #778012 - Flags: approval-mozilla-aurora?
needsinfo on Anthony to help retest this given this is fixed now.
Flags: needinfo?(anthony.s.hughes)
I'm now seeing org.mozilla.appInfo.update in FHR in Firefox 25.0a1. Marking as verified.
Status: RESOLVED → VERIFIED
Flags: needinfo?(anthony.s.hughes)
Keywords: qawanted
Comment on attachment 774751 [details] [diff] [review]
Add app update tracking to FHR

Good to land given QA verification.
Attachment #774751 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #778012 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Due to the mangled way these landed, I used the original patches attached (and had to adapt part 2 a bit) rather than exporting from m-c. Hopefully I got that right.

https://hg.mozilla.org/releases/mozilla-aurora/rev/e793e3a1fe61
https://hg.mozilla.org/releases/mozilla-aurora/rev/9b469e3921ce
The patch from fx-team should have applied without any conflicts.
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.