FHR: Record more application update data

VERIFIED FIXED in Firefox 32

Status

defect
VERIFIED FIXED
6 years ago
10 months ago

People

(Reporter: gps, Assigned: enndeakin)

Tracking

unspecified
Firefox 32
Dependency tree / graph
Bug Flags:
firefox-backlog +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: p=13 s=it-32c-31a-30b.2 [qa!])

Attachments

(1 attachment, 2 obsolete attachments)

+++ This bug was initially created as a clone of Bug #864993 +++

Bug 864993 added basic "is app update enabled" and "is automatic app update downloading enabled" fields to FHR. There are many more possibilities of data that can be added. This bug will track the next wave.

See bug 864993 for specifics.
Summary: Record more application update data → FHR: Record more application update data
Depends on: fxdesktoptriage
No longer depends on: fxdesktoptriage
Operational data requested:

* count: attempts to download a partial update
* count: attempts to download a full update
* count: success and failure of update downloads

Diagnostic data:

* log: reason why an update failed to apply
Blocks: fxdesktopbacklog
No longer blocks: fxdesktoptriage
Whiteboard: p=0
Assignee: nobody → enndeakin
Status: NEW → ASSIGNED
Whiteboard: p=0 → p=13 s=it-31c-30a-29b.2
Flags: needinfo?(florin.mezei)
Whiteboard: p=13 s=it-31c-30a-29b.2 → p=13 s=it-31c-30a-29b.2 [qa+]
No longer blocks: fxdesktopbacklog
Flags: firefox-backlog+
Flags: needinfo?(florin.mezei)
QA Contact: petruta.rasa
OK, so let me summarize what I think needs to be done here.

The Metrics.storage will be used to store counts of what aspects of updates need to be recorded. Those include:
 - checking for updates (checkForUpdates)
 - starting a complete or partial download (added somewhere in Downloader:downloadUpdate)
 - finishing an update, either successfully or failure (added in onStopRequest)
 - applying the update (added near where _sendStatusCodeTelemetryPing is called)

Is it sufficient to just do that, or does something within healthreport/providers.jsm need to collect the data for sending?

I'm a bit unclear as to what the updater code (nsUpdateService.js) should be doing here when one of those situations occurs. Adding to the metrics storage itself, contacting the health report AppInfoProvider provider which currently provides some update preference info, contacting a new health report provider, or something else?
Flags: needinfo?(gps)
The goal of FHR providers is for them to live close to the code they are measuring. FHR doesn't want to be the module/owner of domain-specific code throughout the tree. We just ask that changes to the data are run through FHR people (for policy, legal, and impact on metrics reasons).

I would create a new provider for the update mechanism. Let's call it "org.mozilla.update."

The following patches can be used as a template for adding a new provider:

https://hg.mozilla.org/mozilla-central/rev/c542b74bd763
https://hg.mozilla.org/mozilla-central/rev/dc26e8d484f8

The Sync one demonstrates how you can record new values as a result of an event in your service.

I would write a second patch that would stop collecting data into the org.mozilla.appInfo.update measurement. All new data would go the new update provider/measurements and FHR's core providers would get out of the business of knowing anything about the update service.
Flags: needinfo?(gps)
Whiteboard: p=13 s=it-31c-30a-29b.2 [qa+] → p=13 s=it-31c-30a-29b.3 [qa+]
Posted patch updaterecordreport (obsolete) — Splinter Review
This is what I have so far. It seems to get updated properly when update checking is done at startup and the included test passes.

Some additional work:
 - counters for applying the update
 - some comments within the healthreport code suggest that loading is delayed at startup to save time, but updates are checked during startup, so there may be some performance impact here

Anyway, I'll be away for two weeks, so I'm putting this work in progress up for now.
Attachment #8407267 - Flags: feedback?(gps)
Comment on attachment 8407267 [details] [diff] [review]
updaterecordreport

Bumping requests to rnewman since gps is unavailable.
Attachment #8407267 - Flags: feedback?(gps) → feedback?(rnewman)
Whiteboard: p=13 s=it-31c-30a-29b.3 [qa+] → p=13 s=it-32c-31a-30b.1 [qa+]
Comment on attachment 8407267 [details] [diff] [review]
updaterecordreport

Review of attachment 8407267 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay looking at this. It's been hectic.

This f+ is pretty much an r+.

::: toolkit/mozapps/update/UpdateHealthReport.jsm
@@ +4,5 @@
> +
> +"use strict";
> +
> +this.EXPORTED_SYMBOLS = [
> +  "UpdateProvider",

Maybe call this file UpdaterHealthProvider? "UpdateHealthReport" sounds like verb-noun.

@@ +29,5 @@
> +  fields: {
> +    updateCheckStartCount: DAILY_COUNTER_FIELD,
> +    updateCheckSuccessCount: DAILY_COUNTER_FIELD,
> +    updateCheckFailedCount: DAILY_COUNTER_FIELD,
> +    updateCheckFailedStatuses: DAILY_DISCRETE_NUMERIC_FIELD,

Do we need the count if we have the list of statuses? I see you're doing concatenation to get the field name, so maybe this is not worth optimizing.

@@ +35,5 @@
> +    partialUpdateStartCount: DAILY_COUNTER_FIELD,
> +    completeUpdateSuccessCount: DAILY_COUNTER_FIELD,
> +    partialUpdateSuccessCount: DAILY_COUNTER_FIELD,
> +    updateFailedCount: DAILY_COUNTER_FIELD,
> +    updateFailedStatuses: DAILY_DISCRETE_NUMERIC_FIELD,

Similarly.

::: toolkit/mozapps/update/nsUpdateService.js
@@ +936,5 @@
>    return reason;
>  }
>  
>  /**
> + * Record count in the health report

Trailing period.

@@ +960,5 @@
> +        }
> +      });
> +    }
> +  // The reporting service hasn't initialized during xpcshell tests, so just do nothing
> +  } catch (ex) { LOG("UpdateService: Could not initialize health reporter"); }

Nit: multiple lines.

@@ +3630,5 @@
>      var url = this.getUpdateURL(force);
>      if (!url || (!this.enabled && !force))
>        return;
>  
> +    recordInHealthReport("updateCheckStart", 0);

Could you lift these strings into a pseudo-enum, maybe on nsUpdateService? The sanity of recording into FHR depends on updating the measurement whenever these change, and typos would be bad.

::: toolkit/mozapps/update/nsUpdateService.manifest
@@ +12,5 @@
>  category profile-after-change nsUpdateServiceStub @mozilla.org/updates/update-service-stub;1
> +#ifdef MOZ_SERVICES_HEALTHREPORT
> +category healthreport-js-provider-default UpdateProvider resource://gre/modules/UpdateHealthReport.jsm
> +#endif
> +    
\ No newline at end of file

Trailing whitespace.
Attachment #8407267 - Flags: feedback?(rnewman) → feedback+
Posted patch updaterecordreport (obsolete) — Splinter Review
> Do we need the count if we have the list of statuses?

It is redundant, but I left it for convenience.
Attachment #8407267 - Attachment is obsolete: true
Attachment #8420130 - Flags: review?(rnewman)
Comment on attachment 8420130 [details] [diff] [review]
updaterecordreport

For updater changes. I tested this with the existing updater tests and I can see the right info being populated into about:healthreport when the tests are finished. Not sure if we want any more specific tests for this beyond what exists in this patch.
Attachment #8420130 - Flags: review?(robert.strong.bugs)
Comment on attachment 8420130 [details] [diff] [review]
updaterecordreport

Review of attachment 8420130 [details] [diff] [review]:
-----------------------------------------------------------------

Rubberstamp!
Attachment #8420130 - Flags: review?(rnewman) → review+
Whiteboard: p=13 s=it-32c-31a-30b.1 [qa+] → p=13 s=it-32c-31a-30b.2 [qa+]
Comment on attachment 8420130 [details] [diff] [review]
updaterecordreport

>diff --git a/toolkit/mozapps/update/moz.build b/toolkit/mozapps/update/moz.build
>--- a/toolkit/mozapps/update/moz.build
>+++ b/toolkit/mozapps/update/moz.build
>@@ -41,9 +41,13 @@ if CONFIG['MOZ_UPDATER']:
>         'nsUpdateService.manifest',
>     ]
> 
>     EXTRA_PP_COMPONENTS += [
>         'nsUpdateService.js',
>         'nsUpdateServiceStub.js',
>     ]
> 
>+    EXTRA_JS_MODULES += [
>+      'UpdaterHealthProvider.jsm'
>+    ]
>+
> JAR_MANIFESTS += ['jar.mn']
>\ No newline at end of file
nit: newline

>diff --git a/toolkit/mozapps/update/nsUpdateService.js b/toolkit/mozapps/update/nsUpdateService.js
>--- a/toolkit/mozapps/update/nsUpdateService.js
>+++ b/toolkit/mozapps/update/nsUpdateService.js
>...
>@@ -931,16 +944,46 @@ function getStatusTextFromCode(code, def
>     reason = gUpdateBundle.GetStringFromName("check_error-" + defaultCode);
>     LOG("getStatusTextFromCode - transfer error: " + reason +
>         ", default code: " + defaultCode);
>   }
>   return reason;
> }
> 
> /**
>+ * Record count in the health report.
>+ * @param field
>+ *        The field name to record
>+ * @param status
>+ *        Status code for errors, 0 otherwise
>+ */
>+function recordInHealthReport(field, status) {
>+#ifdef MOZ_SERVICES_HEALTHREPORT
>+  try {
>+    LOG("recordInHealthReport - " + field + " - " + status);
>+
>+    let reporter = Cc["@mozilla.org/datareporting/service;1"]
>+                      .getService().wrappedJSObject.healthReporter;
>+
>+    if (reporter) {
>+      reporter.onInit().then(function recordUpdateInHealthReport() {
>+        try {
>+          reporter.getProvider("org.mozilla.update").recordUpdate(field, status);
>+        } catch (ex) {
>+          Cu.reportError(ex);
>+        }
>+      });
>+    }
>+  // The reporting service hasn't initialized during xpcshell tests, so just
>+  // do nothing
nit: should mention "Don't fail updates if getting the heath reporter service fails" and that covers the xpcshell and other possible cases.

>+  } catch (ex) { LOG("UpdateService: Could not initialize health reporter"); }
nit: put the LOG statement on its own line and UpdateService is redundant with the LOG statement already prefixing with AUS:SVC. How about
LOG("recordInHealthReport - could not initialize health reporter");

>+#endif
>+}
>+
>+/**
>  * Get the Active Updates directory
>  * @return The active updates directory, as a nsIFile object
>  */
> function getUpdatesDir() {
>   // Right now, we only support downloading one patch at a time, so we always
>   // use the same target directory.
>   return getUpdateDirCreate([DIR_UPDATES, "0"]);
> }

>...
>@@ -4366,59 +4426,64 @@ Downloader.prototype = {
>         this._update.statusText = message;
> 
>         if (this._update.isCompleteUpdate || this._update.patchCount != 2)
>           deleteActiveUpdate = true;
> 
>         // Destroy the updates directory, since we're done with it.
>         cleanUpUpdatesDir();
>       }
>-    } else if (status == Cr.NS_ERROR_OFFLINE) {
>-      // Register an online observer to try again.
>-      // The online observer will continue the incremental download by
>-      // calling downloadUpdate on the active update which continues
>-      // downloading the file from where it was.
>-      LOG("Downloader:onStopRequest - offline, register online observer: true");
>-      shouldRegisterOnlineObserver = true;
>-      deleteActiveUpdate = false;
>-    // Each of NS_ERROR_NET_TIMEOUT, ERROR_CONNECTION_REFUSED, and
>-    // NS_ERROR_NET_RESET can be returned when disconnecting the internet while
>-    // a download of a MAR is in progress.  There may be others but I have not
>-    // encountered them during testing.
>-    } else if ((status == Cr.NS_ERROR_NET_TIMEOUT ||
>-                status == Cr.NS_ERROR_CONNECTION_REFUSED ||
>-                status == Cr.NS_ERROR_NET_RESET) &&
>-               this.updateService._consecutiveSocketErrors < maxFail) {
>-      LOG("Downloader:onStopRequest - socket error, shouldRetrySoon: true");
>-      shouldRetrySoon = true;
>-      deleteActiveUpdate = false;
>-    } else if (status != Cr.NS_BINDING_ABORTED &&
>-               status != Cr.NS_ERROR_ABORT &&
>-               status != Cr.NS_ERROR_DOCUMENT_NOT_CACHED) {
>-      LOG("Downloader:onStopRequest - non-verification failure");
>-      // Some sort of other failure, log this in the |statusText| property
>-      state = STATE_DOWNLOAD_FAILED;
>-
>-      // XXXben - if |request| (The Incremental Download) provided a means
>-      // for accessing the http channel we could do more here.
>-
>-      this._update.statusText = getStatusTextFromCode(status,
>-                                                      Cr.NS_BINDING_FAILED);
>+    }
>+    else {
nit: one line } else {


>diff --git a/toolkit/mozapps/update/tests/unit_aus_update/xpcshell.ini b/toolkit/mozapps/update/tests/unit_aus_update/xpcshell.ini
>--- a/toolkit/mozapps/update/tests/unit_aus_update/xpcshell.ini
>+++ b/toolkit/mozapps/update/tests/unit_aus_update/xpcshell.ini
>@@ -34,8 +34,9 @@ reason = custom nsIUpdatePrompt
> [uiOnlyAllowOneWindow.js]
> skip-if = toolkit == 'gonk'
> reason = custom nsIUpdatePrompt
> [uiUnsupportedAlreadyNotified.js]
> skip-if = toolkit == 'gonk'
> reason = custom nsIUpdatePrompt
> [updateRootDirMigration_win.js]
> run-if = os == 'win'
>+[updateHealthReport.js]
How do you prevent this test from running when MOZ_SERVICES_HEALTHREPORT isn't defined?

All looks good except for that last item so with that handled it will be an r+. Thanks!
Attachment #8420130 - Flags: review?(robert.strong.bugs) → review-
Attachment #8420130 - Attachment is obsolete: true
Attachment #8421760 - Flags: review?(robert.strong.bugs)
Comment on attachment 8421760 [details] [diff] [review]
Add health reporting to update, v2

Thanks!
Attachment #8421760 - Flags: review?(robert.strong.bugs) → review+
https://hg.mozilla.org/mozilla-central/rev/a55387bf473b
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 32
Depends on: 1010984
Mozilla/5.0 (X11; Linux i686; rv:32.0) Gecko/20100101 Firefox/32.0 - 20140515030202
This is what I obtained while testing Nightly 15/05.
"org.mozilla.update.update": {
          "_v": 1,
          "updateCheckFailedCount": 1,
          "updateCheckStartCount": 3,
          "updateCheckSuccessCount": 2,
          "updateCheckFailedStatuses": [
            111
          ]
        }
Is there a way to check for the log report (here I believe code 111)?
Also, are there any other useful tips to help me verifying this bug? Thank you!
No longer depends on: 1010984
Flags: needinfo?(enndeakin)
The data above suggests you tried to check for updates three times, and failed one of those times with an error code of 111 (that error means you were offline).

However, no updates were available, otherwise you should have seen some additional entries, completeUpdateStart or partialUpdateStart plus others depending on success and fail. For more complete testing, you may want to test when an update is available.
Flags: needinfo?(enndeakin)
Thanks, Neil!

Please see my results on 2014-05-16 Nightly under Win 8 64-bit, Ubuntu 13.04 32-bit and Mac OSX 10.9.2:
https://docs.google.com/spreadsheets/d/1H_GXTQ3UWyofJ2z09dpR69M2kH7hjyCy8kiW2DoUWsA/edit?usp=sharing

The tests I performed from Help-About menu were: I close Firefox before the download was completed, I interrupted the network connection while the update was downloading, I've made successful downloads..

It seems that full downloads are interpreted as partials. Also I don't see completeUpdateSuccessCount for any of the OSs.
The updateCheckFailedStatuses for Mac OS is different from the ones obtained for the other two OSs and seems to look more like updateFailedStatuses.

Please let me know what you think about all these and if there are any other tests I could do.
(In reply to Petruta Rasa [QA] [:petruta] from comment #17)
> It seems that full downloads are interpreted as partials. Also I don't see
> completeUpdateSuccessCount for any of the OSs.
> The updateCheckFailedStatuses for Mac OS is different from the ones obtained
> for the other two OSs and seems to look more like updateFailedStatuses.

I get the same error on Mac. I don't think the specific errors are too important here. They are all network error codes (cancelled, offline, dns error, connection failure, timeout, etc) and they likely differ only due to the way the os intreprets specific conditions.

The only thing that is noticeable is that some attempts to start a download never output a success or failure. I assume that these are cases where you quit while updating. If this becomes an issue with the data, we can investigate that further in another bug.

So I think this tests this well enough. Thanks!
I managed to have also completeUpdateSuccessCount while updating from Nightly 15/05 to 18/05 under Win 7 64-bit. The failed statuses below were obtained when quitting while updating.
"org.mozilla.update.update": {
          "_v": 1,
          "completeUpdateStartCount": 6,
          "completeUpdateSuccessCount": 1,
          "updateCheckStartCount": 1,
          "updateCheckSuccessCount": 1,
          "updateFailedCount": 3,
          "updateFailedStatuses": [
            2152398850,
            2152398850,
            2152398850
          ]
        },

I'm marking this issue verified as per comment 18.
Status: RESOLVED → VERIFIED
Whiteboard: p=13 s=it-32c-31a-30b.2 [qa+] → p=13 s=it-32c-31a-30b.2 [qa!]
Depends on: 1031874
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.