Closed Bug 759065 Opened 9 years ago Closed 9 years ago

Background updates no longer downloads a full update after a partial update failure

Categories

(Toolkit :: Application Update, defect)

12 Branch
x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla15

People

(Reporter: bbondy, Assigned: ehsan.akhgari)

References

Details

Attachments

(1 file, 3 obsolete files)

When a partial update fails, it should automatically try to download the full update.  This no longer works with background updates.

1. Install the second latest build on m-c.
2. Before opening Firefox, replace any .exe or dll file with one from Aurora.
3. Try to do an update, it will say update failed. 
4. Close browser, re-open it loops back to #2.

Actual result:
No fallback from partial -> full happens.

Expected result:
A fallback from partial -> full should happen.

Telemetry data shows us that this error and fallback can happen quite a bit, (Maybe 1%?).

This is not an extremely dangerous thing at the moment because we only do partials for 1 day. So the user will have at most 1 day of failed updates on the Nightly m-c channel.  Also we don't need to disable updates for this because ReleEng could just delete the partial update at any time to fix.
update.status and the MAR file are cleared in this case after the failed apply.
Attached patch Patch (v1) (obsolete) — Splinter Review
Assignee: nobody → ehsan
Status: NEW → ASSIGNED
Attachment #628086 - Flags: review?(robert.bugzilla)
Comment on attachment 628086 [details] [diff] [review]
Patch (v1)

rstrong asked me to take this
Attachment #628086 - Flags: review?(robert.bugzilla) → review?(netzen)
ehsan: Could you push this to Oak and do 2 Nightly builds as well? I'd like to test this at the time of reviewing.
Sure, done.
Comment on attachment 628086 [details] [diff] [review]
Patch (v1)

Looks like the pushes to Oak introduced some failures, clearing the review flag for an updated patch.
Attachment #628086 - Flags: review?(netzen)
Pushed test builds to the Oak branch.
(In reply to Ehsan Akhgari [:ehsan] from comment #8)
> Pushed test builds to the Oak branch.

Wrong bug!  I'm investigating the test failures.
Attached patch Patch (v2) (obsolete) — Splinter Review
This should pass the tests...
Attachment #628086 - Attachment is obsolete: true
Attachment #628405 - Flags: review?(netzen)
So this patch seems to work...  But there's something tricky about it.  With this, we also need to extend the About dialog and the update wizard to correctly handle the case where an update fails to get staged in the background and a full update download gets triggered.  Currently neither of these UIs can handle this correctly.  What should the correct handling look like?  Should we flip back to the downloading stage when that happens, for example?
And thinking about this some more, I am no longer sure if this actually makes sense.  But I don't know what the original reason why we do this for regular updates is.
Comment on attachment 628405 [details] [diff] [review]
Patch (v2)

Review of attachment 628405 [details] [diff] [review]:
-----------------------------------------------------------------

> we also need to extend the About dialog and the update wizard to 
> correctly handle the case where an update fails to get staged 
> in the background and a full update download gets triggered.

I don't know the about dialog code nor the update service code but the nsUpdateService.js changes look correct to me.
Attachment #628405 - Flags: review?(robert.bugzilla)
Attachment #628405 - Flags: review?(netzen)
Attachment #628405 - Flags: feedback+
> But I don't know what the original reason why we do this for regular updates is.

The full update has a higher chance of success than the partial one does. If a user's install dir gets out of sync in any way the partial will fail. 

I don't recall which error code would be returned, but it would probably be a good idea to add some telemetry to see how often this happens.  I know I've had it happen several times before.
(In reply to Ehsan Akhgari [:ehsan] from comment #11)
> So this patch seems to work...  But there's something tricky about it.  With
> this, we also need to extend the About dialog and the update wizard to
> correctly handle the case where an update fails to get staged in the
> background and a full update download gets triggered.  Currently neither of
> these UIs can handle this correctly.  What should the correct handling look
> like?  Should we flip back to the downloading stage when that happens, for
> example?
The mochitest-chrome test file test_0064_check_verifyFailPartial_successComplete.xul tests this scenario for the update wizard. iirc it just downloads the full update in this scenario and should do the same for the about dialog.
(In reply to Robert Strong [:rstrong] (do not email) from comment #15)
> (In reply to Ehsan Akhgari [:ehsan] from comment #11)
> > So this patch seems to work...  But there's something tricky about it.  With
> > this, we also need to extend the About dialog and the update wizard to
> > correctly handle the case where an update fails to get staged in the
> > background and a full update download gets triggered.  Currently neither of
> > these UIs can handle this correctly.  What should the correct handling look
> > like?  Should we flip back to the downloading stage when that happens, for
> > example?
> The mochitest-chrome test file
> test_0064_check_verifyFailPartial_successComplete.xul tests this scenario
> for the update wizard. iirc it just downloads the full update in this
> scenario and should do the same for the about dialog.

Is there any way to invoke this outside of a test?  So that I can play around and experiment with it?
It would likely be easiest to copy a real update.xml to your own web space with both a partial / complete and change the hashValue for the partial. You can then use the app.update.url.override pref to point to this update.xml
Attached patch Patch (v3) (obsolete) — Splinter Review
OK, this patches fixes both pieces of the UI interacting with this to be able to go back into the "downloading" state if the fallback to full update happens.
Attachment #628405 - Attachment is obsolete: true
Attachment #628405 - Flags: review?(robert.bugzilla)
Attachment #628972 - Flags: review?(robert.bugzilla)
I've tested this locally, and I will push it to Oak as well.
Comment on attachment 628972 [details] [diff] [review]
Patch (v3)

>diff --git a/toolkit/mozapps/update/nsUpdateService.js b/toolkit/mozapps/update/nsUpdateService.js
>index f8bc2db..c0a78c6 100644
>--- a/toolkit/mozapps/update/nsUpdateService.js
>+++ b/toolkit/mozapps/update/nsUpdateService.js
>@@ -987,6 +987,36 @@ function handleUpdateFailure(update, errorCode) {
> }
> 
> /**
>+ * Fall back to downloading a complete update in case an update has failed.
>+ *
>+ * @param update the update object that has failed to apply.
>+ */
>+function handleFallbackToFullUpdate(update) {
handleFallbackToCompleteUpdate

>+  cleanupActiveUpdate();
>+
>+  update.statusText = gUpdateBundle.GetStringFromName("patchApplyFailure");
>+  var oldType = update.selectedPatch ? update.selectedPatch.type
>+                                     : "complete";
>+  if (update.selectedPatch && oldType == "partial" && update.patchCount == 2) {
>+    // Partial patch application failed, try downloading the complete
>+    // update in the background instead.
>+    LOG("UpdateService:_postUpdateProcessing - install of partial patch " +
LOG("handleFallbackToCompleteUpdate

>+        "failed, downloading complete patch");
>+    var status = Cc["@mozilla.org/updates/update-service;1"].
>+                 getService(Ci.nsIApplicationUpdateService).
>+                 downloadUpdate(update, true);
>+    if (status == STATE_NONE)
>+      cleanupActiveUpdate();
>+  }
>+  else {
>+    LOG("UpdateService:_postUpdateProcessing - install of complete or " +
LOG("handleFallbackToCompleteUpdate

>+        "only one patch offered failed... showing error.");
Since an error isn't always shown
"only one patch offered failed.");

>+  }
>+  update.QueryInterface(Ci.nsIWritablePropertyBag);
>+  update.setProperty("patchingFailed", oldType);
>+}
>+
>+/**
>  * Update Patch
>  * @param   patch
>  *          A <patch> element to initialize this object with

>...
>@@ -2387,7 +2399,9 @@ UpdateManager.prototype = {
>     update.state = ary[0];
>     if (update.state == STATE_FAILED && ary[1]) {
>       updateSucceeded = false;
>-      handleUpdateFailure(update, ary[1]);
>+      if (!handleUpdateFailure(update, ary[1])) {
>+        handleFallbackToFullUpdate(update);
>+      }
>     }
>     if (update.state == STATE_APPLIED && shouldUseService()) {
>       writeStatusFile(getUpdatesDir(), update.state = STATE_APPLIED_SVC);
>@@ -2405,6 +2419,8 @@ UpdateManager.prototype = {
> 
>     // Send an observer notification which the update wizard uses in
>     // order to update its UI.
>+    LOG("Notifying observers that the update was staged. state: " +
LOG("UpdateManager:refreshUpdateStatus - " etc.

>+        update.state + ", status: " + status);
>     Services.obs.notifyObservers(null, "update-staged", update.state);
> 
>     // Do this after *everything* else, since it will likely cause the app
Attachment #628972 - Flags: review?(robert.bugzilla) → review+
Attachment #628972 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/e3d3a4b05211
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla15
Is there a change to get a test for? If not we could probably cover that by our Mozmill update tests.
Flags: in-testsuite?
(In reply to Henrik Skupin (:whimboo) from comment #23)
> Is there a change to get a test for? If not we could probably cover that by
> our Mozmill update tests.

Yeah, I'm planning to write tests for a lot of this stuff soon.
Depends on: 764377
No longer depends on: 764377
You need to log in before you can comment on or make changes to this bug.