Closed Bug 831701 Opened 11 years ago Closed 11 years ago

when there is a new system update, we should remove the already downloaded part for the old system update

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

(blocking-b2g:tef+, firefox19 wontfix, firefox20 wontfix, firefox21 fixed, b2g18 fixed, b2g18-v1.0.0 fixed, b2g18-v1.0.1 fixed)

VERIFIED FIXED
B2G C4 (2jan on)
blocking-b2g tef+
Tracking Status
firefox19 --- wontfix
firefox20 --- wontfix
firefox21 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- fixed
b2g18-v1.0.1 --- fixed

People

(Reporter: julienw, Assigned: dhylands)

References

Details

Attachments

(4 files, 3 obsolete files)

STR:
* start a system update download
* stop it
* trigger a new system update
* start the new system update download

Expected:
* the download starts over and eventually completes and the update gets applied

Actual:
* the download seems to resume, it completes but the update is never applied
* following tries stay blocked at 0. (which might be another bug ? ie we don't send an error if we don't start a download at all)

[1] is a complete log of one of the tries where the download stays blocked at 0.

Nominating because system updates MUST work.

[1] http://pastebin.mozilla.org/2060886
I investigated on the phone, here are some findings :

root@android:/data/local/updates/0 # ls -l
-rw-r----- root     root            5 2013-01-17 09:31 update.status
-rw-r----- root     root            5 2013-01-17 09:31 update.version
root@android:/data/local/updates/0 # cat update.status                         
null
root@android:/data/local/updates/0 # cat update.version                        
null
Here is the updates.xml file
and now the /data/local/active-update.xml in case it is useful.
I removed all update-related files (/data/local/*update*) before restarting and then trying a system update. But I may have failed removing all files :)
Major system update issue = blocker

Dave can you take a look?
Assignee: nobody → dhylands
blocking-b2g: tef? → tef+
Heh - I've been investigating exactly this problem over the last few days with jst.

Yes - I will definitely work on this.
I was able to reproduce this using local test files.

With the latest patches when the "new" update completes you wind up with an update.mar that has contents from the initial update and also from the new update.

The verification stage fails, and I see a message flash up on the phone that says:

    "There was an error while download the updates"

This shows up at the bottom of the screen for a few seconds, and then disappears. The notification area shows 1 update available.

If you then click on the notification, it gets stuck at the 0.00 bytes downloaded.

In /data/local/updates/0 I see update.status and update.version, both of which contain the string "null" (without the quotes).

Now to dig into the logic and figure out what's going on.
This patch causes a patch file which hasn't yet been applied to be removed prior to downloading a new patch.

As part of testing this, I also fixed a couple of other minor problems:
- A fully downloaded, but unapplied patch would be re-downloaded instead of just being applied
- If the user chooses to install Later, then they can never choose to apply the patch again.

A side-effect of this patch, is that it also fixes bug 821523.
Remove  trailing space.
Attachment #704759 - Attachment is obsolete: true
Attachment #704762 - Flags: review?(ehsan)
Blocks: 821523
Comment on attachment 704762 [details] [diff] [review]
Remove stale patche when starting to download a new plus a couple cleanups.

Review of attachment 704762 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry Dave, I don't really know the update service code well enough to be able to provide any useful review here.
Attachment #704762 - Flags: review?(ehsan) → review?(robert.bugzilla)
Comment on attachment 704762 [details] [diff] [review]
Remove stale patche when starting to download a new plus a couple cleanups.

I think I mentioned essentially this in a previous bug. We have always supported the ability to resume the download of an update and unless I am mistaken this patch (like the other one) will make it so a partially downloaded update will be deleted. The state you can end up in is never completely downloading an update. Does this allow continuing the download? Also, can you create tests for the GONK specific behavior?

minusing to get answers for the above.
Attachment #704762 - Flags: review?(robert.bugzilla) → review-
rob, this bug is about removing the stale update if a brand new update is available. I mean a new file.
Right and the system can get into a state where an update is never completely downloaded / applied in the case where there are nightly updates.
(In reply to Robert Strong [:rstrong] (do not email) from comment #14)
> Right and the system can get into a state where an update is never
> completely downloaded / applied in the case where there are nightly updates.
*especially* in the case of nightly updates... it could happen in other use cases as well.
so what do you suggest here? finish the old update even if there is a new one, and only then notice the user that a new one is available?
That's what we currently do on desktop for the reasons I have stated. What we'd like to do is come up with a way to detect when there have been repeated updates skipped due to always downloading a *new* one and then continue downloading the update that would otherwise be cancelled.
(In reply to Robert Strong [:rstrong] (do not email) from comment #12)
> Comment on attachment 704762 [details] [diff] [review]
> Remove stale patche when starting to download a new plus a couple cleanups.
> 
> I think I mentioned essentially this in a previous bug. We have always
> supported the ability to resume the download of an update and unless I am
> mistaken this patch (like the other one) will make it so a partially
> downloaded update will be deleted. The state you can end up in is never
> completely downloading an update. Does this allow continuing the download?
> Also, can you create tests for the GONK specific behavior?

The problem with the current "resume" of the download is that it happens at the wrong time.

In its original form, the download would resume at the _postUpdateProcessing stage which is before the B2G UI has launched.

This then slows down the launch of the UI. The phone appears dead, potentially for several minutes. I can tell its downloading by looking at the adb logs, but the user doesn't have access to these.

We also can't start downloading automatically over a cost based channel. For example, say the user starts the download over Wifi, and it gets interrupted and the system restarts. It's not acceptable to now start downloading over say 3G.

In the scenario where we have an interrupted download and a new download, what I'd really like to see is that the user is presented with both as options, and the user can choose to continue the interrupted download, or choose to go with the new one. But in any event, we need to wait for the user to make that choice.

In the market we're releasing to, nightly updates are unlikely, which is why I've taken the path I have for the time being.
I guess for v1.0 whether we should resume the interrupted download or start a new one is really a product and/or UX decision.

We only have a few days left, so we can't get too elaborate.
Flags: needinfo?(ffos-product)
(In reply to Robert Strong [:rstrong] (do not email) from comment #12)
> Comment on attachment 704762 [details] [diff] [review]
> Remove stale patche when starting to download a new plus a couple cleanups.
> 
> Also, can you create tests for the GONK specific behavior?

I filed bug 830780 to create gonk specific tests to cover the various differences for interrupted/corrupted downloads and I expect to start working on this real soon now.
I think it is the right choice for the B2G case to always prompt the user instead of resuming the download though I think it would be better to at least give users the option to always download. The main concern I have is some users ending up in a state where they never finish a download due to always cancelling the previous update and starting a new update. There is also the case where the partially downloaded update is the same as the one offered and with the proposed approach you end up downloading the entire update instead of resuming which also has implications for cost based channels.
(In reply to Dave Hylands [:dhylands] from comment #18)
>...
> In the market we're releasing to, nightly updates are unlikely, which is why
> I've taken the path I have for the time being.
With that in mind then the chances of someone having a newer update available than the one currently downloading should be extremely rare so continuing the current update download (after getting user consent) will more likely be the normal case. The main difference is that by deleting the current download and starting a new one it is possible that the system won't ever get updated.
(In reply to Robert Strong [:rstrong] (do not email) from comment #21)
> I think it is the right choice for the B2G case to always prompt the user
> instead of resuming the download though I think it would be better to at
> least give users the option to always download. The main concern I have is
> some users ending up in a state where they never finish a download due to
> always cancelling the previous update and starting a new update. There is
> also the case where the partially downloaded update is the same as the one
> offered and with the proposed approach you end up downloading the entire
> update instead of resuming which also has implications for cost based
> channels.

We only remove the "stale" update if it has a different appVersion or buildID. If those two things are the same, then we fall through and continue the download from where it left off.
(In reply to Robert Strong [:rstrong] (do not email) from comment #22)
> (In reply to Dave Hylands [:dhylands] from comment #18)
> >...
> > In the market we're releasing to, nightly updates are unlikely, which is why
> > I've taken the path I have for the time being.
> With that in mind then the chances of someone having a newer update
> available than the one currently downloading should be extremely rare so
> continuing the current update download (after getting user consent) will
> more likely be the normal case. The main difference is that by deleting the
> current download and starting a new one it is possible that the system won't
> ever get updated.

So the user does have the option to cancel a current download. So perhaps the right thing to do is that as long as an update is "active", then the UI will only present the currently "active" update, even if newer updates are available.

If the user cancels the active update, then they'll see the newer update.

Then it really boils down to what should the behavior of "Check Now" (forced check) do while there is an active update?

The only issue I see right now is that the current UI has no way to indicate that a newer update is available while some earlier update is still active. But we could also just punt that to a later version.
(In reply to Dave Hylands [:dhylands] from comment #24)
> Then it really boils down to what should the behavior of "Check Now" (forced
> check) do while there is an active update?

When you press "Check Now", there is a spot underneath the "Check Now" to indicate status.

We could update this status are to indicate that a new update is available (although I have a feeling that this might require a string change).

And the notification bar could continue to show the active update (even though a new update is available).
Answer to comment 25:

Why could we not return an error like "an update is currently happening" ? I mean, the user already knows that (he has the notification), and in Gaia's Settings app we'd display nothing (as when there is a new update), and in Gaia's System app we'd just ignore that. For v1 it could be enough, right ?

I believe this error would occur in three situations (ie there is a um.activeUpdate property set) :
- the update is currently downloading/uncompressing -> then there is the "update downloading" notification
- the update's download was canceled -> then there is the "update available" notification
- the update is ready to apply -> I agree that we don't have anything in Gaia for that. I opened Bug 833755 for that, but I'd be ok with v1.0.0.0 not having this.

About comment 24 ("If the user cancels the active update, then they'll see the newer update.") I'm not so sure about that. I agree with Robert and I think we should finish any started system update before handling a new one. And hopefully with differential updates the new one will be small anyway (not sure we have that in B2G though :) ).

About comment 18: I think the "current resume behaviour" you're talking about was changed in bug 813770 already. What this bug is originally about is that we're actually resuming a download after user consent _but_ if we have a new update we're resuming the new file on top of the old file, which gets us a corrupted file obviously.
Once the user starts to download an update, stick to that update until the user cancels the download and does a "Check Now".

If a newer update is started and a stale update is detected, remove the stale update.

If a download fails due an update being in the applied state, prompt the user to apply it.

If an update was fully downloaded, but interrupted during verification, then don't redownload.
Attachment #704762 - Attachment is obsolete: true
Removed a trailing space from the previous patch.

Once the user starts to download an update, stick to that update until the user cancels the download and does a "Check Now".

If a newer update is started and a stale update is detected, remove the stale update.

If a download fails due an update being in the applied state, prompt the user to apply it.

If an update was fully downloaded, but interrupted during verification, then don't redownload.

And bug 830780 was filed as a followup to add additional gonk-specific tests and to make sure the 3 test cases modified with this patch are dealt with properly.
Attachment #705530 - Attachment is obsolete: true
Attachment #705535 - Flags: review?(robert.bugzilla)
Manual testing performed with this patch:

- regular update succeeds.

- interrupted download of update. Restarted update and confirmed that it didn't redownload the already downloaded portion of the patch

- interrupted verification of update. Restarted update and conifirmed that it didn't redownload, that it did re-verify, and that it did reapply.

- interrupted application of update. Restarted update and confirmed that it re-verified and re-applied the patch.

- interrupted download of update. Setup a newer update. Restarted update and confirmed original update was resumed.

- interrupted download of update. Setup a newer update. Cancelled download. Did a Check Now. Confirmed that new update was downloaded and previous stale patch was removed.

- Did a regular update. Chose Later when asked to apply. Went to notification bar, selected "1 update available", and confirmed that Apply the update dialog was presented.

- Ran updater xpcshell test locally and confirmed that all passed.
(In reply to Dave Hylands [:dhylands] from comment #28)
> Created attachment 705535 [details] [diff] [review]
> Finish active updates unless the use swtiches to a newer one. Remove stale
> updates. v3
> 
> Removed a trailing space from the previous patch.
> 
> Once the user starts to download an update, stick to that update until the
> user cancels the download and does a "Check Now".
> 
> If a newer update is started and a stale update is detected, remove the
> stale update.
> 
> If a download fails due an update being in the applied state, prompt the
> user to apply it.
> 
> If an update was fully downloaded, but interrupted during verification, then
> don't redownload.
> 
> And bug 830780 was filed as a followup to add additional gonk-specific tests
> and to make sure the 3 test cases modified with this patch are dealt with
> properly.

From a product and user perspective, I think this approach makes sense.
Flags: needinfo?(ffos-product)
(In reply to Dave Hylands [:dhylands] from comment #29)
> Manual testing performed with this patch:
> 
> - interrupted download of update. Setup a newer update. Cancelled download.
> Did a Check Now. Confirmed that new update was downloaded and previous stale
> patch was removed.

I'm concerned with this one. Don't forget that the check is also automatically launched once per day (default delay).
Unless we can differentiate a forced-check by the user, and a scheduled check by the platform, I'd really stick to a specific update until it is applied.

Except that if the file is corrupted for some reason, it makes sense to be able to remove it at some point. I think we have an error for this case, we should completely remove the downloaded file if it is corruptedn right ?

> - Did a regular update. Chose Later when asked to apply. Went to
> notification bar, selected "1 update available", and confirmed that Apply
> the update dialog was presented.

Oh, that's über-nice. I think that was Bug 833755, right ?
And did you verify that clicking on "Apply" actually works ? I didn't try with your patch yet.
(In reply to Julien Wajsberg [:julienw] from comment #31)
> (In reply to Dave Hylands [:dhylands] from comment #29)
> > Manual testing performed with this patch:
> > 
> > - interrupted download of update. Setup a newer update. Cancelled download.
> > Did a Check Now. Confirmed that new update was downloaded and previous stale
> > patch was removed.
> 
> I'm concerned with this one. Don't forget that the check is also
> automatically launched once per day (default delay).
> Unless we can differentiate a forced-check by the user, and a scheduled
> check by the platform, I'd really stick to a specific update until it is
> applied.

That's a fairly minor change to the first if in the onCheckComplete function (Rmove the && isDownloading portion).

> Except that if the file is corrupted for some reason, it makes sense to be
> able to remove it at some point. I think we have an error for this case, we
> should completely remove the downloaded file if it is corruptedn right ?

I'm pretty sure that's the case. I'll test it.

> > - Did a regular update. Chose Later when asked to apply. Went to
> > notification bar, selected "1 update available", and confirmed that Apply
> > the update dialog was presented.
> 
> Oh, that's über-nice. I think that was Bug 833755, right ?
> And did you verify that clicking on "Apply" actually works ? I didn't try
> with your patch yet.

Yeah - I've never had any problems with that.

In retrospect, I could extract that portion of the patch and put it against that bug. Without the fix, the user has to know that they need to restart the phone to get the patch applied, and when they click on Download the downloader gives an error which isn't manifested in the UI.
> > Except that if the file is corrupted for some reason, it makes sense to be
> > able to remove it at some point. I think we have an error for this case, we
> > should completely remove the downloaded file if it is corruptedn right ?
> 
> I'm pretty sure that's the case. I'll test it.

So I corrupted the mar file, and had a newer update available. With the change (removing the && isDownloading), it sticks to the corrupted mar file until fully downloaded, and then gives an error message that there was a problem with the download. Doing a Check Now while it's downloading sticks with the active update.

The corrupted update stays in the list (I think that's a gaia bug), but the AUS code doesn't consider it to be active any more.

Once it's been detected as corrupted, doing a check now presents the newer update.
> The corrupted update stays in the list (I think that's a gaia bug)

I think we're keeping it because it was not applied and therefore we consider there is still an update. We can't know that the reason it wasn't applied was that it was corrupted.

Maybe we should file another bug for this, but I suspect we'd have to introduce another return value.
BTW if I talked about that, it was that on Pascal's phone, the phone didn't want to download a new update at all, even with check for updates, etc. So your patch definitely seems to make things better for that too.

Unfortunately he flashes his phone so I can't test this on his actual phone, too bad :)
Comment on attachment 705535 [details] [diff] [review]
Finish active updates unless the use swtiches to a newer one. Remove stale updates. v3

Looks good... only one nit

>diff --git a/toolkit/mozapps/update/nsUpdateService.js b/toolkit/mozapps/update/nsUpdateService.js
>--- a/toolkit/mozapps/update/nsUpdateService.js
>+++ b/toolkit/mozapps/update/nsUpdateService.js
>...
>@@ -2575,16 +2576,31 @@ UpdateService.prototype = {
>       if (update.isCompleteUpdate == this._downloader.isCompleteUpdate &&
>           background == this._downloader.background) {
>         LOG("UpdateService:downloadUpdate - no support for downloading more " +
>             "than one update at a time");
>         return readStatusFile(getUpdatesDir());
>       }
>       this._downloader.cancel();
>     }
>+#ifdef MOZ_WIDGET_GONK
>+    var um = Cc["@mozilla.org/updates/update-manager;1"].
>+             getService(Ci.nsIUpdateManager);
>+    var activeUpdate = um.activeUpdate;
>+    if (activeUpdate &&
>+        ((activeUpdate.appVersion != update.appVersion) ||
>+         (activeUpdate.buildID != update.buildID))) {
nit: unneeded / extra ( and )
Attachment #705535 - Flags: review?(robert.bugzilla) → review+
rstrong's nit addressed.
Also removed && isDownloading as per comment 33

https://hg.mozilla.org/integration/mozilla-inbound/rev/a38d0b7187aa
Grr.

Got my trees mixed up. This removes the && isDownloading that was supposed to be in the previous patch.

https://hg.mozilla.org/integration/mozilla-inbound/rev/bb36a87d41ec
https://hg.mozilla.org/mozilla-central/rev/a38d0b7187aa
https://hg.mozilla.org/mozilla-central/rev/bb36a87d41ec
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → B2G C4 (2jan on)
Whiteboard: [needs-b2g-v1.0 uplift]
Verified Fixed on Unagi build #20130219070200
Dec 5th Kernel
Gecko  http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/98354c0298ab
Gaia   edaca00b1eb7534120b6255db5d5200fb1d86d65
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: