Closed Bug 880737 Opened 7 years ago Closed 7 years ago

[OTA] If the download of an update gets aborted due to a network timeout it cannot be continued until the device gets restarted

Categories

(Firefox OS Graveyard :: Gaia::System, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:leo+, firefox23 wontfix, firefox24 wontfix, firefox25 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 wontfix, b2g-v1.1hd fixed)

RESOLVED FIXED
1.1 QE4 (15jul)
blocking-b2g leo+
Tracking Status
firefox23 --- wontfix
firefox24 --- wontfix
firefox25 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- wontfix
b2g-v1.1hd --- fixed

People

(Reporter: whimboo, Assigned: schien)

References

Details

(Keywords: regression, Whiteboard: [apps watch list])

Attachments

(4 files)

Attached file adb logcat
Gecko  http://hg.mozilla.org/releases/mozilla-b2g18/rev/09dc1ae3b1b5
Gaia   92a6e36957145cdb2ac8867e5cdba8ecf12308fc
BuildID 20130605070207
Version 18.0

When you download an update and the network screws up and you are getting disconnected, the download stops as expected, but you are not able to continue. You will have to restart the phone to make it working again.

In more detail the downloader fails to reach the update server, and can't continue. Trying to access the same URL in Firefox it works fine. So some code in the downloader might have reached a bad state, which blocks us from successfully continuing the download of the update.

Attached you will find the full adb logcat output.
needsinfo on :whimboo to help answer :
a) If this happens over 3G or wifi
b) Are you able to successfully download once you reconnect? Does it pick up from where it was left off ?
c) Reproducibility frequency
Keywords: qawanted
(In reply to bhavana bajaj [:bajaj] from comment #1)
> needsinfo on :whimboo to help answer :
> a) If this happens over 3G or wifi

I never tried updating via the mobile connection. It's too expensive. When this happened I was in the MV office and used the flaky Mozilla guest network inside of 10 forward. 

> b) Are you able to successfully download once you reconnect? Does it pick up
> from where it was left off ?

No it doesn't pick up the download because the update server cannot be found. You will have to restart the phone to make it working again. Not sure if re-enabling WIFI also helps, I can't see this behavior in my network. So we might want to have someone from MV testing this with the hope the network is still flaky. Marcia or Tony, mind trying that?

> c) Reproducibility frequency

It happened a couple of times when I was in MV last week and connected to Mozilla Guest.
QA Contact: jsmith
I ended up testing this today with a 6/7 build and was able to reproduce exactly what Henrik is describing. Here's what I did:

1. Flash a 6/7 build
2. Connect to Mozilla Guest wifi
3. Start downloading the OTA update
4. Wait until you've downloaded some amount of bytes (e.g. 420 KB downloaded)
5. Kill the network
6. Kill the OTA update download
7. Try downloading the OTA update again

Result - The OTA update gets stuck at 0.00 bytes downloaded forever. Occasionally, you'll see that there was an error downloading updates, but the download will keep going.

Another thing I noticed also after step #5 was that the OTA update didn't fail gracefully when the network was lost. So I have a feeling something regressed in the arena around network timeouts when network is lost and ensuring that we're cleaning up correctly.
Keywords: qawantedregression
Jason, when the device is in such a state is it enough to disable wifi and re-enable it? Or do you really have to restart the device?
(In reply to Henrik Skupin (:whimboo) from comment #4)
> Jason, when the device is in such a state is it enough to disable wifi and
> re-enable it? Or do you really have to restart the device?

Disabling and re-enabling wifi didn't help for me. I still got stuck at 0.00 bytes downloading forever.
Ok, so my initial observation was correct. Would you mind to test if you can manually reach the download URL via Firefox? In my case that was working when I grabbed the url via 'adb logcat'. So I think it's the downloader from OTA which gets confused with DNS entries?
blocking-b2g: leo? → leo+
Whiteboard: [apps watch list]
Assignee: nobody → dhylands
@jsmith, can you help confirm if "adb logcat | grep AUS" stop output onProgress log while you reproduce this bug?

I have a similar STR which shows the downloading is still progressing but UI doesn't get update. I'm wondering if this bug is cause by not invoking onProgress callback. Here is the STR I use:
1. Flash a 6/7 build
2. Connect to Mozilla Guest wifi
3. Start downloading the OTA update
4. Wait until you've downloaded some amount of bytes (e.g. 420 KB downloaded)
5. Disable wifi
6. Cancel the ongoing OTA update download
7. Enable wifi and try downloading the OTA update again

Result: UI display the downloaded size before step 5 and never update.

I also observed from logcat that OTA download is resumed right after enable wifi at step 7, even if user doesn't click the download button.
Flags: needinfo?(jsmith)
Attached file Logcat
Flags: needinfo?(jsmith)
Not entirely sure what's being asked, although I think comment 7 sounds correct. The logcat indicates downloading is happening, but the UI is indicating it's 0.00 bytes that have been downloaded.
We need to stop OTA retry timer when user want to pause download. Otherwise, the ongoing download will be canceled by UpdateService._attemptResume() and gaia will consider the download is aborted by nsUpdateService.
Assignee: dhylands → schien
Attachment #767694 - Flags: review?(netzen)
Comment on attachment 767694 [details] [diff] [review]
Stop retry timer when pause OTA download

Review of attachment 767694 [details] [diff] [review]:
-----------------------------------------------------------------

In a followup bug, please add to toolkit\mozapps\update\test\unit\test_0030_general.js or create a new test for this.
Attachment #767694 - Flags: review?(netzen) → review+
bug 887603 is created for adding test case.
Keywords: checkin-needed
Attached patch rebase for b2g18Splinter Review
Rebase to b2g18, carry r+.
Attachment #768109 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/6c31bd729ed9
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Whatever has been landed here, the reported problem is not fixed for me. My initial issue is still present and I was able to reproduce it with the following build on the nightly channel:

Gecko  http://hg.mozilla.org/releases/mozilla-b2g18/rev/39607fd11f6b
Gaia   d336288e6cda1a8f974ceea41b9d7860c2367d1f
BuildID 20130706230209
Version 18.0
Please do not flip flags for landings for patches that have already landed. New bugs are filed for issues in this case.
If a bug has not been fixed by a landing of code, why it should not be reopened? That's the usual task which is done and we ever did in the past. Filing a new bug is ok if parts of a fix remain. But here the patch clearly didn't fix the problem I have seen. So I do not understand why I should file a new bug for something I have reported and which is indeed not fixed.

Please verify tomorrow if the behavior has been changed at all for you with the Mozilla network in the office.
Flags: needinfo?(jsmith)
(In reply to Jason Smith [:jsmith] from comment #20)
> Please do not flip flags for landings for patches that have already landed.
> New bugs are filed for issues in this case.

FYI, you should have a look at bug 883770 which handles a situation like that correctly. Means the bug is not fixed, so it got reopened.
(In reply to Henrik Skupin (:whimboo) [away 06/28 - 07/07] from comment #21)
> If a bug has not been fixed by a landing of code, why it should not be
> reopened? That's the usual task which is done and we ever did in the past.
> Filing a new bug is ok if parts of a fix remain. But here the patch clearly
> didn't fix the problem I have seen. So I do not understand why I should file
> a new bug for something I have reported and which is indeed not fixed.

We do not track bugs this way in bugzilla - we track them by landings. This was discussed on dev-platform a while back when I brought this up with reasoning such as landing management, triage management, avoiding endless bugzilla comments with no resolution in site, and misconceptions of definition of not fixed, as the issue found here might . Otherwise, we end up problems such as inconsistent bug statuses, missing landings, inconsistent triage status, etc. I

> 
> Please verify tomorrow if the behavior has been changed at all for you with
> the Mozilla network in the office.

I'll check this.

(In reply to Henrik Skupin (:whimboo) [away 06/28 - 07/07] from comment #22)
> (In reply to Jason Smith [:jsmith] from comment #20)
> > Please do not flip flags for landings for patches that have already landed.
> > New bugs are filed for issues in this case.
> 
> FYI, you should have a look at bug 883770 which handles a situation like
> that correctly. Means the bug is not fixed, so it got reopened.

That bug was not handled correctly. You should not, at any point, reopen bugs for the definition of not fixed.
Flags: needinfo?(jsmith)
(In reply to Jason Smith [:jsmith] from comment #23)
> (In reply to Henrik Skupin (:whimboo) [away 06/28 - 07/07] from comment #21)
> > If a bug has not been fixed by a landing of code, why it should not be
> > reopened? That's the usual task which is done and we ever did in the past.
> > Filing a new bug is ok if parts of a fix remain. But here the patch clearly
> > didn't fix the problem I have seen. So I do not understand why I should file
> > a new bug for something I have reported and which is indeed not fixed.
> 
> We do not track bugs this way in bugzilla - we track them by landings. This
> was discussed on dev-platform a while back when I brought this up with
> reasoning such as landing management, triage management, avoiding endless
> bugzilla comments with no resolution in site, and misconceptions of
> definition of not fixed, as the issue found here might . Otherwise, we end
> up problems such as inconsistent bug statuses, missing landings,
> inconsistent triage status, etc. I

Ack. Missing some information here.

Misconceptions of definition of not fixed, as the issue found here might actually be a new issue not originally seen by the bug shown here. Or the issue might be fixed months back, tested now, and now doesn't work. Reopening that usually causes confusion. 

I would generally lean towards filing new bugs and marking them dependent on the original bug to show the tree of connecting bugs if issues are filed as followups. There's usually a lot of confusion that can happen otherwise - especially with blocking triage, at least, as our definition of blocking changes overtime (it's more stricter as we move closer to release).
(In reply to Jason Smith [:jsmith] from comment #25)
> Here's the associated dev-platform discussion btw -
> https://groups.google.com/forum/#!searchin/mozilla.dev.platform/reopen/
> mozilla.dev.platform/UnxndrIUIL4/1trAhVaYM-4J.

If you refer to something a link like that would be always welcome. Not everyone is following various kind of newsgroups or discussions.

So I filed the follow-up bug 891009.
You need to log in before you can comment on or make changes to this bug.