Closed
Bug 891009
Opened 11 years ago
Closed 11 years ago
[OTA] Unable to reuse Update object to trigger OTA download again after severe network error in previous download
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(blocking-b2g:leo+, firefox23 wontfix, firefox24 wontfix, firefox25 fixed, b2g18 verified, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 wontfix, b2g-v1.1hd fixed)
People
(Reporter: whimboo, Assigned: schien)
References
Details
(Keywords: regression, Whiteboard: [apps watch list])
Attachments
(2 files, 2 obsolete files)
62.40 KB,
text/plain
|
Details | |
1.39 KB,
patch
|
schien
:
review+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #880737 +++
The issue as reported on bug 880737 is still not fixed for me and I can constantly see it with one of my local wifi networks and:
Gecko http://hg.mozilla.org/releases/mozilla-b2g18/rev/39607fd11f6b
Gaia d336288e6cda1a8f974ceea41b9d7860c2367d1f
BuildID 20130706230209
Version 18.0
When you download an update and the network screws up and you are getting disconnected, the download stops as expected, but you are not able to continue. You will have to restart the phone to make it working again.
Comment 1•11 years ago
|
||
Adding qawanted as jason's working to reliably reproduce the issue on a Leo device before we block on it.
Keywords: qawanted
Comment 2•11 years ago
|
||
As filed, I can't reproduce the bug indicated here on Mozilla Guest. I can reproduce an issue that if you end up having an instable/no network present during start/during download that the download gets hung and doesn't gracefully get killed. But if the network becomes reliable during start of download, then the OTA update will download resources. A restart of the phone is not required here - only a reliable connection.
Keywords: qawanted
QA Contact: jsmith
Comment 3•11 years ago
|
||
I filed bug 891979 for the issue mentioned in comment 2 found though.
Reporter | ||
Comment 4•11 years ago
|
||
this is the logcat output from adb which shows that the download stalls by around 1.7MB and does not continue. After a while it get stopped. Retriggering the download process does nothing. As you can see there is no gecko updater output at all.
Comment 5•11 years ago
|
||
If you still have the unstable network, then this is expected behavior. See https://bugzilla.mozilla.org/show_bug.cgi?id=891979#c1 that explains this in detail.
The use case that would be a bug is if this STR:
1. Start downloading an OTA update
2. Enter a bad/no network state
3. Stop the download
4. Enter a good network state
5. Start downloading an OTA update again
Causes the download to stall at 0.00 bytes. If you try to doing step #5 on a bad network you will get the same behavior as what happens after #2 eventually in alignment with https://bugzilla.mozilla.org/show_bug.cgi?id=891979#c1, which is expected behavior. Testing done on comment 2 reveals that this use case actually does work as expected, so I do not understand what this bug is about. Are you reproducing this bug with bad networks in both cases? Only in the first case?
Reporter | ||
Comment 6•11 years ago
|
||
It doesn't matter in which network I am after step 2, the download will never start when I manually stop and restart it. No process is shown in the adb log and in the notification area it stays at 0.00 bytes.
Comment 7•11 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #6)
> It doesn't matter in which network I am after step 2, the download will
> never start when I manually stop and restart it. No process is shown in the
> adb log and in the notification area it stays at 0.00 bytes.
I can't reproduce that behavior. That works fine for me on a 7/10 build.
Comment 8•11 years ago
|
||
Looking at the logs, it appears that the network starts to fail, and I see the watchdog timer go off.
I/Gecko ( 107): UpdatePrompt: Download watchdog fired
A bit later, it retries:
I/Gecko ( 107): UpdatePrompt: Download - restarting download - attempt 1
Then a bit later it gets a more severe failure:
I/Gecko ( 107): *** AUS:SVC Downloader:onStopRequest - status: 2152398878, current fail: 0, max fail: 20, retryTimeout: 30000
I/Gecko ( 107): *** AUS:SVC getStatusTextFromCode - transfer error: Update server not found (check your internet connection), code: 2152398878
and then the user should have been notified of a failure (this only flashes up for a couple of seconds):
I/Gecko ( 107): UpdatePrompt: Update error, state: download-failed, errorCode: 0
I/Gecko ( 107): UpdatePrompt: Setting gecko.updateStatus: Update server not found (check your internet connection)
Now things get slightly confusing because I see another download started sometime after this (unfortunately, the logcat doesn't have timestamps - tip use adb logcat -v threadtime).
I suspect that this additional download may be the one thats stuck at 0 bytes.
Reporter | ||
Comment 9•11 years ago
|
||
Ok, so here another adb log with timestamps included and filtered by 'Gecko'. What you can see here is the following:
1. I restarted the phone so we have fresh log data
2. I started the download of the update
3. The download stalled after about 11MB and didn't continue
4. I think two retries happened, which were also unsuccessful and finally the download got stopped
5. I tried to re-download the update but it doesn't even start and keeps showing 0.00 bytes
6. I turned off wifi and tried to re-download the update after accepting the traffic warning
7. The download still keeps failing
8. I started Firefox and loaded http://www.google.de, which was working fine
9. I tried again to download the update via the mobile connection but it still failed
Not sure what those CTRL-EVENT-BSS-REMOVED messages from the wifi component are but those were most likely stalling the download. And after that we are no longer able to continue or to restart the download until a system reboot.
Attachment #773508 -
Attachment is obsolete: true
Assignee | ||
Comment 10•11 years ago
|
||
Here is my STR:
1. change the value "app.update.url.override" in prefs.js to the update.xml on my server. (I use http://people.mozilla.org/~schien/update.xml )
2. reboot phone and start download the update.
3. While downloading the patch, rename the mar file on server. Error prompt will show on the screen.
4. rename the mar file back to its original name.
5. Try triggering download on device.
The key point to reproduce this bug is to enter “non-verification failure". see http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateService.js#4073
Assignee | ||
Comment 11•11 years ago
|
||
The selected patch is not clean up after non-verification failure, therefore, Downloader will unable to handle a selected patch with unknown state. see http://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateService.js#l3678
Assignee: nobody → schien
Attachment #773897 -
Flags: review?(netzen)
Assignee | ||
Comment 12•11 years ago
|
||
Reporter | ||
Comment 13•11 years ago
|
||
Great observation. Lets hope this is the last issue I have with the OTA updater. Would you mind to update the summary so it better reflects what's broken here? You might find a better wording than I would be able to.
Reporter | ||
Updated•11 years ago
|
Status: NEW → ASSIGNED
Comment 14•11 years ago
|
||
Comment on attachment 773897 [details] [diff] [review]
remove selected patch from update request
Review of attachment 773897 [details] [diff] [review]:
-----------------------------------------------------------------
I'd rather rstrong take this one because I'm not sure of any side effects
Attachment #773897 -
Flags: review?(netzen) → review?(robert.bugzilla)
Comment 15•11 years ago
|
||
Comment on attachment 773897 [details] [diff] [review]
remove selected patch from update request
I should be able to get to this by no later than Tuesday
Comment 16•11 years ago
|
||
Blocking given its an update issue, Henrik can you please help verify this once this lands as you are the only one who was reliably able to reproduce this :)
blocking-b2g: leo? → leo+
Reporter | ||
Comment 17•11 years ago
|
||
(In reply to bhavana bajaj [:bajaj] from comment #16)
> Blocking given its an update issue, Henrik can you please help verify this
> once this lands as you are the only one who was reliably able to reproduce
> this :)
That's not a question. Once it has been landed on b2g18 I will test the new behavior in that network environment.
Comment 18•11 years ago
|
||
(In reply to Shih-Chiang Chien [:schien] from comment #10)
> Here is my STR:
> 1. change the value "app.update.url.override" in prefs.js to the update.xml
> on my server. (I use http://people.mozilla.org/~schien/update.xml )
> 2. reboot phone and start download the update.
> 3. While downloading the patch, rename the mar file on server. Error prompt
> will show on the screen.
> 4. rename the mar file back to its original name.
> 5. Try triggering download on device.
>
> The key point to reproduce this bug is to enter “non-verification failure".
> see
> http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/
> nsUpdateService.js#4073
Can I get updated links to the code from this comment and comment #11?
It seems like it should reach
http://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateService.js#l4408
Where this._update is set to null.
Are you able to reproduce this on desktop as well?
Assignee | ||
Comment 20•11 years ago
|
||
The Update object is still hold by Gaia and will be passed into nsUpdateService.downloadUpdate() while manual retry. So setting null to this._update won't help reset the state of Update object.
Here is the code ref for comment #11:
http://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateService.js#l3927
I don't think we can reproduce this issue on desktop because I don't see we store Update object after severe network error.
Comment 21•11 years ago
|
||
Robert, given this issue may be more specific and applicable to B2G given comment #20, Triage would be willing to take an uplift here if we can get a risk evaluation from your side and if Henrik can confirm the patch fixes the issue for him.
Comment 22•11 years ago
|
||
With the info provided in comment #20 I should be have time to finish the review and risk evaluation by tomorrow.
Comment 23•11 years ago
|
||
Needinfo'ing Robert to help with comment #22 before we make a blocking call on this.
Flags: needinfo?(robert.bugzilla)
Assignee | ||
Updated•11 years ago
|
Summary: [OTA] If the download of an update is stopped due to an instable network it cannot be continued until the device gets restarted → [OTA] Unable to reuse Update object to trigger OTA download again after severe network error in previous download
Comment 24•11 years ago
|
||
Comment on attachment 773897 [details] [diff] [review]
remove selected patch from update request
I spent a bit too much time trying to reproduce without success regretfully. Sorry about that.
I'm ok with this for Gaia though it would be better to fix this in Gaia itself. We do use the updates.xml file to determine which patch was selected when it failed so please #ifdef MOZ_WIDGET_GONK and include a comment inside the ifdef that a reference to the update object is being held by B2G with a reference to this bug.
Bhavana, the patch is fairly safe though with this being app update manual testing would be a very good thing.
Attachment #773897 -
Flags: review?(robert.bugzilla) → review+
Updated•11 years ago
|
Flags: needinfo?(robert.bugzilla)
Assignee | ||
Comment 25•11 years ago
|
||
update according to review comment, carry r+.
Attachment #773897 -
Attachment is obsolete: true
Attachment #780743 -
Flags: review+
Assignee | ||
Updated•11 years ago
|
Component: Gaia::System → General
Keywords: checkin-needed
Comment 26•11 years ago
|
||
Keywords: checkin-needed
Comment 27•11 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 28•11 years ago
|
||
Triage - Partners would like to take this given the user impact described in comment 0.
Comment 24 suggests a safe patch.
blocking-b2g: leo? → leo+
Comment 29•11 years ago
|
||
status-b2g18-v1.0.0:
--- → wontfix
status-b2g18-v1.0.1:
--- → wontfix
status-b2g-v1.1hd:
--- → affected
status-firefox23:
--- → wontfix
status-firefox24:
--- → wontfix
status-firefox25:
--- → fixed
Target Milestone: --- → 1.1 QE5
Comment 30•11 years ago
|
||
Reporter | ||
Comment 31•11 years ago
|
||
That works fine now in the environment I have spotted the problem. Thanks for the fix!
You need to log in
before you can comment on or make changes to this bug.
Description
•