Closed Bug 817821 Opened 12 years ago Closed 12 years ago

System updates run erratically, then fail on a slow connection

Categories

(Firefox OS Graveyard :: General, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-basecamp:+, firefox19 wontfix, firefox20 wontfix, firefox21 fixed, b2g18 fixed)

RESOLVED FIXED
B2G C4 (2jan on)
blocking-basecamp +
Tracking Status
firefox19 --- wontfix
firefox20 --- wontfix
firefox21 --- fixed
b2g18 --- fixed

People

(Reporter: jhammink, Assigned: marshall)

References

Details

(Whiteboard: interaction, UX-P1)

Attachments

(3 files)

# Environment
Unagi
Nightly build 12/2/2012
Unagi BuildID= 20121202070809
Gecko - b7a71a10d566babce419b4fde0f6be602bc540eb
Gaia  - f48b93eafbeb3f93c9575b8bca733113ee611a52

System updates fail, with update notification going spinner going forever showing 0.0 bytes downloaded after 1 hour.

After flashing 12/2/ build:
I get a notification for a system update.   When I select and start to run it though, system updates eventually fail, with update notification spinner going forever showing "0.0 bytes downloaded" after > 1 hour.

There's a whole lot of this in logcat:
12-03 13:14:50.239: DEBUG/memalloc(109): /dev/pmem: Freeing buffer base:0x4c09c000 size:16384 offset:1687552 fd:154
12-03 13:14:50.239: DEBUG/memalloc(109): /dev/pmem: Freeing buffer base:0x4c0a0000 size:16384 offset:1703936 fd:162
12-03 13:14:50.549: DEBUG/memalloc(109): /dev/pmem: Allocated buffer base:0x4bf00000 size:16384 offset:1245184 fd:138
12-03 13:14:50.979: DEBUG/memalloc(109): /dev/pmem: Freeing buffer base:0x4c030000 size:16384 offset:1245184 fd:138
12-03 13:14:50.979: DEBUG/memalloc(109): /dev/pmem: Freeing buffer base:0x4c02c000 size:16384 offset:1228800 fd:106
12-03 13:14:59.338: INFO/ONCRPC(113): rpc_handle_rpc_call: for Xid: 84c, Prog: 31000000, Vers: fc37ad5c, Proc: 00000012
12-03 13:14:59.338: INFO/ONCRPC(113): rpc_handle_rpc_call: Find Status: 0 Xid: 84c
12-03 13:14:59.338: INFO/ONCRPC(113): oncrpc_proxy_handle_cmd_rpc_call: Dispatching xid: 84c

After a while, I see this:

12-03 14:02:41.694: DEBUG/memalloc(109):  Node 14 -> Start Address : 195584 Size 66560 Free info 1
12-03 14:02:41.694: DEBUG/memalloc(109):  Total Allocated: Total Free: 
12-03 14:02:41.694: DEBUG/memalloc(109): ----------------------------------------------
12-03 14:02:41.694: ERROR/memalloc(109): /dev/pmem: No more pmem available
12-03 14:02:41.694: WARN/memalloc(109): Falling back to ashmem
12-03 14:02:41.724: DEBUG/memalloc(109): ashmem: Allocated buffer base:0x4ca00000 size:2375680 fd:167

and this, after one hour of pretending to download updates, when filtered by AUS:SVC
This, when filtered by AUS:SVC
12-03 12:28:26.633: ERROR/GeckoConsole(109): AUS:SVC Checker: checkForUpdates, force: false
12-03 12:28:26.643: ERROR/GeckoConsole(109): AUS:SVC Checker:getUpdateURL - update URL: http://update.boot2gecko.org/nightly/update.xml
12-03 12:28:26.643: ERROR/GeckoConsole(109): AUS:SVC Checker:checkForUpdates - sending request to: http://update.boot2gecko.org/nightly/update.xml
12-03 12:28:27.364: ERROR/GeckoConsole(109): AUS:SVC Checker:onProgress - 604/604
12-03 12:28:27.364: ERROR/GeckoConsole(109): AUS:SVC Checker:onLoad - request completed downloading document
12-03 12:28:27.374: ERROR/GeckoConsole(109): AUS:SVC Checker:getUpdateURL - update URL: http://update.boot2gecko.org/nightly/update.xml
12-03 12:28:27.384: ERROR/GeckoConsole(109): AUS:SVC Checker:onLoad - number of updates available: 1
12-03 12:28:27.384: ERROR/GeckoConsole(109): AUS:SVC UpdateService:_selectAndInstallUpdate - prompting because silent install is disabled
12-03 12:30:35.359: ERROR/GeckoConsole(109): AUS:SVC Checker: checkForUpdates, force: true
12-03 12:30:35.369: ERROR/GeckoConsole(109): AUS:SVC Checker:getUpdateURL - update URL: http://update.boot2gecko.org/nightly/update.xml?force=1
12-03 12:30:35.389: ERROR/GeckoConsole(109): AUS:SVC Checker:checkForUpdates - sending request to: http://update.boot2gecko.org/nightly/update.xml?force=1
12-03 12:30:42.926: ERROR/GeckoConsole(109): AUS:SVC Checker:onProgress - 604/604
12-03 12:30:42.936: ERROR/GeckoConsole(109): AUS:SVC Checker:onLoad - request completed downloading document
12-03 12:30:42.946: ERROR/GeckoConsole(109): AUS:SVC Checker:getUpdateURL - update URL: http://update.boot2gecko.org/nightly/update.xml?force=1
12-03 12:30:42.946: ERROR/GeckoConsole(109): AUS:SVC Checker:onLoad - number of updates available: 1
12-03 13:01:33.001: ERROR/GeckoConsole(109): AUS:SVC Creating Downloader
12-03 13:01:33.001: ERROR/GeckoConsole(109): AUS:SVC UpdateService:_downloadUpdate
12-03 13:01:33.001: ERROR/GeckoConsole(109): AUS:SVC readStringFromFile - file doesn't exist: /data/local/updates/0/update.status
12-03 13:01:33.001: ERROR/GeckoConsole(109): AUS:SVC readStatusFile - status: null, path: /data/local/updates/0/update.status
12-03 13:01:33.021: ERROR/GeckoConsole(109): AUS:SVC UpdateManager:_loadXMLFileIntoArray: XML file does not exist
12-03 13:01:33.021: ERROR/GeckoConsole(109): AUS:SVC UpdateManager:_loadXMLFileIntoArray: XML file does not exist
12-03 13:01:33.101: ERROR/GeckoConsole(109): AUS:SVC Downloader:downloadUpdate - downloading from http://update.boot2gecko.org/nightly/b2g_update_2012-12-03_074825.mar?build_id=20121203070719&version=18.0 to /data/local/updates/0/update.mar
12-03 13:01:33.431: ERROR/GeckoConsole(109): AUS:SVC Downloader:onStartRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-12-03_074825.mar?build_id=20121203070719&version=18.0, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-12-03_074825.mar?build_id=20121203070719&version=18.0
12-03 13:01:33.501: ERROR/GeckoConsole(109): AUS:SVC Downloader:onStopRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-12-03_074825.mar?build_id=20121203070719&version=18.0, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-12-03_074825.mar?build_id=20121203070719&version=18.0, status: 2152398864
12-03 13:01:33.501: ERROR/GeckoConsole(109): AUS:SVC Downloader:onStopRequest - status: 2152398864, current fail: 0, max fail: 10, retryTimeout: 2000
12-03 13:01:33.501: ERROR/GeckoConsole(109): AUS:SVC Downloader:onStopRequest - offline, register online observer: true
12-03 13:01:33.501: ERROR/GeckoConsole(109): AUS:SVC Downloader:onStopRequest - setting state to: downloading
12-03 13:01:33.551: ERROR/GeckoConsole(109): AUS:SVC Downloader:onStopRequest - Registering online observer
12-03 13:01:33.551: ERROR/GeckoConsole(109): AUS:SVC UpdateService:_registerOnlineObserver - waiting for the network to be online, then forcing another check
Component: Gaia → Gaia::System
I just reflashed back to 20121128 nightly build and was able to successfully update to the 20121203 build. I am using a pretty good wifi connection ATM that is not shared. I can try the build John is using as well.
Reflashed back to 20121202 and also was able to complete a successful system update to 20121203.
jammink and I just chatted on IRC about the fact that not everyone will have ultra fast connections, so users will likely see this if they have slower or erratic connections.
I was finally able to get this to work on a very fast connection only.  Still a concern, as on most very slow networks, system updates of this size will probably not work and fail before completion.   I suggest we leave this open for that reason.
Summary: System updates run erratically, then fail → System updates run erratically, then fail on a slow connection
Marshall do you know what happen?
Component: Gaia::System → General
This was discussed during triage and seems important enough to fix given that we know mobile networks don't always have the most reliable speeds.
Assignee: nobody → marshall
blocking-basecamp: ? → +
From the pasted log, it looks like an error (or "download stopped") message is never sent to Gaia if something happens mid stream. We are registering for an online observer though, at least.

Etienne / Vivien, I'm not sure I would consider this an error (so platform probably shouldn't send the update-error mozChromeEvent). It might be good to have a new event, something like "update-download-stopped" to signal we aren't downloading, but we will continue once connectivity is restored..
(In reply to Marshall Culpepper [:marshall_law] from comment #7)
> Etienne / Vivien, I'm not sure I would consider this an error (so platform
> probably shouldn't send the update-error mozChromeEvent). It might be good
> to have a new event, something like "update-download-stopped" to signal we
> aren't downloading, but we will continue once connectivity is restored..

Will we get an update-download-resumend event too? Or should we just wait for the next progress event?

Concerning the UI hopefully we can come up with a simple-but-smart change to reflect this, like removing the StatusBar downloading icon and replacing the spinning arrow by a |<../.. >| icon.

Josh?
Flags: needinfo?(jcarpenter)
(In reply to Etienne Segonzac (:etienne) from comment #8)
> Will we get an update-download-resumend event too? Or should we just wait
> for the next progress event?
> 

Well, on the platform side we essentially receive 3 events:

onStartRequest (when a download starts -- could be resumed, or not?)
onStopRequest (when a download succeeds, or mid stream)
onProgress (when download progress occurs)

So given that, we could just use "update-download-started" and "update-download-stopped", and keep the existing "update-progress". Though, it might be more consistent to call this "update-download-progress".. thoughts?
(In reply to Marshall Culpepper [:marshall_law] from comment #9)
> (In reply to Etienne Segonzac (:etienne) from comment #8)
> > Will we get an update-download-resumend event too? Or should we just wait
> > for the next progress event?
> > 
> 
> Well, on the platform side we essentially receive 3 events:
> 
> onStartRequest (when a download starts -- could be resumed, or not?)
> onStopRequest (when a download succeeds, or mid stream)
> onProgress (when download progress occurs)
> 
> So given that, we could just use "update-download-started" and
> "update-download-stopped", and keep the existing "update-progress". Though,
> it might be more consistent to call this "update-download-progress"..
> thoughts?

Always in favor of keeping things in sync :)

* update-download-stopped -> we remove the status bar icon and change the notification style to reflect the "stalled" state
* update-download-started -> if in a stalled state, we switch back to an active state and put back the statusbar icon
* update-download-progress -> we update the progress indicator
* update-error (update-download-error?) -> no hope left, we remove the status bar icon and the notification

Sounds about right?
(In reply to Etienne Segonzac (:etienne) from comment #10)
> Always in favor of keeping things in sync :)
> 
> * update-download-stopped -> we remove the status bar icon and change the
> notification style to reflect the "stalled" state
> * update-download-started -> if in a stalled state, we switch back to an
> active state and put back the statusbar icon
> * update-download-progress -> we update the progress indicator
> * update-error (update-download-error?) -> no hope left, we remove the
> status bar icon and the notification
> 
> Sounds about right?

Yeah, that sounds about right. I would argue that the "stalled" icon just be "paused", since we support resuming.

I will add some additional logic on the platform side to determine if the download actually started. This way you don't get a lot of spam update-download-started / update-download-stopped events when the updater retries and fails to connect.
Can we get a priority and target milestone for this>
Priority: -- → P1
I plan on addressing this as a follow up to Bug 812584
The solution proposed seems to make sense. Typically we allow the user to tap on update notifications (eg: to initiate or cancel an update, depending on the notification). With this new "paused" state:

* Notification string should communicate that update has paused.
* Icon should communicate that update has paused.
* Progress bar should indicate that update has paused.
* Tapping on notification should open Cancellation prompt.

Does that sound about right? If so I can contact Peter to generate the new icon, and let Stas know about the new string.
Flags: needinfo?(jcarpenter)
Whiteboard: interaction, UX-P1
Target Milestone: --- → B2G C3 (12dec-1jan)
What's the status here?
Flags: needinfo?(marshall)
7 days later ping.
Target Milestone: B2G C3 (12dec-1jan) → B2G C4 (2jan on)
better update download events - v1
Attachment #698671 - Flags: review?(fabrice)
Flags: needinfo?(marshall)
Comment on attachment 698689 [details] [diff] [review]
update download events for gaia - v1

Review of attachment 698689 [details] [diff] [review]:
-----------------------------------------------------------------

Do we have a bug open to reflect the paused state in the UI?
Attachment #698689 - Flags: review?(etienne) → review+
Attachment #698671 - Attachment description: Bug 817821: Communicate update download events better to Gaia. r=fabrice → part 1: better download events - v1
Attachment #698737 - Flags: review?(dhylands)
Comment on attachment 698671 [details] [diff] [review]
part 1: better download events - v1

Review of attachment 698671 [details] [diff] [review]:
-----------------------------------------------------------------

r=me, but curious about the nit.

::: b2g/components/UpdatePrompt.js
@@ +491,5 @@
>    onProgress: function UP_onProgress(aRequest, aContext, aProgress,
>                                       aProgressMax) {
> +    if (!this._startedSent) {
> +      this.sendChromeEvent("update-download-started", {
> +        total: aProgressMax

Nit: don't we want also |progress: aProgress| for the start event? I know currently updates are big, but for small ones we could have a sizable chunk of data available in the first event.
Attachment #698671 - Flags: review?(fabrice) → review+
Attachment #698737 - Flags: review?(dhylands) → review+
https://hg.mozilla.org/mozilla-central/rev/de59ecb70385
https://hg.mozilla.org/mozilla-central/rev/f35f4d93b78c
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: