Closed Bug 791829 Opened 7 years ago Closed 7 years ago

[OTA update] No way of resuming a stopped/partial update

Categories

(Firefox OS Graveyard :: General, defect, P1, critical)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-basecamp:+, firefox16 wontfix, firefox17 wontfix, firefox18 fixed, firefox19 fixed, firefox-esr10 wontfix, firefox-esr17 wontfix)

RESOLVED FIXED
B2G C1 (to 19nov)
blocking-basecamp +
Tracking Status
firefox16 --- wontfix
firefox17 --- wontfix
firefox18 --- fixed
firefox19 --- fixed
firefox-esr10 --- wontfix
firefox-esr17 --- wontfix

People

(Reporter: mbrandt, Assigned: bbondy)

References

(Blocks 1 open bug)

Details

(Keywords: dogfood, Whiteboard: [ota update])

Attachments

(2 files, 3 obsolete files)

If the download is stopped mid-stream there is no way of attempting a resume. Rebooting phone starts download over again

Steps to reproduce:

1. load the 2012-09-16 EU build onto an otoro phone
2. force an OTA update using ttps://gist.github.com/7acb26790bf3dd0bd71e
3. wait ~2 minutes and the phone will begin downloading the update
4. stop the update (turn off data or reboot the phone)

Expected:
Update resumes where it left off

Actual:
The download re-starts at the beginning. The previously partial download is ignored. To restart the download the phone must be rebooted.
OS: All → Gonk
Hardware: All → ARM
Blocks: 791836
Blocks: 791902
jgriffin, the server these updates are being hosted from supports HTTP byte-range requests, right?
(In reply to Chris Jones [:cjones] [:warhammer] from comment #1)
> jgriffin, the server these updates are being hosted from supports HTTP
> byte-range requests, right?

The server is apache, with this config:

<VirtualHost *:80>
  ServerName update.boot2gecko.org
  DocumentRoot /data/update-channels
</VirtualHost>
OK, Apache enables range requests by default.  So looks like a gecko problem.
blocking-basecamp: --- → ?
blocking-basecamp: ? → +
Assignee: nobody → marshall
Blocks: 793066
Blocks: 793067
Blocks: 793070
today, was trying to update from a 9/25 -> 9/26 nightly build.   My update failed midstream twice, with these errors:

09-26 16:50:47.627: I/Gecko(4946): *** AUS:SVC Downloader:onProgress - progress: 11700000/42554639
09-26 16:50:47.627: E/GeckoConsole(4946): AUS:SVC Downloader:onProgress - progress: 11700000/42554639
09-26 16:50:47.637: I/Gecko(4946): *** AUS:SVC Downloader:onStopRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-09-26_075218.mar, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-09-26_075218.mar, status: 2152398862
09-26 16:50:47.637: E/GeckoConsole(4946): AUS:SVC Downloader:onStopRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-09-26_075218.mar, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-09-26_075218.mar, status: 2152398862
09-26 16:50:47.637: I/Gecko(4946): *** AUS:SVC Downloader:onStopRequest - non-verification failure
09-26 16:50:47.637: E/GeckoConsole(4946): AUS:SVC Downloader:onStopRequest - non-verification failure
09-26 16:50:47.637: I/Gecko(4946): *** AUS:SVC getStatusTextFromCode - transfer error: Connection timed out, code: 2152398862
09-26 16:50:47.637: E/GeckoConsole(4946): AUS:SVC getStatusTextFromCode - transfer error: Connection timed out, code: 2152398862
09-26 16:50:47.707: I/Gecko(4946): *** AUS:SVC Downloader:onStopRequest - setting state to: download-failed
09-26 16:50:47.707: E/GeckoConsole(4946): AUS:SVC Downloader:onStopRequest - setting state to: download-failed
09-26 16:50:47.757: I/Gecko(4946): *** AUS:SVC Downloader:onStopRequest - all update patch downloads failed
09-26 16:50:47.757: E/GeckoConsole(4946): AUS:SVC Downloader:onStopRequest - all update patch downloads failed
so i tried running this again, and stopped an update partway, by rebooting the phone.  the log below shows after restarting, AUS tried to resume, but it fails thinking my phone is offline again.  

I was doing all my testing on mozilla Guest wifi, and rebooting did persist the wifi connection.   Thoughts?


10-03 17:26:42.067: I/Gonk(106): Connected to netd
10-03 17:26:42.067: D/Vold(100): Volume sdcard state changing 3 (Checking) -> 4 (Mounted)
10-03 17:26:42.117: I/Gecko(106): *** AUS:SVC gCanCheckForUpdates - able to check for updates
10-03 17:26:42.117: E/GeckoConsole(106): AUS:SVC gCanCheckForUpdates - able to check for updates
10-03 17:26:42.127: I/Gecko(106): *** AUS:SVC gCanApplyUpdates - testing write access /data/local/update.test
10-03 17:26:42.127: E/GeckoConsole(106): AUS:SVC gCanApplyUpdates - testing write access /data/local/update.test
10-03 17:26:42.127: I/Gecko(106): *** AUS:SVC gCanApplyUpdates - able to apply updates
10-03 17:26:42.127: E/GeckoConsole(106): AUS:SVC gCanApplyUpdates - able to apply updates
10-03 17:26:42.127: I/Gecko(106): *** AUS:SVC readStatusFile - status: downloading, path: /data/local/updates/0/update.status
10-03 17:26:42.127: E/GeckoConsole(106): AUS:SVC readStatusFile - status: downloading, path: /data/local/updates/0/update.status
10-03 17:26:42.187: I/Gecko(106): *** AUS:SVC readStatusFile - status: downloading, path: /data/local/updates/0/update.status
10-03 17:26:42.187: E/GeckoConsole(106): AUS:SVC readStatusFile - status: downloading, path: /data/local/updates/0/update.status
10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC UpdateService:_postUpdateProcessing - patch found in downloading state
10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC UpdateService:_postUpdateProcessing - patch found in downloading state
10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC readStatusFile - status: downloading, path: /data/local/updates/0/update.status
10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC readStatusFile - status: downloading, path: /data/local/updates/0/update.status
10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC Downloader:_selectPatch - found existing patch with state: downloading
10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC Downloader:_selectPatch - found existing patch with state: downloading
10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC Downloader:_selectPatch - resuming download
10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC Downloader:_selectPatch - resuming download
10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC Downloader:downloadUpdate - downloading from http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar to /data/local/updates/0/update.mar
10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC Downloader:downloadUpdate - downloading from http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar to /data/local/updates/0/update.mar
10-03 17:26:42.267: D/EventHub(106): No input device configuration file found for device '7k_handset'.
10-03 17:26:42.277: I/EventHub(106): New device: id=1, fd=46, path='/dev/input/event5', name='7k_handset', classes=0x81, configuration='', keyLayout='/system/usr/keylayout/7k_handset.kl', keyCharacterMap='/system/usr/keychars/Generic.kcm', builtinKeyboard=false
10-03 17:26:42.277: D/EventHub(106): No input device configuration file found for device 'compass'.
10-03 17:26:42.337: D/EventHub(106): No input device configuration file found for device 'proximity'.
10-03 17:26:42.377: D/EventHub(106): No input device configuration file found for device 'light'.
10-03 17:26:42.397: D/EventHub(106): No input device configuration file found for device 'roamer2_keypad'.
10-03 17:26:42.407: I/EventHub(106): New device: id=5, fd=49, path='/dev/input/event1', name='roamer2_keypad', classes=0x1, configuration='', keyLayout='/system/usr/keylayout/Generic.kl', keyCharacterMap='/system/usr/keychars/Generic.kcm', builtinKeyboard=false
10-03 17:26:42.407: E/filemap(106): mmap(0,4096) failed: No such device
10-03 17:26:42.417: I/EventHub(106): New device: id=6, fd=50, path='/dev/input/event0', name='atmel-touchscreen', classes=0x15, configuration='/system/usr/idc/atmel-touchscreen.idc', keyLayout='/system/usr/keylayout/Generic.kl', keyCharacterMap='/system/usr/keychars/Generic.kcm', builtinKeyboard=false
10-03 17:26:42.428: I/InputReader(106): Device reconfigured: id=6, name='atmel-touchscreen', surface size is now 320x480, mode is 1
10-03 17:26:42.428: I/InputReader(106): Device added: id=6, name='atmel-touchscreen', sources=0x00001103
10-03 17:26:42.428: I/InputReader(106): Device added: id=5, name='roamer2_keypad', sources=0x00000101
10-03 17:26:42.428: I/InputReader(106): Device added: id=1, name='7k_handset', sources=0x80000101
10-03 17:26:42.498: E/GeckoConsole(106): [JavaScript Warning: "Unknown property '-moz-align-self'.  Declaration dropped." {file: "resource://gre-resources/ua.css" line: 44}]
10-03 17:26:42.768: I/nsVolumeService(106): UpdateVolume: 'sdcard' state Idle
10-03 17:26:42.778: I/nsVolumeService(106): UpdateVolume: 'sdcard' state Checking
10-03 17:26:42.778: I/nsVolumeService(106): UpdateVolume: 'sdcard' state Mounted @ '/mnt/sdcard'
10-03 17:26:43.158: I/FMRadio(106): FMRadio is initialized.
10-03 17:26:43.198: E/profiler(258): Registering start signal
10-03 17:26:43.318: I/GeckoDump(106): Opened socket on 9999
10-03 17:26:43.559: I/Gecko(106): *** AUS:SVC Downloader:onStartRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar
10-03 17:26:43.559: E/GeckoConsole(106): AUS:SVC Downloader:onStartRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar
10-03 17:26:43.609: I/Gecko(106): *** AUS:SVC Downloader:onStopRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, status: 2152398864
10-03 17:26:43.609: E/GeckoConsole(106): AUS:SVC Downloader:onStopRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, status: 2152398864
10-03 17:26:43.609: I/Gecko(106): *** AUS:SVC Downloader:onStopRequest - non-verification failure
10-03 17:26:43.609: E/GeckoConsole(106): AUS:SVC Downloader:onStopRequest - non-verification failure
10-03 17:26:43.619: I/Gecko(106): *** AUS:SVC getStatusTextFromCode - transfer error: Network is offline (go online), code: 2152398864
10-03 17:26:43.619: E/GeckoConsole(106): AUS:SVC getStatusTextFromCode - transfer error: Network is offline (go online), code: 2152398864
10-03 17:26:43.729: I/Gecko(106): *** AUS:SVC Downloader:onStopRequest - setting state to: download-failed
10-03 17:26:43.729: E/GeckoConsole(106): AUS:SVC Downloader:onStopRequest - setting state to: download-failed
10-03 17:26:43.779: I/Gecko(106): *** AUS:SVC Downloader:onStopRequest - all update patch downloads failed
10-03 17:26:43.779: E/GeckoConsole(106): AUS:SVC Downloader:onStopRequest - all update patch downloads failed
(In reply to Tony Chung [:tchung] from comment #5)
> so i tried running this again, and stopped an update partway, by rebooting
> the phone.  the log below shows after restarting, AUS tried to resume, but
> it fails thinking my phone is offline again.  
> 
> I was doing all my testing on mozilla Guest wifi, and rebooting did persist
> the wifi connection.   Thoughts?
> 
> 
> 10-03 17:26:42.067: I/Gonk(106): Connected to netd
> 10-03 17:26:42.067: D/Vold(100): Volume sdcard state changing 3 (Checking)
> -> 4 (Mounted)
> 10-03 17:26:42.117: I/Gecko(106): *** AUS:SVC gCanCheckForUpdates - able to
> check for updates
> 10-03 17:26:42.117: E/GeckoConsole(106): AUS:SVC gCanCheckForUpdates - able
> to check for updates
> 10-03 17:26:42.127: I/Gecko(106): *** AUS:SVC gCanApplyUpdates - testing
> write access /data/local/update.test
> 10-03 17:26:42.127: E/GeckoConsole(106): AUS:SVC gCanApplyUpdates - testing
> write access /data/local/update.test
> 10-03 17:26:42.127: I/Gecko(106): *** AUS:SVC gCanApplyUpdates - able to
> apply updates
> 10-03 17:26:42.127: E/GeckoConsole(106): AUS:SVC gCanApplyUpdates - able to
> apply updates
> 10-03 17:26:42.127: I/Gecko(106): *** AUS:SVC readStatusFile - status:
> downloading, path: /data/local/updates/0/update.status
> 10-03 17:26:42.127: E/GeckoConsole(106): AUS:SVC readStatusFile - status:
> downloading, path: /data/local/updates/0/update.status
> 10-03 17:26:42.187: I/Gecko(106): *** AUS:SVC readStatusFile - status:
> downloading, path: /data/local/updates/0/update.status
> 10-03 17:26:42.187: E/GeckoConsole(106): AUS:SVC readStatusFile - status:
> downloading, path: /data/local/updates/0/update.status
> 10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC
> UpdateService:_postUpdateProcessing - patch found in downloading state
> 10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC
> UpdateService:_postUpdateProcessing - patch found in downloading state
> 10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC readStatusFile - status:
> downloading, path: /data/local/updates/0/update.status
> 10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC readStatusFile - status:
> downloading, path: /data/local/updates/0/update.status
> 10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC Downloader:_selectPatch -
> found existing patch with state: downloading
> 10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC Downloader:_selectPatch -
> found existing patch with state: downloading
> 10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC Downloader:_selectPatch -
> resuming download
> 10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC Downloader:_selectPatch -
> resuming download
> 10-03 17:26:42.197: I/Gecko(106): *** AUS:SVC Downloader:downloadUpdate -
> downloading from
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar to
> /data/local/updates/0/update.mar
> 10-03 17:26:42.197: E/GeckoConsole(106): AUS:SVC Downloader:downloadUpdate -
> downloading from
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar to
> /data/local/updates/0/update.mar
> 10-03 17:26:42.267: D/EventHub(106): No input device configuration file
> found for device '7k_handset'.
> 10-03 17:26:42.277: I/EventHub(106): New device: id=1, fd=46,
> path='/dev/input/event5', name='7k_handset', classes=0x81, configuration='',
> keyLayout='/system/usr/keylayout/7k_handset.kl',
> keyCharacterMap='/system/usr/keychars/Generic.kcm', builtinKeyboard=false
> 10-03 17:26:42.277: D/EventHub(106): No input device configuration file
> found for device 'compass'.
> 10-03 17:26:42.337: D/EventHub(106): No input device configuration file
> found for device 'proximity'.
> 10-03 17:26:42.377: D/EventHub(106): No input device configuration file
> found for device 'light'.
> 10-03 17:26:42.397: D/EventHub(106): No input device configuration file
> found for device 'roamer2_keypad'.
> 10-03 17:26:42.407: I/EventHub(106): New device: id=5, fd=49,
> path='/dev/input/event1', name='roamer2_keypad', classes=0x1,
> configuration='', keyLayout='/system/usr/keylayout/Generic.kl',
> keyCharacterMap='/system/usr/keychars/Generic.kcm', builtinKeyboard=false
> 10-03 17:26:42.407: E/filemap(106): mmap(0,4096) failed: No such device
> 10-03 17:26:42.417: I/EventHub(106): New device: id=6, fd=50,
> path='/dev/input/event0', name='atmel-touchscreen', classes=0x15,
> configuration='/system/usr/idc/atmel-touchscreen.idc',
> keyLayout='/system/usr/keylayout/Generic.kl',
> keyCharacterMap='/system/usr/keychars/Generic.kcm', builtinKeyboard=false
> 10-03 17:26:42.428: I/InputReader(106): Device reconfigured: id=6,
> name='atmel-touchscreen', surface size is now 320x480, mode is 1
> 10-03 17:26:42.428: I/InputReader(106): Device added: id=6,
> name='atmel-touchscreen', sources=0x00001103
> 10-03 17:26:42.428: I/InputReader(106): Device added: id=5,
> name='roamer2_keypad', sources=0x00000101
> 10-03 17:26:42.428: I/InputReader(106): Device added: id=1,
> name='7k_handset', sources=0x80000101
> 10-03 17:26:42.498: E/GeckoConsole(106): [JavaScript Warning: "Unknown
> property '-moz-align-self'.  Declaration dropped." {file:
> "resource://gre-resources/ua.css" line: 44}]
> 10-03 17:26:42.768: I/nsVolumeService(106): UpdateVolume: 'sdcard' state Idle
> 10-03 17:26:42.778: I/nsVolumeService(106): UpdateVolume: 'sdcard' state
> Checking
> 10-03 17:26:42.778: I/nsVolumeService(106): UpdateVolume: 'sdcard' state
> Mounted @ '/mnt/sdcard'
> 10-03 17:26:43.158: I/FMRadio(106): FMRadio is initialized.
> 10-03 17:26:43.198: E/profiler(258): Registering start signal
> 10-03 17:26:43.318: I/GeckoDump(106): Opened socket on 9999
> 10-03 17:26:43.559: I/Gecko(106): *** AUS:SVC Downloader:onStartRequest -
> original URI spec:
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, final
> URI spec:
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar
> 10-03 17:26:43.559: E/GeckoConsole(106): AUS:SVC Downloader:onStartRequest -
> original URI spec:
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, final
> URI spec:
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar
> 10-03 17:26:43.609: I/Gecko(106): *** AUS:SVC Downloader:onStopRequest -
> original URI spec:
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, final
> URI spec:
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar,
> status: 2152398864
> 10-03 17:26:43.609: E/GeckoConsole(106): AUS:SVC Downloader:onStopRequest -
> original URI spec:
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar, final
> URI spec:
> http://update.boot2gecko.org/nightly/b2g_update_2012-10-03_084019.mar,
> status: 2152398864
> 10-03 17:26:43.609: I/Gecko(106): *** AUS:SVC Downloader:onStopRequest -
> non-verification failure
> 10-03 17:26:43.609: E/GeckoConsole(106): AUS:SVC Downloader:onStopRequest -
> non-verification failure
> 10-03 17:26:43.619: I/Gecko(106): *** AUS:SVC getStatusTextFromCode -
> transfer error: Network is offline (go online), code: 2152398864
> 10-03 17:26:43.619: E/GeckoConsole(106): AUS:SVC getStatusTextFromCode -
> transfer error: Network is offline (go online), code: 2152398864
> 10-03 17:26:43.729: I/Gecko(106): *** AUS:SVC Downloader:onStopRequest -
> setting state to: download-failed
> 10-03 17:26:43.729: E/GeckoConsole(106): AUS:SVC Downloader:onStopRequest -
> setting state to: download-failed
> 10-03 17:26:43.779: I/Gecko(106): *** AUS:SVC Downloader:onStopRequest - all
> update patch downloads failed
> 10-03 17:26:43.779: E/GeckoConsole(106): AUS:SVC Downloader:onStopRequest -
> all update patch downloads failed

Dietrich pointed out that it was probably the force update script, with a race condition checking with the wifi still starting up.  i changed the scan time to 30s, and it seemed to have fixed the error.

We still need to fix the original issue of not resuming updates.
I've verified that the update server does support pause / resume with desktop Firefox and the latest update MAR. Looking into what might be going on in platform now..
So this looks like 2 potential problems:

1. The updater code doesn't let the update resume if the error is NS_ERROR_NET_TIMEOUT (which is what your first error log shows). It seems we could let this error fall through like others are (i.e. NS_ERROR_ABORT) and get the result we want.

2. We should also make sure we schedule the download again if the network is offline when we try to resume, similar to the checking fix in Bug 794211.
Severity: normal → critical
Priority: -- → P1
as per an irc message, I'll take this bug.
Assignee: marshall → netzen
Blocks: 800804
Whiteboard: [ota update] → [ota update][dogfooding-blocker]
Can the update be resumed manually by restarting it. If not, we shouldn't hold dogfooding over this.
(In reply to Andreas Gal :gal from comment #10)
> Can the update be resumed manually by restarting it. If not, we shouldn't
> hold dogfooding over this.

Yes, check for updates can find the update.xml, and apply the full mar update if there's a download interruption.

Bug 801998 was also filed during this testing.

Given that, i agree this is a non-dogfood blocker.
Thanks for confirming this, removing the tag.
Whiteboard: [ota update][dogfooding-blocker] → [ota update]
I wouldn't hold dogfooding for this, but it's something we need to fix asap.  Restarting our, initially rather large, updates from scratch on every network interruption over EDGE is going to greatly reduce the number of successful updates.
Keywords: dogfood
Hrm.  that kinda would suck if we can't pause/resume/stop/restart esp for our target audience that doesn't necessarily use wifi...  If an update fails, to update properly, is there a way to restart a new download in the event it might be some sort of corruption due to download?  (should probably file a new bug if that isn't the case)....
The specific bug here, which is already a basecamp blocker, is that when a download fails because of a socket timeout, we don't attempt to resume the download from the point at which it failed.  Instead, we start over; that's not what we want to do.  That looks pretty easy to fix.

If the download fails because the file didn't pass the size check, hash check, or signature check, then the existing behavior is to restart the download from scratch.  That's what we want to do.
Attached patch Patch v0.9 (obsolete) — Splinter Review
Here's what I think we need to do for this bug. I haven't tested this at all yet, but I'll do some testing on desktop and try to get my device setup finished to test there. (Currently it fails with system image is too large to flash).
Attached patch Patch v0.95 (obsolete) — Splinter Review
still untested, but minor changes as per discussion with Marshall on IRC.
Attachment #672338 - Attachment is obsolete: true
This occurs also when you start the download on data network (in my case, EDGE) and then connect to wifi - the download progress freezes up.   Not sure if this is a separate bug or not, just thought it worth noting.
Do you have a log for the use case in Comment 18?
Switching default routes results in the socket opened through the previous interface timing out.  I've done this locally while testing.
Attached patch Patch v2. (obsolete) — Splinter Review
Lots of testing and fixes to the previous patch.  Works solid now even with multiple disconnects.

Tested disconnecting the internet many times while downloading both a partial MAR and a full MAR. 

Also tested entering offline mode (press the alt key to get a File menu, then click on Work Offline) and exiting offline mode many times while downloading both a partial and a full.

I'll push to oak after final reviews to do extra final testing.
Attachment #674237 - Flags: review?(robert.bugzilla)
Attachment #672438 - Attachment is obsolete: true
Comment on attachment 674237 [details] [diff] [review]
Patch v2.

Is it possible to add tests for this? Perhaps with a mock nsIIncrementalDownload?
I was mostly worried that anything I'd write would be susceptible to intermittent failures, but perhaps as you said a mock nsIIncrementalDownload.
You might be able to get away with using the local server used by the existing tests. It is pretty flexible.
Comment on attachment 674237 [details] [diff] [review]
Patch v2.

>diff --git a/toolkit/mozapps/update/nsUpdateService.js b/toolkit/mozapps/update/nsUpdateService.js
>--- a/toolkit/mozapps/update/nsUpdateService.js
>+++ b/toolkit/mozapps/update/nsUpdateService.js
>@@ -1534,16 +1544,26 @@ UpdateService.prototype = {
>   _incompatAddonsCount: 0,
> 
>   /**
>    * Whether or not the service registered the "online" observer.
>    */
>   _registeredOnlineObserver: false,
> 
>   /**
>+   * The current number of consecutive socket errors
>+   */
>+  _consecutiveSocketErrors: 0,
>+
>+  /**
>+   * A timer used to retry socket errors
>+   */
>+  _retryTimer: null,
hmmm... should cancel this timer on xpcom-shutdown if it is still alive

>+
>+  /**
>    * Handle Observer Service notifications
>    * @param   subject
>    *          The subject of the notification
>    * @param   topic
>    *          The notification name
>    * @param   data
>    *          Additional data
>    */
>@@ -1867,17 +1887,17 @@ UpdateService.prototype = {
> 
>     Services.obs.removeObserver(this, "network:offline-status-changed");
>     this._registeredOnlineObserver = false;
> 
>     LOG("UpdateService:_offlineStatusChanged - network is online, forcing " +
>         "another background check");
> 
>     // the background checker is contained in notify
>-    this.notify(null);
>+    this._attemptResume();
>   },
> 
>   // nsIUpdateCheckListener
>   onProgress: function AUS_onProgress(request, position, totalSize) {
>   },
> 
>   onCheckComplete: function AUS_onCheckComplete(request, updates, updateCount) {
>     this._selectAndInstallUpdate(updates);
>@@ -1912,29 +1932,58 @@ UpdateService.prototype = {
>...
>   /**
>    * Notified when a timer fires
>    * @param   timer
>    *          The timer that fired
>    */
>   notify: function AUS_notify(timer) {
>     // If a download is in progress or the patch has been staged do nothing.
>-    if (this.isDownloading || this._downloader && this._downloader.patchIsStaged)
>-      return;
>+    if (this.isDownloading ||
>+        this._downloader && this._downloader.patchIsStaged) {
>+      return; 
>+    }
> 
>     this.backgroundChecker.checkForUpdates(this, false);
>   },
> 
>+
nit: remove the added newline

>   /**
>    * Determine the update from the specified updates that should be offered.
>    * If both valid major and minor updates are available the minor update will
>    * be offered.
>    * @param   updates
>    *          An array of available nsIUpdate items
>    * @return  The nsIUpdate to offer.
>    */
>@@ -2981,18 +3031,20 @@ Checker.prototype = {
> 
> /**
>  * Manages the download of updates
>  * @param   background
>  *          Whether or not this downloader is operating in background
>  *          update mode.
document new param

>  * @constructor
>  */
>-function Downloader(background) {
>+function Downloader(background, updateService) {
>+  LOG("Creating Downloader");
>   this.background = background;
>+  this.updateService = updateService;
> }
> Downloader.prototype = {
>   /**
>    * The nsIUpdatePatch that we are downloading
>    */
>   _patch: null,
> 
>   /**
>@@ -3032,24 +3085,29 @@ Downloader.prototype = {
>            readState == STATE_APPLIED || readState == STATE_APPLIED_SVC;
>   },
> 
>   /**
>    * Verify the downloaded file.  We assume that the download is complete at
>    * this point.
>    */
>   _verifyDownload: function Downloader__verifyDownload() {
>+    LOG("Downloader:_verifyDownload called");
>     if (!this._request)
>       return false;
> 
>     var destination = this._request.destination;
> 
>     // Ensure that the file size matches the expected file size.
>-    if (destination.fileSize != this._patch.size)
>+    if (destination.fileSize != this._patch.size) {
>+      LOG("Downloader:_verifyDownload downloaded size != expected size.");
>       return false;
>+    } else {
>+      LOG("Downloader:_verifyDownload downloaded size == expected size.");
>+    }
Remove the else block since it returns early in the if block

>
>     var fileStream = Cc["@mozilla.org/network/file-input-stream;1"].
>                      createInstance(Ci.nsIFileInputStream);
>     fileStream.init(destination, FileUtils.MODE_RDONLY, FileUtils.PERMS_FILE, 0);
> 
>     try {
>       var hash = Cc["@mozilla.org/security/hash;1"].
>                  createInstance(Ci.nsICryptoHash);
>@@ -3066,17 +3124,23 @@ Downloader.prototype = {
>     } catch (e) {
>       LOG("Downloader:_verifyDownload - failed to compute hash of the " +
>           "downloaded update archive");
>       digest = "";
>     }
> 
>     fileStream.close();
> 
>-    return digest == this._patch.hashValue.toLowerCase();
>+    if (digest == this._patch.hashValue.toLowerCase()) {
>+      LOG("Downloader:_verifyDownload hashes match.");
>+      return true;
>+    } else {
>+      LOG("Downloader:_verifyDownload hashes do not match. ");
>+      return false;
>+    }
Remove the else block since it returns early in the if block

>   },
> 
>   /**
>    * Select the patch to use given the current state of updateDir and the given
>    * set of update patches.
>    * @param   update
>    *          A nsIUpdate object to select a patch from
>    * @param   updateDir
>...
>@@ -3344,17 +3410,29 @@ Downloader.prototype = {
>     if (request instanceof Ci.nsIIncrementalDownload)
>       LOG("Downloader:onStopRequest - original URI spec: " + request.URI.spec +
>           ", final URI spec: " + request.finalURI.spec + ", status: " + status);
> 
>     // XXX ehsan shouldShowPrompt should always be false here.
>     // But what happens when there is already a UI showing?
>     var state = this._patch.state;
>     var shouldShowPrompt = false;
>+    var shouldRegisterOnlineObserver = false;
>+    var shouldRetrySoon = false;
>     var deleteActiveUpdate = false;
>+    var retryTimeout = getPref("getIntPref", 
>+                               PREF_APP_UPDATE_RETRY_TIMEOUT,
>+                               DEFAULT_UPDATE_RETRY_TIMEOUT);
>+    var maxFail = getPref("getIntPref", 
>+                          PREF_APP_UPDATE_SOCKET_ERRORS,
>+                          DEFAULT_SOCKET_MAX_ERRORS);
nit: two lines for each of the above since it is under 80
    var retryTimeout = getPref("getIntPref", PREF_APP_UPDATE_RETRY_TIMEOUT,
                               DEFAULT_UPDATE_RETRY_TIMEOUT);
    var maxFail = getPref("getIntPref", PREF_APP_UPDATE_SOCKET_ERRORS,
                          DEFAULT_SOCKET_MAX_ERRORS);

>+    LOG("Downloader:onStopRequest - status: " + status + ", " +
>+        "current fail: " + this.updateService._consecutiveSocketErrors + ", " +
>+        "max fail: " + maxFail + ", " +
>+        "retryTimeout: " + retryTimeout);
nit: since it is under 80
"max fail: " + maxFail + ", retryTimeout: " + retryTimeout);

>     if (Components.isSuccessCode(status)) {
>       if (this._verifyDownload()) {
>         state = shouldUseService() ? STATE_PENDING_SVC : STATE_PENDING
> 
>         // We only need to explicitly show the prompt if this is a background
>         // download, since otherwise some kind of UI is already visible and
>         // that UI will notify.
>         if (this.background)
>@@ -3379,20 +3457,37 @@ Downloader.prototype = {
>         this._update.statusText = message;
> 
>         if (this._update.isCompleteUpdate || this._update.patchCount != 2)
>           deleteActiveUpdate = true;
> 
>         // Destroy the updates directory, since we're done with it.
>         cleanUpUpdatesDir();
>       }
>-    }
>-    else if (status != Cr.NS_BINDING_ABORTED &&
>-             status != Cr.NS_ERROR_ABORT &&
>-             status != Cr.NS_ERROR_DOCUMENT_NOT_CACHED) {
>+    } else if (status == Cr.NS_ERROR_OFFLINE) {
>+      // Register an online observer to try again.
>+      // The online observer will continue the incremental download by
>+      // calling downloadUpdate on the active update which continues
>+      // downloading the file from where it was.
>+      LOG("Downloader:onStopRequest - offline, register online observer: true");
>+      shouldRegisterOnlineObserver = true;
>+      deleteActiveUpdate = false;
>+    // Each of NS_ERROR_NET_TIMEOUT, ERROR_CONNECTION_REFUSED, and
>+    // NS_ERROR_NET_RESET are intermittently returned when disconnecting
>+    // the internet while testing updates.
I'd like to know more... more detail please.

>+    } else if ((status == Cr.NS_ERROR_NET_TIMEOUT ||
>+                status == Cr.NS_ERROR_CONNECTION_REFUSED ||
>+                status == Cr.NS_ERROR_NET_RESET) &&
>+               this.updateService._consecutiveSocketErrors < maxFail) {
>+      LOG("Downloader:onStopRequest - socket error, shouldRetrySoon: true");
>+      shouldRetrySoon = true;
>+      deleteActiveUpdate = false;
>+    } else if (status != Cr.NS_BINDING_ABORTED &&
>+               status != Cr.NS_ERROR_ABORT &&
>+               status != Cr.NS_ERROR_DOCUMENT_NOT_CACHED) {
>       LOG("Downloader:onStopRequest - non-verification failure");
>       // Some sort of other failure, log this in the |statusText| property
>       state = STATE_DOWNLOAD_FAILED;
> 
>       // XXXben - if |request| (The Incremental Download) provided a means
>       // for accessing the http channel we could do more here.
> 
>       this._update.statusText = getStatusTextFromCode(status,
>@@ -3412,20 +3507,24 @@ Downloader.prototype = {
>       um.activeUpdate = null;
>     }
>     else {
>       if (um.activeUpdate)
>         um.activeUpdate.state = state;
>     }
>     um.saveUpdates();
> 
>-    var listeners = this._listeners.concat();
>-    var listenerCount = listeners.length;
>-    for (var i = 0; i < listenerCount; ++i)
>-      listeners[i].onStopRequest(request, context, status);
>+    // Only notify listeners about the stopped state if we
>+    // aren't handling an internal retry.
>+    if (!shouldRetrySoon && !shouldRegisterOnlineObserver) {
>+      var listeners = this._listeners.concat();
>+      var listenerCount = listeners.length;
>+      for (var i = 0; i < listenerCount; ++i)
>+        listeners[i].onStopRequest(request, context, status);
nit: brackets are the new black so please them with new or changed code
nit: please use let for these declarations in the block to get rid of listeners when exiting the if block.

>+    }
> 
>     this._request = null;
> 
>     if (state == STATE_DOWNLOAD_FAILED) {
>       var allFailed = true;
>       // Check if there is a complete update patch that can be downloaded.
>       if (!this._update.isCompleteUpdate && this._update.patchCount == 2) {
>         LOG("Downloader:onStopRequest - verification of patch failed, " +

r=me with the above but I would like tests for this before landing.
Attachment #674237 - Flags: review?(robert.bugzilla) → review+
Implemented review comments.

Tests still pending, this should not land until tests are written.
Carrying forward r+.
Attachment #674237 - Attachment is obsolete: true
Attachment #674814 - Flags: review+
Attached patch Patch v1 - TestsSplinter Review
So this just uses a Mock incremental downloader to verify that connection interrupts work correctly in updater code.
The implementation of the mock incremental downloader is very simple, it simply copies the file to the destination location.  

There are 3 types of connection retry errors:
NS_ERROR_NET_TIMEOUT 
NS_ERROR_CONNECTION_REFUSED
NS_ERROR_NET_RESET

We loop through each of these via gIncrementalDownloadErrorType=0,1,2, and then when 3 is hit we return NS_OK indicating that the download was successful.  When the nsUpdateService gets any of those connection errors it re-creates a new mock incremental downloader.

nsUpdateService by default will have at most 10 of these retries. In one of the new tests we set this pref to allow at most 2, and then we verify that the actual connection error is returned in that case.

There's also a test for offline mode, and the test will fire an offline status changed to online observer event.
Attachment #678616 - Flags: review?(ehsan)
Attachment #674814 - Attachment description: Patch v3 → Patch v3 - Implementation
Comment on attachment 678616 [details] [diff] [review]
Patch v1 - Tests

Review of attachment 678616 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.  I would put some of the description in comment 29 as comments in the code.
Attachment #678616 - Flags: review?(ehsan) → review+
Thanks, will do.
FWIW, i hit this also when changing from Wifi to Carrier network, in the middle of a OTA update.   This is on the 11-8-2012 daily unagi build.

11-08 15:43:42.985: E/GeckoConsole(108): AUS:SVC Downloader:onProgress - progress: 14769/49709211
11-08 15:43:44.266: I/IdleService(108): next timeout 1000 msec from now
11-08 15:43:44.266: I/IdleService(108): SetTimerExpiryIfBefore: next timeout 1000 msec from now
11-08 15:43:44.266: I/IdleService(108): reset timer expiry to 1009 msec from now
11-08 15:43:44.266: I/IdleService(108): Reset idle timeout: tell observer 47cf9e10 user is back
11-08 15:43:44.487: I/wpa_supplicant(393): wlan0: CTRL-EVENT-DISCONNECTED bssid=00:0f:61:f7:21:30 reason=3
11-08 15:43:44.497: I/Gecko(108): *** AUS:SVC Downloader:onProgress - progress: 0/49709211
11-08 15:43:44.497: E/GeckoConsole(108): AUS:SVC Downloader:onProgress - progress: 0/49709211
11-08 15:43:44.507: I/Gecko(108): *** AUS:SVC Downloader:onStopRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-11-08_120747.mar?build_id=20121108112613&version=18.0a2, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-11-08_120747.mar?build_id=20121108112613&version=18.0a2, status: 2152398862
11-08 15:43:44.507: E/GeckoConsole(108): AUS:SVC Downloader:onStopRequest - original URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-11-08_120747.mar?build_id=20121108112613&version=18.0a2, final URI spec: http://update.boot2gecko.org/nightly/b2g_update_2012-11-08_120747.mar?build_id=20121108112613&version=18.0a2, status: 2152398862
11-08 15:43:44.507: I/Gecko(108): *** AUS:SVC Downloader:onStopRequest - non-verification failure
11-08 15:43:44.507: E/GeckoConsole(108): AUS:SVC Downloader:onStopRequest - non-verification failure
11-08 15:43:44.507: I/Gecko(108): *** AUS:SVC getStatusTextFromCode - transfer error: Connection timed out, code: 2152398862
11-08 15:43:44.507: E/GeckoConsole(108): AUS:SVC getStatusTextFromCode - transfer error: Connection timed out, code: 2152398862
11-08 15:43:44.517: I/Gecko(108): *** AUS:SVC Downloader:onStopRequest - setting state to: download-failed
11-08 15:43:44.517: E/GeckoConsole(108): AUS:SVC Downloader:onStopRequest - setting state to: download-failed
11-08 15:43:44.597: I/wpa_supplicant(393): wlan0: CTRL-EVENT-TERMINATING 
11-08 15:43:44.647: I/Gecko(108): *** AUS:SVC Downloader:onStopRequest - all update patch downloads failed
11-08 15:43:44.647: E/GeckoConsole(108): AUS:SVC Downloader:onStopRequest - all update patch downloads failed
Yup that's just error NS_ERROR_NET_TIMEOUT and it is already covered by the patch.  This is ready to land but I'm away until Monday so I don't want to land it before I go away in case there are any issues.
This is unfinished feature work that needs to be completed in the C1 timeframe.
Target Milestone: --- → B2G C1 (to 19nov)
https://hg.mozilla.org/mozilla-central/rev/5f17f8908bd4
https://hg.mozilla.org/mozilla-central/rev/2886c065d702
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
When this is ready to land on aurora, please post updated patches. These do not apply cleanly.
I requested that Bug 801855 lands first on Aurora.  
If that gets approval, then we're good to go with Bug 801855 patches first, and this bug's patches second.
Blocks: 808297
Blocks: 812545
Brian, I believe this is the only test that exercised nsIIncrementalDownload and with the mock implementation I believe there aren't any tests that do. Could you file a bug for nsIIncrementalDownload tests in the networking component? Thanks.
It looks like there is at least one test, but I'm not sure if it gives full coverage.  See here:
http://dxr.mozilla.org/mozilla-central/source/netwerk/test/TestIncrementalDownload.cpp

Did you want me to post anyway?
Ya it seems to just be a manually run test. I'll post to add an xpcshell test or to use that. Probably the former.
Posted bug 886061
You need to log in before you can comment on or make changes to this bug.