Last Comment Bug 517102 - App update slow, progressbar works poorly on Windows CE
: App update slow, progressbar works poorly on Windows CE
Status: RESOLVED FIXED
[nv]
:
Product: Toolkit
Classification: Components
Component: Application Update (show other bugs)
: Trunk
: ARM Windows CE
: -- normal (vote)
: ---
Assigned To: Justin Dolske [:Dolske]
:
: Robert Strong [:rstrong] (use needinfo to contact me)
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-09-16 17:06 PDT by Justin Dolske [:Dolske]
Modified: 2009-10-09 14:37 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
beta1-fixed


Attachments
Patch A v.1 (Logging cleanup) (1.10 KB, patch)
2009-09-19 20:18 PDT, Justin Dolske [:Dolske]
robert.strong.bugs: review+
Details | Diff | Splinter Review
Patch B v.1 (Progress ignores missing files) (2.42 KB, patch)
2009-09-19 20:21 PDT, Justin Dolske [:Dolske]
robert.strong.bugs: review+
Details | Diff | Splinter Review
Patch C v.1 (Smoother progress bar) (1.75 KB, patch)
2009-09-19 20:25 PDT, Justin Dolske [:Dolske]
robert.strong.bugs: review+
Details | Diff | Splinter Review
Patch D v.1 (bigger copy_file buffer) (1.47 KB, patch)
2009-09-19 20:26 PDT, Justin Dolske [:Dolske]
no flags Details | Diff | Splinter Review
Patch E v.1 (bigger decompress buffers) (4.56 KB, patch)
2009-09-19 20:29 PDT, Justin Dolske [:Dolske]
no flags Details | Diff | Splinter Review
Patch D v.2 (bigger copy_file buffer) (2.75 KB, patch)
2009-10-04 19:49 PDT, Justin Dolske [:Dolske]
robert.strong.bugs: review+
Details | Diff | Splinter Review
Patch E v.2 (bigger decompress buffers) (4.79 KB, patch)
2009-10-04 19:50 PDT, Justin Dolske [:Dolske]
no flags Details | Diff | Splinter Review
Patch E v.3 (bigger decompress buffers) (4.80 KB, patch)
2009-10-04 21:04 PDT, Justin Dolske [:Dolske]
robert.strong.bugs: review+
Details | Diff | Splinter Review

Description Justin Dolske [:Dolske] 2009-09-16 17:06:20 PDT
Just updated from the last nightly to the currently nightly, so should have picked up a small partial update.

The updater took 4-5 minutes to apply the update, and except for the last few seconds the process bar only moved 4 ticks (looking to be around the 10% mark).

I've seen desktop Firefox updates update the progress bar like that before (especially on OS X), so that might just be an existing problem. But it seems surprising that a nightly-to-nightly update would take so long.
Comment 1 Marcia Knous [:marcia - use ni] 2009-09-16 17:08:30 PDT
This is consistent with what I have seen all along in my nightly updates. I think I did mention to rstrong at one point and he mentioned something about the component registration.
Comment 2 Robert Strong [:rstrong] (use needinfo to contact me) 2009-09-16 17:09:07 PDT
It currently increments with any intelligence by number of files and since we've added xul.dll it has done this when updating xul.dll though it isn't that noticeable on systems with reasonable file IO.
Comment 3 Justin Dolske [:Dolske] 2009-09-16 18:23:35 PDT
Yeah, the progress bar is based on the number of items in update.manifest, 99% of which are the entries generated from the removed-files list.

I had also noticed a delay of a few seconds at the beginning of the update, looks like the updater doesn't update progress during the Prepare() stage, to that explains that short pause.

The patching happens first in the Execute() state [according to my log, at least], so this is probably where all the time is spent. A little extra debug logging would confirm that. It would be a good idea to look through the code that executes here and see if it can be optimized at all.

I'm actually surprised that all the access() calls for the removed files happens within a few seconds, I was expecting it to be longer.

We should probably move the process display to be byte-based (with some fudge-factor to account for removed-files) to at least get an accurate progress display, especially now that everything is in big jars.
Comment 4 Justin Dolske [:Dolske] 2009-09-16 18:24:31 PDT
Oh, and it wouldn't hurt to buffer the output logging (and maybe shorten it -- 141K of log isn't free to write!).
Comment 5 Robert Strong [:rstrong] (use needinfo to contact me) 2009-09-16 18:34:10 PDT
access is rather simplified for WinCE in the updater
http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/src/updater/updater_wince.cpp#124

btw: there is also a delay before displaying the ui.

I suspect tracking bytes will be a PITA and that doing so will mostly only be of benefit for systems with file IO like shitake mushrooms. Not saying it shouldn't be done by any means.

I believe the output logging was done this way initially so there would be a partial log for troubleshooting in case of a crash. Do you know what the actual cost is here?

It definitely can be shortened.
Comment 6 Nick Thomas [:nthomas] 2009-09-16 18:45:21 PDT
If the prepare phase is doing the file existence checks for the removed files, as the log indicates with the skipping comments, then perhaps can they be excluded from counting towards the progress bar of execute.

While all those entries in removed files don't apply to WinCE, and that list has grown unchecked since 1.5 days, how would we go about shortening it ?
Comment 7 Robert Strong [:rstrong] (use needinfo to contact me) 2009-09-16 18:52:27 PDT
(In reply to comment #6)
> If the prepare phase is doing the file existence checks for the removed files,
> as the log indicates with the skipping comments, then perhaps can they be
> excluded from counting towards the progress bar of execute.
> 
> While all those entries in removed files don't apply to WinCE, and that list
> has grown unchecked since 1.5 days, how would we go about shortening it ?
Either via ifdef or a separate file. I prefer the ifdef to avoid having to update multiple files.
Comment 8 Justin Dolske [:Dolske] 2009-09-16 19:03:50 PDT
(In reply to comment #5)

> I believe the output logging was done this way initially so there would be a
> partial log for troubleshooting in case of a crash. Do you know what the actual
> cost is here?

It's probably just a small win. Maybe a second or two, clearly swamped by the actual patching.

(In reply to comment #6)

> While all those entries in removed files don't apply to WinCE, and that list
> has grown unchecked since 1.5 days, how would we go about shortening it ?

If we wanted to be really clever, partial updates could determine what's changed in removed-files between the old and new versions, and only remove those changes. [Might need a few special cases, if any of those files can reappear by other means.]
Comment 9 Justin Dolske [:Dolske] 2009-09-16 19:04:27 PDT
(To be clear, that last bit would happen server-side when generating the mar)
Comment 10 Justin Dolske [:Dolske] 2009-09-19 20:17:19 PDT
So, there were a few things going on.

1) The progress bar didn't show progress in a meaningful way because it counted patching the file and removing a file as equal. A typical partial mar has maybe a dozen or so files to patch, and hundreds of files to remove (basically the contents of browser/installer/removed-files.in). Conveniently, we check to see if the removed files exist before starting the progress bar, so the simple fix is to ignore non-existent files when determining progress.

2) The slow updates are a result of doing short reads followed by short writes. 1K on WinCE, probably 4-8K on other platforms (whatever BUFSIZ is). Increasing the size of the buffers to 256K helped out tremendously. I also tried 1M buffers, but there was little difference (< 5 seconds, in the noise).

3) Probably to an extremely minor degree was the amount of output being written to the update log. It was overly verbose, so I've trimmed a bit of useless output.

I'm attaching a series of patches here to clean up these things.

End result is that updates are 3x faster on Windows CE, and the progress bar moves much more smoothly. The progress bar still freeze in a couple of points for a few seconds (due to extracting and patching our ginormous libxul), but it's not too bad. Fixing this would require more invasive changes to update the progressbar while patching (counting bytes), and that probably isn't worth the effort.

Some data on WinCE:

Updating with a complete mar (10MB) to today's nightly took 3:15 with the old updater, 1:00 with these patches. [3.3x faster]

Updating with a partial mar (345K) to today's nightly took 2:10 with the old updater, 0:45 with these patches. [2.9x faster]

I expect desktop updates would show more modest gains (if any), but the progress bar should at least work better (I've often noticed the same problem on OS X).
Comment 11 Justin Dolske [:Dolske] 2009-09-19 20:18:50 PDT
Created attachment 401679 [details] [diff] [review]
Patch A v.1 (Logging cleanup)

Trivial cleanup of logging output.
Comment 12 Justin Dolske [:Dolske] 2009-09-19 20:21:05 PDT
Created attachment 401680 [details] [diff] [review]
Patch B v.1 (Progress ignores missing files)

This updates the accounting of the progress bar, adding a value so that the cost of the remove action for non-existent files can be 0.
Comment 13 Justin Dolske [:Dolske] 2009-09-19 20:25:08 PDT
Created attachment 401681 [details] [diff] [review]
Patch C v.1 (Smoother progress bar)

Previously, only the Execute phase updated the progress bar. The Prepare phase really needs to as well, because it's extracting all the compressed patches from the mar (could take 20 seconds on WinCE in some cases). The Finish phase is fairly quick, but I made it update anyway. The PROGRESS_xxx_SIZE #defines allow scaling these 3 phases relative to the total progress.
Comment 14 Justin Dolske [:Dolske] 2009-09-19 20:26:48 PDT
Created attachment 401682 [details] [diff] [review]
Patch D v.1 (bigger copy_file buffer)

Increases the size of the buffer used when copying a file (eg, for backups).
Comment 15 Justin Dolske [:Dolske] 2009-09-19 20:29:04 PDT
Created attachment 401683 [details] [diff] [review]
Patch E v.1 (bigger decompress buffers)

Increases the size of the buffers used when decompressing a patch. Also s/BUFSUZ/BUFFER_SIZE/, since we don't use the system size now.
Comment 16 Justin Dolske [:Dolske] 2009-09-19 20:32:57 PDT
(In reply to comment #10)
> Some data on WinCE:

Forgot to mention, just copying the Firefox folder from an SD card to the device takes about 20 seconds. Since we're essentially copying data twice during update (file backup + patched output), this means things should now be in the ballpark of being limited by the filesystem speed.
Comment 17 Justin Dolske [:Dolske] 2009-10-04 19:49:48 PDT
Created attachment 404539 [details] [diff] [review]
Patch D v.2 (bigger copy_file buffer)

Adds fallback to be more robust in low-memory situations.
Comment 18 Justin Dolske [:Dolske] 2009-10-04 19:50:30 PDT
Created attachment 404540 [details] [diff] [review]
Patch E v.2 (bigger decompress buffers)

Ditto.
Comment 19 Justin Dolske [:Dolske] 2009-10-04 21:04:01 PDT
Created attachment 404544 [details] [diff] [review]
Patch E v.3 (bigger decompress buffers)

Refinement of Patch E v.2...

This part of archivereader.cpp had been throwing my brain for a spin:

154     if (strm.avail_out < BUFSIZ) {
155       if (fwrite(outbuf, BUFSIZ - strm.avail_out, 1, fp) != 1) {
156         ret = WRITE_ERROR;
157         break;
158       }
159     }

The thing that had confused me was that strm.avail_in (elsewhere in the function) is the amount of data in the input buffer. But strm.avail_out isn't the amount of data in the output buffer -- it's the available space remaining in the output buffer. Introducing a new "outlen" that's semantically the same as inlen (again, elsewhere in the function) makes this bit of code easier to understand.
Comment 20 Robert Strong [:rstrong] (use needinfo to contact me) 2009-10-05 23:23:15 PDT
Comment on attachment 401679 [details] [diff] [review]
Patch A v.1 (Logging cleanup)

>diff --git a/toolkit/mozapps/update/src/updater/updater.cpp b/toolkit/mozapps/update/src/updater/updater.cpp
>--- a/toolkit/mozapps/update/src/updater/updater.cpp
>+++ b/toolkit/mozapps/update/src/updater/updater.cpp
>@@ -674,7 +674,7 @@ RemoveFile::Prepare()
>   // We expect the file to exist if we are to remove it.
>   int rv = NS_taccess(mDestFile, F_OK);
>   if (rv) {
>-    LOG(("file cannot be removed because it does not exist; skipping\n"));
>+    LOG(("file does not exist; skipping\n"));
With this we'll end up with a bunch of "file does not exist; skipping" in the log file. At the very least it should have the name of the file but I'm tempted to say that this should just be removed.

r=me with that
Comment 21 Robert Strong [:rstrong] (use needinfo to contact me) 2009-10-06 10:59:21 PDT
(In reply to comment #20)
> (From update of attachment 401679 [details] [diff] [review])
> >diff --git a/toolkit/mozapps/update/src/updater/updater.cpp b/toolkit/mozapps/update/src/updater/updater.cpp
> >--- a/toolkit/mozapps/update/src/updater/updater.cpp
> >+++ b/toolkit/mozapps/update/src/updater/updater.cpp
> >@@ -674,7 +674,7 @@ RemoveFile::Prepare()
> >   // We expect the file to exist if we are to remove it.
> >   int rv = NS_taccess(mDestFile, F_OK);
> >   if (rv) {
> >-    LOG(("file cannot be removed because it does not exist; skipping\n"));
> >+    LOG(("file does not exist; skipping\n"));
> With this we'll end up with a bunch of "file does not exist; skipping" in the
> log file. At the very least it should have the name of the file but I'm tempted
> to say that this should just be removed.
> 
> r=me with that
I take this back... please leave it and thanks, the progress bar is MUCH smoother with these changes
Comment 23 Mike Beltzner [:beltzner, not reading bugmail] 2009-10-07 05:42:52 PDT
a192=beltzner for all patches
Comment 25 David Craig 2009-10-09 08:49:11 PDT
Does the progress bar still calculate the time remaining immediately and give gigantic times? Mine always starts out at "4 hours remaining" then it goes down to 30 seconds after 2-3 seconds of progress. I would suggest not displaying time remaining until at least a few seconds have elapsed.
Comment 26 Justin Dolske [:Dolske] 2009-10-09 11:29:57 PDT
The progressbar this bug is about doesn't show any time estimates; you're thinking of the download not the actual processing of the update.

Note You need to log in before you can comment on or make changes to this bug.