Investigate small writes from MBS_ApplyPatch (partial updates cause fragmentation)

RESOLVED FIXED in mozilla2.0b8

Status

()

Toolkit
Application Update
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: Dolske, Assigned: rstrong)

Tracking

(Blocks: 1 bug)

Trunk
mozilla2.0b8
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite -
in-litmus -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Reporter)

Description

7 years ago
From bug 522375 comment 5:

"I think that the delta from "started starting it" to "first line of main" will
be relatively constant for a given user, compared to sqlite evolution,
fragmentation due to update patching (oh god, I just thought through that a
bit, oh god)"

My first thought was that this wasn't a problem, because we patch in memory. Upon looking at the code again, we actually just read the source file into memory, then trickle out fwrites() as we patch it. :-/

My second thought was that this was still ok, because the current updater is overzealous about copying files when it doesn't need to (to be fixed in bug 529464). Upon looking at the code again, the output from the patching isn't copied, it's patched directly to the source location. :-/

So, yeah, this is a potential problem. In addition to fragmenting the patched files, this could also cause the updater to be slower. I fixed some of these problems in bug 517102 and bug 520491, but missed these writes... Might be that these stdio functions are doing buffering, such that the DTrace output I was looking at didn't show them as small writes? In any case, worth investigating.
For Windows I suspect that any effect from this is at least somewhat mitigated by PreFetch (Windows XP) and SuperFetch (Vista and above). Also, on the release channel we only provide a complete update for a major update which does a file copy. Definitely worth looking into though.
Just took a quick look at my top file fragmentation and for Firefox found the following
# of fragments        file                        file size
302                   urlclassifier3.sqlite       26.2 MB (27,557,888 bytes)
241                   places.sqlite               6.63 MB (6,959,104 bytes)
240                   Cache\_CACHE_003_           30.4 MB (31,930,591 bytes)
136                   xul.dll                     11.7 MB (12,333,056 bytes)
109                   Cache\_CACHE_002_           6.52 MB (6,844,736 bytes)

Other fragmented files (didn't include images, text, JS, etc. files) just from the installation directory that didn't make the top fragmented files list
# of fragments        file                        file size
12                    mozsqlite3.dll              748 KB (765,952 bytes)
4                     components/browser.xpt      375 KB (384,871 bytes)
4                     mozjs.dll                   1.18 MB (1,241,088 bytes)
2                     chrome/toolkit.jar          2.91 MB (3,051,671 bytes)
2                     mozalloc.dll                8.50 KB (8,704 bytes)
2                     plc4.dll                    14.5 KB (14,848 bytes)
2                     plugin-container.exe        9.00 KB (9,216 bytes)
4                     components/browsercomps.dll 128 KB (131,072 bytes)

The update log only contains the last 10 updates and all of the last 10 updates for this installation of Minefield were partial updates.

I don't have diskkeeper scheduled to auto defrag (which will also mitigate this for Windows users) and instead manually defrag every couple of months.

My other Firefox installations which have been updated with complete mar files did not show up in the top list of fragmented files.
After performing a complete defrag after which there were no fragmented files in the install directory and then performing a partial update I got the following

# of fragments        file
96                    xul.dll
14                    mozjs.dll
8                     mozsqlite3.dll
5                     browser.xpt
2                     mozalloc.dll
2                     plugin-container.exe
2                     toolkit.jar
2                     en-US.jar
2                     browser.jar

Updated

7 years ago
Blocks: 572459
I have a Windows patch in progress so stealing.
Assignee: dolske → robert.bugzilla
The following number are from a Win7 system

After 1 complete update
# of fragments        file                        file size
2                     components\browsercomps.dll 132 KB (135,168 bytes)
2                     mozalloc.dll                8.50 KB (8,704 bytes)
2                     mozcpp19.dll                696 KB (712,704 bytes)
2                     mozcrt19.dll                696 KB (712,704 bytes)
2                     mozsqlite3.dll              768 KB (786,432 bytes)
2                     plc4.dll                    14.5 KB (14,848 bytes)
2                     plugin-container.exe        9.50 KB (9,728 bytes)
5                     omni.jar                    3.80 MB (3,992,661 bytes)
5                     xul.dll                     14.7 MB (15,425,536 bytes)

After 3 partial updates:
# of fragments        file                        file size
2                     mozalloc.dll                8.50 KB (8,704 bytes)
2                     plc4.dll                    14.5 KB (14,848 bytes)
2                     plugin-container.exe        9.50 KB (9,728 bytes)
5                     components\browsercomps.dll 132 KB (135,168 bytes)
7                     mozsqlite3.dll              768 KB (786,432 bytes)
9                     omni.jar                    3.80 MB (3,992,661 bytes)
46                    xul.dll                     14.7 MB (15,425,536 bytes)

I am fairly certain that the partial numbers have improved due to bug 466778 making it so we don't write over an existing file.

With the patch I am working on after 2 partial updates using nightly mar files the only files that have been fragmented were en-US.dic and the default theme's preview.png both of which had 2 fragments which I didn't bother reporting the fragments for previously.

I'm going to finish the Mac / Linux partial mar implementations and then look at improving the complete mar update.
Created attachment 492786 [details] [diff] [review]
Windows patch rev1

We might want to take the Windows fix early since it will be a week or two before I can work on the Linux / Mac implementation.
Attachment #492786 - Flags: review?(dolske)
(In reply to comment #5)
>...
> I am fairly certain that the partial numbers have improved due to bug 466778
> making it so we don't write over an existing file.
Verified this is the case. After performing an update after OS startup the fragmentation of xul.dll was rather high as shown in comment #5 and when performing an update after things have settled down xul.dll only had 5 fragments.
Created attachment 493084 [details] [diff] [review]
patch rev2

Handles systems with HAVE_POSIX_FALLOCATE along with Windows / Mac OS X.

Taras, could you take a look at this? iirc you said that the XP_UNIX non HAVE_POSIX_FALLOCATE case wasn't worth doing.
Attachment #492786 - Attachment is obsolete: true
Attachment #493084 - Flags: review?(tglek)
Attachment #492786 - Flags: review?(dolske)

Comment 9

7 years ago
Comment on attachment 493084 [details] [diff] [review]
patch rev2


>+  AutoFile ofile = ensure_open(mDestFile, shouldTruncate ? NS_T("wb+") : NS_T("rb+"), ss.st_mode);

I think if you do  shouldTruncate ? NS_T("wb") // no plus
you don't need 
> 
>+#ifdef XP_WIN
>+  if (!shouldTruncate) {
>+    fseek(ofile, 0, SEEK_SET);
>+  }
>+#endif

other than that looks good.
Attachment #493084 - Flags: review?(tglek) → review+
(In reply to comment #9)
> Comment on attachment 493084 [details] [diff] [review]
> patch rev2
> 
> 
> >+  AutoFile ofile = ensure_open(mDestFile, shouldTruncate ? NS_T("wb+") : NS_T("rb+"), ss.st_mode);
> 
> I think if you do  shouldTruncate ? NS_T("wb") // no plus
> you don't need 
> > 
> >+#ifdef XP_WIN
> >+  if (!shouldTruncate) {
> >+    fseek(ofile, 0, SEEK_SET);
> >+  }
> >+#endif
> 
> other than that looks good.
!shouldTruncate is for the rb+ case. It requires the + to open for both reading and writing so the file isn't truncated on open.
Attachment #493084 - Flags: review?(dolske)
Created attachment 493139 [details] [diff] [review]
patch rev3 - fix typo
Attachment #493084 - Attachment is obsolete: true
Attachment #493139 - Flags: review+
Attachment #493084 - Flags: review?(dolske)
Attachment #493139 - Flags: review?(dolske)
Comment on attachment 493139 [details] [diff] [review]
patch rev3 - fix typo

I sent this patch to try and all platforms look good.
Attachment #493139 - Flags: review?(dtownsend)
Attachment #493139 - Flags: approval2.0?
Attachment #493139 - Flags: review?(dtownsend)
Attachment #493139 - Flags: review?(dolske)
Attachment #493139 - Flags: review+
Attachment #493139 - Flags: approval2.0?
Attachment #493139 - Flags: approval2.0+
fix for partial updates pushed to mozilla-central
http://hg.mozilla.org/mozilla-central/rev/da02b204f064

File fragmentation on partial updates is by far the worst culprit and since it will take significant changes to do this properly for complete updates. I've filed bug 616390 to evaluate whether it is worth doing something similar for complete updates / added files.

The files created / modified by updates are tested but since there is no reasonable way to test fragmentation I don't think fragmentation should be tested... at least at this time.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Flags: in-testsuite-
Flags: in-litmus-
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b8
Windows has APIs that expose the on-disk layout of files, as used by defragmentation programs.  Should be straightforward to get the number of fragments, by my reading:

http://msdn.microsoft.com/en-us/library/aa364572%28v=VS.85%29.aspx
then check the ExtentCount in the returned RETRIEVAL_POINTERS_BUFFER.
(it also exposes related APIs to defragment in-place, if we want to do that instead of copying)
I first worked with those when Win2K was in beta and considered adding tests using these APIs. It would be fairly straightforward to get the fragments and though it could be used by a test I just don't think spending the time on such a test is all that valuable at this point in the release cycle especially since there are idiosyncrasies with the OS and VMs to contend with when it comes to writing out files.

It doesn't perform a copy. The process now is rename the file instead of copy to make a backup of it / get it out of the way (bug 466778) and create the file with the correct size. In my testing if there is available / contiguous space the file is created without any fragmentation. I haven't verified 100% that the following is the case but I believe if the current implementation fails to allocate a contiguous file there is a good chance that files other than ours would need to be defragmented to free up additional contiguous space. So, I don't think the additional complexity would buy us anything while it would add time to perform the update and quite possibly bugs that could break the update process.
To also verify with nightly builds, I started off a couple of days ago after defragmenting my hard drive and measured the fragmentation after applying a partial update.

The two updates previous to this patch had the following fragmentation for a partial update on Win7.

                             Fragmentation
File                   2 days prior   1 day prior
xul.dll                     43           164
mozjs.dll                    0            12
mozsqlite3.dll               9            10
omni.jar                     9             6
components\browsercomps.dll  0             4
mozalloc.dll                 2             2
plc4.dll                     2             2
plugin-container.exe         2             2

The first update that included this patch made all of the files listed above contiguous (e.g. no fragments)... can't get better than that!

One new file (libGLESv2.dll 668 KB) that was added (e.g. not patched like the above files) and was in 2 fragments. The next time this file is patched it will likely be contiguous as well. Other new files were contiguous (libEGL.dll 136 KB and the usual .chk files).
Created attachment 496440 [details] [diff] [review]
1.9.2 patch

HAVE_POSIX_FALLOCATE isn't available on 1.9.2 and since Linux doesn't typically fragment files badly I don't think it is worth backporting the fallocate detection.
Depends on: 466778
Attachment #496440 - Flags: review+
Attachment #496440 - Flags: approval1.9.2.14?
Comment on attachment 496440 [details] [diff] [review]
1.9.2 patch

Approved for 1.9.2.14, a=dveditz for release-drivers
Attachment #496440 - Flags: approval1.9.2.14? → approval1.9.2.14+
Pushed to mozilla-1.9.2
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0b38c0bb138d
status1.9.2: --- → .14-fixed
Is the only way to test this fix to set up some hardware in a corner and measure fragmentation rates on the hard disk over time?
That wouldn't really test it very well actually. What this does is fix the partial update case and it can be tested somewhat well / easily by getting a nightly from a couple of days ago, performing a full update, measuring the fragmentation for the complete, performing a partial update to the next nightly on the next day, and measuring the fragmentation for the partial. This can be done on both Mac and Windows though it is slightly easier to do on Windows using the contig program. Also, the 3.6.x patch doesn't contain the fix for Linux.

btw: I have verified the fix on both Mac and Linux already.
Backed out of 1.9.2 to investigate if this caused bug 633869
status1.9.2: .14-fixed → ---

Updated

6 years ago
Summary: investigate small writes from MBS_ApplyPatch → Investigate small writes from MBS_ApplyPatch (partial updates cause fragmentation)
You need to log in before you can comment on or make changes to this bug.