Closed Bug 391958 Opened 17 years ago Closed 16 years ago

Improve partial patch generation performance

Categories

(Release Engineering :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mtschrep, Assigned: coop)

References

()

Details

Attachments

(4 files, 3 obsolete files)

When I saw the computer time required to complete an entire build process I got a little upset so I decided to look into it a bit.  The patch generation time looked to me to be the largest single entry so I focused there.

Running patcher2.pl on my local system it became obvious that partial patch generation was the single largest operation.  

Our current process looks roughly like this (leaving out lots of edge cases):

1) for each pair of builds (e.g. 2004-mac-af, 2005-mac-af)
2) unpack from full mar (2004-mac-af.mar) into a temp dir 
	mar -x 2004-mac-af.mar
	bunzip each entry in tmp from dir
3) unpack to mar (2005-mac-af.mar) into a temp dir
	mar -x 2005-mac-af.mar
	bunzip each entry in tmp to dir
4) Compare the from/to dirs:
	For files in both:
		Calculate patch file using mbsdiff
		bzip patch file
		bzip original file
		if patch file is smaller
			add patch file to manifest
		else
			add full file to manifest
	For files only in to:
		bzip file and add to manfiest
	For files only in from:
		add removed instruction to manifest
	Look for removed-files file in to dir
		if exists add each entry to manifest
5) bzip update.manifest
6) build mar of all files from step 4 + update.manifest

the process with make_incremental_updates.py looks like this:

1) for each pair of builds (e.g. 2004-mac-af, 2005-mac-af)
2) unpack from full mar (2004-mac-af.mar) into a temp dir 
	mar -x 2004-mac-af.mar
3) unpack to mar (2005-mac-af.mar) into a temp dir
	mar -x 2005-mac-af.mar
4) Compare the from/to dirs:
	For files in both:
		compute tuple (from.digest,to.digest) composed of sha digest of 
                              from/to file (in bzip form)
		if I've seen this tuple before 
			use previously generated patch
		else
			bunzip from/to files
			Calculate patch file using mbsdiff
			bzip patch file
			bzip original file
			if patch file is smaller
				add patch file to manifest
			else
				add full file to manifest
			add results to cache with key (from.digest,to.digest)
	For files only in to:
		add file to manifest 
	For files only in from:
		add removed instruction to manifest
	Look for removed-files file in to dir
		if exists add each entry to manifest
5) bzip update.manifest
6) build mar of all files from step 4 + update.manifest

The two primary optimizations are:
	a) We never run mbsdiff on the same pair of files twice.   Since the large binary files on the same platform (e.g. 2004-mac-firefox-bin, 2005-mac-firefox-bin) are the same for 40+ locales this is a *huge* savings
	b) We compare files using a sha.digest of the bziped version of them.  This means skip bunziping files that are the same in from/to
	
On my MBP it takes 2h 40Minutes to generate all 130 partial patches for 2004->2005 using the existing system. 

The new way takes 8minutes.

There is a third optimization which is possible by implementing a mar file reader in python.  This lets me calculate the sha digests in-place without extracting the mar into a tmp dir.   This takes the runtime down from 8mins to about 6mins - but I'm not super-thrilled about having yet another mar impl (even though it is a very simple format) and a 20x speedup is not too shabby to start. Building patches on each mac/win/linux builder is likely a better optimization.   Threading it doesn't seem to be much help (running 3 instances goes about 5% faster) at this point.

Other fun stuff I discovered along the way:

1) make_incremental_update.sh relies on sort to compare files in from/to.  The sort order in our production patches made no sense at all to me (and was different than on my local system).  Turns out: 

http://www.gnu.org/software/fileutils/doc/faq/core-utils-faq.html#Sort%20does%20not%20sort%20in%20normal%20order!

looks to be true on the production system used to build patches.  It results in a manifest file in a nonsense order.

2) unwrap_full_update.pl doesn't seem to work on MacOS without some hacks around case-sensitivity
3) Removed the .5s delay in the inner loop of CreatePartialPatches
	select(undef, undef, undef, 0.5);
4) Not sure why we are using /dev/shm/tmp as our TMPDIR, changed to /tmp in my local config
5) There are two separate places in make_incremental_update.sh where we exclude processing of certain files.  In one place a filename (e.g. "update.manifest") will be excluded if it is anywhere in the dir hierarchy.  In the other it will be excluded only at the top dir level.  I tried to rationalize this in the new impl
6) make_incremental_updates.sh does not properly strip spaces in files list in  removed-files 
7) There are a surprising number of special cases (searchplugins/extensions/etc) in incremental update generation.

So this patch includes:

a) make_incremental_updates.py as a replacement for:
	make_incremental_update.sh
	unwrap_full_update.pl
	parts of patcher2.pl

b) patches to patch2.pl to optionally use make_incremental_updates.py
c) A test directory with a set of specially constructed mar files to attempt to execute every code path and special case in mar construction.   Also includes some hacky utils to compare and build mars.

I've rebuild the 2004-2005 patches on my system using both the old and new, and run through the unit tests.  make_incremental_update.py produces the same results as existing tools in both cases with the following exceptions:
	a) manifest file is slightly different order
	b) make_incremental_updates.py properly strips spaces from removed files
	
My python mojo is about a 2/10 (perl=1/10) so be nice since I'm sure I've made some humorous errors here :-).
Added .zip of test dir at the URL (too big to attach).

A few final notes:

a) I re-wrote make_incremental_update.sh because in order to implement the first optimization I needed to carry state from one patch generation to the next.  The easiest way to do this was to have a single program process all partial patches.

b) This is designed to drop into the existing systems.  Configuration and processing is still the same in patcher so none of the 'which patches do I generate' logic has to be re-tested.

c) The only aspect I haven't tested/implemented is forced updates.  It is trivial to implement in miu.py - but i didn't have a sample patcher.cfg that had them to test it end-to-end. 


Assignee: build → mtschrep
Attachment #276398 - Attachment mime type: text/x-python → text/plain
Comment on attachment 276398 [details]
New incremental update implementation [checked in]

Since Darin wrote make_incremental_updates.sh not sure who should review this...
Attachment #276398 - Flags: review?(benjamin)
Comment on attachment 276397 [details] [diff] [review]
Patch to patcher2.pl to allow it to optionally call make_incremental_updates.py

integration with patcher a quick hack - so I'm sure there is a better way to do this...
Attachment #276397 - Flags: review?(preed)
(In reply to comment #4)
> (From update of attachment 276397 [details] [diff] [review])
> integration with patcher a quick hack - so I'm sure there is a better way to do
> this...

Couple quick things:

I want to set this up on the machines we do the patch generation on for releases, so we can make sure it'll work (I don't have any reason to believe it won't just want to test :-)

Also, I think we may be able to integrate this a bit cleaner by tweaking MozAUSLib::CreatePartialMar() so it just Does The Right Thing.

I'll poke around how difficult that would be when I do the first part.

This is cool!
Comment on attachment 276398 [details]
New incremental update implementation [checked in]

>    def __init__(self, work_dir, file_exclusion_list, path_exclusion_list):
>        self.work_dir=work_dir
>        self.archive_files=[]
>        self.manifest=[]
>        self.file_exclusion_list=file_exclusion_list
>        self.path_exclusion_list=path_exclusion_list

Nit: most places you use spaces around =, which IMO is more readable. Can you do that here as well?

>        if filename.startswith("extensions/"):
>            testdir = "extensions/"+filename.split("/")[1]  # Dir immediately following extensions is used for the test
>            self.manifest.append('add-if "'+testdir+'" "'+filename+'"')
>        else:
>            self.manifest.append('add "'+filename+'"')

I think that here and throughout the script this would be more readable using the string-formatting operator:

self.manifest.append('add "%s"' % filename)

>def exec_shell_cmd(cmd):
>    """Execs shell cmd and raises an exception if the cmd fails"""
>    if (os.system(cmd)):
>        raise Exception, "cmd failed "+cmd

Do we need to support python 2.3 with this script? If not, please use subprocess and check_call. See http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/tools/cvs2hg-import.py&rev=1.9&mark=15-26 for something that works in python 2.4+, and you can just "from subprocess import check_call" if you only need to support python 2.5+

>def bzip_file(filename):
>    """ Bzip's the file in place.  The original file is replaced with a bzip'd version of itself
>        assumes the path is absolute"""
>    exec_shell_cmd('bzip2 -z9 "' + filename+'"')
>    os.rename(filename+".bz2",filename)

Do you want to do this instead of using the python builtin bz2 module? I see you using bz2 elsewhere in this script.

>def extract_mar(filename, work_dir): 
>    """ Extracts the marfile intot he work_dir
>        assumes work_dir already exists otherwise will throw osError"""
>    print "Extracting "+filename+" to "+work_dir
>    saved_path = os.getcwd()
>    try:
>        os.chdir(work_dir)
>        exec_shell_cmd("mar -x "+filename)    
>    finally:
>        os.chdir(saved_path)

I'd like the "mar" command to be configurable, so that we don't have to put mar in the PATH: we could pick up MAR from the environment:

mar = os.environ.get('MAR', 'mar');

or from the commandline (e.g. --mar=/path/to/mar)

r=me with nits fixed... would be good to have a second-review from pickone('luser', 'preed', 'rhelmer')
Attachment #276398 - Flags: review?(benjamin) → review+
Comment on attachment 276397 [details] [diff] [review]
Patch to patcher2.pl to allow it to optionally call make_incremental_updates.py

After playing around with this a bit more, I don't think it's worth it to move this code around... patcher2 is due for a rework soon anyway.

I attached a take on this; I ran patcher2.pl with the attached patch here, and then ran patcher2 with the patch I attached, and diffed the result; they're the same.

The only changes I made relate to some variable renaming, so it was clearer when the fast-patcher option was being used, and using some perl-idioms, to match the style of the rest of the patcher2 code.
Attachment #276397 - Flags: review?(preed) → review+
Ugh... last version of this patch had a lot of garbage in it that wasn't relevant to the patch. This version is just the relevant stuff.

diff had -w on it, too, to ignore the indentation changes from Schrep's original patch.
Attachment #278639 - Attachment is obsolete: true
Reassigning to build, per a request from joduinn and an ok from schrep.
Assignee: mtschrep → build
Assignee: build → nobody
QA Contact: mozpreed → build
Assignee: nobody → ccooper
Priority: P3 → P2
Status: NEW → ASSIGNED
Comment on attachment 278645 [details] [diff] [review]
patcher2.pl patch to use make_incremental_updates.py, my take II [checked in]

Some minor comment typos, and I would consider including Data::Dumper at the outset (and possibly using it elsewhere to log data structures), but otherwise looks good.
Attachment #278645 - Flags: review+
Comment on attachment 276398 [details]
New incremental update implementation [checked in]

Checked in on trunk.
Attachment #276398 - Attachment description: New incremental update implementation → New incremental update implementation [checked in]
Attachment #276397 - Attachment is obsolete: true
Comment on attachment 278645 [details] [diff] [review]
patcher2.pl patch to use make_incremental_updates.py, my take II [checked in]

Checked in on trunk.
Attachment #278645 - Attachment description: patcher2.pl patch to use make_incremental_updates.py, my take II → patcher2.pl patch to use make_incremental_updates.py, my take II [checked in]
The patcher changes are causing a build process failure during the 2.0.0.11 release. I've reverted the change and am trying to diagnose.
Planning to land this again on Friday when rhelmer is back around to help out.
The patch has been re-landed along with a fix to make_incremental_updates.py to fix the atchlist_file / patchlist_filename discrepancy. Getting ready to test it out in the automation staging env.
Just a note that the URL field as a link to a directory used for unit testing.  It was too big to attach directly.
Depends on: 411235
Patch has been backed out (again) pending new version/fixes in bug 375415.
Depends on: 375415
Let's make patcher support generating the patchlist file for make_incremental_update.py; it already has all needed info in patcher.cfg, we shouldn't duplicate this by generating it elsewhere and then passing it to patcher. 

patcher.cfg files are currently the authoritative source for AUS configuration history, so we'd need to parse them in any case.

Once that's done and the blockers are resolved, we can land and start using this right away. I don't think there's much point in landing the current patch, as it's missing the patchlist file generator, and no other system has enough info to supply this.
(In reply to comment #20)
> Let's make patcher support generating the patchlist file for
> make_incremental_update.py; it already has all needed info in patcher.cfg, we
> shouldn't duplicate this by generating it elsewhere and then passing it to
> patcher. 

hrm, ok actually reading the latest patch, looks like it does the generation of the patchlist file for make_incremental_updates.py:

https://bugzilla.mozilla.org/attachment.cgi?id=278645&action=diff#mozilla/tools/patcher/patcher2.pl_sec3

coop, now that the patch in bug 375415 landed, let's re-land this and test it. I can set the staging server to pull a version that will work. We need to test in both fast-patcher and old modes.
Landed once again.

Checking in MozAUSConfig.pm;
/cvsroot/mozilla/tools/patcher/MozAUSConfig.pm,v  <--  MozAUSConfig.pm
new revision: 1.16; previous revision: 1.15
done
Checking in MozAUSLib.pm;
/cvsroot/mozilla/tools/patcher/MozAUSLib.pm,v  <--  MozAUSLib.pm
new revision: 1.11; previous revision: 1.10
done
Checking in patcher2.pl;
/cvsroot/mozilla/tools/patcher/patcher2.pl,v  <--  patcher2.pl
new revision: 1.28; previous revision: 1.27
done
Seems to work fine in the fallback mode, which is great (we can stop backing this patch out :P).

Found some problems we need to clear up with fast mode:

1) make_incremental_updates.py skipped win32, even though it's in the patchlist file (!). Investigating.
2) the last field in each line in the patchlist file (forced files) is "$VAR1 = []", because of Data::Dumper... probably not what we want in that field
3) turns out I broke 1.8 branch in the metadata collection feature in bug 410806 (no application.ini to get buildid from, I'll just pull it from talkback's master.ini most likely).

I'll work on a patch for 1 and 2 here, and fix 3 over in bug 410806. It would actually be a really cool enhancement to make patcher take advantage of the feature from 3, we could make sure that the expected build ID matches the actual build ID, for instance.
(In reply to comment #23)
> 1) make_incremental_updates.py skipped win32, even though it's in the patchlist
> file (!). Investigating.
> 2) the last field in each line in the patchlist file (forced files) is "$VAR1 =
> []", because of Data::Dumper... probably not what we want in that field
> 3) turns out I broke 1.8 branch in the metadata collection feature in bug
> 410806 (no application.ini to get buildid from, I'll just pull it from
> talkback's master.ini most likely).

4) make_incremental_updates.py expects mar/mbsdiff to be on the PATH, but they aren't. Should probably just do this in patcher before calling make_incremental_updates.py
5) all of the partials are coming out the same, per OS :/ I know this doesn't happen when calling make_incremental_updates.py and the patchlist file looks ok, investigating..
(In reply to comment #25)
> 5) all of the partials are coming out the same, per OS :/ I know this doesn't
> happen when calling make_incremental_updates.py and the patchlist file looks
> ok, investigating..

oops, false alarm, this was the case for the actual release too :) Only win32 had slight differences, there were no l10n changes between .10 and .11 so a bunch of partials were identical.
(In reply to comment #23)
> 1) make_incremental_updates.py skipped win32, even though it's in the patchlist
> file (!). Investigating.

Couldn't reproduce this one.. I noticed some problems on the test machine, I think it actually may have run out of memory or crashed in some other way which wasn't caught by patcher. I'll take a closer look.
(In reply to comment #27)
> (In reply to comment #23)
> > 1) make_incremental_updates.py skipped win32, even though it's in the patchlist
> > file (!). Investigating.
> 
> Couldn't reproduce this one.. I noticed some problems on the test machine, I
> think it actually may have run out of memory or crashed in some other way which
> wasn't caught by patcher. I'll take a closer look.

Ah, I see, RunShellCommand is timing out, but that's not being checked. I'll work up a patch for this too.

This patch adds a function run_shell_command(), based on Bootstrap::Step::Shell() but without the OO and Log method. It doesn't provide a way to skip output or ignore exit values, as this isn't something any of patcher's three shell calls need to do currently.

On the plus side, it logs and produces timestamps for every command it runs, and checks every form of error that RunShellCommand() is known to report. We could tack all of these checks on after each run_shell_command(), but I think that would be even more of a copy/paste sin than this.
Attachment #296471 - Flags: review?
Comment on attachment 296471 [details] [diff] [review]
wrap RunShellCommand for better logging/error handling

Hm, something goes wrong with --build-tools; I'll post a better one in a bit.
Attachment #296471 - Flags: review?
Comment on attachment 296471 [details] [diff] [review]
wrap RunShellCommand for better logging/error handling

My mistake; it was my test environment that was the problem.
Attachment #296471 - Flags: review?(ccooper)
Comment on attachment 296471 [details] [diff] [review]
wrap RunShellCommand for better logging/error handling

Two nits:

* you declare %args twice at the start of run_shell_command
* consider a more descriptive name for the function. I don't know what that might be without getting into very long function names, but just a suggestion.

Otherwise, fine.
Attachment #296471 - Flags: review?(ccooper) → review+
(In reply to comment #32)
> (From update of attachment 296471 [details] [diff] [review])
> Two nits:
> 
> * you declare %args twice at the start of run_shell_command

Er, yeah, removed :)

> * consider a more descriptive name for the function. I don't know what that
> might be without getting into very long function names, but just a suggestion.

Hmm, any suggestions? Not sure what's more descriptive (run_shell_command_with_error_handling() ? :)).. I'm going to go ahead and check in as-is but it's easy to change if anyone has suggestions!
Works fine as far as I can tell, I'm going to run a bunch of tests on this over the weekend in both modes and both 1.8/1.9 branches, and start using fast-patcher mode for staging so I'll see if anything shakes loose.

Checking in patcher2.pl;
/cvsroot/mozilla/tools/patcher/patcher2.pl,v  <--  patcher2.pl
new revision: 1.29; previous revision: 1.28
done
Attachment #296471 - Attachment is obsolete: true
Whiteboard: testing
rhelmer: You were going to summarize the outstanding issues here. In comment 23 you talk about the forced file issue, but you also mention you're already working on a patch.

Just let me know what still needs to be addressed.
(In reply to comment #35)
> rhelmer: You were going to summarize the outstanding issues here. In comment 23
> you talk about the forced file issue, but you also mention you're already
> working on a patch.
> 
> Just let me know what still needs to be addressed.

So far I've focused on testing that this didn't break the old fallback mode; I've done some testing of the new "fast mode" but I wouldn't call it comprehensive.

The only outstanding issue I know of with "fast mode" is that we don't support forced files. I haven't started working on a patch, but I think we should add this feature to make_incremental_updates.py as it's a feature that we do sometimes use.

I am ok if you want to file a followup and close this one though.
(In reply to comment #36)
> I am ok if you want to file a followup and close this one though.

Filed bug 414560 as a followup.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Testing this on staging now (for use in the next round of releases), ran into one issue - mozilla/dist/host/bin needs to be on the PATH or make_incremental_updates.py can't find it. 
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This wfm on staging, what do you think coop?
Attachment #309341 - Flags: review?(ccooper)
Comment on attachment 309341 [details] [diff] [review]
[checked in] set the ENV to include mozilla/dist/host/bin in fast mode

Makes sense.
Attachment #309341 - Flags: review?(ccooper) → review+
Comment on attachment 309341 [details] [diff] [review]
[checked in] set the ENV to include mozilla/dist/host/bin in fast mode

Checking in patcher2.pl;
/cvsroot/mozilla/tools/patcher/patcher2.pl,v  <--  patcher2.pl
new revision: 1.31; previous revision: 1.30
done
Attachment #309341 - Attachment description: set the ENV to include mozilla/dist/host/bin in fast mode → [checked in] set the ENV to include mozilla/dist/host/bin in fast mode
Status: REOPENED → RESOLVED
Closed: 17 years ago16 years ago
Resolution: --- → FIXED
Whiteboard: testing
Were the unit tests from this ever checked in anywhere?
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: