Closed Bug 470146 Opened 16 years ago Closed 15 years ago

Speed up signing of windows dll's and exe's

Categories

(Release Engineering :: General, defect)

x86
Windows Server 2003
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: catlee)

References

Details

Attachments

(2 files, 2 obsolete files)

Recently, we've been waiting up to 4 hours for signing of the internals of windows installers to complete. For some releases (eg Fx 3.0.x) the number of locales has increased compared to historical precedent, but anecdotal evidence suggests that keymaster has gotten slower in general.

There are several things we can try to improve this situation, so this bug might turn into a meta or tracking bug.
* instrument the scripts to find out where time is being consumed
* convert shell scripts that have to do a lot of I/O to a more performant  language, mozilla/tools/update-packaging/make_full_update.sh is the prime target here
* investigate the health of the signing machine (fragmentation etc)
* investigate disabling access time checks on the NTFS partition we sign on

I'd like to propose we work on this in Q1 2009.
Attached patch New signing tools (obsolete) — Splinter Review
The new scripts are smarter about caching results, contents on MAR files in particular are not bunzipped/bzipped again unless required.

It's also possible to split up the signing into N jobs via a -j parameter to the sign-release.py script.
Assignee: nobody → catlee
Status: NEW → ASSIGNED
Attachment #384936 - Flags: review?(nthomas)
Attachment #384936 - Flags: review?(bhearsum)
Component: Release Engineering: Future → Release Engineering
Comment on attachment 384936 [details] [diff] [review]
New signing tools

A couple of overall comments first:
Please don't hardcode firefox into this. I'm sure Thunderbird would love to make use of this, and maybe SeaMonkey, too. I really really don't want to regress this.
More comments, please. Some high level descriptions would be helpful in many places, especially many of the functions/methods in sign-release.py
A few suggestions about logging:
* Can you build up a string in Signer.signPackage and then print the whole thing at the end of that function? I think this would be better than all the locales intermingled. (I don't mind the PID being there either way, fwiw)
* At the start of each locale it would be nice to have a line that is more recognizable. Maybe "START OF SIGNING: $locale", or even just a line without the PID prepended would help when scanning the log.

Another thing that popped into my head is one-off locale signing and partner signing. By the looks of it there's no way to do an "already signed en-US" signing (like the -p option for sign-release.pl). This is totally a follow-up issue it doesn't work already, A couple of overall comments first:
Please don't hardcode firefox into this. I'm sure Thunderbird would love to make use of this, and maybe SeaMonkey, too.
More comments, please. Some high level descriptions would be helpful in many places, especially many of the functions/methods in sign-release.py

Another thing that popped into my head is one-off locale signing and partner signing. By the looks of it there's no way to do an "already signed en-US" signing (like the -p option for sign-release.pl). If we can't do that already please file a follow-up on it.


>+            # Repack it
>+            log.info("Packing %s", dstfile)
>+            packfile(dstfile, tmpdir)
>+            # Sign exe's
>+            if dstfile.endswith('.exe'):
>+                signfile(dstfile, self.keydir, self.fake)
>+            return nFiles, cacheHits, nSigned

I'm really happy to see the internal and installer signing finally getting integrated fully.

>+        finally:
>+            shutil.rmtree(tmpdir)
>+
>+    def signDir(self, files, dstdir, firstLocale='en-US'):
>+        start = time.time()
>+
>+        if len(files) == 1 and os.path.isdir(files[0]):
>+            files = findfiles(files[0])
>+        for f in files[:]:
>+            ext = os.path.splitext(f)[1]
>+            if ext not in ('.exe', '.mar') or 'win32' not in f:
>+                files.remove(f)

Can we log a warning if we find something that isn't a mar, exe, or xpi inside of win32/? This probably will expand the logic here a bit.

>+        # Sign the firstLocale files
>+        for f in files[:]:
>+            ext = os.path.splitext(f)[1]
>+            if fileInfo(f)['locale'] == firstLocale:
>+                files.remove(f)
>+                self.signPackage(f, dstdir, True, ext == '.mar')
>+

The last argument to signPackage is a little dense, it took me a couple tries to realize it was a comparison and not an assignment. Can you assign it to a variable, or at least but parentheses around it to call it out a bit more? 

>+if __name__ == "__main__":
>+    os.umask(000)

We're on cygwin so this is probably fine....but why? And on the topic of permissions, have you tested what the permissions are when things are pushed back to stage? They should be 644 for files and 755 for dirs.


>+    parser.add_option("-o", "--dest", dest="dest", help="destination directory")
>+    parser.add_option("", "--first-locale", dest="first_locale", help="first locale to sign")
>+    parser.add_option("-j", "--concurrency", dest="concurrency", help="concurrency", type="int")
>+    parser.add_option("", "--keep-cache", dest="keep_cache", action="store_true", help="don't delete cache first")

Can you spell out that this is about cache between releases? It's not immediately obvious to me.

>+    options, args = parser.parse_args()
>+    if not options.dest:
>+        if len(args) < 2:
>+            parser.error("Must specify a destination directory")
>+        options.dest = args[-1]
>+        args = args[:-1]

Is there a specific reason to support the dstdir being an option or an arg? It seems unnecessary, and confusing in the case where both are passed.


>+def makePath(filepath):
>+    info = fileInfo(filepath)
>+    if info['pathstyle'] == 'short':
>+        return os.path.basename(filepath)
>+    elif info['pathstyle'] == 'long':
>+        if info['format'] == 'mar':
>+            return "update/%(platform)s/%(locale)s/%(product)s-%(version)s.%(contents)s.mar" % info
>+        elif info['format'] == 'exe':
>+            return "%(platform)s/%(locale)s/%(product)s Setup %(version)s.exe" % info
>+    else:
>+        raise ValueError("Whaaa?")

I appreciate the humour, but let's make this error message a bit more useful :-).

>+def packexe(exefile, srcdir):
>+    exefile = cygpath(os.path.abspath(exefile))
>+    appbundle = exefile + ".app.7z"
>+    files = os.listdir(srcdir)
>+    # These are the original parameters
>+    SEVENZIP_ARGS = ['-r', '-t7z', '-mx', '-m0=BCJ2', '-m1=LZMA:d24',
>+            '-m2=LZMA:d19', '-m3=LZMA:d19', '-mb0:1', '-mb0s1:2', '-mb0s2:3']
>+
>+    # These parameters give better results
>+    SEVENZIP_ARGS = ['-r', '-t7z', '-mx', '-m0=BCJ2', '-m1=LZMA:d27', '-m2=LZMA:d19:mf=bt2', '-m3=LZMA:d19:mf=bt2', '-mb0:1', '-mb0s1:2', '-mb0s2:3', '-m1fb=128', '-m1lc=4']

If we're going to use the new parameters we need to update:
toolkit/mozapps/installer/windows/nsis/makensis.mk
browser/locales/Makefile.in

and maybe other places with them, too.

Maybe we should just downgrade 7z on the new keymaster so we get the same results from the same parameters? Whichever we choose, let's drop the other array definition from here.

It might be nice to make it expect the tag files in the same place as sign-release.pl, to make switching between the scripts easier, but I'm not hung up on that.

Tons of awesome here, just a few things to clean-up (firefox hardcoding, logging improvements, better comments mainly).
Attachment #384936 - Flags: review?(bhearsum) → review-
Still left to do here:

- Final (hopefully) round of approvals of patches.  New patches should be ready tomorrow.

- QA testing new builds later this week.
Attached patch New signing tools (obsolete) — Splinter Review
Attachment #384936 - Attachment is obsolete: true
Attachment #389691 - Flags: review?(nthomas)
Attachment #389691 - Flags: review?(bhearsum)
Attachment #384936 - Flags: review?(nthomas)
Comment on attachment 389691 [details] [diff] [review]
New signing tools


> SEVENZIP        = $(WORKDIR)/7-Zip/7z.exe
> UPX_BIN         = $(WORKDIR)/upx/upx.exe
>+CONCURRENCY    ?= 4
>+KEYDIR	       ?= d:/2008-keys

You've got a hard tab here, please get rid of it.


>+postsign: upload upload-log verify-signatures2
>+

What's the rationale behind running verify-signatures2 after uploading? I think you probably want a verify2 target that does the right verification for USE_NEW.

I'll have a look at the .py files in a bit.
(In reply to comment #5)
> (From update of attachment 389691 [details] [diff] [review])
> 
> > SEVENZIP        = $(WORKDIR)/7-Zip/7z.exe
> > UPX_BIN         = $(WORKDIR)/upx/upx.exe
> >+CONCURRENCY    ?= 4
> >+KEYDIR	       ?= d:/2008-keys
> 
> You've got a hard tab here, please get rid of it.
> 
> 
> >+postsign: upload upload-log verify-signatures2
> >+
> 
> What's the rationale behind running verify-signatures2 after uploading? I think
> you probably want a verify2 target that does the right verification for
> USE_NEW.

We're still doing 'verify' before uploading when USE_NEW is true.  The verify-signatures2 step takes around 30 minutes, so I figured we could start that after uploading the results.  If any problems are found, we can re-sign, and re-upload.
(In reply to comment #6)
> We're still doing 'verify' before uploading when USE_NEW is true.  The
> verify-signatures2 step takes around 30 minutes, so I figured we could start
> that after uploading the results.  If any problems are found, we can re-sign,
> and re-upload.

Hrm, ok. That's pretty consistent with what we already do with test channel updates. WFM.
Comment on attachment 389691 [details] [diff] [review]
New signing tools

I ran through a 3.5.1 signing without major issue. verify-signatures2 bailed out right away saysing that a DLL in the af complete MAR had a bad signature. I think that's due to it being done with the staging key though. catlee - can you confirm that?

assuming that, r=bhearsum with the following conditions:
* fix the $(TOTAL_FILES) count in verify-asc to include *SUMS files, othrewise the target fails
* update the CombinedSigning instructions to talk about the new scripts/Makefile targets. (Maybe we should have a section for the old and the new way, for now)


Really fantastic work, Chris, this is such a huge improvement.
Attachment #389691 - Flags: review?(bhearsum) → review+
Attachment #389691 - Flags: review?(nthomas) → review-
Comment on attachment 389691 [details] [diff] [review]
New signing tools

I have lots of comments here but the majority of it is suggestions or quick fixes like comments/typos. The only things I'd really like fixed asap are
* sorting the right update manifest in verify-signatures.py
* making sure appbundle doesn't exist before calling 7z
* pulling the 7z stub & app.tag on the release tag from the appropriate repo

It's going to be so great to be able to sign fast with this code.

>diff --git a/release/signing/Makefile b/release/signing/Makefile
>+stubs:
>+	if [ ! -d checkouts/stubs ]; then mkdir -p checkouts/stubs; fi
>+	(cd checkouts/stubs; cvs -d:ext:cltbld@cvs.mozilla.org:/cvsroot co -d 7z mozilla/other-licenses/7zstub/$(PRODUCT))
>+	(cd checkouts/stubs; cvs -d:ext:cltbld@cvs.mozilla.org:/cvsroot co -d tagfile mozilla/$(PRODUCT_DIR)/installer/windows/app.tag)
>+	rm -f checkouts/stubs/7z/7zSD.sfx.compressed
>+	$(UPX_BIN) --best -o checkouts/stubs/7z/7zSD.sfx.compressed checkouts/stubs/7z/7zSD.sfx

I'm uncomfortable assuming that 7zSD.sfx and app.tag will never change, and that we can use whatever is on the tip of the Fx3.0 code. Couldn't we ifdef based on $VERSION between cvs and hg-over-http(s) ? I would also like to explicitly remove checkout/stubs each time the stubs runs, to ensure it's fresh.

Which reminds me, what's the plan for Fx 3.0.x releases ? Pick and choose the Makefile targets and pass --unsigned-installers to sign-release.py ? Perhaps it makes sense to have two Makefiles to keep them simpler.

> download:
> 	# Download the builds.
>-	rsync -av --exclude '*.zip' -e "ssh -i $(SSH_KEY)" \
>+	rsync -av --exclude-from download-exclude.list -e "ssh -i $(SSH_KEY)" \

Reminder: document that download-exclude.list et al should be copied from the hg checkout along with Makefile.

>diff --git a/release/signing/sign-release.py b/release/signing/sign-release.py
>+def _signLocale(obj, dstdir, files, remember=None):

I wondered why this guy is not in the Signer class. Something to do with making concurrency work ?

>+    def signPackage(self, pkgfile, dstdir, remember=False, compressed=False):
>+            for f in findfiles(tmpdir):
>+                h = sha1sum(f)

I did a little test here - hashing only the executables doesn't win anything time-wise.

>+            # Repack it
>+            logs.append("Packing %s" % dstfile)
>+            packfile(dstfile, tmpdir)
>+            # Sign exe's

Nit: "# Sign installer" would be less ambiguous.

>+        if pool:
>+            # Clean up the pool of child processes afterwars

Nit: typo in afterwards

>diff --git a/release/signing/signing.py b/release/signing/signing.py
>+def makePath(srcpath, dstdir):

I keep thinking this function is going to create a directory, rather than convert a path from unsigned to signed directory. getPath/mungePath/convertPath maybe.

>+def fileInfo(filepath, product):
>+    'pathstyle' is either 'short' or 'long', and refer

Hanging sentence in this comment.

>+            m = re.search("update/(win32|linux-i686|mac)/([-a-zA-Z]+)/(%s)-(\w+\.\w+(?:\.\w+)?(?:-\w+\.\w+)?)\.(complete|partial)\.mar" % product, filepath)

Do you need to match partial files with the (?:-\w+\.\w+)? at the end of the version block ? Looks like that would fail on 3.0.x/3.5.x.

>+            m = re.search("(win32)/([-a-zA-Z]+)/((?i)%s) Setup ([\d\.]+(?:\ \w+\ \d+)?)\.exe" % product, filepath)
>+            ret['product'] = m.group(3)

One day we might care that this returns 'Firefox' for installers and 'firefox' for updates, but it seems that day is not today. The (?i) makes the whole regexp case insensitive too, but I don't think that's going to matter either.

>+def unpackexe(exefile, destdir):
>+        log.exception("Error unpacking exe %s to %s", exefile, destdir)
>+	raise

Stray tab/indent problem here.

>+def packexe(exefile, srcdir):
>+        check_call([SEVENZIP, 'a'] + SEVENZIP_ARGS + [appbundle] + files,
>+                cwd=srcdir, stdout=stdout, preexec_fn=_noumask)

Could we please ensure that appbundle doesn't exist before making this call. It could be bad if we left it around after a failed run and then appended.

>+        log.exception("Error packing exe %s from %s", exefile, srcdir)
>+	raise

Stray tab/indent problem here.

>diff --git a/release/signing/verify-signature.py b/release/signing/verify-signature.py
>+#!/usr/bin/python
>+# Verifies that a .exe or complete .mar have been signed correctly

Nit: Verifies directories rather than individual files.

>+        # Make sure the list of files are the same
>+        if signed_files != unsigned_files:
>+            for f in signed_files:
>+                if f not in unsigned_files:
>+                    print f

Looks like this will report files added to signed_files but not those missing from signed_files.

>+            return False, "List of files differs"

If we hit this error, I think it's going to say something like:
nonlocalized/components/foo.xpt
...
signed-build1/win32/af/Firefox Setup 3.5.2.exe False List of files differs

It would be friendlier if it said:
FAIL  signed-build1/win32/af/Firefox Setup 3.5.2.exe  List of files differs:
  Added:   nonlocalized/components/foo.xpt
  Missing: nonlocalized/xul.dll
   
>+        # And the list of directories too
>+        if signed_dirs != unsigned_dirs:
>+            return False, "List of directories differs"

This could have similar output to the file test.

>+        # Check the directory modes
>+        for d in unsigned_dirs:
>+            ud = os.path.join(unsigned_dir, d)
>+            sd = os.path.join(signed_dir, d)
>+            if os.stat(sd).st_mode != os.stat(ud).st_mode:
>+                return False, "Mode mismatch (%o != %o) in %s" % (os.stat(ud).st_mode, os.stat(sd).st_mode, sd)

Maybe use d instead of sd as the final arg here. The absolute path to the temp dir is less interesting.

>+            if os.stat(sf).st_mode != os.stat(uf).st_mode:
>+                return False, "Mode mismatch (%o != %o) in %s" % (os.stat(uf).st_mode, os.stat(sf).st_mode, sf)

Similarly, just f for the last argument here.

>+            # Skip this checking for freebl3.dll, softokn3.dll, or if we're in fake mode
>+            if ext in ('.dll', '.exe') and not fake_signatures and b not in ('freebl3.dll','softokn3.dll'):

I'd suggest using a constant in signing.py to keep track of binaries we don't sign, so that it's trivial to keep the signing and verification scripts in sync.

>+                if 0 != call(['chktrust', '-q', b], cwd=d, stdout=nullfd):
>+                    return False, "Bad signature %s in %s (%s)" % (sf, signed, cygpath(sf))

f instead of sf here too

>+                # Check the hashes
>+                if f == "update.manifest":
>+                    sf_lines = sorted(open(sf).readlines())
>+                    uf_lines = sorted(open(sf).readlines())

Reading the signed manifest both times!

>+                elif sha1sum(sf) != sha1sum(uf):
>+                    return False, "%s != %s" % (sf, uf)

(Here I go again...) Not sure if the full path would be useful in these messages. Actually, it's definitely not given the rmtree's just below here. So just "update.manifest differs" and |sha1sum on %s differs" % f| would be fine.

>+if __name__ == "__main__":
>+    import sys, os, logging

Nit: sys is already imported at the top of the file.

>+    # Check that MAR and SEVENZIP are executable
>+    null = open(os.devnull, "w")
>+    try:
>+        call([MAR, '-h'], stdout=null)
>+    except OSError:
>+        parser.error("mar must be in your $PATH, or set via $MAR")
>+    try:
>+        call([SEVENZIP, '-h'], stdout=null)
>+    except OSError:
>+        parser.error("7z must be in your $PATH, or set via $SEVENZIP")
>+    null.close()

Might be worth moving these checks into signing.py as this is the second time this fragment appears in the patch.

>+    options, args = parser.parse_args()
>+    if len(args) != 2:
>+        parser.error("Must specify two arguments")

Nit: Wouldn't hurt to say that they are the unsigned and signed directories.

>+        result, msg = check_repack(uf, sf, options.fake)
>+        print sf, result, msg
>+        if not result:
>+            sys.exit(1)
>+            break

So we exit on the first error rather than find them all. I think I'd prefer to know the full scope of any problem than fix things iteratively. 

It would be really great if we could verify that the files in the installer are the same as the mar, although I guess update verify does that too.
Attachment #389691 - Attachment is obsolete: true
Attachment #390995 - Flags: review?(nthomas)
Attachment #390995 - Flags: review?(bhearsum)
(In reply to comment #9)
> Which reminds me, what's the plan for Fx 3.0.x releases ? Pick and choose the
> Makefile targets and pass --unsigned-installers to sign-release.py ? Perhaps it
> makes sense to have two Makefiles to keep them simpler.

Bug 506695 is now tracking this.  I think the Makefile should be able to handle
3.0.x releases, and it shouldn't be that hard to get it working.

> >diff --git a/release/signing/sign-release.py b/release/signing/sign-release.py
> >+def _signLocale(obj, dstdir, files, remember=None):
> 
> I wondered why this guy is not in the Signer class. Something to do with making
> concurrency work ?

Yes, I was having issues calling methods on the Signer instance from child
processes.

> >+    def signPackage(self, pkgfile, dstdir, remember=False, compressed=False):
> >+            for f in findfiles(tmpdir):
> >+                h = sha1sum(f)
> 
> I did a little test here - hashing only the executables doesn't win anything
> time-wise.

Good check, the result is a bit surprising.  Maybe all the files are cached at
that point?
Comment on attachment 390995 [details] [diff] [review]
New signing tools

Looks great to me, except I think this change

>diff --git a/release/signing/Makefile b/release/signing/Makefile
> verify-asc: TOTAL_FILES=$(strip $(shell find $(WORKDIR)/signed-build$(BUILD)/ \
>                           -not -wholename '*update*' -not -name "*.xpi" -not -name "*.asc" \
>-                          -not -name "*.txt" -not -name "*.log"  -not -name '*SUMS' \
>+                          -not -name "*.txt" -not -name "*.log" \

is a result of tests pulling down *SUMS from old candidates directories, and create-sigs making asc's for them. A new release won't have them yet, as create-sigs runs before checksum-files. If that sounds correct then jettison this change and put *SUMS into download-exclude.list.
Attachment #390995 - Flags: review?(nthomas) → review+
Comment on attachment 390995 [details] [diff] [review]
New signing tools

>+verify-signatures2:
>+	if ./verify-signature.py unsigned-build$(BUILD) signed-build$(BUILD) > verify-signatures-build${BUILD}.log 2>&1; then \
>+	    blat -to $(EMAIL) \
>+		 -subject "Signature verification succeeded"\
>+		 -body "EOM"; \
>+	else \
>+	    blat verify-signatures-build${BUILD}.log -to $(EMAIL) \
>+		 -subject "Signature verification failed"; \
>+	fi

Awesome, I love this improvement.

Looks good to me.
Attachment #390995 - Flags: review?(bhearsum) → review+
Comment on attachment 390995 [details] [diff] [review]
New signing tools

changeset:   327:43153de7f5ce
Attachment #390995 - Flags: checked‑in+ checked‑in+
I ran through a 3.0.12 and 3.5.2 with the new signing instructions at:
https://intranet.mozilla.org/Build:Signing
https://intranet.mozilla.org/Build:CombinedSigning

Everything worked fine, I'm happy with these scripts + the instructions.
Notes on doing a test Tb2.0.0.22 signing with the Build:Signing instructions:
* had to use stubs-cvs target & compressed stub target directly because this line fails for tb:
    81 ifneq ($(filter 3.0.%,$(VERSION)),)
* had to pass TAG=${TAG}, PRODUCT=${PRODUCT} and PRODUCT_DIR=${PRODUCT_DIR} to the Makefile when doing stubs-cvs
* had to pass --product to sign-release.py

I've updated the docs to work around these issues - we should fix the stubs targets, though. We should also fix the tag default to use uc($PRODUCT) instead of hardcoding FIREFOX.

Using the above information we should be able to use the new signing scripts to sign tb2.0.0.23, at least.
Having signed *SUMS files would be nice, and doing the checksum-files here avoids problems in verify-asc when the *SUMS files don't have .asc files.

Alternatively we could restore the old verify-asc behaviour, and add the *SUMS files to the download exclusion list.  But then we'd still have a problem if you re-ran signing in the same working directory.
Attachment #391704 - Flags: review?(nthomas)
Attachment #391704 - Flags: review?(bhearsum)
Comment on attachment 391704 [details] [diff] [review]
Run checksum-files before create-sigs

Seems fine.
Attachment #391704 - Flags: review?(bhearsum) → review+
Attachment #391704 - Flags: review?(nthomas) → review+
Comment on attachment 391704 [details] [diff] [review]
Run checksum-files before create-sigs

changeset:   335:2b176212dd81
Attachment #391704 - Flags: checked-in+
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
...and the fans cheered wildly! :-D Very nice work, catlee!!
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: