Closed Bug 780561 (new-packager) Opened 12 years ago Closed 11 years ago

Overhaul the packager

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla21

People

(Reporter: glandium, Assigned: glandium)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: qawanted, Whiteboard: [metro-it2][completed-elm])

Attachments

(12 files, 22 obsolete files)

4.37 KB, patch
ted
: review+
Details | Diff | Splinter Review
899 bytes, patch
ted
: review+
Details | Diff | Splinter Review
1.12 KB, patch
ted
: review+
Details | Diff | Splinter Review
30.27 KB, patch
ted
: review+
Details | Diff | Splinter Review
59.74 KB, patch
ted
: review+
Details | Diff | Splinter Review
139.68 KB, patch
gps
: review+
Details | Diff | Splinter Review
75.44 KB, patch
gps
: review+
Details | Diff | Splinter Review
90.23 KB, patch
Details | Diff | Splinter Review
8.14 KB, patch
gps
: review+
Details | Diff | Splinter Review
4.77 KB, patch
ted
: review+
Details | Diff | Splinter Review
2.13 KB, text/plain
Details
5.16 KB, patch
gps
: review+
Details | Diff | Splinter Review
I've started working on a complete revamp of the packager over the week-end. At this point, it's not any useful, but it's near a point where it could be. Here are some features it has/will have:
- Automatic inclusion of anything that is registered in chrome.manifest (although at the moment, dist/bin/chrome.manifest also includes things like httpd.js, which should not be packaged, so either we mandate that dist/bin/chrome.manifest needs to only contain packaged stuff (which, arguably, would be better, because running dist/bin/firefox should probably not have extra components and chrome), or we can add exclusion rules to the package manifest)
- Manifest validation of style, overlay and override chrome urls, resource urls, component contracts and cids.
- Manifest validation of chrome content/locale/skin. (This found two registered chrome urls that point nowhere)
- (optional) Manifest simplification according to flags (e.g. remove os!=WINNT lines when building for windows)
- XPT linkage
- Manifest "linkage"
- Preprocessor is integrated within the packager, so no need to preprocess the package manifest file beforehand.
- Starts from flat chrome format and packages as flat, jar or omnijar.
No longer blocks: metro-build
AIUI, Joey is already doing some of this work over in bug 755724. Is this a duplication of that effort, or this for changes above and beyond?

Can you guys discuss this in IRC or skype and hammer out the details, please?
(In reply to Chris Cooper [:coop] from comment #1)
> AIUI, Joey is already doing some of this work over in bug 755724. Is this a
> duplication of that effort, or this for changes above and beyond?

Both.
(In reply to Mike Hommey [:glandium] from comment #2)
> (In reply to Chris Cooper [:coop] from comment #1)
> > AIUI, Joey is already doing some of this work over in bug 755724. Is this a
> > duplication of that effort, or this for changes above and beyond?
> 
> Both.

Let me know when you have time to talk about this because we have some duplication of effort going on.
Depends on: 783751
Blocks: metro-build
IRC discussion 

[2012-08-20 09:00:22] -->| YOU (joey) have joined #pymake[2012-08-20 09:00:22] =-= Topic for #pymake is ``http://hg.mozilla.org/projects/build-system | http://tbpl.mozilla.org/?tree=Build-System | open for business, sheriffed by ted and khuey, merges every now and then to and from m-c | also pymake once in a while''
[2012-08-20 09:00:22] =-= Topic for #pymake was set by gavin on Wednesday, May 23, 2012 8:11:25 PM
[2012-08-20 11:01:21] <glandium> joey: ping
[2012-08-20 11:04:11] <joey> glandium: pong
[2012-08-20 11:04:38] <glandium> joey: so what do you want to know about 780561?
[2012-08-20 11:05:30] <joey> why open and direction the bug is headed in ?
[2012-08-20 11:06:26] <glandium> joey: everything is in the bug description. i opened the bug because i started working on it, and featureset is in the description too
[2012-08-20 11:06:54] <glandium> i think i'll also extend to unification on osx, strip and shlibsign
[2012-08-20 11:07:14] <glandium> and probably populate_startupcache
[2012-08-20 11:08:06] <glandium> i left validation out for now
[2012-08-20 11:08:14] <glandium> (manifest validation, that is)
[2012-08-20 11:09:37] <glandium> if you think config.mk/rules.mk was awful, don't ever look at build/core/*.mk in android
[2012-08-20 11:10:34] <joey> I'm confused why a duplicate bug was opened for this when bug 755724 exists and is actively being worked on.
[2012-08-20 11:11:10] <joey> also I thought we were trying to move away from make syntax but the approach for 780561 will be make based
[2012-08-20 11:11:29] <joey> are unit tests being added with this effort as well ?
[2012-08-20 11:12:28] <glandium> where did i write this will be make based?
[2012-08-20 11:12:32] <glandium> it's entirely in python
[2012-08-20 11:12:59] <glandium> and 755724 is not the same thing. It's something that 780561 facilitates
[2012-08-20 11:16:49] <joey> 755724 does cover the packager logic and the answers used are also in python.
[2012-08-20 11:18:01] <joey> [ps] have not reviwed patches for 780561 yet but from the discussions in pymake as of late the answer seemed make based { arrays, shell quoting, pymake interactions, etc }
[2012-08-20 11:19:23] <glandium> these discussions had nothing to do with 780561
[2012-08-20 11:26:10] <glandium> anyways, you can look at it this way: 780561 /can/ be the infrastructure to fix 755724.
[2012-08-20 11:28:52] <glandium> a big chunk of 755724 is to change the way browser/ installs its files under dist/bin
[2012-08-20 11:29:04] <glandium> that's /not/ covered by 780561
[2012-08-20 11:30:38] <joey> a niaeve read for this is the install targets/code paths should all be the same.
[2012-08-20 11:30:56] <joey> installing manifests, binaries, etc via FINAL_TARGET 
[2012-08-20 11:31:41] <joey> does not seem far removed from the basics of 755724 of we need a way to condionally install metro targets in different directories
[2012-08-20 11:31:57] <joey> start by porting some of the internals to python
[2012-08-20 11:33:01] <glandium> metro stuff can be installed in dist/bin like everything else, with FINAL_TARGET
[2012-08-20 11:33:08] <glandium> no need for python here
Depends on: 785871
Blocks: 526333
Depends on: 787165
Attached patch PoC (obsolete) — Splinter Review
Comment on attachment 656979 [details] [diff] [review]
PoC

This implements points 1,5,6,7,8 of comment 0. This also handles stripping and elfhack, but not shlibsign.
Attachment #656979 - Flags: feedback?(ted.mielczarek)
(In reply to Mike Hommey [:glandium] from comment #7)
> Comment on attachment 656979 [details] [diff] [review]
> PoC
> 
> This implements points 1,5,6,7,8 of comment 0. This also handles stripping
> and elfhack, but not shlibsign.

It also doesn't handle optimizejars nor populate_startupcache. These will come later.
(In reply to Mike Hommey [:glandium] from comment #8)
> It also doesn't handle optimizejars nor populate_startupcache. These will
> come later.

Other know issues: it breaks universal builds and zipfile being feature-limited, it also compresses some files it shouldn't.
Attached patch PoC (obsolete) — Splinter Review
This fixes an issue on windows
Attachment #656979 - Attachment is obsolete: true
Attachment #656979 - Flags: feedback?(ted.mielczarek)
Attachment #657063 - Flags: feedback?(ted.mielczarek)
Attached patch PoC (obsolete) — Splinter Review
With a little update to the package manifest
Attachment #657063 - Attachment is obsolete: true
Attachment #657063 - Flags: feedback?(ted.mielczarek)
Attachment #658546 - Flags: feedback?(ted.mielczarek)
Depends on: 789481
Depends on: 792050
Depends on: 794393
Blocks: 773171
Comment on attachment 658546 [details] [diff] [review]
PoC

I'll attach a new version next week.
Attachment #658546 - Flags: feedback?(ted)
Alias: new-packager
Depends on: 809798
Depends on: 809802
Depends on: 809803
Depends on: 809806
Depends on: 818255
No longer depends on: 818255
Depends on: 789529, 789582, 792580
Whiteboard: [metro-it2]
Attached patch WIP (obsolete) — Splinter Review
This is a code dump. It's almost final, it just needs some reformatting, and a lot of comments.
Attachment #658546 - Attachment is obsolete: true
Attached patch Unit tests (obsolete) — Splinter Review
There are some parts of the code that have no coverage yet.
Attached patch Integration and cleanup (obsolete) — Splinter Review
This will be split in more parts in the future.
With these three patches, we have a full replacement for the current packager, that is, contrary to the original PoC which used a different manifest, this iteration just reuses the current manifests. Transition to the new manifest format will be done later. This allows to remove the old packaging code entirely, or almost (build/macosx/universal/unify doesn't go away, for instance, until test packaging is dealt with as well).
Comment on attachment 689366 [details] [diff] [review]
WIP

Review of attachment 689366 [details] [diff] [review]:
-----------------------------------------------------------------

Quick drive-by on Python nits.

If you wait a few days, you should be able to import OrderedDict unconditionally.

I prefer relative imports from within packages. But, I know others maintain other preferences.

I also prefer the 1 line per import style:

  from .foo import (
      bar,
      biz,
      baz,
  )

I personally find it more readable and easier to maintain since you can yank whole lines.

There's a lot of code using Python's struct module. Someone told me once that the overhead of compiling the format strings can be non-trivial and that the struct.Struct class should be used where performance is an issue. I haven't measured this myself and am not sure if it is still an issue with Python or would be an issue in any of this code.

I'd feel better if the files in toolkit/mozapps/installer lived inside /python/mozbuild and/or minimized the size of main(). IMO, main() should just collect CLI arguments and pass them into another routine. They seem to be doing a little bit more than this now.

Please also run this through flake8 and clean up some of the more major style issues.

::: python/mozbuild/mozpack/executables.py
@@ +53,5 @@
> +
> +def strip(path):
> +    strip = substs['STRIP']
> +    flags = substs['STRIP_FLAGS'].split()
> +    subprocess.call([strip] + flags + [path])

Exit code ignored.

@@ +59,5 @@
> +def may_elfhack(path):
> +    return substs['USE_ELF_HACK'] and path.endswith(substs['DLL_SUFFIX'])
> +
> +def elfhack(path):
> +    subprocess.call([os.path.join(topobjdir, 'build/unix/elfhack/elfhack'), path])

Ditto.

::: python/mozbuild/mozpack/files.py
@@ +3,5 @@
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +
> +import os
> +import shutil
> +from mozpack.executables import *

Nit: Don't import *

@@ +4,5 @@
> +
> +import os
> +import shutil
> +from mozpack.executables import *
> +from cStringIO import StringIO

There is something funky about cStringIO or someone in the StringIO family. I can't remember what it is right now. Cross platform compat maybe?

@@ +19,5 @@
> +
> +    def read(self, length=-1):
> +        if self.mode != 'r':
> +            self.file = open(self.path, 'rb')
> +            self.mode = 'r'

I'd make file a @property that auto-opens the file.

::: python/mozbuild/setup.py
@@ +9,5 @@
>  setup(
>      name='mozbuild',
>      description='Mozilla build system functionality.',
>      license='MPL 2.0',
> +    packages=['mach', 'mozbuild', 'mozpack'],

Please kill "mach" while you are here - it shouldn't exist any more.
(In reply to Gregory Szorc [:gps] from comment #17)
> > +    subprocess.call([strip] + flags + [path])
> 
> Exit code ignored.

On purpose.
 
> > +    subprocess.call([os.path.join(topobjdir, 'build/unix/elfhack/elfhack'), path])
> 
> Ditto.

Likewise.
 
> There is something funky about cStringIO or someone in the StringIO family.
> I can't remember what it is right now. Cross platform compat maybe?

Besides not being able to inherit from cStringIO, I've had no problems with it. The performance difference between the two is noticeable (and that's an understatement).
> Please also run this through flake8 and clean up some of the more major style issues.

I did that midway, but the result was really awful formatting (IMHO). Like cutting lines between a ( and a ).
(In reply to Gregory Szorc [:gps] from comment #17)
> > +
> > +    def read(self, length=-1):
> > +        if self.mode != 'r':
> > +            self.file = open(self.path, 'rb')
> > +            self.mode = 'r'
> 
> I'd make file a @property that auto-opens the file.

This would be more error-prone, imho, because of the different modes.
Attached patch WIP (obsolete) — Splinter Review
Only small changes. I found a few more issues that I'd like to fix. I'm however going to land this on elm so that peripheral changes don't rely on the previous iteration that's already on elm.
Attachment #689366 - Attachment is obsolete: true
Split out from the previous integration and cleanup patch.
Attached patch Integration and cleanup (obsolete) — Splinter Review
Attachment #689368 - Attachment is obsolete: true
Attached patch WIP (obsolete) — Splinter Review
With an adjustments for startupcache in l10n repacks.
Attachment #690884 - Attachment is obsolete: true
Pretty darn close to green builds. Win opt on Elm ran into one problem though - 

Traceback (most recent call last):
  File "e:/builds/moz2_slave/elm-w32/build/toolkit/mozapps/installer/l10n-repack.py", line 100, in <module>
    main()
  File "e:/builds/moz2_slave/elm-w32/build/toolkit/mozapps/installer/l10n-repack.py", line 97, in main
    repack(sys.argv[1], sys.argv[2])
  File "e:/builds/moz2_slave/elm-w32/build/toolkit/mozapps/installer/l10n-repack.py", line 68, in repack
    errors.error("Missing file: %s" % os.path.join(l10n, path))
  File "e:\builds\moz2_slave\elm-w32\build\python\mozbuild\mozpack\errors.py", line 54, in error
    errors._handle(errors.ERROR, msg)
  File "e:\builds\moz2_slave\elm-w32\build\python\mozbuild\mozpack\errors.py", line 44, in _handle
    raise ErrorMessage(msg)
mozpack.errors.ErrorMessage: Error: Missing file: ../../dist/xpi-stage/locale-x-test\browser/chrome/x-test/locale/browser/about.dtd
program finished with exit code 2
elapsedTime=23.322000

https://tbpl.mozilla.org/php/getParsedLog.php?id=17844869&tree=Elm&full=1
(In reply to Jim Mathies [:jimm] from comment #25)
> Pretty darn close to green builds. Win opt on Elm ran into one problem
> though - 

I know, I'm already testing a fix on try.
Depends on: 780448
Let's start requesting review on the easy parts.
Attachment #693882 - Flags: review?(ted)
Attachment #690886 - Attachment is obsolete: true
Attachment #693887 - Flags: review?(ted)
Attachment #693888 - Flags: review?(ted)
Attachment #690888 - Attachment is obsolete: true
Attachment #693892 - Flags: review?(ted)
Attachment #693887 - Attachment is obsolete: true
Attachment #693887 - Flags: review?(ted)
Attachment #693894 - Flags: review?(ted)
Attachment #693888 - Attachment is obsolete: true
Attachment #693888 - Flags: review?(ted)
Attachment #693883 - Flags: review?(ted) → review+
Comment on attachment 693885 [details] [diff] [review]
Fix the buildconfig python module handling of environment variables

Review of attachment 693885 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/buildconfig.py
@@ +23,5 @@
>      setattr(sys.modules[__name__], var, value)
> +
> +for var in os.environ:
> +    if var != 'SHELL' and var in substs:
> +        substs[var] = os.environ[var]

This doesn't look right, you've changed from config.substs to just substs, which isn't defined anywhere.
Attachment #693885 - Flags: review?(ted) → review-
Comment on attachment 693885 [details] [diff] [review]
Fix the buildconfig python module handling of environment variables

It could look that way, but using config.substs would actually fail. config is actually the config.status module, and replacing the values there would just do that: replacing the values there. But the intent is to have the overridden values exported from the buildconfig module. The for var in config.__all__ look copies the config.status values in the buildconfig namespace. Which makes substs exist in buildconfig. When the values were set before that loop, they could be set in the config.status namespace and copied over in the buildconfig namespace.
Attachment #693885 - Flags: review- → review?(ted)
Comment on attachment 693885 [details] [diff] [review]
Fix the buildconfig python module handling of environment variables

Review of attachment 693885 [details] [diff] [review]:
-----------------------------------------------------------------

Can you put a comment to that effect above the loop over config.__all__?
Attachment #693885 - Flags: review?(ted) → review+
One review will be enough, but flagging more people to have more chances of it being reviewed quickly. This is only a part of the code dump. The rest will follow when it will be properly commented.
Attachment #694507 - Flags: review?(ted)
Attachment #694507 - Flags: review?(gps)
Attachment #690974 - Attachment is obsolete: true
Comment on attachment 694507 [details] [diff] [review]
Import new packager code, part 1.

Adding khuey to the party, if he can get there before the others.
Attachment #694507 - Flags: review?(khuey)
Comment on attachment 694507 [details] [diff] [review]
Import new packager code, part 1.

Review of attachment 694507 [details] [diff] [review]:
-----------------------------------------------------------------

Partial review. I still haven't gone over copier.py, files.py, and unify.py.

I see tests in another patch on this bug. Could they get merged into this patch or could you request review from me?

I want to believe that a lot of the code from flags.py could be reused outside the packager. What are your thoughts on this?

I'm wondering whether manifest.py should perform more validation. e.g. CIDs could be fed into uuid.UUID().

There are some general Python 3 compatibility issues. I highly encourage you to make this code work with Python 3 now rather than later.

My review for mozjar.py was more or less "yeah, I guess if it works, it works." I trust you know what you are doing. And, this code is kinda important, so my guess is issues will be found rather quickly.

I'm not intimately familiar with the details of the manifests. My review was mostly "sure, this seems to be fine."

::: python/mozbuild/README.rst
@@ +11,5 @@
>  * mozbuild.compilation -- Functionality related to compiling. This
>    includes managing compiler warnings.
>  
> +* mozpack -- Functionality related to packaging builds. See
> +  mozpack/README.rst for more details.

I originally conceived "mozbuild" to be the name of a monolithic Python package providing everything build related. Here, we have a new "mozpack" package. On one hand, I'd like to see things under mozbuild.packaging (e.g. /python/mozbuild/mozbuild/packaging). On the other, meh. I don't really have strong feelings on this. Unless someone else does, I'm content with stuffing all the build code under /python/mozbuild/{moz*}.

::: python/mozbuild/mozpack/chrome/flags.py
@@ +7,5 @@
> +from mozpack.errors import errors
> +try:
> +    from collections import OrderedDict
> +except ImportError:
> +    from simplejson import OrderedDict

If you wait a few weeks or land this on build-system, no need to try since we require 2.7 \o/

@@ +24,5 @@
> +        '''
> +        self.name = name
> +        self.value = None
> +
> +    def add_def(self, definition):

add_definition()

Or, perhaps set_definition() or set_value() might be more appropriate. "add" implies "append" to me.

@@ +32,5 @@
> +        if definition == self.name:
> +            self.value = True
> +            return
> +        assert(definition.startswith(self.name))
> +        if definition[len(self.name)] != '=':

Possible IndexError if definition == self.name

@@ +87,5 @@
> +            self.values.append(('==', value[1:]))
> +        elif value.startswith('!='):
> +            self.values.append(('!=', value[2:]))
> +        else:
> +            return errors.fatal('Malformed flag: %s' % definition)

Perhaps you should explicitly check for value == '' (value == self.name)

@@ +110,5 @@
> +            return True
> +        for comparison, val in self.values:
> +            if eval('value %s val' % comparison):
> +                return True
> +        return False

I tend to avoid eval() out of principle. If you don't need eval(), I would just perform the == or != inline.

@@ +154,5 @@
> +        elif value[0] in ['<', '>']:
> +            if value[1] == '=':
> +                self.values.append((value[0:2], LooseVersion(value[2:])))
> +            else:
> +                self.values.append((value[0], LooseVersion(value[1:])))

IIRC, LooseVersion will accept "obviously" non-version strings like "foo.bar". All your examples are using integer version numbers. Do you want any extra validation?

@@ +156,5 @@
> +                self.values.append((value[0:2], LooseVersion(value[2:])))
> +            else:
> +                self.values.append((value[0], LooseVersion(value[1:])))
> +        else:
> +            return errors.fatal('Malformed flag: %s' % definition)

Again, please be more robust about len(value) being too short.

@@ +222,5 @@
> +           flags = Flags('contentaccessible=yes', 'appversion>=3.5')
> +        '''
> +        OrderedDict.__init__(self)
> +        for f in flags:
> +            name = re.split(r'([!<>=]+)', f)

Please use re.compile for regular expressions used multiple times.

@@ +235,5 @@
> +    def __str__(self):
> +        '''
> +        Serialize the set of flags.
> +        '''
> +        return ' '.join(str(self[k]) for k in self)

Does this work?! I thought you needed to enclose the comprehension inside an explicit [] or ().

::: python/mozbuild/mozpack/chrome/manifest.py
@@ +5,5 @@
> +import re
> +import os
> +from urlparse import urlparse
> +import mozpack.path
> +from flags import Flags

from .flags import Flags

(as written won't work in Python 3 because all non-.-prefixed imports in Python 3 are absolute).

@@ +9,5 @@
> +from flags import Flags
> +from mozpack.errors import errors
> +
> +
> +def rebase_path(oldbase, base, relpath):

Move to path.py?

@@ +38,5 @@
> +          data.
> +        - type: the manifest entry type (e.g. 'content' in
> +          'content global content/global/')
> +        - allowedFlags: a set of flags allowed to be defined for the given
> +          manifest entry type.

You may consider using some of the features in the abc package to make an abstract base class.

@@ +71,5 @@
> +    def serialize(self, *args):
> +        '''
> +        Serialize the manifest entry.
> +        '''
> +        entry = [self.type] + list(args)

Is args not already a list?

@@ +78,5 @@
> +            entry.append(flags)
> +        return ' '.join(entry)
> +
> +    def __eq__(self, other):
> +        return str(self) == str(other) and self.base == other.base

I'd make the base comparison before serializing for performance reasons (not that it matters that much).

Per http://docs.python.org/2/reference/datamodel.html#object.__lt__, if you define __eq__, you should also define __ne__.

@@ +343,5 @@
> +    Parse a line from a manifest file with the given base directory and
> +    return the corresponding ManifestEntry instance.
> +    '''
> +    # Remove comments
> +    cmd = re.sub(r'\s*#.*$', '', line).strip().split()

Save this regular expression with re.compile()

::: python/mozbuild/mozpack/errors.py
@@ +8,5 @@
> +class ErrorMessage(Exception):
> +    '''
> +    Exception type raised from errors.error() and errors.fatal()
> +    '''
> +    pass

"pass" is not needed if you have a docstring.

I also prefer not wasting the extra lines for the triple quotes. But, meh.

@@ +18,5 @@
> +    '''
> +    pass
> +
> +
> +class errors(object):

This is an interesting class. I like the overall concept. But, I don't care much for the singleton pattern. Would you mind removing all the @staticmethod, turning this into class ErrorCollector (or somthing) and then exporting an instance as "errors"? That way, if we ever want to use this class elsewhere (I could see a few use cases), we won't be confined by the singleton.

@@ +131,5 @@
> +                errors._context.append((self._file, self._line))
> +
> +        def __exit__(self, type, value, tb):
> +            if self._file and self._line:
> +                errors._context.pop()

Could you use contextlib.contextmanager here?

@contextmanager
def context(file, line):
    errors._context.append((file, line))
    yield
    errors._context.pop()

@@ +142,5 @@
> +        def __exit__(self, type, value, tb):
> +            count = errors._count
> +            errors._count = None
> +            if count:
> +                raise AccumulatedErrors()

Ditto.

::: python/mozbuild/mozpack/executables.py
@@ +30,5 @@
> +        - the file extension on OS/2
> +        - the file signature on OS/X and ELF systems (GNU/Linux, Android, BSD,
> +          Solaris)
> +
> +    As this function is intendeda for use to choose between the ExecutableFile

"intendeda"

@@ +40,5 @@
> +        return False
> +
> +    if substs['OS_ARCH'] == 'OS2':
> +        return path.lower().endswith(substs['DLL_SUFFIX']) or \
> +            path.lower().endswith(substs['BIN_SUFFIX'])

It's possible to pass a tuple to endswith:

  return path.lower().endswith((substs['DLL_SUFFIX'], substs['BIN_SUFFIX']))

@@ +49,5 @@
> +            return False
> +        signature = struct.unpack('>L', signature)[0]
> +        if signature in executable_signatures:
> +            return True
> +        if signature != 0xcafebabe:  # mach-o FAT binary

Please constify.

@@ +62,5 @@
> +        num = f.read(4)
> +        if len(num) < 4:
> +            return False
> +        num = struct.unpack('>L', num)[0]
> +        return num < 20

What are the chances of new Mach-O architectures being rolled out? I would either compare against the lowest Java class format version or the current number of Macho-O architectures.

@@ +89,5 @@
> +    Return whether elfhack() should be called
> +    '''
> +    # elfhack only supports libraries. We should check the ELF header for
> +    # the right flag, but checking the file extension works too.
> +    return substs['USE_ELF_HACK'] and path.endswith(substs['DLL_SUFFIX'])

Any reason why this doesn't use is_executable?

::: python/mozbuild/mozpack/mozjar.py
@@ +51,5 @@
> +    JarStruct subclasses instances can be either initialized from existing data
> +    (deserialized), or with empty fields.
> +    '''
> +
> +    TYPE_MAPPING = {'uint32': ('I', 4), 'uint16': ('H', 2)}

Can you store a struct.Struct instance in here? Apparently feeding formatting strings into struct.[un]pack() is inefficient because the formatting string needs to be reparsed all the time.

@@ +59,5 @@
> +        Create an instance at the given offset within the given data. Both data
> +        and offset may be omitted to create an instance with empty fields.
> +        Omitting only one of data or offset is not allowed.
> +        '''
> +        assert self.MAGIC and self.STRUCT

Perhaps some abc magic could be used here?

@@ +61,5 @@
> +        Omitting only one of data or offset is not allowed.
> +        '''
> +        assert self.MAGIC and self.STRUCT
> +        if data:
> +            self._init_data(data, offset)

Perhaps offset should be checked before calling _init_data() since _init_data() will assert if None.

@@ +77,5 @@
> +        if offset < 0:
> +            offset = len(data) + offset
> +        self.signature, size = JarStruct.get_data('uint32', data, offset)
> +        if self.signature != self.MAGIC:
> +            raise RuntimeError('Bad magic')

I'd create a JarReaderError class and use that.

@@ +90,5 @@
> +            else:
> +                size = sizes[t]
> +                value = data[offset:offset + size]
> +            if not name in sizes:
> +                setattr(self, name, value)

Instead of setting an attribute directly on the instance, I would instead keep a separate dict and define __getitem__ to facilitate access. This way, there is no potential collision between a built-in or method/property and the field names.

@@ +125,5 @@
> +        '''
> +        Serialize the data structure according to the data structure definition
> +        from self.STRUCT.
> +        '''
> +        serialized = struct.pack('<I', self.signature)

Please store a struct.Struct somewhere.

@@ +151,5 @@
> +            if type in JarStruct.TYPE_MAPPING:
> +                size += JarStruct.TYPE_MAPPING[type][1]
> +            else:
> +                size += len(getattr(self, name))
> +        return size

I'm not sure overloading __len__ is appropriate here, especially since [] access doesn't work on this class. I'd just define a regular method name or property.

@@ +199,5 @@
> +        ('filecomment', 'filecomment_size'),
> +    ]
> +
> +    def __hash__(self):
> +        return hash(self.filename)

This seems dangerous to me. Why do we need __hash__ implemented?

@@ +286,5 @@
> +        if hasattr(self, '_uncompressed_data'):
> +            return self._uncompressed_data
> +        data = self.compressed_data
> +        if self.compressed:
> +            data = zlib.decompress(data, -15)

That's an interesting magic number.

@@ +288,5 @@
> +        data = self.compressed_data
> +        if self.compressed:
> +            data = zlib.decompress(data, -15)
> +        if len(data) != self.uncompressed_size:
> +            raise RuntimeError('Corrupted file? %s' % self.filename)

Use a custom error class.

@@ +308,5 @@
> +            self._data = open(file, 'rb').read()
> +        else:
> +            self._data = file.read()
> +        # Scan for the End of Central Directory Record signature
> +        offset = -22

Magic number?

@@ +372,5 @@
> +            if e.offset < preload:
> +                self._lastPreloaded = e.filename
> +            else:
> +                break
> +        return self._lastPreloaded

Perhaps lastPreloaded could be computed as entries is populated?

::: python/mozbuild/mozpack/path.py
@@ +93,5 @@
> +        'foo/bar' matches 'foo/**/bar', or '**/bar'
> +    '''
> +    if not pattern:
> +        return True
> +    if isinstance(path, basestring):

basestring isn't Python 3 compatible.

@@ +96,5 @@
> +        return True
> +    if isinstance(path, basestring):
> +        path = split(path)
> +    if isinstance(pattern, basestring):
> +        pattern = split(pattern)

What if not isinstance? Perhaps you should raise?
(In reply to Gregory Szorc [:gps] from comment #41)
> Comment on attachment 694507 [details] [diff] [review]
> Import new packager code, part 1.
> 
> Review of attachment 694507 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Partial review. I still haven't gone over copier.py, files.py, and unify.py.
> 
> I see tests in another patch on this bug. Could they get merged into this
> patch or could you request review from me?
> 
> I want to believe that a lot of the code from flags.py could be reused
> outside the packager. What are your thoughts on this?

A lot of code from different files in the packager code can be reused outside the packager. How about caring when we come to that?

> > +    def add_def(self, definition):
> 
> add_definition()
> 
> Or, perhaps set_definition() or set_value() might be more appropriate. "add"
> implies "append" to me.

Apart from binary flags, add_def *does* append.

> @@ +32,5 @@
> > +        if definition == self.name:
> > +            self.value = True
> > +            return
> > +        assert(definition.startswith(self.name))
> > +        if definition[len(self.name)] != '=':
> 
> Possible IndexError if definition == self.name

There's a if definition == self.name branch above that prevents that.

> @@ +87,5 @@
> > +            self.values.append(('==', value[1:]))
> > +        elif value.startswith('!='):
> > +            self.values.append(('!=', value[2:]))
> > +        else:
> > +            return errors.fatal('Malformed flag: %s' % definition)
> 
> Perhaps you should explicitly check for value == '' (value == self.name)
> 
> @@ +110,5 @@
> > +            return True
> > +        for comparison, val in self.values:
> > +            if eval('value %s val' % comparison):
> > +                return True
> > +        return False
> 
> I tend to avoid eval() out of principle. If you don't need eval(), I would
> just perform the == or != inline.

For =/!=, sure, but for <, <=, >=, >, =, it's a lot of complication for not much benefit. Especially considering the only variable is the comparison operator, and it's controlled. And then, using eval for =/!= as well makes the code the same in both places.
 
> @@ +154,5 @@
> > +        elif value[0] in ['<', '>']:
> > +            if value[1] == '=':
> > +                self.values.append((value[0:2], LooseVersion(value[2:])))
> > +            else:
> > +                self.values.append((value[0], LooseVersion(value[1:])))
> 
> IIRC, LooseVersion will accept "obviously" non-version strings like
> "foo.bar". All your examples are using integer version numbers. Do you want
> any extra validation?

Probably.

> @@ +235,5 @@
> > +    def __str__(self):
> > +        '''
> > +        Serialize the set of flags.
> > +        '''
> > +        return ' '.join(str(self[k]) for k in self)
> 
> Does this work?! I thought you needed to enclose the comprehension inside an
> explicit [] or ().

It works when the function takes an iterable.

> @@ +9,5 @@
> > +from flags import Flags
> > +from mozpack.errors import errors
> > +
> > +
> > +def rebase_path(oldbase, base, relpath):
> 
> Move to path.py?

Makes sense.

> @@ +71,5 @@
> > +    def serialize(self, *args):
> > +        '''
> > +        Serialize the manifest entry.
> > +        '''
> > +        entry = [self.type] + list(args)
> 
> Is args not already a list?

It's a tuple.

> @@ +18,5 @@
> > +    '''
> > +    pass
> > +
> > +
> > +class errors(object):
> 
> This is an interesting class. I like the overall concept. But, I don't care
> much for the singleton pattern. Would you mind removing all the
> @staticmethod, turning this into class ErrorCollector (or somthing) and then
> exporting an instance as "errors"? That way, if we ever want to use this
> class elsewhere (I could see a few use cases), we won't be confined by the
> singleton.

WFM
 
> @@ +131,5 @@
> > +                errors._context.append((self._file, self._line))
> > +
> > +        def __exit__(self, type, value, tb):
> > +            if self._file and self._line:
> > +                errors._context.pop()
> 
> Could you use contextlib.contextmanager here?
> 
> @contextmanager
> def context(file, line):
>     errors._context.append((file, line))
>     yield
>     errors._context.pop()

oh.

> @@ +62,5 @@
> > +        num = f.read(4)
> > +        if len(num) < 4:
> > +            return False
> > +        num = struct.unpack('>L', num)[0]
> > +        return num < 20
> 
> What are the chances of new Mach-O architectures being rolled out? I would
> either compare against the lowest Java class format version or the current
> number of Macho-O architectures.

I just ported the code from build/macosx/universal/unify.

> @@ +89,5 @@
> > +    Return whether elfhack() should be called
> > +    '''
> > +    # elfhack only supports libraries. We should check the ELF header for
> > +    # the right flag, but checking the file extension works too.
> > +    return substs['USE_ELF_HACK'] and path.endswith(substs['DLL_SUFFIX'])
> 
> Any reason why this doesn't use is_executable?

elfhack doesn't support programs, only libraries.

> ::: python/mozbuild/mozpack/mozjar.py
> @@ +51,5 @@
> > +    JarStruct subclasses instances can be either initialized from existing data
> > +    (deserialized), or with empty fields.
> > +    '''
> > +
> > +    TYPE_MAPPING = {'uint32': ('I', 4), 'uint16': ('H', 2)}
> 
> Can you store a struct.Struct instance in here? Apparently feeding
> formatting strings into struct.[un]pack() is inefficient because the
> formatting string needs to be reparsed all the time.
> 
> @@ +61,5 @@
> > +        Omitting only one of data or offset is not allowed.
> > +        '''
> > +        assert self.MAGIC and self.STRUCT
> > +        if data:
> > +            self._init_data(data, offset)
> 
> Perhaps offset should be checked before calling _init_data() since
> _init_data() will assert if None.

Is there a point double checking it?

> @@ +125,5 @@
> > +        '''
> > +        Serialize the data structure according to the data structure definition
> > +        from self.STRUCT.
> > +        '''
> > +        serialized = struct.pack('<I', self.signature)
> 
> Please store a struct.Struct somewhere.

I'm not convinced it matters. I haven't seen struct.pack anywhere significant in profiles.

> @@ +199,5 @@
> > +        ('filecomment', 'filecomment_size'),
> > +    ]
> > +
> > +    def __hash__(self):
> > +        return hash(self.filename)
> 
> This seems dangerous to me. Why do we need __hash__ implemented?

Mmmm it was needed in earlier iterations, but it looks like it's no used anymore.

> @@ +286,5 @@
> > +        if hasattr(self, '_uncompressed_data'):
> > +            return self._uncompressed_data
> > +        data = self.compressed_data
> > +        if self.compressed:
> > +            data = zlib.decompress(data, -15)
> 
> That's an interesting magic number.

It's -MAX_WBITS. Is that any more useful?

> @@ +308,5 @@
> > +            self._data = open(file, 'rb').read()
> > +        else:
> > +            self._data = file.read()
> > +        # Scan for the End of Central Directory Record signature
> > +        offset = -22
> 
> Magic number?

It's the size of an end of central directory record.

> @@ +96,5 @@
> > +        return True
> > +    if isinstance(path, basestring):
> > +        path = split(path)
> > +    if isinstance(pattern, basestring):
> > +        pattern = split(pattern)
> 
> What if not isinstance? Perhaps you should raise?

It's meant to accept iterables. But there is also no proper test whether something is iterable.
Attached patch Import new packager code (obsolete) — Splinter Review
This addresses some of gps's comments, and adds the remainder of the code.
Attachment #695771 - Flags: review?(ted)
Attachment #695771 - Flags: review?(gps)
Attachment #694507 - Attachment is obsolete: true
Attachment #694507 - Flags: review?(ted)
Attachment #694507 - Flags: review?(khuey)
Attachment #694507 - Flags: review?(gps)
Attachment #695772 - Flags: review?(ted)
Attachment #695772 - Flags: review?(gps)
Attachment #689367 - Attachment is obsolete: true
Remove some more.
Attachment #695773 - Flags: review?(ted)
Attachment #693894 - Attachment is obsolete: true
Attachment #693894 - Flags: review?(ted)
Comment on attachment 695771 [details] [diff] [review]
Import new packager code

Review of attachment 695771 [details] [diff] [review]:
-----------------------------------------------------------------

Random nits:

::: python/mozbuild/mozpack/chrome/flags.py
@@ +15,5 @@
> +    '''
> +    Class for flags in manifest entries in the form:
> +        "flag"   (same as "flag=true")
> +        "flag=yes|true|1"
> +        "flag="no|false|0"

Odd number of quotes

@@ +48,5 @@
> +        '''
> +        if value in ['yes', 'true', '1', True]:
> +            return self.value in ['yes', 'true', '1', True]
> +        if value in ['no', 'false', '0', False]:
> +            return self.value in ['no', 'false', '0', None]

False/None looks strange... Might warrant a comment. Also, tuples, maybe?

@@ +59,5 @@
> +        '''
> +        if self.value is None:
> +            return ''
> +        elif self.value is True:
> +            return self.name

No else-after-return, please

@@ +150,5 @@
> +        assert(definition.startswith(self.name))
> +        value = definition[len(self.name):]
> +        if value.startswith('='):
> +            self.values.append(('==', LooseVersion(value[1:])))
> +        elif value[0] in ['<', '>'] and len(value) > 1:

Possible index out of range

::: python/mozbuild/mozpack/mozjar.py
@@ +369,5 @@
> +            # are the FAT attributes.
> +            xattr = entry['external_attr']
> +            # Skip directories
> +            if host == 0 and xattr & 0x10 or host == 3 and \
> +                    xattr & (040000 << 16):

Parens around and-in-or (or is it or-in-and?) for clarity
Attached patch Import new packager code (obsolete) — Splinter Review
Refreshed with Ms2ger comments, and updated support for NO_PKG_FILES.
Attachment #695969 - Flags: review?(ted)
Attachment #695969 - Flags: review?(gps)
Attachment #695771 - Attachment is obsolete: true
Attachment #695771 - Flags: review?(ted)
Attachment #695771 - Flags: review?(gps)
Update to support MOZ_PKG_FATAL_WARNINGS=0 (0 was necessary to properly disable from the command line, before)
Attachment #695970 - Flags: review?(ted)
Attachment #693892 - Attachment is obsolete: true
Attachment #693892 - Flags: review?(ted)
Context change after change to the packager.mk patch.
Attachment #695972 - Flags: review?(ted)
Attachment #695773 - Attachment is obsolete: true
Attachment #695773 - Flags: review?(ted)
Blocks: 825872
Comment on attachment 695969 [details] [diff] [review]
Import new packager code

Review of attachment 695969 [details] [diff] [review]:
-----------------------------------------------------------------

Another partial review. Only got to the files "left of" mozjar.py in the splinter list. I assume splinter is in patch order?? It's getting late. Hopefully I'll get to the rest tomorrow.

I would /really/ like for you to invest some time in writing high-level docs for all this code. Even if it is a simple README in python/mozbuild/mozpack detailing what each module is for and how they interact with each other, I think that would go a long way. It definitely would have saved me a lot of time exploring before figuring out where to start review!

If you're not going to address Python 3 compatibility, we should probably invent a comment keyword for known issues to aid ourselves in the future. "# FUTURE PY3K"? .iteritems() and basestring are the big Python 3 compatibility issues. These are pretty obvious for anyone porting Python 3, so maybe it's not worth calling each out?

::: python/mozbuild/mozpack/copier.py
@@ +15,5 @@
> +except ImportError:
> +    from simplejson import OrderedDict
> +
> +
> +def ensureParentDir(file):

Nit: Underscore, not camel case for function names.

@@ +19,5 @@
> +def ensureParentDir(file):
> +    '''Ensures the directory parent to the given file exists'''
> +    dir = os.path.dirname(file)
> +    if dir and not os.path.exists(dir):
> +        try:

I prefer early return as a general style.

if not dir or os.path.exists(dir):
    return

try:

@@ +21,5 @@
> +    dir = os.path.dirname(file)
> +    if dir and not os.path.exists(dir):
> +        try:
> +            os.makedirs(dir)
> +        except OSError, error:

except OSError as error:

@@ +45,5 @@
> +    def add(self, path, content):
> +        '''
> +        Add a BaseFile instance to the container, under the given path.
> +        '''
> +        assert isinstance(content, BaseFile) or content is None

Perhaps you should "content=None" in the function declaration?

@@ +71,5 @@
> +            return self.paths()
> +        if not pattern in self._files:
> +            return [p for p in self.paths()
> +                    if mozpack.path.basedir(p, [pattern]) == pattern]
> +        return [pattern]

Please document returning [pattern].

@@ +111,5 @@
> +        '''
> +        Return the BaseFile instance stored in the container for the given
> +        path.
> +        '''
> +        return self._files[path]

It's kinda weird for __contains__ -> to not imply __getitem__ will be successful. Python sometimes magically implements some magic methods in terms of what you have implemented (using what it can when available). I worry that different semantics for __contains__ and __getitem__ will lead to weirdness.

@@ +120,5 @@
> +            for path, file in registry:
> +                (...)
> +        '''
> +        for path, file in self._files.iteritems():
> +            yield path, file

return self._files.iteritems()

Or for Python 3 compatibility:

return self._files.items()

@@ +182,5 @@
> +        a jar file.
> +        If the destination jar file exists, its (compressed) contents are used
> +        instead of the registered BaseFile instances when appropriate.
> +        '''
> +        class DeflaterDest(Dest):

I generally think nested classes are silly and just make reading code harder.

@@ +213,5 @@
> +
> +            def exists(self):
> +                return self.deflater is not None
> +
> +        if isinstance(dest, basestring):

py3k basestring

@@ +216,5 @@
> +
> +        if isinstance(dest, basestring):
> +            dest = Dest(dest)
> +        else:
> +            assert isinstance(dest, Dest)

drop the else

@@ +218,5 @@
> +            dest = Dest(dest)
> +        else:
> +            assert isinstance(dest, Dest)
> +
> +        from mozpack.mozjar import JarWriter, JarReader

module-level imports, please.

@@ +222,5 @@
> +        from mozpack.mozjar import JarWriter, JarReader
> +        try:
> +            old_jar = JarReader(dest)
> +        except:
> +            old_jar = []

bareword except is evil. Please trap only what you need (I assume some kind of IOError for missing file?)

::: python/mozbuild/mozpack/files.py
@@ +73,5 @@
> +
> +        can_skip_content_check = False
> +        if not dest.exists():
> +            can_skip_content_check = True
> +        elif hasattr(self, 'path') and hasattr(dest, 'path'):

What if self.path is None? Perhaps getattr(self, 'path', None) would be more robust?

@@ +76,5 @@
> +            can_skip_content_check = True
> +        elif hasattr(self, 'path') and hasattr(dest, 'path'):
> +            if int(os.path.getmtime(self.path) * 1000) \
> +                    <= int(os.path.getmtime(dest.path) * 1000):
> +                return False

This seems dangerous to me. Convince me we shouldn't perform stronger size/content based validation here.

@@ +101,5 @@
> +                break
> +            # If the read content differs between origin and destination,
> +            # write what was read up to now, and copy the remainder.
> +            if len(dest_content) != len(src_content) or \
> +                    dest_content != src_content:

Python is smart enough to short-circuit deep compare if the lengths are different.

@@ +107,5 @@
> +                shutil.copyfileobj(src, dest)
> +                break
> +        if hasattr(self, 'path') and hasattr(dest, 'path'):
> +            shutil.copystat(self.path, dest.path)
> +        return True

I worry about not closing dest when finished. e.g. in ExecutableFile.copy(), you try an os.remove() after a File.Copy(). I'm pretty sure some filesystems don't like removing a file that is open somewhere.

@@ +115,5 @@
> +        Return a file-like object allowing to read() the content of the
> +        associated file. This is meant to be overloaded in subclasses to return
> +        a custom file-like object.
> +        '''
> +        assert self.path

assert self.path is not None

@@ +133,5 @@
> +    File class for executable and library files on OS/2, OS/X and ELF systems.
> +    (see mozpack.executables.is_executable documentation).
> +    '''
> +    def copy(self, dest):
> +        assert isinstance(dest, basestring)

basestring doesn't exist in Python 3.

@@ +135,5 @@
> +    '''
> +    def copy(self, dest):
> +        assert isinstance(dest, basestring)
> +        if File.copy(self, dest):
> +            try:

if not File.copy(self, dest):
    return

try:

@@ +162,5 @@
> +    File class for members of a jar archive. DeflatedFile.copy() effectively
> +    extracts the file from the jar archive.
> +    '''
> +    def __init__(self, file):
> +        from mozpack.mozjar import JarFileReader

Do this at module scope, please.

@@ +178,5 @@
> +    (using the add() and remove() member functions), and links them at copy()
> +    time.
> +    '''
> +    def __init__(self):
> +        self._files = set()

Does using an unordered data structure impact us at all? Should you perhaps throw a sorted(self._files) in somewhere to ensure consistent ordering?

@@ +203,5 @@
> +        destination given as a string or a Dest instance. Avoids an expensive
> +        XPT linking if the interfaces in an existing destination match those of
> +        the individual XPTs to link.
> +        '''
> +        if isinstance(dest, basestring):

basestring and py3k

@@ +206,5 @@
> +        '''
> +        if isinstance(dest, basestring):
> +            dest = Dest(dest)
> +        else:
> +            assert isinstance(dest, Dest)

No need for the else

@@ +220,5 @@
> +                     if i.iid != Interface.UNRESOLVED_IID)
> +            identical = True
> +            for f in self._files:
> +                typelib = Typelib.read(f.open())
> +                for i in typelib.interfaces:

Typelib vs typelibs vs typelib make this somewhat difficult to read!

@@ +225,5 @@
> +                    if i.iid != Interface.UNRESOLVED_IID and \
> +                            not (i.name in dest_interfaces and
> +                                 i == dest_interfaces[i.name]):
> +                        identical = False
> +                        break

Do you think splitting this out into separate tests + continue would be easier to read?

@@ +227,5 @@
> +                                 i == dest_interfaces[i.name]):
> +                        identical = False
> +                        break
> +            if identical:
> +                return

Base type's copy() returns a bool. Why no return here?

@@ +230,5 @@
> +            if identical:
> +                return
> +        s = StringIO()
> +        xpt_link(typelibs).write(s)
> +        dest.write(s.getvalue())

Ugh, inconsistent behavior for similarly named methods. Your code is correct. xpt_link is wrong.

@@ +289,5 @@
> +        '''
> +        Iterate over entries in the manifest file.
> +        '''
> +        for e in self._entries:
> +            yield e

return iter(self._entries)

@@ +355,5 @@
> +            generator = self._find_dir(pattern)
> +        else:
> +            generator = self._find_file(pattern)
> +        for p, f in generator:
> +            yield p, f

return generator

@@ +414,5 @@
> +                        yield p_, f
> +        else:
> +            for p, f in self._find_glob(mozpack.path.join(base, pattern[0]),
> +                                        pattern[1:]):
> +                yield p, f

I'll confess to only glancing over the find* functions.

::: python/mozbuild/mozpack/packager/formats.py
@@ +62,5 @@
> +    '''
> +    Formatter for the flat package format.
> +    '''
> +    def __init__(self, copier):
> +        self.copier = copier

assert isinstance(copier, ...)?
(In reply to Gregory Szorc [:gps] from comment #50)
> If you're not going to address Python 3 compatibility, we should probably
> invent a comment keyword for known issues to aid ourselves in the future. "#
> FUTURE PY3K"? .iteritems() and basestring are the big Python 3 compatibility
> issues. These are pretty obvious for anyone porting Python 3, so maybe it's
> not worth calling each out?

Yeah i doubt it's any useful. Others include StringIO and some other stuff. The way to handle strings in a compatible way between py2 and py3 is just horrible and error prone, and since at the very least mozunit and xpt are not py3 ready (and far from it), i couldn't validate that everything works properly, so I just skipped python 3 compatibility altogether.

> > +        '''
> > +        Return the BaseFile instance stored in the container for the given
> > +        path.
> > +        '''
> > +        return self._files[path]
> 
> It's kinda weird for __contains__ -> to not imply __getitem__ will be
> successful. Python sometimes magically implements some magic methods in
> terms of what you have implemented (using what it can when available). I
> worry that different semantics for __contains__ and __getitem__ will lead to
> weirdness.

What would you expect then? return [self.match(path)] ? or something that returns a list when there is more than one item, but just the item when there is only one ? or something that returns a list when the path matches a directory or a glob match (whatever number of results there are) and just the item when path is self._files ?

> @@ +182,5 @@
> > +        a jar file.
> > +        If the destination jar file exists, its (compressed) contents are used
> > +        instead of the registered BaseFile instances when appropriate.
> > +        '''
> > +        class DeflaterDest(Dest):
> 
> I generally think nested classes are silly and just make reading code harder.

I'm not a fan of exposing every kind of details at the module level.

> @@ +218,5 @@
> > +            dest = Dest(dest)
> > +        else:
> > +            assert isinstance(dest, Dest)
> > +
> > +        from mozpack.mozjar import JarWriter, JarReader
> 
> module-level imports, please.

The reason to not import at module level here is to avoid importing mozjar or xpt when not dealing with jars and xpts, which the code may be used for.
 
> @@ +222,5 @@
> > +        from mozpack.mozjar import JarWriter, JarReader
> > +        try:
> > +            old_jar = JarReader(dest)
> > +        except:
> > +            old_jar = []
> 
> bareword except is evil. Please trap only what you need (I assume some kind
> of IOError for missing file?)

There are *many* ways JarReader() can fail, all of which can be handled the same way.

> > +            can_skip_content_check = True
> > +        elif hasattr(self, 'path') and hasattr(dest, 'path'):
> > +            if int(os.path.getmtime(self.path) * 1000) \
> > +                    <= int(os.path.getmtime(dest.path) * 1000):
> > +                return False
> 
> This seems dangerous to me. Convince me we shouldn't perform stronger
> size/content based validation here.

That's not any more dangerous than make rules. Size/content based validation also would make us recopy and restrip all the libraries every time (and it's not exactly fast to do so)

> @@ +107,5 @@
> > +                shutil.copyfileobj(src, dest)
> > +                break
> > +        if hasattr(self, 'path') and hasattr(dest, 'path'):
> > +            shutil.copystat(self.path, dest.path)
> > +        return True
> 
> I worry about not closing dest when finished. e.g. in ExecutableFile.copy(),
> you try an os.remove() after a File.Copy(). I'm pretty sure some filesystems
> don't like removing a file that is open somewhere.

I'll check that. ISTR I had problems with dest.close() causing problems with some StringIOs (because closing a StringIO does bad things)
 
> @@ +178,5 @@
> > +    (using the add() and remove() member functions), and links them at copy()
> > +    time.
> > +    '''
> > +    def __init__(self):
> > +        self._files = set()
> 
> Does using an unordered data structure impact us at all? Should you perhaps
> throw a sorted(self._files) in somewhere to ensure consistent ordering?

order doesn't matter much.

> @@ +227,5 @@
> > +                                 i == dest_interfaces[i.name]):
> > +                        identical = False
> > +                        break
> > +            if identical:
> > +                return
> 
> Base type's copy() returns a bool. Why no return here?

Leftover from when the base type's copy() didn't return anything.
I chatted with jimm on IRC and we decided it's best to wait until 21 to land this. Reasons include:

* I am extremely busy trying to land 718066 in 20
* Sweeping changes late in the release cycle are somewhat risky
* It's only delayed for a few days and according to jimm is "no biggie"

I will still try to perform the reviews ASAP. However, if I have to choose between this and bug 718066, 718066 wins.
Whiteboard: [metro-it2] → [metro-it2][completed-elm]
Comment on attachment 695969 [details] [diff] [review]
Import new packager code

Review of attachment 695969 [details] [diff] [review]:
-----------------------------------------------------------------

Only really got to mozjar.py in this review. I'll finish this eventually...

::: python/mozbuild/mozpack/chrome/flags.py
@@ +7,5 @@
> +from mozpack.errors import errors
> +try:
> +    from collections import OrderedDict
> +except ImportError:
> +    from simplejson import OrderedDict

We're on Python 2.7 now. No more simplejson!

::: python/mozbuild/mozpack/mozjar.py
@@ +18,5 @@
> +JAR_STORED = ZIP_STORED
> +JAR_DEFLATED = ZIP_DEFLATED
> +
> +
> +class JarReaderError(RuntimeError):

RuntimeError doesn't seem appropriate given the description at http://docs.python.org/2/library/exceptions.html#exceptions.RuntimeError. I think this should be a child of Exception.

@@ +57,5 @@
> +    JarStruct subclasses instances can be either initialized from existing data
> +    (deserialized), or with empty fields.
> +    '''
> +
> +    TYPE_MAPPING = {'uint32': ('I', 4), 'uint16': ('H', 2)}

I /think/ everywhere you use the format field you pass the result into struct.{pack,unpack}. I'd consider storing a pair of struct.Struct instances for performance reasons (I've left this review comment before and am too lazy to see if you left a rebuttal).

@@ +114,5 @@
> +            if not name in sizes:
> +                if t in JarStruct.TYPE_MAPPING:
> +                    self._values[name] = 0
> +                else:
> +                    self._values[name] = ''

for name, t in self.STRUCT.iteritems():
    if name in sizes:
        continue

    self._values[name] = 0 if t in JarStruct.TYPE_MAPPING else ''

@@ +123,5 @@
> +        Deserialize a single field of given type (must be one of
> +        JarStruct.TYPE_MAPPING) at the given offset in the given data.
> +        '''
> +        assert type in JarStruct.TYPE_MAPPING
> +        assert data is not None and offset is not None

I'd use separate asserts to ease debugging.

@@ +132,5 @@
> +        '''
> +        Serialize the data structure according to the data structure definition
> +        from self.STRUCT.
> +        '''
> +        serialized = struct.pack('<I', self.signature)

Using strings here seems like an obvious perf loss. I would either use [c]StringIO or would append strings to a list then join together on empty string at the end.

@@ +141,5 @@
> +                format, size = JarStruct.TYPE_MAPPING[t]
> +                if name in sizes:
> +                    value = len(self[sizes[name]])
> +                else:
> +                    value = self[name]

value = len(self[sizes[name]]) if name in sizes else self[name]

Oh, hmm. That's somewhat hard to read. Your call.

@@ +159,5 @@
> +            if type in JarStruct.TYPE_MAPPING:
> +                size += JarStruct.TYPE_MAPPING[type][1]
> +            else:
> +                size += len(self[name])
> +        return size

This is a candidate for caching. Would need to invalidate on mutate of course.

@@ +167,5 @@
> +
> +    def __setitem__(self, key, value):
> +        if not key in self.STRUCT:
> +            raise KeyError(key)
> +        self._values[key] = value

Should we allow setting of the *_size keys?

@@ +174,5 @@
> +        return key in self._values
> +
> +    def __iter__(self):
> +        for key, value in self._values.iteritems():
> +            yield key, value

return self._values.iteritems()

@@ +240,5 @@
> +        ('filename_size', 'uint16'),
> +        ('extra_field_size', 'uint16'),
> +        ('filename', 'filename_size'),
> +        ('extra_field', 'extra_field_size'),
> +    ])

I didn't verify these match the old implementation. I assume things will blow up if you got this wrong :)

@@ +248,5 @@
> +    '''
> +    File-like class for use by JarReader to give access to individual files
> +    within a Jar archive.
> +    '''
> +    def __init__(self, header, data, offset):

Instead of shuffling data-offset pairs around, you may consider using the built-in "buffer" or "memoryview" types. These give you the convenience of not having to worry about offsets without sacrificing perf of new allocations.

@@ +306,5 @@
> +        if hasattr(self, '_uncompressed_data'):
> +            return self._uncompressed_data
> +        data = self.compressed_data
> +        if self.compressed:
> +            data = zlib.decompress(data, -15)

Magic number to module or class-level "constant" please.

@@ +326,5 @@
> +        '''
> +        if isinstance(file, basestring):
> +            self._data = open(file, 'rb').read()
> +        else:
> +            self._data = file.read()

I would use separate named arguments to denote behavior.

@@ +328,5 @@
> +            self._data = open(file, 'rb').read()
> +        else:
> +            self._data = file.read()
> +        # Scan for the End of Central Directory Record signature
> +        offset = -22

Document magic number. Move to module or class scope.

@@ +335,5 @@
> +            if signature == JarCdirEnd.MAGIC:
> +                break
> +            if offset == -len(self._data):
> +                raise JarReaderError('Not a jar?')
> +            offset -= 1

If we have to scan on opening, that seems like a deficiency in the encoding format. Probably not a significant perf concern or something that could be changed easily. Oh well.

@@ +355,5 @@
> +        if hasattr(self, '_entries'):
> +            return self._entries
> +        preload = 0
> +        if self.isOptimized:
> +            preload, _ = JarStruct.get_data('uint32', self._data, 0)

That _ is too Erlang! While I'm cool with it, the Python way is:

preload = JarStruct.get_data('uint32', self._data, 0)[0]

@@ +378,5 @@
> +        self._entries = entries
> +        return entries
> +
> +    @property
> +    def isOptimized(self):

is_optimized

@@ +389,5 @@
> +        return self._cdir_end['cdir_offset'] == \
> +            JarStruct.TYPE_MAPPING['uint32'][1]
> +
> +    @property
> +    def lastPreloaded(self):

last_preloaded

@@ +398,5 @@
> +        if hasattr(self, '_lastPreloaded'):
> +            return self._lastPreloaded
> +        self._lastPreloaded = None
> +        self.entries
> +        return self._lastPreloaded

We really need a @CachedProperty decorator in the tree if we don't already. So useful...

@@ +408,5 @@
> +        '''
> +        header = JarLocalFileHeader(self._data, entry['offset'])
> +        for key, value in entry:
> +            if key in header and header[key] != value:
> +                raise JarReaderError('Corrupted archive?')

Why might it be corrupt? More useful error message, please.

@@ +450,5 @@
> +        '''
> +        if isinstance(file, basestring):
> +            self._data = open(file, 'wb')
> +        else:
> +            self._data = file

Again, I prefer separate named arguments.

@@ +468,5 @@
> +        Context manager __exit__ method for JarWriter.
> +        '''
> +        self.close()
> +
> +    def close(self):

I confess to not looking at this function too hard.

How about renaming this to "finish"? "close" has different meaning to me.

@@ +564,5 @@
> +        '''
> +        if name in self._contents:
> +            raise JarWriterError("File %s already in JarWriter" % name)
> +        if compress is None:
> +            compress = self._compress

compress = self._compress if compress is None

@@ +591,5 @@
> +            entry['min_version'] = 10  # Version 1.0 for stored streams
> +            entry['general_flag'] = 0
> +            entry['compression'] = JAR_STORED
> +        # January 1st, 2010
> +        entry['lastmod_date'] = ((2010 - 1980) << 9) | (1 << 5) | 1

Huh?

@@ +631,5 @@
> +        '''
> +        self._data = cStringIO.StringIO()
> +        self.compress = compress
> +        if compress:
> +            self._deflater = zlib.compressobj(9, zlib.DEFLATED, -15)

There's that magic number again!

@@ +640,5 @@
> +        Append a buffer to the Deflater.
> +        '''
> +        self._data.write(data)
> +        if self.compress:
> +            if hasattr(self, '_deflater'):

Instead of hasattr, I'd just store None in __init__ and check for truthiness.
(In reply to Gregory Szorc [:gps] from comment #53)
> > +    TYPE_MAPPING = {'uint32': ('I', 4), 'uint16': ('H', 2)}
> 
> I /think/ everywhere you use the format field you pass the result into
> struct.{pack,unpack}. I'd consider storing a pair of struct.Struct instances
> for performance reasons (I've left this review comment before and am too
> lazy to see if you left a rebuttal).

I replied it didn't show up anywhere in profiling.

> @@ +132,5 @@
> > +        '''
> > +        Serialize the data structure according to the data structure definition
> > +        from self.STRUCT.
> > +        '''
> > +        serialized = struct.pack('<I', self.signature)
> 
> Using strings here seems like an obvious perf loss. I would either use
> [c]StringIO or would append strings to a list then join together on empty
> string at the end.

I don't think it matters.

> @@ +159,5 @@
> > +            if type in JarStruct.TYPE_MAPPING:
> > +                size += JarStruct.TYPE_MAPPING[type][1]
> > +            else:
> > +                size += len(self[name])
> > +        return size
> 
> This is a candidate for caching. Would need to invalidate on mutate of
> course.

Likewise.
 
> @@ +328,5 @@
> > +            self._data = open(file, 'rb').read()
> > +        else:
> > +            self._data = file.read()
> > +        # Scan for the End of Central Directory Record signature
> > +        offset = -22
> 
> Document magic number. Move to module or class scope.

It's not a magic number, it's the size of the structure.

> @@ +335,5 @@
> > +            if signature == JarCdirEnd.MAGIC:
> > +                break
> > +            if offset == -len(self._data):
> > +                raise JarReaderError('Not a jar?')
> > +            offset -= 1
> 
> If we have to scan on opening, that seems like a deficiency in the encoding
> format. Probably not a significant perf concern or something that could be
> changed easily. Oh well.

The structure may contain comments at the end, so has a variable-size, thus the scanning.

> > +        # January 1st, 2010
> > +        entry['lastmod_date'] = ((2010 - 1980) << 9) | (1 << 5) | 1
> 
> Huh?

Inherited from bug 592369 ; I think it can go away now, but i didn't want to risk it.
FHR appears to be landing tonight! So, I should be unblocked to focus on this review in earnest. I will be in Tahoe this weekend and won't have my computer with me. I hope to finish this review Monday. Please hold me to it.
Comment on attachment 695969 [details] [diff] [review]
Import new packager code

Review of attachment 695969 [details] [diff] [review]:
-----------------------------------------------------------------

I got to everything except for l10n-repack.py and packager.py. I /might/ finish tonight. Will need to find some energy though (skiing fatigue is catching up to me!).

::: python/mozbuild/mozpack/chrome/flags.py
@@ +7,5 @@
> +from mozpack.errors import errors
> +try:
> +    from collections import OrderedDict
> +except ImportError:
> +    from simplejson import OrderedDict

We're on Python 2.7 now. No more simplejson!

::: python/mozbuild/mozpack/mozjar.py
@@ +18,5 @@
> +JAR_STORED = ZIP_STORED
> +JAR_DEFLATED = ZIP_DEFLATED
> +
> +
> +class JarReaderError(RuntimeError):

RuntimeError doesn't seem appropriate given the description at http://docs.python.org/2/library/exceptions.html#exceptions.RuntimeError. I think this should be a child of Exception.

@@ +57,5 @@
> +    JarStruct subclasses instances can be either initialized from existing data
> +    (deserialized), or with empty fields.
> +    '''
> +
> +    TYPE_MAPPING = {'uint32': ('I', 4), 'uint16': ('H', 2)}

I /think/ everywhere you use the format field you pass the result into struct.{pack,unpack}. I'd consider storing a pair of struct.Struct instances for performance reasons (I've left this review comment before and am too lazy to see if you left a rebuttal).

@@ +114,5 @@
> +            if not name in sizes:
> +                if t in JarStruct.TYPE_MAPPING:
> +                    self._values[name] = 0
> +                else:
> +                    self._values[name] = ''

for name, t in self.STRUCT.iteritems():
    if name in sizes:
        continue

    self._values[name] = 0 if t in JarStruct.TYPE_MAPPING else ''

@@ +123,5 @@
> +        Deserialize a single field of given type (must be one of
> +        JarStruct.TYPE_MAPPING) at the given offset in the given data.
> +        '''
> +        assert type in JarStruct.TYPE_MAPPING
> +        assert data is not None and offset is not None

I'd use separate asserts to ease debugging.

@@ +132,5 @@
> +        '''
> +        Serialize the data structure according to the data structure definition
> +        from self.STRUCT.
> +        '''
> +        serialized = struct.pack('<I', self.signature)

Using strings here seems like an obvious perf loss. I would either use [c]StringIO or would append strings to a list then join together on empty string at the end.

@@ +141,5 @@
> +                format, size = JarStruct.TYPE_MAPPING[t]
> +                if name in sizes:
> +                    value = len(self[sizes[name]])
> +                else:
> +                    value = self[name]

value = len(self[sizes[name]]) if name in sizes else self[name]

Oh, hmm. That's somewhat hard to read. Your call.

@@ +159,5 @@
> +            if type in JarStruct.TYPE_MAPPING:
> +                size += JarStruct.TYPE_MAPPING[type][1]
> +            else:
> +                size += len(self[name])
> +        return size

This is a candidate for caching. Would need to invalidate on mutate of course.

@@ +167,5 @@
> +
> +    def __setitem__(self, key, value):
> +        if not key in self.STRUCT:
> +            raise KeyError(key)
> +        self._values[key] = value

Should we allow setting of the *_size keys?

@@ +174,5 @@
> +        return key in self._values
> +
> +    def __iter__(self):
> +        for key, value in self._values.iteritems():
> +            yield key, value

return self._values.iteritems()

@@ +240,5 @@
> +        ('filename_size', 'uint16'),
> +        ('extra_field_size', 'uint16'),
> +        ('filename', 'filename_size'),
> +        ('extra_field', 'extra_field_size'),
> +    ])

I didn't verify these match the old implementation. I assume things will blow up if you got this wrong :)

@@ +248,5 @@
> +    '''
> +    File-like class for use by JarReader to give access to individual files
> +    within a Jar archive.
> +    '''
> +    def __init__(self, header, data, offset):

Instead of shuffling data-offset pairs around, you may consider using the built-in "buffer" or "memoryview" types. These give you the convenience of not having to worry about offsets without sacrificing perf of new allocations.

@@ +306,5 @@
> +        if hasattr(self, '_uncompressed_data'):
> +            return self._uncompressed_data
> +        data = self.compressed_data
> +        if self.compressed:
> +            data = zlib.decompress(data, -15)

Magic number to module or class-level "constant" please.

@@ +326,5 @@
> +        '''
> +        if isinstance(file, basestring):
> +            self._data = open(file, 'rb').read()
> +        else:
> +            self._data = file.read()

I would use separate named arguments to denote behavior.

@@ +328,5 @@
> +            self._data = open(file, 'rb').read()
> +        else:
> +            self._data = file.read()
> +        # Scan for the End of Central Directory Record signature
> +        offset = -22

Document magic number. Move to module or class scope.

@@ +335,5 @@
> +            if signature == JarCdirEnd.MAGIC:
> +                break
> +            if offset == -len(self._data):
> +                raise JarReaderError('Not a jar?')
> +            offset -= 1

If we have to scan on opening, that seems like a deficiency in the encoding format. Probably not a significant perf concern or something that could be changed easily. Oh well.

@@ +355,5 @@
> +        if hasattr(self, '_entries'):
> +            return self._entries
> +        preload = 0
> +        if self.isOptimized:
> +            preload, _ = JarStruct.get_data('uint32', self._data, 0)

That _ is too Erlang! While I'm cool with it, the Python way is:

preload = JarStruct.get_data('uint32', self._data, 0)[0]

@@ +378,5 @@
> +        self._entries = entries
> +        return entries
> +
> +    @property
> +    def isOptimized(self):

is_optimized

@@ +389,5 @@
> +        return self._cdir_end['cdir_offset'] == \
> +            JarStruct.TYPE_MAPPING['uint32'][1]
> +
> +    @property
> +    def lastPreloaded(self):

last_preloaded

@@ +398,5 @@
> +        if hasattr(self, '_lastPreloaded'):
> +            return self._lastPreloaded
> +        self._lastPreloaded = None
> +        self.entries
> +        return self._lastPreloaded

We really need a @CachedProperty decorator in the tree if we don't already. So useful...

@@ +408,5 @@
> +        '''
> +        header = JarLocalFileHeader(self._data, entry['offset'])
> +        for key, value in entry:
> +            if key in header and header[key] != value:
> +                raise JarReaderError('Corrupted archive?')

Why might it be corrupt? More useful error message, please.

@@ +450,5 @@
> +        '''
> +        if isinstance(file, basestring):
> +            self._data = open(file, 'wb')
> +        else:
> +            self._data = file

Again, I prefer separate named arguments.

@@ +468,5 @@
> +        Context manager __exit__ method for JarWriter.
> +        '''
> +        self.close()
> +
> +    def close(self):

I confess to not looking at this function too hard.

How about renaming this to "finish"? "close" has different meaning to me.

@@ +564,5 @@
> +        '''
> +        if name in self._contents:
> +            raise JarWriterError("File %s already in JarWriter" % name)
> +        if compress is None:
> +            compress = self._compress

compress = self._compress if compress is None

@@ +591,5 @@
> +            entry['min_version'] = 10  # Version 1.0 for stored streams
> +            entry['general_flag'] = 0
> +            entry['compression'] = JAR_STORED
> +        # January 1st, 2010
> +        entry['lastmod_date'] = ((2010 - 1980) << 9) | (1 << 5) | 1

Huh?

@@ +631,5 @@
> +        '''
> +        self._data = cStringIO.StringIO()
> +        self.compress = compress
> +        if compress:
> +            self._deflater = zlib.compressobj(9, zlib.DEFLATED, -15)

There's that magic number again!

@@ +640,5 @@
> +        Append a buffer to the Deflater.
> +        '''
> +        self._data.write(data)
> +        if self.compress:
> +            if hasattr(self, '_deflater'):

Instead of hasattr, I'd just store None in __init__ and check for truthiness.

::: python/mozbuild/mozpack/packager/__init__.py
@@ +1,3 @@
> +# This Source Code Form is subject to the terms of the Mozilla Public
> +# License, v. 2.0. If a copy of the MPL was not distributed with this
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.

I'm not sure how I feel about so many things existing in an __init__.py for a package that already has a bunch of modules. Meh.

@@ +45,5 @@
> +        '''
> +        # Remove comments
> +        str = re.sub(r'\s*;.*$', '', str).strip()
> +        if not str:
> +            return

I would just:

str = str.strip()
if str.startswith(';'):
    return

@@ +50,5 @@
> +        if len(str) >= 2 and str[0] == '[' and str[-1] == ']':
> +            if len(str) == 2 or re.search(r'[\[\]\s]', str[1:-1]):
> +                errors.fatal('Malformed manifest')
> +            else:
> +                self._component = str[1:-1]

I /think/ if you special case the str == '[]' case this would be easier to read. I'm also tempted to recommend .startswith and .endswith instead of [0] and [-1] but meh.

@@ +52,5 @@
> +                errors.fatal('Malformed manifest')
> +            else:
> +                self._component = str[1:-1]
> +        elif str[0] == '-':
> +            str = str[1:].strip()

You already performed a .strip() above. I guess this is to support "- arg" as well as "-arg". If only 1 is used in the wild, I'd nuke support for the other one. I like strict formats :)

@@ +160,5 @@
> +            if b.endswith('/' + path) or b == path:
> +                base = os.path.normpath(b[:-len(path)])
> +        for e in parse_manifest(base, path, file.open()):
> +            if e and not isinstance(e, Manifest) and \
> +                    not isinstance(e, ManifestInterfaces):

The 2nd argument to isinstance can be a tuple of types to match against. I /think/ you could put None in this tuple and remove the "if e and".

@@ +161,5 @@
> +                base = os.path.normpath(b[:-len(path)])
> +        for e in parse_manifest(base, path, file.open()):
> +            if e and not isinstance(e, Manifest) and \
> +                    not isinstance(e, ManifestInterfaces):
> +                # e.mode(e.base) just returns a clone of the entry

"e.move". Add period to all comments.

@@ +194,5 @@
> +    '''
> +    Parser sink for "simple" package manifests. Simple package manifests use
> +    the format described in the PackageManifestParser documentation, but don't
> +    support file removals, and require manifests, interfaces and chrome data to
> +    be explicitely listed.

English.

@@ +246,5 @@
> +        paths = [mozpack.path.dirname(m) for m in self._manifests]
> +        path = mozpack.path.dirname(mozpack.path.commonprefix(paths))
> +        for p, f in self._finder.find(mozpack.path.join(path,
> +                                      '**', 'chrome.manifest')):
> +            if not p in self._manifests:

if p not in self._manifests

::: python/mozbuild/mozpack/packager/formats.py
@@ +244,5 @@
> +            if mozpack.path.relpath(path, base) in copier:
> +                return True
> +        return False
> +
> +    def isResource(self, path):

is_resource

@@ +252,5 @@
> +        '''
> +        base = self._get_base(path)
> +        path = mozpack.path.split(mozpack.path.relpath(path, base))
> +        if path[0] == 'chrome':
> +            return len(path) == 1 or path[1] != 'icons'

Perhaps you should assert on len(path)?

@@ +267,5 @@
> +            'modules',
> +            'greprefs.js',
> +            'hyphenation',
> +            'update.locale',
> +        ] or path[0] in STARTUP_CACHE_PATHS

Since you already have part of the set in a module-level variable, I'd just move the rest.

::: python/mozbuild/mozpack/packager/unpack.py
@@ +27,5 @@
> +from urlparse import urlparse
> +try:
> +    from collections import OrderedDict
> +except ImportError:
> +    from simplejson import OrderedDict

We have Python 2.7 now!

@@ +74,5 @@
> +                                self.files.add(path, m)
> +                            continue
> +                        else:
> +                            self.files.add(path, DeflatedFile(j))
> +                    continue

Do you mean to fall through to the if below if chrome.manifest isn't in jar?

@@ +113,5 @@
> +                if p in self.files:
> +                    continue
> +                f = m
> +            if not p in jars:
> +                self.files.add(p, f)

There's a few big blocks in this loop (FileFinder.find). Would you mind splitting them into methods for readability?

@@ +123,5 @@
> +        '''
> +        jar = JarReader(file.open())
> +        if jar.isOptimized:
> +            self.optimizedjars = True
> +        if jar.lastPreloaded:

I assume I already made a comment about not using camelCase variables.

@@ +151,5 @@
> +        jar, relpath = urlparse(relpath).path.split('!', 1)
> +        return mozpack.path.join(base, jar), \
> +            entry.rebase(mozpack.path.join(base, 'jar:%s!' % jar)) \
> +            .move(mozpack.path.join(base, mozpack.path.splitext(jar)[0])) \
> +            .rebase(base)

Please save that long expression to a variable then do the return in 1 line.

::: python/mozbuild/mozpack/unify.py
@@ +21,5 @@
> +import subprocess
> +try:
> +    from collections import OrderedDict
> +except ImportError:
> +    from simplejson import OrderedDict

Python 2.7!

@@ +59,5 @@
> +            for p in [self.path1, self.path2]:
> +                fd, f = mkstemp()
> +                os.close(fd)
> +                shutil.copyfile(p, f)
> +                shutil.copystat(p, f)

I swore there was a single function in the stdlib that did this. Meh.

@@ +62,5 @@
> +                shutil.copyfile(p, f)
> +                shutil.copystat(p, f)
> +                strip(f)
> +                tmpfiles.append(f)
> +            subprocess.call(['lipo', '-create'] + tmpfiles + ['-output', dest])

The back of my mind is saying "the build system should be able to run without $PATH [and that 'lipo' should come from configure]." We're so far from this today that I'm not too bothered by it.

@@ +129,5 @@
> +            for line in unified_diff(file1.open().readlines(),
> +                                     file2.open().readlines(),
> +                                     os.path.join(self._base1.base, path),
> +                                     os.path.join(self._base2.base, path)):
> +                sys.stderr.write(line)

I'm not a huge fan of writing to std{out,err} unless from main() or unless one was passed into the function. If nothing else, paramaterizing the fh makes it easier to unit test. I'm not sure if I care /that/ much.

@@ +169,5 @@
> +            return GeneratedFile(''.join(
> +                content1[:content1.index('</body>\n')] +
> +                ['<hr> </hr>\n'] +
> +                content2[content2.index('<h1>about:buildconfig</h1>\n') + 1:]
> +            ))

This warrants a comment.

::: toolkit/mozapps/installer/find-dupes.py
@@ +20,5 @@
> +    md5s = OrderedDict()
> +    for p, f in UnpackFinder(source):
> +        content = f.open().read()
> +        m = hashlib.md5(content).digest()
> +        if not m in md5s:

if m not in md5s:

@@ +22,5 @@
> +        content = f.open().read()
> +        m = hashlib.md5(content).digest()
> +        if not m in md5s:
> +            md5s[m] = (len(content), [])
> +        md5s[m][1].append(p)

If you don't think it is too unreadable:

  md5s.setdefault(m, (len(content), []))[1].append(p)
Comment on attachment 695969 [details] [diff] [review]
Import new packager code

Review of attachment 695969 [details] [diff] [review]:
-----------------------------------------------------------------

I think I'm done - huzzah!

Again, I want to iterate that there is a *ton* of code in this patch that is very difficult to test for correctness without having a full understanding of how the packager works. Given the importance of the packager, I'm assuming differences between the old and new versions will be blatantly obvious.

::: toolkit/mozapps/installer/packager.py
@@ +97,5 @@
> +    def copy(self, dest):
> +        assert isinstance(dest, basestring)
> +        if os.path.exists(dest) and \
> +                int(os.path.getmtime(self.path) * 1000) <= \
> +                int(os.path.getmtime(dest) * 1000):

The * 1000 isn't necessary. Doesn't os.path.getmtime always return an int?

@@ +145,5 @@
> +            path = f.filename[f.filename.index(resource) + len(resource):]
> +            if path in formatter:
> +                formatter.add(f.filename, GeneratedFile(f.read()))
> +    jar.close()
> +    os.remove(cache)

You probably want this in a finally block. Alternatively, you could use the NamedTemporaryFile clone in mozbase. I think there was another use of mkstemp I missed this for...

@@ +171,5 @@
> +    '''
> +    if '=' in define:
> +        name, value = define.split('=', 1)
> +        if re.match('^\d+$', value):
> +            value = int(value)

The Pythonic way is:

try:
    value = int(value)
except ValueError:
    pass

@@ +183,5 @@
> +    '''
> +    def __init__(self, formatter, has_manifest):
> +        assert 'NO_PKG_FILES' in os.environ
> +        self._formatter = formatter
> +        self._files = os.environ['NO_PKG_FILES'].split()

Ugh. I generally don't like classes looking at ENV as these are effectively global variables and we can generally agree that globals are a bad practice. Alas, this file has a main(), so I won't block on this.

@@ +212,5 @@
> +
> +
> +def main():
> +    parser = OptionParser(usage='Usage: %prog [options] [manifest] ' +
> +                          'source destination')

We have 2.7, so you could use argparse if you were so inclined. Should be able to just drop it in!

@@ +261,5 @@
> +        formatter = JarFormatter(copier, optimize=options.optimizejars)
> +    elif options.format == 'omni':
> +        formatter = OmniJarFormatter(copier,
> +                                     buildconfig.substs['OMNIJAR_NAME'],
> +                                     optimize=options.optimizejars)

I'm pretty sure there's a better way to do this type of mapping with argparse. I'm not sure it is worth it though.
Attachment #695969 - Flags: review?(gps) → feedback+
Comment on attachment 695772 [details] [diff] [review]
Unit tests for the new packager code

Review of attachment 695772 [details] [diff] [review]:
-----------------------------------------------------------------

I love test coverage! This looks pretty good except for some non-robust use of try..except instead of assertRaises.

Out of curiosity, have you ran code coverage for this? I'm curious what areas are lacking code coverage. I'd like to hit those a little harder with my review, if possible. If not, I'm sure things will come out in bugs and fixes will entail more tests.

::: Makefile.in
@@ +53,5 @@
>  endif
>  tier_base_dirs += \
>    mozglue \
>    memory/mozalloc \
> +  python \

This or the equivalent is already checked in.

::: python/Makefile.in
@@ +1,3 @@
> +# This Source Code Form is subject to the terms of the Mozilla Public
> +# License, v. 2.0. If a copy of the MPL was not distributed with this
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.

We already have this file checked in.

::: python/mozbuild/mozpack/test/test_chrome_flags.py
@@ +40,5 @@
> +        flag.add_definition('flag')
> +        self.assertEqual(str(flag), 'flag')
> +        self.assertFalse(flag.matches('false'))
> +        self.assertTrue(flag.matches('true'))
> +        self.assertFalse(flag.matches(False))

Would you mind adding some empty lines here to separate logical tests?

::: python/mozbuild/mozpack/test/test_errors.py
@@ +47,5 @@
> +        try:
> +            with errors.accumulate():
> +                errors.error('1')
> +        except AccumulatedErrors:
> +            self.assertEquals(self.getOutput(), ['Error: 1'])

For these try..except tests, you should use self.assertRaises as a context manager. The exception will be captured by the context manager and you will guarantee the exception is thrown. As it stands, if you don't throw, this will still pass. This pattern is prevalent throughout this file.

@@ +52,5 @@
> +
> +    def test_error_loop(self):
> +        try:
> +            with errors.accumulate():
> +                for i in xrange(3):

range for py3k compat. xrange is only useful if there are lots of items.

::: python/mozbuild/mozpack/test/test_mozjar.py
@@ +8,5 @@
> +    JarWriter,
> +    Deflater,
> +    OrderedDict,
> +)
> +from test_files import MockDest

Please use absolute import.

::: python/mozbuild/mozpack/test/test_packager.py
@@ +35,5 @@
> +        '; comment',
> +        '#ifdef baz',
> +        '[baz]',
> +        'baz@SUFFIX@',
> +        '#endif']])

I'd just do a triple quoted string at module scope and .replace(os.linesep, '\n') (if necessary)

::: python/mozbuild/mozpack/test/test_unify.py
@@ +61,5 @@
> +                            '<body>',
> +                            '<h1>about:buildconfig</h1>',
> +                            '<div>foo</div>',
> +                            '</body>',
> +                            '</html>',

This HTML mucking seems really fragile. Whatevs.
Attachment #695772 - Flags: review?(gps) → feedback+
(In reply to Gregory Szorc [:gps] from comment #58)
> Comment on attachment 695772 [details] [diff] [review]
> Unit tests for the new packager code
> 
> Review of attachment 695772 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I love test coverage! This looks pretty good except for some non-robust use
> of try..except instead of assertRaises.

Guess why. I was targetting python 2.6 when I wrote this code. And assertRaises as a context manager is python 2.7.

> Out of curiosity, have you ran code coverage for this? I'm curious what
> areas are lacking code coverage. I'd like to hit those a little harder with
> my review, if possible. If not, I'm sure things will come out in bugs and
> fixes will entail more tests.

I haven't run code coverage, but I could. I know a bunch of error conditions are not tested.

> 
> ::: Makefile.in
> @@ +53,5 @@
> >  endif
> >  tier_base_dirs += \
> >    mozglue \
> >    memory/mozalloc \
> > +  python \
> 
> This or the equivalent is already checked in.

This patch predates that landing ;)
(In reply to Gregory Szorc [:gps] from comment #56)
> @@ +194,5 @@
> > +    '''
> > +    Parser sink for "simple" package manifests. Simple package manifests use
> > +    the format described in the PackageManifestParser documentation, but don't
> > +    support file removals, and require manifests, interfaces and chrome data to
> > +    be explicitely listed.
> 
> English.

?

> @@ +74,5 @@
> > +                                self.files.add(path, m)
> > +                            continue
> > +                        else:
> > +                            self.files.add(path, DeflatedFile(j))
> > +                    continue
> 
> Do you mean to fall through to the if below if chrome.manifest isn't in jar?

Yes, if the jar isn't an omnijar, it needs to be added verbatim, which the fall through does.

> If you don't think it is too unreadable:
> 
>   md5s.setdefault(m, (len(content), []))[1].append(p)

Eww
(In reply to Gregory Szorc [:gps] from comment #58)
> I love test coverage! This looks pretty good except for some non-robust use
> of try..except instead of assertRaises.
> 
> Out of curiosity, have you ran code coverage for this? I'm curious what
> areas are lacking code coverage. I'd like to hit those a little harder with
> my review, if possible. If not, I'm sure things will come out in bugs and
> fixes will entail more tests.

BTW, I'm fairly confident in the current tests. I've done pretty massive refactorings with their help and broke nothing in the packager as a result.
(In reply to Gregory Szorc [:gps] from comment #57)
> Comment on attachment 695969 [details] [diff] [review]
> Import new packager code
> 
> Review of attachment 695969 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think I'm done - huzzah!
> 
> Again, I want to iterate that there is a *ton* of code in this patch that is
> very difficult to test for correctness without having a full understanding
> of how the packager works. Given the importance of the packager, I'm
> assuming differences between the old and new versions will be blatantly
> obvious.
> 
> ::: toolkit/mozapps/installer/packager.py
> @@ +97,5 @@
> > +    def copy(self, dest):
> > +        assert isinstance(dest, basestring)
> > +        if os.path.exists(dest) and \
> > +                int(os.path.getmtime(self.path) * 1000) <= \
> > +                int(os.path.getmtime(dest) * 1000):
> 
> The * 1000 isn't necessary. Doesn't os.path.getmtime always return an int?

Unfortunately, no. It returns a value in seconds. And IIRC, it can have a precision to the microsecond, but shutil.copystat only copies to the millisecond.

> @@ +145,5 @@
> > +            path = f.filename[f.filename.index(resource) + len(resource):]
> > +            if path in formatter:
> > +                formatter.add(f.filename, GeneratedFile(f.read()))
> > +    jar.close()
> > +    os.remove(cache)
> 
> You probably want this in a finally block. Alternatively, you could use the
> NamedTemporaryFile clone in mozbase. I think there was another use of
> mkstemp I missed this for...
> 
> @@ +171,5 @@
> > +    '''
> > +    if '=' in define:
> > +        name, value = define.split('=', 1)
> > +        if re.match('^\d+$', value):
> > +            value = int(value)
> 
> The Pythonic way is:
> 
> try:
>     value = int(value)
> except ValueError:
>     pass
> 
> @@ +183,5 @@
> > +    '''
> > +    def __init__(self, formatter, has_manifest):
> > +        assert 'NO_PKG_FILES' in os.environ
> > +        self._formatter = formatter
> > +        self._files = os.environ['NO_PKG_FILES'].split()
> 
> Ugh. I generally don't like classes looking at ENV as these are effectively
> global variables and we can generally agree that globals are a bad practice.
> Alas, this file has a main(), so I won't block on this.
> 
> @@ +212,5 @@
> > +
> > +
> > +def main():
> > +    parser = OptionParser(usage='Usage: %prog [options] [manifest] ' +
> > +                          'source destination')
> 
> We have 2.7, so you could use argparse if you were so inclined. Should be
> able to just drop it in!
> 
> @@ +261,5 @@
> > +        formatter = JarFormatter(copier, optimize=options.optimizejars)
> > +    elif options.format == 'omni':
> > +        formatter = OmniJarFormatter(copier,
> > +                                     buildconfig.substs['OMNIJAR_NAME'],
> > +                                     optimize=options.optimizejars)
> 
> I'm pretty sure there's a better way to do this type of mapping with
> argparse. I'm not sure it is worth it though.
(In reply to Gregory Szorc [:gps] from comment #57)
> @@ +183,5 @@
> > +    '''
> > +    def __init__(self, formatter, has_manifest):
> > +        assert 'NO_PKG_FILES' in os.environ
> > +        self._formatter = formatter
> > +        self._files = os.environ['NO_PKG_FILES'].split()
> 
> Ugh. I generally don't like classes looking at ENV as these are effectively
> global variables and we can generally agree that globals are a bad practice.
> Alas, this file has a main(), so I won't block on this.

FWIW, this is set to be removed later on, when other changes to the packager will have been made.
FYI, test coverage gives:
Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
python/mozbuild/mozpack/__init__                         0      0   100%
python/mozbuild/mozpack/chrome/__init__                  0      0   100%
python/mozbuild/mozpack/chrome/flags                   114      5    96%
python/mozbuild/mozpack/chrome/manifest                152     22    86%
python/mozbuild/mozpack/copier                         122     12    90%
python/mozbuild/mozpack/errors                          55      0   100%
python/mozbuild/mozpack/executables                     43     20    53%
python/mozbuild/mozpack/files                          209     19    91%
python/mozbuild/mozpack/mozjar                         308     18    94%
python/mozbuild/mozpack/packager/__init__              122      6    95%
python/mozbuild/mozpack/packager/formats               138      9    93%
python/mozbuild/mozpack/path                            65      1    98%
python/mozbuild/mozpack/unify                           89     30    66%
------------------------------------------------------------------------
TOTAL                                                 1417    142    90%
(In reply to Gregory Szorc [:gps] from comment #53)
> > +        if compress is None:
> > +            compress = self._compress
> 
> compress = self._compress if compress is None

SyntaxError: invalid syntax
(In reply to Gregory Szorc [:gps] from comment #56)
> > +        if not m in md5s:
> 
> if m not in md5s:

While this form sounds more english, i've always hated it because it makes things different with negative tests ("if not something").
(In reply to Gregory Szorc [:gps] from comment #56)
> @@ +252,5 @@
> > +        '''
> > +        base = self._get_base(path)
> > +        path = mozpack.path.split(mozpack.path.relpath(path, base))
> > +        if path[0] == 'chrome':
> > +            return len(path) == 1 or path[1] != 'icons'
> 
> Perhaps you should assert on len(path)?

Huh? That would reject perfectly valid paths.
Attachment #693882 - Flags: review?(ted) → review+
Comment on attachment 695772 [details] [diff] [review]
Unit tests for the new packager code

Review of attachment 695772 [details] [diff] [review]:
-----------------------------------------------------------------

::: python/mozbuild/mozpack/test/test_errors.py
@@ +12,5 @@
> +import sys
> +from cStringIO import StringIO
> +
> +
> +class TestErrors(unittest.TestCase):

You probably don't actually want this to inherit from TestCase, since it's not really a testcase. You should just make it a mixin and have the other tests inherit from this and TestCase.

::: python/mozbuild/mozpack/test/test_files.py
@@ +401,5 @@
> +    b'\x80\x00\x00\x70\x57\xF2\xAA\xFD\xC2\x45\x59\xAB\xDE\x08\xD9\x39' +
> +    b'\xF7\xE8\x0D\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x09\x00' +
> +    b'\x66\x6F\x6F\x00\x66\x6F\x6F\x00\x00\x00\x00\x01\x00\x00\x00\x00' +
> +    b'\x05\x00\x80\x06\x00\x00\x00'
> +)

Can you add a 1-line comment explaining what's in these?
Attachment #695772 - Flags: review?(ted) → review+
Comment on attachment 695972 [details] [diff] [review]
Remove now unused bits of the packaging scripts

Review of attachment 695972 [details] [diff] [review]:
-----------------------------------------------------------------

This might be the happiest patch I've ever reviewed.

::: build/Makefile.in
@@ -256,5 @@
> -          echo "TEST-UNEXPECTED-FAIL | build/ | unify failed to unify files with differing line ordering!"; \
> -          false; \
> -        else \
> -          echo "TEST-PASS | build/ | unify unified files with differing line ordering!"; \
> -        fi

\o/

::: toolkit/mozapps/installer/Packager.pm
@@ -1,4 @@
> -#!perl -w
> -# This Source Code Form is subject to the terms of the Mozilla Public
> -# License, v. 2.0. If a copy of the MPL was not distributed with this
> -# file, You can obtain one at http://mozilla.org/MPL/2.0/.

\o/

::: toolkit/mozapps/installer/xptlink.pl
@@ -1,5 @@
> -#!/usr/bin/perl -w
> -# 
> -# This Source Code Form is subject to the terms of the Mozilla Public
> -# License, v. 2.0. If a copy of the MPL was not distributed with this
> -# file, You can obtain one at http://mozilla.org/MPL/2.0/.

\o/
Attachment #695972 - Flags: review?(ted) → review+
(In reply to Mike Hommey [:glandium] from comment #51)
> > It's kinda weird for __contains__ -> to not imply __getitem__ will be
> > successful. Python sometimes magically implements some magic methods in
> > terms of what you have implemented (using what it can when available). I
> > worry that different semantics for __contains__ and __getitem__ will lead to
> > weirdness.
> 
> What would you expect then? return [self.match(path)] ? or something that
> returns a list when there is more than one item, but just the item when
> there is only one ? or something that returns a list when the path matches a
> directory or a glob match (whatever number of results there are) and just
> the item when path is self._files ?

I would not use magic methods if the behavior is different depending on the contents. I have nothing against has_* or get_* functions in this case.
 
> > @@ +218,5 @@
> > > +            dest = Dest(dest)
> > > +        else:
> > > +            assert isinstance(dest, Dest)
> > > +
> > > +        from mozpack.mozjar import JarWriter, JarReader
> > 
> > module-level imports, please.
> 
> The reason to not import at module level here is to avoid importing mozjar
> or xpt when not dealing with jars and xpts, which the code may be used for.

Fine by me.
 
> > @@ +222,5 @@
> > > +        from mozpack.mozjar import JarWriter, JarReader
> > > +        try:
> > > +            old_jar = JarReader(dest)
> > > +        except:
> > > +            old_jar = []
> > 
> > bareword except is evil. Please trap only what you need (I assume some kind
> > of IOError for missing file?)
> 
> There are *many* ways JarReader() can fail, all of which can be handled the
> same way.

The problem with bare "except" is it also traps things like KeyboardInterrupt and SystemExit, effectively swallowing C-c, etc. See http://docs.python.org/2/whatsnew/2.5.html#pep-352-exceptions-as-new-style-classes for the proper way to do this. There are numerous StackOverflow questions on the practice.

> > > +            can_skip_content_check = True
> > > +        elif hasattr(self, 'path') and hasattr(dest, 'path'):
> > > +            if int(os.path.getmtime(self.path) * 1000) \
> > > +                    <= int(os.path.getmtime(dest.path) * 1000):
> > > +                return False
> > 
> > This seems dangerous to me. Convince me we shouldn't perform stronger
> > size/content based validation here.
> 
> That's not any more dangerous than make rules. Size/content based validation
> also would make us recopy and restrip all the libraries every time (and it's
> not exactly fast to do so)

If it has worked for us before, then I guess it's good enough!

> > @@ +178,5 @@
> > > +    (using the add() and remove() member functions), and links them at copy()
> > > +    time.
> > > +    '''
> > > +    def __init__(self):
> > > +        self._files = set()
> > 
> > Does using an unordered data structure impact us at all? Should you perhaps
> > throw a sorted(self._files) in somewhere to ensure consistent ordering?
> 
> order doesn't matter much.

OK. But, when someone runs into a problem that requires diffing output and the order is all mixed up because of a lack of sort, don't tell me I didn't warn you :) I just tend to prefer consistent operation unless perf requires otherwise.


(In reply to Mike Hommey [:glandium] from comment #54)

> > @@ +132,5 @@
> > > +        '''
> > > +        Serialize the data structure according to the data structure definition
> > > +        from self.STRUCT.
> > > +        '''
> > > +        serialized = struct.pack('<I', self.signature)
> > 
> > Using strings here seems like an obvious perf loss. I would either use
> > [c]StringIO or would append strings to a list then join together on empty
> > string at the end.
> 
> I don't think it matters.

Fair enough. It's easy enough to change if it shows up as a hot spot. Out of curiosity, how long does all this code take to run during builds? How does it compare with the old Perl packager?

> > > +        # January 1st, 2010
> > > +        entry['lastmod_date'] = ((2010 - 1980) << 9) | (1 << 5) | 1
> > 
> > Huh?
> 
> Inherited from bug 592369 ; I think it can go away now, but i didn't want to
> risk it.

Please link that bug in the code.
(In reply to Gregory Szorc [:gps] from comment #70)
> Fair enough. It's easy enough to change if it shows up as a hot spot. Out of
> curiosity, how long does all this code take to run during builds? How does
> it compare with the old Perl packager?

On my Mac, it's as fast as the perl packager when there is no preexisting dist/firefox, and about twice as fast when there is one. It could be even faster on incremental builds if startupcache wasn't compiled each time.
Attached patch Import new packager code (obsolete) — Splinter Review
Attachment #703357 - Flags: review?(gps)
Attachment #695969 - Attachment is obsolete: true
Attachment #695969 - Flags: review?(ted)
Forgot to fold a few more fixes
Attachment #703361 - Flags: review?(gps)
Attachment #703357 - Attachment is obsolete: true
Attachment #703357 - Flags: review?(gps)
Attachment #703366 - Flags: review?(gps)
Attachment #695772 - Attachment is obsolete: true
(In reply to Mike Hommey [:glandium] from comment #74)
> Created attachment 703366 [details] [diff] [review]
> Unit tests for the new packager code

Interdiff works for this one.

(In reply to Mike Hommey [:glandium] from comment #73)
> Created attachment 703361 [details] [diff] [review]
> Import new packager code

but doesn't for this one, so here is a diff between the two diffs, which can almost be read as an interdiff.
Comment on attachment 695970 [details] [diff] [review]
Use new packager code for packaging (packager.mk)

Review of attachment 695970 [details] [diff] [review]:
-----------------------------------------------------------------

Not much to complain about here, just a few questions.

::: b2g/installer/Makefile.in
@@ +50,5 @@
>  ifdef MOZ_PKG_MANIFEST_P
>  MOZ_PKG_MANIFEST = package-manifest
>  endif
>  
> +MOZ_PACKAGER_MINIFY=1

Why does everything but desktop Firefox want this?

::: build/macosx/universal/flight.mk
@@ +27,3 @@
>  	mkdir -p $(DIST_UNI)/$(MOZ_PKG_APPNAME)
>  	rm -f $(DIST_ARCH_2)/universal
>  	ln -s $(call core_abspath,$(DIST_UNI)) $(DIST_ARCH_2)/universal

Do you have a plan to handle this with the new packager code at some point? (presumably we could just use the new packager code in the package-tests target).

::: configure.in
@@ +2418,5 @@
>      RCFLAGS='-n'
>      MOZ_USER_DIR="Mozilla"
>      ZIP="$ZIP -X"
> +    STRIP=lxlite
> +    STRIP_FLAGS="/yua /ydd /yxd /ynl /anp /b- /cs+ /d /i- /ml1 /mr2 /mf2 /r+ /u+ /x- /zs:0 /zx /zd"

Is anyone still maintaining the OS/2 port?

::: toolkit/mozapps/installer/packager.mk
@@ +553,5 @@
>  ifeq (gonk,$(MOZ_WIDGET_TOOLKIT))
>  ELF_HACK_FLAGS = --fill
>  endif
>  
> +ifndef MOZ_PKG_MANIFEST

Does anything define MOZ_PKG_MANIFEST directly? Maybe we can just get rid of this.

@@ +575,5 @@
> +		$(if $(JARLOG_DIR),--jarlogs $(JARLOG_DIR_AB_CD)) \
> +		$(if $(OPTIMIZEJARS),--optimizejars) \
> +		$(addprefix --unify ,$(UNIFY_DIST)) \
> +		$(MOZ_PKG_MANIFEST) $(DIST) $(DIST)/$(STAGEPATH)$(MOZ_PKG_DIR)
> +	$(PYTHON) $(topsrcdir)/toolkit/mozapps/installer/find-dupes.py $(DIST)/$(MOZ_PKG_DIR)

Can this find-dupes just be rolled into packager.py's work instead of being a separate invocation?

::: toolkit/mozapps/installer/precompile_cache.js
@@ +79,5 @@
>    return Services.io.newURI(rph.resolveURI(uri), null, null);
>  }
>  
> +function precompile_startupcache(uri) {
> +  load_modules_under(uri, resolveResource(uri));

I don't understand this change.
Attachment #695970 - Flags: review?(ted) → review+
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #76)
> Comment on attachment 695970 [details] [diff] [review]
> Use new packager code for packaging (packager.mk)
> 
> Review of attachment 695970 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Not much to complain about here, just a few questions.
> 
> ::: b2g/installer/Makefile.in
> @@ +50,5 @@
> >  ifdef MOZ_PKG_MANIFEST_P
> >  MOZ_PKG_MANIFEST = package-manifest
> >  endif
> >  
> > +MOZ_PACKAGER_MINIFY=1
> 
> Why does everything but desktop Firefox want this?

Good question, I'm merely making things on par with what we have currently.

> ::: build/macosx/universal/flight.mk
> @@ +27,3 @@
> >  	mkdir -p $(DIST_UNI)/$(MOZ_PKG_APPNAME)
> >  	rm -f $(DIST_ARCH_2)/universal
> >  	ln -s $(call core_abspath,$(DIST_UNI)) $(DIST_ARCH_2)/universal
> 
> Do you have a plan to handle this with the new packager code at some point?
> (presumably we could just use the new packager code in the package-tests
> target).

Yes, I plan to make that use the new packager code, but that needed more work so I skipped it for now.

> ::: configure.in
> @@ +2418,5 @@
> >      RCFLAGS='-n'
> >      MOZ_USER_DIR="Mozilla"
> >      ZIP="$ZIP -X"
> > +    STRIP=lxlite
> > +    STRIP_FLAGS="/yua /ydd /yxd /ynl /anp /b- /cs+ /d /i- /ml1 /mr2 /mf2 /r+ /u+ /x- /zs:0 /zx /zd"
> 
> Is anyone still maintaining the OS/2 port?

IIRC some people from IBM do. I've seen patches for OS/2 not so long ago.
 
> ::: toolkit/mozapps/installer/packager.mk
> @@ +553,5 @@
> >  ifeq (gonk,$(MOZ_WIDGET_TOOLKIT))
> >  ELF_HACK_FLAGS = --fill
> >  endif
> >  
> > +ifndef MOZ_PKG_MANIFEST
> 
> Does anything define MOZ_PKG_MANIFEST directly? Maybe we can just get rid of
> this.

It's a transition thing. In the end MOZ_PKG_MANIFEST is what is going to be defined, and MOZ_PKG_MANIFEST_P will disappear. Mainly, the goal is to attempt a smooth transition for c-c and other third parties (mainly c-c, really)
 
> @@ +575,5 @@
> > +		$(if $(JARLOG_DIR),--jarlogs $(JARLOG_DIR_AB_CD)) \
> > +		$(if $(OPTIMIZEJARS),--optimizejars) \
> > +		$(addprefix --unify ,$(UNIFY_DIST)) \
> > +		$(MOZ_PKG_MANIFEST) $(DIST) $(DIST)/$(STAGEPATH)$(MOZ_PKG_DIR)
> > +	$(PYTHON) $(topsrcdir)/toolkit/mozapps/installer/find-dupes.py $(DIST)/$(MOZ_PKG_DIR)
> 
> Can this find-dupes just be rolled into packager.py's work instead of being
> a separate invocation?

I think it's just simpler if it's not, avoids adding more options to the packager script, and allows not to worry about other uses of the packager that would not need find-dupes.

> ::: toolkit/mozapps/installer/precompile_cache.js
> @@ +79,5 @@
> >    return Services.io.newURI(rph.resolveURI(uri), null, null);
> >  }
> >  
> > +function precompile_startupcache(uri) {
> > +  load_modules_under(uri, resolveResource(uri));
> 
> I don't understand this change.

With the old packager, there was one invocation of the startupcache compilation handling resource://app and resource://gre. With the new packager, there is one invocation per app (currently, only webapprt, but in metro world, webapprt, metro and browser), and another one for the gre. Doing everything in one invocation doesn't work due to the multiplicity of applications.
Comment on attachment 703361 [details] [diff] [review]
Import new packager code

Review of attachment 703361 [details] [diff] [review]:
-----------------------------------------------------------------

I looked at the interdiff and you seemed to touch on most of the points I made in previous reviews. Did you hit all of them? I'm not sure. I've said it plenty of times already in this bug: this is a lot of code. I expect I missed things during review. Potentially even glaring things. But, the test coverage is thorough and I have trust in you. The nature of this code is such that errors and changes in behavior should be highly visible.

My biggest complaint is I suspect that porting to Python 3 will be made more difficult because of the lack of b'' strings. As mentioned in the comments, getting full Python 3 compat would be difficult because of external module imports. Oh well. We'll cross this bridge when we get to it.

I do have one final request. I'd like to know what the plan is for verifying this patch. Will you be comparing packager output for old and new? How will this be done? Will there be independent review? What about a sign-off by QA, etc that new state and/or the delta is sane?

In conclusion, awesome work!
Attachment #703361 - Flags: review?(gps) → review+
Attachment #703366 - Flags: review?(gps) → review+
(In reply to Gregory Szorc [:gps] from comment #78)
> Comment on attachment 703361 [details] [diff] [review]
> Import new packager code
> 
> Review of attachment 703361 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I looked at the interdiff and you seemed to touch on most of the points I
> made in previous reviews. Did you hit all of them?

I /think/ I hit them all.

> I do have one final request. I'd like to know what the plan is for verifying
> this patch. Will you be comparing packager output for old and new?
> How will this be done? Will there be independent review? What about a sign-off by QA,
> etc that new state and/or the delta is sane?

I've done such comparisons in the first iteration. And an intermediate iteration of the code is in production on elm (has been for a while) and I'm going to land the final version there first.

I'm planning to push to try with everything but:
  - Use new packager code for packaging (packager.mk)
  - Remove now unused bits of the packaging scripts

and add code to packager.mk to trigger both packagers targetting different directories, then run unpack.py in each and compare. Except for maybe manifest ordering, they should be identical. I want to also do the same on the l10n-check run.

Once everything is in the clear, I want to land on m-c close to a nightly so that we get nightlies and, more importantly, l10n builds. I think the ideal position would be to land just after a nightly and retrigger one on the landing, then compare the nightlies and l10n builds. For good measure, a QA sign-off would be good, but if the builds are strictly identical, there should be no worry.

Then there's the issue of c-c, which needs bug 825872 landed at the same time (provided it's enough)

I'll also post on dev-platform a few notes, mostly for third parties.
This plan sounds good to me!
Looking at the coverage, it seemed to me the chrome.manifest coverage was suspiciously low, as I thought I had explicit tests for all of it, and when I prepared the elm landing, I figured I had, in fact, mistakenly removed them. So here are the missing tests.
Attachment #703816 - Flags: review?(gps)
Depends on: 832202
Depends on: 832220
This is required to make comm-central happier, without any changes (although bug 832202 is required on m-c side). I was able to package Thunderbird with this with only few warnings. I'd expect Seamonkey to break, though, because it uses MOZ_PKG_FATAL_WARNINGS=1, and removed-files.in triggers an error for components/components.manifest.
Attachment #703844 - Flags: review?(ted)
Thanks for the answers to my review questions, that all sounds good.
Attachment #703844 - Flags: review?(ted) → review+
Comment on attachment 703816 [details] [diff] [review]
Missing unit tests

Review of attachment 703816 [details] [diff] [review]:
-----------------------------------------------------------------

Yay tests!

::: python/mozbuild/mozpack/test/test_chrome_manifest.py
@@ +137,5 @@
> +                    list(parse_manifest(os.curdir, 'manifest'))
> +            out = self.get_output()
> +            # Expecting 2 errors
> +            self.assertEqual(len(out), 2)
> +            path = os.path.abspath(os.path.join(os.curdir, 'manifest'))

I /think/ this is equivalent to "os.path.abspath('manifest')" (abspath will use current directory if argument is relative).
Attachment #703816 - Flags: review?(gps) → review+
Attached file For reference, diff script (obsolete) —
This is the diff script I'm using to double check the differences between builds with the old vs. the new packager. So far, so good.
Attachment #704541 - Attachment mime type: text/x-python → text/plain
Attached patch Two additional fixes (obsolete) — Splinter Review
This unbreaks l10n-repacks after the changes from attachment 703369 [details] [diff] [review], and fixes --jarlogs.
Attachment #704623 - Flags: review?(gps)
Here's the interesting thing so far: while I've found a few issues in the new packager code (fixed in that last patch), I now also found issues with the old packager code:
- non omnijar builds don't have NSS libraries signed with shlibsign because SIGN_NSS is not defined because MOZ_CAN_RUN_PROGRAMS is only defined when MOZ_OMNIJAR is defined.
- in MOZ_CHROME_FILE_FORMAT=jar builds, optimizejars.py adds jar files from dist/bin that are *not* in package-manifest. That only affects debug builds, with reftest.jar, layoutdebug.jar and xslt-qa.jar. None of them are actually reachable from within Firefox, since there's no corresponding manifest ; they're just dead-weight.
Attachment #704541 - Attachment is obsolete: true
(In reply to Mike Hommey [:glandium] from comment #87)
> - in MOZ_CHROME_FILE_FORMAT=jar builds, optimizejars.py adds jar files from
> dist/bin that are *not* in package-manifest. That only affects debug builds

In fact, that also affects non-debug builds, for reftest.jar and xslt-qa.jar.
This adds another fix on top of the previous one: it generates the precomplete file during l10n-repacks, too (and unpack skips it, too).
Attachment #704835 - Flags: review?(gps)
Attachment #704623 - Attachment is obsolete: true
Attachment #704623 - Flags: review?(gps)
(In reply to Mike Hommey [:glandium] from comment #87)
> Here's the interesting thing so far: while I've found a few issues in the
> new packager code (fixed in that last patch), I now also found issues with
> the old packager code:

I'm trying very hard to be shocked that our rat's nest of existing packager code has issues. :)
QA Wanted: to verify and find any bugs prior to landing; see Mike's email request.
Keywords: qawanted
(In reply to Naoki Hirata :nhirata from comment #92)
> QA Wanted: to verify and find any bugs prior to landing; see Mike's email
> request.

That would be after landing.
Comment on attachment 704835 [details] [diff] [review]
Three additional fixes

Review of attachment 704835 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.
Attachment #704835 - Flags: review?(gps) → review+
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Depends on: 833848
Depends on: 833882
Depends on: 833946
Depends on: 834022
Depends on: 834176
Depends on: 835150
Depends on: 835164
Depends on: 835309
QA Contact: jbecerra
Depends on: 834337
Depends on: 844785
Depends on: 848472
Depends on: 855223
Depends on: 855227
Depends on: 871080
We've got problem with this change with separate Firefox and xulrunner build on Fedora.
I get:
Executing /usr/lib64/xulrunner-devel-21.0/bin/xpcshell -g /usr/lib64/xulrunner-devel-21.0 -a /home/jhorak/fedora/firefox/firefox-21.0/mozilla-release/dist/bin/browser -f ../../toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("resource://app/");
failed to get nsJSRuntimeService!
Traceback (most recent call last):
  File "../../toolkit/mozapps/installer/packager.py", line 366, in <module>
    main()
  File "../../toolkit/mozapps/installer/packager.py", line 358, in main
    args.source, gre_path, base)
  File "../../toolkit/mozapps/installer/packager.py", line 140, in precompile_cache
    errors.fatal('Error while running startup cache precompilation')
  File "/home/jhorak/fedora/firefox/firefox-21.0/mozilla-release/python/mozbuild/mozpack/errors.py", line 101, in fatal
    self._handle(self.FATAL, msg)
  File "/home/jhorak/fedora/firefox/firefox-21.0/mozilla-release/python/mozbuild/mozpack/errors.py", line 96, in _handle
    raise ErrorMessage(msg)
mozpack.errors.ErrorMessage: Error: Error while running startup cache precompilation
make[1]: *** [stage-package] Error 1

It looks fine when -g /usr/lib64/xulrunner-devel-21.0 is replaced by -g /usr/lib64/xulrunner-devel-21.0/bin
File a new bug, please.
Depends on: 872439
No longer depends on: 848830
Depends on: 910660
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.