Closed Bug 890097 Opened 6 years ago Closed 6 years ago

Use manifests for managing simple file installation

Categories

(Firefox Build System :: General, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla25

People

(Reporter: gps, Assigned: gps)

References

Details

Attachments

(3 files, 9 obsolete files)

8.42 KB, patch
glandium
: review+
Details | Diff | Splinter Review
9.22 KB, patch
glandium
: review+
Details | Diff | Splinter Review
14.55 KB, patch
glandium
: review+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #884587 +++

In bug 884587 we added support for file purging to blow away parts of dist/ at the top of the build. We realized in that bug it would only be a temporary solution and that a more robust solution is to just define file copies/installs in manifests and have purging be a side-effect of that.

In this bug, I'd like to establish the same base infrastructure for installs as we did for purging. The ultimate goal is the effective elimination of the recursive |make export| tier.

I think we'll employ the same general approach as purge manifests: the build backend writes out multiple files in a directory, each describing a specific domain of interest in the objdir. At various points during the build, the manifests are loaded and basic I/O is performed.

Install manifests will effectively be build rules for a limited known set of build actions (e.g. file copy) that have no dependencies. i.e. the build action can be performed any time, even on a fresh tree. I can think of the following actions that need to be expressed in a manifest:

* File copy
* File symlink (with fallback to copy where symlinks aren't supported)
* Preprocessing

In addition, we may want to express an action for "exists" to denote files we know will be created by "later" build steps but aren't currently provided. This will prevent manifest application via mozpack.copier.FileCopier from deleting untracked files in the destination directory. Alternatively, I suppose we could add an argument to FileCopier to not unlink untracked files.

In my ideal world, every file ending up in manifest-tracked files can be identified at moz.build read time. Unfortunately, due to the presence of 3rd party build systems not using moz.build, this will not be possible. We may need to eventually figure out a way to merge the output of 3rd party build systems into other manifests so we know about these files and don't delete them because they are untracked. But, I think this is purely a performance optimization and can be implemented as a followup.

For an initial implementation, I was thinking of implementing an "install manifest" for defining the basic actions I described above. I may possible omit preprocessed files for the initial version to keep it simple. The install manifests would run at the top of the build during the export target/tier (although they'd have to run first or they may accidentally remove files not yet tracked by the manifests). For every directory tracked by an install manifest, we can remove the corresponding purge manifest (because install manifests will also purge untracked files).

I would like to have this in place before bug 850380 is tackled so xpidl generation can be agnostic about the .idl install foo (because solving that problem is difficult with just make rules - read that bug).
This patch adds an "install manifest" to mozpack. The manifest format
currently supports copying, symlinking, and marking a file as "present."

I think preprocessing can be tackled in a follow-up.
Attachment #772829 - Flags: review?(mh+mozilla)
Assignee: nobody → gps
I added __ior__ to facilitate manifest merging.
Attachment #772864 - Flags: review?(mh+mozilla)
Attachment #772829 - Attachment is obsolete: true
Attachment #772829 - Flags: review?(mh+mozilla)
So as I'm implementing the build backend pieces, I'm running into a few hurdles:

1) Some existing EXPORTS in moz.build reference generated files. e.g. cairo-features.h, xpcom-config.h, js-config.h, necko-config.h, and TelemetryHistogramEnums.h. I may move some things from moz.build back into Makefile.in until the corresponding "rules" are moved to moz.build.

2) We'll either need to figure out a clever way of merging install manifests together or incrementally applying manifests to a destination directory (read: no delete mode). Maybe we combine them and do a purge at the end of the build? (Although this would require us to have every single file accounted for. I want to eventually attain this, but it's likely difficult in the short term.)
Depends on: 891626
Depends on: 891632
Depends on: 870401
Here is what I have so far. I'm creating an install manifest for
"dist/include" from EXPORTS. This seems to work on my OS X machine.
Although, it requires a number of prerequisite patches. I'm mildly
surprised at how fast the manifest is processed. It literally installs
1000+ files with the blink of an eye. But that's also on an SSD with
populated page cache.

There are a few shortcomings that need addressed:

* Tests are missing and old tests are broken.
* Dependency issue with js/src (we pull in the manifest before checking
  if its moz.build files are up-to-date).
* Need to figure out recursive story (what happens when you type |make|
  in a leaf directory).
* Need Windows testing.
* Possibly fix newly-introduced make rules to suck less (use
  INSTALL_TARGETS?)
Comment on attachment 772864 [details] [diff] [review]
Part 1: Manifest for managing file installs

Review of attachment 772864 [details] [diff] [review]:
-----------------------------------------------------------------

::: python/mozbuild/mozpack/files.py
@@ +242,5 @@
>          os.rename(temp_dest, dest)
>          return True
>  
>  
> +class PresentFile(BaseFile):

ExistingFile ?

::: python/mozbuild/mozpack/manifests.py
@@ +128,5 @@
> +    @staticmethod
> +    def from_path(path):
> +        """Construct an instance from a path on the filesystem."""
> +        with open(path, 'rb') as fh:
> +            return InstallManifest.from_fileobj(fh)

I keep seeing this pattern in your patches and I must say I don't like it (as subjective as that is). Why not simply def __init__(path, fileobj), and call InstallManifest(path=foo) or InstallManifest(fileobj=bar) ?

@@ +147,5 @@
> +
> +        for line in fh:
> +            line = line.rstrip()
> +
> +            fields = line.split('\x1f')

SEPARATOR = '\x1f', and use SEPARATOR here and further below.

@@ +154,5 @@
> +
> +            record_type = int(fields[0])
> +
> +            if record_type == InstallManifest.SYMLINK:
> +                assert len(fields) == 3

might as well do:
source, dest = fields[1:]
(which will throw if len(fields) != 3)

This has the advantage of making clearer what kind of values are expected.

@@ +197,5 @@
> +            self._add_entry(dest, other._dests[dest])
> +
> +        return self
> +
> +    def write_file(self, path):

Some remark as with from_*.

@@ +220,5 @@
> +        dest will be symlinked to source.
> +        """
> +        self._add_entry(dest, (self.SYMLINK, source))
> +
> +    def add_copy(self, dest, source):

the dest, source order is kind of counter intuitive (and is opposite to that of cp/ln).

@@ +240,5 @@
> +
> +    def get_copier(self):
> +        """Obtain a FileCopier instance built from this manifest."""
> +
> +        c = FileCopier()

I think it would be better if you passed the FileCopier to the function instead of creating it here. So that you can use anything that provides the FileRegistry interface (which could be a FileCopier, a FilePurger, or anything else we might add in the future).
Attachment #772864 - Flags: review?(mh+mozilla) → feedback+
Blocks: 892644
Blocks: 892747
I inserted a new patch in the series: the PurgeManifest API is now more
Pythonic per glandium's earlier review comments.
Attachment #778765 - Flags: review?(mh+mozilla)
I believe I hit all points of feedback.

Rebased on top of symlink landing. Even added some test coverage.
Attachment #778766 - Flags: review?(mh+mozilla)
Attachment #772864 - Attachment is obsolete: true
Pretty trivial patch to add more detailed tracking to FileCopier
actions. I can't remember exactly why I wrote this - I believe Ms2ger
was asking for it in IRC so he could get an idea of what files are
unaccounted for. Whatever the inspiration, we'll need this eventually
because we'll eventually want to track down the long tail of files
deleted during no-op builds (I'd argue there should be 0).
Attachment #778768 - Flags: review?(mh+mozilla)
Comment on attachment 778765 [details] [diff] [review]
Part 1: Use more Pythonic API for PurgeManifest

Review of attachment 778765 [details] [diff] [review]:
-----------------------------------------------------------------

::: python/mozbuild/mozpack/manifests.py
@@ +36,5 @@
> +        if fh:
> +            self._read_from_fh(fh)
> +
> +        if path:
> +            fh.close()

I think

if path:
    with open(path, 'rt') as fh:
        self._read_from_fh(fh)
elif fh:
    self._read_from_fh(fh)

reads better.
Make more readable per comment.
Attachment #779477 - Flags: review?(mh+mozilla)
Attachment #778765 - Attachment is obsolete: true
Attachment #778765 - Flags: review?(mh+mozilla)
Ditto.
Attachment #779480 - Flags: review?(mh+mozilla)
Attachment #778766 - Attachment is obsolete: true
Attachment #778766 - Flags: review?(mh+mozilla)
Comment on attachment 779477 [details] [diff] [review]
Part 1: Use more Pythonic API for PurgeManifest

Review of attachment 779477 [details] [diff] [review]:
-----------------------------------------------------------------

Only nits.

::: python/mozbuild/mozpack/manifests.py
@@ +25,5 @@
>  
>      Don't be confused by the name of this class: entries are files that are
>      *not* purged.
>      """
> +    def __init__(self, relpath='', path=None, fh=None):

I can be wrong, but I think python lib functions that take a file object use "fileobj". Might as well use the same idioms.

@@ +62,3 @@
>  
> +        if not path and not fh:
> +            raise Exception('Must specify either path or fh.')

Might as well use an assert for both tests (and I think assert takes an explanation string, even though we don't use that much).

@@ +63,5 @@
> +        if not path and not fh:
> +            raise Exception('Must specify either path or fh.')
> +
> +        if path:
> +            fh = open(path, 'wt')

Maybe a nice way around this pattern is an intermediate function, like:

def foo(path=None, fh=None, mode='r'):
    if fh:
        return fh
    return open(path, mode)

with foo(path=path, fh=fh, 'wt') as fh:
    #do stuff
Attachment #779477 - Flags: review?(mh+mozilla) → review+
Comment on attachment 779480 [details] [diff] [review]
Part 2: InstallManifest class for managing file installs

Review of attachment 779480 [details] [diff] [review]:
-----------------------------------------------------------------

Nits

::: python/mozbuild/mozpack/files.py
@@ +260,5 @@
> +    '''
> +    File class that represents that a file exists.
> +
> +    If asked to copy, nothing is performed because nothing is known about the
> +    file.

This sounds a bit misleading. As I take it, nothing is performed because an instance of this class indicates that the file is supposed to exist at the copy destination. Which, by the way, means it should probably error out (errors.fatal) when the file doesn't exist.

::: python/mozbuild/mozpack/manifests.py
@@ +101,5 @@
> +class UnreadableInstallManifest(Exception):
> +    """Raised when an invalid install manifest is parsed."""
> +
> +
> +FIELD_SEPARATOR = '\x1f'

Why not as a class variable?

@@ +135,5 @@
> +        If path is defined, the manifest will be populated with data from the
> +        file path.
> +
> +        If fh is defined, the manifest will be populated with data read
> +        from the specified file object.

Same remark as the other patch, it should probably by fileobj instead of fh.

@@ +137,5 @@
> +
> +        If fh is defined, the manifest will be populated with data read
> +        from the specified file object.
> +
> +        Both path and fileobj cannot be defined.

Look, it's already fileobj here :)

@@ +140,5 @@
> +
> +        Both path and fileobj cannot be defined.
> +        """
> +        if path and fh:
> +            raise Exception('Only 1 of path and fh may be defined.')

assert?

@@ +148,5 @@
> +        if path:
> +            with open(path, 'rb') as fh:
> +                self._load_from_fh(fh)
> +        elif fh:
> +            self._load_from_fh(fh)

See other patch comment.

@@ +195,5 @@
> +        return not self.__eq__(other)
> +
> +    def __ior__(self, other):
> +        if not isinstance(other, InstallManifest):
> +            raise ValueError('Can only |= with another instance of InstallManifest.')

assert?

Isn't __ior__ for | too (not for |= only)?

@@ +214,5 @@
> +        if path and fh:
> +            raise Exception('Only 1 of path or fh may be specified.')
> +
> +        if not path and not fh:
> +            raise Exception('You must specified 1 of path or fh.')

asserts?

@@ +217,5 @@
> +        if not path and not fh:
> +            raise Exception('You must specified 1 of path or fh.')
> +
> +        if path:
> +            fh = open(path, 'wb')

See other patch comment.
Attachment #779480 - Flags: review?(mh+mozilla) → review+
Attachment #778768 - Flags: review?(mh+mozilla) → review+
Heads up: I'm about to clone this bug to decouple the implementation and support of install manifests in the build system from actually hooking up EXPORTS to them. The reason is the EXPORTS patch is a little more involved than I'd like. The install manifests themselves unblock a lot of other work (including XPIDL refactoring) that I'd like to see moving in parallel.
Blocks: 896797
No longer blocks: 892644
Attachment #772994 - Attachment description: Part 3: Use install manifests for managing dist/include → Part 3: Use install manifests for managing dist/include
Attachment #772994 - Attachment is obsolete: true
Added updated_files and existing_files to FileCopyResult class. I figure
we'll want these eventually, so why not add them now?
Attachment #779516 - Flags: review?(mh+mozilla)
Attachment #778768 - Attachment is obsolete: true
Attachment #779516 - Flags: review?(mh+mozilla) → review+
(In reply to Mike Hommey [:glandium] from comment #12)
> ::: python/mozbuild/mozpack/manifests.py
> @@ +25,5 @@
> >  
> >      Don't be confused by the name of this class: entries are files that are
> >      *not* purged.
> >      """
> > +    def __init__(self, relpath='', path=None, fh=None):
> 
> I can be wrong, but I think python lib functions that take a file object use
> "fileobj". Might as well use the same idioms.

It's wildly inconsistent. I see fileobj and fp as common names. fh is only used in like 1 place, so I'll just use fileobj from now on.

> @@ +62,3 @@
> >  
> > +        if not path and not fh:
> > +            raise Exception('Must specify either path or fh.')
> 
> Might as well use an assert for both tests (and I think assert takes an
> explanation string, even though we don't use that much).

A problem with asserts is they will be ignored by python -O.

I tend to use asserts for internal implementation validation and exceptions for everything else.
I added an auto_fileobj() helper context manager. Just want to run this
by you to ensure we're on the same page.
Attachment #779527 - Flags: review?(mh+mozilla)
Attachment #779477 - Attachment is obsolete: true
Hit points of feedback. Renamed ExistingFile to OptionalExistingFile.
Naming is hard. But this leaves the door open for RequiredExistingFile
and the potential for an error to be raised if files of that class are
not present.
Attachment #779537 - Flags: review?(mh+mozilla)
Attachment #779480 - Attachment is obsolete: true
(In reply to Gregory Szorc [:gps] from comment #16)
> I tend to use asserts for internal implementation validation and exceptions
> for everything else.

Me too, and I see those as implementation validation.
Comment on attachment 779527 [details] [diff] [review]
Part 1: Use more Pythonic API for PurgeManifest

Review of attachment 779527 [details] [diff] [review]:
-----------------------------------------------------------------

Nits.

::: python/mozbuild/mozpack/manifests.py
@@ +10,5 @@
>  import mozpack.path as mozpath
>  
>  
> +@contextmanager
> +def auto_fileobj(path, fileobj, mode='r'):

Probably not the best place to put this helper. Would be better in some common tooling module.

@@ +15,5 @@
> +    if path and fileobj:
> +        raise Exception('Only 1 of path or fileobj may be defined.')
> +
> +    if not path and not fileobj:
> +        raise Exception('Must specified 1 of path or fileobj.')

cf. comment 19.

@@ +50,5 @@
>          self.relpath = relpath
>          self.entries = set()
>  
> +        if not path and not fileobj:
> +            return

Shouldn't that be an error?
Attachment #779527 - Flags: review?(mh+mozilla) → review+
Comment on attachment 779537 [details] [diff] [review]
Part 2: InstallManifest class for managing file installs

Review of attachment 779537 [details] [diff] [review]:
-----------------------------------------------------------------

Not convinced we need the optional variant, but meh.
Attachment #779537 - Flags: review?(mh+mozilla) → review+
(In reply to Mike Hommey [:glandium] from comment #20)
> Comment on attachment 779527 [details] [diff] [review]
> Part 1: Use more Pythonic API for PurgeManifest
> 
> Review of attachment 779527 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Nits.
> 
> ::: python/mozbuild/mozpack/manifests.py
> @@ +10,5 @@
> >  import mozpack.path as mozpath
> >  
> >  
> > +@contextmanager
> > +def auto_fileobj(path, fileobj, mode='r'):
> 
> Probably not the best place to put this helper. Would be better in some
> common tooling module.

I don't know where else to put it! I'll drop a comment in the code that it should be defined elsewhere and _prefix it to mark it internal. We can move it later.

> @@ +50,5 @@
> >          self.relpath = relpath
> >          self.entries = set()
> >  
> > +        if not path and not fileobj:
> > +            return
> 
> Shouldn't that be an error?

No. There are both undefined if creating an empty, in-memory manifest.
s/OptionalExistingFile/RequiredExistingFile/. And copy() now errors if
the destination path doesn't exist. Includes tests for this.
Attachment #779982 - Flags: review?(mh+mozilla)
Attachment #779537 - Attachment is obsolete: true
Comment on attachment 779982 [details] [diff] [review]
Part 2: InstallManifest class for managing file installs

Review of attachment 779982 [details] [diff] [review]:
-----------------------------------------------------------------

::: python/mozbuild/mozpack/files.py
@@ +273,5 @@
> +        else:
> +            assert isinstance(dest, Dest)
> +
> +        if not dest.exists():
> +            raise ErrorMessage("Required existing file doesn't exist: %s" %

errors.fatal()
Attachment #779982 - Flags: review?(mh+mozilla) → review+
No longer depends on: 870401
https://hg.mozilla.org/integration/mozilla-inbound/rev/a9714e589779

Part 3 broke unit tests due to my stupidity.
See Also: → 892747
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.