Generate test archives without staging files

RESOLVED FIXED in Firefox 44

Status

defect
RESOLVED FIXED
4 years ago
Last year

People

(Reporter: gps, Assigned: gps)

Tracking

unspecified
mozilla44
Dependency tree / graph

Firefox Tracking Flags

(firefox44 fixed)

Details

Attachments

(19 attachments, 1 obsolete attachment)

40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
40 bytes, text/x-review-board-request
glandium
: review+
Details
39.46 KB, image/png
Details
We currently copy test files into a staging directory so we can archive them. We also use plain zip to produce the archives, which leaks mtimes and makes output non-deterministic.

Let's plug in our mozpack JarWriter and produce test archives without the overhead of staging files first.
Bug 1208320 - INCOMPLETE allow FileFinder to find dot files

Previously, we skipped over files beginning with a ".". This commit adds
an option to include them.
Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

Test archive generation currently copies a bunch of files into a staging
area then runs `zip` to produce ZIP files. There are 2 concerns with
this approach:

1) We incur a lot of extra I/O to copy files so everything is
   rooted in a single tree so the `zip` invocation and paths are
   simple.
2) ZIP files inherit properties from the local filesystem (including
   mtime), making ZIP files non-deterministic.

This commit introduces a new mozbuild action for producing test
archives. It does so using the mozpack file finder and JAR writer,
which are used throughout the build to deterministically
produce ZIP/JAR files from files in multiple source directories.

We implement support for producing the mozharness archive. This archive
does not involve files that are staged, so no I/O is saved. In fact,
the switch from `zip` to Python likely makes this slightly slower.
However, we do have deterministic archives now.

Additional archives will be ported over in subsequent commits.
Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

This commit produces the xpcshell test archive without staging 5000+
xpcshell test files first.

The test_archive action has been taught new arguments to find files in
new base locations. We also teach it to ignore .mkdir.done files.

The xpcshell Makefile.in still stages some files. This is less than
ideal. However, it is a small handful of files and shouldn't add too
much overhead.

This appears to not impact overall CPU usage significantly on my
machine, despute using Python instead of `zip`. It does reduce I/O
by ~25MB by avoiding the staging copy.
Bug 1208320 - Produce mochitest test archive without staging test files; r?glandium

This is very similar to what we did for xpcshell. Like xpcshell, there
are still some staged files. However, about 73MB of copies are
eliminated with this change. On my machine, overall execution time of
test packaging appears to decrease, although CPU usage is up slightly.
Bug 1208320 - Produce web-platform test archive without staging; r?glandium

The web-platform test archive now builds without any staging at all.
This saves ~103 MB of file copies on my machine.

The testing/web-platform/Makefile.in serves no purpose after this
change, so it and all references to it have been removed.
The patches have r?glandium, but mozreview didn't set any flag, is that expected?
Depends on: 1208475
(In reply to Mike Hommey [:glandium] from comment #7)
> The patches have r?glandium, but mozreview didn't set any flag, is that
> expected?

I removed you as a reviewer before publishing. Not a bug.
Comment on attachment 8665776 [details]
MozReview Request: Bug 1208320 - Allow FileFinder to find dot files; r=glandium

Bug 1208320 - Allow FileFinder to find dot files; r?glandium

Previously, we always skipped over files beginning with a ".". This
commit adds an option to include them.

This is needed to support test package generation via Python / mozpack.
Attachment #8665776 - Attachment description: MozReview Request: Bug 1208320 - INCOMPLETE allow FileFinder to find dot files → MozReview Request: Bug 1208320 - Allow FileFinder to find dot files; r?glandium
Attachment #8665776 - Flags: review?(mh+mozilla)
Comment on attachment 8665777 [details]
MozReview Request: Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

Test archive generation currently copies a bunch of files into a staging
area then runs `zip` to produce ZIP files. There are 2 concerns with
this approach:

1) We incur a lot of extra I/O to copy files so everything is
   rooted in a single tree so the `zip` invocation and paths are
   simple.
2) ZIP files inherit properties from the local filesystem (including
   mtime), making ZIP files non-deterministic.

This commit introduces a new mozbuild action for producing test
archives. It does so using the mozpack file finder and JAR writer,
which are used throughout the build to deterministically
produce ZIP/JAR files from files in multiple source directories.

We implement support for producing the mozharness archive. This archive
does not involve files that are staged, so no I/O is saved. In fact,
the switch from `zip` to Python likely makes this slightly slower.
However, we do have deterministic archives now.

Additional archives will be ported over in subsequent commits.
Attachment #8665777 - Flags: review?(mh+mozilla)
Comment on attachment 8665778 [details]
MozReview Request: Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

This commit produces the xpcshell test archive without staging 5000+
xpcshell test files first.

The test_archive action has been taught new arguments to find files in
new base locations. We also teach it to ignore .mkdir.done files.

The xpcshell Makefile.in still stages some files. This is less than
ideal. However, it is a small handful of files and shouldn't add too
much overhead.

This appears to not impact overall CPU usage significantly on my
machine, despute using Python instead of `zip`. It does reduce I/O
by ~25MB by avoiding the staging copy.
Attachment #8665778 - Flags: review?(mh+mozilla)
Comment on attachment 8665779 [details]
MozReview Request: Bug 1208320 - Produce mochitest test archive without staging test files; r=glandium

Bug 1208320 - Produce mochitest test archive without staging test files; r?glandium

This is very similar to what we did for xpcshell. Like xpcshell, there
are still some staged files. However, about 73MB of copies are
eliminated with this change. On my machine, overall execution time of
test packaging appears to decrease, although CPU usage is up slightly.
Attachment #8665779 - Flags: review?(mh+mozilla)
Attachment #8665780 - Flags: review?(mh+mozilla)
Comment on attachment 8665780 [details]
MozReview Request: Bug 1208320 - Produce web-platform test archive without staging; r=glandium

Bug 1208320 - Produce web-platform test archive without staging; r?glandium

The web-platform test archive now builds without any staging at all.
This saves ~103 MB of file copies on my machine.

The testing/web-platform/Makefile.in serves no purpose after this
change, so it and all references to it have been removed.
Bug 1208320 - Produce talos test archive without staging files; r?glandium

This is pretty straightforward. This saves ~26 MB of file copies.
Attachment #8666210 - Flags: review?(mh+mozilla)
Bug 1208320 - Produce common tests archive via Python; r?glandium

This doesn't change I/O or copy behavior at all. But it does remove a
one-off make rule.
Attachment #8666211 - Flags: review?(mh+mozilla)
Bug 1208320 - Produce cppunittest and reftest packages via Python; r?glandium

With this change, all test ZIP archives are now generated via Python and
mozpack.

This change does not change I/O or file copy behavior at all. There is
still a lot of room for eliminating extra file copies.
Attachment #8666212 - Flags: review?(mh+mozilla)
Bug 1208320 - Do not stage JIT test files before archiving; r?glandium

This avoids copying 5000+ files consuming ~37 MB on my build
configuration.
Attachment #8666213 - Flags: review?(mh+mozilla)
Bug 1208320 - Do not stage some reftest support files before archiving; r?glandium

After this, only reftest files themselves are staged. Those will be
addressed in a subsequent commit.
Attachment #8666214 - Flags: review?(mh+mozilla)
Bug 1208320 - Do not stage reftest test files before archiving; r?glandium

This is slightly more involved than earlier changes because reftests
have a one-off mechanism for finding files. Essentially, the master
reftest manifest is loaded, directories are discovered, and every file
in those directories is packaged.

We add support to our test archive generation tool to read sources from
reftest manifests and tell it where the reftest manifests are.
print-manifest-dirs.py was only being used for staging reftest files.
Since we don't do that any more, the functionality doesn't need to exist
in a standalone file, so it has been moved inline into test_archive.py.

This change avoids copying ~26,000 tests consuming 131 MB during test
packaging. This is a majority of the file count that was remaining in
the stage directory at this point. On my machine (which hasn't typically
seen major wall time wins from not staging files due to its fast SSD),
this change made test packaging ~20% faster, reducing wall time from
~50s to ~40s!

A Try push seemed to indicate drastic results with the series up to this
point. Including the already landed changes to generate test archives
concurrently, test packaging times on OS X builders dropped from ~18:40
to 6:29! Times on Linux x64 remained about the same (~2:46). This is
possibly due to these machines already having SSDs and due to normal
variance in performance of builders and EC2 instances.
Attachment #8666215 - Flags: review?(mh+mozilla)
In case you missed it in the commit message, this series appears to drop 12 whole minutes from test packaging time on our OS X builders!

The crazy thing is there is some performance is still on the table: we still have 333 MB spread over ~10,000 files in the test-stage directory. Most of the size is binaries for C++ tests. Most of the files are jsreftests, which I just found out use a relatively crude mechanism for tracking that doesn't easily lend itself to the kinds of optimizations I've employed so far: they will need a bit more code to convert over.
Bug 1208320 - Do not stage TPS files before archiving; r?glandium

This saves copying of ~100 files comprising ~1 MB. Not significant. But
it gets us a little closer to no staging.
Attachment #8666308 - Flags: review?(mh+mozilla)
Bug 1208320 - Do not stage JS test modules before archiving; r?glandium

Saves 400 KB over 40 files on my machine.
Attachment #8666309 - Flags: review?(mh+mozilla)
Bug 1208320 - Do not stage mozbase files before archiving; r?glandium

This prevents copying of 447 files adding to ~4 MB.
Attachment #8666310 - Flags: review?(mh+mozilla)
Bug 1208320 - Do not stage some C++ unit test support files before archiving; r?glandium

Won't impact performance much. But fewer make foo makes porting the C++
unit tests (which are the largest remaining tests) to the Python
archiver easier to grok.
Attachment #8666311 - Flags: review?(mh+mozilla)
Bug 1208320 - Do not stage C++ unit tests unless needed; r?glandium

C++ unit tests may or may not be stripped. If they aren't being
stripped, it is possible to produce the test archive without staging
them. If they are being stripped, we still need to strip them before
producing the test archive.

The resulting code is a bit complex. Arguably binary stripping should be
performed at binary generation time. This is slightly more complicated
and beyond the scope of the quick hacks we're employing to make test
archive generation more optimal.

On my OS X build, test archive generation when stripping is not
performed results in ~330 MB of file copies not being performed. It's
likely that many of these binaries will be in the page cache from just
being built. However, if they aren't and the host doesn't have fast I/O,
this change could result in a drastic decrease in test archiving time.

After this change, the staging directory is only ~55 MB in my local
build. Most of that space is taken up by jsreftests.
Attachment #8666312 - Flags: review?(mh+mozilla)
Bug 1208320 - Print message when done with archiving; r?glandium

Metrics are nice. Adding this output clearly demonstrates that C++ unit
tests are the long pole by far: they take ~95% of wall execution time
to archive (~30s total). The next longest archive only takes ~11s to
produce. This will be important if we ever want to reduce archive time
further on optimal hardware.

FWIW, disabling compression will produce a C++ unit test archive in
1.0s. Archives with more files take longer, despite the significantly
smaller sizes.
Attachment #8666313 - Flags: review?(mh+mozilla)
Bug 1208320 - Support configuring zlib compression level; r?glandium

An upcoming commit will introduce a caller that doesn't want the maximum
compression level. This commit introduces arguments to control the
compression level inside written archives.
Attachment #8666314 - Flags: review?(mh+mozilla)
Bug 1208320 - Decrease compression level of test zip archives; r?glandium

Compressing C++ unit tests is a long pole when writing test archives.
Experimenting with various levels of compression revealed that
compression level 9 was providing minimal space savings for
significantly longer archiving times and greater CPU usage.

Results of our experimentation of `make -sj8 package-tests` on OS X
with various levels of compression are below. Note: these numbers were
accidentally obtained without JS tests being archived. This skews the
results a little but doesn't impact the analysis below.

ARCHIVE        SIZE       WALL   CPU

(L=9)
cppunittest  76,806,629  30.6s
mochitest    61,276,928   9.4s
reftest      31,204,396  11.0s
ALL         228,146,761  31.2s  75.9s

(L=8)
cppunittest  76,851,593  24.1s
mochitest    61,279,322   8.9s
reftest      31,207,867  10.4s
ALL         228,228,096  24.9s  64.7s

(L=7)
cppunittest  77,102,292  14.3s
mochitest    61,305,147   8.2s
reftest      31,260,359   9.4s
ALL         228,717,803  15.0s  49.1s

(L=6)
cppunittest  77,321,408  11.5s
mochitest    61,336,539   8.2s
reftest      31,303,604   9.2s
ALL         229,123,307  12.2s  44.7s

(L=5)
cppunittest  78,226,404   8.2s
mochitest    61,483,804   7.6s
reftest      31,509,349   8.8s
ALL         230,725,600   9.6s  39.7s

(L=4)
cppunittest  79,733,669   6.3s
mochitest    61,825,519   7.6s
reftest      31,924,171   8.4s
ALL         233,669,991   9.0s  36.4s

(L=3)
cppunittest  82,380,731   5.8s
mochitest    62,554,431   7.1s
reftest      32,696,415   8.1s
ALL         239,180,168   8.9s  34.6s

Levels lower than 3 resulted in larger archives with no decreae in
wall time and marginal decrease in CPU time.

As we can see, lowering the compression level reduces archiving time by
>3x while only increasing total archive size by ~2.5 MB or ~1% for
compression level 5.

Total time hits a plateau around levels 4 and 5. After that, file size
increases faster for little decrease in wall time. I suspect that we're
hitting Python limits from having to process thousands of files: there's
only so fast Python can do I/O and make function calls.

I think choosing 4 or 5 for the new compression level are acceptable.
I went with 5 because the wall time savings from 5 to 4 are marginal and
the archive size does start to increase a bit faster at 4. That being
said, 4 does consume 10% less CPU. I could easily just 4 as well. 5 is
more conservative. We can always change to 4 after seeing results in the
wild.

The end result of this change is `make package-tests` is much faster:

Before:  228,146,761 bytes;  31.2s wall;  75.9s CPU
After:   230,725,600 bytes;  11.4s wall;  45.0s CPU
Delta:    +2,578,839 bytes; -19.8s wall; -30.9s CPU

When you take the whole series into consideration:

Before:  44.2s wall; 84.6s CPU
After:   11.4s wall; 45.0s CPU

Lowering CPU is impressive considering we switched from the C `zip`
implementation to Python!

Keep in mind we were at ~78s wall before e87b74b3db43 introduced
concurrent archive generation!

And we still haven't eliminated the staging of JS tests, which are
several thousand files and a few dozen MB!
Attachment #8666315 - Flags: review?(mh+mozilla)
https://reviewboard.mozilla.org/r/20461/#review18393

::: python/mozbuild/mozbuild/action/test_archive.py:47
(Diff revision 1)
> +            'dest': 'jit-test',

Should this be `jit-test/tests/`?
https://reviewboard.mozilla.org/r/20519/#review18395

::: testing/testsuite-targets.mk:436
(Diff revision 1)
> +	MOZ_DISABLE_STARTUPCACHE=$$(MOZ_DISABLE_STARTUPCACHE) \

Can we make this an argument? Environment variables are annoying action-at-a-distance.
https://reviewboard.mozilla.org/r/20335/#review18397

While the latest try push failed on Linux, it looks like test archive generation on OS X is now down to ~5:15. It started in the 18 minute range.

The changes I cranked out in the latest submission didn't appear to move the needle that much in automation, despite a significant reduction on my local machine. I suspect I/O wait is such a massive issue on these machines that most of the time is spent obtaining the original copy of all the files going into the archive. Avoiding staging helps, certainly. But the cost of reading the file the first time appears to outweigh a significant amount of the costs of the extra staging copy. My reasoning here is that the stage will put the files in the page cache, so subsequent packaging shouldn't involve much read I/O. On my local machine where I/O is relatively cheap due to a SSD, the overhead of copying tens of thousands of files matters. All that being said, we should take my newer changes because they do result in less work. It's just too bad OS X builders can't remonstrate that. (If Linux builds completed, I bet wall time would have decreased.)
https://reviewboard.mozilla.org/r/20335/#review18399

Looks like I/O on Windows builders is horrible as well (although I don't have anything to compare it against):

  Wrote 15979 files in 18012347 bytes to firefox-44.0a1.en-US.win32.common.tests.zip in 22.45s
  Wrote 102 files in 1967414 bytes to firefox-44.0a1.en-US.win32.cppunittest.tests.zip in 0.40s
  Wrote 18315 files in 61541646 bytes to firefox-44.0a1.en-US.win32.mochitest.tests.zip in 197.99s
  Wrote 26119 files in 31509425 bytes to firefox-44.0a1.en-US.win32.reftest.tests.zip in 98.03s
  Wrote 338 files in 11131856 bytes to firefox-44.0a1.en-US.win32.talos.tests.zip in 11.79s
  Wrote 17065 files in 25949799 bytes to firefox-44.0a1.en-US.win32.web-platform.tests.zip in 64.39s
  Wrote 6405 files in 10536287 bytes to firefox-44.0a1.en-US.win32.xpcshell.tests.zip in 42.15s

Over 3 minutes to produce a mochitest archive that my MBP can do in under 10s with a fresh page cache (15-20s otherwise).

This is why we need SSDs everywhere (and especially on the builders).
As part of profiling Mercurial over the weekend, I discovered that CloseFile() is apparently insanely slow on Windows: times measured in ~1ms as opposed to ~1us for WriteFile(). For tens of thousands of synchronous file copies, this likely adds up to several seconds of delay. What I'm not sure of yet is whether moving CloseFile() to a background thread will speed things up. Things to consider for followup work.
https://reviewboard.mozilla.org/r/20461/#review18393

> Should this be `jit-test/tests/`?

No.

The previous behavior was: `cp -RL $(topsrcdir)/js/src/tests/ecma_6 $(PKG_STAGE)/jit-test/tests/`. This results in `$(PKG_STAGE)/jit-test/tests/ecma_6`.

The way the mozpack finder works is it returns paths relative to the base, including leading directories in the pattern. So files coming off the matcher look like `tests/ecma_6/X`. We join paths with `dest` resulting in `join(dest, 'tests/ecma_6/X')` which produces `jit-test/tests/ecma_6/X`, just like the `cp` behavior before.
Comment on attachment 8665776 [details]
MozReview Request: Bug 1208320 - Allow FileFinder to find dot files; r=glandium

Bug 1208320 - Allow FileFinder to find dot files; r?glandium

Previously, we always skipped over files beginning with a ".". This
commit adds an option to include them.

This is needed to support test package generation via Python / mozpack.
Comment on attachment 8665777 [details]
MozReview Request: Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

Test archive generation currently copies a bunch of files into a staging
area then runs `zip` to produce ZIP files. There are 2 concerns with
this approach:

1) We incur a lot of extra I/O to copy files so everything is
   rooted in a single tree so the `zip` invocation and paths are
   simple.
2) ZIP files inherit properties from the local filesystem (including
   mtime), making ZIP files non-deterministic.

This commit introduces a new mozbuild action for producing test
archives. It does so using the mozpack file finder and JAR writer,
which are used throughout the build to deterministically
produce ZIP/JAR files from files in multiple source directories.

We implement support for producing the mozharness archive. This archive
does not involve files that are staged, so no I/O is saved. In fact,
the switch from `zip` to Python likely makes this slightly slower.
However, we do have deterministic archives now.

Additional archives will be ported over in subsequent commits.
Comment on attachment 8665778 [details]
MozReview Request: Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

This commit produces the xpcshell test archive without staging 5000+
xpcshell test files first.

The test_archive action has been taught new arguments to find files in
new base locations. We also teach it to ignore .mkdir.done files.

The xpcshell Makefile.in still stages some files. This is less than
ideal. However, it is a small handful of files and shouldn't add too
much overhead.

This appears to not impact overall CPU usage significantly on my
machine, despute using Python instead of `zip`. It does reduce I/O
by ~25MB by avoiding the staging copy.
Comment on attachment 8665779 [details]
MozReview Request: Bug 1208320 - Produce mochitest test archive without staging test files; r=glandium

Bug 1208320 - Produce mochitest test archive without staging test files; r?glandium

This is very similar to what we did for xpcshell. Like xpcshell, there
are still some staged files. However, about 73MB of copies are
eliminated with this change. On my machine, overall execution time of
test packaging appears to decrease, although CPU usage is up slightly.
Comment on attachment 8665780 [details]
MozReview Request: Bug 1208320 - Produce web-platform test archive without staging; r=glandium

Bug 1208320 - Produce web-platform test archive without staging; r?glandium

The web-platform test archive now builds without any staging at all.
This saves ~103 MB of file copies on my machine.

The testing/web-platform/Makefile.in serves no purpose after this
change, so it and all references to it have been removed.
Comment on attachment 8666210 [details]
MozReview Request: Bug 1208320 - Produce talos test archive without staging files; r=glandium

Bug 1208320 - Produce talos test archive without staging files; r?glandium

This is pretty straightforward. This saves ~26 MB of file copies.
Comment on attachment 8666211 [details]
MozReview Request: Bug 1208320 - Produce common tests archive via Python; r?glandium

Bug 1208320 - Produce common tests archive via Python; r?glandium

This doesn't change I/O or copy behavior at all. But it does remove a
one-off make rule.
Comment on attachment 8666212 [details]
MozReview Request: Bug 1208320 - Produce cppunittest and reftest packages via Python; r=glandium

Bug 1208320 - Produce cppunittest and reftest packages via Python; r?glandium

With this change, all test ZIP archives are now generated via Python and
mozpack.

This change does not change I/O or file copy behavior at all. There is
still a lot of room for eliminating extra file copies.
Comment on attachment 8666213 [details]
MozReview Request: Bug 1208320 - Do not stage JIT test files before archiving; r=glandium

Bug 1208320 - Do not stage JIT test files before archiving; r?glandium

This avoids copying 5000+ files consuming ~37 MB on my build
configuration.
Comment on attachment 8666214 [details]
MozReview Request: Bug 1208320 - Do not stage some reftest support files before archiving; r=glandium

Bug 1208320 - Do not stage some reftest support files before archiving; r?glandium

After this, only reftest files themselves are staged. Those will be
addressed in a subsequent commit.
Comment on attachment 8666215 [details]
MozReview Request: Bug 1208320 - Do not stage reftest test files before archiving; r?glandium

Bug 1208320 - Do not stage reftest test files before archiving; r?glandium

This is slightly more involved than earlier changes because reftests
have a one-off mechanism for finding files. Essentially, the master
reftest manifest is loaded, directories are discovered, and every file
in those directories is packaged.

We add support to our test archive generation tool to read sources from
reftest manifests and tell it where the reftest manifests are.
print-manifest-dirs.py was only being used for staging reftest files.
Since we don't do that any more, the functionality doesn't need to exist
in a standalone file, so it has been moved inline into test_archive.py.

This change avoids copying ~26,000 tests consuming 131 MB during test
packaging. This is a majority of the file count that was remaining in
the stage directory at this point. On my machine (which hasn't typically
seen major wall time wins from not staging files due to its fast SSD),
this change made test packaging ~20% faster, reducing wall time from
~50s to ~40s!

A Try push seemed to indicate drastic results with the series up to this
point. Including the already landed changes to generate test archives
concurrently, test packaging times on OS X builders dropped from ~18:40
to 6:29! Times on Linux x64 remained about the same (~2:46). This is
possibly due to these machines already having SSDs and due to normal
variance in performance of builders and EC2 instances.
Comment on attachment 8666308 [details]
MozReview Request: Bug 1208320 - Do not stage TPS files before archiving; r=glandium

Bug 1208320 - Do not stage TPS files before archiving; r?glandium

This saves copying of ~100 files comprising ~1 MB. Not significant. But
it gets us a little closer to no staging.
Comment on attachment 8666309 [details]
MozReview Request: Bug 1208320 - Do not stage JS test modules before archiving; r=glandium

Bug 1208320 - Do not stage JS test modules before archiving; r?glandium

Saves 400 KB over 40 files on my machine.
Comment on attachment 8666310 [details]
MozReview Request: Bug 1208320 - Do not stage mozbase files before archiving; r=glandium

Bug 1208320 - Do not stage mozbase files before archiving; r?glandium

This prevents copying of 447 files adding to ~4 MB.
Comment on attachment 8666311 [details]
MozReview Request: Bug 1208320 - Do not stage some C++ unit test support files before archiving; r?glandium

Bug 1208320 - Do not stage some C++ unit test support files before archiving; r?glandium

Won't impact performance much. But fewer make foo makes porting the C++
unit tests (which are the largest remaining tests) to the Python
archiver easier to grok.
Comment on attachment 8666313 [details]
MozReview Request: Bug 1208320 - Print message when done with archiving; r=glandium

Bug 1208320 - Print message when done with archiving; r?glandium

Metrics are nice. Adding this output clearly demonstrates that C++ unit
tests are the long pole by far: they take ~95% of wall execution time
to archive (~30s total). The next longest archive only takes ~11s to
produce. This will be important if we ever want to reduce archive time
further on optimal hardware.

FWIW, disabling compression will produce a C++ unit test archive in
1.0s. Archives with more files take longer, despite the significantly
smaller sizes.
Comment on attachment 8666314 [details]
MozReview Request: Bug 1208320 - Support configuring zlib compression level; r=glandium

Bug 1208320 - Support configuring zlib compression level; r?glandium

An upcoming commit will introduce a caller that doesn't want the maximum
compression level. This commit introduces arguments to control the
compression level inside written archives.
Comment on attachment 8666315 [details]
MozReview Request: Bug 1208320 - Decrease compression level of test zip archives; r=glandium

Bug 1208320 - Decrease compression level of test zip archives; r?glandium

Compressing C++ unit tests is a long pole when writing test archives.
Experimenting with various levels of compression revealed that
compression level 9 was providing minimal space savings for
significantly longer archiving times and greater CPU usage.

Results of our experimentation of `make -sj8 package-tests` on OS X
with various levels of compression are below. Note: these numbers were
accidentally obtained without JS tests being archived. This skews the
results a little but doesn't impact the analysis below.

ARCHIVE        SIZE       WALL   CPU

(L=9)
cppunittest  76,806,629  30.6s
mochitest    61,276,928   9.4s
reftest      31,204,396  11.0s
ALL         228,146,761  31.2s  75.9s

(L=8)
cppunittest  76,851,593  24.1s
mochitest    61,279,322   8.9s
reftest      31,207,867  10.4s
ALL         228,228,096  24.9s  64.7s

(L=7)
cppunittest  77,102,292  14.3s
mochitest    61,305,147   8.2s
reftest      31,260,359   9.4s
ALL         228,717,803  15.0s  49.1s

(L=6)
cppunittest  77,321,408  11.5s
mochitest    61,336,539   8.2s
reftest      31,303,604   9.2s
ALL         229,123,307  12.2s  44.7s

(L=5)
cppunittest  78,226,404   8.2s
mochitest    61,483,804   7.6s
reftest      31,509,349   8.8s
ALL         230,725,600   9.6s  39.7s

(L=4)
cppunittest  79,733,669   6.3s
mochitest    61,825,519   7.6s
reftest      31,924,171   8.4s
ALL         233,669,991   9.0s  36.4s

(L=3)
cppunittest  82,380,731   5.8s
mochitest    62,554,431   7.1s
reftest      32,696,415   8.1s
ALL         239,180,168   8.9s  34.6s

Levels lower than 3 resulted in larger archives with no decreae in
wall time and marginal decrease in CPU time.

As we can see, lowering the compression level reduces archiving time by
>3x while only increasing total archive size by ~2.5 MB or ~1% for
compression level 5.

Total time hits a plateau around levels 4 and 5. After that, file size
increases faster for little decrease in wall time. I suspect that we're
hitting Python limits from having to process thousands of files: there's
only so fast Python can do I/O and make function calls.

I think choosing 4 or 5 for the new compression level are acceptable.
I went with 5 because the wall time savings from 5 to 4 are marginal and
the archive size does start to increase a bit faster at 4. That being
said, 4 does consume 10% less CPU. I could easily just 4 as well. 5 is
more conservative. We can always change to 4 after seeing results in the
wild.

The end result of this change is `make package-tests` is much faster:

Before:  228,146,761 bytes;  31.2s wall;  75.9s CPU
After:   230,725,600 bytes;  11.4s wall;  45.0s CPU
Delta:    +2,578,839 bytes; -19.8s wall; -30.9s CPU

When you take the whole series into consideration:

Before:  44.2s wall; 84.6s CPU
After:   11.4s wall; 45.0s CPU

Lowering CPU is impressive considering we switched from the C `zip`
implementation to Python!

Keep in mind we were at ~78s wall before e87b74b3db43 introduced
concurrent archive generation!

And we still haven't eliminated the staging of JS tests, which are
several thousand files and a few dozen MB!
Attachment #8666312 - Attachment is obsolete: true
Attachment #8666312 - Flags: review?(mh+mozilla)
https://reviewboard.mozilla.org/r/20519/#review18395

> Can we make this an argument? Environment variables are annoying action-at-a-distance.

Fixed in latest version.
https://reviewboard.mozilla.org/r/20335/#review18523

I dropped more intelligent staging of C++ test files from the series because it caused failures in automation. I'm content with leaving this to a follow-up bug.
Comment on attachment 8665776 [details]
MozReview Request: Bug 1208320 - Allow FileFinder to find dot files; r=glandium

https://reviewboard.mozilla.org/r/20337/#review18527

::: python/mozbuild/mozpack/files.py:834
(Diff revision 3)
> +        self.find_dotfiles = find_dotfiles

nit: could move this line one or two lines up for a nicer order.
Attachment #8665776 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8665777 [details]
MozReview Request: Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

https://reviewboard.mozilla.org/r/20339/#review18529

::: python/mozbuild/mozbuild/action/test_archive.py:5
(Diff revision 3)
> +# This action is used to produce test archives.

First things first, besides the fact that this action is used to produce test archives, there is nothing inherently limited to producing test archives about it. In fact, it could be viewed as an enhanced install-manifest handler that copies into a zip archive instead of a directory.

Which leads me to the second point: I'm not a big fan of the ARCHIVE_FILES dict way of doing this. I think we would be better served with some sort of (preprocessed?) manifest files.

Can we discuss this before I go through the rest of this long queue?
Attachment #8665777 - Flags: review?(mh+mozilla)
Blocks: 1209628
https://reviewboard.mozilla.org/r/20339/#review18625

::: python/mozbuild/mozbuild/action/test_archive.py:53
(Diff revision 3)
> +    parser.add_argument('topsrcdir', help='Where input files are located')

I'd rather import buildconfig and use buildconfig.topsrcdir.
Comment on attachment 8665778 [details]
MozReview Request: Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

https://reviewboard.mozilla.org/r/20341/#review18627

::: python/mozbuild/mozbuild/action/test_archive.py:38
(Diff revision 3)
> +            'source': 'stage',

I think it would be reasonable to skip "stage" entirely, and hardcode it as topobjdir/test-stage. Which, combined with the use of buildconfig for topobjdir and topsrcdir allows to keep the number of necessary command line arguments low.

::: python/mozbuild/mozbuild/action/test_archive.py:62
(Diff revision 3)
>          if source == 'topsrcdir':
>              finder = FileFinder(os.path.join(topsrcdir, base),
> -                                find_executables=False,
> -                                find_dotfiles=True)
> +                                **common_kwargs)
> +        elif source == 'topobjdir':
> +            finder = FileFinder(os.path.join(topobjdir, base),
> +                                **common_kwargs)
> +        elif source == 'stage':
> +            finder = FileFinder(os.path.join(stage, base),
> +                                **common_kwargs)

You could use buildconfig.topsrcdir/topobjdir, directly in ARCHIVE_FILES, so this conditional block would just be a single call.
Attachment #8665778 - Flags: review?(mh+mozilla)
Comment on attachment 8665779 [details]
MozReview Request: Bug 1208320 - Produce mochitest test archive without staging test files; r=glandium

https://reviewboard.mozilla.org/r/20343/#review18655
Attachment #8665779 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8665780 [details]
MozReview Request: Bug 1208320 - Produce web-platform test archive without staging; r=glandium

https://reviewboard.mozilla.org/r/20345/#review18657
Attachment #8665780 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666210 [details]
MozReview Request: Bug 1208320 - Produce talos test archive without staging files; r=glandium

https://reviewboard.mozilla.org/r/20455/#review18659
Attachment #8666210 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666211 [details]
MozReview Request: Bug 1208320 - Produce common tests archive via Python; r?glandium

https://reviewboard.mozilla.org/r/20457/#review18629

::: python/mozbuild/mozbuild/action/test_archive.py:29
(Diff revision 2)
> +                'cppunittest/**',
> +                'mochitest/**',
> +                'reftest/**',
> +                'talos/**',
> +                'web-platform/**',
> +                'xpcshell/**',

This smells like something that would not be updated when a new suite is added. Can you add a check that nothing that is defined in ARCHIVE_FILES is missing from that list?
Attachment #8666211 - Flags: review?(mh+mozilla)
Attachment #8666212 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666212 [details]
MozReview Request: Bug 1208320 - Produce cppunittest and reftest packages via Python; r=glandium

https://reviewboard.mozilla.org/r/20459/#review18631

::: testing/testsuite-targets.mk:439
(Diff revision 2)
>  define python_test_archive

Please make this package_archive again.
Comment on attachment 8666213 [details]
MozReview Request: Bug 1208320 - Do not stage JIT test files before archiving; r=glandium

https://reviewboard.mozilla.org/r/20461/#review18633

::: python/mozbuild/mozbuild/action/test_archive.py:47
(Diff revision 2)
> +            'dest': 'jit-test',

jit-test/tests

::: python/mozbuild/mozbuild/action/test_archive.py:53
(Diff revision 2)
> +            'dest': 'jit-test',

likewise

::: python/mozbuild/mozbuild/action/test_archive.py:59
(Diff revision 2)
> +            'dest': 'jit-test',

likewise
Attachment #8666213 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666214 [details]
MozReview Request: Bug 1208320 - Do not stage some reftest support files before archiving; r=glandium

https://reviewboard.mozilla.org/r/20463/#review18661
Attachment #8666214 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666215 [details]
MozReview Request: Bug 1208320 - Do not stage reftest test files before archiving; r?glandium

https://reviewboard.mozilla.org/r/20465/#review18647

::: python/mozbuild/mozbuild/action/test_archive.py:96
(Diff revision 2)
> -            'source': 'stage',
> -            'base': '',
> -            'pattern': 'reftest/**',
> +            'source': 'reftest',
> +            'manifests': [
> +                'layout/reftests/reftest.list',
> +                'testing/crashtest/crashtests.list',
> +            ],

Somehow, I'd prefer something that fills ARCHIVE_FILES['reftest'] with "normal" entries based on the manifests. Then you don't need the special behavior in find_files.
Attachment #8666215 - Flags: review?(mh+mozilla)
Comment on attachment 8666308 [details]
MozReview Request: Bug 1208320 - Do not stage TPS files before archiving; r=glandium

https://reviewboard.mozilla.org/r/20513/#review18663
Attachment #8666308 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666309 [details]
MozReview Request: Bug 1208320 - Do not stage JS test modules before archiving; r=glandium

https://reviewboard.mozilla.org/r/20515/#review18665
Attachment #8666309 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666310 [details]
MozReview Request: Bug 1208320 - Do not stage mozbase files before archiving; r=glandium

https://reviewboard.mozilla.org/r/20517/#review18667
Attachment #8666310 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666311 [details]
MozReview Request: Bug 1208320 - Do not stage some C++ unit test support files before archiving; r?glandium

https://reviewboard.mozilla.org/r/20519/#review18651

::: python/mozbuild/mozbuild/action/test_archive.py:300
(Diff revision 2)
> +            })

I'm tempted to say that these two files don't hurt if they're in the archive even when startupcache is disabled. Plus, disabled startupcache shouldn't be common, I don't see the benefit of code complexity here for something harmless. Just always copy the files.
Attachment #8666311 - Flags: review?(mh+mozilla)
Comment on attachment 8666313 [details]
MozReview Request: Bug 1208320 - Print message when done with archiving; r=glandium

https://reviewboard.mozilla.org/r/20523/#review18653

::: python/mozbuild/mozbuild/action/test_archive.py:310
(Diff revision 2)
>              for p, f in res:

You could do:
for file_count, (p, f) in enumerate(res, 1):
Attachment #8666313 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666314 [details]
MozReview Request: Bug 1208320 - Support configuring zlib compression level; r=glandium

https://reviewboard.mozilla.org/r/20525/#review18669
Attachment #8666314 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666315 [details]
MozReview Request: Bug 1208320 - Decrease compression level of test zip archives; r=glandium

https://reviewboard.mozilla.org/r/20527/#review18671
Attachment #8666315 - Flags: review?(mh+mozilla) → review+
https://reviewboard.mozilla.org/r/20465/#review18647

> Somehow, I'd prefer something that fills ARCHIVE_FILES['reftest'] with "normal" entries based on the manifests. Then you don't need the special behavior in find_files.

I agree that it is dirty.

I don't want to have to parse reftest manifests on every process invocation. I'll see about finding a slightly better way to factor this out.
https://reviewboard.mozilla.org/r/20519/#review18651

> I'm tempted to say that these two files don't hurt if they're in the archive even when startupcache is disabled. Plus, disabled startupcache shouldn't be common, I don't see the benefit of code complexity here for something harmless. Just always copy the files.

Looking at the commit that added the guard, I think you are correct.
Comment on attachment 8665776 [details]
MozReview Request: Bug 1208320 - Allow FileFinder to find dot files; r=glandium

Bug 1208320 - Allow FileFinder to find dot files; r=glandium

Previously, we always skipped over files beginning with a ".". This
commit adds an option to include them.

This is needed to support test package generation via Python / mozpack.
Attachment #8665776 - Attachment description: MozReview Request: Bug 1208320 - Allow FileFinder to find dot files; r?glandium → MozReview Request: Bug 1208320 - Allow FileFinder to find dot files; r=glandium
Comment on attachment 8665777 [details]
MozReview Request: Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

Test archive generation currently copies a bunch of files into a staging
area then runs `zip` to produce ZIP files. There are 2 concerns with
this approach:

1) We incur a lot of extra I/O to copy files so everything is
   rooted in a single tree so the `zip` invocation and paths are
   simple.
2) ZIP files inherit properties from the local filesystem (including
   mtime), making ZIP files non-deterministic.

This commit introduces a new mozbuild action for producing test
archives. It does so using the mozpack file finder and JAR writer,
which are used throughout the build to deterministically
produce ZIP/JAR files from files in multiple source directories.

We implement support for producing the mozharness archive. This archive
does not involve files that are staged, so no I/O is saved. In fact,
the switch from `zip` to Python likely makes this slightly slower.
However, we do have deterministic archives now.

Additional archives will be ported over in subsequent commits.
Attachment #8665777 - Flags: review?(mh+mozilla)
Comment on attachment 8665778 [details]
MozReview Request: Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

This commit produces the xpcshell test archive without staging 5000+
xpcshell test files first.

We teach the archiver to ignore .mkdir.done files.

The xpcshell Makefile.in still stages some files. This is less than
ideal. However, it is a small handful of files and shouldn't add too
much overhead.

This appears to not impact overall CPU usage significantly on my
machine, despute using Python instead of `zip`. It does reduce I/O
by ~25MB by avoiding the staging copy.
Attachment #8665778 - Flags: review?(mh+mozilla)
Attachment #8665779 - Attachment description: MozReview Request: Bug 1208320 - Produce mochitest test archive without staging test files; r?glandium → MozReview Request: Bug 1208320 - Produce mochitest test archive without staging test files; r=glandium
Comment on attachment 8665779 [details]
MozReview Request: Bug 1208320 - Produce mochitest test archive without staging test files; r=glandium

Bug 1208320 - Produce mochitest test archive without staging test files; r=glandium

This is very similar to what we did for xpcshell. Like xpcshell, there
are still some staged files. However, about 73MB of copies are
eliminated with this change. On my machine, overall execution time of
test packaging appears to decrease, although CPU usage is up slightly.
Comment on attachment 8665780 [details]
MozReview Request: Bug 1208320 - Produce web-platform test archive without staging; r=glandium

Bug 1208320 - Produce web-platform test archive without staging; r=glandium

The web-platform test archive now builds without any staging at all.
This saves ~103 MB of file copies on my machine.

The testing/web-platform/Makefile.in serves no purpose after this
change, so it and all references to it have been removed.
Attachment #8665780 - Attachment description: MozReview Request: Bug 1208320 - Produce web-platform test archive without staging; r?glandium → MozReview Request: Bug 1208320 - Produce web-platform test archive without staging; r=glandium
Comment on attachment 8666210 [details]
MozReview Request: Bug 1208320 - Produce talos test archive without staging files; r=glandium

Bug 1208320 - Produce talos test archive without staging files; r=glandium

This is pretty straightforward. This saves ~26 MB of file copies.
Attachment #8666210 - Attachment description: MozReview Request: Bug 1208320 - Produce talos test archive without staging files; r?glandium → MozReview Request: Bug 1208320 - Produce talos test archive without staging files; r=glandium
Comment on attachment 8666211 [details]
MozReview Request: Bug 1208320 - Produce common tests archive via Python; r?glandium

Bug 1208320 - Produce common tests archive via Python; r?glandium

This doesn't change I/O or copy behavior at all. But it does remove a
one-off make rule.
Attachment #8666211 - Flags: review?(mh+mozilla)
Comment on attachment 8666212 [details]
MozReview Request: Bug 1208320 - Produce cppunittest and reftest packages via Python; r=glandium

Bug 1208320 - Produce cppunittest and reftest packages via Python; r=glandium

With this change, all test ZIP archives are now generated via Python and
mozpack.

This change does not change I/O or file copy behavior at all. There is
still a lot of room for eliminating extra file copies.
Attachment #8666212 - Attachment description: MozReview Request: Bug 1208320 - Produce cppunittest and reftest packages via Python; r?glandium → MozReview Request: Bug 1208320 - Produce cppunittest and reftest packages via Python; r=glandium
Attachment #8666213 - Attachment description: MozReview Request: Bug 1208320 - Do not stage JIT test files before archiving; r?glandium → MozReview Request: Bug 1208320 - Do not stage JIT test files before archiving; r=glandium
Comment on attachment 8666213 [details]
MozReview Request: Bug 1208320 - Do not stage JIT test files before archiving; r=glandium

Bug 1208320 - Do not stage JIT test files before archiving; r=glandium

This avoids copying 5000+ files consuming ~37 MB on my build
configuration.
Comment on attachment 8666214 [details]
MozReview Request: Bug 1208320 - Do not stage some reftest support files before archiving; r=glandium

Bug 1208320 - Do not stage some reftest support files before archiving; r=glandium

After this, only reftest files themselves are staged. Those will be
addressed in a subsequent commit.
Attachment #8666214 - Attachment description: MozReview Request: Bug 1208320 - Do not stage some reftest support files before archiving; r?glandium → MozReview Request: Bug 1208320 - Do not stage some reftest support files before archiving; r=glandium
Comment on attachment 8666215 [details]
MozReview Request: Bug 1208320 - Do not stage reftest test files before archiving; r?glandium

Bug 1208320 - Do not stage reftest test files before archiving; r?glandium

This is slightly more involved than earlier changes because reftests
have a one-off mechanism for finding files. Essentially, the master
reftest manifest is loaded, directories are discovered, and every file
in those directories is packaged.

We add support to our test archive generation tool to read sources from
reftest manifests and tell it where the reftest manifests are.
print-manifest-dirs.py was only being used for staging reftest files.
Since we don't do that any more, the functionality doesn't need to exist
in a standalone file, so it has been moved inline into test_archive.py.

This change avoids copying ~26,000 tests consuming 131 MB during test
packaging. This is a majority of the file count that was remaining in
the stage directory at this point. On my machine (which hasn't typically
seen major wall time wins from not staging files due to its fast SSD),
this change made test packaging ~20% faster, reducing wall time from
~50s to ~40s!

A Try push seemed to indicate drastic results with the series up to this
point. Including the already landed changes to generate test archives
concurrently, test packaging times on OS X builders dropped from ~18:40
to 6:29! Times on Linux x64 remained about the same (~2:46). This is
possibly due to these machines already having SSDs and due to normal
variance in performance of builders and EC2 instances.
Attachment #8666215 - Flags: review?(mh+mozilla)
Attachment #8666308 - Attachment description: MozReview Request: Bug 1208320 - Do not stage TPS files before archiving; r?glandium → MozReview Request: Bug 1208320 - Do not stage TPS files before archiving; r=glandium
Comment on attachment 8666308 [details]
MozReview Request: Bug 1208320 - Do not stage TPS files before archiving; r=glandium

Bug 1208320 - Do not stage TPS files before archiving; r=glandium

This saves copying of ~100 files comprising ~1 MB. Not significant. But
it gets us a little closer to no staging.
Comment on attachment 8666309 [details]
MozReview Request: Bug 1208320 - Do not stage JS test modules before archiving; r=glandium

Bug 1208320 - Do not stage JS test modules before archiving; r=glandium

Saves 400 KB over 40 files on my machine.
Attachment #8666309 - Attachment description: MozReview Request: Bug 1208320 - Do not stage JS test modules before archiving; r?glandium → MozReview Request: Bug 1208320 - Do not stage JS test modules before archiving; r=glandium
Comment on attachment 8666310 [details]
MozReview Request: Bug 1208320 - Do not stage mozbase files before archiving; r=glandium

Bug 1208320 - Do not stage mozbase files before archiving; r=glandium

This prevents copying of 447 files adding to ~4 MB.
Attachment #8666310 - Attachment description: MozReview Request: Bug 1208320 - Do not stage mozbase files before archiving; r?glandium → MozReview Request: Bug 1208320 - Do not stage mozbase files before archiving; r=glandium
Comment on attachment 8666311 [details]
MozReview Request: Bug 1208320 - Do not stage some C++ unit test support files before archiving; r?glandium

Bug 1208320 - Do not stage some C++ unit test support files before archiving; r?glandium

Won't impact performance much. But fewer make foo makes porting the C++
unit tests (which are the largest remaining tests) to the Python
archiver easier to grok.

This conversion did change behavior slightly. Previously, startup
cache files weren't being packaged if startup cache was disabled. Now,
we always package them since their presence in the test archive should
be harmless. The original change to guard their inclusion in
ee82e0ae5488 was probably unnecessary.
Attachment #8666311 - Flags: review?(mh+mozilla)
Comment on attachment 8666313 [details]
MozReview Request: Bug 1208320 - Print message when done with archiving; r=glandium

Bug 1208320 - Print message when done with archiving; r=glandium

Metrics are nice. Adding this output clearly demonstrates that C++ unit
tests are the long pole by far: they take ~95% of wall execution time
to archive (~30s total). The next longest archive only takes ~11s to
produce. This will be important if we ever want to reduce archive time
further on optimal hardware.

FWIW, disabling compression will produce a C++ unit test archive in
1.0s. Archives with more files take longer, despite the significantly
smaller sizes.
Attachment #8666313 - Attachment description: MozReview Request: Bug 1208320 - Print message when done with archiving; r?glandium → MozReview Request: Bug 1208320 - Print message when done with archiving; r=glandium
Comment on attachment 8666314 [details]
MozReview Request: Bug 1208320 - Support configuring zlib compression level; r=glandium

Bug 1208320 - Support configuring zlib compression level; r=glandium

An upcoming commit will introduce a caller that doesn't want the maximum
compression level. This commit introduces arguments to control the
compression level inside written archives.
Attachment #8666314 - Attachment description: MozReview Request: Bug 1208320 - Support configuring zlib compression level; r?glandium → MozReview Request: Bug 1208320 - Support configuring zlib compression level; r=glandium
Attachment #8666315 - Attachment description: MozReview Request: Bug 1208320 - Decrease compression level of test zip archives; r?glandium → MozReview Request: Bug 1208320 - Decrease compression level of test zip archives; r=glandium
Comment on attachment 8666315 [details]
MozReview Request: Bug 1208320 - Decrease compression level of test zip archives; r=glandium

Bug 1208320 - Decrease compression level of test zip archives; r=glandium

Compressing C++ unit tests is a long pole when writing test archives.
Experimenting with various levels of compression revealed that
compression level 9 was providing minimal space savings for
significantly longer archiving times and greater CPU usage.

Results of our experimentation of `make -sj8 package-tests` on OS X
with various levels of compression are below. Note: these numbers were
accidentally obtained without JS tests being archived. This skews the
results a little but doesn't impact the analysis below.

ARCHIVE        SIZE       WALL   CPU

(L=9)
cppunittest  76,806,629  30.6s
mochitest    61,276,928   9.4s
reftest      31,204,396  11.0s
ALL         228,146,761  31.2s  75.9s

(L=8)
cppunittest  76,851,593  24.1s
mochitest    61,279,322   8.9s
reftest      31,207,867  10.4s
ALL         228,228,096  24.9s  64.7s

(L=7)
cppunittest  77,102,292  14.3s
mochitest    61,305,147   8.2s
reftest      31,260,359   9.4s
ALL         228,717,803  15.0s  49.1s

(L=6)
cppunittest  77,321,408  11.5s
mochitest    61,336,539   8.2s
reftest      31,303,604   9.2s
ALL         229,123,307  12.2s  44.7s

(L=5)
cppunittest  78,226,404   8.2s
mochitest    61,483,804   7.6s
reftest      31,509,349   8.8s
ALL         230,725,600   9.6s  39.7s

(L=4)
cppunittest  79,733,669   6.3s
mochitest    61,825,519   7.6s
reftest      31,924,171   8.4s
ALL         233,669,991   9.0s  36.4s

(L=3)
cppunittest  82,380,731   5.8s
mochitest    62,554,431   7.1s
reftest      32,696,415   8.1s
ALL         239,180,168   8.9s  34.6s

Levels lower than 3 resulted in larger archives with no decreae in
wall time and marginal decrease in CPU time.

As we can see, lowering the compression level reduces archiving time by
>3x while only increasing total archive size by ~2.5 MB or ~1% for
compression level 5.

Total time hits a plateau around levels 4 and 5. After that, file size
increases faster for little decrease in wall time. I suspect that we're
hitting Python limits from having to process thousands of files: there's
only so fast Python can do I/O and make function calls.

I think choosing 4 or 5 for the new compression level are acceptable.
I went with 5 because the wall time savings from 5 to 4 are marginal and
the archive size does start to increase a bit faster at 4. That being
said, 4 does consume 10% less CPU. I could easily just 4 as well. 5 is
more conservative. We can always change to 4 after seeing results in the
wild.

The end result of this change is `make package-tests` is much faster:

Before:  228,146,761 bytes;  31.2s wall;  75.9s CPU
After:   230,725,600 bytes;  11.4s wall;  45.0s CPU
Delta:    +2,578,839 bytes; -19.8s wall; -30.9s CPU

When you take the whole series into consideration:

Before:  44.2s wall; 84.6s CPU
After:   11.4s wall; 45.0s CPU

Lowering CPU is impressive considering we switched from the C `zip`
implementation to Python!

Keep in mind we were at ~78s wall before e87b74b3db43 introduced
concurrent archive generation!

And we still haven't eliminated the staging of JS tests, which are
several thousand files and a few dozen MB!
Comment on attachment 8665777 [details]
MozReview Request: Bug 1208320 - Produce mozharness test archive via mozpack; r?glandium

https://reviewboard.mozilla.org/r/20339/#review18793
Attachment #8665777 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8665778 [details]
MozReview Request: Bug 1208320 - Produce xpcshell archive without staging test files; r?glandium

https://reviewboard.mozilla.org/r/20341/#review18795
Attachment #8665778 - Flags: review?(mh+mozilla) → review+
Attachment #8666211 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666211 [details]
MozReview Request: Bug 1208320 - Produce common tests archive via Python; r?glandium

https://reviewboard.mozilla.org/r/20457/#review18797

::: python/mozbuild/mozbuild/action/test_archive.py:98
(Diff revisions 2 - 3)
> +# Verify nothing sneaks into ARCHIVE_FILES without a corresponding exlusion

typo: exclusion

::: python/mozbuild/mozbuild/action/test_archive.py:101
(Diff revisions 2 - 3)
> +    if k == 'common':
> +        continue
> +
> +    # Skip mozharness because it isn't staged.
> +    if k == 'mozharness':
> +        continue

might as well group both
if k in ('common', 'mozharness')

::: python/mozbuild/mozbuild/action/test_archive.py:108
(Diff revisions 2 - 3)
> +    ignores = set()
> +    for e in ARCHIVE_FILES['common']:
> +        ignores |= set(e.get('ignore', []))

could be written as
set(chain(e.get('ignore', []) for e in ARCHIVE_FILES['common']))
Comment on attachment 8666215 [details]
MozReview Request: Bug 1208320 - Do not stage reftest test files before archiving; r?glandium

https://reviewboard.mozilla.org/r/20465/#review18799
Attachment #8666215 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8666311 [details]
MozReview Request: Bug 1208320 - Do not stage some C++ unit test support files before archiving; r?glandium

https://reviewboard.mozilla.org/r/20519/#review18801

::: python/mozbuild/mozbuild/action/test_archive.py:104
(Diff revisions 2 - 3)
> +        # These files aren't present if startup cache is disabled, which is

Technically, the files are always there, since they are in the source tree. We just don't care to not skip them when startupcache is disabled. Please update the comment accordingly.

::: python/mozbuild/mozbuild/action/test_archive.py:224
(Diff revision 3)
> -def find_files(archive):
> -    for entry in ARCHIVE_FILES[archive]:
> +def find_files(entries):
> +    for entry in entries:

You don't need to change the function signature anymore.
Attachment #8666311 - Flags: review?(mh+mozilla) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/62cade67affe5f6989e7ca414cbc533b7e42b94a
Bug 1208320 - Allow FileFinder to find dot files; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/3c91cbd21682ef1c095535d353af795e22a378fa
Bug 1208320 - Produce mozharness test archive via mozpack; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/1c9cf4ccedaaa1a237ae8f7a46bf6ea297442efc
Bug 1208320 - Produce xpcshell archive without staging test files; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/5545643ec0424036732c21056b8afce2ac24c1cf
Bug 1208320 - Produce mochitest test archive without staging test files; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/39e7998e8434f4fb587ddc04eb8a78997732d169
Bug 1208320 - Produce web-platform test archive without staging; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/ca68c9413276cba0f9c934fde56c9d4c5f7883b5
Bug 1208320 - Produce talos test archive without staging files; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/6d4afcd9ba64eaa8c2eec65a201957db02690511
Bug 1208320 - Produce common tests archive via Python; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/8957d96ac8f87af30b329532f13bc35df439d449
Bug 1208320 - Produce cppunittest and reftest packages via Python; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/a35d20fb25d97ab1594faf27510a1f8f26d0f4ec
Bug 1208320 - Do not stage JIT test files before archiving; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/6416acd3ff9b35b0960f5c9d91fe8e571f440e4f
Bug 1208320 - Do not stage some reftest support files before archiving; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/07a13667f8e0c4811170e24e1ee9f8d205ed0a7b
Bug 1208320 - Do not stage reftest test files before archiving; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/c9611e5f49357042766c092e16cfb59cb2e8fbd2
Bug 1208320 - Do not stage TPS files before archiving; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/d4d89dc22e33b463cc72ec6e5e5d55673a81de5a
Bug 1208320 - Do not stage JS test modules before archiving; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/7bf7734c34bf6bccfd0a65db82db71d2d9dc4482
Bug 1208320 - Do not stage mozbase files before archiving; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/2af206d9404d23cfbf53a97b1cb3094d6b95fbe7
Bug 1208320 - Do not stage some C++ unit test support files before archiving; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/2d5a9234f7804a5898671dea4cdbfc8c59f2cb2c
Bug 1208320 - Print message when done with archiving; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/a23a1cbaf062634ef2bf028d00a8c662a7e6593a
Bug 1208320 - Support configuring zlib compression level; r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/4a19c58da5928924ab276092de5e00fabcff28b8
Bug 1208320 - Decrease compression level of test zip archives; r=glandium
Looks like test archive generation on Linux dropped from ~172s to ~82s. And here I thought I didn't see much of a win on Linux from my Try pushes!
See Also: → 1215331
Depends on: 1228674
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.