Closed
Bug 1359965
Opened 8 years ago
Closed 8 years ago
Produce .tar.gz test archive for WPT to work around 2**16 file limit
Categories
(Firefox Build System :: General, defect)
Firefox Build System
General
Tracking
(firefox55 fixed)
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: jgraham, Assigned: gps)
References
(Blocks 1 open bug)
Details
Attachments
(2 files)
After addign css tests and associated metadata to web-platform-tests I have a build error in package-tests:
0:25.96 Traceback (most recent call last):
0:25.96 File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
0:25.96 "__main__", fname, loader, pkg_name)
0:25.96 File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
0:25.96 exec code in run_globals
0:25.96 File "/home/james/develop/gecko/python/mozbuild/mozbuild/action/test_archive.py", line 642, in <module>
0:25.96 sys.exit(main(sys.argv[1:]))
0:25.96 File "/home/james/develop/gecko/python/mozbuild/mozbuild/action/test_archive.py", line 632, in main
0:25.96 file_count += 1
0:25.96 File "/home/james/develop/gecko/python/mozbuild/mozpack/mozjar.py", line 492, in __exit__
0:25.96 self.finish()
0:25.96 File "/home/james/develop/gecko/python/mozbuild/mozpack/mozjar.py", line 570, in finish
0:25.96 self._data.write(end.serialize())
0:25.96 File "/home/james/develop/gecko/python/mozbuild/mozpack/mozjar.py", line 146, in serialize
0:25.96 serialized += struct.pack('<' + format, value)
0:25.96 struct.error: ushort format requires 0 <= number <= USHRT_MAX
0:26.29 /home/james/develop/gecko/testing/testsuite-targets.mk:161: recipe for target 'package-tests-web-platform' failed
I think this is a Zip32 limit of 2**16-1 files in an archive. Note that mochtitest-plain currently has 53775 files, which is 82% of the limit, so it seems likely to be affected in the near future.
mozjar ought to support Zip64.
Assignee | ||
Comment 1•8 years ago
|
||
The archives produced by mozjar need to be decoded by libjar in Gecko. So before we change this, we should verify libjar can decode zip64 archives. Alternatively (and this might be a good idea to reduce the size of omni.ja), we could have mozjar "upgrade" to zip64 when N>=2^16 entries are present. I think nearly all consumers of mozjar populate archives by first obtaining a list of entries. So this could be a run-time decision.
FWIW, Python's built-in zip implementation doesn't support zip64. mozjar's code either relies heavily on the Python implementation or it borrowed heavily from its code. So there may be a bit of work here to support zip64.
Assignee | ||
Comment 2•8 years ago
|
||
The other question I want to ask is: why are we putting this many files in a single archive? Compressing zip files is inefficient because each file is compressed in its own compression context. Contrast with tar files, where you put the raw content in a single container and compress that.
IMO we should be using tar + compression (preferably zstandard) everywhere in automation. An exception to this are where we absolutely need zip as an interchange format. Another exception is when we only extract a subset of files. But in the case where we extract a subset of files, I'd rather produce a standalone archive that can be extracted wholesale than a zip file because shipping around megabytes of unused content in zip files is just slow. Redundantly compressing files into multiple archives with zstandard will likely offset producer-side performance drawbacks from the redundant compression. And storage is cheap.
Given the question of "y u no tar," do we still need this?
Comment 3•8 years ago
|
||
We really should just fix bug 1286900 instead. Putting that many files into a zip so we can upload and then unzipping them is just silly.
Reporter | ||
Comment 4•8 years ago
|
||
The Python stdlib does support zip64 I think [1]
I agree that putting this many files in a zip is silly :) But I'm not clear how much work is involved in changing away from zip for packaging tests; bug 1286900 ran into a roadblock and I don't know if there's any specific dependencies on zip other than mozharness which I guess is not too hard to update.
[1] https://docs.python.org/2/library/zipfile.html
Reporter | ||
Comment 5•8 years ago
|
||
So in https://treeherder.mozilla.org/#/jobs?repo=try&revision=f4f9732692123863cd4bc10dbb46ff077e5211b6 I just replaced mozjar with zipfile for packaging tests. Anything else looked pretty daunting. So far nothing terrible seems to have happened (perf. looked similar, archive sizes looked similar).
Comment 6•8 years ago
|
||
For reference, the zip64 format is described here:
https://www.loc.gov/preservation/digital/formats/digformatspecs/APPNOTE%2820120901%29_Version_6.3.3.txt
The short version is that there needs to be two zip64-specific end-of-central-directory related sections, and zip64 extended info headers added to each file entries that require one.
I don't expect this to be a lot of work for minimal support, but I'm not entirely convinced we should do it either. Greg, please assign this bug to me if you feel we should do it.
Flags: needinfo?(gps)
Assignee | ||
Comment 7•8 years ago
|
||
Like Ted, I feel we should do bug 1286900. However, I/O on Windows testers is still horrible and I'm afraid this slows down tests too much.
I'm going to spend ~30 minutes trying to make tar.gz test archives work. The code in mozharness should automagically handle different archive extensions. So as long as things aren't hard coding references to .zip, it should "just work." I'll take the conservative approach and only convert the web platform archive by default. If it doesn't pan out, I'm fine with jgraham's patch to use zipfile directly. Although we may want to conditionally do that if there are >2^16 files (because I like mozjar's determinism).
Flags: needinfo?(gps)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 11•8 years ago
|
||
mozreview-review |
Comment on attachment 8865704 [details]
Bug 1359965 - Support mozpack.file.BaseFile in create_tar_from_files;
https://reviewboard.mozilla.org/r/137332/#review140840
::: python/mozbuild/mozpack/archive.py:38
(Diff revision 1)
> + ti.mode = f.mode or (stat.S_IRUSR | stat.S_IWUSR |
> + stat.S_IRGRP | stat.S_IROTH)
stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IROTH
needs a lot of mental effort to figure out what it means, while 0644 or 0o644 is the same thing and plain obvious to most people.
::: python/mozbuild/mozpack/archive.py:65
(Diff revision 1)
>
> # Set mtime to a constant value.
> ti.mtime = DEFAULT_MTIME
>
> - with open(fs_path, 'rb') as fh:
> + if isinstance(f, BaseFile):
> + data = f.read()
f.open() should always return a file handle that tf.addfile should be able to take.
::: python/mozbuild/mozpack/files.py:500
(Diff revision 1)
> + def read(self):
> + return self.content
This effectively fixes bug 1170329, but you actually don't need it here.
Attachment #8865704 -
Flags: review?(mh+mozilla)
Comment 12•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8865704 [details]
Bug 1359965 - Support mozpack.file.BaseFile in create_tar_from_files;
https://reviewboard.mozilla.org/r/137332/#review140840
> This effectively fixes bug 1170329, but you actually don't need it here.
Err no, it doesn't, it's GeneratedFile, not BaseFile...
Comment 13•8 years ago
|
||
mozreview-review |
Comment on attachment 8865705 [details]
Bug 1359965 - Support and generate tar.gz WPT archive;
https://reviewboard.mozilla.org/r/137334/#review140852
Attachment #8865705 -
Flags: review?(mh+mozilla) → review+
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 16•8 years ago
|
||
mozreview-review |
Comment on attachment 8865704 [details]
Bug 1359965 - Support mozpack.file.BaseFile in create_tar_from_files;
https://reviewboard.mozilla.org/r/137332/#review143952
::: python/mozbuild/mozpack/archive.py:64
(Diff revision 2)
>
> # Set mtime to a constant value.
> ti.mtime = DEFAULT_MTIME
>
> - with open(fs_path, 'rb') as fh:
> + if isinstance(f, BaseFile):
> + ti.size = f.read()
f.read() is not going to return a size. It's a bummer that TarInfo.addfile wants a size in self.size :-/
Can we add a size() method to BaseFile instead of reading the thing? File could use stat(), GeneratedFile could use len(self.content)
Attachment #8865704 -
Flags: review?(mh+mozilla)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 19•8 years ago
|
||
mozreview-review |
Comment on attachment 8865704 [details]
Bug 1359965 - Support mozpack.file.BaseFile in create_tar_from_files;
https://reviewboard.mozilla.org/r/137332/#review145298
Attachment #8865704 -
Flags: review?(mh+mozilla) → review+
Assignee | ||
Comment 20•8 years ago
|
||
More accurate summary.
Assignee: nobody → gps
Status: NEW → ASSIGNED
Summary: mozjar only supports 2**16 - 1 files in a zip → Produce .tar.gz test archive for WPT to work around 2**16 file limit
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 23•8 years ago
|
||
Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ae8bce278626
Support mozpack.file.BaseFile in create_tar_from_files; r=glandium
https://hg.mozilla.org/integration/autoland/rev/a0e257e346cc
Support and generate tar.gz WPT archive; r=glandium
Comment 24•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/ae8bce278626
https://hg.mozilla.org/mozilla-central/rev/a0e257e346cc
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Backed out from m-c in https://hg.mozilla.org/mozilla-central/rev/1ce55499e8193d53aeff91546b16e0c580ebd6f1 at callek's request for likely breaking tc nightlies.
Status: RESOLVED → REOPENED
Flags: needinfo?(gps)
Resolution: FIXED → ---
Target Milestone: mozilla55 → ---
Comment 26•8 years ago
|
||
From IRC:
3:45 PM <Callek_cloud9> KWierso: gps: ping -- so I think we need to backout https://bugzilla.mozilla.org/show_bug.cgi?id=1359965
3:45 PM <firebot> Bug 1359965 — FIXED, gps@mozilla.com — Produce .tar.gz test archive for WPT to work around 2**16 file limit
3:46 PM <Callek_cloud9> https://hg.mozilla.org/mozilla-central/rev/a0e257e346cc#l2.2 --> https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/beetmover.py#18
3:46 PM <Callek_cloud9> so two issues, one is the beetmover script code won't support the new artifact, second is that its likely not `.tar.zip`
Assignee | ||
Comment 27•8 years ago
|
||
Looks like it was a trivial typo of "tar.zip" instead of "tar.gz". I'll reland.
Flags: needinfo?(gps)
Comment 28•8 years ago
|
||
Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a872303fd084
Support mozpack.file.BaseFile in create_tar_from_files; r=glandium
https://hg.mozilla.org/integration/autoland/rev/d10f5ccd882b
Support and generate tar.gz WPT archive; r=glandium
Assignee | ||
Comment 29•8 years ago
|
||
I missed the bit about having to update beetmover script code before this lands. I just submitted https://github.com/mozilla-releng/beetmoverscript/pull/63. If that gets deployed before we trigger nightlies with the reland, we should be OK. If not, we'll get the same failure.
I should have read comment #26 in more detail before relanding. In my defense, I wasn't expecting there to be an out-of-tree reference to this filename because most work in the past few years has been about consolidating references in tree so these types of failures don't occur :/
I won't be offended if I'm backed out because beetmover can't be deployed in time.
Comment 30•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a872303fd084
https://hg.mozilla.org/mozilla-central/rev/d10f5ccd882b
Status: REOPENED → RESOLVED
Closed: 8 years ago → 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment 31•8 years ago
|
||
https://hg.mozilla.org/projects/cedar/rev/ae8bce278626bc84914063f93292ac5e825eec36
Bug 1359965 - Support mozpack.file.BaseFile in create_tar_from_files; r=glandium
https://hg.mozilla.org/projects/cedar/rev/a0e257e346ccf3c1db332ec5903241f4eeb9a7ee
Bug 1359965 - Support and generate tar.gz WPT archive; r=glandium
https://hg.mozilla.org/projects/cedar/rev/a872303fd08497bbde0e3b4cea09a88a4182810e
Bug 1359965 - Support mozpack.file.BaseFile in create_tar_from_files; r=glandium
https://hg.mozilla.org/projects/cedar/rev/d10f5ccd882b965fcad39914f7c3c930d1301a41
Bug 1359965 - Support and generate tar.gz WPT archive; r=glandium
Comment 32•8 years ago
|
||
Backed out from central again -- https://hg.mozilla.org/mozilla-central/rev/ab0dcc11686cfaa56aab1c7b879fecbe8c1f0d0c
https://hg.mozilla.org/mozilla-central/rev/23a341e9b53d04f80ea6a66ced2d72cdc17afffb
Flags: needinfo?(gps)
Updated•8 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 33•8 years ago
|
||
Comment 34•8 years ago
|
||
Copy-pasting comment from PR, for verbosity and FYI:
But please do keep in mind that changing here is only half-way to solving the problem. You'd also need to change the artifact in the beetmover in-tree task, specifically, all occurrences of target.web-platform.tests in https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/transforms/beetmover.py and land that to inbound/central/beta.
Until we have those patches landed, I can't land this as it will break the current existing nightly beetmover jobs.
Assignee | ||
Comment 35•8 years ago
|
||
RyanVM: are you the right person to ask for help about coordinating an uplift? A try push with backported patches is at https://treeherder.mozilla.org/#/jobs?repo=try&revision=a47888d0b6b206bf98e1a66b0ebfb724d79e5c5b.
There should be no impact to a Firefox distribution: either automation is happy or it isn't. Biggest risk is we screw up beetmover on beta and jeopardize a beta build.
Flags: needinfo?(gps) → needinfo?(ryanvm)
Comment 36•8 years ago
|
||
Sure, I can uplift it later today with some other approvals if you get the beetmover stuff taken care of.
Flags: needinfo?(ryanvm)
Whiteboard: [checkin-needed-beta]
Assignee | ||
Comment 37•8 years ago
|
||
AFAICT the original patches already modified beetmover in tree. We just need to coordinate the deployment of upstream beetmover with the patches in this bug.
Comment 38•8 years ago
|
||
If we want to modify anything related to the artifacts beetmover is supposed to upload, path is the following:
0. land in-tree + uplift all build changes that are changing the format of the web-platform.tests from `zip` to `tar.gz`
1. change, land + uplift beetmover artifacts that are being downloaded. That is changing all occurrrences from this[1] file of `target.web-platform.tests.zip` to `target.web-platform.tests.tar.gz`
2. land https://github.com/mozilla-releng/beetmoverscript/pull/63/ and roll-out a new beetmoverscript package to the beetmoverworkers
I know it's not ideal that we have to change in two different places for (1) and (2). We've gotten bug 1331141 to single-source that. I will take care of (2) once (1) is landed + uplifted. I can help with patches for (1) if you want me to.
[1]: https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/transforms/beetmover.py
Comment 39•8 years ago
|
||
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #38)
> If we want to modify anything related to the artifacts beetmover is supposed
> to upload, path is the following:
>
> 0. land in-tree + uplift all build changes that are changing the format of
> the web-platform.tests from `zip` to `tar.gz`
> 1. change, land + uplift beetmover artifacts that are being downloaded. That
> is changing all occurrrences from this[1] file of
> `target.web-platform.tests.zip` to `target.web-platform.tests.tar.gz`
> 2. land https://github.com/mozilla-releng/beetmoverscript/pull/63/ and
> roll-out a new beetmoverscript package to the beetmoverworkers
>
> I know it's not ideal that we have to change in two different places for (1)
> and (2). We've gotten bug 1331141 to single-source that. I will take care of
> (2) once (1) is landed + uplifted. I can help with patches for (1) if you
> want me to.
>
> [1]:
> https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/
> transforms/beetmover.py
Ok, heres my comprehension on things, and is why I'm also still confused, mihai can you confirm/deny this more optimal workflow:
0. land to beetmover + deploy the *addition* of the new format of an artifact that should get uploaded. (similar to the PR 63, but without removing the old).
1. Land to in-tree code to change the filename and specify the new filename in taskcluster.
2. Once patch from 1 is on all trees where we expect to beetmove that prior artifact name (e.g. central and project branches where that format is to be used on Nightlies, and is in place for beta/release/esr stuff where we also beetmove) we can delete the old version of the filename from beetmover on github, and deploy.
The reason I come to these steps as a plan is that it's my understanding that if your step 0 and 1 happens then we have the issue that beetmover would reject those filenames, and thus nightlies would fail.
if we go with my listed plan, then we have success with beetmover (and much less of a chicken and egg) since it won't care if a file is missing from its template transforms.
Flags: needinfo?(mtabara)
Comment 40•8 years ago
|
||
We're going to hold off on landing this on Beta today to avoid adding risk to the 54b13 gtb.
Whiteboard: [checkin-needed-beta]
Comment 41•8 years ago
|
||
Bug 1359965 - Update web platform test archive name
https://github.com/mozilla-releng/beetmoverscript/commit/5e651f54c99f827ee16cb12912c9e999b21bcc56
Bug 1359965 - bump beetmoverscript version in beetmoverworkers. r=versionbump
https://hg.mozilla.org/build/puppet/rev/793eed2dc811
Comment 42•8 years ago
|
||
Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0a94b9164a64
Support mozpack.file.BaseFile in create_tar_from_files; r=glandium
https://hg.mozilla.org/integration/autoland/rev/11ba6213be5a
Support and generate tar.gz WPT archive; r=glandium
Comment 43•8 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #39)
> (In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #38)
> > If we want to modify anything related to the artifacts beetmover is supposed
> > to upload, path is the following:
> >
> > 0. land in-tree + uplift all build changes that are changing the format of
> > the web-platform.tests from `zip` to `tar.gz`
> > 1. change, land + uplift beetmover artifacts that are being downloaded. That
> > is changing all occurrrences from this[1] file of
> > `target.web-platform.tests.zip` to `target.web-platform.tests.tar.gz`
> > 2. land https://github.com/mozilla-releng/beetmoverscript/pull/63/ and
> > roll-out a new beetmoverscript package to the beetmoverworkers
> >
> > I know it's not ideal that we have to change in two different places for (1)
> > and (2). We've gotten bug 1331141 to single-source that. I will take care of
> > (2) once (1) is landed + uplifted. I can help with patches for (1) if you
> > want me to.
> >
> > [1]:
> > https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/
> > transforms/beetmover.py
>
> Ok, heres my comprehension on things, and is why I'm also still confused,
> mihai can you confirm/deny this more optimal workflow:
>
> 0. land to beetmover + deploy the *addition* of the new format of an
> artifact that should get uploaded. (similar to the PR 63, but without
> removing the old).
> 1. Land to in-tree code to change the filename and specify the new filename
> in taskcluster.
> 2. Once patch from 1 is on all trees where we expect to beetmove that prior
> artifact name (e.g. central and project branches where that format is to be
> used on Nightlies, and is in place for beta/release/esr stuff where we also
> beetmove) we can delete the old version of the filename from beetmover on
> github, and deploy.
>
> The reason I come to these steps as a plan is that it's my understanding
> that if your step 0 and 1 happens then we have the issue that beetmover
> would reject those filenames, and thus nightlies would fail.
>
> if we go with my listed plan, then we have success with beetmover (and much
> less of a chicken and egg) since it won't care if a file is missing from its
> template transforms.
You're absolutely right. This is the better aproach by far and we're currently following that.
Flags: needinfo?(mtabara)
Comment 44•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/0a94b9164a64
https://hg.mozilla.org/mozilla-central/rev/11ba6213be5a
Status: REOPENED → RESOLVED
Closed: 8 years ago → 8 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Core → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•