Closed Bug 1058923 Opened 10 years ago Closed 9 years ago

Bundle mach in tests.zip and allow test targets to be run from there

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox39 fixed, firefox40 fixed, firefox41 fixed, firefox-esr38 fixed)

RESOLVED FIXED
mozilla41
Tracking Status
firefox39 --- fixed
firefox40 --- fixed
firefox41 --- fixed
firefox-esr38 --- fixed

People

(Reporter: jgriffin, Assigned: ahal)

References

(Blocks 1 open bug)

Details

(Whiteboard: [ateam_harness_work])

Attachments

(1 file, 1 obsolete file)

We want to make it easier for developers to run tests from a tests package (tests.zip).  Currently doing so requires some special knowledge to construct a suitable environment and pass the correct command-line args.

It would be better to make existing mach targets operable from within a tests package.  This will require us to bundle mach and relevant mach targets in the test package; we'll likely need to modify some or all of them to be able to function correctly outside of an objdir.
This is going to be a lot of work, FWIW, but we've talked about something like this since we introduced the concept of test packages.
Blocks: 524130
is there a chance that by adding mach to tests.zip and making it streamlined ends up turning mozharness into something that only downloads binaries and runs a simple command.  This would remove a lot of the setup work that mozharness scripts do.
Yeah, it seems like the end game here is being able to write something like

./mach job-name --buildbot

for any job, and have it do the right thing. Mozharness would then just know how to set up the environment and how to interpret the output; one can imagine needing no out-of-tree harness-specific code at all.

Outside of the mozharness environment the above command would presumably find the packaged files in the objdir (allowing for overrides as extra options) and run the command as if in automation.
Personally I would prefer to move mozharness scripts in-tree and have mach call mozharness (the Talos mach command already does this). Mach targets are hard to maintain (just look at testing/mochitest/mach_commands.py) and are designed to be used by humans, not machines. If we do this, then every time we need a harness to run with a slightly different configuration we'll need to add a new command line option for it. This seems like it will get messy quickly.

Mozharness on the other hand is well suited for running various configurations with.
Just to expand on that last comment. Aki had started working on the ability for mozharness scripts to invoke other mozharness scripts as an action before he left. Our plan was for buildbot/setup/download related steps to continue living in mozharness, and the steps related to running the test harnesses would live in an in-tree mozharness script.

Then both buildbot and mach could call the same in-tree mozharness script (mach_commands.py would almost be empty, just dispatching to another command like it's supposed to).
I think we just need to pick our poison and get on with it. On one hand, developers are already familiar with the mach commands, they're what developers use for running tests locally. Expanding them to cover the "run from packaged tests" case means the cognitive load for developers is smaller. On the other hand, we already have mozharness scripts that know how to run all the tests from the test package, so we could adapt those to run from the objdir as well. Either way we're adding a layer of complexity to one of the scenarios in order to make them both similar.

Running mozharness from inside mach doesn't make a lot of sense to me, since we need mozharness to deal with the setup of downloading the packages etc. The mach commands expect to have everything they need already there.

Presumably for the use case of alternate configurations we'd *want* to add either new mach commands or new options to mach commands to allow developers to replicate the tests that run in production. I agree that the current mach commands have a bunch of hacks, mostly because they were tossed together by a disparate group of people in order to get something working. We could no doubt do better with a little bit of refactoring.
You're absolutely right. And we don't have to use mozharness for it, I mostly wanted to use it because it already exists, and has nice configuration/command line integration. But we could make something that works similarly without depending on all of mozharness.

My main concern about mozharness calling mach is that we still need to distill all those configuration files down into a single command line. It seems silly to do a bunch of work in a mozharness script to do this, and then do a bunch of work in the test harness to translate the command line back into configuration.
Assignee: nobody → dminor
Status: NEW → ASSIGNED
Depends on: 1087567
I'm not actively working on this at the moment, unassigning myself in case someone else wants to pick it up.
Assignee: dminor → nobody
Status: ASSIGNED → NEW
I've recently given a lot of thought to this matter.

I believe the right way of solving this in the long term is by dropping the use of tests.zip altogether.

The way this would work is by doing this:
* make the test slaves checkout gecko/gaia
* run mach the same way developers do _but_ we pass a url to the firefox binaries it needs to download

We either check-in mozharness or we teach mach to check it out (or even use shared check-ins).

This would remove the need to generate tests.zip file and keep storing them since we can generate the tests from the tree.

This would also make everything much closer to what developers run.
+1, that makes a lot of sense to me.

The ability to run tests locally against downloaded builds seems like something we should do anyway. As usual, this would be a massive undertaking :).
> The way this would work is by doing this:
> * make the test slaves checkout gecko/gaia

With the downside of massive mercurial/git update time. Some builds spend more than 10 minutes just for that part. That would make test results take much longer to be available. This would also add a *lot* of load to the mercurial and git servers.

> * run mach the same way developers do _but_ we pass a url to the firefox binaries it needs to download

tests.zip contains more than files that are in the tree. It contains files that are also compiled as part of the build. This will become even more true when gtest will move there.
Even if there wasn't that, running tests locally requires to essentially run a full build, currently. Even if that's improved in the future, there's a lot of file copying involved, and that has a big overhead on windows.
(In reply to Mike Hommey [:glandium] from comment #11)
> With the downside of massive mercurial/git update time. Some builds spend
> more than 10 minutes just for that part. That would make test results take
> much longer to be available. This would also add a *lot* of load to the
> mercurial and git servers.

These clones would have to live outside the working directory that gets clobbered. So tests slaves would either do an incremental pull or an hg update. That doesn't seem like it would be any slower than downloading/extracting tests.zip..

Maybe the extra load on hg.m.o would be a non-starter though.

> tests.zip contains more than files that are in the tree. It contains files
> that are also compiled as part of the build. This will become even more true
> when gtest will move there.

Yes, we'd probably still need something similar to tests.zip, though hopefully renamed and much smaller.

> Even if there wasn't that, running tests locally requires to essentially run
> a full build, currently. Even if that's improved in the future, there's a
> lot of file copying involved, and that has a big overhead on windows.

Also true. I think it's a shame that running tests require a build. I should be able to pip install mochitest and run against my nightly if I wanted to. Making the harnesses less build-centric would probably be a pre-requisite to this.

But yes, this would all be a massive amount of work. We would really need to be sure that it is worth it.
(In reply to Andrew Halberstadt [:ahal] from comment #12)
> (In reply to Mike Hommey [:glandium] from comment #11)
> These clones would have to live outside the working directory that gets
> clobbered. So tests slaves would either do an incremental pull or an hg
> update.

Theoretically, that's what build slaves do too... except when they don't, which happens an awful lot.
Note that we're trying to get rid of the test package in bug 1059943 in favor of uploading all the files individually to S3 and uploading a JSON manifest that points to them. That will certainly make some things easier. There's still sort of a chicken-and-egg problem here where you need to have *something* to download the files to kick the test off.

I know I've been pretty resistant to adding another layer to the stack, but if we made mach invoke the mozharness scripts under the hood I guess we could support this fairly easily. By default if you just did `mach mochitest` or similar it would skip the download/install steps and use the local binaries. If you instead did `mach mochitest --build http://...` it would download those binaries and test files and invoke those.
As gross as having mach call into mozharness sounds, I'm guessing the typical use case for running tests from the test package is investigating test machine only failures, so we should probably do whatever we can to ensure the tests are run in the same fashion as on the test machines.
These days there is little opposition to moving mozharness into tree, at least for test scripts. There are still some bootstrapping issues that need to be figured out, but my hope is that soon (i.e this year) the test harness and the mozharness scripts will be the same thing. Instead of investing effort into having mach download and run mozharness, I'd prefer to invest effort moving mozharness scripts into the tree (and merging them with the harnesses).
FWIW I think this is a reasonable plan i.e. move mozharness into the tree and make it the generic wrapper layer for test jobs in mach. Then make the infrastructure run the mach command rather than the mozharness command directly.
I have a solution to test this on try.
It is a big cheat just to get the piping worked out before making the releng required changes.

* Using pinning to mozharness, we pin a try push to a new repository which only hosts one script (not a mozharness one).
* Inside of that repo we create one symlink per each mozharness script in use to the script I mentioned
* This script knows how to grab the test installer and unzip the mozharness repo which we commit in gecko
* Once we unzip mozharness from the tests.zip we can simply pass the options that our scripts was called with to the mozharness scripts that have been extracted
** These in-tree mozharness scripts will not have to download a tests.zip since it is already in disk
* Everything else should be the same

buildbot -> thinks that is calling mozharness
new repo script -> downloads test.zip, extract mozharness and passes parameters from buildbot
the unzipped mozharness scripts will run almost as normal.
How about using `pip wheel` to bundle .whl files for Python packages inside the test archive? We can include a special script in the archive that knows how to reconstruct a virtualenv from the path to a .zip file (which contains all the packages necessary to run content inside). Self-bootstrapping :)

I think this is similar to what Armen suggested - just using proper Python packaging as opposed to zip files, which will almost certainly be fragile when you are talking about Python packages.
Whiteboard: [ateam_harness_work]
This is going to be tackled by ahal in Q2.
Assignee: nobody → ahalberstadt
Status: NEW → ASSIGNED
It seems like a lot (most?) of the discussion here is out of scope from the title of the bug. My plan for the title as stated is:

1) package mach core and driver into test-stage
2) create a new bootstrap script that is test-stage aware (can add things like modules-dir and cert-path to the context).
3) modify mach driver to detect the new bootstrap script.

By the end of this bug, mach will exist in the tests.zip and it will be possible to add mach_commands.py. In future bugs I'll start adding mach_commands.py scripts that can run the tests.
Component: mach → Build Config
Attached file MozReview Request: bz://1058923/ahal (obsolete) —
/r/8043 - Bug 1058923 - Package mach in tests.zip; create empty bootstrap script for test package

Pull down this commit:

hg pull -r bafdb30bf4b8ba584aad63a90a26086334f6fccf https://reviewboard-hg.mozilla.org/gecko/
The above change has the bonus side effect of making the Gecko mach driver compatible with the B2G mach bootstrap, which goes 90% of the way towards fixing bug 920181.
Comment on attachment 8600403 [details]
MozReview Request: bz://1058923/ahal

/r/8043 - Bug 1058923 - Package mach in tests.zip; create empty bootstrap script for test package

Pull down this commit:

hg pull -r bafdb30bf4b8ba584aad63a90a26086334f6fccf https://reviewboard-hg.mozilla.org/gecko/
Attachment #8600403 - Flags: review?(gps)
Comment on attachment 8600403 [details]
MozReview Request: bz://1058923/ahal

https://reviewboard.mozilla.org/r/8041/#review7169

Ship It!
Attachment #8600403 - Flags: review?(gps) → review+
Attachment #8600403 - Flags: review+
Comment on attachment 8600403 [details]
MozReview Request: bz://1058923/ahal

/r/8043 - Bug 1058923 - Package mach in tests.zip; create bootstrap script for test package, r=gps

Pull down this commit:

hg pull -r 4bbaa0c188da1d04192aeea71b4e5dc0b6602dca https://reviewboard-hg.mozilla.org/gecko/
Comment on attachment 8600403 [details]
MozReview Request: bz://1058923/ahal

I'm not entirely sure how, but it looks like the reference to the b2g bootstrap script in the mach driver was causing that bustage, so I removed it:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=957723f77ae9

Carrying r+ forward.
Attachment #8600403 - Flags: review+
Somehow this broke valgrind builds too. Weren't any sheriffs around so backed myself out:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5a72e3728487
The valgrind bustage happened because I removed the 'build' directory from sys.path here:
https://dxr.mozilla.org/mozilla-central/source/mach#25

Adding it back in mach_bootstrap.py's SEARCH_PATHS worked for me locally. New patch + try run coming shortly.
Comment on attachment 8600403 [details]
MozReview Request: bz://1058923/ahal

/r/8043 - Bug 1058923 - Package mach in tests.zip; create bootstrap script for test package, r=gps

Pull down this commit:

hg pull -r 9f988171182a0abd54724e0cc2906fe678da318f https://reviewboard-hg.mozilla.org/gecko/
Attachment #8600403 - Flags: review+
Comment on attachment 8600403 [details]
MozReview Request: bz://1058923/ahal

Valgrind fixed by adding 'build' to SEARCH_PATHS in the mach bootstrap:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5bdca885cd19

Carry r+ forward.
Attachment #8600403 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/fec6ff864e3c
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Attachment #8600403 - Attachment is obsolete: true
Attachment #8618283 - Flags: review+
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.