Closed Bug 1198341 Opened 9 years ago Closed 7 years ago

Convert |mach try| to use mozci as the backend.

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jgraham, Unassigned)

References

Details

So one of the goals for |mach try| is to allow developers to run tests under certain directories that are believed to be relevant to their patch. For example one might write

mach try -b -do -p windows,linux layout

and run all tests under layout in windows and linux. At present the way this works is to try and run all the tests in that directory as a single chunk, usually the first chunk in the set. For something like mochitests this works OK because there is always a mochitest-1 chunk so we can send try -u mochitest-1. For reftests, however, the situation is different because reftests are chunked on linux but not windows so the same try syntax will run different things on the two platforms.

To solve this |mach try| should be based on mozci so we can select the exact jobs required.
Blocks: 1198342
Depends on: 1198360
bug 1194264 will add the ability to submit something like this:
{
 "Linux build": ["Ubuntu 64 reftest-1"],
 "Windows build": ["Windows 8 64 reftest"],
}

What would mach try need to determine the right builders to trigger?

We also can generate data like this (in case mach try would need to filter through):
http://people.mozilla.org/~armenzg/permanent/graph.json

Bug 1198360 will also be a blocker since the ability gained through TaskCluster lead us to a different security model.
Could you please give an example of each different flow for |mach try|?
With UI and job selection, without UI, with tags...
IIUC, instead of using mozci to schedule jobs for you, you're interested on being able to pick up the right-up-to-date try syntax which will schedule the jobs for you; is this right?
Try syntax isn't flexible enough for us -- I can't think of a way to say "run reftest-1 where it exists, or just reftest if it doesn't" to get one chunk on all platforms.

Not being up to date is just one limitation of try syntax, I think we want an alternative way to schedule the jobs.
If we do bug 1198430, we can have improve the syntax.

If we do this bug without increasing the flexibility of the try syntax, we will need to:
1) let the user choose the jobs it wants through |mach try|
2) create a try message which does not schedule any buildbot jobs through buildbot and push
3) submit a task graph through mozci

For #3, we will have to create a web API (somewhere) to submit a task graph (bug 1198360) or include the complete graph inside of the commit message.
So I think based on our earlier conversation, what I would like is for try syntax to disappear as a form of data transport. Instead we would need the following things:

1) A way to build and submit a graph of test jobs to run
2) A way to specify, for each test job, extra arguments to pass into the harness (this would be needed to control things like tags and test paths to run)
3) A way to provide extra global arguments for the run (like --no-retry)

Then "try syntax" would only exist as command line arguments to |mach try|, which would be internally converted into a TC graph to submit.
#1 is the most work if we decide to not pass the scheduling information with the push.
We would need to build a system to submit the graphs (which would have the privileges to submit to TC).

For #2, are we using extra args through the commit message? or can we pass them as properties to the jobs?

For #3, we can specify retries to 0 for tasks (being the maximum of retries allowed)

We could totally put the graph inside of the commit message :P
No longer depends on: 1194264
I just had a chat with Armen about this and he clarified a lot. The big takeaway is that scheduling taskcluster jobs locally on a pre-existing push is likely a non-starter. This will be very hard to implement properly without totally breaking taskcluster's security/permissions model. This means that using mozci to schedule jobs via |mach try| locally won't work (or at least is more effort than it will be worth). The scheduling information must somehow be attached to the push.

But all is not lost! We can still leverage taskcluster's decision task and get rid of try syntax at the same time. Here is the workflow I propose:

1. |mach try| generates a list of paths to a task.yaml, as well as any related configuration for that task. E.g:

[
  ('path/to/task1.yaml', {'foo': 'bar'}),
  ('path/to/task2.yaml', {'A': 'B'})
]

Note the data structured used isn't really important.. we could make this mimic the existing dependency graph format if we wanted to. How this list gets generated is also irrelevant, there could be multiple frontends (e.g one for classic try syntax, one for a curses ui, one that uses mozbuild.testing.TestResolver, etc).

2. The data structure gets stored in the commit metadata using the namespace api. I guess there's nothing stopping us from keeping it in the commit message like before, but I think it would be better to keep it hidden. Though we may need to keep using the commit message in order to support git, I'm not sure.

3. A new module under testing/taskcluster/taskcluster_graph reads the data structure and formats it into a full taskcluster graph. This module is the equivalent of the existing try_test_parser.py. It will be invoked in a similar way (as part of the decision task, only on try) and will produce a similar output.

4. Buildbot jobs will be a special case. But we could invent a special syntax to denote a buildbot job. The module from 3) could then use mozci to schedule these like normal and omit them from the task graph. I guess the |mach try| command could also just do it locally, but I'd prefer if there was a single point of entry for scheduling everything. It also avoids a local dependency on mozci and pushes more work to AWS rather than wasting developer cycles.
Adding dustin; tl;dr this year we're aiming to make scheduling on try smarter and easier; getting rid of the try syntax will be a side effect (we might support it for backward compatibility).

> 2. The data structure gets stored in the commit metadata using the namespace
> api. I guess there's nothing stopping us from keeping it in the commit
> message like before, but I think it would be better to keep it hidden.
> Though we may need to keep using the commit message in order to support git,
> I'm not sure.
> 
Post push, can we inspect what was added to the namespace in order that we can debug any issues?

> 3. A new module under testing/taskcluster/taskcluster_graph reads the data
> structure and formats it into a full taskcluster graph. This module is the
> equivalent of the existing try_test_parser.py. It will be invoked in a
> similar way (as part of the decision task, only on try) and will produce a
> similar output.
> 
As I mentioned in the meeting, I would like dustin to give us the stamp of approval. Having a prototype would help us using something explicit to understand each other.

Note that a new implementation of task scheduling is going to be added on Q2:
https://github.com/djmitche/taskcluster-in-tree-taskgraph
I can see a few interesting points to plug in here.

First, in the new implementation, the decision task will be calling the queue and scheduler APIs directly.  It's possible that `mach try` could just do the same, based on the same taskgraph support code.  In other words, a try push wouldn't require anything unique fed to hg -- just push your commits there, then create a task graph using API calls from your desktop, with your taskcluster clientId, to do the tasks you want for it.  No need for a decision task.  And anyone who can push to try already has the scopes required to do this (and we could configure the API calls such that any *additional* scopes they posess wouldn't be used, avoiding the chance of e.g., accidentally pulling production secrets onto a try builder)

Second, task graphs are generated by creating a full graph of all possible tasks, then filtering that using a query of some sort.  The full graph is recorded as an artifact.  This has the nice advantage that try-extender can just read the full graph and call createTask with any additional tasks necessary, all pre-packaged for it.  Right now, it looks like that query will be a Python expression, but that's not set in stone yet.  My plan was to write a parser that will convert the existing command-line-switch try format into a query expression.

So including a list of YAML files won't really work, but including an expression in the query language would.

Also, every task as a label, so it would be possible to just include a list of labels that should be built.  So basically `mach try` would do the whole taskgraph generation process, then include [t.label for t in taskgraph] in the commit metadata.  The decision task would repeat the taskgraph generation process and then use that list as a filter.

Finally, we want to get to the point of scheduling all jobs -- buildbot included -- in-tree.  So I don't think you need to consider buildbot jobs as a special case.
(In reply to Dustin J. Mitchell [:dustin] from comment #10)
> First, in the new implementation, the decision task will be calling the
> queue and scheduler APIs directly.  It's possible that `mach try` could just
> do the same, based on the same taskgraph support code.  In other words, a
> try push wouldn't require anything unique fed to hg -- just push your
> commits there, then create a task graph using API calls from your desktop,
> with your taskcluster clientId, to do the tasks you want for it.  No need
> for a decision task.  And anyone who can push to try already has the scopes
> required to do this (and we could configure the API calls such that any
> *additional* scopes they posess wouldn't be used, avoiding the chance of
> e.g., accidentally pulling production secrets onto a try builder)

Awesome, we originally wanted to schedule jobs directly from the mach command, but apparently it was hard to get the scopes right.. Are you saying that the new system for scheduling jobs will make this easier? What is a taskcluster clientId and how does Joe Schmo developer get one?


> Second, task graphs are generated by creating a full graph of all possible
> tasks, then filtering that using a query of some sort.  The full graph is
> recorded as an artifact.  This has the nice advantage that try-extender can
> just read the full graph and call createTask with any additional tasks
> necessary, all pre-packaged for it.  Right now, it looks like that query
> will be a Python expression, but that's not set in stone yet.  My plan was
> to write a parser that will convert the existing command-line-switch try
> format into a query expression.

You'll likely still need to write this converter, as we'll likely want to keep running the old try_test_parser.py alongside this new system for a bit. And this new system will take awhile to implement anyway. The good news is that if you write the converter, the |mach try| command can steal it for supporting legacy try syntax for those who still want to use it.


> So including a list of YAML files won't really work, but including an
> expression in the query language would.

This should work just fine.


> Also, every task as a label, so it would be possible to just include a list
> of labels that should be built.  So basically `mach try` would do the whole
> taskgraph generation process, then include [t.label for t in taskgraph] in
> the commit metadata.  The decision task would repeat the taskgraph
> generation process and then use that list as a filter.

Good idea. Assuming we don't go the calling the scheduler APIs directly route.


> Finally, we want to get to the point of scheduling all jobs -- buildbot
> included -- in-tree.  So I don't think you need to consider buildbot jobs as
> a special case.

So if I understand you correctly, this new taskcluster scheduler will also schedule buildbot jobs? \o/

Thanks for replying!
(In reply to Dustin J. Mitchell [:dustin] from comment #10)
> And anyone who can push to try already has the scopes
> required to do this (and we could configure the API calls such that any
> *additional* scopes they posess wouldn't be used, avoiding the chance of
> e.g., accidentally pulling production secrets onto a try builder)

Would *all* scopes to show jobs on Treeherder plus access to the routes be available?

> What is a taskcluster clientId and how does Joe Schmo developer get one?

mach try would open a TC web page and Joe will grant accesss to the local script.

>> Finally, we want to get to the point of scheduling all jobs -- buildbot
>> included -- in-tree.  So I don't think you need to consider buildbot jobs as
>> a special case.
>
>So if I understand you correctly, this new taskcluster scheduler will also schedule buildbot jobs? \o/

We need to stop scheduling jobs through Buildbot and have BBB/TC.

I speak about this in:
http://armenzg.blogspot.ca/2015/09/the-benefits-of-moving-per-push.html
> Would *all* scopes to show jobs on Treeherder plus access to the routes be available?

yes
We have try fuzzy these days.
I will be deprecating mozci soon.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.