Closed Bug 1720715 Opened 3 years ago Closed 3 years ago

Productionize `taskgraph-diff` by integrating it into taskgraph

Categories

(Firefox Build System :: Task Configuration, task, P2)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ahal, Assigned: ahal)

References

Details

Attachments

(8 files)

The taskgraph-diff module is just a couple scripts and a bunch of parameters files. We want to make this tool easier to use and discover.

After putting some thought into this, I’d like to incorporate it into taskgraph itself. My rough plan here is:

  1. Build taskgraph-gen directly into taskgraph

    • Add ability to specify multiple --parameters flags
    • Add ability to specifiy a dir of params files (like in braindump)
    • Resolve a list of parameters + identifiers for each
    • If more than one, generate them all in parallel (like taskgraph-gen does currently)
    • If more than one, write taskgraph logs to log files (one per params file)
    • If more than one, taskgraph output files will be prefixed by params identifier (if specified), by default will still print to stdout (with log lines denoting which params identifier each one is)
  2. Turn taskgraph-diff into a submodule of the taskgraph package

    • We can still provide a taskgraph-diff entrypoint which gets invoked the same way it is now
    • We can also add a taskgraph --diff convenience flag that can automatically generate two sets of parameters files on different VCS revisions and then call taskgraph-diff on them
      • Updating vcs feels icky, but since implementing it for m-c it’s honestly been one of the biggest time savers I’ve ever made. Plus we have mozversioncontrol which handles abstractions across hg + git.
      • This could be follow-up work, but imo it is the missing link to making taskgraph-diff useful for laypeople
  3. Move actual dummy parameters files into the repos that they are for. E.g, the parameters in the params-fenix dir would go in the actual fenix repo under taskcluster/test/params or somewhere similar.

So presuming this exists in mozilla-central, the common usage of this tool would be to run something like:

$ ./mach taskgraph full --diff -p project=autoland -p project=mozilla-central --fast --target-kind=build

This would generate the taskgraphs on two separate revisions (current and base like it does currently). For each revision it would also generate two graphs, one for autoland and one for central. After the four total graphs are generated, they'd get fed into taskgraph-diff and normal diff output would be displayed. In all four taskgraphs, only the build kind (and dependencies) are generated.

If working on this I'd likely incorporate it into standalone taskgraph first and then try to sync it to m-c after.

(In reply to Andrew Halberstadt [:ahal] from comment #0)

The taskgraph-diff module is just a couple scripts and a bunch of parameters files. We want to make this tool easier to use and discover.

After putting some thought into this, I’d like to incorporate it into taskgraph itself. My rough plan here is:

  1. Build taskgraph-gen directly into taskgraph

As long as this means both standalone taskgraph and gecko taskgraph, this sounds good.

* Add ability to specify multiple `--parameters` flags
* Add ability to specify a dir of params files (like in braindump)
* Resolve a list of parameters + identifiers for each
* If more than one, generate them all in parallel (like `taskgraph-gen` does currently)

We did a multiprocessing hack here because we were still in py2; this means that a ctrl-C doesn't actually kill things. I'd lean towards asyncio here instead.

* If more than one, write taskgraph logs to log files (one per params file)
* If more than one, taskgraph output files will be prefixed by params identifier (if specified), by default will still print to stdout (with log lines denoting which params identifier each one is)

Sure, this sounds like it will help debug without having to deal with interleaved logs.

  1. Turn taskgraph-diff into a submodule of the taskgraph package
    • We can still provide a taskgraph-diff entrypoint which gets invoked the same way it is now
    • We can also add a taskgraph --diff convenience flag that can automatically generate two sets of parameters files on different VCS revisions and then call taskgraph-diff on them
      • Updating vcs feels icky, but since implementing it for m-c it’s honestly been one of the biggest time savers I’ve ever made. Plus we have mozversioncontrol which handles abstractions across hg + git.

Hm, l10n-cross-channel does something with hglib where it's able to inspect files in a given revision. I wonder if we can do something with that, or if that's way uglier given how many files are in taskcluster/ and how many files we reference outside of it.

We could do something like requiring a clean tree before updating vcs, or maybe hg does things cleanly.

    * This could be follow-up work, but imo it is the missing link to making `taskgraph-diff` useful for laypeople
  1. Move actual dummy parameters files into the repos that they are for. E.g, the parameters in the params-fenix dir would go in the actual fenix repo under taskcluster/test/params or somewhere similar.

+1, and docs on how to curate these.
Something that makes these harder to maintain is the lack of default parameter values; we may want to raise that in priority.

(In reply to Aki Sasaki [:aki] (he/him) (UTC-7) from comment #2)

  1. Build taskgraph-gen directly into taskgraph

As long as this means both standalone taskgraph and gecko taskgraph, this sounds good.

Yes, I'd sync it to both. Though the cli is pretty divergent already, so it might end up being "implement it for both"

We did a multiprocessing hack here because we were still in py2; this means that a ctrl-C doesn't actually kill things. I'd lean towards asyncio here instead.

+1

Hm, l10n-cross-channel does something with hglib where it's able to inspect files in a given revision. I wonder if we can do something with that, or if that's way uglier given how many files are in taskcluster/ and how many files we reference outside of it.

We could do something like requiring a clean tree before updating vcs, or maybe hg does things cleanly.

Yeah, this came up in review when I implemented it. Mitch suggested the same thing but ultimately we don't know exactly which files we need (other than globs in the sparse-profile definition). At the time we concluded it was more effort than any potential benefit warranted.

Though I also know the moz-phab team was looking into a way of solving this as well. If they get it working, we could try and copy their approach, though again I'm not 100% sure it will translate well to taskgraph.

Move actual dummy parameters files into the repos that they are for. E.g, the parameters in the params-fenix dir would go in the actual fenix repo under taskcluster/test/params or somewhere similar.

How confident are we that these will stay up to date? The ones in braindump tend to get out of date and cause invalid testing. Personally, I try to use -p task-id or -p [project] these days, to pull a recent one that I'm certain will give me comparable results to m-c, etc.

Other than that, which is not even a real objection, I'm a big +1 on all of this.

Hm, l10n-cross-channel does something with hglib where it's able to inspect files in a given revision. I wonder if we can do something with that, or if that's way uglier given how many files are in taskcluster/ and how many files we reference outside of it.

Ah, we have to support git if we're in standalone taskgraph.

(In reply to bhearsum@mozilla.com (:bhearsum) from comment #4)

Move actual dummy parameters files into the repos that they are for. E.g, the parameters in the params-fenix dir would go in the actual fenix repo under taskcluster/test/params or somewhere similar.

How confident are we that these will stay up to date? The ones in braindump tend to get out of date and cause invalid testing. Personally, I try to use -p task-id or -p [project] these days, to pull a recent one that I'm certain will give me comparable results to m-c, etc.

Other than that, which is not even a real objection, I'm a big +1 on all of this.

I'm confident they'll go out of date, but as long as we have something for every required parameter, it works. Changing parameter files means we can't diff across taskgraph-gen runs, and using -p [project] doesn't include cron or relpro.

Yeah, -p project=autoland has real chance of picking up different parameters, but maybe we can make taskgraph-diff smart enough to re-use whatever parameters it used the first time around.

(In reply to Aki Sasaki [:aki] (he/him) (UTC-7) from comment #5)

I'm confident they'll go out of date, but as long as we have something for every required parameter, it works. Changing parameter files means we can't diff across taskgraph-gen runs, and using -p [project] doesn't include cron or relpro.

Maybe we can add -p task-index=<index> as well so we can include things like cron and relpro. We could make a little manifest that lists all the indices we care about and figure out a way to pass them all into taskgraph. That way we wouldn't need the static files as much anymore.

Edit: though not all contexts have a latest index, so I guess it's still not perfect.

One way might be to publish the parameters used along with the taskgraph-gen output, as artifacts. (I'm not 100% sure if that will be useful or not, but that would allow us to reference and potentially debug.)

We did a multiprocessing hack here because we were still in py2; this means that a ctrl-C doesn't actually kill things. I'd lean towards asyncio here instead.

I was thinking about this part, and I'm not sure it's the right choice here. These are CPU intensive tasks AFAIK, which means asyncio probably doesn't speed things up much (we're still stuck in one process). I think we want a `concurrent.futures' instead? https://stackoverflow.com/questions/51271451/how-to-parallelize-computation-with-asyncio

Good call. There's a fair amount of IO involved in taskgraph generation (reading config files, reading moz.build files, etc), but yeah.. I think we should continue using a separate subprocess for each generation here. Haven't looked at what the "hack" is, but switching to a concurrent.futures.ProcessPoolExecutor might still be an improvement.

I'd love to investigate using asyncio within a single invocation of taskgraph such that we can process kinds concurrently (as long as their kind dependencies are already processed), but that's a whole separate project.

Assignee: nobody → ahal
Status: NEW → ASSIGNED
Depends on: 1725404

Rather than implement the --diff feature into two distinct cli's, I decided to block on bug 1725404 and try to unify the two first.

Depends on: 1726017
No longer depends on: 1726017

We will soon allow multiple parameter files on the same cli. In anticipation of
this we need to add identifiers (the hash of the parameters) as well as
specifications (how the parameters were designated by the user). This will make
it easier to trach which logs / output belong to which taskgraph generation
from the user's perspective.

Depends on D122991

This is a WIP stack that mostly solves step 1 from comment 0. Still todo here are:

  1. Ability to specify a dir of parameters
  2. Solve the log issue (interleaved logs from multiple generations). Will likely just redirect these to files.

After that is solving the diff portion of the bug. I'd like to try and have all of this work in a single invocation (i.e taskgraph --diff vs taskgraph && hg up other && taskgraph && taskgraph-diff). Though to do this will mean messing around with version control which isn't everyone's cup of tea. For context we already do this in mozilla-central and some safe guards we use include bailing out if the working dir is dirty and trying to update as close as possible back to the original state (e.g ensure bookmarks are activated again with hg).

If there aren't any major objections to this, I'd like to publish Gecko's mozversioncontrol library to Pypi and make taskgraph depend on that rather than re-inventing the wheel there.

(In reply to Andrew Halberstadt [:ahal] from comment #16)

This is a WIP stack that mostly solves step 1 from comment 0. Still todo here are:

  1. Ability to specify a dir of parameters
  2. Solve the log issue (interleaved logs from multiple generations). Will likely just redirect these to files.

After that is solving the diff portion of the bug. I'd like to try and have all of this work in a single invocation (i.e taskgraph --diff vs taskgraph && hg up other && taskgraph && taskgraph-diff). Though to do this will mean messing around with version control which isn't everyone's cup of tea. For context we already do this in mozilla-central and some safe guards we use include bailing out if the working dir is dirty and trying to update as close as possible back to the original state (e.g ensure bookmarks are activated again with hg).

If there aren't any major objections to this, I'd like to publish Gecko's mozversioncontrol library to Pypi and make taskgraph depend on that rather than re-inventing the wheel there.

I'm on board with all of this. I am one of the folks that's not a fan of messing with the VCS, but it's worth it for the enhanced UX IMO. (Unless you want to consider copying the repo to a tempdir or something, and never touching the original - but that's awfully expensive.)

Heh, that might work for some of the smaller repos.. but it's gonna be a bad time with m-c :p.

I'm definitely open to exploring alternatives down the road (bearing in mind we need to support hg and git), but I don't want to block any of the work currently scoped out for H2 on it. If messing with VCS was a deal breaker, I'd advocate the 4 step method as the next best thing (and promptly write myself a script to avoid it).

(In reply to Andrew Halberstadt [:ahal] from comment #18)

Heh, that might work for some of the smaller repos.. but it's gonna be a bad time with m-c :p.

I'm definitely open to exploring alternatives down the road (bearing in mind we need to support hg and git), but I don't want to block any of the work currently scoped out for H2 on it. If messing with VCS was a deal breaker, I'd advocate the 4 step method as the next best thing (and promptly write myself a script to avoid it).

Just to be 100% clear...I am not against checking in the VCS stuff. It's not an ideal solution, but it's the best available one. If we can come up with something better later, great, but if not I'm happy for it to live in perpetuity.

This flag was previously being ignored.

Depends on D122991

Depends on: 1714597
Blocks: 1726573
Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/ci/taskgraph/rev/70cf69f7927a Improve '--parameters' help text, r=taskgraph-reviewers,bhearsum https://hg.mozilla.org/ci/taskgraph/rev/644b0b15ef3a Fix '--verbose' in show taskgraph commands, r=taskgraph-reviewers,bhearsum
Depends on: 1726831

Passing --parameters=<dir> will search <dir> for any .json or .yml therein and
run multiple taskgraph generations in parallel (as if passing multiple -p
flags).

This is primarily intended to be used with the sets of parameters defined in
taskgraph-diff:
https://hg.mozilla.org/build/braindump/file/61ad54b07b06ee98cdd5e1177f423dda62b3ac3b/taskcluster/taskgraph-diff

Depends on D122994

This also refactors the tests to use parametrization.

Depends on D123563

Attachment #9236883 - Attachment description: WIP: Bug 1720715 - Create and log parameter IDs / specifications → Bug 1720715 - Create and log parameter IDs / specifications, r?#taskgraph-reviewers!
Attachment #9236884 - Attachment description: WIP: Bug 1720715 - Allow passing multiple parameters on the same command line → Bug 1720715 - Allow passing multiple parameters on the same command line, r?#taskgraph-reviewers!
Attachment #9236885 - Attachment description: WIP: Bug 1720715 - Run generations with multiple parameters in parallel → Bug 1720715 - Run generations with multiple parameters in parallel, r?#taskgraph-reviewers!
Attachment #9237871 - Attachment description: WIP: Bug 1720715 - Support directories with the --parameters flag → Bug 1720715 - Support directories with the --parameters flag, r?#taskgraph-reviewers!
Attachment #9237872 - Attachment description: WIP: Bug 1720715 - Refactor taskgraph.util.vcs to use an interface → Bug 1720715 - Refactor taskgraph.util.vcs to use an interface, r?#taskgraph-reviewers!
Attachment #9237873 - Attachment description: WIP: Bug 1720715 - Support a '--diff' flag in taskgraph's cli → Bug 1720715 - Support a '--diff' flag in taskgraph's cli, r?#taskgraph-reviewers!
Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/ci/taskgraph/rev/da7774fc0416 Create and log parameter IDs / specifications, r=taskgraph-reviewers,aki https://hg.mozilla.org/ci/taskgraph/rev/4eb4d8bec730 Allow passing multiple parameters on the same command line, r=taskgraph-reviewers,aki
Blocks: 1729060
Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/ci/taskgraph/rev/5b985cdabe1f Run generations with multiple parameters in parallel, r=taskgraph-reviewers,aki https://hg.mozilla.org/ci/taskgraph/rev/dc5010d02eb0 Support directories with the --parameters flag, r=taskgraph-reviewers,jmaher,aki https://hg.mozilla.org/ci/taskgraph/rev/f7e395a422e2 Refactor taskgraph.util.vcs to use an interface, r=taskgraph-reviewers,aki https://hg.mozilla.org/ci/taskgraph/rev/b1debb683854 Support a '--diff' flag in taskgraph's cli, r=taskgraph-reviewers,aki
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: