Closed Bug 1277665 Opened 4 years ago Closed 4 years ago

Make 'mach taskgraph' usable for local debugging like 'mach taskcluster-graph --dry-run'

Categories

(Firefox Build System :: Task Configuration, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: mshal, Unassigned)

Details

Attachments

(1 file)

Now that MOZ_BUILD_DATE exists in the taskcluster-graph data, it would be handy if this was stubbed out to a constant value when using --dry-run to get an accurate diff.
I think for the most part this mach target isn't used anymore in favor of the new `mach taskcluster` commands that dustin has added.
Oh, I wasn't aware of that. What's the correct way to dump out the tasks now and compare them? Should I be looking at artifacts/full-task-graph.json?
Flags: needinfo?(dustin)
Yes, if you're looking at decision tasks, that's correct.  If you want to run commands locally, you will need to hack together a version of bug 1277417 which isn't complete yet (but it's easy to just add a print(json.dumps(..))
Flags: needinfo?(dustin)
Attachment #8759318 - Flags: review?(garndt)
The workflow I used to use with taskcluster-graph was something like this:

1) Copy the 'mach taskcluster-graph ...' command out of the gecko decision task
2) Run the command, which prints the task information as JSON to stdout (so I'd usually redirect to a file)
3) If I want to compare results, I run the command with '--dry-run' and redirect to before/after files and then diff them.

This was nice because I didn't really have to figure out the details of the command, since it was available in the decision task and the same command ran locally. The --dry-run functionality stubbed out the variable stuff (like taskIds and dates) so that running it twice in a row produced identical output.

Is there a way we can get taskgraph to have this same behavior? As it is now, it seems like a big step backward. Here's what I've tried so far:

 - Copy/paste the 'mach taskgraph decision' command from the decision task: This runs for a while and then dies trying to connect to taskcluster for some reason. It at least produces the full-task-graph.json file, which is almost what I want. However, since there is no dry-run functionality, the diff is unusable. Additionally, the output seems to be somewhat arbitrarily ordered, so the same tasks are not in the same place in the output file.

 - Instead of 'decision', use one of the other options like 'full': This just complains about a missing "parameters" argument, which I'm not sure how to get working.
Flags: needinfo?(dustin)
Summary: taskcluster-graph --dry-run should stub out MOZ_BUILD_DATE → Make 'mach taskgraph' usable for local debugging like 'mach taskcluster-graph --dry-run'
The right way to do this now is to compare the outputs of two different `mach taskgraph target-graph` or `mach taskgraph optimized` commands.

Rather than copy-pasting the decision-task command (which also depends on a bunch of environment variables), just get the parameters.yml artifact from a relevant decision task, tweak it if necessary, and pass that to `mach taskgraph`.  The docs do talk about parameters.yml files :)

The graphs currently have a lot of differences.  Bug 1229178 is outstanding to make them more comparable.  That said, `jq` is a useful tool for cutting down the tasks to just the interesting bits, at which point a simple diff is useful.

Maybe this bug should be dup'd to bug 1229178?  And I could certainly use some help with bug 1229178 if you want to.
Flags: needinfo?(dustin)
What environment variables does the decision task depend on? I tried adding --dry-run functionality back in, and it seems to work fine for my purposes (though the docker image tasks still don't stub out the taskIds, but that can be done as a followup). Is there something that would be missing or wrong with the locally generated parameters without these variables?

I don't think having to download a parameters file and using special json processing tools makes for a smooth workflow. Even with that workflow, it's still helpful to have something to stub out the taskIds and dates to make for a usable diff.
Duplicate of this bug: 1229178
Comment on attachment 8759318 [details]
Bug 1277665 - add --dry-run functionality to mach taskgraph;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/57296/diff/1-2/
Attachment #8759318 - Attachment description: Bug 1277665 - Stub out MOZ_BUILD_DATE for taskcluster-graph dry-run; → Bug 1277665 - add --dry-run functionality to mach taskgraph;
Attachment #8759318 - Flags: review?(dustin)
Comment on attachment 8759318 [details]
Bug 1277665 - add --dry-run functionality to mach taskgraph;

https://reviewboard.mozilla.org/r/57296/#review57920

Aside from some serious problems with this implementation, I'm not convinced this is necessary, not least because as written this makes `mach taskgraph decision --dry-run` roughly equivalent to `mach taskgraph tasks`.  Perhaps you can spell out your use-case better?

The rough diagram for how things work right now is this:

        |--tasks--|--target--|--target-graph--|--optimized--|
    |------------------------decision---------------------------|

that is, the decision task spans just a little more than the `mach taskgraph` utility commands.  At the beginning, it processes command-line arguments, environment variables, defaults, etc. into an explicit set of inputs (and records these as parameters.yml).  At the end, it calls `queue.createTask` for you.  Everything else is available in easy-to-use form from the utility commands.

I have a hard time seeing the process of gathering all the command-line and environment variable inputs to the decision task as being easier than clicking on a relevant parameters.yml file to download it.  But maybe I misunderstand?

The results aren't perfect right now.  Please know that things *will* get better once the legacy kind is replaced.  For one thing, labels (not taskIds) will be constant from run to run.  I'm also working on a tool to diff taskgraphs that can handle otherwise-spurious differences (particularly in datestamps) in bug 1229178 (which I just re-opened).

For what it's worth, I'm actually in favor of adding a --dry-run option to the decision task, but one that would only prevent it from calling `queue.createTask` at the end.  That way, we can run decision tasks for mozilla-central, etc. in --dry-run mode from try, avoiding the increasingly common phenomenon of task misconfigurations that work in try but fail immediately on inbound or, worse, beta.  So if you want to factor this patch down to that point, I'm open to r+'ing it :)

::: taskcluster/taskgraph/decision.py:65
(Diff revision 2)
>          root_dir=options['root'],
>          parameters=parameters,
>          target_tasks_method=target_tasks_method)
>  
> +    if options['dry_run']:
> +        print(json.dumps(tgg.full_task_graph.to_json(), indent=4, sort_keys=True))

This is not the graph you want, though.  The full task graph does have stable labels (once we replace the legacy kind), although not stable timestamps.  However, it is just the first stage in the decision task output, and omits all of the target set selection and optimization that are probably of interest when dry-running.

::: taskcluster/taskgraph/kind/docker_image.py:31
(Diff revision 2)
>  
>  
>  class DockerImageKind(base.Kind):
>  
>      def load_tasks(self, params):
> +        if params['dry_run']:

I don't think we want to have every kind special-casing dry runs.  Instead, we should have some kind of post-processing that strips the variable bits of the graphs.

::: taskcluster/taskgraph/kind/docker_image.py:40
(Diff revision 2)
> +            )
> +        else:
> +            from taskgraph.util.time import (
> +                current_json_time,
> +                json_time_from_now,
> +            )

I think that conditional imports are not the best way to implement this.  A better choice would be some kind of object to produce times and other dynamic attributes.  But "better" here does not overcome my concerns with the overall approach!
Attachment #8759318 - Flags: review?(dustin) → review-
(In reply to Dustin J. Mitchell [:dustin] from comment #11)
> Aside from some serious problems with this implementation, I'm not convinced
> this is necessary, not least because as written this makes `mach taskgraph
> decision --dry-run` roughly equivalent to `mach taskgraph tasks`.  Perhaps
> you can spell out your use-case better?

Ahh, ok. I put it in the decision part because it used only the command-line without requiring a separate parameters.yml file. It does sound like 'mach taskgraph tasks' is the right place to put this, though. I'll see if I can get that to support --dry-run without a parameters.yml file.

The use-case is spelled out a little more in https://bugzilla.mozilla.org/show_bug.cgi?id=1229178#c8

> The results aren't perfect right now.  Please know that things *will* get
> better once the legacy kind is replaced.  For one thing, labels (not
> taskIds) will be constant from run to run.  I'm also working on a tool to
> diff taskgraphs that can handle otherwise-spurious differences (particularly
> in datestamps) in bug 1229178 (which I just re-opened).

Ahh, if the taskIds are going away that will help. Then I think we'll just need to stub out the date fields.

Can you clarify on what your tool to diff taskgraphs will do? IMO just plain "diff" works great for this, when the output was consistent from run to run (hence --dry-run).

> For what it's worth, I'm actually in favor of adding a --dry-run option to
> the decision task, but one that would only prevent it from calling
> `queue.createTask` at the end.  That way, we can run decision tasks for
> mozilla-central, etc. in --dry-run mode from try, avoiding the increasingly
> common phenomenon of task misconfigurations that work in try but fail
> immediately on inbound or, worse, beta.  So if you want to factor this patch
> down to that point, I'm open to r+'ing it :)

That sounds like a different use-case from diffing the result of task definition changes, which is what the feature was originally intended for. Perhaps this should be a different flag? 

> 
> ::: taskcluster/taskgraph/decision.py:65
> (Diff revision 2)
> >          root_dir=options['root'],
> >          parameters=parameters,
> >          target_tasks_method=target_tasks_method)
> >  
> > +    if options['dry_run']:
> > +        print(json.dumps(tgg.full_task_graph.to_json(), indent=4, sort_keys=True))
> 
> This is not the graph you want, though.  The full task graph does have
> stable labels (once we replace the legacy kind), although not stable
> timestamps.  However, it is just the first stage in the decision task
> output, and omits all of the target set selection and optimization that are
> probably of interest when dry-running.

Ahh, thanks for the information! I'm still trying to understand all the different pieces here.

> ::: taskcluster/taskgraph/kind/docker_image.py:31
> (Diff revision 2)
> >  
> >  
> >  class DockerImageKind(base.Kind):
> >  
> >      def load_tasks(self, params):
> > +        if params['dry_run']:
> 
> I don't think we want to have every kind special-casing dry runs.  Instead,
> we should have some kind of post-processing that strips the variable bits of
> the graphs.

You mean going through the json and stripping stuff out before printing it? Or how do you envision this post-processing happening?

> ::: taskcluster/taskgraph/kind/docker_image.py:40
> (Diff revision 2)
> > +            )
> > +        else:
> > +            from taskgraph.util.time import (
> > +                current_json_time,
> > +                json_time_from_now,
> > +            )
> 
> I think that conditional imports are not the best way to implement this.  A
> better choice would be some kind of object to produce times and other
> dynamic attributes.  But "better" here does not overcome my concerns with
> the overall approach!

Yeah, it is quite ugly. I'll try to make it a separate class. What is your primary objection to adding the flag? Is it that it was in 'taskgraph decision' at all?
Assignee: mshal → nobody
dustin mentioned on IRC that he has more changes planned - I may revisit this after that's settled.
Just to answer the last question here, yes, my objection is that this is in 'taskgraph decision', and I don't think anyone should be running that directly, outside of a task environment.

There's been some more talk in bug 1229178; basically, I think we can produce stable JSON output without any post-processing required.  Instead, we'd post-process that stable output (by generating real timestamps) just before calling queue.createTask.  If that's the case, then diffing JSON outputs is just a matter of using `diff`.  In other words, aside from the timestamp work in bug 1229178, I don't think there's much to do here.
The JSON output is now stable and diff-able.  Is this sufficient, Mike?
Flags: needinfo?(mshal)
Yep, thanks!
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(mshal)
Resolution: --- → WORKSFORME
Product: TaskCluster → Firefox Build System
You need to log in before you can comment on or make changes to this bug.