1302786 - Support filtering output from task-graph generation

Reporter

Description

•

8 years ago

The `./mach taskgraph <subcommand>` commands generally spit out all of the tasks they are creating, but maybe you only want to see one.  For example:

  ./mach taskgraph tasks -p parameters.yml -J --tasks 'source-check-mozlint-eslint'

might output the JSON for only that one task.  The `--tasks` option could take a regular expression to make it a little more general.

HAMMAD AKHTAR

Assignee

Comment 1

•

8 years ago

Hi Dustin, I am a newbie here. I would like to take up this bug. It will be really helpful if you can guide me on how tackle this bug.

Thank you

HAMMAD AKHTAR

Assignee

Comment 2

•

8 years ago

Hi Dustin, I am a newbie here. I would like to take up this bug. It will be really helpful if you can guide me on how tackle this bug.

Thank you

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 3

•

8 years ago

Sure! The first thing is to get a checkout of the Firefox source code. There is some useful material on that topic on http://developer.mozilla.org/

You can stop at the part where the getting-started guide says to run `./mach build`. For this bug, you don't need to build, just run mach commands.

Once you have that running, you'll want to get set up to hack taskgraphs,
http://gecko.readthedocs.io/en/latest/taskcluster/taskcluster/how-tos.html#hacking-task-graphs

You can find a parameters.yml file by visiting https://treeherder.mozilla.org/#/jobs?repo=mozilla-central, finding a green "D", and then downloading the parameters.yml artifact that appears in the panel at the bottom.

Once you've done that, try running
./mach taskgraph tasks -p parameters.yml -J

You will see that it thinks for a while, then produces a LOT of JSON output. The goal of this bug is to add one or more command-line options that can filter that output down to something more reasonable. The code that runs when you type `./mach taskgraph` is all under `taskcluster/`, starting with `taskcluster/mach_commands.py`. I'll leave you to learn about that source code. A few hints:

- docs: http://gecko.readthedocs.io/en/latest/taskcluster/taskcluster/
- read the source to learn more -- follow the execution from place to place
- add "print" statements here and there to see what happens in what order
- concentrate on the parts you most need to understand - in this case, the command-line arguments and the code that shows the output.

If you get stuck learning, I'm happy to answer specific questions either here or in IRC (you can find me in the #tc-contributors channel on Mozilla IRC)

Assignee: nobody → hammad13060

Flags: needinfo?(dustin)

HAMMAD AKHTAR

Assignee

Comment 4

•

8 years ago

Attached file mach_commands.py — Details

Hi Dustin, I have made some changes to functions of MachCommands class found in mach_commands.py. I have decorated taskgraph_tasks function with command argument '--tasks'. I have added a check for json format in show_taskgraph function, then show_taskgraph_json method is called (if format is json). Is my approach correct ?

Flags: needinfo?(dustin)

Attachment #8810340 - Flags: feedback?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 5

•

8 years ago

quick note -- if you set "feedback?" you don't need to also set "needinfo?"

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Updated

•

8 years ago

Attachment #8810340 - Attachment mime type: text/x-python-script → text/plain

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 6

•

8 years ago

Comment on attachment 8810340 [details]
mach_commands.py

Good start!

the `--tasks` option should be added to all subcommands that call `show_taskgraph`, which means it should be part of `ShowTaskGraphSubcommand`.

Also, `--tasks` should apply to all show methods, not just JSON.  Probably the easiest way to do that is to filter the taskgraph in `show_taskgraph` before passing it to the show method.

Finally, it's difficult for me to look at changes in a Python file like this, since I must visually compare it against the original.  Providing a patch, either as a patch attachment or using mozreview (https://mozilla-version-control-tools.readthedocs.io/en/latest/mozreview-user.html) will make it a lot easier for me to help.

Attachment #8810340 - Flags: feedback?(dustin) → feedback+

Comment hidden (mozreview-request)

Review commit: https://reviewboard.mozilla.org/r/92968/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/92968/

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 8

•

8 years ago

mozreview-review

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

https://reviewboard.mozilla.org/r/92968/#review93126

This is looking great.  The "r-" just means I do not want to land the patch as-is.

I'm open to being convinced that the to/from_json is the only option.

::: taskcluster/mach_commands.py:238
(Diff revision 1)
>  
>              show_method = getattr(self, 'show_taskgraph_' + (options['format'] or 'labels'))
> -            show_method(tg)
> +
> +            filtered_tg_dict = self.filter_tasks(tg, options['tasks_regex'])
> +
> +            tg = TaskGraph.from_json(filtered_tg_dict)

Why go `to_json` and then back with `from_json` here?  I would prefer to just filter the TaskGraph object itself.

::: taskcluster/mach_commands.py:239
(Diff revision 1)
>              show_method = getattr(self, 'show_taskgraph_' + (options['format'] or 'labels'))
> -            show_method(tg)
> +
> +            filtered_tg_dict = self.filter_tasks(tg, options['tasks_regex'])
> +
> +            tg = TaskGraph.from_json(filtered_tg_dict)
> +            

some extra tab or space characters here

Attachment #8810680 - Flags: review?(dustin) → review-

HAMMAD AKHTAR

Assignee

Comment 9

•

8 years ago

Hey Dustin, in the code for TaskGraph and Graph classes it is mentioned that objects are immutable. Hence I am thinking of adding a new class method to TaskGraph which generates a new filtered TaskGraph object and returns it. Am I heading in the right direction.

Thank you

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 10

•

8 years ago

Close!  Probably better to follow the pattern of `taskcluster/taskgraph/generator.py`, in particular "Generating target task set", which does a similar thing (filtering a taskgraph).  That creates a new TaskGraph object directly, using dictionary and/or list comprehensions to create the content.

Flags: needinfo?(dustin)

Comment hidden (mozreview-request)

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/92968/diff/1-2/

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 12

•

8 years ago

mozreview-review

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

https://reviewboard.mozilla.org/r/92968/#review93192

I'm liking this, just please move the `get_filtered_taskgraph` method into `MachCommands`, to match how derived `TaskGraph` instances are created in other places.

Attachment #8810680 - Flags: review?(dustin) → review-

Comment hidden (mozreview-request)

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/92968/diff/2-3/

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 14

•

8 years ago

mozreview-review

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

https://reviewboard.mozilla.org/r/92968/#review93494

Yay, looks good!

Actually, mozreview is showing this diff as deleting code from taskgraph.py, but that code isn't actually *in* taskgraph.py, so that's confusing.  I think that is a mozreview bug, but we'll see.  Have you run this is try (now that you have try access)?

Attachment #8810680 - Flags: review?(dustin) → review+

HAMMAD AKHTAR

Assignee

Comment 15

•

8 years ago

mozreview-review-reply

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

https://reviewboard.mozilla.org/r/92968/#review93494

Thank you Dustin for helping me. This is my first time contributing to an open source project, so I am not aware of various terms. I am trying my best to learn things fast. It will be great if you can share some reference on "ship it".

HAMMAD AKHTAR

Assignee

Comment 16

•

8 years ago

mozreview-review-reply

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

https://reviewboard.mozilla.org/r/92968/#review93494

Dustin I have not run my code on try server. It will be great if you can share some reference on that too.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 17

•

8 years ago

Thanks for asking -- "ship it" means it's ready to be committed to the version control system.  It's sort of a Mozilla term.

There's some help on pushing to try here:
  https://wiki.mozilla.org/Build:TryServerAsBranch#How_to_push_to_try

give it a shot -- it's a good way to make sure the job will not cause problems when we commit it.  A good try syntax is "try: -b d -p linux64 -t none -u none" which will do a linux64 build but not run any tests.

HAMMAD AKHTAR

Assignee

Comment 18

•

8 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #17)
> Thanks for asking -- "ship it" means it's ready to be committed to the
> version control system.  It's sort of a Mozilla term.
> 
> There's some help on pushing to try here:
>   https://wiki.mozilla.org/Build:TryServerAsBranch#How_to_push_to_try
> 
> give it a shot -- it's a good way to make sure the job will not cause
> problems when we commit it.  A good try syntax is "try: -b d -p linux64 -t
> none -u none" which will do a linux64 build but not run any tests.

So how do I know which tests are for my patch ?

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 19

•

8 years ago

Good question!  It takes a bit of learning.  In this case, none of the unit tests or talos tests (-u and -t) test this code; instead, the python tests will evaluate it, and those aren't controlled through try syntax.  Don't think too hard about it, it doesn't make that much sense!  In general, as you work on this or that part of Gecko, people already working on it can suggest what to run in try.

HAMMAD AKHTAR

Assignee

Comment 20

•

8 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #19)
> Good question!  It takes a bit of learning.  In this case, none of the unit
> tests or talos tests (-u and -t) test this code; instead, the python tests
> will evaluate it, and those aren't controlled through try syntax.  Don't
> think too hard about it, it doesn't make that much sense!  In general, as
> you work on this or that part of Gecko, people already working on it can
> suggest what to run in try.

Okay so basically when someone will write tests for this piece of code only then it can be tested. If there are not tests for a code the most we can do is try building it on different platforms.

I ran the code on try server at it failed for one job. So this piece of code still needed to be shipped ? Can you elaborate on shipping ?

Thank you

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 21

•

8 years ago

You're right that there aren't tests for this bit of code, but in this case if it runs OK by hand, then that should be enough.

Can you link to the try server failure?  If it failed then we may need to make more changes.

HAMMAD AKHTAR

Assignee

Comment 22

•

8 years ago

try server failure - https://archive.mozilla.org/pub/firefox/try-builds/hammad13060@iiitd.ac.in-191db05218fbcba255b85c26dfa65562f99345de/

I am not able to access it.

HAMMAD AKHTAR

Assignee

Comment 23

•

8 years ago

(In reply to HAMMAD AKHTAR from comment #22)
> try server failure -
> https://archive.mozilla.org/pub/firefox/try-builds/hammad13060@iiitd.ac.in-
> 191db05218fbcba255b85c26dfa65562f99345de/
> 
> I am not able to access it.

(In reply to Dustin J. Mitchell [:dustin] from comment #21)
> You're right that there aren't tests for this bit of code, but in this case
> if it runs OK by hand, then that should be enough.
> 
> Can you link to the try server failure?  If it failed then we may need to
> make more changes.

try server failure - https://archive.mozilla.org/pub/firefox/try-builds/hammad13060@iiitd.ac.in-191db05218fbcba255b85c26dfa65562f99345de/

I am not able to access it.

Flags: needinfo?(dustin)

HAMMAD AKHTAR

Assignee

Comment 24

•

8 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #21)
> You're right that there aren't tests for this bit of code, but in this case
> if it runs OK by hand, then that should be enough.
> 
> Can you link to the try server failure?  If it failed then we may need to
> make more changes.
 link to treeherder : https://treeherder.mozilla.org/#/jobs?repo=try&revision=191db05218fbcba255b85c26dfa65562f99345de

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 25

•

8 years ago

You don't need to post messages twice, with a needinfo on the second one - I can read it just fine the first time :)

The more useful link is

  https://treeherder.allizom.org/#/jobs?repo=try&revision=471eaa285bcd7a1d58c69ce112c52bd9491484aa

but you used -p all, which was quite wasteful -- that's several *days* worth of computation and many GB of storage space used.  It's not the end of the world, but please do be careful what jobs you run in try - we spend millions of dollars every year running try jobs, and every bit of savings helps.

Of all the jobs in your try push, the one that is problematic is:

  https://treeherder.allizom.org/#/jobs?repo=try&revision=471eaa285bcd7a1d58c69ce112c52bd9491484aa&selectedJob=29620972

and in particular:

TEST-UNEXPECTED-ERROR | /home/worker/checkouts/gecko/taskcluster/mach_commands.py:218:9 | 'taskgraph.taskgraph.TaskGraph' imported but unused (F401)

so, please remove that import.  Also, I figured out why mozreview was displaying things strangely -- there are two changesets with the same commit message in your push:

 dc79b735fafb HA Bug 1302786 - Support filtering output from task-graph generation; r?dustin
 9d857c040a96 HA Bug 1302786 - Support filtering output from task-graph generation; r?dustin

You can use `hg histedit` to combine those into a single changeset, remove the import, and push to review again, and we'll see how that works :)

Comment hidden (mozreview-request)

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/92968/diff/3-4/

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 27

•

8 years ago

That is a nice-looking try push!

However, on trying it manually:

./mach taskgraph full -p parameters.yml --tasks 'build-linux.*'
 0:00.12 Loading kinds
 0:00.26 Generating full task set
 0:00.31 Generated 4 tasks for kind android-stuff
 0:00.31 Generated 1 tasks for kind artifact-build
 0:00.49 Generated 23 tasks for kind build
 0:00.50 Generated 1 tasks for kind build-signing
 0:04.54 Generated 700 tasks for kind desktop-test
 0:04.70 Generated 6 tasks for kind docker-image
 0:04.71 Generated 2 tasks for kind hazard
 0:04.72 Generated 3 tasks for kind l10n
 0:04.72 Generated 1 tasks for kind marionette-harness
 0:04.77 Generated 6 tasks for kind source-check
 0:04.81 Generated 12 tasks for kind spidermonkey
 0:04.82 Generated 2 tasks for kind static-analysis
 0:04.84 Generated 4 tasks for kind toolchain
 0:04.85 Generated 3 tasks for kind upload-symbols
 0:04.85 Generated 1 tasks for kind valgrind
 0:06.14 Generated 220 tasks for kind android-test
 0:06.14 Generating full task graph
Traceback (most recent call last):
  File "/home/dustin/p/m-c/taskcluster/mach_commands.py", line 235, in show_taskgraph
    tg = self.get_filtered_taskgraph(tg, options["tasks_regex"])
  File "/home/dustin/p/m-c/taskcluster/mach_commands.py", line 269, in get_filtered_taskgraph
    if regexprogram.match(dep):
TypeError: expected string or buffer

Printing `dep` there gives

{u'docker-image': u'build-docker-image-desktop-test', u'build': u'build-linux64/debug'}

I think there's an issue with the logic there -- maybe it changed when moving out of TaskGraph?  Can you take another look?  Sorry to keep going around and around on this one!

Flags: needinfo?(dustin)

HAMMAD AKHTAR

Assignee

Comment 28

•

8 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #27)
> That is a nice-looking try push!
> 
> However, on trying it manually:
> 
> ./mach taskgraph full -p parameters.yml --tasks 'build-linux.*'
>  0:00.12 Loading kinds
>  0:00.26 Generating full task set
>  0:00.31 Generated 4 tasks for kind android-stuff
>  0:00.31 Generated 1 tasks for kind artifact-build
>  0:00.49 Generated 23 tasks for kind build
>  0:00.50 Generated 1 tasks for kind build-signing
>  0:04.54 Generated 700 tasks for kind desktop-test
>  0:04.70 Generated 6 tasks for kind docker-image
>  0:04.71 Generated 2 tasks for kind hazard
>  0:04.72 Generated 3 tasks for kind l10n
>  0:04.72 Generated 1 tasks for kind marionette-harness
>  0:04.77 Generated 6 tasks for kind source-check
>  0:04.81 Generated 12 tasks for kind spidermonkey
>  0:04.82 Generated 2 tasks for kind static-analysis
>  0:04.84 Generated 4 tasks for kind toolchain
>  0:04.85 Generated 3 tasks for kind upload-symbols
>  0:04.85 Generated 1 tasks for kind valgrind
>  0:06.14 Generated 220 tasks for kind android-test
>  0:06.14 Generating full task graph
> Traceback (most recent call last):
>   File "/home/dustin/p/m-c/taskcluster/mach_commands.py", line 235, in
> show_taskgraph
>     tg = self.get_filtered_taskgraph(tg, options["tasks_regex"])
>   File "/home/dustin/p/m-c/taskcluster/mach_commands.py", line 269, in
> get_filtered_taskgraph
>     if regexprogram.match(dep):
> TypeError: expected string or buffer
> 
> Printing `dep` there gives
> 
> {u'docker-image': u'build-docker-image-desktop-test', u'build':
> u'build-linux64/debug'}
> 
> I think there's an issue with the logic there -- maybe it changed when
> moving out of TaskGraph?  Can you take another look?  Sorry to keep going
> around and around on this one!

I think that line "for depname, dep in named_links_dict.iteritems()" needs to be changes to "for depname, dep in named_links_dict[key].iteritems()", because graph.named_links_dict() returns returns a two-level dictionary mapping each node to a dictionary mapping edge names to labels.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 29

•

8 years ago

sounds right to me!

HAMMAD AKHTAR

Assignee

Comment 30

•

8 years ago

okay so I will try it on try server and then share the changes with you :)

Flags: needinfo?(dustin)

Comment hidden (mozreview-request)

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/92968/diff/4-5/

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 32

•

8 years ago

I hate to nag about communication stuff, but please stop setting the needinfo flag

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 33

•

8 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/d607a23fc3dc64956e68b588e53772b9aef48f4f
Bug 1302786 - Support filtering output from task-graph generation; r=dustin

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 34

•

8 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/add1700c24ca4e854f9c4e66bf433cc8c6696f2d
Bug 1302786: add help message for mach taskgraph --tasks-regex r=me

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 35

•

8 years ago

mozreview-review

Comment on attachment 8810680 [details]
Bug 1302786 - graph.named_links_dict changed to  graph.named_links_dict[key];

https://reviewboard.mozilla.org/r/92968/#review94054

The try job looks good (the decision ran without issue) and it works for me on the command line!  Yay!

I see there are two commits in the hg history now, and neither has a commit message that reflects the bug (they just reflect changes made in review).  I've squashed those two commits together and used a different commit message.  I also added a help message to the --tasks option (I didn't see that it was missing before or I would I have mentioned it).

Here's the result:

https://hg.mozilla.org/integration/mozilla-inbound/rev/d607a23fc3dc64956e68b588e53772b9aef48f4f
https://hg.mozilla.org/integration/mozilla-inbound/rev/add1700c24ca4e854f9c4e66bf433cc8c6696f2d

Hooray, shipped!  Can I help you find the next thing to hack on?

HAMMAD AKHTAR

Assignee

Comment 36

•

8 years ago

Thank you Dustin for helping me. Even I knew that I had to add help message to --tasks command but completely forgot about it. Thanks again :)

Yes it will be great if you can suggest next thing to work on.

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 37

•

8 years ago

How about bug 1311810?  I haven't heard back from Haard, so it's open to work on.

Status: NEW → RESOLVED

Closed: 8 years ago

Flags: needinfo?(dustin)

Resolution: --- → FIXED

Carsten Book [:Tomcat]

Comment 38

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/d607a23fc3dc
https://hg.mozilla.org/mozilla-central/rev/add1700c24ca

areya

Comment 39

•

8 years ago

https://hg.mozilla.org/mozilla-central/rev/d607a23fc3dc
https://hg.mozilla.org/mozilla-central/rev/add1700c24ca

BMO Automation

Updated

•

6 years ago

Product: TaskCluster → Firefox Build System

mach_commands.py 8 years ago HAMMAD AKHTAR 12.65 KB, text/plain	dustin : feedback+	Details
Bug 1302786 - graph.named_links_dict changed to graph.named_links_dict[key]; 8 years ago HAMMAD AKHTAR 58 bytes, text/x-review-board-request	dustin : review+	Details