Closed Bug 1281062 Opened 9 years ago Closed 9 years ago

Create Action Tasks to schedule new jobs

Categories

(Firefox Build System :: Task Configuration, task)

Product:

Component:

Type:

task

Priority:

Not set

Severity:

normal

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: martianwars, Assigned: martianwars, Mentored)

References

Details

Attachments

(1 file, 5 obsolete files)

action_tasks.patch 9 years ago Kalpesh Krishna [:martianwars] 6.49 KB, patch	dustin : feedback+ armenzg : feedback+	Details \| Diff \| Splinter Review
action_tasks.patch 9 years ago Kalpesh Krishna [:martianwars] 12.73 KB, patch	dustin : feedback+	Details \| Diff \| Splinter Review
action_tasks.patch 9 years ago Kalpesh Krishna [:martianwars] 13.61 KB, patch		Details \| Diff \| Splinter Review
action_tasks.patch 9 years ago Kalpesh Krishna [:martianwars] 15.94 KB, patch		Details \| Diff \| Splinter Review
action_tasks.patch 9 years ago Kalpesh Krishna [:martianwars] 23.87 KB, patch		Details \| Diff \| Splinter Review
Bug 1281062 - Create Action Tasks to schedule new jobs. 9 years ago Kalpesh Krishna [:martianwars] 58 bytes, text/x-review-board-request	dustin : review+	Details

Kalpesh Krishna [:martianwars]

Assignee

Description

•

9 years ago

Proposed plan of action :- 1. I create a JSON file having the task template for an action task. Maybe this could be added to the full-task-graph.json file itself. 2. On receiving a pulse message, pulse action downloads this file and fills in the missing blanks for action tasks. 3. An action task is created using the createTask API (https://docs.taskcluster.net/reference/platform/queue/api-docs#createTask). pulse_actions has a function which performs this using the TaskCluster python client. 4. This action task appears as an "A" on a push. It uses the full-task-graph of the gecko decision task in the same push and is effectively sent this information (https://public-artifacts.taskcluster.net/ZWYYxOFvQd6XEXShxzUXDg/0/public/target-tasks.json) and uses optimize.py on these task labels. I plan on using https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/decision.py#75-79 right after this file is created. How does this sound? Questions :- 1. What's the best way to go about doing #1? Which are the relevant parts of the code here?ter/taskgraph/decision.py#59) seems to be needing me to run 2. After step #4, I'm done right? Or am I missing something?

Kalpesh Krishna [:martianwars]

Assignee

Updated

•

9 years ago

Blocks: 1254325

Kalpesh Krishna [:martianwars]

Assignee

Updated

•

9 years ago

Flags: needinfo?(dustin)

Flags: needinfo?(armenzg)

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

9 years ago

Assignee: nobody → kalpeshk2011

Flags: needinfo?(dustin)

Armen [:armenzg]

Comment 1

•

9 years ago

Sounds about right. To help you focus forget for now about pulse_actions. Your main focus should be "I want to see a job show up on Treeherder which schedules other tasks". There are mainly two parts that need developed for this project. 1 - A task definition which once scheduled it will make a call to a mach command 2 - The mach command receives what to schedule and schedules it For a moment, forget about the task definition and focus on the mach command. You will have to modify mach_commands.py under taskcluster/. Once you give it some input parameters it should be able schedule all the jobs you specificied. If you ignore the task definition of the "action task" and pulse_actions you will have this: * From the gecko checkout you can make a |mach something| call; you will need to pass task labels or something to say what you want scheduled * The |mach call| should do all the work and schedule as many jobs as required * You should see the jobs you want (not the "A"/action symbol) show up on Treeherder Once you have the working mach command, you can start focusin on creating a task definition. At that point you would be able to modify this script [1] to help you schedule the "action task". The "action task" would show up on Treeherder and will run the mach command you developed in the previous step. [1] https://github.com/armenzg/TC_developer_scheduling_experiments/blob/master/schedule_linux64_task.py

Flags: needinfo?(armenzg)

Kalpesh Krishna [:martianwars]

Assignee

Comment 2

•

9 years ago

Attached patch action_tasks.patch (obsolete) — Details — Splinter Review

This is a very rough patch, and I just wanted a feedback on the general direction. It seems to be working for me locally, though I don't know the best way to test it. > ./mach taskgraph action-task --task-labels="TaskLabel==A-JT3sTNQMSvayONPiHU6A,TaskLabel==AWMGHageQh-L_nkBPsQOvA" --decision-id="TADoGLV5RUWQYivg1ReyTw" is a sample way to run this.

Attachment #8764057 - Flags: feedback?(dustin)

Attachment #8764057 - Flags: feedback?(armenzg)

Dustin J. Mitchell [:dustin] (he/him)

Comment 3

•

9 years ago

Comment on attachment 8764057 [details] [diff] [review] action_tasks.patch Review of attachment 8764057 [details] [diff] [review]: ----------------------------------------------------------------- This is definitely off to a nice start! ::: taskcluster/taskgraph/action.py @@ +40,5 @@ > + > + parameters = get_action_parameters(options) > + > + # write out the parameters used to generate this graph > + write_artifact('parameters.yml', dict(**parameters)) I don't think these parameters should be written out as `parameters.yml`, as they are not the same as the parameters to the decision task. @@ +82,5 @@ > + """ > + This code is used to generate the full task graph using the decision > + task id. The file is first downloaded and then converted to the > + TaskGraph format. > + """ I'd rather see this implemented as a class method of TaskGraph -- something like TaskGraph.from_full_task_graph() @@ +83,5 @@ > + This code is used to generate the full task graph using the decision > + task id. The file is first downloaded and then converted to the > + TaskGraph format. > + """ > + url = "https://public-artifacts.taskcluster.net/" + decision_task_id + "/0/public/full-task-graph.json" This should use the queue getLatestArtifact endpoint, rather than pointing to an S3 bucket that we might move. https://docs.taskcluster.net/reference/platform/queue/api-docs#getLatestArtifact

Attachment #8764057 - Flags: feedback?(dustin) → feedback+

Armen [:armenzg]

Comment 4

•

9 years ago

Comment on attachment 8764057 [details] [diff] [review] action_tasks.patch Review of attachment 8764057 [details] [diff] [review]: ----------------------------------------------------------------- It looks like you're in the right direction. For the final review please ask dustin directly as I really don't have much expertise in this area. I will, however, do a drive-by review. ::: taskcluster/taskgraph/action.py @@ +21,5 @@ > +from .optimize import optimize_task_graph > +from .types import ( > + Task, > + TaskGraph > +) Sort it alphabetically. @@ +25,5 @@ > +) > + > +logger = logging.getLogger(__name__) > + > +logger = logging.getLogger(__name__) Duplicated. @@ +48,5 @@ > + target_tasks = set(parameters['task_labels'].split(',')) > + > + target_task_set = TaskGraph( > + {l: all_tasks[l] for l in target_tasks}, > + Graph(target_tasks, set())) Could you please be explicit of the signature of this call? > TaskGraph(param1=value1, param2=value2)

Attachment #8764057 - Flags: feedback?(armenzg) → feedback+

Kalpesh Krishna [:martianwars]

Assignee

Comment 5

•

9 years ago

Attached patch action_tasks.patch (obsolete) — Details — Splinter Review

So here is the relevant push with the code in action. https://treeherder.mozilla.org/#/jobs?repo=try&revision=88f8d3edac627665a2dc6df5294ccd53cb53621c What I don't understand is that I am seeing docker images in my push, which shouldn't be happening.

Attachment #8764057 - Attachment is obsolete: true

Attachment #8765019 - Flags: feedback?(dustin)

Kalpesh Krishna [:martianwars]

Assignee

Comment 6

•

9 years ago

Obvious from the above link, but the "A" is the action task created. Here is the Task Inspector window for the green "A" https://tools.taskcluster.net/task-inspector/#dxZ73-meQ-WR4WIghphVpg/0

Dustin J. Mitchell [:dustin] (he/him)

Comment 7

•

9 years ago

Comment on attachment 8765019 [details] [diff] [review] action_tasks.patch Review of attachment 8765019 [details] [diff] [review]: ----------------------------------------------------------------- This is looking good! ::: taskcluster/taskgraph/action.py @@ +26,5 @@ > +) > + > +logger = logging.getLogger(__name__) > + > +logger = logging.getLogger(__name__) duplicated @@ +53,5 @@ > + target_task_graph = TaskGraph( > + {l: all_tasks[l] for l in target_graph.nodes}, > + target_graph) > + > + optimized_task_graph, label_to_taskid = optimize_task_graph(target_task_graph, set()) So, instead of taking the transitive closure and optimizing here, I'd suggest doing something different: * Download the decision task's task-graph.json * Iterate over the target tasks, following dependencies *until* you come to a task which is in task-graph.json, and when you do, keep its taskId but stop iterating You'll end up with a set of tasks that are required for the action task, but have not already been run by the decision task. You'll need to do some work to stitch together the dependencies (depend on the taskId from task-graph.json when you find a task there, otherwise create a new taskId for a task that you will need to create). @@ +73,5 @@ > + task = task_graph.tasks[label].task > + task["created"] = datetime.datetime.now().isoformat() + "Z" > + task["deadline"] = (datetime.datetime.now() + datetime.timedelta(days=1)).isoformat() + "Z" > + task["expires"] = (datetime.datetime.now() + datetime.timedelta(days=14)).isoformat() + "Z" > + for key, value in task["payload"]["artifacts"].iteritems(): Note that not all artifacts are a dictionary; sadly the generic worker uses a list. So this will need to be a little flexible. @@ +109,5 @@ > + for depname, dep in value['dependencies'].iteritems(): > + edges.add((key, dep, depname)) > + full_task_set = TaskGraph(all_tasks, Graph(set(all_tasks), set())) > + full_task_graph = TaskGraph(all_tasks, > + Graph(full_task_set.graph.nodes, edges)) why construct full_task_set here? Why not just full_task_graph = TaskGraph(all_tasks, Graph(set(all_tasks), edges)) ::: taskcluster/taskgraph/action.yml @@ +32,5 @@ > + GECKO_HEAD_REF: '{{head_ref}}' > + GECKO_HEAD_REV: '{{head_rev}}' > + > + cache: > + level-{{level}}-{{project}}-test-workspace: /home/worker/workspace This is not a good cache for a decision/action task; see .taskcluster.yml for a better choice. ::: taskcluster/taskgraph/decision.py @@ +68,5 @@ > > + action_params = get_action_parameters(parameters) > + with open('taskcluster/taskgraph/action.yml', 'r') as f: > + c = pystache.render(f.read(), action_params) > + result = yaml.load(c) This should probably go into a separate function. The mach command isn't necessarily run from the root of the source directory, so the relative path won't work -- there are GECKO constants defined in a few other taskgraph files that you can use to find the root of the GECKO source tree and make an absolute path. Also, for consistency, this should probably use the `taskgraph.util.templates` support to generate the action.yml file. In particular, this will let you make the time offsets without putting them in the get_action_parameters method. @@ +103,5 @@ > 'pushlog_id', > 'owner', > 'level', > 'target_tasks_method', > + 'revision_hash', why was this added??

Attachment #8765019 - Flags: feedback?(dustin) → feedback+

Kalpesh Krishna [:martianwars]

Assignee

Comment 8

•

9 years ago

Attached patch action_tasks.patch (obsolete) — Details — Splinter Review

So I was forgetting about the "kind" parameters in my previous patch and that's why docker images were showing up. I have rebased this patch and used the new DockerImageTask and LegacyTask classes and they seem to be working quite nicely. Here is a push https://treeherder.mozilla.org/#/jobs?repo=try&revision=f1c9018f8f95472ce23af875a782e737625e532b In this I scheduled the Android jobs and like you can see, there isn't any image loading. If you look at the Action Task logs, https://tools.taskcluster.net/task-inspector/#X8UxLQLXTPGEC00UnZ0fTQ/0, we can see a statement saying "optimizing build-docker-image-desktop-build" and "optimizing build-docker-image-desktop-test". The next part of this patch hopes to optimize Action Tasks further by using the jobs in the existing push. I propose a slightly different approach, but still on the lines of what Dustin said. 1. optimization is carried out like in this patch. This is so that we can use the LegacyTask optimizations in the future without changing optimize.py 2. I download the label-to-taskid file from the Gecko-Decision-Task and all the existing action tasks(not doing this for now) and I scan my task graph for dependencies which can be further optimized. 3. On finding such a dependency, I remove that key-value from the task json and update the taskid everywhere in the action task label-to-taskid file and task graph file. Does this sound okay? I'll push a patch with this approach tomorrow or something.

Attachment #8765019 - Attachment is obsolete: true

Flags: needinfo?(dustin)

Attachment #8766554 - Flags: feedback?(dustin)

Kalpesh Krishna [:martianwars]

Assignee

Comment 9

•

9 years ago

Attached patch action_tasks.patch (obsolete) — Details — Splinter Review

This is the final patch with optimizations. I have followed the approach outlined above. It works for two cases locally. I can't show you a try push until the trees open :(

Attachment #8766554 - Attachment is obsolete: true

Attachment #8766554 - Flags: feedback?(dustin)

Flags: needinfo?(dustin)

Attachment #8766772 - Flags: review?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Comment 10

•

9 years ago

Comment on attachment 8766772 [details] [diff] [review] action_tasks.patch Review of attachment 8766772 [details] [diff] [review]: ----------------------------------------------------------------- So, first, let me say that this looks good - a nice, solid implementation. However, it seems to do a lot of work to avoid refactoring existing code. I think that needs to happen! First, the bit I describe below about `from_json` would be a good refactor. Ideally that would be done in a separate hg commit that can be reviewed independently. Second, we've discussed several times the wisdom of performing optimization on the new target task graph before eliminating tasks that have already been created by the decision task. I suspect that this is because optimization is curently the only way to get from a label-based graph to one with taskIds in it. That seemed like a reasonable design at the time, but for your uses it wasn't. So I think a refactor is called for here, too. A key observarion that you've already made is this: the process of eliminating tasks already created by the decision task is *very* similar to optimization. So I would suggest refactoring the optimization process so that it can optimize on an arbitrary basis. For decision tasks, that basis would be the tasks' optimize() method; for action tasks that basis would be what tasks have already been performed in the decision task. All of this code is still young and will exist and be hacked on for years, so we need to make sure it's the highest quality possible! ::: taskcluster/taskgraph/action.py @@ +26,5 @@ > + > + > +def taskgraph_action(options): > + """ > + Run the action task. This function implements `mach taskgraph action`, `mach taskgraph action-task`, actually @@ +35,5 @@ > + """ > + > + parameters = get_action_parameters(options) > + > + # write out the full graph for reference s/write out/read in/ @@ +67,5 @@ > + for label in task_graph.graph.nodes: > + task = task_graph.tasks[label].task > + task["created"] = current_json_time() > + task["deadline"] = json_time_from_now("1 day") > + task["expires"] = json_time_from_now("14 day") The deadline and expires should be adjusted relative to the created timestamp. That is, if you changed the created timestamp by 17 hours and 20 minutes, you should also add 17 hours and 20 minutes to all of the other timestamps. They are not always set to 1 and 14 days' offset. @@ +107,5 @@ > + del task_graph[current_task_id] > + # Converting it to a string and replacing all references > + json_graph = json.dumps(task_graph) > + json_graph = json_graph.replace(current_task_id, optimum_task_id) > + task_graph = json.loads(json_graph) I know I did this in legacy.py, but that was because I'm going to delete that file soon. For code we expect to last, we should be doing things more reliably. @@ +115,5 @@ > + .format(label, optimum_task_id)) > + # Relax all the dependencies before creating a TaskGraph > + for key, value in task_graph.iteritems(): > + dep = value["dependencies"] > + task_graph[key]["dependencies"] = {k: v for k, v in dep.items() if v in task_graph} This strips all dependencies which are not within the set of tasks that are about to be created. This is too many -- if there are tasks created by the original decision task that are not completed, this may allow the new tasks to run before the old are complete, causing errors. ::: taskcluster/taskgraph/taskgraph.py @@ +61,5 @@ > def __eq__(self, other): > return self.tasks == other.tasks and self.graph == other.graph > + > + @classmethod > + def from_artifact(cls, decision_task_id, artifact_path): I like adding a method to re-create a task graph from a JSON object. However, I don't think this class should be doing fetches from artifacts. Instead, the action-task code should be performing that fetch, and passing the result to a `from_json` method similar to `from_task_json`. @@ +81,5 @@ > + tasks = {} > + edges = set() > + for key, value in tasks_dict.iteritems(): > + if value['attributes']['kind'] == 'legacy': > + tasks[key] = LegacyTask(kind='legacy', We may have an arbitrary number of kind classes like this -- we won't be able to special-case them. One possible fix to this is to use the `kind` attribute to look up the relevant kind.yml file and look at its implementation key. However, even then, the class constructors may take different numbers of arguments, so you wouldn't be able to call the constructors directly. Another approach is to just use the base class (suitably modified to no longer be an abstract base class; or define a simple subclass). You really only need these Task objects to contain a task definition, label, dependencies, and attributes. In particular, the optimize method won't be needed, and get_dependencies can just return a static value. You could probably implement this with a `from_json` class method on `taskgraph.kind.base.Task`. Actually, you can probably combine the two approaches: each kind class implements its own `from_json` method, and the taskgraph code does the kind.yml / implementation lookup I described above. Then the resulting task graph will be functionally identical to the original. @@ +96,5 @@ > + result = index_path_regex.search(route) > + if result is None: > + continue > + index_paths.append(result.group(1)) > + index_paths.append(result.group(1).replace(result.group(2), 'mozilla-central')) Yikes -- the taskgraph code definitely shouldn't have stuff this specific to the docker kind in it!

Attachment #8766772 - Flags: review?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Comment 11

•

9 years ago

One more proposed refactor: as you've noted, we're currently putting a bunch of timestamps into the tasks. This definitely causes issues with action tasks, but if the decision task were to take longer than 15 minutes, we'd have the same problem there! Also, it's difficult for developers to compare JSON files when they contain different datestamps. One fix I've been considering is to replace all of those timestamps with relative times. Then create_tasks would apply those relative times to the current time just before calling createTask. So we would have something like: { created: "datestamp offset=0 seconds", expires: "datestamp offset=14 days", ... } The tricky bit here is to figure out how to represent this in a way that create_tasks can reliably replace, since it doesn't know the location of all of the timestamps in a task definition (as you've seen, for artifacts the location can move around a bit). What I've written above, "datestamp offset=NN UNIT", would work, but what if someone uses a string like that somewhere else in a task, without intending it to be a datestamp? Maybe we should use something similar to {'task-reference': ..}, perhaps {'relative-datestamp': "14 days"}? That would also be a little faster to search for, since we can just look for object.keys() == ['relative-datestamp'] instead of applying a regex to every string. I'd love to see this refactor land for many reasons. In fact, it might be good to break it out into another bug.

Kalpesh Krishna [:martianwars]

Assignee

Updated

•

9 years ago

Depends on: 1284005

Kalpesh Krishna [:martianwars]

Assignee

Comment 12

•

9 years ago

> This strips all dependencies which are not within the set of tasks that are about to be created. > This is too many -- if there are tasks created by the original decision task that are not completed, > this may allow the new tasks to run before the old are complete, causing errors. I don't quite agree with this. In my latest push, https://treeherder.mozilla.org/#/jobs?repo=try&revision=db11e6d84418f14af18d072efa96613a675831c7 I had this situation and it worked since the dependencies used were the array inside ["task"]["dependencies"], which is still populated. The dictionary inside ["dependencies"] seems like it's built from the `edges` attribute. In any case, I've refactored this to a much simpler implementation which I'll show you in my next patch. :) Other than that, I more or less like the approach you outlined and excited to complete it! I've broken off into another bug to fix the timestamps issue.

Dustin J. Mitchell [:dustin] (he/him)

Comment 13

•

9 years ago

You are absolutely right about the dependencies. Thanks for correcting me! The timestamps fix is something :mshal has been looking for, too, so I'm excited to see that.

Kalpesh Krishna [:martianwars]

Assignee

Comment 14

•

9 years ago

Attached patch action_tasks.patch (obsolete) — Details — Splinter Review

This is the corrected patch. I've not accounted for timestamps since I'm waiting for the other bug to merge. Here is the push https://treeherder.mozilla.org/#/jobs?repo=try&revision=42fed0f75f6ea6e6ea4cecae6e872e5dca42555e

Attachment #8766772 - Attachment is obsolete: true

Attachment #8767481 - Flags: review?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Comment 15

•

9 years ago

If you can get mozreview working, one of the big advantages is that reviewers can see the "interdiff" - what has changed from one patch to the next. That would be very helpful in this sort of bug!

Kalpesh Krishna [:martianwars]

Assignee

Comment 16

•

9 years ago

Attached file Bug 1281062 - Create Action Tasks to schedule new jobs. — Details

Review commit: https://reviewboard.mozilla.org/r/61968/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/61968/

Attachment #8767503 - Flags: review?(dustin)

Armen [:armenzg]

Updated

•

9 years ago

Blocks: 1284911

Armen [:armenzg]

Updated

•

9 years ago

No longer blocks: 1254325

Dustin J. Mitchell [:dustin] (he/him)

Comment 17

•

9 years ago

Comment on attachment 8767503 [details] Bug 1281062 - Create Action Tasks to schedule new jobs. https://reviewboard.mozilla.org/r/61968/#review59404 This is really close, and I like your design! I'd like to see a few things here: * Make the to_json/from_json pair nice and robust, at least adding tests that exercise a round-trip for each Task subclass. This would probably make a good separate commit from the rest of this work, but it's OK to leave them in the same patch if that's a lot of trouble. * Make the optimization process use the decision-task's existing tasks *first*, before trying the more ordinary optimization process ::: taskcluster/taskgraph/action.py:49 (Diff revision 1) > + target_graph = full_task_graph.graph.transitive_closure(target_tasks) > + target_task_graph = TaskGraph( > + {l: all_tasks[l] for l in target_graph.nodes}, > + target_graph) > + > + # We don't want to optimize target tasks since they have been requested by user This is true, but I think the comment is in the wrong place? ::: taskcluster/taskgraph/kind/docker_image.py:157 (Diff revision 1) > + index_path_regex = re.compile(INDEX_REGEX) > + result = index_path_regex.search(route) > + if result is None: > + continue > + index_paths.append(result.group(1)) > + index_paths.append(result.group(1).replace(result.group(2), 'mozilla-central')) Why not just take all of the routes as index_paths? ::: taskcluster/taskgraph/optimize.py:17 (Diff revision 1) > > logger = logging.getLogger(__name__) > TASK_REFERENCE_PATTERN = re.compile('<([^>]+)>') > > > -def optimize_task_graph(target_task_graph, do_not_optimize): > +def optimize_task_graph(target_task_graph, do_not_optimize, decision_task_jobs=None): As far as optimization is concerned, I think this is a dictionary of existing tasks, keyed by label. So maybe `existing_tasks`? That might make it less purpose-specific. The fact that these come from the decision task is not the important characteristic at this point. Let's steer clear of the word "job", as it means something different from "task". ::: taskcluster/taskgraph/optimize.py:98 (Diff revision 1) > optimized = False > # otherwise, examine the task itself (which may be an expensive operation) > else: > optimized, replacement_task_id = task.optimize() > - > + if optimized is False and decision_task_jobs is not None: > + # The task still couldn't be optimized, let's use decision_task_jobs Why try to optimize the task first, and only then fall back to `decision_task_jobs`? At best, optimizing is a much slower way to find the same taskId that you already have in `decision_task_jobs`, right? ::: taskcluster/taskgraph/taskgraph.py:63 (Diff revision 1) > > def __eq__(self, other): > return self.tasks == other.tasks and self.graph == other.graph > + > + @classmethod > + def from_json(cls, tasks_dict): It would be great to have some tests for this method to ensure that it can accurately re-create a taskgraph (that is, `tg == TaskGraph.from_json(tg.to_json())`). ::: taskcluster/taskgraph/taskgraph.py:73 (Diff revision 1) > + tasks = {} > + edges = set() > + for key, value in tasks_dict.iteritems(): > + # We try to find the task implementation using the kind > + kind = value['attributes']['kind'] > + file_path = os.path.join(GECKO, "taskcluster/ci/" + kind + "/kind.yml") This `taskcluster/ci` should not be hardcoded here. It's an option (`--root`) to the decision task, and that should probably be reflected into the action task, too, and passed as an argument to `from_json`. Actually, now that I think of it, you could save yourself this trouble by changing the `to_json` support to store the kind implementation path in the JSON, then just use that directly (rather than looking it up via the kind name). ::: taskcluster/taskgraph/taskgraph.py:79 (Diff revision 1) > + with open(file_path, 'r') as f: > + implementation = yaml.load(f)['implementation'] > + module, class_name = implementation.split(':') > + # Loading the module and creating a Task from a dictionary > + m = importlib.import_module(module) > + task_kind = getattr(m, class_name) This isn't quite enough to load a class reliably. I actually factored this out in a way that might be useful for you, in https://reviewboard.mozilla.org/r/61882/diff/2#index_header ::: taskcluster/taskgraph/taskgraph.py:84 (Diff revision 1) > + task_kind = getattr(m, class_name) > + tasks[key] = task_kind.from_json(value) > + if 'task_id' in value: > + tasks[key].task_id = value['task_id'] > + for depname, dep in value['dependencies'].iteritems(): > + # Removing edges which are not completely part of this task graph Does this happen in one of the graphs published by the decision task?

Attachment #8767503 - Flags: review?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Comment 18

•

9 years ago

Comment on attachment 8767481 [details] [diff] [review] action_tasks.patch (replaced by mozreview request)

Attachment #8767481 - Attachment is obsolete: true

Attachment #8767481 - Flags: review?(dustin)

Kalpesh Krishna [:martianwars]

Assignee

Comment 19

•

9 years ago

@Dustin, I'm not so sure about two of the comments here. > Why not just take all of the routes as index_paths? I'm not so sure here. I just followed https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/kind/docker_image.py#93 since that's what garndt and I found appropriate. If I just take all routes, I won't get `mozilla-central` in the index routes. > This `taskcluster/ci` should not be hardcoded here. Is it okay to just follow this? https://dxr.mozilla.org/mozilla-central/source/taskcluster/mach_commands.py#107 Or do you want me to keep the path in JSON?

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Comment 20

•

9 years ago

(In reply to Kalpesh Krishna [:martianwars] from comment #19) > @Dustin, > I'm not so sure about two of the comments here. > > Why not just take all of the routes as index_paths? > I'm not so sure here. I just followed > https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/kind/ > docker_image.py#93 since that's what garndt and I found appropriate. If I > just take all routes, I won't get `mozilla-central` in the index routes. Ah, good point -- the task's routes are only for the tree it's building, whereas index_routes needs to be all the routes where the task might be found during optimization. So this is OK. > > This `taskcluster/ci` should not be hardcoded here. > Is it okay to just follow this? > https://dxr.mozilla.org/mozilla-central/source/taskcluster/mach_commands. > py#107 > Or do you want me to keep the path in JSON? I prefer the solution of including the kind implementation in the JSON.

Flags: needinfo?(dustin)

Kalpesh Krishna [:martianwars]

Assignee

Comment 21

•

9 years ago

Dustin, So this is a JSON file with the kind path in it https://public-artifacts.taskcluster.net/MPazfbWNRlK9oC_gClMgPg/0/public/task-graph.json. I'm not so sure I like this due to the `/workspace/gecko` which precedes the path (since I suppose this may not be always true). Just like in the old code, I forced to find the kind, and append `taskcluster/ci` to the GECKO path and add `kind.yml`. What do you think? I think adding a --root option sounds better, UNLESS some other code wants to use this kind path.

Flags: needinfo?(dustin)

Kalpesh Krishna [:martianwars]

Assignee

Comment 22

•

9 years ago

Comment on attachment 8767503 [details] Bug 1281062 - Create Action Tasks to schedule new jobs. Review request updated; see interdiff: https://reviewboard.mozilla.org/r/61968/diff/1-2/

Attachment #8767503 - Attachment description: Bug 1281062 - Create Action Tasks to schedule new jobs. → Bug 1281062 - Adding a from_json function to TaskGraph and each Task subclass.

Attachment #8767503 - Flags: review?(dustin)

Kalpesh Krishna [:martianwars]

Assignee

Comment 23

•

9 years ago

Comment on attachment 8767503 [details] Bug 1281062 - Create Action Tasks to schedule new jobs. Review request updated; see interdiff: https://reviewboard.mozilla.org/r/61968/diff/2-3/

Attachment #8767503 - Attachment description: Bug 1281062 - Adding a from_json function to TaskGraph and each Task subclass. → Bug 1281062 - Create Action Tasks to schedule new jobs.

Kalpesh Krishna [:martianwars]

Assignee

Updated

•

9 years ago

Depends on: 1285755

Kalpesh Krishna [:martianwars]

Assignee

Comment 24

•

9 years ago

Phew! Unfortunately it seems like I'm not pushing to MozReview correctly. When I pushed the patch with just the from_json changes, it's didn't appear as a different changeset on MozReview, and created a diff with my old action_tasks.patch. As a result, I've filed a new bug Bug 1285755, with a MozReview request of all the from_json changes. This patch requires https://reviewboard.mozilla.org/r/61882/diff/2#index_header to be merged for it to work. In this bug, I've updated the action_tasks.patch and pushed the latest to MozReview. You might see two diff in MozReview, but one is to be ignored for the above reason. ^

Kalpesh Krishna [:martianwars]

Assignee

Comment 25

•

9 years ago

https://reviewboard.mozilla.org/r/61968/#review60182 ::: taskcluster/mach_commands.py:226 (Diff revisions 1 - 3) > def show_taskgraph_labels(self, taskgraph): > for label in taskgraph.graph.visit_postorder(): > print(label) > > def show_taskgraph_json(self, taskgraph): > - # JSON output is a sequence of JSON objects, rather than a single object, so > + print(json.dumps(taskgraph.to_json(), I suppose this has come due to a rebasing (this is on a 1-3 diff) https://dxr.mozilla.org/mozilla-central/source/taskcluster/mach_commands.py#202

Kalpesh Krishna [:martianwars]

Assignee

Comment 26

•

9 years ago

Here is a push with both patches https://treeherder.mozilla.org/#/jobs?repo=try&revision=0429e27c5df5ddb179a5e86cd982f9f0f096a605

Dustin J. Mitchell [:dustin] (he/him)

Comment 27

•

9 years ago

(In reply to Kalpesh Krishna [:martianwars] from comment #21) > So this is a JSON file with the kind path in it > https://public-artifacts.taskcluster.net/MPazfbWNRlK9oC_gClMgPg/0/public/ > task-graph.json. I'm not so sure I like this due to the `/workspace/gecko` > which precedes the path (since I suppose this may not be always true). Just Good instinct! What I meant was the Python path to the implementation, so something like "kind_implementation": "taskgraph.kind.test:TestTask",

Flags: needinfo?(dustin)

Kalpesh Krishna [:martianwars]

Assignee

Comment 28

•

9 years ago

Comment on attachment 8767503 [details] Bug 1281062 - Create Action Tasks to schedule new jobs. Review request updated; see interdiff: https://reviewboard.mozilla.org/r/61968/diff/3-4/

Kalpesh Krishna [:martianwars]

Assignee

Comment 29

•

9 years ago

Comment on attachment 8767503 [details] Bug 1281062 - Create Action Tasks to schedule new jobs. Review request updated; see interdiff: https://reviewboard.mozilla.org/r/61968/diff/4-5/

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

9 years ago

Attachment #8767503 - Flags: review?(dustin) → review+

Dustin J. Mitchell [:dustin] (he/him)

Comment 30

•

9 years ago

Comment on attachment 8767503 [details] Bug 1281062 - Create Action Tasks to schedule new jobs. https://reviewboard.mozilla.org/r/61968/#review60322 This looks great! It occurs to me that we may want to add an action name to the task, once we support more than one action. But we can add that later.

Kalpesh Krishna [:martianwars]

Assignee

Comment 31

•

9 years ago

Comment on attachment 8767503 [details] Bug 1281062 - Create Action Tasks to schedule new jobs. Review request updated; see interdiff: https://reviewboard.mozilla.org/r/61968/diff/5-6/

Kalpesh Krishna [:martianwars]

Assignee

Comment 32

•

9 years ago

Here is the push on try https://treeherder.mozilla.org/#/jobs?repo=try&revision=bd1b478964d5bdc16f96b49aed3c0590937cf8df

Dustin J. Mitchell [:dustin] (he/him)

Comment 33

•

9 years ago

https://reviewboard.mozilla.org/r/61968/#review61030

Kalpesh Krishna [:martianwars]

Assignee

Updated

•

9 years ago

Keywords: checkin-needed

Comment 34

•

9 years ago

Pushed by kwierso@gmail.com: https://hg.mozilla.org/integration/autoland/rev/d223b3cdee66 Create Action Tasks to schedule new jobs. r=dustin

Keywords: checkin-needed

Carsten Book [:Tomcat]

Comment 35

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/d223b3cdee66

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Kalpesh Krishna [:martianwars]

Assignee

Updated

•

9 years ago

Blocks: 1286813

Updated

•

7 years ago

Product: TaskCluster → Firefox Build System

You need to log in before you can comment on or make changes to this bug.