Closed Bug 1274611 Opened 3 years ago Closed 3 years ago

Implement task-graph optimization

Categories

(Firefox Build System :: Task Configuration, task)

task
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla50

People

(Reporter: dustin, Assigned: dustin)

References

Details

Attachments

(5 files)

My first pass at this left a lot to be desired.  It was focused on finding tasks in the index based on directory hashes, but that's probably a minority of the use-cases for optimization.

We want to be able to optimize some tasks away completely, based only on information in the push.  For example, if the push contains no changes to the set of files contributing to a test, then that test does not need to be run.  These will generally be "leaf" tasks, meaning that nothing depends on them.

Non-leaf tasks will generally need to be replaced with something else.  For example, docker image build tasks need to be replaced with the existing task that contains the built image.

However, some non-leaf tasks will optimize away all their downstream tasks as well.  For example, if it is determined that a build task is not required, then neither are any of the downstream test tasks.
* Implement & document optimization (although legacy kind doesn't do much of it)
 * Introduce `optimize_target_tasks` parameter to control whether tasks in the
   target set can be optimized (no for try, yes for most other branches)
 * Refactor to include resolved taskIds in the optimized task graph
 * Include a `label-to-taskid.json` artifact.
 * Introduce {'task-reference': '... <dependency-name> ...'} for referring to
   parent tasks' taskId.

Review commit: https://reviewboard.mozilla.org/r/55956/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/55956/
Attachment #8757518 - Flags: review?(jopsen)
Attachment #8757518 - Flags: review?(gps)
Attachment #8757519 - Flags: review?(gps)
Attachment #8757520 - Flags: review?(garndt)
OK, this is another big code-drop.  I tried to compromise between slathering another coat of hackiness on the legacy kind (which might have been a smaller patch) and the risk of re-implementing everything (which would end up much larger!).  Anyway, sorry about that.

jonasfj: I'd like your feedback on the overall approach to graph generation and optimization here, and in particular on the more algorithmic aspects.  I think some of my graph operations are O(n**2), and I call them a lot.  If you see some low-hanging fruit here that might avert multi-hour decision tasks down the road, I'm all ears!  I'd also like to see what you think of how I'm interacting with the queue and index (some of which is in the optimization bits of this patch, and some of which is in taskcluster/taskgraph/create.py from previous patches).

gps: I'm interested in your feedback on this approach to optimization and how it might interact with the various kinds of efficiency improvements you'd like to see in-tree.  In particular, I know you're keen to look at files that have changed in a push, and that is certainly an option here but only for tasks which have no other tasks depending on them.  In any other case, we need a replacement taskId for downstream tasks to depend on, so hashing files and searching the index is the only option I see.  I'm quite confident there will be a few rounds of review here, so let's steer clear of detailed code review for now...

garndt: I flagged you for review on the docker-image kind, in part because you were the original author of that functionality.  But of course, to understand how that works, you'll need to understand the other stuff too.

There are a number of TODO items in here, or in some cases I omitted that (I really hate seeing "# TODO" in committed code), but made notes on my gist (https://gist.github.com/djmitche/9eb2ecf77020d7e86085). So if you see places to comment "why don't you just..", well, I'll probably just put them on the gist.

I'm away at PyCon next week, so I'd love to get review early in the week and re-draft in my spare time around the conference, hoping to land the whole thing Monday the 6th.  We can get into line-by-line on the second round.
Flags: needinfo?(jopsen)
Flags: needinfo?(gps)
Flags: needinfo?(garndt)
(In reply to Dustin J. Mitchell [:dustin] from comment #2)
>  * Introduce `optimize_target_tasks` parameter to control whether tasks in
> the
>    target set can be optimized (no for try, yes for most other branches)

Is the branch the only criteria for flipping this bit?
We're considering having the nightlies run through the decision task, and we may want to optimize CI builds but not nightlies on a given branch.  We don't necessarily need to add that logic now; just mentioning that we may need it in the future.
That would need to be added when support for periodic / nightly builds was added to the decision task.  Mihai is already thinking about some of that in order to implement periodic PGO builds.
https://reviewboard.mozilla.org/r/55956/#review53296

::: taskcluster/taskgraph/decision.py:36
(Diff revision 1)
> +        'optimize_target_tasks': False,
>      },
>  
>      # the default parameters are used for projects that do not match above.
>      'default': {
> -        'target_tasks_method': 'all_tasks',
> +        'target_tasks_method': 'all_builds_and_tests',

fyi, I think the default target_tasks_method in mach_command will need to be updated.
Sorry, please ignore comment 9
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

https://reviewboard.mozilla.org/r/55960/#review53188

I'm still checking this out, but am having difficulties running the local task graph commands.  I get various errors that a key is missing (related to image building).

::: taskcluster/ci/docker-image/kind.yml:18
(Diff revision 1)
> +images:
> +  - desktop-test
> +  - desktop-build
> +  - builder
> +  - lint
> +  - android-gradle-lint

shouldn't tester be part of this white list?

::: taskcluster/taskgraph/kind/docker_image.py:76
(Diff revision 1)
> +
> +            image_artifact_path = "public/decision_task/image_contexts/{}/context.tar.gz".format(image_name)
> +            if os.environ.get('TASK_ID'):
> +                destination = os.environ['HOME'] + "/artifacts/decision_task/image_contexts/{}/context.tar.gz".format(image_name)
> +                image_parameters['context_url'] = ARTIFACT_URL.format(os.environ['TASK_ID'], image_artifact_path)
> +                self.create_context_tar(context_path, destination, name)

I think 'name' should be 'image_name' here.

::: testing/taskcluster/tasks/test.yml:19
(Diff revision 1)
>  
>    scopes:
>      - 'docker-worker:feature:allowPtrace'
>  
>    payload:
> -    image: '{{#docker_image}}tester{{/docker_image}}'
> +    image:

nice, didn't realize this wasn't moved over.
Attachment #8757520 - Flags: review?(garndt)
Attachment #8757518 - Flags: review?(gps)
Comment on attachment 8757518 [details]
Bug 1274611: Implement task-graph optimization;

https://reviewboard.mozilla.org/r/55956/#review53326

I only skimmed the tests, so not much feedback on them.

I like most of the approach here. I feel like I'll need to read more implementations of optimization functions to figure out if this will work for us. But I like how this is generic and you can write your own optimization functions depending on the task type. Lots of future flexibility here.

The part that strikes me as a bit odd is that `task-reference`s in the YAML are parameterized URLs with `queue.taskcluster.net` in them. That feels a bit odd. I would expect some kind of identifier or path fragment, not a full blown, fully qualified URL. Could we strip off the `https://queue.taskcluster.net/v1/task/` prefix?

::: taskcluster/taskgraph/kind/base.py:57
(Diff revision 1)
> +        error to return no taskId for a task on which other tasks depend.
>  
> -        The returned task definition will be modified before being passed to
> +        The default never optimizes.
> -        `queue.createTask`.
>          """
> +        return (False, None)

Nit: don't need parens on return statement when returning tuples.

::: taskcluster/taskgraph/kind/legacy.py:440
(Diff revision 1)
>          taskdict = self.tasks_by_label[task.label]
>          return [(label, label) for label in taskdict.get('requires', [])]
>  
> -    def get_task_optimization_key(self, task, taskgraph):
> -        pass
> +    def optimize_task(self, task, taskgraph):
> +        # no optimization for the moment
> +        return (False, None)

Nit: drop parens

::: taskcluster/taskgraph/optimize.py:14
(Diff revision 1)
> +from .graph import Graph
> +from .types import TaskGraph
> +from slugid import nice as slugid
> +
> +logger = logging.getLogger(__name__)
> +TASK_REFERENCE_PATTERN = re.compile('<([^>]*)>')

Should this be a `+` instead of `*` to avoid matching on `<>`?

::: taskcluster/taskgraph/optimize.py:45
(Diff revision 1)
> +            # handle escaping '<'
> +            if key == '<':
> +                return key
> +            raise KeyError("task '{}' has no dependency with label '{}'".format(label, key))
> +
> +    def iter(val):

iter() is a built-in function. Please use a different name.

::: taskcluster/taskgraph/optimize.py:50
(Diff revision 1)
> +    def iter(val):
> +        if isinstance(val, list):
> +            return [iter(v) for v in val]
> +        elif isinstance(val, dict):
> +            if val.keys() == ['task-reference']:
> +                return TASK_REFERENCE_PATTERN.sub(repl, val['task-reference'])

TIL you can pass a function to re.sub()!

::: taskcluster/taskgraph/optimize.py:58
(Diff revision 1)
> +def annotate_task_graph(target_task_graph, do_not_optimize, named_links_dict, label_to_taskid):
> +    """
> +    Annotate each task in the graph with .optimized (boolean) and .task_id
> +    (possibly None), following the rules for optimization and calling the task
> +    kinds' `optimize_task` method.
> +
> +    As a side effect, label_to_taskid is updated with labels for all optimized
> +    tasks that are replaced with existing tasks.
> +    """

I'm slightly worried about updating task instances in place. Once this code runs, task IDs will be changed for anyone that has a reference to the unoptimized graph. That feels wrong.

::: taskcluster/taskgraph/optimize.py:76
(Diff revision 1)
> +        for t in dependencies:
> +            if t.optimized and not t.task_id:
> +                raise Exception("task {} was optimized away, but {} depends on it".format(t.label, label))

Do you need to do this check again at the end of the function? I think the answer is yes because `optimized` can get set later.

::: taskcluster/taskgraph/optimize.py:82
(Diff revision 1)
> +            if t.optimized and not t.task_id:
> +                raise Exception("task {} was optimized away, but {} depends on it".format(t.label, label))
> +
> +        # if this task is blacklisted, don't even consider optimizing
> +        replacement_task_id = None
> +        if label in do_not_optimize:

Is `do_not_optimize` a list or a set? If a list, there may be performance issues here as list membership testing can quickly be a performance suck.

::: taskcluster/taskgraph/test/test_generator.py:37
(Diff revision 1)
>              return [('t-{}'.format(i - 1), 'prev')]
>          else:
>              return []
>  
> +    def optimize_task(self, task, dependencies):
> +        return (False, None)

Nit: drop parens

::: taskcluster/taskgraph/test/test_optimize.py:24
(Diff revision 1)
> +    def do(self, input, output):
> +        taskid_for_edge_name = {'edge%d' % n: 'tid%d' % n for n in range(1, 4)}
> +        self.assertEqual(resolve_task_references('subject', input, taskid_for_edge_name), output)
> +
> +    def test_in_list(self):
> +        "resolve_task_references resolves task references in a list"

What does this compile as? A bareword string? Shouldn't this be a docstring?
Comment on attachment 8757519 [details]
Bug 1274611: handle empty error messages properly;

https://reviewboard.mozilla.org/r/55958/#review53340
Attachment #8757519 - Flags: review?(gps) → review+
Regarding optimization graphs, I would go with a file-based approach because that is really the only option available to us. You may want to revisit bug 1249091 to see what I had in mind several months ago.

You'll almost certainly want to hash files somehow. However, hashes aren't enough because backouts can restore original hashes and possibly not retrigger things we want them to retrigger. E.g. if you have commits X -> Y -> Z and Z is a backout of X, do hashes alone retrigger everything we want? It depends. Gets complicated quickly.

We could potentially be hashing a lot of files. You'll want to use SHA-1 or lower because SHA-256 and others can be noticeably slower. This isn't crypto, so we shouldn't be worried about cryptographic strength.

Let me think a bit more about things. This might be the kind of thing where we have a Vidyo meeting and think aloud...
Flags: needinfo?(gps)
https://reviewboard.mozilla.org/r/55960/#review53410

::: taskcluster/ci/docker-image/kind.yml:18
(Diff revision 1)
> +images:
> +  - desktop-test
> +  - desktop-build
> +  - builder
> +  - lint
> +  - android-gradle-lint

android-gradle-build is also missing
https://reviewboard.mozilla.org/r/55956/#review53326

We do refer to tasks in other ways and other forms of URLs.  For example:

```
      taskId:
        task-reference: "<docker-image>"
```
The alternative is to support something like `{"artifact-reference": "$dependency-label", "artifact": "public/foo/bar.tgz"}`, but I worry that doing so invites invention of lots of new substitutions that also aren't strictly necessary (`image-reference`? `installer-and-tests-zip-reference`?), which ends up decreasing readability.  I like the simplicity of this model, and once we migrate away from using these godawful YAML files these references will generally be created by Python code so we can suitably abstract away the queue URL there.  This has the advantage of being the minimal solution to the problem of needing to embed data (task IDs) which isn't yet available at the time the data structure is generated.

> I'm slightly worried about updating task instances in place. Once this code runs, task IDs will be changed for anyone that has a reference to the unoptimized graph. That feels wrong.

Before this, there were no task id's (just labels).  So this is modifying instances in place, but I don't think it's particularly harmful as it's just adding more information.  Unless I misunderstand your comment.

> What does this compile as? A bareword string? Shouldn't this be a docstring?

I don't know what a bareword string is.  Are you referring to the use of single quotes?  There's nothing that says that docstrings have to be triple-quoted, and using single quotes saves 4 characters on a one-liner.
https://reviewboard.mozilla.org/r/55960/#review53188

> shouldn't tester be part of this white list?

As you noted below, tester isn't generated in-tree.  I think there's some private data involved, but I only glanced at it and left it alone.
(In reply to Gregory Szorc [:gps] from comment #14)
> Regarding optimization graphs, I would go with a file-based approach because
> that is really the only option available to us. You may want to revisit bug
> 1249091 to see what I had in mind several months ago.

Yes, I think https://bugzilla.mozilla.org/show_bug.cgi?id=1249091#c7 was a good model.

> You'll almost certainly want to hash files somehow. However, hashes aren't
> enough because backouts can restore original hashes and possibly not
> retrigger things we want them to retrigger. E.g. if you have commits X -> Y
> -> Z and Z is a backout of X, do hashes alone retrigger everything we want?
> It depends. Gets complicated quickly.

Is there a case where, after a backout, we would definitely want to re-generate things that had already been generated from the same inputs?  Are those cases common enough that a re-trigger wouldn't work?

> Let me think a bit more about things. This might be the kind of thing where
> we have a Vidyo meeting and think aloud...

Yeah, that's a good idea.  I can talk this Thurs/Fri from PyCon if that's useful.  If we can get this bug more-or-less landed, then maybe we can also discuss further in London.
Flags: needinfo?(garndt)
https://reviewboard.mozilla.org/r/55956/#review53326

> Do you need to do this check again at the end of the function? I think the answer is yes because `optimized` can get set later.

No, because we visit nodes in postorder, [t.optimized for t in dependencies] is fixed.

> Is `do_not_optimize` a list or a set? If a list, there may be performance issues here as list membership testing can quickly be a performance suck.

It's either a list or a dictionary.
https://reviewboard.mozilla.org/r/55956/#review53326

> It's either a list or a dictionary.

Sorry, it's either a *set* or a dictionary.  Either one is a hash table lookup.
Comment on attachment 8757518 [details]
Bug 1274611: Implement task-graph optimization;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55956/diff/1-2/
Attachment #8757518 - Flags: review?(gps)
Attachment #8757520 - Flags: review?(garndt)
Comment on attachment 8757519 [details]
Bug 1274611: handle empty error messages properly;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55958/diff/1-2/
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55960/diff/1-2/
(In reply to Dustin J. Mitchell [:dustin] from comment #17)
> https://reviewboard.mozilla.org/r/55960/#review53188
> 
> > shouldn't tester be part of this white list?
> 
> As you noted below, tester isn't generated in-tree.  I think there's some
> private data involved, but I only glanced at it and left it alone.

it wasn't generated in-tree prior to your changes, but now it looks to be using the task-reference format the other in-tree images are using (at least looking at the changes to test.yml). phone-tester is the only private image with secret bits that I'm aware of.
https://reviewboard.mozilla.org/r/55956/#review53570

::: taskcluster/taskgraph/generator.py:183
(Diff revision 2)
> -        # optimization is not yet implemented
> -
>          logger.info("Generating optimized task graph")
> -        yield 'optimized_task_graph', target_task_graph
> +        do_not_optimize = set()
> +        if not self.parameters.get('optimize_target_tasks', True):
> +            do_not_optimize = target_task_set.graph.tasks

should this be target_task_set.tasks? Graph has edges and nodes, but the TaskGraph has tasks.
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

https://reviewboard.mozilla.org/r/55960/#review53592

::: taskcluster/taskgraph/kind/docker_image.py:28
(Diff revision 2)
> +    current_json_time,
> +)
> +
> +logger = logging.getLogger(__name__)
> +GECKO = os.path.realpath(os.path.join(__file__, '..', '..', '..', '..'))
> +IMAGE_BUILD_TASK = os.path.join(GECKO, 'testing', 'taskcluster', 'tasks', 'image.yml')

This is not using the newly created taskcluster/ci/docker-image/image.yml

::: taskcluster/taskgraph/kind/legacy.py:421
(Diff revision 2)
>          taskdict = self.tasks_by_label[task.label]
> -        return [(label, label) for label in taskdict.get('requires', [])]
> +        deps = [(label, label) for label in taskdict.get('requires', [])]
> +
> +        # add a dependency on an image task, if needed
> +        if 'docker-image' in taskdict:
> +            deps.append(('build-docker-image-{docker-image}'.format(**taskdict), 'docker-image'))

I haven't wrapped my head around it yet, but it appears that in the graph there is a depedendencies list that contains label/task-id pairings but in this case it would be a kind/task-id pairing, wouldn't it?

for instance:
    "dependencies": [
      [
        "docker-image",
        "CWwCIIQFToe28bwQ5TEZNw"
      ],
      [
        "TaskLabel==b8ha06IWRZez69Se-vkdvQ",
        "E_dSOpx7RHyDkod4BtYeQw"
      ]
    ]
Attachment #8757520 - Flags: review?(garndt)
https://reviewboard.mozilla.org/r/55956/#review53570

> should this be target_task_set.tasks? Graph has edges and nodes, but the TaskGraph has tasks.

`taskgraph.graph.tasks` is the set of node names in the graph, which is what we need here.  `taskgraph.tasks` is a dictionary of task objects, the keys of which should match `taskgrpah.graph.tasks`.  It's a bit confusing, but keeping the two separate allos the graph algorithms to operate without touching the tasks objects.
https://reviewboard.mozilla.org/r/55960/#review53592

> I haven't wrapped my head around it yet, but it appears that in the graph there is a depedendencies list that contains label/task-id pairings but in this case it would be a kind/task-id pairing, wouldn't it?
> 
> for instance:
>     "dependencies": [
>       [
>         "docker-image",
>         "CWwCIIQFToe28bwQ5TEZNw"
>       ],
>       [
>         "TaskLabel==b8ha06IWRZez69Se-vkdvQ",
>         "E_dSOpx7RHyDkod4BtYeQw"
>       ]
>     ]

The dependencies are (name, label) pairs, where `name` describes the dependency.  So later we might have `("build", "build-win32-pgo-opt")`.  For the legacy type, the dependency name is the same as the label, and the labels aren't stable from run to run, both of which don't follow this model.  But it's the legacy type so it's going away.

Also confusing is that the kind and the dependency name will probably be the same in most cases, just because it's hard to think of two names for the same thing.

Does that answer the question?  Thanks for looking so deeply!
> Does that answer the question?  Thanks for looking so deeply!

Yup, it does, thanks!
Comment on attachment 8757518 [details]
Bug 1274611: Implement task-graph optimization;

https://reviewboard.mozilla.org/r/55956/#review54582
Attachment #8757518 - Flags: review?(gps) → review+
Duplicate of this bug: 1275014
Jonas and I talked on Friday.  He has some good ideas that I will take into account, but they don't block landing this.

I think that I have r+ for everything, but mozreview disagrees.  At any rate, the last few review comments have caused a few minor tweaks, and I need to rebase on the latest inbound and account for any updated task definitions.  So I will push another change to mozreview.
Flags: needinfo?(jopsen)
Whoops, I don't have r+ from garndt yet..
Review commit: https://reviewboard.mozilla.org/r/57808/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/57808/
Attachment #8757518 - Attachment description: MozReview Request: Bug 1274611: Implement task-graph optimization; r?jonasfj r?gps → Bug 1274611: Implement task-graph optimization;
Attachment #8757519 - Attachment description: MozReview Request: Bug 1274611: handle empty error messages properly; r?gps → Bug 1274611: handle empty error messages properly;
Attachment #8757520 - Attachment description: MozReview Request: Bug 1274611: implement docker image builds as a distinct kind; r?garndt → Bug 1274611: implement docker image builds as a distinct kind;
Attachment #8760082 - Flags: review?(garndt)
Attachment #8757518 - Flags: review?(jopsen) → review?(garndt)
Attachment #8757520 - Flags: review?(garndt)
Comment on attachment 8757518 [details]
Bug 1274611: Implement task-graph optimization;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55956/diff/2-3/
Comment on attachment 8757519 [details]
Bug 1274611: handle empty error messages properly;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55958/diff/2-3/
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55960/diff/2-3/
https://reviewboard.mozilla.org/r/55954/#review54630

::: testing/taskcluster/tasks/tests/fx_linux64_firefox_ui_functional.yml
(Diff revisions 2 - 3)
>  task:
>    payload:
>      command:
>        - {"task-reference": "--installer-url=https://queue.taskcluster.net/v1/task/<{{build_slugid}}>/artifacts/{{build_location}}"}
>        - {"task-reference": "--test-packages-url=https://queue.taskcluster.net/v1/task/<{{build_slugid}}>/artifacts/{{test_packages_location}}"}
> -      - --download-symbols=ondemand

Changes in this file and the next are artifacts of the rebase: whimboo removed `--download-symbols=ondemand` and made them tier-2 upstream.
Comment on attachment 8757518 [details]
Bug 1274611: Implement task-graph optimization;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55956/diff/3-4/
Comment on attachment 8757519 [details]
Bug 1274611: handle empty error messages properly;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55958/diff/3-4/
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55960/diff/3-4/
Comment on attachment 8760082 [details]
Bug 1274611: spell target-tasks.json with `-`;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/57808/diff/1-2/
I discovered the last patch when pushing a downstream cset to try that modified a docker image.  The docker-image task was created first, with schedulerId '-', followed by a build task with schedulerId 'task-graph-scheduler', and the queue called me on my error:

[34m 0:28.53(B[m Task group QPwJQKVCRGyEqlxN4C9ZkQ contains tasks with
schedulerId -. You are attempting
to include tasks from schedulerId task-graph-scheduler,
which is not permitted.
All tasks in the same task-group must have the same schedulerId.
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55960/diff/4-5/
Comment on attachment 8760082 [details]
Bug 1274611: spell target-tasks.json with `-`;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/57808/diff/2-3/
https://reviewboard.mozilla.org/r/55954/#review54786

In try, I was unable to re-trigger a docker image build.  That may be due to the last cset regarding schedulerId.  So I may revise.
Attachment #8757520 - Flags: review?(wcosta)
Attachment #8760082 - Flags: review?(garndt) → review?(wcosta)
Attachment #8760100 - Flags: review?(garndt) → review?(wcosta)
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55960/diff/5-6/
Attachment #8760100 - Attachment description: Bug 1274611: do not try to set schedulerId on any tasks; → Bug 1274611: set schedulerId for image task, too;
Comment on attachment 8760082 [details]
Bug 1274611: spell target-tasks.json with `-`;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/57808/diff/3-4/
Comment on attachment 8760100 [details]
Bug 1274611: set schedulerId for image task, too;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/57816/diff/1-2/
https://reviewboard.mozilla.org/r/57816/#review54802

::: taskcluster/ci/docker-image/image.yml:15
(Diff revision 2)
>    tags:
>      createdForUser: {{owner}}
>  
>    workerType: taskcluster-images
>    provisionerId: aws-provisioner-v1
> +  schedulerId: task-graph-scheduler

It turns out if `schedulerId` is not `task-graph-scheduler`, mozilla-taskcluster won't touch the task.  So rather than remove schedulerId from all tasks, this just adds schedulerId to the image task.

With this in place, I've been able to re-trigger tasks :)
https://reviewboard.mozilla.org/r/55960/#review54874

::: taskcluster/taskgraph/kind/docker_image.py:73
(Diff revision 6)
> +            image_parameters['artifact_path'] = 'public/image.tar'
> +            image_parameters['image_name'] = image_name
> +
> +            image_artifact_path = "public/decision_task/image_contexts/{}/context.tar.gz".format(image_name)
> +            if os.environ.get('TASK_ID'):
> +                destination = os.environ['HOME'] + "/artifacts/decision_task/image_contexts/{}/context.tar.gz".format(image_name)

nit: prefer os.path.join here

::: taskcluster/taskgraph/kind/docker_image.py:120
(Diff revision 6)
> +                # branches have been tried
> +                request = urllib2.Request(ARTIFACT_URL.format(existing_task['taskId'], 'public/image.tar'))
> +                request.get_method = lambda: 'HEAD'
> +                urllib2.urlopen(request)
> +
> +                # HEAD success on the artifact is enough

What if response isn't 200?

::: taskcluster/taskgraph/kind/docker_image.py:128
(Diff revision 6)
> +                pass
> +
> +        return False, None
> +
> +    def create_context_tar(self, context_dir, destination, image_name):
> +        ''' Creates a tar file of a particular context directory '''

docstring nit: one liner should end with a dot. No spaces after/before "'".

::: taskcluster/taskgraph/kind/docker_image.py:138
(Diff revision 6)
> +        with tarfile.open(destination, 'w:gz') as tar:
> +            tar.add(context_dir, arcname=image_name)
> +
> +    def generate_context_hash(self, image_path):
> +        '''
> +        Generates a sha256 hash for context directory used to build an image.

docstring nit: first comment line should be on the same line of the opening '''.

::: taskcluster/taskgraph/kind/docker_image.py:157
(Diff revision 6)
> +
> +        for filename in sorted(files):
> +            relative_filename = filename.replace(GECKO, '')
> +            with open(filename, 'rb') as f:
> +                file_hash = hashlib.sha256()
> +                while True:

The while is not needed. Documentation of read guarantees return of all data if size is not given and file is in non-blocking mode.

::: taskcluster/taskgraph/test/test_kind_docker_image.py:35
(Diff revision 6)
> +    def test_create_context_tar(self):
> +        image_dir = os.path.join(docker_image.GECKO, 'testing', 'docker', 'image_builder')
> +        tarball = tempfile.mkstemp()[1]
> +        self.kind.create_context_tar(image_dir, tarball, 'image_builder')
> +        self.failUnless(os.path.exists(tarball))
> +        os.unlink(tarball)

How about assign the result of exists to a temp variable, then unlink and then call failUnless? This way we make sure unlink is always called.
Comment on attachment 8760082 [details]
Bug 1274611: spell target-tasks.json with `-`;

https://reviewboard.mozilla.org/r/57808/#review54908
Attachment #8760082 - Flags: review?(wcosta) → review+
Attachment #8760100 - Flags: review?(wcosta) → review+
https://reviewboard.mozilla.org/r/55960/#review54874

> What if response isn't 200?

urllib.urlopen raises an error for 4xx's.
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

https://reviewboard.mozilla.org/r/55960/#review54926
Attachment #8757520 - Flags: review?(wcosta) → review+
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/55960/diff/6-7/
Attachment #8760100 - Flags: review?(dustin)
Comment on attachment 8760082 [details]
Bug 1274611: spell target-tasks.json with `-`;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/57808/diff/4-5/
Comment on attachment 8760100 [details]
Bug 1274611: set schedulerId for image task, too;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/57816/diff/2-3/
Attachment #8760100 - Flags: review?(dustin)
Pushed by dmitchell@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/ec20b463c04f
fix indentation in mulet_simulator.yml; r=pmoore a=Tomcat
Duplicate of this bug: 1278466
Comment on attachment 8757518 [details]
Bug 1274611: Implement task-graph optimization;

Removing flag, was already landed and I believe wcosta reviewed these also.
Attachment #8757518 - Flags: review?(garndt)
Comment on attachment 8757520 [details]
Bug 1274611: implement docker image builds as a distinct kind;

Removing flag, was already landed and I believe wcosta reviewed these also.
Attachment #8757520 - Flags: review?(garndt)
Product: TaskCluster → Firefox Build System
You need to log in before you can comment on or make changes to this bug.