Closed
Bug 1374940
Opened 7 years ago
Closed 7 years ago
Add taskgraph support for toolchain definitions
Categories
(Taskcluster :: General, enhancement)
Taskcluster
General
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla56
People
(Reporter: glandium, Assigned: glandium)
References
Details
Attachments
(4 files)
No description provided.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 3•7 years ago
|
||
mozreview-review |
Comment on attachment 8879841 [details] Bug 1374940 - Cleanup how toolchain names are derived from task labels. https://reviewboard.mozilla.org/r/151246/#review156072 ::: taskcluster/taskgraph/transforms/job/toolchain.py:25 (Diff revision 1) > from taskgraph import GECKO > > > TOOLCHAIN_INDEX = 'gecko.cache.level-{level}.toolchains.v1.{name}.{digest}' > > +TOOLCHAIN_ARTIFACTS = {} I'm not a big fan of this, and how it's used in next patch, and by other things down the road, but I don't have a better idea. I'm open to suggestions.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 5•7 years ago
|
||
This is the kind of change that makes everything happen after this bug: https://hg.mozilla.org/try/diff/462eee9cdf68/taskcluster/ci/build/linux.yml I wonder if I should enforce the kind-dependency on toolchains... (and if so, how... that's not the kind of thing available in the transform)
Comment 6•7 years ago
|
||
mozreview-review |
Comment on attachment 8879841 [details] Bug 1374940 - Cleanup how toolchain names are derived from task labels. https://reviewboard.mozilla.org/r/151246/#review156260 There is, indeed, a race condition here: if TOOLCHAIN_ARTIFACTS is not populated by the time the downstream (build) tasks are generated, you'll get an error. That kind ordering is probably fairly deterministic, but could change when a new kind is added. Adding an explicit `kind_dependency` would fix that particular problem, but the use of mutable shared global state like this is still not great design. I'd prefer to see this implemented as follows: * toolchain jobs define a `toolchain_artifact` attribute * build tasks have a `kind_dependency` on toolchains * the build task toolchain transform looks up the existing toolchain task by its label (prepending the kind name, etc.) and examines its `toolchain_artifact` attribute to calculate the env var contents. This keeps all the state in the taskgraph. It should be relatively straightforward to calculate the label given the values in the `toolchains` property. ::: taskcluster/taskgraph/transforms/job/toolchain.py:66 (Diff revision 1) > # The script > files.append('taskcluster/scripts/misc/{}'.format(run['script'])) > > label = taskdesc['label'] > subs = { > - 'name': label.replace('toolchain-', '').split('/')[0], > + 'name': label.replace('%s-' % config.kind, ''), Did you intend to drop the `split('/')[0]` bit?
Attachment #8879841 -
Flags: review?(dustin) → review-
Comment 7•7 years ago
|
||
mozreview-review |
Comment on attachment 8879842 [details] Bug 1374940 - Allow transforms to access info about tasks from kind dependencies of the current kind. https://reviewboard.mozilla.org/r/151248/#review156264 ::: taskcluster/taskgraph/transforms/toolchain.py:39 (Diff revision 2) > + f = os.path.basename(TOOLCHAIN_ARTIFACTS[t]) > + if f in filenames: > + raise Exception('%s-%s cannot use both %s and %s toolchains: ' > + 'they both have the same artifact name %s' > + % (config.kind, job['name'], filenames[f], > + t, f)) If I understand, this is a problem only because the downstream task would download both artifacts to the same filename?
Attachment #8879842 -
Flags: review?(dustin) → review+
Assignee | ||
Comment 8•7 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #7) > If I understand, this is a problem only because the downstream task would > download both artifacts to the same filename? Indeed. (In reply to Dustin J. Mitchell [:dustin] from comment #6) > Comment on attachment 8879841 [details] > Bug 1374940 - Add artifact paths to toolchain jobs definitions. > > https://reviewboard.mozilla.org/r/151246/#review156260 > > There is, indeed, a race condition here: if TOOLCHAIN_ARTIFACTS is not > populated by the time the downstream (build) tasks are generated, you'll get > an error. That kind ordering is probably fairly deterministic, but could > change when a new kind is added. Adding an explicit `kind_dependency` would > fix that particular problem, but the use of mutable shared global state like > this is still not great design. > > I'd prefer to see this implemented as follows: > > * toolchain jobs define a `toolchain_artifact` attribute Sure. > * build tasks have a `kind_dependency` on toolchains See the link in comment 5, that's already how it's done. But I was wondering if it should be enforced (and if it can) > * the build task toolchain transform looks up the existing toolchain task > by its label (prepending the kind name, etc.) and examines its > `toolchain_artifact` attribute to calculate the env var contents. But how? By accumulating all jobs in a dict first? That would lose the generator nature of the whole chain. Plus, the transforms are called once per kind, with a different TransformConfig for each, so all I see I can do is to accumulate when the kind is toolchains, and reuse when the kind is different, which then means I still have a global state tracking that... > ::: taskcluster/taskgraph/transforms/job/toolchain.py:66 > (Diff revision 1) > > # The script > > files.append('taskcluster/scripts/misc/{}'.format(run['script'])) > > > > label = taskdesc['label'] > > subs = { > > - 'name': label.replace('toolchain-', '').split('/')[0], > > + 'name': label.replace('%s-' % config.kind, ''), > > Did you intend to drop the `split('/')[0]` bit? Yes and no, I guess this should go in a different patch. That part comes from when the toolchain jobs had /opt in their name.
Comment 9•7 years ago
|
||
Ah, sorry, the missing link is that loader functions get a copy of the taskgraph as generated so far. This is how tests find the builds they depend on, for example. So you'd need to write a custom loader based on the existing loader, but doing a little more work around dependencies. The test loader might offer some good examples. What I would do in the loader is this: pre-calculate a dictionary mapping toolchain to the necessary metadata (toolchain task label, artifact name, at least). Then attach that dictionary to each job object before yielding it, as something like build['toolchain-info']. Then in the build transform, combine build['toolchains'] with build['toolchain-info'] to generate the proper dependencies and environment variables.
Assignee | ||
Comment 10•7 years ago
|
||
This appears to not be a small change: I can have a loader for toolchain jobs that collects their artifacts, and fills a dict, but then, to have that dict attached to build jobs, build jobs need to use the same loader... but there are multiple loaders, most of which are independent from the others (only push_apk reuses the transform loader). So the dict would need to be shared somehow, and that would still be global state. Even if they were all using the same loader, the loader invocations are still independent, so we'd still have global state. Each invocation could be building its own dict, but then we'd be scanning the entire loaded_tasks for each kind. For the kinds that happen last, that can mean scanning thousands of tasks (btw, I wouldn't be surprised if most loaders scanning loaded_tasks is part of the performance problem running taskgraph) All things considered, global state doesn't seem too bad to me, and it seems simpler not done at the loader level. Any better ideas?
Flags: needinfo?(dustin)
Assignee | ||
Comment 11•7 years ago
|
||
(and the sharing between loaders is not hypothetical, l10n jobs, which use a different loader from build jobs, will eventually need toolchain dependencies)
Comment 12•7 years ago
|
||
Ah, I didn't realize that more than builds needed toolchains. The objections to the global-state implementation are a few: keeping any global state; keeping mutable global state in an UPPERCASE_VARIABLE_NAME; and modifying that global state as a side-effect of something that is otherwise purely functional. How about a memoized function that all relevant loaders can call, passing loaded_tasks along. The loaders can then stick that value into `config['toolchains'] where it's available for use by transforms. Adding it to config will be a one-liner for each loader, and simpler than attaching it to each of the yielded dictionaries. It's still global state (a memoized function), but at least it looks functional! We could make that a little more general (maybe a way for a kind to specify extra metadata it needs, or a way for a kind to supply extra metadata to kinds that depend on it), but YAGNI probably applies. There are only a dozen or so kinds, most of which do not scan -- and iterating over a few thousand tasks is not an especially expensive operation. Last anyone profiled it, it was the schema validation and the transforms that were the big time consumers.
Flags: needinfo?(dustin)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8879842 -
Flags: review+ → review?(dustin)
Assignee | ||
Updated•7 years ago
|
Attachment #8879841 -
Flags: review?(dustin)
Attachment #8879842 -
Flags: review?(dustin)
Assignee | ||
Updated•7 years ago
|
Attachment #8879841 -
Flags: review?(dustin)
Assignee | ||
Comment 17•7 years ago
|
||
Comment on attachment 8879842 [details] Bug 1374940 - Allow transforms to access info about tasks from kind dependencies of the current kind. Please diregard anything mozreview says about r+, this is really r?
Attachment #8879842 -
Flags: review?(dustin)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 20•7 years ago
|
||
Had to change the format of MOZ_TOOLCHAINS because msys doesn't like :: on the command line, and that busted windows toolchain jobs. Never noticed that before because I didn't have toolchain dependencies on windows jobs yet...
Comment 21•7 years ago
|
||
mozreview-review |
Comment on attachment 8879841 [details] Bug 1374940 - Cleanup how toolchain names are derived from task labels. https://reviewboard.mozilla.org/r/151246/#review164184
Attachment #8879841 -
Flags: review?(dustin) → review+
Comment 22•7 years ago
|
||
mozreview-review |
Comment on attachment 8879842 [details] Bug 1374940 - Allow transforms to access info about tasks from kind dependencies of the current kind. https://reviewboard.mozilla.org/r/151248/#review164186 I like this :)
Attachment #8879842 -
Flags: review?(dustin) → review+
Comment 23•7 years ago
|
||
mozreview-review |
Comment on attachment 8887787 [details] Bug 1374940 - Add artifact paths to toolchain jobs definitions. https://reviewboard.mozilla.org/r/158708/#review164190 ::: taskcluster/taskgraph/transforms/job/toolchain.py:189 (Diff revision 2) > ' '.join(hg_command), > # do something intelligent. > r'{} -c ./build/src/taskcluster/scripts/misc/{}'.format(bash, run['script']) > ] > > + taskdesc['toolchain-artifact'] = run['toolchain-artifact'] It would be better to just add this to `taskdesc.setdefault('attributes', {})` here. Then `task.py` doesn't need to know anything about it. (same for `docker_worker_toolchain` above)
Attachment #8887787 -
Flags: review?(dustin) → review+
Comment 24•7 years ago
|
||
mozreview-review |
Comment on attachment 8887788 [details] Bug 1374940 - Allow to define a list of toolchains to use for a given TC job. https://reviewboard.mozilla.org/r/158710/#review164194 This feels unfinished -- I don't see the use_toolchains transform actually used anywhere (that is, referenced from `kind.yml`), and I don't see `task.py` using the `toolchains` property defined in its schema. I think the key question here is, where do you want to handle this property and convert it to env vars? It does make sense at the task-description level -- then any task including tests could require a toolchain. In this case, please add a check to payload builders for worker types that can't support toolchains: `if task['toolchains']: raise Exception('Worker type xx does not support toolchains')` Another option is to define it at the job level, for one or a few values of `run.using` (probably just `mozharness` for various worker types?). ::: taskcluster/taskgraph/transforms/task.py:157 (Diff revision 2) > Required('needs-sccache', default=False): bool, > > # Path to the artifact produced by the toolchain job > Optional('toolchain-artifact'): basestring, > > + # The list of toolchains to use for the job Wherever this ends up, please add some of the commentary from the commit message here (especially about the env var)
Attachment #8887788 -
Flags: review?(dustin) → review-
Assignee | ||
Comment 25•7 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #24) > Comment on attachment 8887788 [details] > Bug 1374940 - Allow to define a list of toolchains to use for a given TC job. > > https://reviewboard.mozilla.org/r/158710/#review164194 > > This feels unfinished -- I don't see the use_toolchains transform actually > used anywhere (that is, referenced from `kind.yml`), and I don't see > `task.py` using the `toolchains` property defined in its schema. That's because it's going to be setup in another bug. You can see the taskcluster/ci changes in https://hg.mozilla.org/try/rev/847b5e0230422be85d500391a1caa1cff7c77bc1 for an example. > I think the key question here is, where do you want to handle this property > and convert it to env vars? It does make sense at the task-description > level -- then any task including tests could require a toolchain. In this > case, please add a check to payload builders for worker types that can't > support toolchains: `if task['toolchains']: raise Exception('Worker type xx > does not support toolchains')` > > Another option is to define it at the job level, for one or a few values of > `run.using` (probably just `mozharness` for various worker types?). I'm not sure what you're asking here.
Assignee | ||
Comment 26•7 years ago
|
||
Also, the previous version of largely the same patch had r+...
Assignee | ||
Comment 27•7 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #26) > Also, the previous version of largely the same patch had r+... (comment 7)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 30•7 years ago
|
||
mozreview-review |
Comment on attachment 8887788 [details] Bug 1374940 - Allow to define a list of toolchains to use for a given TC job. https://reviewboard.mozilla.org/r/158710/#review164650 Awesome, I like it!
Attachment #8887788 -
Flags: review?(dustin) → review+
Comment 31•7 years ago
|
||
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again. hg error in cmd: hg rebase -s 8bc75f03b8a0 -d 3e61e9a0fd86: rebasing 408227:8bc75f03b8a0 "Bug 1374940 - Cleanup how toolchain names are derived from task labels. r=dustin" rebasing 408228:9cac1a34f3cb "Bug 1374940 - Allow transforms to access info about tasks from kind dependencies of the current kind. r=dustin" rebasing 408229:52c80bfe3557 "Bug 1374940 - Add artifact paths to toolchain jobs definitions. r=dustin" merging taskcluster/ci/toolchain/linux.yml merging taskcluster/ci/toolchain/windows.yml merging taskcluster/taskgraph/transforms/task.py warning: conflicts while merging taskcluster/ci/toolchain/linux.yml! (edit, then use 'hg resolve --mark') warning: conflicts while merging taskcluster/ci/toolchain/windows.yml! (edit, then use 'hg resolve --mark') unresolved conflicts (see hg resolve, then hg rebase --continue)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 36•7 years ago
|
||
Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/b5390e7949af Cleanup how toolchain names are derived from task labels. r=dustin https://hg.mozilla.org/integration/autoland/rev/eac3bc21f370 Allow transforms to access info about tasks from kind dependencies of the current kind. r=dustin https://hg.mozilla.org/integration/autoland/rev/2023ee951ea6 Add artifact paths to toolchain jobs definitions. r=dustin https://hg.mozilla.org/integration/autoland/rev/ba64040f39ad Allow to define a list of toolchains to use for a given TC job. r=dustin
Comment 37•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/b5390e7949af https://hg.mozilla.org/mozilla-central/rev/eac3bc21f370 https://hg.mozilla.org/mozilla-central/rev/2023ee951ea6 https://hg.mozilla.org/mozilla-central/rev/ba64040f39ad
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in
before you can comment on or make changes to this bug.
Description
•