Closed Bug 1555988 Opened 1 year ago Closed 1 year ago

Proposal for new Taskcluster YML Format

Categories

(Taskcluster :: General, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: mhentges, Unassigned)

References

Details

Proposal for new Taskcluster YML format

I have two main pain points that I'm encountering with .taskcluster.yml v1:

  1. At the beginning of all my .taskcluster.yml files, I need to parse out the same git values (user, revision, branch, repository).
    Since all the Github events have this information in different spots, there's a lot of mapping. Here's an example in a simple project
  2. JSON-e adds additional complexity that might not be necessary
    1. Devs working with .taskcluster.yml need to lean this additional tool
    2. Due to increased complexity, debugging issues within .taskcluster.yml is harder
    3. The features of JSON-e solve many of the same problems that the "decision task" handles
    4. JSON-e increases Taskcluster's attack surface (e.g.: billion laughs)

My proposal for solves these problems by:

  1. Provide variables with commonly-available values so they don't need to be specifically mapped for each type of Github event
  2. Remove support for JSON-e - let projects with complex task graphs represent them in the decision task.

Examples:

Mobile project example:

version: 2
policy:
	pullRequests: public

onPullRequest:
	# <snip>
	payload:
		command:
			- >-
				git fetch ${repository} ${ref}
				&& git checkout FETCH_HEAD
				&& python decision_task.py pull-request ${pull_request_title}

onRelease:
	# <snip>
	payload:
		command:
			- >-
				git fetch https://github.com/mozilla-mobile/fenix refs/tags/${tag}
				&& git checkout FETCH_HEAD
				&& python decision_task.py release ${release_name} {tag}

onMasterPush:
	# <snip>
	payload:
		command:
			- >-
				git fetch https://github.com/mozilla-mobile/fenix refs/heads/master
				&& git checkout FETCH_HEAD
				&& python decision_task.py master-push

# Custom triggers, taskcluster-github won't ever populate these
onCustomTrigger:
	nightly:
		# <snip>
	raptor:
		# <snip>

Project with same task for different triggers example:

version: 2
policy:
	pullRequests: public


baseTask: &baseTask
	workerType: github-worker
	payload:
		image: python:3.7
		command:
			- /bin/bash
			- -c
			- git clone ${repository} redo && cd redo && git checkout ${revision} && python decision_task.py
	metadata:
		owner: ${user_email}
		source: ${repository}/raw/${revision}/.taskcluster.yml


onPullRequest:
	<<: *baseTask
	metadata:
		name: Redo Pull Request
		description: Redo Pull Request

onMasterPush:
	<<: *baseTask
	metadata:
		name: Redo Master Push
		description: Redo Master Push

Specifics:

Specific triggers with contextual variables
Rather than evaluating a provided tasks_for variable, this proposal recommends building the different trigger types directly into the YAML.
Then, when interpolating within these properties, provide contextual variables, rather than passing through the raw Github event object.
The three taskcluster-github trigger types should be:

  • onPullRequest (triggered when a PR is opened or updated). This has the additional variables of:
    • branch: the name of the git branch associated with the PR
    • pr_title: the title of the Github PR
    • pr_number: the identifier of the Github PR
    • pr_url: URL to Github to view the PR
  • onRelease (triggered by Github releases)
    • release_name: The name of the release on Github
    • tag: The name of the tag the release is associated with
  • onMasterPush (triggered when the master branch in the primary repository is pushed to)
    • (no extra variables)

Notes:

  • If you use context variables outside of their associated context (e.g.: an onPullRequest-specific variable outside of the onPullRequest block) you should get an error since the value doesn't exist.
  • We'll need to lazily evaluate values after the task is chosen to avoid "value doesn't exist" errors for unused-for-current-event triggers.

There's a few benefits to providing variables rather than the github events:

  • We can simplify access to values (e.g.: tag instead of event.release.tag_name)
  • .taskcluster.yml users don't need to learn about Github events (though, they need to learn taskcluster-specific skills instead, so this is a :shrug:)
  • When github changes the structure of events, we only need to update taskcluster-github, rather than every project that has a .taskcluster.yml

Remove JSON-e
JSON-e performs two purposes for us with v1: it allows us to disambiguate between trigger types (PR, release, master-push), and it allows us to have reuse between task definitions.
However, the specific triggers listed above handle the first case.

The second purpose of JSON-e (enabling re-use between task definitions) overlaps significantly with the decision task and taskgraph.
Even within .taskcluster.yml, if simple reuse is needed between the different trigger tasks, you can use YAML's anchor-and-extend functionality (see baseTask above in the "Simple project example").

Since JSON-e has significant costs (listed at the top), removing it could significantly benefit our .taskcluster.yml files.

Custom triggers
Sometimes, we need to trigger a task for non-Github-related reasons. Yet, for the sake of the structure of the projects, we still want the specially-triggered task to be defined beside the other tasks.
A good example of this is Fenix's Raptor trigger.

To handle this, allow projects to define custom tasks in the onCustomTrigger object. taskcluster-github shouldn't use this value, but it can be used by projects if they want to
manually interpret .taskcluster.yml and schedule custom tasks with custom values.


Notes:

  • If I missed use cases that this is incompatible with, then I'd love to learn more context :)
  • There may be use cases in which spawning a decision task is less ideal than representing tasks entirely in JSON-e. However, if we agree that "there's overlap between JSON-e and decision tasks", then I believe we should solve that problem (decision task is significantly harder than .taskcluster.yml), rather than supporting decision-task-y functionality twice.
  • My "billion laughs" example for JSON-e might not be the most apt, since standalone YAML is also susceptible to "Billion laughs" :/
  • Some projects may use a branch other than master as their primary branch: we could call the trigger onPrimaryPush instead, with a top-level primary branch config value?

CoT verification rebuilds the decision task to make sure that the task is not similar-to, but different-than, a valid decision task. Removing json-e and moving it all to the code removes this capability.

CoT verification rebuilds the decision task to make sure that the task is not similar-to, but different-than, a valid decision task. Removing json-e and moving it all to the code removes this capability.

FWIW, this would simplify CoT verification, since JSON-e is no longer needed. CoT would need the MOBILE_HEAD_... values (or equivalent) passed forward (as normal), then would download .taskcluster.yml, grab the correct task (e.g.: from onRelease), then compare.
Though, this doesn't matter because:


Dustin and I talked directly via Zoom.
This proposal would not be effective for two reasons:

This proposal would greatly increase the friction for any non-trivial tasks:
Removing JSON-e would force the declation of tasks that could be in the .taskcluster.yml to be moved to a decision task, which has three downsides:

  • Performance: we now have the delay of firing up a docker worker before the decision task runs
  • It requires project devs to have knowledge of taskcluster + docker, since they have to configure their image and run their script in it
  • Any variables that are used by the decision task now need to be piped to it, so there's boilerplate variable-passing in the payload.command[]

There's plans to add more triggers to .taskcluster.yml
This is more theoretical, but it could multiply the issues described above^

Other notes

  • We agree that reading and writing JSON-e (usually) isn't as ergonomic as representing logic in a real programming language
  • Running anything dynamic in .taskcluster.yml needs to be safe, which is why JSON-e was chosen (running something like python adds additional security challenges to guard against fs/network/etc access)
  • We'll talk more at all-hands to see how .taskcluster.yml ergonomics can be addressed
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.