Bug 1459705 (cotv3)

support the new action-hook in cot

RESOLVED FIXED

Status

enhancement
RESOLVED FIXED
Last year
Last year

People

(Reporter: aki, Assigned: aki)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

This will be cotv2.5 or cotv3.
Blocks: 1415868
WIP is here: https://github.com/escapewindow/scriptworker/commits/action-hook-cot
I'm thinking of refactoring things along with desupporting cotv1; this will effectively be cotv3.

cotv4? will be possibly be openssl certs instead of gpg keys; we'd have to add support in docker-worker, taskcluster-worker, and generic-worker, and update the AMIs. We probably have to support verifying the gpg keys + openssl keys for some window of time.
WIP is actually here: https://github.com/mozilla-releng/scriptworker/compare/master...escapewindow:cotv3?expand=1

Status:
- I'm able to verify Dustin's action task!
- The other, non-hook action task still verifies correctly, as do the various integration tests.
- I killed cotv1 in the same patchset. If we need to land action-hook support before killing cotv1, I can split that out to a different patchset.

I had to do two things that I think are ugly:
- I had to hardcode `repo_scope`. I'd love to get this pre-populated in actions.json or task.extra.action. Since a decision task is specific to a repo, I don't think it will be bad to add it to actions.json.
- By convention, action tasks have a taskGroupId that matches the decision task's taskId. This is not true of the kind=hook task, RUSih0YTT1uhfyKyiuNmXA, which:
  - has a taskGroupId of RUSih0YTT1uhfyKyiuNmXA, which should be ezXvG6gdQIO4CB0lhzLPUg
  - has an env ACTION_TASK_GROUP_ID is set to "RUSih0YTT1uhfyKyiuNmXA", and should be ezXvG6gdQIO4CB0lhzLPUg
  - has an env ACTION_TASK_ID set to "\"TqmY1K5tS2WKKPM3MhmJXA\"" (with backslashes); I believe this should be set to RUSih0YTT1uhfyKyiuNmXA
  - has a task.extra.action.context.taskGroupId of ezXvG6gdQIO4CB0lhzLPUg, which is correct but internally inconsistent,
  - and has a task.extra.action.context.taskId of TqmY1K5tS2WKKPM3MhmJXA, which I think is the task we want to retrigger.

I have hacks in cotv3 to address the taskGroupId weirdness above, but I think we shouldn't land them as-is.
(In reply to Aki Sasaki [:aki] from comment #2)
> - I had to hardcode `repo_scope`. I'd love to get this pre-populated in
> actions.json or task.extra.action. Since a decision task is specific to a
> repo, I don't think it will be bad to add it to actions.json.

Yeah, I hate this variable.  I would rather not change it until we have eliminated all of the non-hook actions, just to avoid the risk of breaking them.  However, we do need a way to calculate that value -- not all tasks are "generic"!  And this is where I get a little confused: the value is actually in task.scopes[0].  I don't think we have access to more trustworthy data than that. I suspect that the best plan is to take that value and check that it has one of the forms

  assume:repo:<repo>:action:generic
  assume:repo:<repo>:action:<task.payload.env.ACTION_CALLBACK>

`task.scopes` is essentially pre-checked for you: if someone cheats by running a task with a less-powerful scope than they need, the action will fail; if someone tries to run a task with a more-powerful scope than they have permission to use, then they will fail to create the task.  So I think just sanity-checking it is enough.

Once we eliminate that context variable, the scopes will be calculated inside .taskcluster.yml, so it will still make sense to sanity-check task.scopes, but that value won't need to get fed back into the JSON-e context.

> - By convention, action tasks have a taskGroupId that matches the decision
> task's taskId. This is not true of the kind=hook task,
> RUSih0YTT1uhfyKyiuNmXA, which:

So, those come from:

A - task.taskGroupId = taskId invented by the hooks service
B - task.payload.env.ACTION_TASK_GROUP_ID = ownTaskId, the task ID invented by the hooks service
C - task.payload.env.ACTION_TASK_ID = JSON.stringify(taskId on which the action was run)
D - task.extra.action.context.taskGroupId = taskGroupId of the decision task (as fixed in actions.json)
C - task.extra.action.context.taskId = taskId on which the action was run

A is definitely wrong; it comes from
  https://github.com/taskcluster/taskcluster-hooks/blob/master/src/taskcreator.js#L37
which we should convert to only set that value if it's not already set.  The JSON templates *try* to set that to D, which is what you want (ezXvG6gdQIO4CB0lhzLPUg).

The templates should set task.payload.env.ACTION_TASK_GROUP_ID correctly, too -- that's supposed to be D.

C (task.payload.env.ACTION_TASK_ID) is correct as-is - that's the taskId of the target action.  I agree these variables were not especially well-named.  I have no idea why we JSON-encode strings!

So I have two actions:
 - fix hooks to allow setting taskGroupId (bug 1462811)
 - fix bug 1415868 to set ACTION_TASK_GROUP_ID correctly
Hm, regarding the latter:

This is in the upstream, already-committed-and-running .taskcluster.yml:

https://dxr.mozilla.org/mozilla-central/source/.taskcluster.yml#107
                ACTION_TASK_GROUP_ID: '${ownTaskId}'

where https://docs.taskcluster.net/manual/using/actions/spec specifies that ownTaskId is the taskId of the action task being created.  So that's not even correct for kind=task actions.  I can change that, but I'm a little worried it will break existing CoT verification.  It looks like only cancel_all actually cares about this value, and I'm quite confident it doesn't work as-is, so aside from CoT verification I don't think there's much to worry about.  I'll flag you for review of the patch.
Alias: cotv3
Keywords: leave-open
Comment on attachment 8981599 [details]
bug 1459705 - bump scriptworker to 12.0.0.

https://reviewboard.mozilla.org/r/247724/#review253722
Attachment #8981599 - Flags: review?(bhearsum) → review+
Comment on attachment 8981598 [details]
bug 1459705 - bump treescript deps for scriptworker 12.0.0

https://reviewboard.mozilla.org/r/247722/#review253990

Mostly a stamp=me, I don't have pyup on treescript yet and have not testing these updates. I think its all fine based on my memory of updates to addonscript though.

Can you submit a PR to treescript as well though please.
Attachment #8981598 - Flags: review?(bugspam.Callek) → review+
Comment on attachment 8981597 [details]
bug 1459705 - remove the docker sha allowlist override.

https://reviewboard.mozilla.org/r/247720/#review254066
Attachment #8981597 - Flags: review?(mozilla) → review+
Pushed by asasaki@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/7a1e1f955f66
remove the docker sha allowlist override. r=tomprince
https://hg.mozilla.org/build/puppet/rev/646b828bdc35
bump treescript deps for scriptworker 12.0.0 r=callek
https://hg.mozilla.org/build/puppet/rev/d452766c6aae
bump scriptworker to 12.0.0. r=bhearsum
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Pushed by asasaki@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/b1b3c1999b37
bump scriptworker to 12.0.1 to fix mobile. r=versionbump
You need to log in before you can comment on or make changes to this bug.