scriptworker signed chain of trust artifact

RESOLVED FIXED

Status

defect
RESOLVED FIXED
3 years ago
Last year

People

(Reporter: aki, Assigned: aki)

Tracking

unspecified
Dependency tree / graph

Firefox Tracking Flags

(firefox52 fixed)

Details

Attachments

(8 attachments, 2 obsolete attachments)

Assignee

Description

3 years ago
Both validation and creation.
Assignee

Comment 1

3 years ago
The code to generate chain of trust artifacts in scriptworker is here: https://github.com/mozilla-releng/scriptworker/pull/11
Validation is coming soon.

However, we don't just need the code.  We need:

Generation
==========
* a gpg keypair for each scriptworker, to sign their chain of trust artifacts
* for these to be somewhat secure, I'd like bug 1290261 and bug 1298105 addressed

I may want to add a pref to conditionally not generate the chain of trust artifact, so we can merge the PR now and flip the pref later.  We can make it mandatory later.

We also need an answer here for non- signing scriptworkers; balrog, beetmover, and play store scriptworkers are on the radar.  Most likely I'll write a tool for generating these keys and putting them in puppet.

Validation
==========
* code to validate firefox chain of trust
* gpg pubkeys for each worker AMI, and a process to add them to the list of good keys.  We can use a keyserver, a git repo, or other.  We can either whitelist everything directly, or verify by a trusted signature.
* a way to revoke old keys.  a keyserver allows for this, but is additional infrastructure overhead.  If there's a path for revoked keys in the git repo, that could work as well.

We can keep refreshing keys in an existing keychain, or regenerate the keychain every time they change.

* enable the generateCertificate flag in the graph for the appropriate tasks.
Depends on: 1290261
Assignee

Comment 2

3 years ago
This patch:
* bumps all the versions of outdated python modules for signing scriptworker
* adds the python-gnupg dependency
* adds the new config.json options for scriptworker 0.5.0
Attachment #8786114 - Flags: review?(rail)
Comment on attachment 8786114 [details] [diff] [review]
add_cot_generation.diff

Review of attachment 8786114 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8786114 - Flags: review?(rail) → review+
Assignee

Comment 6

3 years ago
For signing keys.
Attachment #8791830 - Flags: review?(rail)

Updated

3 years ago
Attachment #8791830 - Flags: review?(rail) → review+
Assignee

Comment 8

3 years ago
Posted file gpg.py
This is the script I used to generate the gpg keypairs for the signing scriptworkers.  It creates a gpg homedir at ./gpg and generates the keys, then exports them to ./cltsign@FQDN.{pub,sec}

Then I imported the pubkeys, signed them with my key, and re-exported them, and put the signed pubkeys and my pubkey in https://github.com/escapewindow/cot-gpg-keys/tree/master/scriptworker valid/ and trusted/, respectively.

The pub and sec keys then went into puppet hiera.
Assignee

Comment 10

3 years ago
f? for date
Attachment #8795442 - Flags: feedback?(bugspam.Callek)
Attachment #8795442 - Flags: feedback?(bugspam.Callek) → feedback+
Comment hidden (mozreview-request)
Assignee

Comment 12

3 years ago
Posted patch fix_signing_deps.diff (obsolete) — Splinter Review
f? for date.
This patch:
* adds the docker-image task as an additional dependency for the signing task.  Between this, and allowing for the taskGroupId as an acceptable taskId to download from, allows the signing task to download artifacts from the build, docker-image task, and decision task.
* enables chain of trust generation in the docker-image task
* enables chain of trust generation in the build tasks
Attachment #8795497 - Flags: feedback?(kmoir)
Attachment #8795497 - Flags: feedback?(jlund)
Comment on attachment 8795497 [details] [diff] [review]
fix_signing_deps.diff

lgtm, it might have to be refactored along with the work in bug 

https://bugzilla.mozilla.org/show_bug.cgi?id=1277579#c71

once I have patches there
Attachment #8795497 - Flags: feedback?(kmoir) → feedback+
Comment on attachment 8795497 [details] [diff] [review]
fix_signing_deps.diff

Review of attachment 8795497 [details] [diff] [review]:
-----------------------------------------------------------------

overall patch looks good. has this been ran against some `mach taskgraph` runs with various parameters.yml?

f- for now until issues raised are commented on or new patch is up.

::: taskcluster/ci/build-signing/android-signing.yml
@@ +26,5 @@
>      artifacts:
>        - "public/build/target.apk"
>        - "public/build/en-US/target.apk"
> +  additional_dependencies:
> +    - ["build-docker-image-desktop-build", "docker-image"]

blocker: do these need this dependency? iiuc, we already depend on the unsigned-task (the nightly build) and that itself depends on docker-image.

::: taskcluster/ci/docker-image/image.yml
@@ +34,5 @@
>        HEAD_REV: '{{head_rev}}'
>        HEAD_REF: '{{head_ref}}'
>      features:
>        dind: true
> +      chainOfTrust: true

blocker: granted this is just for date, but this defaults to true for all tasks that use this image?

::: taskcluster/taskgraph/task/signing.py
@@ +20,5 @@
>  class SigningTask(base.Task):
>  
>      def __init__(self, kind, name, task, attributes):
>          self.unsigned_artifact_label = task['unsigned-task']['label']
> +        self.additional_dependencies = task.get('additional_dependencies', [])

non-blocker: I guess if this has a default value of [] it's fine. Maybe we should include a default value in from_json too so we never have to worry about it breaking in the future: https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/task/signing.py#58

::: taskcluster/taskgraph/transforms/task.py
@@ +300,5 @@
>              "type": "task-image",
>          }
>  
> +    features = {
> +        'chainOfTrust': True

blocker: also here, I wonder if we should just condition this now so it is more 'central ready'
Attachment #8795497 - Flags: feedback?(kmoir)
Attachment #8795497 - Flags: feedback?(jlund)
Attachment #8795497 - Flags: feedback-
Attachment #8795497 - Flags: feedback+
Assignee

Comment 17

3 years ago
(In reply to Jordan Lund (:jlund) from comment #16)
> Comment on attachment 8795497 [details] [diff] [review]
> fix_signing_deps.diff
> 
> Review of attachment 8795497 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> overall patch looks good. has this been ran against some `mach taskgraph`
> runs with various parameters.yml?
> 
> f- for now until issues raised are commented on or new patch is up.
> 
> ::: taskcluster/ci/build-signing/android-signing.yml
> @@ +26,5 @@
> >      artifacts:
> >        - "public/build/target.apk"
> >        - "public/build/en-US/target.apk"
> > +  additional_dependencies:
> > +    - ["build-docker-image-desktop-build", "docker-image"]
> 
> blocker: do these need this dependency? iiuc, we already depend on the
> unsigned-task (the nightly build) and that itself depends on docker-image.

Yes, we need this dependency.
For chain of trust verification, we need to download artifacts from the build, the decision task, and the docker-image task.  download_artifacts() makes sure we do not download from a task that is not a dependency.  I munged that check to allow for the taskGroupId to be a valid taskId, and that currently corresponds to the decision task.  So the final dependency I need to add is the docker-image task for the build.

> ::: taskcluster/ci/docker-image/image.yml
> @@ +34,5 @@
> >        HEAD_REV: '{{head_rev}}'
> >        HEAD_REF: '{{head_ref}}'
> >      features:
> >        dind: true
> > +      chainOfTrust: true
> 
> blocker: granted this is just for date, but this defaults to true for all
> tasks that use this image?

Yup.  I need it turned on.  It's relatively harmless: it means it creates a chain of trust json file and signs it and uploads that with the other artifacts, so if we don't check it for a specific task, we waste a bit of compute time and disk space.  However, if we try to sign an artifact and its docker-image task doesn't have a CoT artifact, signing will bail, and we have a broken nightly.

We never know if a docker-image task is going to be used in a nightly, since the optimization step points us at the latest docker-image artifact rather than spawning a new one.

Later, when we're signing windows and mac, dep builds will need cot turned on because some tests don't work without a signed build, and signing will bail if we can't trace the chain of trust back to the tree.

> ::: taskcluster/taskgraph/task/signing.py
> @@ +20,5 @@
> >  class SigningTask(base.Task):
> >  
> >      def __init__(self, kind, name, task, attributes):
> >          self.unsigned_artifact_label = task['unsigned-task']['label']
> > +        self.additional_dependencies = task.get('additional_dependencies', [])
> 
> non-blocker: I guess if this has a default value of [] it's fine. Maybe we
> should include a default value in from_json too so we never have to worry
> about it breaking in the future:
> https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/task/
> signing.py#58

I think we should require additional_dependencies to be set for signing; I just put this in in case, for now.  We should remove it later.

> ::: taskcluster/taskgraph/transforms/task.py
> @@ +300,5 @@
> >              "type": "task-image",
> >          }
> >  
> > +    features = {
> > +        'chainOfTrust': True
> 
> blocker: also here, I wonder if we should just condition this now so it is
> more 'central ready'

It will be perma-on on all branches, so it is central-ready.
(In reply to Aki Sasaki [:aki] from comment #17)
> (In reply to Jordan Lund (:jlund) from comment #16)
> > Comment on attachment 8795497 [details] [diff] [review]
> > fix_signing_deps.diff
> > 
> > Review of attachment 8795497 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 

> > blocker: do these need this dependency? iiuc, we already depend on the
> > unsigned-task (the nightly build) and that itself depends on docker-image.
> 
> Yes, we need this dependency.

okay. I'll leave the implementation method up to you, dustin, and jonas.


> > blocker: granted this is just for date, but this defaults to true for all
> > tasks that use this image?
> 
> Yup.  I need it turned on. 

okay sounds good :)


> > 
> > non-blocker: I guess if this has a default value of [] it's fine. Maybe we
> > should include a default value in from_json too so we never have to worry
> > about it breaking in the future:
> > https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/task/
> > signing.py#58
> 
> I think we should require additional_dependencies to be set for signing; I
> just put this in in case, for now.  We should remove it later.

if it will be required, then we should definitely add it to signing.py#58 now so we don't forget.


> > blocker: also here, I wonder if we should just condition this now so it is
> > more 'central ready'
> 
> It will be perma-on on all branches, so it is central-ready.

rgr. wfm :)
Comment on attachment 8795497 [details] [diff] [review]
fix_signing_deps.diff

f+ with addition of SigningTask.from_json patch @ signing.py#58

I think I will likely refactor ['unsigned-task']['label'] and ['additional_dependencies'] and merge them into a single ['dependencies']. But I'll do that with the general cleanup and it shouldn't block you.
Attachment #8795497 - Flags: feedback- → feedback+
Assignee

Updated

3 years ago
Attachment #8795497 - Attachment is obsolete: true
Attachment #8795497 - Flags: feedback?(kmoir)
Attachment #8795497 - Flags: feedback+

Comment 20

3 years ago
mozreview-review
Comment on attachment 8795446 [details]
bug 1298553 - enable chain of trust generation in the decision task.

https://reviewboard.mozilla.org/r/81486/#review80410
Attachment #8795446 - Flags: review?(dustin) → review+

Comment 21

3 years ago
Pushed by asasaki@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e809e1f8304b
enable chain of trust generation in the decision task. r=dustin
Assignee

Updated

3 years ago
Depends on: 1304582
Assignee

Comment 22

3 years ago
Enabling build and docker-image CoT blows up in Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9f8cd9e86c77&selectedJob=28138394

Waiting until the AMI deployment in bug 1304582 finishes.
Assignee

Comment 23

3 years ago
Chain of Trust dependency traversal writeup: https://gist.github.com/escapewindow/a6a6944f51d4219d08284ededc65aa30

Does this match what we discussed in today's meeting and irc chat?
Flags: needinfo?(jopsen)
Flags: needinfo?(dustin)

Comment 24

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/e809e1f8304b
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Indeed, nice summary!
Flags: needinfo?(dustin)
Assignee

Comment 26

3 years ago
Oops, marking leave-open.
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Sounds fairly reasonable.
I don't get why you need a task.extra.chainOfTrust.inputs
I would just traverse task.dependencies recursively, and discover tasks that way.

Or look at the extra data in the chainOfTrust artifact as it should specify the taskIds of its input.
(I suspect you might have to use the chainOfTrust artifact for trust issues)
Flags: needinfo?(jopsen)
Assignee

Comment 28

3 years ago
(In reply to Jonas Finnemann Jensen (:jonasfj) from comment #27)
> Sounds fairly reasonable.
> I don't get why you need a task.extra.chainOfTrust.inputs
> I would just traverse task.dependencies recursively, and discover tasks that
> way.

This is because certain tasks will be at the bottom of the graph.  We don't care about tasks like tests in phase 1, so we're going to either have to explicitly state which tasks we do care about at decision time, or we have to guess which tasks we care about at scriptworker task time.  Explicit > implicit, so we're adding info in the task definitions.

> Or look at the extra data in the chainOfTrust artifact as it should specify
> the taskIds of its input.
> (I suspect you might have to use the chainOfTrust artifact for trust issues)

The taskId of the *build* task will be there, but not the taskId of the docker-image build.  To find that out, or other dependent tasks not explicitly listed, we're going to have to hardcode task definition logic that should be in-tree.
Just to answer Jonas's questions with my understanding, to check that it matches Aki's:

1. task.depenencies does not contain information about what to expect at each dependency: was it the docker-image-build task, or the build task, or ..?  Trying to "guess" from the shape of the dependent task's definition seems like a bad idea, so this lets us label the dependency.  Also, it means that we can list only the dependencies we care to validate here, while task.dependencies can include more (such as signing depending on some tests completing, but not incorporating any outputs from those tests)

2. the chainOfTrust artifact is basically just a signature over the task and a few other things, so using task.extra.chainOfTrust is a way of embedding arbitrary data in a format that the worker will sign.
Assignee

Comment 30

3 years ago
(In reply to Dustin J. Mitchell [:dustin] from comment #29)
> Just to answer Jonas's questions with my understanding, to check that it
> matches Aki's:

I think the answers match.  I definitely want to avoid the situation where I have to slog through dozens or hundreds of upstream tasks, and I have to guess which ones I care about.
Assignee

Comment 33

3 years ago
Posted patch cot_image (obsolete) — Splinter Review
I seem unable to re-push this to autoreview...
Attachment #8797671 - Flags: review?(dustin)
Comment on attachment 8797671 [details] [diff] [review]
cot_image

Review of attachment 8797671 [details] [diff] [review]:
-----------------------------------------------------------------

::: taskcluster/taskgraph/transforms/task.py
@@ +314,5 @@
>              "type": "task-image",
>          }
>  
> +    features = {
> +        'chainOfTrust': True

This could use a comment to say why we set this for all gecko tasks.
Attachment #8797671 - Flags: review?(dustin) → review+
Assignee

Comment 37

3 years ago
I think this should just enable docker-worker build + docker-image task chain of trust.
Attachment #8797671 - Attachment is obsolete: true
Attachment #8797791 - Flags: review?(dustin)
Comment on attachment 8797791 [details] [diff] [review]
enable_cot.diff

Review of attachment 8797791 [details] [diff] [review]:
-----------------------------------------------------------------

It's a think of beauty!!
Attachment #8797791 - Flags: review?(dustin) → review+
Assignee

Comment 39

3 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/77d637164bfdbf3ab7c30fe01f0bafae36c792c9
bug 1298553 - enable cot in docker-image and docker-worker build tasks. r=dustin
Assignee

Comment 41

3 years ago
Posted patch puppet.diffSplinter Review
This patch:

* populates the signing scriptworkers with individual gpg keypairs
* points the signing scriptworkers at the chain of trust gpg key git repo, which is cloned at first puppet
* adds my gpg pubkey to the list of valid keys to sign the chain of trust gpg key git repo
* renames config.json to scriptworker.json
* adds a cot_config.json for chain of trust config
* runs create_initial_gpg_homedirs script to create ~cltsign/.gnupg and all the worker-type gpg homedirs in /builds/scriptworker/gpg
* enables chain of trust artifact signing

I was able to successfully run a linux and android signing task with these, and we have signed chain of trust artifacts (first runs failed, but I fixed and retriggered green):

https://tools.taskcluster.net/push-inspector/#/IKbffHyKQJKtM3brHGm3gA/V5_IryB5TP2JBM-QWOoZNw?_k=w22js0
https://tools.taskcluster.net/push-inspector/#/bJvcVXncQ9OMYszPe-hnWQ/MCN1mvuyS9qH7nz8LhgFJw?_k=wnv34z

We may want to have the pubkeys in a shared scriptworker area; if you want to block on that, any tips?
Attachment #8799588 - Flags: review?(bugspam.Callek)
Comment on attachment 8799588 [details] [diff] [review]
puppet.diff

Review of attachment 8799588 [details] [diff] [review]:
-----------------------------------------------------------------

Nothing overtly wrong here. Did not review the gpg-creation-script referenced, as it appears its part of the scriptworker deployment itself.

::: modules/signing_scriptworker/manifests/init.pp
@@ +91,5 @@
>          '/root/certs.sh':
>              ensure => absent;
> +        "${signing_scriptworker::settings::root}/config.json":
> +            ensure => absent;
> +        "/home/${users::signer::username}/.gnupg":

Personally I'm a bigger fan of specifying the gnupg directory than relying on ~/.gnupg (since there is a chance that manual runs of gpg or stuff could affect this by accident) but I don't consider that blocking.

@@ +103,5 @@
> +            owner       => "${users::signer::username}",
> +            group       => "${users::signer::group}";
> +        "/home/${users::signer::username}/privkey":
> +            mode        => 600,
> +            content     => $config::signing_scriptworker_gpg_private_keys[$fqdn],

Can't say I'm a huge fan of something secret being in ::config (makes it harder to introspect that this is a problem if used/accessed elsewhere) -- but I'm not sure how else to do it since the secret function (afaik) doesn't easily support hiera hashing.

@@ +113,5 @@
> +            mode        => 700,
> +            owner       => "${users::signer::username}",
> +            group       => "${users::signer::group}",
> +            source      => 'puppet:///modules/signing_scriptworker/git_pubkeys',
> +            recurse     => true;

I've seen recurse=>true on directories being a cpu/io drain in the past, and I forget if it also purges old entries (maybe purge=>true?). :dustin or :dividehex may know more on this piece.

::: modules/signing_scriptworker/templates/cot_config.json.erb
@@ +11,5 @@
> +        },
> +        "generic-worker": {
> +            "type": "flat",
> +            "ignore_suffixes": [".md"]
> +        },

docs in git repo for {docker,generic}-worker say to use subdirs, so is "type": "flat" right here?
Attachment #8799588 - Flags: review?(bugspam.Callek) → review+
Assignee

Comment 43

3 years ago
(In reply to Justin Wood (:Callek) from comment #42)
> Nothing overtly wrong here. Did not review the gpg-creation-script
> referenced, as it appears its part of the scriptworker deployment itself.

That's correct.

> ::: modules/signing_scriptworker/manifests/init.pp
> @@ +91,5 @@
> >          '/root/certs.sh':
> >              ensure => absent;
> > +        "${signing_scriptworker::settings::root}/config.json":
> > +            ensure => absent;
> > +        "/home/${users::signer::username}/.gnupg":
> 
> Personally I'm a bigger fan of specifying the gnupg directory than relying
> on ~/.gnupg (since there is a chance that manual runs of gpg or stuff could
> affect this by accident) but I don't consider that blocking.

We only use ~/.gnupg for git commit verification; other gpg homedirs are put in /builds/scriptworker/gpg/{docker-,generic-,script}worker/ .
We could potentially override ~/.gnupg for git, but we'd probably have to set an env var before calling git.

> @@ +103,5 @@
> > +            owner       => "${users::signer::username}",
> > +            group       => "${users::signer::group}";
> > +        "/home/${users::signer::username}/privkey":
> > +            mode        => 600,
> > +            content     => $config::signing_scriptworker_gpg_private_keys[$fqdn],
> 
> Can't say I'm a huge fan of something secret being in ::config (makes it
> harder to introspect that this is a problem if used/accessed elsewhere) --
> but I'm not sure how else to do it since the secret function (afaik) doesn't
> easily support hiera hashing.

Yeah.  I followed this same pattern for the signing server ssl certs.  If we figure out a better way, we can change both.

> @@ +113,5 @@
> > +            mode        => 700,
> > +            owner       => "${users::signer::username}",
> > +            group       => "${users::signer::group}",
> > +            source      => 'puppet:///modules/signing_scriptworker/git_pubkeys',
> > +            recurse     => true;
> 
> I've seen recurse=>true on directories being a cpu/io drain in the past, and
> I forget if it also purges old entries (maybe purge=>true?). :dustin or
> :dividehex may know more on this piece.

It looks like I want recurse => remote or recurse => true, purge => true https://docs.puppet.com/puppet/latest/reference/types/file.html#file-attribute-recurse
I can use recurselimit as well, though I don't expect to see any subdirectories in here.
Good catch!

> ::: modules/signing_scriptworker/templates/cot_config.json.erb
> @@ +11,5 @@
> > +        },
> > +        "generic-worker": {
> > +            "type": "flat",
> > +            "ignore_suffixes": [".md"]
> > +        },
> 
> docs in git repo for {docker,generic}-worker say to use subdirs, so is
> "type": "flat" right here?

Yeah.  It's flat vs signed, and doesn't affect whether we recurse into the source tree for pubkeys.  Flat means one layer of pubkeys, all signed by the ultimate privkey, so they're all valid: flat as in not nested gpg structure.  Signed means we import, trust, and sign the trusted keys, which already have signed the valid keys.  The valid keys we import but don't sign.  It's not flat because there's a layer of valid keys below the layer of trusted keys.
https://github.com/escapewindow/scriptworker/blob/cot/scriptworker/gpg.py#L1031
Assignee

Comment 44

3 years ago
> > @@ +113,5 @@
> > > +            mode        => 700,
> > > +            owner       => "${users::signer::username}",
> > > +            group       => "${users::signer::group}",
> > > +            source      => 'puppet:///modules/signing_scriptworker/git_pubkeys',
> > > +            recurse     => true;
> > 
> > I've seen recurse=>true on directories being a cpu/io drain in the past, and
> > I forget if it also purges old entries (maybe purge=>true?). :dustin or
> > :dividehex may know more on this piece.
> 
> It looks like I want recurse => remote or recurse => true, purge => true
> https://docs.puppet.com/puppet/latest/reference/types/file.html#file-
> attribute-recurse
> I can use recurselimit as well, though I don't expect to see any
> subdirectories in here.
> Good catch!

I touched a bogus file in the dir, client side, then tried both of these.
Looks like recurse => remote doesn't clean up unmanaged files.
|recurse => true, recurselimit => 1, purge => true| appears to be what we want.
Assignee

Comment 45

3 years ago
https://hg.mozilla.org/build/puppet/rev/60eaaa1901befc4c3a11b603540dfeefab7b87d3
bug 1298553 - enable signed chain of trust in signing scriptworkers. r=callek
Assignee

Updated

3 years ago
Blocks: 1309293
Assignee

Comment 46

3 years ago
Modifying this bug to only track uploading the signed chain of trust artifact, and opened bug 1309293 for graph chain of trust verification.
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Summary: scriptworker chain of trust support → scriptworker signed chain of trust artifact
Assignee

Updated

3 years ago
Blocks: 1317789
Component: Tools → General
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.