add chain of trust verification support to generic-worker
Categories
(Taskcluster :: Workers, enhancement)
Tracking
(Not tracked)
People
(Reporter: mozilla, Unassigned)
References
(Blocks 1 open bug)
Details
Comment 1•6 years ago
|
||
Reporter | ||
Comment 2•6 years ago
|
||
Comment 3•6 years ago
|
||
Reporter | ||
Comment 4•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Comment 5•5 years ago
|
||
Hi Aki,
I like the idea of generic-worker verifying signed artifact signatures when mounting artifacts. Is there an authoritative source for the public key of a given worker type? I'm wondering how this would look for redeployable taskcluster environments. Perhaps worker manager should have an endpoint to return the public key of a given worker type?
Reporter | ||
Comment 6•5 years ago
|
||
(In reply to Pete Moore [:pmoore][:pete] from comment #5)
I like the idea of generic-worker verifying signed artifact signatures when mounting artifacts.
For completeness, we should verify the entire chain back to the tree as well. Verifying the artifact sha and signature means that the artifact was created in a trusted workerType. Verifying the chain means that the request (task definition) came from a trusted tree.
I'm currently thinking we should port scriptworker.cot.verify
to a standalone module/tool that can both verify the chain and download artifacts with verification. Golang is a likely candidate so we can import it in generic-worker and distribute standalone binaries for others to use, rather than requiring they install a python virtualenv.
Is there an authoritative source for the public key of a given worker type? I'm wondering how this would look for redeployable taskcluster environments. Perhaps worker manager should have an endpoint to return the public key of a given worker type?
Currently that's here. We should pull that out of scriptworker as well.
I really like the idea of exposing the cot public keys in worker manager. At some point we may have the capability of having public keys per workerType, not just per worker implementation, reducing the fallout if a key is compromised. However, I have misgivings about using it as the source of truth: if we can update our CoT trust through scopes, then CoT is no longer a second factor to scopes.
I gave this some thought and now I'm thinking that worker manager could have information about runtime cot public keys per worker implementation or workerType (possibly even per-worker?). But something like ci-configuration would have the definitive trusted set of public keys: by using vcs, we have an audit log, history, and reviews. (If we have standalone cot verification, we probably have the entire cot verification config landed there.)
Comment 7•5 years ago
|
||
A move of CoT into the platform should take the form of a proposal that leads to an RFC, and be considered on its own merits (that is, without much weight given to "that's how we do it now"). I think that would be a real strength for Taskcluster!
That's also a pretty substantial project, so we should think about the schedule for such a thing -- I would guess this could happen after docker-worker is deprecated and after we have migrated Firefox CI production, at least.
Comment 8•5 years ago
|
||
I've chatted with :aki about this some, and I'm not sure that I agree that full chain-of-trust verification should move into tree.
I do think it would be valuable to verify that mounted task artifacts hashes match the hash recorded in chain-of-trust.json
. I'm not sure if the workers should do any more work around verifying chain-of-trust.
Reporter | ||
Comment 9•5 years ago
|
||
(In reply to Tom Prince [:tomprince] from comment #8)
I've chatted with :aki about this some, and I'm not sure that I agree that full chain-of-trust verification should move into tree.
I do think it would be valuable to verify that mounted task artifacts hashes match the hash recorded in
chain-of-trust.json
. I'm not sure if the workers should do any more work around verifying chain-of-trust.
I think this is analogous to verifying an ssl connection is signed, but not verifying that it's chained to a trusted CA, and then verifying all previous connections when we do something sensitive like see a password field or a credit card field. It's less secure overall: it opens up potential backdoor ways to add malicious traffic when we don't do the full check. It complicates the full check further, because we have to verify previous traffic as well as the current traffic. If the check happens in a library, you're essentially telling it to run a partial check most of the time, and then a greater-than-normal check at the most sensitive times; overall this will result in a more complex library. I believe that running the full check of the current task's inputs and request is the most sane way forward, and I agree that means we'll need to craft and RFC a proposal to make CoT part of the platform.
Updated•5 years ago
|
Comment 10•5 years ago
|
||
(In reply to Aki Sasaki [:aki] (he/him) (UTC-7) from comment #9)
we'll need to craft and RFC a proposal to make CoT part of the platform.
Aki: given recent discussions around this, is this bug still valid?
Reporter | ||
Comment 11•5 years ago
|
||
This is still a want from my side to enable end-to-end CoT verification in firefoxci taskgraphs. I could see potentially adding generic-worker hooks to allow for pre-task calls, and then adding CoT checks to firefoxci pools.
As long as we're tracking it somehow, I'm open what we do with this bug.
Description
•