Closed Bug 1301382 Opened 8 years ago Closed 5 years ago

Cloning from interactive loaners should pin hg.mozilla.org fingerprint

Categories

(Firefox Build System :: Task Configuration, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: ahal, Unassigned)

References

(Blocks 1 open bug)

Details

This is a spin off of bug 1298947. It is likely lower priority as cloning via the run-wizard script is a rarer occurence. See the patch in the other bug for what needs to be done here:
https://dxr.mozilla.org/mozilla-central/rev/ab70808cd4b6c6ad9a57a9f71cfa495fcea0aecd/taskcluster/scripts/tester/run-wizard#78
Andrew, is this something you could knock out, and if not, maybe a mentor-able bug?
Flags: needinfo?(ahalberstadt)
Sure, I can mentor it. I probably shouldn't do the review, but can help get a patch put up.
Mentor: ahalberstadt
Flags: needinfo?(ahalberstadt)
Although testing this requires the ability to get a one-click loaner.. can volunteer contributors with lvl 1 commit access but no LDAP do that?
Level 1 comes with LDAP, so shouldn't be an issue.
Once this fingerprint is available without TC credentials, I think this will be a non-issue.
Assignee: nobody → dustin
It occurs to me that it's pretty silly to connect to one https service (secrets, in this case) without pinning the cert, just to pin the cert for another service.  The solution I suggested in comment 5 makes it even sillier: fetch the fingerprint for hg.mozilla.org from https://hg.mozilla.org/build/ci-configuration.

Maybe we should just bake this into the docker images?
If we bake it into the Docker images, then tasks start failing on those images whenever we install a new certificate. We need a mechanism that resolves the fingerprint at run-time so certificate swaps can be conducted without breaking e.g. esr builds.

Yes, fetching from one https:// service to fetch the fingerprint for another is kinda crazy. The chain is as strong as its weakest link. You have to anchor trust somewhere. I'd anchor trust with the secrets service. Then I'd figure out the most secure way possible to get data from that service into a task. Relying on the default trusted root CA bundle is probably not good enough. I'd use a self-issued certificate and pin that in the worker images. Or have the provisioner pass that fingerprint into the worker at spawn time. In the rare case where we need to rotate the certificate for the secrets service, we can mass respawn workers. Shouldn't be a big deal considering how ephemeral they are.
Well, then you're in the same situation as you rejected in your first paragraph -- tasks start failing on those images whenever we change the secrets service certificate.  So we could make another facts-about-secrets service, maybe, and promise to never change its certificate?  It feels like a lot of work for one string that will hardly ever change.

If we baked the fingerprint into an image, we could just land a patch with the new cert before changing it on the servers.  That would leave anything before that patch unable to build once we change the servers.

So both of those are not solutions.  In fact, TLS already provides a solution: CAs.  Generate a self-signed CA cert with an air-gapped privkey, generate certs for hg.m.o, and burn the CA pubkey (and associated CRL) into the docker images.  When the server private key gets disclosed, just revoke it and generate a new one, putting the new CRL in the new docker image.  Then configure the workers to verify the CA and CRL.  Older docker images will still be willing to talk to a host with the revoked cert until it expires, but that's almost unavoidable (those images *legitimately* talked to such hosts at one point!)

I'm not going to turn off the current fetch-from-secrets thing, but I don't think it's worth further work to support it.  Hopefully the work to put mirrors in AWS will include a proper certificate hierarchy and we can retire the fetch-from-secrets approach, fixing this bug.  In the interim, I'll leave it open for mentoring since it sounds like the patch is pretty simple.
Assignee: dustin → nobody
For hg.mo ins AWS, a self-signed CA is probably a good idea. The service will be private. So we don't need to establish a trust root with a public CA.
Mentor: ahalberstadt
Product: TaskCluster → Firefox Build System

Cloning is mostly via run-task which does pin (though does suffer from fetching the secret via a different HTTPS service). It is used in interactive tasks.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.