Closed Bug 945751 Opened 11 years ago Closed 10 years ago

Eliminate use of shared k/v store by signing

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

Attachments

(1 file)

In some discussions of the security of signing, :kang, :catlee, and I noted that:

* Tokens could easily be self-contained: cleartext data + signature.  With this arrangement, any signing server could validate a token produced by any signing server, without any need for communication or shared storage

* Nonces do require shared storage, as they must be invalidated when used.  However, nonces do not provide a substantial level of additional security above that provided by tokens:
  * the initial nonce is a well known value ("")
  * nonces aren't revealed during signing, so assuming the secrecy of the token, the nonce provides no additional security

So, the proposal is:
 * expand tokens to contain their contents in cleartext, as well as the existing signature.  This just increases the length of the token.
 * validate tokens in-place, without reference to a k/v store
 * do not validate nonces any longer, but continue to accept and generate them to avoid protocol incompatibilities
 * remove use of a k/v store

This doesn't present any protocol changes for the masters or slaves -- it's a change entirely internal to the signing system.

:kang, what do you think?
Flags: needinfo?(gdestuynder)
The particular urgency for this bug is that buildapi and signing currently share the same redis DB, meaning that a compromise of buildapi could lead to, at least, DOS or replay attacks on signing.
Blocks: 846332
Blocks: 945927
if i summarize, this basically means we're removing the nonce validity check.

For the nonces, the risk perspective is IMO: even if you could replay nonces, it doesn't change much. all build slaves will be able to generate nonces at one point or another, and thus able to build+sign stuff.

as per our previous discussion, it would also be interesting to use standard key management libraries (such as openssl) for the token, generating a certificate, instead of using a hash (easier to plug a HSM in later on, well understood behavior of the mechanism)

In any case this needs to be well documented. My current understanding is (please correct it if that's not right!):

- master reqs a signing token to signing server
- signing server generates one and signs it (this is where it could sign with openssl or even gpg)
- master passes token to slave
- slave builds, push binary, and tells signing server "sign this binary, here's my token"
** with nonce removal, slave can potential ask many signs at once here, until token is invalidated
- signing server verifies token validity (signature, time window check - this could be the "expire" setting if using gpg/openssl)
- signing server signs the builds
- token eventually expire, so can't be reused after a period of time

- signing server revoke signature on token?
Flags: needinfo?(gdestuynder)
Thanks!

Using OpenSSL or something like it wouldn't address the immediate need, which is to not depend on a k/v store.

This is pretty thoroughly documented - based on my conversation with you - at https://mana.mozilla.org/wiki/display/IT/Product+Signing.  That matches your description.

No, the signing server does not revoke the signature.  As we discussed, revocations require shared storage, and this bug is to eliminate use of shared storage.

So, I think we're good to go here from a security perspective.
This is probably a little off topic at this point, but it's worth noting that we generate a single token, pass it to the slave, and then make many signing requests to the server without generating a new token for each one. If I'm understanding correctly, then token revocation wouldn't work unless we allow the client to tell the server "I'm done" at some point.
Yep, and this bug is intentionally not modifying the master or slave sides of the protocol at all (although it makes the nonces in that protocol moot)
Assignee: nobody → dustin
Attached patch bug945751.patchSplinter Review
This removes redis, removes nonces entirely except for 'UNKNOWN' in the protocol, removes all storage of tokens or nonces, and converts tokens from standalone signatures to data!signature, verifying the signature on every signing request.  This also adds a test to simulate a master and slave talking to the service.  The test passed before (without the token=token.replace bit which didn't make sense), and it passes now.
Attachment #8347419 - Flags: review?(catlee)
Catlee, ping?
Comment on attachment 8347419 [details] [diff] [review]
bug945751.patch

Review of attachment 8347419 [details] [diff] [review]:
-----------------------------------------------------------------

This looks OK to me, but it would be good to have catlee's eyes on it too.

We should wait until Monday to land this. I don't want to be restarting signing servers on a Friday, and I'm on buildduty Monday anyways.
Attachment #8347419 - Flags: review?(catlee) → review+
Rail and I just talked about to properly deploy this. We determined that we need to restart all but one of the signing servers (of each type), then wait awhile (probably an hour or two) before restart the last ones. This is because there will be some builds that have requested a token but not finished signing when the bulk of the servers are upgraded. These will fail to sign when they talk to upgraded servers, but because of the retries we have in place they'll eventually fall back to the one with the old code. Once we're reasonably sure that all of the builds with old-style tokens have finished signing, we can restart the final servers.

This probably doesn't apply to release key servers, and maybe not nightly ones (if we time it correctly).
Comment on attachment 8347419 [details] [diff] [review]
bug945751.patch

https://hg.mozilla.org/build/tools/rev/426b20e6484e

Ben's going to follow the deployment process described in the previous comment.
Attachment #8347419 - Flags: checked-in+
Release signing servers are restarted.

mac-signing1, 2, 4 and signing4 and 5 both have their dep and nightly servers restarted. mac-signing3's and signing6's dep and nightly instances will be restarted in an hour or so - after builds with an existing token have time to complete.
All of the signing servers have been restarted. I realized there may be a few failures after all - any jobs that were in between getting a token from mac-signing3 or signing6 and finishing their signing will fail. We'll have to live with that.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: