1625322 - Let's Encrypt: Failure to revoke key-compromised certificates within 24 hours

Reporter

Description

•

6 years ago

Reported by Matt Palmer on mozilla.dev.security.policy on 2020-03-20.

These two certificates:

Were issued by Let's Encrypt more than 24 hours ago, and remain unrevoked,
despite the revocation of the below two certificates, which use the same
private key, for keyCompromise prior to the above two certificates being
issued:

As per recent discussions here on m.d.s.p, I believe this is a breach of BR
s4.9.1.1.

Flags: needinfo?(jaas)

Ryan Sleevi

Reporter

Comment 1

•

6 years ago

Josh: Could you please provide an incident report for Let's Encrypt?

Note the related discussion in the Let's Encrypt community.

Ryan Sleevi

Reporter

Updated

•

6 years ago

Status: NEW → ASSIGNED

Andrew Gabbitas

Comment 2

•

6 years ago

Summary:

Our current procedure is that when a key compromise is reported to us via cert-prob-reports@letsencrypt.org, the contact method in sections 1.5.2 and 4.9.3 of our CPS, we add the key to a list of blocked keys for future issuance. We also revoke other certificates that were issued with the same compromised key. When a Subscriber revokes a certificate via the ACME API, we do not currently block that certificate's key for future issuance or revoke other certificates issued with the same key.

The certificates in this incident report were revoked via the ACME API, and therefore we did not block the corresponding keys for future issuance.

We have been treating our obligation to block keys as being triggered by reporting via the methods listed in our CPS. However, the substance of Matt's report, that it should be easier to block future issuance of compromised keys in an automated way (for instance via our ACME API), is sensible, and so we plan to implement support for that in our API by May 14, 2020. We have a relevant issue filed at:

https://github.com/letsencrypt/boulder/issues/4712

Incident Report:

How your CA first became aware of the problem?

On 2020-03-19 Matt Palmer started a topic on our community support forums. https://community.letsencrypt.org/t/pre-issuance-checking-of-previously-revoked-private-keys/116762

A timeline of the actions your CA took in response.

2020-03-19T06:12:43Z: Certificate 039b4d9fb7bd5380bd1f112a39a3c92196a6 issued.
2020-03-19T05:16:45Z: Certificate 03cef2bf92b26e53d9df899bc0d3733dfbff issued.
2020-03-19T10:03:47Z: Certificate 039b4d9fb7bd5380bd1f112a39a3c92196a6 revoked via ACME API.
2020-03-19T10:04:13Z: Certificate 03cef2bf92b26e53d9df899bc0d3733dfbff revoked via ACME API.
2020-03-19T11:31:00Z: Matt Palmer posted to the Let’s Encrypt community forum https://community.letsencrypt.org/t/pre-issuance-checking-of-previously-revoked-private-keys/116762
2020-03-19T18:44:38Z: Certificate 03b69639064dad953ddbf063d8d6544ec8c0 issued.
2020-03-19T22:57:12Z: Certificate 04c1ed2562da2f7671a04f0ce9c204960552 issued.
2020-03-20T22:25:45Z: Matt Palmer opened a thread on MDSP https://groups.google.com/forum/#!msg/mozilla.dev.security.policy/Iz4lOBclSuk/tlDcS0fbAgAJ .
2020-03-22T05:24:04Z: Certificate 04c1ed2562da2f7671a04f0ce9c204960552 revoked via ACME API.
2020-03-22T05:35:01Z: Certificate 03b69639064dad953ddbf063d8d6544ec8c0 revoked via ACME API.
2020-03-26T22:22:00Z: Ryan Sleevi opened this Bugzilla report and assigned it to Josh Aas.
2020-03-31T01:32:00Z: Let’s Encrypt blocked issuance for the two compromised keys.
2020-03-31T01:43:33Z: Let’s Encrypt revoked certificates issued with the compromised keys: 03cc5bbfecd73cf2cee58a06281aaee2dfc3, 037408261c612a3d12baa68bdff4337d34d9

Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem.

All certificates with the same SubjectPublicKeyInfo as those reported have been revoked and added to our Key Compromise blocklist. This will prevent any new certificates from being issued with the same private key. We have not yet implemented the changes described to our revocation API.

A summary of the problematic certificates.

Six certificates issued between 2020-03-19 and 2020-03-30 were affected.

The complete certificate data for the problematic certificates.

Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

See summary text at the top of this post.

List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

We plan to implement support for adding compromised keys revoked via our API to a blocklist in our API by May 14, 2020. We have a relevant issue filed at: https://github.com/letsencrypt/boulder/issues/4712

Josh Aas

Assignee

Updated

•

6 years ago

Flags: needinfo?(jaas)

mpalmer

Comment 3

•

6 years ago

Our current procedure is that when a key compromise is reported to us via cert-prob-reports@letsencrypt.org, the contact method in sections 1.5.2 and 4.9.3 of our CPS, we add the key to a list of blocked keys for future issuance. We also revoke other certificates that were issued with the same compromised key. When a Subscriber revokes a certificate via the ACME API, we do not currently block that certificate's key for future issuance or revoke other certificates issued with the same key.

For what reason was the procedure for routinely blocking issuance using keys reported as compromised via e-mail developed and deployed?

We have been treating our obligation to block keys as being triggered by reporting via the methods listed in our CPS.

This is a disconcerting statement. Is it Let's Encrypt's position that revocation via the ACME API does not need to comply with the BRs, because it is not a contact method listed in Let's Encrypt's CPS?

I note that the BRs do not say that CAs are required to revoke keys for which evidence was obtained by a contact method listed in the CA's CPS, but merely that "evidence was obtained", without limit as to the means by which the evidence came into the CA's possession. What does Let's Encrypt believe a revocation request sent to Let's Encrypt's ACME API, signed by the key used in a certificate, with a revocation reason of "key compromise" is, exactly, if not evidence of key compromise?

Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem.

All certificates with the same SubjectPublicKeyInfo as those reported have been revoked and added to our Key Compromise blocklist. This will prevent any new certificates from being issued with the same private key.

Is this a "yes" or a "no" to the question in the heading? I cannot tell.

We plan to implement support for adding compromised keys revoked via our API to a blocklist in our API by May 14, 2020.

What steps is Let's Encrypt taking to prevent further misissuance in the interim, until the technical mitigations are in place?

Has Let's Encrypt done anything to identify if any other certificates have been impacted by this problem, taken steps to revoke those certificates and prevent further issuance using those keys?

Flags: needinfo?(agabbitas)

Andrew Gabbitas

Comment 4

•

6 years ago

For what reason was the procedure for routinely blocking issuance using keys reported as compromised via e-mail developed and deployed?

A certain number of key compromise cases require human judgement, our contact methods for key compromise need to include a method of contacting a human. Email was the suitable and common choice here.

However, I think your underlying question is: Why did we not additionally implement a workflow where Subscribers can block keys using the ACME revocation API? Part of the answer is that ACME did not actually specify revocation reasons until sometime after initial implementation: https://github.com/ietf-wg-acme/acme/pull/140. The next obvious question is: At the time we implemented revocation reason, why didn’t we implement key blocking when a Subscriber specifies reason = keyCompromise? Our reasoning at the time was anti-abuse: Because anybody can issue a certificate and immediately revoke it as key compromised, and the obligation to block keys is indefinite, we were concerned about people abusing the self-service API to indefinitely inflate our list of blocked keys with keys they had just generated. So we wanted a human in the loop for such decisions, to evaluate the question of whether the key was really compromised.

However, re-examining that decision now, we’ve realized that it’s far better to find and block such abuse patterns at issuance time rather than at revocation time, since of course that hypothetical user could trivially make sure each key they issued was immediately compromised, by publishing it.

Is it Let's Encrypt's position that revocation via the ACME API does not need to comply with the BRs, because it is not a contact method listed in Let's Encrypt's CPS?

Naturally everything we do is governed by the BRs and our CPS.

What does Let's Encrypt believe a revocation request sent to Let's Encrypt's ACME API, signed by the key used in a certificate, with a revocation reason of "key compromise" is, exactly, if not evidence of key compromise?

As described above, it could also be evidence of spam or accidental misuse of the API. However, based on this discussion, we’re updating our approach to this: it’s better to be over-inclusive than under-inclusive, and as you’ve said there are large practical benefits to doing key blocking via the API rather than via human intervention.

Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem.
All certificates with the same SubjectPublicKeyInfo as those reported have been revoked and added to our Key Compromise blocklist. This will prevent any new certificates from being issued with the same private key.
Is this a "yes" or a "no" to the question in the heading? I cannot tell.

Yes to the narrow question: the particular public keys reported in this incident report. No to the broader question: We have not yet implemented automated key blocking by Subscribers in our ACME API.

What steps is Let's Encrypt taking to prevent further misissuance in the interim, until the technical mitigations are in place?
Has Let's Encrypt done anything to identify if any other certificates have been impacted by this problem, taken steps to revoke those certificates and prevent further issuance using those keys?

(Answering both questions at once): As of 2020-04-05T00:00 we’ve generated a list of Subscriber-revoked certificates with reasonCode = keyCompromise, based on our last year of revocations, and blocked the keys in used to create them. We’re currently generating a list of valid certificates using those same keys, and will notify Subscribers and revoke said certificates once that list has been compiled.

We will be checking every 24 hours for new reasonCode = keyCompromise revocations and manually revoking the affected certificates until we implement automated key blocking in our ACME API.

Flags: needinfo?(agabbitas)

mpalmer

Comment 5

•

6 years ago

(In reply to Andrew Gabbitas from comment #4)

However, I think your underlying question is: Why did we not additionally implement a workflow where Subscribers can block keys using the ACME revocation API? Part of the answer is that ACME did not actually specify revocation reasons until sometime after initial implementation: https://github.com/ietf-wg-acme/acme/pull/140.

I'm not sure what to make of this statement. The Let's Encrypt ACME API does accept a revocation reason, which appears to be correctly propagated into the OCSP response. Why does the precise timeline of the requirement being added to the specification, prior to its publication as an RFC, have a bearing on Let's Encrypt's operations, several years later?

The next obvious question is: At the time we implemented revocation reason, why didn’t we implement key blocking when a Subscriber specifies reason = keyCompromise? Our reasoning at the time was anti-abuse: Because anybody can issue a certificate and immediately revoke it as key compromised, and the obligation to block keys is indefinite, we were concerned about people abusing the self-service API to indefinitely inflate our list of blocked keys with keys they had just generated. So we wanted a human in the loop for such decisions, to evaluate the question of whether the key was really compromised.

However, re-examining that decision now, we’ve realized that it’s far better to find and block such abuse patterns at issuance time rather than at revocation time, since of course that hypothetical user could trivially make sure each key they issued was immediately compromised, by publishing it.

While I can understand the desires to have anti-abuse mechanisms in place (as the operator of a large store of compromised keys, the issue of someone deciding to generate umpty-zillion keys for the lulz does weigh on my mind), it feels like Let's Encrypt didn't really think this particular abuse channel through to its conclusion. If a miscreant wants to abuse the Let's Encrypt revocation service, they could just as easily request revocation of a lot of certificates by e-mailing those revocation requests, along with the private keys (or proof thereof). As that processing involves a human, it would seem to be an even more effective method of consuming Let's Encrypt's time and energy than automatically revoking the certificates.

At any rate, requiring "proof" of compromise, rather than an assertion of compromise, is an odd standard to maintain, and one which I don't think is supported by the BRs or Mozilla Policy, and is certainly not a standard that I'd like to see encouraged. It places the bar to request revocation much higher than it needs to be, and encourages the further dissemination of compromised keys, which is a really bad idea. If someone who can demonstrate control of the private key (by generating a signature with the key, or decrypt a string encrypted with the public key) asserts that the key is compromised, that has to be sufficient evidence of compromise.

Requiring that the reporter "prove" that it "really really really is compromised", and not just someone having a laugh, is dangerous and counter productive. Many keys are only exposed briefly, such that by the time the certificate problem report is processed, the initial publication location no longer returns the private key. However, the key is still compromised, because it could have been retrieved by anyone during the time it was exposed (as, indeed, it was, otherwise I wouldn't have it). If I say "well, it was published at this URL", you can't verify that -- I could very easily be making that up. Short of re-publishing the key somewhere else, how do I prove, to any reasonable standard of evidence, that the key is actually compromised, and I'm not just making up a story about where I got it from?

On the upside, now that we know that at least one CA is making up extra requirements for demonstrating key compromise, Mozilla can do something to make all CAs aware of the reasonable requirements for validating key compromise reports.

Is it Let's Encrypt's position that revocation via the ACME API does not need to comply with the BRs, because it is not a contact method listed in Let's Encrypt's CPS?

Naturally everything we do is governed by the BRs and our CPS.

Could you square this reassuring assertion with your earlier statement, that "We have been treating our obligation to block keys as being triggered by reporting via the methods listed in our CPS."? Let's Encrypt recognises an "obligation to block keys", but only when reported via the methods listed in its CPS. From where does Let's Encrypt believe this obligation arises, and how does this obligation only apply to reporting made via the methods listed in its CPS, and not reporting made via the ACME API?

On a closely related point, does Let's Encrypt have any intention of including the ACME API in its CPS as a method of contact, as other CAs have committed to doing?

What does Let's Encrypt believe a revocation request sent to Let's Encrypt's ACME API, signed by the key used in a certificate, with a revocation reason of "key compromise" is, exactly, if not evidence of key compromise?

As described above, it could also be evidence of spam or accidental misuse of the API.

Is it not possible that a certificate problem report sent to Let's Encrypt's problem reporting address, providing equivalent evidence to that used to make a validated revocation request via the ACME API, could be evidence of spam or accidental misuse of SMTP?

Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem.
All certificates with the same SubjectPublicKeyInfo as those reported have been revoked and added to our Key Compromise blocklist. This will prevent any new certificates from being issued with the same private key.
Is this a "yes" or a "no" to the question in the heading? I cannot tell.

Yes to the narrow question: the particular public keys reported in this incident report. No to the broader question: We have not yet implemented automated key blocking by Subscribers in our ACME API.

Thank you for this clarification.

What steps is Let's Encrypt taking to prevent further misissuance in the interim, until the technical mitigations are in place?
Has Let's Encrypt done anything to identify if any other certificates have been impacted by this problem, taken steps to revoke those certificates and prevent further issuance using those keys?

(Answering both questions at once): As of 2020-04-05T00:00 we’ve generated a list of Subscriber-revoked certificates with reasonCode = keyCompromise, based on our last year of revocations, and blocked the keys in used to create them. We’re currently generating a list of valid certificates using those same keys, and will notify Subscribers and revoke said certificates once that list has been compiled.

I presume that list of additionally affected certificates will be provided to this incident report?

We will be checking every 24 hours for new reasonCode = keyCompromise revocations and manually revoking the affected certificates until we implement automated key blocking in our ACME API.

That seems too infrequent, given the requirement for the revocation to have been published within 24 hours of the notification of compromise. Does this procedure also include adding the keys of the affected certificates to your key compromise blocklist, so as to prevent further use of those keys in new certificates? Because otherwise it seems like there's still a gap here, whereby a key used in a certificate revoked for key compromise later this week could be use again next week for a new certificate, without being noticed and revoked within 24 hours of issuance.

Flags: needinfo?(agabbitas)

Andrew Gabbitas

Comment 6

•

6 years ago

As that processing involves a human, it would seem to be an even more effective method of consuming Let's Encrypt's time and energy than automatically revoking the certificates.

Agreed, handling these reports via email is time consuming and automating this process is a top priority for our organization.

On a closely related point, does Let's Encrypt have any intention of including the ACME API in its CPS as a method of contact, as other CAs have committed to doing?

This is a good idea. We will begin an internal dialogue and add clarification to our CPS once we reach an agreement.

I presume that list of additionally affected certificates will be provided to this incident report?

Yes. We have added the SPKI hash of our Subscriber-revoked certificates to our key compromise blocklist and revoked any active certificates using those keys.

We will be checking every 24 hours for new reasonCode = keyCompromise revocations and manually revoking the affected certificates until we implement automated key blocking in our ACME API.

That seems too infrequent, given the requirement for the revocation to have been published within 24 hours of the notification of compromise.

You are correct, this is not frequent enough. We recognized that as well, and have adjusted our check frequency.

Does this procedure also include adding the keys of the affected certificates to your key compromise blocklist, so as to prevent further use of those keys in new certificates?

Yes.

Additional remediation steps that have implemented:

We have built automation to check our logs for any keyCompromise revocations every 15 minutes and add any compromised keys to our blocklist. This will remain in place until changes to our API add the functionality to our revocation pipeline.
We are scanning new keyCompromise revocations against active certificates in our database to determine any non-revoked certificates, and are revoking as necessary.

Flags: needinfo?(agabbitas)

Andrew Gabbitas

Comment 7

•

6 years ago

Hi Folks,

An update regarding our commitment to provide automatic keyCompromise block and revoke via the ACME API:

List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

We plan to implement support for adding compromised keys revoked via our API to a blocklist in our API by May 14, 2020. We have a relevant issue filed at: https://github.com/letsencrypt/boulder/issues/4712

This item was on track to be delivered by May 14, but we found a bug before deployment that needed to be fixed. In the interim we have continued to audit the keyCompromise revoked certificates and revoke any additional certificates that matched the spki hash within 24 hours of the original keyCompromise revocation.

The bug was resolved and as of today (2020-05-19 21:20Z) all new revocations via the ACME api with reason keyCompromise automatically add the spki hash to a blocked keys table. This also triggers a service to find all valid certificates with the same key hash. The found certificates are automatically revoked and subscribers that have provided contact information are notified.

We do plan to update our CPS to add the API as an approved way to report keyCompromise, but that is not yet complete.

We now encourage folks to use the API for keyCompromise revocations, if possible, to relieve our security team of the burden of manual revocations via reports to our cert-prob-reports email address.

Thanks for the report and feedback, our service is better for it.

Andrew Gabbitas

Comment 8

•

6 years ago

One additional update:

We have updated our CPS to clearly state that keys will be blocked and affected certificates revoked when the ACME API is used to revoke certificates with reason keyCompromise.

https://letsencrypt.org/documents/isrg-cps-v2.8/#493-procedure-for-revocation-request

Wayne Thayer

Comment 9

•

6 years ago

It appears that all questions have been answered and remediation is complete.

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

3 years ago

Product: NSS → CA Program

David Lawrence [:dkl]

Updated

•

3 years ago

Whiteboard: [ca-compliance] [delayed-revocation-leaf] → [ca-compliance] [leaf-revocation-delay]

Bugzilla

Let's Encrypt: Failure to revoke key-compromised certificates within 24 hours

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

People

(Reporter: ryan.sleevi, Assigned: jaas)

References

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Summary:

Incident Report:

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Updated