Let's Encrypt: Failure to revoke key-compromised certificates within 24 hours
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: mpalmer, Assigned: jaas)
Details
(Whiteboard: [ca-compliance] [leaf-revocation-delay])
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36
Steps to reproduce:
Between 2020-04-05 07:51:00 and 2020-04-05 07:51:21 (all times UTC), a total of 12 revocation requests were sent to cert-problem-reports@letsencrypt.org, notifying Let's Encrypt of a total of 42 certificates and precertificates which had been issued with compromised private keys. Each of these reports included a link to a crt.sh query listing the known-impacted certificates, as well as a link to a PKCS#10 format attestation of key compromise.
Actual results:
I received auto-acknowledgement e-mails from Let's Encrypt for these reports dated between 2020-04-05 07:51:04 and 2020-04-05 07:51:25.
The revocationTime for all these certificates in the OCSP responses I am currently seeing are all either 2020-04-06 08:19:09 or 2020-04-06 08:19:10 (primarily the latter), times that are greater than 24 hours from both the time the certificate problem report was submitted and the time of the auto-acknowledgements I received.
Expected results:
BRs 4.9.1.1 requires CAs to revoke certificates within 24 hours when the CA has obtained evidence of key compromise.
Comment 1•5 years ago
|
||
Matt: The revocation time expressed in the CRL or OCSP response is it alone sufficient here. Your report does not indicate whether or not you recorded evidence of a failure to revoke.
Here is a table listing, for each certificate revoked at around that time:
-
The CA-provided revocationTime;
-
The last time at which The Revokinator received a validated OCSP response where certStatus=0 ("good"); and
-
The first time at which The Revokinator received a validated OCSP response where certStatus=1 ("revoked").
revocationtime | lastunrevokedresponse | firstrevokedresponse
---------------------+----------------------------+----------------------------
2020-04-06 08:19:10 | 2020-04-06 08:16:40.762038 | 2020-04-06 08:20:21.363634
2020-04-06 08:19:10 | 2020-04-06 08:16:42.103358 | 2020-04-06 08:20:24.731198
2020-04-06 08:19:09 | 2020-04-06 08:16:38.208337 | 2020-04-06 08:20:10.782942
2020-04-06 08:19:09 | 2020-04-06 08:16:40.162595 | 2020-04-06 08:20:20.060133
2020-04-06 08:19:10 | 2020-04-06 08:16:42.203639 | 2020-04-06 08:20:24.823135
2020-04-06 08:19:10 | 2020-04-06 08:16:40.066229 | 2020-04-06 08:20:19.538995
2020-04-06 08:19:10 | 2020-04-06 08:16:42.011732 | 2020-04-06 08:20:24.630944
2020-04-06 08:19:09 | 2020-04-06 08:16:38.09652 | 2020-04-06 08:20:10.107563
2020-04-06 08:19:10 | 2020-04-06 08:16:42.31963 | 2020-04-06 08:20:24.918158
2020-04-06 08:19:10 | 2020-04-06 08:16:39.378991 | 2020-04-06 08:20:17.437362
2020-04-06 08:19:10 | 2020-04-06 08:16:39.749835 | 2020-04-06 08:20:18.792142
2020-04-06 08:19:09 | 2020-04-06 08:16:42.41533 | 2020-04-06 08:20:25.014022
2020-04-06 08:19:10 | 2020-04-06 08:16:41.137739 | 2020-04-06 08:20:23.199512
2020-04-06 08:19:10 | 2020-04-06 08:16:40.949092 | 2020-04-06 08:20:22.11979
2020-04-06 08:19:10 | 2020-04-06 08:16:41.916499 | 2020-04-06 08:20:24.535931
2020-04-06 08:19:10 | 2020-04-06 08:16:39.46686 | 2020-04-06 08:20:18.100656
2020-04-06 08:19:10 | 2020-04-06 08:16:37.672196 | 2020-04-06 08:20:07.567897
2020-04-06 08:19:10 | 2020-04-06 08:16:39.841186 | 2020-04-06 08:20:19.194164
2020-04-06 08:19:10 | 2020-04-06 08:16:39.179098 | 2020-04-06 08:20:16.681526
2020-04-06 08:19:10 | 2020-04-06 08:16:39.970563 | 2020-04-06 08:20:19.285912
2020-04-06 08:19:10 | 2020-04-06 08:16:37.892081 | 2020-04-06 08:20:08.492301
2020-04-06 08:19:10 | 2020-04-06 08:16:40.857936 | 2020-04-06 08:20:21.880174
2020-04-06 08:19:10 | 2020-04-06 08:16:39.554435 | 2020-04-06 08:20:18.474003
2020-04-06 08:19:10 | 2020-04-06 08:16:38.00034 | 2020-04-06 08:20:08.993678
2020-04-06 08:19:10 | 2020-04-06 08:16:38.51691 | 2020-04-06 08:20:12.586751
2020-04-06 08:19:10 | 2020-04-06 08:16:39.287211 | 2020-04-06 08:20:17.200604
2020-04-06 08:19:10 | 2020-04-06 08:16:38.341327 | 2020-04-06 08:20:11.821462
2020-04-06 08:19:10 | 2020-04-06 08:16:38.424878 | 2020-04-06 08:20:12.341504
2020-04-06 08:19:10 | 2020-04-06 08:16:37.766368 | 2020-04-06 08:20:07.969171
2020-04-06 08:19:10 | 2020-04-06 08:16:39.65476 | 2020-04-06 08:20:18.565025
2020-04-06 08:19:10 | 2020-04-06 08:16:41.72914 | 2020-04-06 08:20:23.786257
2020-04-06 08:19:10 | 2020-04-06 08:16:41.045493 | 2020-04-06 08:20:22.663187
2020-04-06 08:19:10 | 2020-04-06 08:16:42.507066 | 2020-04-06 08:20:25.142946
2020-04-06 08:19:10 | 2020-04-06 08:16:41.824195 | 2020-04-06 08:20:24.023708
2020-04-06 08:19:10 | 2020-04-06 08:16:41.228692 | 2020-04-06 08:20:23.290947
2020-04-06 08:19:10 | 2020-04-06 08:16:42.603717 | 2020-04-06 08:20:25.242163
2020-04-06 08:19:10 | 2020-04-06 08:16:38.608051 | 2020-04-06 08:20:13.098002
2020-04-06 08:19:10 | 2020-04-06 08:16:41.337058 | 2020-04-06 08:20:23.382173
2020-04-06 08:19:10 | 2020-04-06 08:16:40.383251 | 2020-04-06 08:20:20.793199
2020-04-06 08:19:10 | 2020-04-06 08:16:38.704899 | 2020-04-06 08:20:13.773825
2020-04-06 08:19:10 | 2020-04-06 08:16:38.796125 | 2020-04-06 08:20:14.427767
2020-04-06 08:19:10 | 2020-04-06 08:16:40.290732 | 2020-04-06 08:20:20.28141
2020-04-06 08:19:10 | 2020-04-06 08:16:40.653907 | 2020-04-06 08:20:21.129899
2020-04-06 08:19:10 | 2020-04-06 08:16:39.088019 | 2020-04-06 08:20:16.138927
2020-04-06 08:19:10 | 2020-04-06 08:16:38.900108 | 2020-04-06 08:20:14.951427
2020-04-06 08:19:10 | 2020-04-06 08:16:41.631862 | 2020-04-06 08:20:23.690599
2020-04-06 08:19:10 | 2020-04-06 08:16:38.995766 | 2020-04-06 08:20:15.470681
2020-04-06 08:19:10 | 2020-04-06 08:16:40.561811 | 2020-04-06 08:20:21.013109
In each case, as you can see, I received a "good" response from OCSP more than 24 hours after the certificate problem report was submitted, which, if nothing else, I believe violates 4.9.5's requirement for "published revocation" within 24 hours of receipt of the problem report.
While we're on 4.9.5 violations, I also did not receive a preliminary report from the CA within 24 hours of submitting the certificate problem report, as required by the first paragraph. Those problem reports were received starting from approximately 2020-04-06 09:07. I don't like to harp on those unnecessarily, because the important thing is that the certs get revoked, but I mention it in the interests of full disclosure.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 3•5 years ago
|
||
Summary
12 subsequent reports of key compromise came to our cert-problem-reports email address. Investigation, revocation and key blocking was performed 28 minutes after the 24 hour revocation deadline as outlined in 4.9.1.1.
How your CA first became aware of the problem.
12 emails were sent to our cert-problem-reports@letsencrypt.org reporting email address on or near 2020-04-05T07:51:00Z. Each email indicates a key had been compromised and contained a link to a PKCS#10 file signed using the key reported.
A timeline of the actions your CA took in response.
- 2020-04-05T07:51:00Z: 12 emails reported to our cert-problem-reports email address containing evidence of key compromise for 12 keys.
- 2020-04-05T19:04 Email notification to the SRE team of an unassigned ticket that had not yet been addressed.
- 2020-04-05T21:30:00Z: Let’s Encrypt team started investigation of the batch of compromised keys reports.
- 2020-04-06T07:51:00Z: 24 hours had elapsed since the initial report
- 2020-04-06T08:19:00Z: 24 serials associated with the reported compromised keys were found and revoked
Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem.
Yes. At the time the reported certificates were revoked the compromised keys were also added to our compromised keys blocklist to prevent future issuance using those keys.
A summary of the problematic certificates.
A total 24 certificates were revoked as part of the aggregated email reports.
The complete certificate data for the problematic certificates.
- https://crt.sh/?serial=04e069cd90a028ef659acc8a81e79d15d035
- https://crt.sh/?serial=03ff8ab20e0c85c4758536fd4cd3fdf9715c
- https://crt.sh/?serial=0347e5edb2fbb18a4970e6cf5c66af33d015
- https://crt.sh/?serial=03ae1ba7dee30c65ee023fb3ada1a109af08
- https://crt.sh/?serial=03e5ec7559e780c6813bc02ed16a0983cb17
- https://crt.sh/?serial=030cf455709fe33e70dd0150399507a7f0af
- https://crt.sh/?serial=03b444cd53d404203f4fe7ee0125ca28d002
- https://crt.sh/?serial=03340fc8c8808889b808cd353509dcad0d3a
- https://crt.sh/?serial=03aa5d88930dd35c586a25a2ed2510b04e7c
- https://crt.sh/?serial=039c885b9f691ad81fdfcb3572afca33344c
- https://crt.sh/?serial=048ecc550af34dc2572c74ab7a794f686dfe
- https://crt.sh/?serial=03ff9d93cb08f23bc7bdf36796ec522a7595
- https://crt.sh/?serial=038e3b4f64b2c2530a736cd95512987e324b
- https://crt.sh/?serial=0313a676ed46ade1c21198fa7403725655d7
- https://crt.sh/?serial=04ed3aaa7d6e06e865090d67d0a96e6d3154
- https://crt.sh/?serial=033307fb7604f9627ead7a3b314fcaf1f6d3
- https://crt.sh/?serial=03912fb07106bc344bb7040c3b9fb912d9fa
- https://crt.sh/?serial=04dabf800227dcf4bbdd91907430058b6587
- https://crt.sh/?serial=04a057e9be46a0304d00c47f51848dfeffa1
- https://crt.sh/?serial=03e9478910c7bbd5ec0923f027aa89ad88e8
- https://crt.sh/?serial=043fb261390861447115eb020a947f7255b3
- https://crt.sh/?serial=031d97b82442016665a1f5a8e7df3b446bec
- https://crt.sh/?serial=037d12638a51c591f27624e2120a0779773a
- https://crt.sh/?serial=04ae54202edc08e8eba72f17f381fae2d6b4
Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
Our current process for handling key compromise reports is that our SRE and security teams receive a notification and begin their investigation. If an email goes unanswered for 11 hours, a secondary notification email is triggered. Only after the secondary notification came in were the initial email reports discovered and an investigation began.
We were working on tools and automation for https://bugzilla.mozilla.org/show_bug.cgi?id=1625322 when the reports came in. We decided to bundle the work from the new reports with what we were doing there for efficiency purposes. Our plan was to finish the bundled work by the deadline but miscalculated the time and missed the deadline by 28 minutes. We have now identified that as a place we needed to add additional monitoring and alerting.
Some of the tooling and automation we were working on:
- Verification of certificate reported as compromised against the PKCS#10 bundle as evidence of private key control.
- Search all non-revoked, non-expired certificates in our database to determine which are affected by a report of key compromise (both via our cert-problem-reports email or api revocation).
- Automatically determine when a revocation with keyCompromise reason happens via our api and add the key to our block list.
List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
- We previously had our cert-problem-reports ticketing system send emails to the team when a ticket had been unassigned for over 11 hours, but there was no on-call page alert associated. As a remediation we have added an on-call page in addition to this alert. In addition, we have added a 14 hour on-call page to additional team members as a backup.
- Sped up our systems for finding all certificates that use a specific public key.
Updated•5 years ago
|
Comment 4•5 years ago
|
||
Ben: While Comment #3's response to Question 7 does not include the timeline of changes, it appears they are both stated in past-tense, meaning they're already implemented.
Updated•5 years ago
|
Updated•3 years ago
|
Updated•2 years ago
|
Description
•