Google Trust Services: 63 bit serial numbers in some certificates
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: ryan_hurst, Assigned: ryan_hurst)
Details
(Whiteboard: [ca-compliance] [ov-misissuance])
Attachments
(2 files)
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.96 Safari/537.36
Steps to reproduce:
Summary
Some certificates issued by GTS utilize EJBCA and as a result had serial numbers with an effective entropy of 63 bits. These serial numbers were created from a 64 bit CSPRNG output and were believed to be in compliance with Section 7.1 of the Baseline Requirements. Upon closer investigation we learned that EJBCA’s logic for serial number generation only selected output numbers having a leading 0 bit, which reduced their effective entropy to 63 bits.
Though GTS agree that the issuance of the certificates based on the above behavior qualifies as missisuance, we also believe that this issue does not represent a material security risk to the community.
To ensure that all of our certificates comply with the community’s interpretation of the Baseline Requirements (BR) we have updated the associated EJBCA CAs to mitigate the problematic behaviour.
At this time approximately 95% of the affected certificates have been replaced and revoked. The remaining certificates expire over the next 3 months.
We are actively working with the subscribers of these remaining certificates to facilitate a replacement with the goal of minimizing disruption of services. Should this not be possible, we will revoke these certificates no later than 2019-03-31.
Certificates issued from non-EJBCA CAs have been checked and are not affected.
Incident Report
-
How your CA first became aware of the problem
We have been following the thread discussing Dark Matter’s root inclusion request. When concerns regarding the EJBCA serial number generation logic were raised, we analyzed the behaviour of our EJBCA installations and found that they were affected as well. -
A timeline of the actions your CA took in response.
2019-02-22 - A thread on m.d.s.p. mentions serial entropy issue of Dark Matter certificates.
2019-02-26 - GTS begins reviewing the serial number generation behaviour of its CAs.
2019-02-27 - A third-party reports that serial numbers in all certificates issued from a specific GTS CA have a leading bit of 0 and suggests that we may have the same issue as Dark Matter.
2019-02-27 - GTS requests clarification from PrimeKey. It is confirmed that EJBCA serial generation logic causes the issue and that in order to create compliant serial numbers, the logic has to be replaced.
2019-02-27 - The associated CAs used an earlier version of EJBCA where the serial number logic was not configurable. As a result code from a newer version of EJBCA that supports configurable serial number length is backported and configured to use 16 byte serials.
2019-02-28 - The backported code is deployed to production.
2019-02-28 - Ongoing discussion on m.d.s.p. revolves around interpretation of Section 7.1 BR. A consensus emerges that the affected certificates must be considered to have been misissued.
2019-02-28 - We inventory the number of certificates issued since Section 7.1 BR went into effect in September 2016, the number that were currently valid as well as their validity period. The results are provided in the section on remediation actions below.
2019-03-01 - GTS decides to replace and revoke all affected certificates. Customers are contacted to work out revocation plans.
2019-03-01 - Issuance of replacement certificates begins.
2019-03-02 - A first notification is posted to m.d.s.p
2019-03-04 - Certificate revocation begins.
2019-03-05 - An update on progress is posted to m.d.s.p
2019-03-05 - This post mortem is posted to m.d.s.p
-
Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem.
GTS has stopped using the incorrect serial number generation logic. As of 2019-02-28, all GTS certificates issued from its EJBCA CAs have serial numbers with at least 64 bits of entropy. -
A summary of the problematic certificates.
All certificates issued from GIAG3 (https://crt.sh/?id=109354897, https://crt.sh/?id=158511650) between 2016-09-30 and 2019-02-28 were affected. -
The complete certificate data for the problematic certificates.
Given the large number of certificates (> 100k) they are not enumerated here. All certificates issued from the CAs mentioned above are affected. The CA certificates themselves are not affected. All affected certs were logged to multiple CT logs. -
Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
GTS relied on the affected EJBCA CA to generate 64 bit long certificate serial numbers as per its configuration and relied on EJBCA to perform this serial number generation in a BR compliant manner.
Since the serial numbers had the expected length, the leading bit of 0 was not discovered as limiting the serial number space.
Regular internal audits examined the technical profile of samples of issued certificates but did not detect the issue, because the audit was performed on a certificate-by-certificate basis. A comparison of serial numbers across a larger certificate population would have been required to identify it. Such tests were not implemented for the affected EJBCA CA because it is a legacy system scheduled for decommissioning later this year.
- List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future.
The issue only affects certificates from one of our legacy EJBCA CAs. Most of the GTS issuance volume has previously been migrated to a bespoke CA that is not affected.
Since the affected CA was patched with updated logic, it is issuing with 16 byte serials. At the end of August 2019, it will be decommissioned.
An inventory of the affected certificates provided the following:
Total number issued since September 2016 - 100,836
Total valid as of 2019-02-28 - 7,171
Expiring within 90 days - 7,137
Revocation Schedule
Revocation was performed with the objective of meeting the timeline prescribed for cases that fall under the second paragraph of Section 4.9.1 BR (5 days).
by 2019-03-04 : 6,199 (86.5%)
by 2019-03-05 : 575 (8%)
The remaining 397 certificates (5.5%) will expire or be revoked no later than 2019-03-31.
For 86.5% of the affected certificates the BR revocation timeline could be met.
954 certificates could not be replaced within the five day target without causing outages or other business issues. Revoking these certificates without providing adequate time for the associated subscribers to replace the certificates certificates without providing adequate time for the associated subscribers to replace the certificates would have caused significant business damage to the respective subscribers. Our goal being to revoke all certificates as quickly as possible while minimizing disruption on the subscribers and the users of their services.
We are actively working with remaining subscribers who have affected certs which have not been revoked to replace their certificates in the shortest time window possible. The 397 remaining certificates will expire or be revoked by 2019-03-31.
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
This file contains a list of all certificates that were issued after ballot 164 with 63bit serial numbers.
Assignee | ||
Comment 2•5 years ago
|
||
This file contains a list of all certificates that were issued after ballot 164 with 63bit serial numbers that have not yet been revoked.
Assignee | ||
Comment 3•5 years ago
|
||
Small correction, the 'msdp-64b.misissued.notrevoked.csv' list contains all certs that were not revoked at the end of deadline.
Comment 5•5 years ago
|
||
Emailed POCs on 2019-07-04 regarding this issue, highlighting https://wiki.mozilla.org/CA/Responding_To_An_Incident#Keeping_Us_Informed
Comment 6•5 years ago
|
||
There are no updates to provide beyond comment #0 and the data subsequently provided.
Comment 7•5 years ago
|
||
Andy:
Please carefully re-review the response. Comment #3 identifies a series of certificates that were not revoked, with Comment #0 proposing a revocation schedule for 397 certificates to be revoked by 2019-03-31. Since then, there has been no communication by Google Trust Services as to whether it met that deadline and that all certificates were revoked.
In terms of remediation, the answer to question 7 does not identify what steps are being taken to ensure timely compliance with the BRs going forward. It also does not provide the details requested in https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation regarding that known non-compliance.
Please carefully review that section for the expectations of CAs, and provide an update for Comment #0. For example, Ryan's update in https://groups.google.com/d/msg/mozilla.dev.security.policy/-RB8ovYgOHE/pZ7TcPN3CQAJ suggests that between Comment #0 on 2019-03-05 to 2019-03-11, 371 of the 397 certificates were revoked.
Assignee | ||
Comment 8•5 years ago
|
||
Dear m.d.s.p,
Apologies for the delay in providing a final update.
Since our last update, we worked intensively with our subscribers to help them rotate their certificates as soon as possible while at the same time avoiding a business impact from service interruptions.
As a result of that work, on 8 April all remaining affected certificates were revoked.
At this same point our remediation actions were all completed.
Comment 9•5 years ago
|
||
Does that mean that Google Trust Services has done nothing, and plans to do nothing, to prevent future delays of revocation?
I again direct your attention to Comment #7, which includes the link to https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation , and which provides expectations for CAs. Could you help me understand what part of those expectations were confusing or ambiguous, or whether it was oversight of both that and Comment #7 that caused you to fail to:
perform an analysis to determine the factors that prevented timely revocation of the certificates, and include a set of remediation actions in the final incident report that aim to prevent future revocation delays.
Updated•5 years ago
|
Comment 10•5 years ago
|
||
Sorry we failed to provide full information previously. We had an internal communication breakdown, which caused the responses to be delayed and incomplete. We have added a recurring sync to ensure that all open items are reviewed, discussed and closed out within specified timelines.
In terms of the remediation actions and analysis, we performed an analysis to determine the factors that prevented timely revocation. Revoking the concerned certificates within the 5 day deadline would have caused significant harm to affected customers who were using the certificates in business critical systems such as payment applications. In most cases, these customers had to work with other partners to coordinate the certificate rotation and avoid breakage.
To prevent future revocation delays we informed our customers that they need to be able to rotate their certificates within the BR timeline and updated documentation to reinforce that expectation. Where necessary, they need to work with any dependent partners to ensure they are capable of meeting the timeline. A factor that caused some delayed revocation is that the concerned customers used manual certificate request and issuance processes. One of the goals of Google Trust Services is to reduce manual request scenarios to as close to zero as possible. As a side-effect, this incident was actually beneficial, in that many clients used the incident as an opportunity to switch from manual to automated requests and issuance. While the work to reduce the manual flows continues and will continue, we made very solid progress in this regard.
If this incident were to happen again, we would be able to revoke more quickly and have fewer customers with business impacts requiring evaluation and special handling.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Description
•