KIR: Delayed revocation within seven (7) days for bug 1921598
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: piotr.grabowski, Assigned: piotr.grabowski)
Details
(Whiteboard: [ca-compliance] [ca-revocation-delay])
Incident Report
This is a preliminary report.
Summary
KIR has issued SZAFIR Trusted CA3 Intermediate CA certificate with missing Reserved Certificate Policy Identifiers that indicate adherence and compliance with S/MIME BR as described in https://bugzilla.mozilla.org/show_bug.cgi?id=1921598
According to SBR [https://cabforum.org/uploads/CA-Browser-Forum-SMIMEBR-1.0.6.pdf] section 4.9.1.2 - The Issuing CA SHALL revoke a Subordinate CA Certificate within seven (7) days.
This has not been completed. A full incident report will be provided no later than Friday October 11th 2024.
Updated•4 months ago
|
Assignee | ||
Comment 1•4 months ago
|
||
Incident Report
Summary
KIR has issued SZAFIR Trusted CA3 Intermediate CA certificate with missing Reserved Certificate Policy Identifiers that indicate adherence and compliance with S/MIME BR as described in https://bugzilla.mozilla.org/show_bug.cgi?id=1921598
According to SBR [https://cabforum.org/uploads/CA-Browser-Forum-SMIMEBR-1.0.6.pdf] section 4.9.1.2 - The Issuing CA SHALL revoke a Subordinate CA Certificate within seven (7) days.
Impact
1 Intermediate CA certificates issued on Oct 11, 2023 – 10:49 UTC.
https://crt.sh/?caid=278655
Timeline
Sep 28, 2024 – 09:36 UTC – https://bugzilla.mozilla.org/show_bug.cgi?id=1921598 report incident was posted - KIR: Intermediate CA - SZAFIR Trusted CA3 - Certificate Policies extension - non-compliance
Sep 28, 2024 – 10:00 UTC – First inspection, assessment and forwarding of the information.
Sep 30, 2024 – 11:00 UTC – Remediation plan was initialized.
Oct 02, 2024 – 11:00 UTC – End-user certificates from affected subCA were grouped by systems, usage.
Oct 03, 2024 – 11:00 UTC – Management as well as personnel responsible for the affected certificates were informed about the
potential severity of the problem.
Oct 03, 2024 - 16:12 UTC - preliminary report was posted - KIR: Delayed revocation within seven (7) days for bug 1921598 indicating
that full incident report will be provided no later than Friday October 11th 2024.
Oct 08, 2024 – 08:30 UTC – Issuance of new CA SZAFIR Trusted CA5 with new keys.
Oct 10, 2024 – 11:00 UTC – Remediation plan was communicated.
Oct 11, 2024 – 16:40 UTC – Post of this Incident Report.
Root Cause Analysis
First of all this issue has been prioritized at the highest level within KIR.
The main root cause for the delayed certificates revocation lies in the outages in critical infrastructure if given revocation would take place.
The root casue can be dived into 2 issues:
a) Issues with new chain in back-end systems.
The main root cause for the delayed certificates revocation lies in the complex and lengthy process required for banks, government entities and other insitusions to distrust the affected subCA that should be revoked and trust new subCA instead. It involves the approval and coordinated efforts from multiple internal departments to deploy a new subCA certificate in their back-end systems, which makes the time required to complete the process last long. The back-end systems use their own specific limited truststores with custom implementation. In many cases third party entities should be involved to plan and handle such a change. In many instances the entities have specific already announced in advance maintanance windows for deploying changes and usually do not plan to deploy any changes at the end of the year.
b) Issues with subscribers' certificates
Regardless of the issues with new chain in the back-end systems truststore there is a problem with end-user certificates replacement which blocks usage of new subCA.
In most cases subscribers are in the critical industries where manual replacement usualy takes place, especially on devices (mostly HSMs) without automation. Due to the risk of error it requires more time and change control. The risk of causing a potential outage by a rapid replacement could directly impact the Subscriber’s end-users and introduce potential security issues, especially in some of the industries that these certificates were installed on. Some other installations require a restart, which even automated deployments are unable to perform outside of their change management window.
8355 end-user certificates are affected by the potential subCA revocation.
End-user certificates issued from affected subCA are used in following systems
Secure data exchange financial system 84,5%
Clearing settlement system 3,5%
Electronic identity system 1,0%
Others 11,0%
Our user migration plan:
until 2024-11-01: 10%.
until 2024-12-01: 40%
until 2025-01-01: 65%
until 2025-02-01: 87%
until 2025-03-07: 100%
Lessons Learned
What went well
On October 08, 2024 the new CA "SZAFIR Trusted CA5" was put into operation. On October 10, 2024 the active user migration will be started.
Remediation plan was already accepted and communicated.
User migration has been started.
What didn't go well
Where we got lucky
Action Items
Action Item | Kind | Due Date |
---|---|---|
SZAFIR Trusted CA5 was put into operation | mitigate | 2024-10-08 (completed) |
Remediation plan was already accepted and communicated | mitigate | 2024-10-10 (completed) |
We will highlight, educate and actively support subscibers to migrate from their custom trustore implementations in their back-end systems to more flexible solutions allowing simple and secure certificate management or using publicly truststores | prevent | 2024-12-30 |
We will educate the subscribers, enhance their understanding of immediate revocation requirements, and facilitate them in preparing for a swift certificate replacement process, to ensure that the certificates can be replaced within the revocation deadline in case needed. | prevent | 2024-12-30 |
For situations that require special treatment subscribers will also be advised to get them off publicly trusted certificates or consider utilizing private PKI, and prepare other contingency plans for enforced certificate revocation to minimize disruptions to their systems | prevent | 2024-12-30 |
Although the incident does not derictly affect TLS/SSL end-user certifcates we will also consider providing ARI extension to our ACME protocol | prevent | 2025-03-01 |
With the implementation of these actions, we are confident that we will be able to fulfil the BRs of timely revocation going forward.
Appendix
Details of affected certificates
Based on Incident Reporting Template v. 2.0
From 28th September 2024 to 7th March 2025 I count 161 days. This is slightly longer than the 7 days required.
The action items are to give yourselves 3 months to provide some information to customers. None of those ensure that this will not happen again. On the contrary reading your report on the RCA it seems like pinning these intermediates is by design and no one is in a rush to do anything at all.
What aspect of your operations requires the use of public trust stores at this point? I can see no attempt to entertain any adherence to compliance within reasonable timescales.
Comment 3•4 months ago
|
||
(In reply to Piotr Grabowski from comment #1)
8355 end-user certificates are affected by the potential subCA revocation.
End-user certificates issued from affected subCA are used in following systems
Secure data exchange financial system 84,5%
Clearing settlement system 3,5%
Electronic identity system 1,0%
Others 11,0%
Could you give some specific examples of how these certificates are used in secure data exchange financial systems? You said that there would be outages, so could you also provide some examples of how these systems interact with your OCSP service and published CRL and provide details about the behaviour these systems in that scenario?
Assignee | ||
Comment 4•4 months ago
|
||
(In reply to Wayne from comment #2)
From 28th September 2024 to 7th March 2025 I count 161 days. This is slightly longer than the 7 days required.
The action items are to give yourselves 3 months to provide some information to customers. None of those ensure that this will not happen again. On the contrary reading your report on the RCA it seems like pinning these intermediates is by design and no one is in a rush to do anything at all.
What aspect of your operations requires the use of public trust stores at this point? I can see no attempt to entertain any adherence to compliance within reasonable timescales.
Dear Wayne,
Thank you for raising your concerns. We acknowledge the severity of the situation, particularly regarding the timeline, and the perception of actions. As noted in the incident report, management and personnel were informed about the problem's gravity early on (Oct 03, 2024 – 11:00 UTC – Management as well as personnel responsible for the affected certificates were informed about the potential severity of the problem.) Since then, KIR is actively collaborating with subscribers, their management, and third-party stakeholders to expedite certificate replacement while balancing the criticality of affected systems.
Regarding the design of intermediate certificate pinning during the first stages of collaboration we found out, it is indeed the case that in most subscriber backend systems, this is by design to ensure security annd stability. However, KIR has already started to actively support subscribers and third parties to find a more agile and efficient approach to manage root and intermediates certificates in their applications.
We support them for a best approches for redesign, development in several technlogies and programming languages and further maintanance. We are fully committed to the unpinning process, as indicated by our action items.
To address your question about the necessity of public trust stores, KIR subscirbers usually operates in highly secure environments where trust store implementations are limited and heavily focused on security. So far this approach ensured that only specific, trusted certificates are accepted, but it also made the unpinning process more complex. Nonetheless, we are working closely with subscribers to navigate this challenge.
Lastly, regarding the compliance timelines, we understand that there appears to be a delay in addressing this issue. However, KIR has adopted this strategy in its remediation plan to ensure the proper balance between operational security and compliance requirements. We took into the consideration complexity, approval processes, coordinated efforts, third-parties involved, maintanance windows and other important factors to handle migration as quickly as possible.
We appreciate your feedback and assure you that every entity involved is working diligently and without undue delay to resolve this issue as quickly as possible and we really hope we will manage to execute the revocation even earlier than in a given estimation.
Assignee | ||
Comment 5•3 months ago
|
||
(In reply to Mathew Hodson from comment #3)
(In reply to Piotr Grabowski from comment #1)
8355 end-user certificates are affected by the potential subCA revocation.
End-user certificates issued from affected subCA are used in following systems
Secure data exchange financial system 84,5%
Clearing settlement system 3,5%
Electronic identity system 1,0%
Others 11,0%Could you give some specific examples of how these certificates are used in secure data exchange financial systems? You said that there would be outages, so could you also provide some examples of how these systems interact with your OCSP service and published CRL and provide details about the behaviour these systems in that scenario?
Dear Mathew,
In these cases, these subscribers' certificates issued from the affected CA are mainly used for mutual authentication in a distributed financial system that ensures the secure data exchange and information between banks and institutions authorized to obtain information about their customers. Authentication and authorization is a two-step process - at the frontend level where basic certificate data, its status (OCSP, CRL) are checked and then at the backend level where full validation of the certificate as well as its issuer takes place.
Assignee | ||
Comment 6•3 months ago
|
||
We keep our migration plan on track and under control.
Updated•3 months ago
|
Assignee | ||
Comment 7•3 months ago
|
||
We keep our migration plan on track and under control.
Assignee | ||
Comment 8•2 months ago
|
||
We have migrated 50% affected end-user certifcates.
Updated•2 months ago
|
Assignee | ||
Comment 9•2 months ago
|
||
We keep our migration plan on track and under control.
Updated•21 days ago
|
Assignee | ||
Comment 10•18 days ago
|
||
We keep our migration plan on track and under control.
Description
•