Open Bug 1884532 Opened 2 months ago Updated 6 days ago

ACCV: Certificates issued with cRLIssuer in CDP extension

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: jamador, Assigned: jamador)

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

Attachments

(1 file)

Attached file List of certificates

Incident Report

Summary

Our team is notified that about 10+ certificates were issued with field cRLIssuer in CRL Distribution Points extension. This field must not be included in the certificate profile according to section “7.1.2.11.2 CRL Distribution Points” of Baseline Requirements for the Issuance and Management of Publicly-Trusted TLS Server Certificates (BR) version 2.0.1.

Impact

We have detected 837 active certificates affected by the incident (with the cRLIssuer field included), issued after 15 September 2023 when the BR 2.0.1 profiles came into force. As of today, they are no longer issued with this field.

Timeline

All times are UTC.

2023-09-15:

  • BR for TLS 2.0.0 has become effective

2024-03-09:

  • 08:30 After a routine review of the incidents received and referred to the compliance office, a warning is detected involving incorrectly issued certificates.

  • 11:00 Urgent meeting of the technical committee to evaluate the incident and the scope.

  • 13:10 Mis-issuance confirmed, requested a report of impacted certificates and it is guaranteed that at this date the certificates are being issued correctly.

  • 16:30 The technical committee meets again to initiate contact with users and establish mechanisms to avoid the repetition of these incidents, in addition to drafting the incident report.

  • 19:13 Publication of this incident report

Root Cause Analysis

The verification of the changes established in the new versions of the BR is done manually and up to this moment a matrix of fields of the certificate profile has not been created to verify that each field complies with the regulations. After the revision of BR version 2.0.1, several changes were detected, but not the removal of the cRLIssuer field.

The root causes detected are:

  • Verification of the changes established in the new versions of the BR is done manually and so far no certificate profile field matrix has been created to verify that each field complies with the regulations. After the revision of BR version 2.0.1, several changes were detected, but not the removal of the cRLIssuer field. This modification was masked because this field does not appear explicitly but is inherited from the field in the Certification Authority (this is how it appears in our EJBCA software). This has allowed the cRLIssuer field to appear in the Certificates with the inherited extension.

  • Our systems validate each pre-certificate generated with the latest version of Zlint before sending it to CTLogs and in our test environment no alarms were raised.

In the days following the entry into force of BR version 2.0.1, a sample of the certificates generated was made, showing that:

  • They complied with the detected changes.
  • Zlint, x509lint and certlint showed them without error

This led to close the update procedure to the latest BR version.

Lessons Learned

What went well

  • The response time from the detection of the incident until decisions were taken to correct it was adequate. The affected Certificates have been detected and the communication has been sent to the users in due time and form.

What didn't go well

  • The procedure for reviewing changes to certificate profiles has proven to be inadequate, as it does not establish a STRICT comparison between each field of the profile and the different sections of the BR.

  • The latest versions of linters that we were using have not detected this issue.

Where we got lucky

Action Items

Action Item Kind Due Date
The protocol for reviewing documentation associated with changes to certificate profiles will be improved. We will establish separate reviews by various team members following a matrix of certificate profile fields to check. The outcome of these reviews will be jointly evaluated by the compliance office. Prevent 2024-04-05
In addition to continuing to use ZLint, include Pkilint as a complementary tool. This tool seems to be updated more frequently to ensure compliance. Detect 2024-04-15
----------- ---- --------

Appendix

Details of affected certificates

See attached file

Based on Incident Reporting Template v. 2.0

Assignee: nobody → jamador
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [ov-misissuance]

Hi,
All affected certificates have been revoked.
We continue to work on the implementation of the actions that have been put in place to prevent such incidents.
Thank you very much
Regards

Hi Jose,

Thanks for filing this bug.

General Comments:

  • When submitting future incident reports, please be sure to closely follow the incident reporting guidelines available on CCADB.org. For example, the initial report fails to describe whether ACCV ceased issuance or whether it was intended to be a preliminary or final report. While many of these questions were answered later in the report, community members should be able to expect and rely upon a consistent reporting format.

  • The initial report provided lacks detail and fails to meet the community expectations described on CCADB.org and those described in the Baseline Requirements. Specific updates and answers to questions are requested below.

Updates Requested:

A. The Impact Section does not describe whether issuance ceased during the incident after ACCV was first notified via its problem reporting address.

B. The Timeline Section does not describe when the certificate problem report was received or when ACCV provided a preliminary report on its findings to both the affected Subscribers and the entity that filed the report (BRs 4.9.5). You should also add specific line items for when certificate profile changes took place to align with the BRs (ending misissuance), and when revocation for the misissued certificates was expected (as required by the BRs).

C. The Root Cause Analysis generally describes “human error” as the root cause of this issue, but lacks detail surrounding the specific factors that contributed to this incident. You could try the “5 Whys” methodology observed in 1878106.

D. The What Went Well list indicates that ACCV responded to the certificate problem report in a timely manner. However, as the reporter of this issue, I did not receive any communication from ACCV after mailing the problem reporting address. It’s also unclear to me when specifically the subscribers affected by this incident were contacted. Given these concerns, describing the response to the third-party problem report as “what went well” feels like a mischaracterization. A separate bug should be opened due to failing to respond to a Certificate Problem Report in a complete and timely manner.

E. The Action Items list does not offer sufficient detail to adequately evaluate how they will address the elements of the “What didn’t go well” list. Specifically, can you describe how the proposed changes are intended to improve ACCV’s ability to uphold its commitments to conform to the BRs when compared to existing processes? Why are additional reviews by additional team members expected to prevent similar issues in the future? Providing more details, including methods for the community to quantify whether each remediation tactic was successful, would be helpful.

F. A separate bug should be opened focusing on the delayed revocation of these misissued certificates (BRs 4.9.1.1: “With the exception of Short-lived Subscriber Certificates, the CA SHOULD revoke a certificate within 24 hours and MUST revoke a Certificate within 5 days and use the corresponding CRLReason (see Section 7.2.2) if one or more of the following occurs:” … “The CA is made aware that the Certificate was not issued in accordance with these Requirements or the CA’s Certificate Policy or Certification Practice Statement (CRLReason #4, superseded).”) For example, https://crt.sh/?sha256=2c4b6875b99de342cec51dc862fd9f392c4097dd8f0d9e403326926ad5963e34 was revoked about ten days after the problem report was submitted.

Questions:

Q1) The initial report does not describe requirements or plans related to certificate revocation, despite the BRs specifying this as an expected outcome. While I appreciate this was discussed in Comment 1, can you explain why this was omitted from the initial report?

Q2) The Timeline Section describes, “08:30 After a routine review of the incidents received and referred to the compliance office, a warning is detected involving incorrectly issued certificates.”? This action took place nearly five days after the problem report was submitted. What circumstances led to this delay?

Q3) The “Root Cause Analysis” Section describes, “After the revision of BR version 2.0.1, several changes were detected, but not the removal of the cRLIssuer field.” Can you explain why profile changes were only identified after the release of BR Version 2.0.1 (4 months after Version 2.0 was released, where these changes were introduced)? Version 2.0.0 represented a significant update to the BRs, with conversations of changes serving as the focal point of several Validation Subcommittee and F2F meeting agendas (e.g., https://lists.cabforum.org/pipermail/validation/attachments/20220608/ea4bb526/attachment-0001.pdf) for several years.

Q4) Can you describe whether ACCV uses pre-issuance linting and why that was not considered part of the remediation of this incident? Relying solely on post-issuance linting seems to leave an opportunity for future incidents, whereas pre-issuance linting presents an opportunity to prevent them.

Q5) Can you describe how ACCV evaluates linting tools to fully comprehend each one's scope, capabilities, and limitations, including as updates are made available?

Q6) Can you describe how ACCV validates linting tools are working as expected?

Q7) Beyond adopting pkilint, is ACCV planning to enhance the existing linting tools it relies on to better protect against this issue and others like it in the future (e.g., filing an enhancement request or contributing to open source tools)?

Flags: needinfo?(jamador)

Hi, Ryan

Thank you very much for the input and queries. We will prepare an update to answer your questions by the 21st at the latest.

Flags: needinfo?(jamador)

Incident Report

This is an update of the preliminary report.

Summary

Our team is notified that about 10+ certificates were issued with field cRLIssuer in CRL Distribution Points extension. This field must not be included in the certificate profile according to section “7.1.2.11.2 CRL Distribution Points” of Baseline Requirements for the Issuance and Management of Publicly-Trusted TLS Server Certificates (BR) version 2.0.1.

Impact

We have detected 837 active certificates affected by the incident (with the cRLIssuer field included), issued after 15 September 2023 when the BR 2.0.1 profiles came into force.

Once the information regarding the issuance error was received, processed and confirmed, the technical team reviewed the latest certificates issued and detected that the cRLIssuer field has not appeared in the certificates since 2024-02-22 due to a planned change in the configuration that collaterally broke the automatic inheritance mechanism in the certificate profiles. This mechanism was responsible for the field appearing when propagating the configuration information of the signing CA. This change was introduced in order to be able to change the OCSP server URL independently of the value of the CA and even remove it if it is set as optional in future versions of the BR (as seems to be the trend from the discussions in the CAB/Forum). For this reason, the issuance of certificates was stopped only two hours.

Timeline

All times are UTC.

2023-09-15:

  • CAs MUST use the updated Certificate Profiles passed in Version 2.0.1 BR

2024-02-21:

  • An update of our systems causes certificates to be issued without the cRLIssuer. This change was introduced in order to be able to change the URL of the OCSP server independently of the value of the CA and even remove it if it is set as optional in future versions of the BR (as seems to be the trend).

2024-03-04:

  • An external observer sent a personal e-mail to the account accv@accv.es indicating a possible problem with the issuance. This email was not prioritised as urgent and was passed on for routine review by the support team.

2024-03-09:

  • 08:30 After a routine review of the incidents received and referred to the compliance office, a warning is detected involving incorrectly issued certificates.
  • 11:00 Urgent meeting of the technical committee to evaluate the incident and the scope. ACCV stops issuing certificates while the checks are being carried out.
  • 13:10 Mis-issuance confirmed, requested a report of impacted certificates and it is guaranteed that at this date the certificates are being issued correctly. ACCV resumes TLS certificate generation.
  • 16:30 The technical committee meets again to initiate contact with users and establish mechanisms to avoid the repetition of these incidents, in addition to drafting the incident report.
  • 19:13 Publication of the first incident report, A preliminary report is attempted to provide all information, but the type of report is not indicated.

2024-03-14:

  • 08:30 All affected certificates are revoked.

2024-03-21:

Root Cause Analysis

The verification of the changes established in the new versions of the BR is done manually and up to this moment a matrix of fields of the certificate profile has not been created to verify that each field complies with the regulations. After the revision of BR version 2.0.1 (and previous version 2.0.0), several changes were detected, but not the removal of the cRLIssuer field.

The root causes detected are:

  • Verification of the changes established in the new versions of the BR is done manually and by a single person and so far no certificate profile field matrix has been created to verify that each field complies with the regulations. After the revision of BR version 2.0.1, several changes were detected, but not the removal of the cRLIssuer field. This modification was masked because this field does not appear explicitly but is inherited from the field in the Certification Authority (this is how it appears in our EJBCA software). This has allowed the cRLIssuer field to appear in the Certificates with the inherited extension, in the same way that it is no longer included when the inheritance has been removed..
  • Our systems validate each pre-certificate generated with the latest version of Zlint (currently version 3.6.1) in a pre-issuance mode. If an error is detected at this point the pre-certificate is revoked and if it is correct it is sent to CTLogs. In no case has Zlint returned a pre-issuance error.

In the days following the entry into force of BR version 2.0.1, a sample of the certificates generated was made, showing that:

  • They complied with the detected changes.
  • Zlint, x509lint and certlint showed them without error

This led to close the update procedure to the latest BR version.

Here we present the results of the “5 whys” root cause methodology that was followed:

Why was there a problem?

Because we were issuing certificates with the cRLIssuer field when it was not allowed.

Why were we issuing certificates with this field?

Because the revision of the profiles to adapt them to the BR 2.0.1 policy was not detected.

Why was this non-compliance not detected?

Because this field was automatically inherited from the CA, it was not taken into account in the revision by the responsible person (which included many other changes) and the problem was not detected with the lint tools used. In other words, there were no automatic tools and human error occurred.

Why did this human error occur?

Because the review was carried out by only one person.

Why was the review carried out by only one person?

Up to the moment of the change to BR 2.0.0, only one reviewer had been sufficient, but the modifications due to the change have been many and fields that were established since the beginning of the issuance of Certificates are no longer allowed. This has allowed us to detect this point of failure and to start to put the means to solve it.

Lessons Learned

What went well

  • The response time from the detection of the incident (when non-compliance was confirmed 2024-03-09) until decisions were taken to correct it was adequate. The affected Certificates have been detected and the communication has been sent to the users in due time and form.

What didn't go well

  • The use of the generic ACCV e-mail in communications for reporting problems causes delays in notifying the responsible staff. To avoid this, we are going to change it to a specific email for problem reports.
  • The procedure for reviewing changes to certificate profiles has proven to be inadequate, as it does not establish a STRICT comparison between each field of the profile and the different sections of the BR.
  • The latest versions of linters that we were using have not detected this issue.

Where we got lucky

Action Items

Action Item Kind Due Date
The protocol for reviewing documentation associated with changes to certificate profiles will be improved. We will establish separate reviews by various team members following a matrix of certificate profile fields to check. The outcome of these reviews will be jointly evaluated by the compliance office. Prevent 2024-04-05
In addition to continuing to use ZLint, include Pkilint as a complementary tool for pre-lint. This tool seems to be updated more frequently to ensure compliance. Other lints that may help in the early detection of missuassuances will also be assessed periodically. Detect 2024-04-15
Create an e-mail address only for communications of certificate issues that goes directly to the compliance and technical team responsible for these issues. Mitigate 2024-03-25
----------- ---- --------

Appendix

With regard to the questions ask:

Q1) The initial report does not describe requirements or plans related to certificate revocation, despite the BRs specifying this as an expected outcome. While I appreciate this was discussed in Comment 1, can you explain why this was omitted from the initial report?

We took it as understood that revocation was necessary and did not make it explicit. An error on our part. We have corrected this in the update of the report..

Q2)The Timeline Section describes, “08:30 After a routine review of the incidents received and referred to the compliance office, a warning is detected involving incorrectly issued certificates.”? This action took place nearly five days after the problem report was submitted. What circumstances led to this delay?

The email was picked up by our support system from the generic communication mailbox and as there were no precedents of similar incidents and it did not cause a problem in the issue, the severity of the problem was not understood and support sent it for the next routine review of the emails, which in this case was on the 9th of March. They have already taken measures separating the problem communication email from the generic one and a Bugzilla will be put in place to indicate the delay in the response.

Q3) The “Root Cause Analysis” Section describes, “After the revision of BR version 2.0.1, several changes were detected, but not the removal of the cRLIssuer field.” Can you explain why profile changes were only identified after the release of BR Version 2.0.1 (4 months after Version 2.0 was released, where these changes were introduced)? Version 2.0.0 represented a significant update to the BRs, with conversations of changes serving as the focal point of several Validation Subcommittee and F2F meeting agendas (e.g., https://lists.cabforum.org/pipermail/validation/attachments/20220608/ea4bb526/attachment-0001.pdf) for several years.

It refers to 2.0.1 because it was the one in force on 15 September. We have clarified this in the update report.

Q4) Can you describe whether ACCV uses pre-issuance linting and why that was not considered part of the remediation of this incident? Relying solely on post-issuance linting seems to leave an opportunity for future incidents, whereas pre-issuance linting presents an opportunity to prevent them.

Yes, we use pre-issuance linting for all certificate issuance where applicable, including all TLS certificates. We use zlint in the latest version and as part of the solution we are going to use the pkilint tool for pre-issuance linting as well. We have tried to clarify this in the report update.

Q5) Can you describe how ACCV evaluates linting tools to fully comprehend each one's scope, capabilities, and limitations, including as updates are made available?

We tested on a recent set of certificates to see the associated messages by category. We reviewed the changes in the Github project and tested the latest changes according to the release notes. We are on the zlint mailing list to see what the future changes are and when they will be in production.

Q6) Can you describe how ACCV validates linting tools are working as expected?

We validate the operation of the versions before going into production on our test platform, where we can test the pre-linting associated with the issuance process, as well as test with a fresh set of certificates.

Q7) Beyond adopting pkilint, is ACCV planning to enhance the existing linting tools it relies on to better protect against this issue and others like it in the future (e.g., filing an enhancement request or contributing to open source tools)?

As it is such an important tool, we are looking at how we can help or collaborate.

Thank you very much

ACCV has opened a case in CCADB for the modification of the email address associated with the problem report mechanism from accv@accv.es to problem_reporting@accv.es. The change is pending confirmation from the root store reviewer.

Update on actions.

ACCV has changed its internal protocols to improve response to changes and incidents. Different roles have been added to the compliance office and the number of people receiving notifications from the dedicated mailbox for reporting errors and incidents has been increased to six. ACCV has initiated a training process to familiarise new managers with the documents and data sources involved in the compliance office. We hope that these improvements will avoid cases like the latest compliance problems.

We continue to work towards adopting pkilint in the pre-linting process, in addition to zlint. We are in testing and expect to be able to move to production a version of the generation application within the deadline.

The reviewer has confirmed the mechanism for reporting problems in the CCADB. The new e-mail address is problem_reporting@accv.es

Action Items

Action Item Kind Status Due Date
The protocol for reviewing documentation associated with changes to certificate profiles will be improved. We will establish separate reviews by various team members following a matrix of certificate profile fields to check. The outcome of these reviews will be jointly evaluated by the compliance office. Prevent Done 2024-04-05
In addition to continuing to use ZLint, include Pkilint as a complementary tool for pre-lint. This tool seems to be updated more frequently to ensure compliance. Other lints that may help in the early detection of missuassuances will also be assessed periodically. Detect In progress 2024-04-15
Create an e-mail address only for communications of certificate issues that goes directly to the compliance and technical team responsible for these issues. Mitigate Done 2024-03-21
----------- ---- -------- ----------

Update on actions.

After a two-week testing process ACCV has deployed in PRO the version of the issuing system that introduces pkilint (v0.10.0) as a pre-linting mechanism. ACCV is currently using pkilint and zlint as prelint tools. We hope that in this way we can prevent future errors.

Action Items

Action Item Kind Status Due Date
The protocol for reviewing documentation associated with changes to certificate profiles will be improved. We will establish separate reviews by various team members following a matrix of certificate profile fields to check. The outcome of these reviews will be jointly evaluated by the compliance office. Prevent Done 2024-04-05
In addition to continuing to use ZLint, include Pkilint as a complementary tool for pre-lint. This tool seems to be updated more frequently to ensure compliance. Other lints that may help in the early detection of missuassuances will also be assessed periodically. Detect Done 2024-04-15
Create an e-mail address only for communications of certificate issues that goes directly to the compliance and technical team responsible for these issues. Mitigate Done 2024-03-21
----------- ---- -------- ----------

No further action is pending. We are monitoring this bug for further comments or questions.

Can you clarify when you say you lint the certs before sending them to CT:

When you’re sending them to the linters, have you signed them with the actual intermediate or are you signing them with something else at that point?

Hi, Amir

For the pre-lint we are sending the data to the linters in the pre-sign certificate phase. According to the supplier's instructions:

The validator is run on a certificate signed with a hardcoded dummy key, before any CT pre-certificate or final certificate has been produced. Used to validate certificate contents before the CAs private key is used. This is useful since signing a CT pre-certificate counts as issuance, and revocation is needed even if the CT pre-certificate has not been submitted to logs. Using a pre-sign certificate allows validation before any requirements are put on a CT pre-certificate or final certificates. The pre-sign certificate has the same contents as the final certificate except for the authorityKeyId, which is for the hardcoded dummy key (and the signature is different of course).
In order to use the same signature algorithm as the CA issued certificate, the dummy key is of the same key algorithm as the CAs signing key;

When we tested the different alternatives offered by the supplier this option seemed the cleanest because it did not involve the production keys in the process.

Furthermore, as another detection mechanism, the already generated certificate is sent to the linters after the issuance is finished (it passes through the linters twice pre and post).

Thank you for your questions

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: