Open Bug 1885132 Opened 2 months ago Updated 6 hours ago

TWCA: TLS certificates with non-critical basicConstraints

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: hcli, Assigned: hcli, NeedInfo)

Details

(Whiteboard: [ca-compliance] [ov-misissuance] [ev-misissuance] Next update 2024-05-30)

Attachments

(1 file)

Steps to reproduce:

Incident Report

This is a preliminary report.

Summary

During the investigation of a previous bug (https://bugzilla.mozilla.org/show_bug.cgi?id=1883620), we tested pkilint on certificates and discovered that some certificates had their basicConstrant not set as critical, which does not comply with BR Section 7.1.2.7.
We have confirmed that recently issued TLS certificates are not affected.
We are still investigating the issue and looking for affected certificates.

Timeline

All times are UTC+8.

2023-04-22:

  • TLS BR 2.0.0 was published.

2023-08-31:

  • We have reviewed the certificate profile changes in TLS BR 2.0.0 and prepared new certificate profiles.

2023-09-15:

  • 08:00 TLS BR 2.0.0 has become effective.

2024-03-05:

2024-03-13:

  • 12:00 As part of the investigation of previous bug, we tested pkilint on the certificates and discovered that some certificates triggered warning because the basicConstrants are not set as critical. We confirmed this is a separate incident.
  • 13:00 We have confirmed that the recently issued TLS certificates are not affected by this issue. We are still investigating when and for what reasons the affected certificates were issued.
  • 20:30 Posting this preliminary report.
Assignee: nobody → hcli
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance]

Before this incident, were any linting tools used at all?

(In reply to amir from comment #1)

Before this incident, were any linting tools used at all?

We currently use ZLint and our self-developed TWCALinter tool for pre-issuance checks of SSL certificates. The preliminary investigation of the version information is as follows.

  • 2019/06/26:First adopted TWCALinter v4.4.2.4.
  • 2020/07/21 :First adopted ZLint v2.0.0 and TWCALinter v4.4.2.7.
  • 2020/10/28 :Adopted ZLint v2.2.0 and TWCALinter v4.4.2.9.
  • 2021/10/28:Adopted ZLint v3.2.0 and TWCALinter v4.4.3.0.
  • 2022/06/10:Adopted ZLint v3.3.1 and TWCALinter v4.4.7.0.
  • 2023/12/31:Adopted ZLint v3.5.0 and TWCALinter v5.0.0.0.
Attached file Affected certificates

Incident Report

Summary

During the investigation of a previous bug (https://bugzilla.mozilla.org/show_bug.cgi?id=1883620), we tested pkilint on certificates and discovered that some certificates had their basicConstrant not set as critical, which does not comply with BR Section 7.1.2.7.
We have confirmed that recently issued TLS certificates are not affected.

Impact

Since 2023-09-15, 75 EV TLS certificates and 16406 OV TLS certificates were issued with non-critical basicConstraints.
According to BR, these certificates must be revoked within 5 days.

Timeline

All times are UTC+8.

2023-04-22:

  • TLS BR 2.0.0 was published.

2023-08-31:

  • We have reviewed the certificate profile changes in TLS BR 2.0.0 and prepared new certificate profiles.

2023-09-15:

  • 08:00 TLS BR 2.0.0 has become effective.

2024-03-05:

2024-03-13:

  • 12:00 As part of the investigation of previous bug, we tested pkilint on the certificates and discovered that some certificates triggered warning because the basicConstrants are not set as critical. We confirmed this is a separate incident.
  • 13:00 We have confirmed that the recently issued TLS certificates are not affected by this issue. We are still investigating when and for what reasons the affected certificates were issued.
  • 13:30 We have begun an investigation into the scope of the impact.
  • 14:00 TWCA started contacting customers for certificate replacement.
  • 20:30 Posting this preliminary report.

2024-03-14:

  • 12:00 We have calculated the exact number of certificates affected and are continuing with the certificate replacement operations.

2024-03-15:

  • 16:20 Upload the list of affected certificates.

Incident Report

Summary

During the investigation of a previous bug (https://bugzilla.mozilla.org/show_bug.cgi?id=1883620), we tested pkilint on certificates and discovered that some certificates had their basicConstrant not set as critical, which does not comply with BR Section 7.1.2.7.

Impact

Since 2023-09-15, 75 EV TLS certificates and 16406 OV TLS certificates were issued with non-critical basicConstraints.
Roughly 85% of the affected certificates have been revoked or expired within 5 days. We have created separate bugs for delayed revocations of EV and OV certificates.

Timeline

All times are UTC+8.

2023-04-22:

  • TLS BR 2.0.0 was published.

2023-08-31:

  • We have reviewed the certificate profile changes in TLS BR 2.0.0 and prepared new certificate profiles. We expected the profile to be configured before 09-15.

2023-09-15:

  • TLS BR 2.0.0 has become effective.
  • We have validated the configuration with existing linters and no errors were reported. Therefore we did not detected that the changes in the profile was not fully configured.

2023-02-05:

  • The operational personnel discovered the configuration failure and rectified it, but did not raise an incident.

2024-03-05:

2024-03-13:

  • 12:00 As part of the investigation of previous bug, we tested pkilint on the certificates and discovered that some certificates triggered warning because the basicConstrants are not set as critical. We confirmed this is a separate incident.
  • 13:00 We confirmed that the recently issued TLS certificates are not affected by this issue. We were still investigating when and for what reasons the affected certificates were issued.
  • 20:30 Posting a preliminary report.

2024-03-14:

  • 12:00 We have calculated the exact number of certificates affected and were continuing with the certificate replacement operations.
  • 15:30 We confirmed that the cause of mis-issuance was incorrect profile configuration, and we discovered that the configuration was rectified on 2/5.

2024-03-15:

  • 15:42 Upload the list of affected certificates.

Root Cause Analysis

Despite we were aware of the TLS BR changes and provided the certificate profile in advance, the operational team failed to configure it within the expected timeframe. Moreover, the validation mechanism for profile changes proved inadequate, leading to the issuance of certificates with incorrect formatting.

Another contributing factor to our failure to detect this issue earlier is our heavy reliance on linting tools for issue identification. Specifically, we primarily rely on zlint, which did not report any errors because the the changes of TLS BR 2.0.0 are just implemented in the new version released months after TLS BR 2.0.0 became effective, and we had not completed testing and integration of this new version until now.

Furthermore, the operational personnel independently discovered the configuration failure and rectified it without grasping the implications of this change. The team solely depended on the linting tool's results, which remained normal both before and after the profile change. Consequently, they failed to recognize it as an incident and neglected to report it to the compliance team. As a result, the mis-issuance went undetected until now.

Lessons Learned

What went well

N/A

What didn't go well

  • The validation mechanism for profile changes was inadequate.
  • Delays occurred before the linter could be updated to check for the new requirement.
  • We did not update the linter frequently enough to detect the issue earlier.
  • The operational team did not recognize the error in profile configuration as an compliance incident.

Where we got lucky

N/A

Action Items

Action Item Kind Due Date
Enhance training for operational personnel to ensure they understand the meaning of each profile change, enabling them to verify configuration of each change during operations. Prevent 2024-04-30
Add pkilint to the certificate issuance process. Prevent/Detect 2024-06-30
Establish a procedure to check at least once a week whether the certificate linter tools has been updated, and upon discovering an update, initiate a series of self-checks and tool update operations. This procedure will be documented in the TWCA internal ISMS documents. Prevent/Detect 2024-04-30
Reinforce the communication across departments to ensure that all operations and events potentially related to compliance are reported to the compliance team. Detect 2024-03-30

Appendix

Details of affected certificates

See Comment#3

Whiteboard: [ca-compliance] → [ca-compliance] [ov-misissuance] [ev-misissuance]

We continue to monitor the incident and complete action items as planned.

Hi Chya-Hung and Hao-Chun Li,

Thanks for filing this report, the timely updates, and for proactively evaluating your certificate corpus for the potential of other profile non-conformance issues in response to Bug 188360.

Comment #1: Bug 1875820 was filed on January 22, 2024, and demonstrated the same mis-issuance we observe in this report (i.e., basicConstraints not marked critical).

Question #1: How is TWCA monitoring Bugzilla such that it is able to identify potential issues of its own and remediate them in a timely manner?

Question #2: Can you help us understand the circumstances that resulted in TWCA marking basicConstraints as critical? Based on the incident timeline (February 5, 2024 entry), it seems operational personnel discovered the issue and updated the relevant issuance profiles, but it’s not clear what led to the issue’s discovery and remediation.

Question #3: You mention a heavy reliance on zlint. Given you are now adopting pkilint, how do you intend to avoid the same mistake?

Question #4: Can you describe how TWCA evaluates linting tools to fully comprehend each one's scope, capabilities, and limitations, including as updates are made available - including tools that might not yet be in use today (e.g., pkilint)?

Question #5: Can you describe how TWCA validates that linting tools are working as expected?

Thanks,
Ryan

Flags: needinfo?(hcli)

(In reply to Ryan Dickson from comment #7)

Hi Chya-Hung and Hao-Chun Li,

Thanks for filing this report, the timely updates, and for proactively evaluating your certificate corpus for the potential of other profile non-conformance issues in response to Bug 188360.

Comment #1: Bug 1875820 was filed on January 22, 2024, and demonstrated the same mis-issuance we observe in this report (i.e., basicConstraints not marked critical).

Question #1: How is TWCA monitoring Bugzilla such that it is able to identify potential issues of its own and remediate them in a timely manner?

We will set up a schedule to regularly retrieve incidents created in Bugzilla from the previous week and send a summary to the relevant personnel. These will then be discussed in the weekly meeting, with the meeting content including self-assessment and review.

Question #2: Can you help us understand the circumstances that resulted in TWCA marking basicConstraints as critical? Based on the incident timeline (February 5, 2024 entry), it seems operational personnel discovered the issue and updated the relevant issuance profiles, but it’s not clear what led to the issue’s discovery and remediation.

Originally, we expected to adopt the new profile by 8/31, but there was an oversight by the maintenance staff at the time of setting, and the configuration was not completed. It was not until 2/5, when the new version of the CA software was launched and the system settings were checked, that it was discovered the settings were incomplete. The maintenance staff's response was to directly configure the settings and, after checking that there were no anomalies in the issuance, deemed that this delayed change had not caused any impact, and there was no need for further notification or action.

Question #3: You mention a heavy reliance on zlint. Given you are now adopting pkilint, how do you intend to avoid the same mistake?

  • We believe that tools serve as aids, and everything is still guided by the BRs as the highest principle. Regarding the tools, we will set up a schedule to run daily, conducting checks on the targets we monitor (including the GitHub site of lint tools) and sending the results to the relevant personnel. If there is an update in the version of the monitored targets, related assessments will be conducted, including checks on issued certificates certificates and update evaluations.
  • If the lint tool has known issues that it does not yet cover, we will add those checks to our self-developed lint tool.

Question #4: Can you describe how TWCA evaluates linting tools to fully comprehend each one's scope, capabilities, and limitations, including as updates are made available - including tools that might not yet be in use today (e.g., pkilint)?

  • When we notice a new tool, we will conduct tests to verify the tool's usability and evaluate whether to incorporate it. However, we cannot guarantee the extent of its coverage in terms of checks.
  • We believe that using a variety of tools can mitigate the issue of insufficient detection by a single tool.
  • We look forward to having a lint tool released and maintained under the CA/B Forum's name for all CAs to use, which would be updated synchronously with any changes to the BRs.

Question #5: Can you describe how TWCA validates that linting tools are working as expected?

Before the system goes live, we will conduct tests using erroneous certificates for validation to ensure that the lint tool can detect these anomalies. After the system is launched, our internal audit will perform regular checks to ensure that the certificate issuance process works as expected. These checks include random inspections of transaction logs to ensure that each certificate is checked by the tool before being uploaded to the CT log server.

Thanks,
Ryan

We have completed a operation checklist that covers the procedures for regulatory confirmation, as well as the review processes with the compliance team, ensuring that operation and the compliance team establish horizontal communication channels.

Action Items

Action Item Kind Due Date
Reinforce the communication across departments to ensure that all operations and events potentially related to compliance are reported to the compliance team. Detect Done
Enhance training for operational personnel to ensure they understand the meaning of each profile change, enabling them to verify configuration of each change during operations. Prevent 2024-04-30
Establish a procedure to check at least once a week whether the certificate linter tools has been updated, and upon discovering an update, initiate a series of self-checks and tool update operations. This procedure will be documented in the TWCA internal ISMS documents. Prevent/Detect 2024-04-30
Add pkilint to the certificate issuance process. Prevent/Detect 2024-06-30

(In reply to chtsai from comment #2)

(In reply to amir from comment #1)

Before this incident, were any linting tools used at all?

We currently use ZLint and our self-developed TWCALinter tool for pre-issuance checks of SSL certificates.

Is there any public documentation for TWCALinter, particularly in regard to the list of checks that it performs? Is the code open-source?

(In reply to Rob Stradling from comment #10)

(In reply to chtsai from comment #2)

(In reply to amir from comment #1)

Before this incident, were any linting tools used at all?

We currently use ZLint and our self-developed TWCALinter tool for pre-issuance checks of SSL certificates.

Is there any public documentation for TWCALinter, particularly in regard to the list of checks that it performs? Is the code open-source?

Hi Rob,
TWCALinter is a tool we developed in 2019 that has not been made public. As it was developed earlier, many verification cases have gradually been covered by ZLint. With the continuous updates of ZLint, our tool's completeness is clearly not as extensive as ZLint anymore. There has been little maintenance after 2023, and it is currently mainly used for local business logic checks. If interested, we can provide the source code privately.

The current draft of our internal ISMS documents has been completed and is now undergoing the review and issuance process. We request to set the next update for April 30, 2024. Thank you.

Progress Update (synchronized with Bug 1883620):

  • The internal procedures have been established, relevant ISMS documents have been issued, and the company is currently complying with the standards in its operations.
  • We use automated scheduling to monitor Bugzilla and lint tools. This schedule sends information via email to internal colleagues at daily and weekly intervals:
    • Daily monitoring: An email is sent out every day to monitor standards (such as BRs) or new release information of lint tools, and version change information is marked in the email.
    • Weekly monitoring: An email is sent every Monday summarizing incidents created on Bugzilla from the previous week, which are then reviewed in weekly meetings to analyze the causes of incidents and whether similar situations have occurred within the company.
  • The implementation of pkilint is progressing smoothly; software development has been completed and is currently in the testing phase, expected to be completed on schedule.

Action Items

Action Item Kind Due Date
Reinforce the communication across departments to ensure that all operations and events potentially related to compliance are reported to the compliance team. Detect Done
Enhance training for operational personnel to ensure they understand the meaning of each profile change, enabling them to verify configuration of each change during operations. Prevent Done
Establish a procedure to check at least once a week whether the certificate linter tools has been updated, and upon discovering an update, initiate a series of self-checks and tool update operations. This procedure will be documented in the TWCA internal ISMS documents. Prevent/Detect Done
Add pkilint to the certificate issuance process. Prevent/Detect 2024-06-30

We request to set the next update for May 30, 2024. Thank you.

Whiteboard: [ca-compliance] [ov-misissuance] [ev-misissuance] → [ca-compliance] [ov-misissuance] [ev-misissuance] Next update 2024-05-30
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: