Closed Bug 1792111 Opened 2 years ago Closed 2 years ago

IdenTrust: Expired CRLs

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: roots, Assigned: roots)

Details

(Whiteboard: [ca-compliance] [crl-failure])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36

Steps to reproduce:

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

On 09/16/2022 our monitoring alerted that the CRL issued by the “IdenTrust Commercial Root CA 1” root had expired. We had approximately 90 minutes during which there were 137K downloads of the expired CRL.
This is a violation of section 4.10.2 of the CA/B Forum Baseline Requirements (quotes added):
The CA SHALL maintain an online 24x7 Repository that application software can use to automatically check the current status of “all unexpired Certificates issued by the CA.”

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

2022-09-14 13:30 MDT: New CRL created to replace upcoming CRL expiration on 2022-09-16
2022-09-16 14:00 MDT: “IdenTrust Commercial Root CA 1” root CRL expired
2022-09-16 14:07 MDT: Received alert from external monitor for expired CRL at http://validation.identrust.com/crl/commercialrootca1.crl
2022-09-16 15:33 MDT: Completed replacement of the expired CRL

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
    Not applicable

  2. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
    Not applicable

  3. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
    Not applicable

  4. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
    As part of the remediation actions of #1709192, we had deployed a monitor to alert of impending expired CRL(s) which once triggered would allow us sufficient time to publish a new CRL prior to the upcoming expiration. On 9/16/2022 this monitor detected the upcoming CRL expiration but the notification process failed to alert us resulting in our omission to replace the expired CRLs with valid CRLs. We are investigating the root cause of the notification failure.

  5. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
    We will provide an update on or before September 30, 2022 on our investigation of the root cause. Once we have identified the root cause, we will also communicate the steps we will take to avoid recurrence.

Assignee: bwilson → roots
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance]

Updates:
6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
As part of the remediation actions of #1709192, we deployed a monitor to alert us of impending expired CRL(s) which once triggered would allow us sufficient time to publish a new CRL prior to the upcoming expiration. On 9/16/2022 this monitor detected the upcoming CRL expiration but the notification process failed to alert us resulting in our oversight to replace the expired CRLs with valid CRLs. The notification delivery was via a mail system that was not functioning as expected due to an incorrect configuration. The incorrect configuration was applied by our automated configuration management platform during our last system update.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
    The mail system has been updated with the correct configuration. In addition, the automated configuration management platform has been updated to retain the correct configuration during future system updates. We have no further actions and consider this issue resolved.

Thank you for the incident report.

Related to #6:

  • How was the incorrect configuration applied? (i.e., can you share more about the specific circumstances that allowed the incorrect configuration to be applied?)
  • The remediation for 1709192 appears to have been implemented in 2021, we are curious to know when the system update occurred in the timeline of actions provided.
  • Before the system update, was the notification feature working as expected? Do you suspect there were other similar incidents in the past?

Related to #7:

  • Has the automated configuration management platform been tested to ensure the updated configuration is retained for future system updates? If so, can you please describe how? (i.e., have you designed both positive and negative test cases for the CRL monitoring solution to determine it’s working properly?)

We acknowledge receipt and will provide an answer no later than 10/10/2022.

Related to #6:
• How was the incorrect configuration applied? (i.e., can you share more about the specific circumstances that allowed the incorrect configuration to be applied?)

The mail server configuration is designed to be applied via a configuration management tool. There is a centralized "master” device that the master configuration management software runs on and several "client" servers that run a "client" configuration service. There is a bidirectional functionality that exists with the option to update the configuration at either the centralized "master" point or the "client" point running on the individual servers. In the situation reported in bug 1792111, the master point did not have the same mail server configuration as the client machine running the mail service. The client machine was updated and then restarted post-update. It is believed that there was a more up-to-date mail server configuration running on the client server before the update. The sync between the client and the master configuration was not performed beforehand. After the restart, the client-side configuration was no longer available on the mail server and the client retrieved the most recent copy from the master device resulting in an old configuration being applied on the mail server that broke the email delivery.

• The remediation for 1709192 appears to have been implemented in 2021, we are curious to know when the system update occurred in the timeline of actions provided.

The mail server system update was applied on 8/16/2022

• Before the system update, was the notification feature working as expected? Do you suspect there were other similar incidents in the past?

Before the mail server system update, the notification feature was working as expected. To our knowledge, there were no similar mail server incidents prior to this incident.

Related to #7:
• Has the automated configuration management platform been tested to ensure the updated configuration is retained for future system updates? If so, can you please describe how? (i.e., have you designed both positive and negative test cases for the CRL monitoring solution to determine it’s working properly?)

The automated configuration management has been eliminated in the new design and it is no longer applicable going forward. The notification system has been thoroughly tested and mail notifications are flowing thru as expected.

This issue does not require any additional remediation.

Flags: needinfo?(bwilson)

I'll close this on or about next Wed. 2-Nov-2022.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Summary: IdenTrust Expired CRLs → IdenTrust: Expired CRLs
Whiteboard: [ca-compliance] → [ca-compliance] [crl-failure]
You need to log in before you can comment on or make changes to this bug.