Closed Bug 1794050 Opened 2 years ago Closed 2 years ago

DigiCert: Org information issue in new validation workflow

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jeremy.rowley, Assigned: jeremy.rowley)

Details

(Whiteboard: [ca-compliance] [ov-misissuance] [ev-misissuance])

Attachments

(2 files, 1 obsolete file)

Attached image UX change.png

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36

Steps to reproduce:

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

On September 29th, 2022, the DigiCert validation team detected an anomaly in an issued certificate. Specifically, we detected one certificate with “Name: [org name]” in the O field of the certificate. Compliance and engineering began investigating. On September 30th, the validation team detected an issue with the same system where a DBA was included in an EV certificate without adequate validation agent approval. The issues related to enhancements we made to our validation system and were detected and reported by DigiCert staff.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

On Sep 29, 2022, the DigiCert validation team detected an anomaly in an issued certificate. Specifically, we detected one certificate with “Name: [org name]” in the O field of the certificate. Compliance and Engineering teams began investigating. Engineering and Support determined the root cause of the mis-issued certificate related to a UI issue where the validation staff could select more than the company name while validating information. DigiCert staff searched for any additional certificates.

On Sep 30, 2022, the DigiCert team detected another issued cert with a DBA that was not approved for inclusion in the certificate. We concluded that the issue was related to the new validation experience but a slightly separate issue than the issue on the 29th. The issue with the second certificate was with an improved validation proxy adding a slim name to the certificate as a DBA before the certificate is passed to the CA for issuance. We found the issue and fixed the primary issue with the validation proxy late on Sept 30th by updating the code to not include a DBA unless the DBA was validated and approved by the validation staff.

On Oct 3, 2022, we continued to investigate the issue and potential related issues. We did not find any related issues, but we found that cached validation information remained intact. This caused issuance of additional certificates with non-verified DBA information. Late on Oct 3rd , we fixed this DBA issue by clearing the cached results, updating the validation information, and confirming correct issuance of impacted accounts.

On Oct 4, 2022, we monitored the certificate issuance system for additional anomalies. We did not detect any additional anomalies.

On Oct 5, 2022, Compliance performed a final sweep to confirm the list of impacted certificates, determined which customers needed to replace certificates, and initiated the revocation process.

On Oct 6, 2022, while preparing the incident report, we reviewed the validation information and code, verifying the information. We found that our high-volume issuing system (compared to the regular issuing system) contained DBA information in a cached copy of the validation information. We remediated by clearing the high-volume certificate cache.

  1. Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident. A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.

We have located and fixed the issues by updating our UI, updating the code of the proxy passing information from to the CA, and by clearing all cached information.
We are looking at additional unit tests, sanity checks, and UI changes we can make to prevent similar issues.

  1. In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.

The first bad certificate issued Sep 14, 2022. The last certificate issued Oct 3, 2022. We deployed the changes that caused “NAME: “issue on Aug 8th, 2022. Although the code is the same, EV certificates were not sent through the proxy on Sep 14, 2022. Instead, we switched EV certificate issuance to the proxy as a test on Sep 14, 2022.

There were two primary root causes. First, the UI made the exact org name selected for validation potentially confusing. In this case, the validation agent selected “Name: [org name]” in the government record. The record showed up in the validation case as “Name: [org name]”, which made it look like an updated UI. This did not raise red flags as we are making improvements to the validation interface to highlight potential risks and streamline validation processes. I have attached a screenshot of the UI before and after to help illustrate what happened.

The second root cause related to the first because both resulted from changes made to the validation engine to improve usability, reduce the potential for errors, and streamline the validation procedure. The second issue included a proxy that tied the validation information back to the CA. This agent enhances the efficiency of our validation system with integration with the new UI. The proxy was adding shortened company names to EV certs as a DBA. This addition of the DBA met the rules associated with EV but the addition of the DBA was not approved by our validation staff.

  1. In a case involving certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.

See attached spreadsheet.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

The mistake was two-fold. First, the UI provided insufficient warning that the org name included non-org name information. The layout made the issue difficult to detect during validation. Therefore, when a validation agent selected the word “Name” as part of the org name, the mistake when undetected during subsequent review. Second, the proxy had insufficient unit testing to detect whether what was sent from the validation tool matched exactly what sent to the CA. Although we have unit testing in place for the system, the unit tests involved passed because this was a DBA and an org name, just not an approved DBA.

  1. List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

We are changing the UI as shown in the attached images. This will resolve the user agent and detection issue. Second, we are enhancing our unit tests so they test to make sure any extra information in a user’s account is not included in a certificate except where inclusion is specifically approved by the validation team.

Attached file List of impacted certs (obsolete) —
Assignee: bwilson → jeremy.rowley
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance]

We revoked all impacted certificates and updated the UX per the attachment. We also added additional highlighting to the UI for any information that doesn't match exactly what the customer entered. This highlighting will help alert second approvers to changes made after submission of the cert request. Finally, we added one additional unit test that checks to see if an OV cert has a DBA. In our system, OV should never pass a DBA to the CA - just the org name.

Any additional questions before this bug is closed?

I'll check back here in a week (Oct. 26) and close it, unless there are additional questions.

Flags: needinfo?(bwilson)

Hi Jeremy. After reviewing your incident report in comment 0, the Sectigo team has some observations and questions regarding the timeline.

13 precertificates do not seem to be fully accounted for

In comment 0 you wrote,

The first bad certificate issued Sep 14, 2022. The last certificate issued Oct 3, 2022

However, your List of impacted certs contains 13 precertificates that were apparently issued later than "Oct 3, 2022", based on the notBefore timestamps and the earliest corresponding SCT timestamps:

crt.sh ID notBefore Earliest SCT timestamp
7688495093 2022-10-06 00:00:00 2022-10-06 00:55:28.639
7689612238 2022-10-06 00:00:00 2022-10-06 06:47:49.945
7689808066 2022-10-06 00:00:00 2022-10-06 08:13:11.922
7689936456 2022-10-06 00:00:00 2022-10-06 09:13:07.783
7685093057 2022-10-06 00:00:00 2022-10-06 10:00:07.265
7690220777 2022-10-06 00:00:00 2022-10-06 11:41:09.234
7690622550 2022-10-06 00:00:00 2022-10-06 14:17:03.055
7690708430 2022-10-06 00:00:00 2022-10-06 14:47:36.514
7690751798 2022-10-06 00:00:00 2022-10-06 14:58:27.601
7690841284 2022-10-06 00:00:00 2022-10-06 15:35:01.045
7690973707 2022-10-06 00:00:00 2022-10-06 16:33:16.945
7690980879 2022-10-06 00:00:00 2022-10-06 16:35:59.848
7691044637 2022-10-06 00:00:00 2022-10-06 17:02:06.961
  • Can you explain this discrepancy?

Comment 0 also states,

On Oct 5, 2022, Compliance performed a final sweep to confirm the list of impacted certificates, determined which customers needed to replace certificates, and initiated the revocation process.

If those 13 precertificates were issued on October 6, then that "final sweep" on October 5 would not have caught them, which would mean that there is no information in comment 0 regarding the discovery and revocation of those 13 precertificates.

  • Can you clarify when those 13 precertificates were discovered to have been misissued, and when you "initiated the revocation process" for them?

Any misissuance on October 4 or 5?

We surmise from the timeline that stale information in your "high-volume certificate cache" led to those 13 precertificates being misissued on October 6; however, given the presumed "high-volume" of issuance, we are a little surprised that attachment 9297504 [details] contains no certificates or precertificates that were issued on October 4 or 5.

  • Can you confirm that no misissuance relating to this incident occurred on those two days?

Timestamps lack the expected precision

None of the timestamps indicate the time of day or even the timezone, and so the timeline falls short of being a "date-and-time-stamped sequence of all relevant events". Other CAs have been criticised in the recent past for this same lack of precision - for example, see bug 1740493 comment 2.

  • Can you post a revised timeline that contains suitably precise timestamps?

Suspected failure to revoke within 5 days

The timeline says that you "initiated the revocation process" on "Oct 5, 2022". Since this incident is a misissuance event, BR 4.9.1.1 requires revocation within 5 days. Given the lack of expected precision in the timestamp, it's hard for us to know exactly when the 5 day counter began; but if we assume it began at 2022-10-05T23:59:59Z, and if we discount the 13 precertificates that apparently were not yet issued at that time, then we count 22 precertificates disclosed in attachment 9297504 [details] for which the revocation timestamps in the corresponding CRL entries indicate that the 5 day deadline was missed by over 21 hours.

  • Can you review the data and either confirm our analysis or clarify the time periods that indicate revocation occurred within 5 days?

Weekly update missed

The gap of 11 days between comment 1 and comment 2 falls short of Mozilla's expectation that you "should also provide updates at least every week giving your progress". A Chrome representative has also expressed that same expectation to us previously.

Ambiguous identification of (pre)certificates

Finally, we noticed that attachment 9297504 [details] contains a mixture of crt.sh ?id= and ?serial= links. We wanted to remind you that Mozilla recommends that you 'use this form in your list "https://crt.sh/?sha256=[sha256-hash]", unless circumstances dictate otherwise'. In bug 1736064, which contains the discussion that led to that recommendation, several commenters pointed out that serial numbers alone are insufficient to identify certificates unambiguously.

Flags: needinfo?(jeremy.rowley)
Attachment #9297504 - Attachment is obsolete: true

Hi Rob - questions answered by section:

13 precertificates do not seem to be fully accounted for

We started preparing the Bugzilla post before the last certs were accounted for. We missed updating the 5th Oct entry after discovering the certificates on Oct 6th. The original post should have said Oct 6th instead of Oct 3rd. I missed that during my final updates to the draft document.

Any misissuance on October 4 or 5?

No mis-issuance occurred on Oct 4 or Oct 5. The new validation was cached and started being used for issuance on Oct 6th, which is what alerted us to the issue. If issuance had started on Oct 4 or Oct 5, we would have noticed the caching issue then.

Timestamps lack the expected precision<

All times are MDT

Day of Sep 29, 2022 - the DigiCert validation team detected an anomaly in an issued certificate. Specifically, we detected one certificate with “Name: [org name]” in the O field of the certificate. Compliance and Engineering teams began investigating. Engineering and Support determined the root cause of the mis-issued certificate related to a UI issue where the validation staff could select more than the company name while validating information. DigiCert staff searched for any additional certificates.

16:02 Sep 29 2022 – We deployed an emergency patch to stop new enrolments through the updated validation system with the issue while we investigated what happened. Unfortunately, some of the bad validations had escaped notice and were already send from the DB prior to the patch and awaited issuance.

Day of Sep 30, 2022 - the DigiCert team detected a cert with a DBA that was not approved for inclusion in the certificate form the same validation system update. We concluded that the issue was related to the new validation experience, even if the actual issue was slightly different than the issue on the 29th. The issue with the second certificate was with an improved validation proxy from the same system that added a slim name to the certificate as a DBA. The proxy added this information as the cert was passed from validation to the CA for issuance. We found the issue and fixed this issue on Sept 30th. The fix was to prevent inclusion of a DBA unless the DBA was validated and approved by the validation staff.

Day of Oct 3, 2022 - we continued to investigate the issue and look for other potential issues related to this validation deployment. We also ran scans collecting the cert information. We did not find any related issues, but we found that bad validation information previously detected remained cached. The cache allowed additional issuance of certificates with non-verified DBA information.

18:30 Oct 3rd - We fixed this DBA issue by clearing the cached results, updating the validation information, and confirming correct issuance of impacted accounts.

Day of Oct 4, 2022 - We monitored the certificate issuance system for additional anomalies. We did not detect any additional anomalies.

Day of Oct 5, 2022 – Draft of the incident report completed.

09:11 Oct 5, 2022 - Compliance performed a sweep to determine the final list of impacted certificates with a count of 118.

15:52 Oct 5, 2022 – We reconciled the cert contact information with the cert data and submitted the cert problem report, kicking off the revocation process for the 118 certs. See Bug 1797165 for this delay.

16:16 Oct 6, 2022 - We found that our high-volume issuing system (compared to the regular issuing system) contained DBA information in a cached copy of the validation information. We remediated by clearing the high-volume certificate cache. A final scan was run and included the additional 35 certificates issued from the high-volume system. The certs were submitted via a certificate problem report to the support team for revocation.

16:39 Oct 6, 2022 - This bug was lodged.

14.24 Oct 10, 2022 first batch of 118 certs revoked.

15.26 Oct 11, 2022 Final certs revoked.

Suspected failure to revoke within 5 days

Please see above for time stamped events. Also see Bug 1797165

Weekly update missed

For some reason our playbook has updates every 2 weeks as we have mentioned multiple times previously:

https://bugzilla.mozilla.org/show_bug.cgi?id=1684442
https://bugzilla.mozilla.org/show_bug.cgi?id=1639801
https://bugzilla.mozilla.org/show_bug.cgi?id=1550645

We updated this playbook to ensure weekly updates are made instead.

Ambiguous identification of (pre)certificates

I agree that SHA2 links are stated as the preferred format. The person pulling the data thought the crt.sh link alone was sufficient and included that instead of the crt.sh SHA256 link. Attached is a new CSV.

Flags: needinfo?(jeremy.rowley)

I have no further updates.

I'll close this on or about Wed. 2-Nov-2022 unless further discussion is needed.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ov-misissuance] [ev-misissuance]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: