- How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.
As part of our post-issuance review of a key ceremony, we discovered that an issuing CA was created that didn’t meet the requirements of the baseline requirements. Specifically, the profile lacked the CRL extension. After discovering the error, we ran a scan over all CA capable of issuing TLS certificates and found 9 additional ICAs that had errors resulting from improper profiles during key ceremonies. We are including all of these in this incident report. The nine additional issuing CAs included one that had an incorrect chain of signatures in an ECDSA CA (violating the Mozilla policy adopted Jan 2020), and 8 with the use of AnyPolicy for an external un-affiliated party (which is not allowed under BR section 184.108.40.206).
- A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
21-Feb-20 Ceremony to create ECC ICA (with incorrect ECDSA signatures)
20-May-20 Ceremony to create MS CA’s with AnyPolicy
18-Jul-20 Ceremony to create a new Issuing CA. (missing CRL)
18-Jul-20 ICA failed post creation Linting (missing CRL)
18-Jul-20 ICA uploaded to CCADB (missing CRL)
21-Jul-20 ICA revoked in next ceremony (missing CRL)
22-Jul-20 Review of previous Ceremonies
23-July-20 additional ICA found in scope (ECC ICA and MS ICAs)
29-July-20 Additional ICA scheduled for revocation within the 7 day requirement
- Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
DigiCert has stopped all new ceremonies for publicly trusted certificates until the key ceremony process contains more automation. We have some automation already, but the human component could flag issues as false positives and continue a key ceremony. We have attached a diagram showing the key ceremony process and where we have automation already and what we are currently building. The green boxes are all tools that already exist. However, we don’t have the automated flow through those tools to terminate the key ceremony process if there is an error. For example, in this instance of the ICA missing the CRL, the linter did catch the error, but the key ceremony team failed to abort when encountering the error and continued the key ceremony. The anyPolicy was not flagged by a linter (as it is allowed for hosted ICAs but not non-affiliated ICAs). The ceremony with the bad ECC ICA was held before zlint finished updating and was not flagged as bad by the ceremony tool.
We plan on stopping issuance until the part of the process (in green) is automated, making it impossible to proceed and complete a key ceremony on a linter error. After finishing requiring mandatory flow through the automated tools and preventing overrides, we will resume key ceremonies while we work on the blue items. The purple items (CCADB related) are subject to salesforce integration. We’d like to automate those and will do so immediately after receiving approval from Mozilla.
- A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
Ten ICA certificates were issued Between 21-Feb-20 and 18-July-20
ECDSA Cert (issue 1)
Microsoft CA with Any Policy (issue 2)
Missing CRL (issue 3)
- The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
See #4 above.
- Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
For the ECDSA CA, this was created prior to the linting check. During the ceremony, we did not detect that the signature was not one permitted under the Mozilla policy. It was only through the recent scan using an updated zlint that we detected the issue.
We do have developers who contribute to zlint already. To expand the scope of this, we are adding additional linters to the key ceremony for DigiCert specific items, like the anyPolicy issue. We are also regularly running zlint over the corpus of intermediates to detect issues as zlint is updated. We are syncing the whole process so any error hard-block the cert creation, even if a false positive.
The Microsoft CAs were created on legacy naming documents, similar to this bug https://bugzilla.mozilla.org/show_bug.cgi?id=1647084. The naming docs for internal CAs use anyPolicy for TLS. We are building into the linter a requirement that each TLS cert include all four TLS issuance OIDs. This will prevent the need for anyPolicy going forward.
In this case of the missing CRL, the naming documents actually included a CDP field. However, the naming documents failed to transfer to the key ceremony tool properly. The linter flagged the cert when a private test cert was created and caught the issuer when ran on the completed certificate. However, the team failed to recognize the error message and proceeded with key creation regardless. We scheduled revocation for the next ceremony.
- List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
First, we have stopped all key ceremonies while we finish our key ceremony automation process. The updated process will automate most of the process, excluding parts like the key ceremony itself where network separation is required. We expect phase 1 to be completed next week. Phase 2 will be scoped and started release of phase 1. The new process is nearly identical to the current process expect the system moves the request through the process automatically and hard-stops the key ceremony on any linting error.