Closed Bug 1625715 Opened 4 years ago Closed 4 years ago

Sectigo: Failure to revoke certificate with previously-compromised key within 24 hours

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wthayer, Assigned: Robin.Alden)

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

Matt Palmer reported the following on 26-March 2020 to the mozilla.dev.security.policy list:

At 2020-03-20 03:02:43 UTC, I sent a notification to sslabuse@sectigo.com
that certificate https://crt.sh/?id=1659219230 was using a private key with
SPKI fingerprint
4c67cc2eb491585488bab29a89899e4e997648c7047c59e99a67c6123434f1eb, which was
compromised due to being publicly disclosed. My e-mail included a link to a
PKCS#10 attestation of compromise, signed by the key at issue. An MX server
for sectigo.com accepted this e-mail at 2020-03-20 03:02:50 UTC.

This certificate was revoked by Sectigo, with a revocation timestamp of
2020-03-20 19:37:48 UTC.

Subsequently, certificate https://crt.sh/?id=2614798141 was issued by
Sectigo, and uses a private key with the same SPKI as that previously
reported. This certificate has a notBefore of Mar 23 00:00:00 2020 GMT, and
embeds two SCTs issued at 2020-03-23 05:55:53 UTC. At the time of writing,
the crt.sh revocation table does not show this certificate as revoked either
via CRL or OCSP:

Mechanism Provider Status Revocation Date Last Observed in CRL Last Checked (Error)
OCSP The CA Good n/a n/a 2020-03-27 06:27:23 UTC
CRL The CA Not Revoked n/a n/a 2020-03-27 04:44:26 UTC

Based on previous discussions on m.d.s.p, I believe Sectigo's failure to
revoke this certificate within 24 hours of its issuance is a violation of
the BRs, and hence Mozilla policy.

Robin: please provide an incident report.

Flags: needinfo?(Robin.Alden)
  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

A post was made to m.d.s.p. giving details of a certificate that had been issued over a subscriber key that had previously been reported to us as compromised.

date&time stamp (UTC) event
Fri 27-Mar 06:34 Post to m.d.s.p. describing the issue
  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
date&time stamp (UTC) event
Fri 20-Mar 03:02 We received a report that 125 private keys for Sectigo-issued TLS certificates had been discovered to have been compromised. The report included proof of compromise.
Fri 20-Mar 17:25 We acknowledged receipt of the report and began the process of verifying the proofs of compromise and mapping the keys to certificates for revocation.
Fri 20-Mar 18:03 Due to the number of keys involved we escalated the request internally to get a bulk processing script written.
Fri 20-Mar 19:37 164 certificates were revoked, including certificates for *.feelway.com and *.merkator.com.
Fri 20-Mar 20:16 We responded to the reporter that the compromised certificates had been revoked.
Mon 23-Mar 05:55 A new certificate for *.feelway.com was issued using the key identified above as compromised.
Thu 26-Mar 08:42 A new certificate for *.merkator.com was issued using the key identified above as compromised.
Fri 27-Mar 06:34 The original reporter made a post to m.d.s.p. describing the re-use of a compromised key for *.feelway.com
Fri 27-Mar 19:25 A database script was run to identify for rejection certificate requests that used compromised subscriber keys and to identify for revocation certificates that used compromised subscriber keys that were both unrevoked and unexpired. Two certificates were identified as requiring revocation.
Fri 27-Mar 21:07 The certificate for *.feelway.com was revoked.
Fri 27-Mar 21:12 The certificate for *.merkator.com was revoked.
  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

We have put interim measures in place so that new certificates with compromised keys will be revoked within 24 hours.

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
date&time stamp (UTC) event
Mon 23-Mar 05:55 A new certificate for *.feelway.com was issued using the key identified above as compromised.
Thu 26-Mar 08:42 A new certificate for *.merkator.com was issued using the key identified above as compromised.
  1. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.

as above.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

There are two things going on here.
a) Our CA systems haven't been checking for compromised keys being re-used.
As has been discussed here and here it is not yet a policy requirement that we check each certificate request against a list of compromised keys, but it is evidently a very good idea to do so.

b) When we receive reports of certificates having compromised, we haven't been consistently searching for other certificates with the same key to also revoke.
BRs 4.9.1.1 bullet 3 says "The CA SHALL revoke a Certificate within 24 hours if .. The CA obtains evidence that the Subscriber's Private Key corresponding to the Public Key in the Certificate suffered a Key Compromise".

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

In the short term we will achieve the revocation of newly issued certificates that use compromised within 24 hours by regularly running our script to identify certificate requests and certificates that use compromised keys.

We already have our revocation portal available at https://secure.sectigo.com/products/RevocationPortal and an updated CPS will shortly be published that includes reference to that revocation portal.
Keys proved to be compromised through the revocation portal will automatically be included in our list of compromised keys.

We have identified a development task to:

a) Completely automate checking against the list of compromised keys at the points of certificate issuance and of acceptance of certificate request so that requests may be rejected and certificates will not be signed where a compromised key is used.

b) Implement an ACME revokeCert API that anybody can call to prove the compromise of a private key corresponding to the public key in any certificate we've issued. Rob proposed this previously as a general solution for any/all CAs (see https://www.mail-archive.com/dev-security-policy@lists.mozilla.org/msg13045.html).

c) Update our CPS to include the ACME revokeCert API.

(In reply to Robin Alden from comment #2)

A database script was run to identify for rejection certificate requests that used compromised subscriber keys and to identify for revocation certificates that used compromised subscriber keys that were both unrevoked and unexpired. Two certificates were identified as requiring revocation.

Was this script run for all keys previously reported to Sectigo as compromised, or only the 125 SPKIs reported in the batch that was sent to you that you mentioned in this incident report?

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

We have put interim measures in place so that new certificates with compromised keys will be revoked within 24 hours.

Can you provide any further details as to the nature of these measures? Are they technical or procedural, for instance?

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

There are two things going on here.
a) Our CA systems haven't been checking for compromised keys being re-used.
As has been discussed here and here it is not yet a policy requirement that we check each certificate request against a list of compromised keys, but it is evidently a very good idea to do so.

b) When we receive reports of certificates having compromised, we haven't been consistently searching for other certificates with the same key to also revoke.
BRs 4.9.1.1 bullet 3 says "The CA SHALL revoke a Certificate within 24 hours if .. The CA obtains evidence that the Subscriber's Private Key corresponding to the Public Key in the Certificate suffered a Key Compromise".

Why was this searching not consistently done? Did Sectigo not believe it was a requirement to revoke certificates using compromised keys, or some other reason? The use of the word "consistently" makes me think that perhaps such searching was done on an ad-hoc basis; if that is correct, what were the circumstances in which searches might have been done in the past?

a) Completely automate checking against the list of compromised keys at the points of certificate issuance and of acceptance of certificate request so that requests may be rejected and certificates will not be signed where a compromised key is used.

Does this action item mean that, at present, checking against a list of compromised keys currently may involve one or more manual steps?

b) Implement an ACME revokeCert API that anybody can call to prove the compromise of a private key corresponding to the public key in any certificate we've issued.

Yay! It's not going to use acmevoke by any chance, is it?

c) Update our CPS to include the ACME revokeCert API.

This would seem to be a very good idea.

Was this script run for all keys previously reported to Sectigo as compromised, or only the 125 SPKIs reported in the batch that was sent to you that you mentioned in this incident report?

There have already been several versions of that script. The first included literal certificate identifiers and was added to as earlier certificate compromise reports (i.e. before your 125) were processed. The 125 you reported were only added to that list in response to the email from you that triggered this incident response and report. That was an oversight on our part as a result of a rushed response from us to your report of 125 subjectPublicKeyInfo values. We did not have a procedure for dealing with SPKIs in revocation reports so relating those to our issued certificates had to be done by different people across different systems.
There is no implied criticism of your report here. If we could not handle your report we should have said so.
The script has since been rewritten so that the driving list is the record of public keys for TLS certificate revocations for which 'key compromise' was indicated as the reason, and that driving list is re-evaluated each time the script is run.

Can you provide any further details as to the nature of these measures? Are they technical or procedural, for instance?

Both. The script has been revised as described above and it is being run twice a day. The procedural part is that since this script does not prevent new issuance we will need to reach out to the subscriber for each new certificate issued to notify them that it must be revoked and replaced within 24 hours of issuance.

Why was this searching not consistently done? Did Sectigo not believe it was a requirement to revoke certificates using compromised keys, or some other reason? The use of the word "consistently" makes me think that perhaps such searching was done on an ad-hoc basis; if that is correct, what were the circumstances in which searches might have been done in the past?

The script that checked for uses of compromised keys other than in the certificate being revoked was not run as an integral part of the process for dealing with revocation reports. It was more often run in response to a bulk notification of compromised keys. It was less often run in response to a report of a single compromised key either from a report by a security researcher or from a report of compromise by a subscriber.

Does this action item mean that, at present, checking against a list of compromised keys currently may involve one or more manual steps?

That item means that we do not currently automatically or manually check against a list of compromised keys at the point we accept a certificate request or at the point we issue a certificate.

Yay! It's not going to use acmevoke by any chance, is it?

We saw that and it is great to see solutions provided but no, it will use our own RFC8555 implementation.

Flags: needinfo?(Robin.Alden)

Thanks for your clarifications, Robin. All my questions have been answered.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [leaf-revocation-delay]
You need to log in before you can comment on or make changes to this bug.