Closed Bug 1824257 Opened 2 years ago Closed 2 years ago

WISeKey: Pre-certificates revoked with certificateHold reason

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pfuentes, Assigned: pfuentes)

Details

(Whiteboard: [ca-compliance] [crl-failure])

Steps to reproduce:

We were reported about four certificates revoked with certificateHold reason.

After the first analysis we have a first understanding of the incident as follows:

  • These are pre-certificates, for which no certificate was issued
  • The problem was introduced by a background service in EJBCA, which revokes automatically pre certificates that don't become certificates (e.g. due to insufficient CT-Log signatures or other potential causes that prevent the issuance), but specifying a wrong reason due to a bug in EJBCA. This bug has been solved in a more recent version of EJBCA.

We will publish ASAP a full incident report in the adequate format once we conclude the investigation, but we open already this bug to disclose the issue, which is not related to a bad certificate revocation practice, but to a change management issue, as the bug was not detected and the patch solving the bug was not applied as it should have been.

In the interim, we verified for the occurrence of other cases not finding any, and we disabled temporarily the automated revocation service of "orphan" pre-certificates in EJBCA until the patch is applied.

1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

A person from DigiCert notified us via our incident reporting email.

2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

All times are CET
23/March/23 18:34 - WISeKey is notified about 4 certificates revoked with "certificateHold" reason. We acknowledge to the reporter the incident and investigation starts.
23/March/23 20:30 - The PKI team determines that the incident affects pre-certificates that were revoked via an automated background process in EJBCA that revokes pre-certificates that don't correspond to final certificates.
23/March/23 20:45 - The PKI team confirms that the wrong revocation reason is due to an apparent bug in EJBCA, and the affecting service is disabled. Further investigation continues to confirm the causes of the no-issuance of the corresponding final certificates, being the lack of sufficient CT-Log signatures the obvious cause but not the only one.
23/March/23 22:00 - The PKI team modifies (in CRL and OCSP) the revocation reason of these pre-certificates to "superseded".
24/March/23 10:00 - The Compliance team meets and determines the corrective actions to prevent similar issues related to our change management process.

3. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

This incident doesn't relate to a mis-issuance, and no final certificates are involved. No subscriber or certificate consumer has been affected by the presence of these revoked pre-certificates in our CRL.
WISeKey has set the means to avoid the situation to reproduce.

4. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

Four pre-certificates were affected, as listed in item 5.

5. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.

Affected pre-certificates:
https://crt.sh/?q=4d7bf78e97b5f9277f233430b050fc582ecf1d13c2440eacd1bcb6483951216a
https://crt.sh/?q=727ca58ff4b3206bb5a72f2de460d1e5c9df0c72c9abd4091b851d307c1f5d4a
https://crt.sh/?q=7cc58bf31b585301f0e52c917793b81c227677453ee7047d0ce2c9a27b90ff46
https://crt.sh/?q=bbc98f077df315932d6ce6e2e6d13905fe3e3cd27396978727070bd85ab10d64

6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

EJBCA implements a facility to automatically revoke pre-certificates that, for any reason, don't correspond to a final certificate. The most common case, but not the only, is the case of insufficient SCTs obtained while issuing a TLS certificate. EJBCA recommends to activate a built-in background service to perform the revocation of those incomplete pre-certificates, but the version we use (EJBCA 7.10.0.1) makes that the "certificateHold" revocation reason is used to revoke these pre-certificates. We have observed that this changed in the new EJBCA version 7.11.0 to revoke these certificates without providing any specific revocation reason. [1]

WISeKey establishes measures to control the revocation reasons used by operators, disabling the certificateHold reason in the certificate management platform. We didn't actively monitor the potential automated revocation of pre-certificates.

WISeKey considers this incident a flaw in our change management process. WISeKey monitors the vendor's release notes to determine the convenience of applying updates or patches, but this monitoring has demonstrated not to be systematic enough, as it didn't detect the need to patch the pre-certificate revocation service.

7. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

We have modified our process to actively monitor releases of new versions, including four-eyes sight on each release note and imposing a consensus of at least two specialists to determine the need, or not, to apply a patch or update. We expect this measure to improve our change management process and reduce the risk to not applying (or applying) updates that could impact to PKI compliance.

[1] https://doc.primekey.com/ejbca/ejbca-release-information/ejbca-release-notes/ejbca-7-11-release-notes

Assignee: nobody → pfuentes
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance] [crl-failure]

(In reply to Pedro Fuentes from comment #1)

2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

All times are CET
23/March/23 18:34 - WISeKey is notified about 4 certificates revoked with "certificateHold" reason. We acknowledge to the reporter the incident and investigation starts.

As indicated, the timeline should include events before the incident was reported. When did the incident start? Has WISeKey always revoked with certificateHold reason? Did this become an incident when a requirement changed? If so, when did that requirement take effect? Was the incorrect behaviour introduced in a particular EJBCA version? If so, when was the version deployed?

(In reply to Mathew Hodson from comment #2)

(In reply to Pedro Fuentes from comment #1)

2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

All times are CET
23/March/23 18:34 - WISeKey is notified about 4 certificates revoked with "certificateHold" reason. We acknowledge to the reporter the incident and investigation starts.

As indicated, the timeline should include events before the incident was reported. When did the incident start? Has WISeKey always revoked with certificateHold reason? Did this become an incident when a requirement changed? If so, when did that requirement take effect? Was the incorrect behaviour introduced in a particular EJBCA version? If so, when was the version deployed?

Hi Mathew,
thanks for your comment. You're right, we didn't indicate enough details about the previous events.

WISeKey certificate management platform doesn't allow to specify certificateHold as reason since many years ago, to avoid breach of the BR 7.2.2.

As per our knowledge, EJBCA introduced the automated pre-certificate revocation feature in version 7.7.1, that was released around September 2021. We can't say right now exactly when, because they disabled external access to Jira recently due to some security incident or vulnerability.

WISeKey enabled the automated revocation of "orphan" pre-certificates the 15th of March 2023, and this effectively allowed the incident to happen, as further revoked pre-certificates processed by that EJBCA service would show certificateHold as revocation reason.

(In reply to Pedro Fuentes from comment #3)

WISeKey enabled the automated revocation of "orphan" pre-certificates the 15th of March 2023, and this effectively allowed the incident to happen, as further revoked pre-certificates processed by that EJBCA service would show certificateHold as revocation reason.

Hi Pedro,

I am curious, why did you decide to auto-revoke those pre-certificates? Did you have any compliance or security issue associated with those pre-certificates? Revoking certificates without a good security reason will only increase your CRL size (if you use CRLs). Issuing a pre-certificate and not the final certificate doesn't automatically require revoking the pre-certificate.

(In reply to Dimitris Zacharopoulos from comment #4)

(In reply to Pedro Fuentes from comment #3)

WISeKey enabled the automated revocation of "orphan" pre-certificates the 15th of March 2023, and this effectively allowed the incident to happen, as further revoked pre-certificates processed by that EJBCA service would show certificateHold as revocation reason.

Hi Pedro,

I am curious, why did you decide to auto-revoke those pre-certificates? Did you have any compliance or security issue associated with those pre-certificates? Revoking certificates without a good security reason will only increase your CRL size (if you use CRLs). Issuing a pre-certificate and not the final certificate doesn't automatically require revoking the pre-certificate.

Hi Dimitris,
this is a good question.
If you check the EJBCA admin GUI, this is something that PrimeKey/Keyfactor encourages you to enable [1][2], but as you say, there's no compliance requirement, and there was no a security issue. At that time we had a couple of cases of certificates not issued due to lack of enough CT-Log signatures, being the reason an spurious communications issue that had external causes.
In principle these non-issuances should be rare, so we weren't worried by the potential increase in CRL, we just thought that it would be cleaner not to leave those "zombie" pre-certificates active... but this seek for cleanliness has backfired us... so not a good idea after all.

[1] https://doc.primekey.com/ejbca/ejbca-operations/ejbca-ca-concept-guide/services/pre-certificate-revocation-service
[2] Text in the Admin GUI:
Recommended: Add Revocation Service
When CT submission succeeds for some logs, but the overall certificate issuance fails, it is desirable to revoke the certificate. Press the button to add a periodic service that revokes incompletely issued certificates.

Indeed, this should not be a recommended practice. In fact, this looks like a SHOULD NOT practice as it provides zero-to-very-low security benefits and only causes performance impact. I assume you will contact Keyfactor to update the documentation and the Admin GUI text.

Regardless of EJBCA's recommendations, a CA should evaluate any software vendor's options carefully based on the applicable standards, policies and experience, and decide how to configure the behavior of the CA system. That came out as an important takeaway from the "serial number 63-entropy" incidents (CA's are ultimately responsible for the configuration of their CA system and should always re-evaluate "defaults" or other recommendations from a software vendor). This comes from a CA that was among the ones impacted by that incident :-)

Pedro,
Are there any remaining remediation items still open?
Thanks,
Ben

Flags: needinfo?(pfuentes)

Hello Ben,
in what respects to the root cause (change management issue) we consider having put in place measures to prevent this situation, so remediation is done.

In what respects to the discussion about the revocation of "orphan" certificates being a need or not, we had some discussions with Keyfactor and we confirmed that there are hypothetical situations where this could still be convenient.

In particular, there could be a (very unlikely) case where some unexpected issue occurs right after EJBCA submits the first pre-certificate to CT-logs, e.g. power outage or HSM disconnects, in this case the pre-certificate would be visible in the CT-Logs, but in EJBCA it remains in a temporary table (IncompleteIssuanceJournalData) and is only moved to the certificate table (and revoked) if the "Pre-Certificate Revocation Service Worker" is enabled. If the service is not enabled and the pre-certificate is left in the temporary table then it could happen that an OCSP query to the pre-certificate responds "unauthorised" or "unknown" (depending on the configuration)... so this could lead to a compliance issue.

We discussed with Keyfactor about the convenience to find a different approach to avoid this situation, without revoking the pre-certificate and growing the CRL, but just moving the pre-certificate to the certificate table, so an OCSP query responds properly as non-revoked. They are studying this.

Flags: needinfo?(pfuentes)
Flags: needinfo?(bwilson)

In case our last comment was unclear... The issue is considered remediated and we don't expect further action.

The additional insights were included in case of interest for other EJBCA users, as there was debate about this functionality for pre-certificate revocation.

Thanks,
Pedro

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.