SwissSign: CRL/OCSP revocation time mismatch
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: roman.fischer, Assigned: roman.fischer)
Details
(Whiteboard: [ca-compliance] [crl-failure] [ocsp-failure] Next update 2023-04-30)
Attachments
(1 file)
|
1002 bytes,
application/vnd.ms-excel
|
Details |
- How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.
While implementing a change to improve backdating of certificate revocation in case of key compromise with evidence that the compromise happend significantly before it was reported to us, a developer discovered a discrepancy in how CRL and OCSP handler in our OLD CA platform determine the date and time when a certificate was revoked. Currently, the OCSP responder reads a db field that is updated by a db-trigger when the certificate status changes to 'revoked'. The CRL generator on the other hand reads a db field which is set to the revocation date by the CA backend.
Latencies and rare connection issues between CA backend and DB server can lead to situations where these two db fields differ more than 1s which results in different revocation times reported on CRL vs OCSP.
Our investigation found 1446 db entries of revoked certificates where the two db fields differ by more than 1s.
- A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
2023-02-07 10:00 (UTC+1): Developer detects the bug and informs compliance team
2023-02-07 12:30 (UTC+1): Bug confirmed, classified as compliance incident, incident process started
2023-02-07 15:00 (UTC+1): Scope of problem investigated: 1446 db records affected. 12 revoked non-expired TLS certificates affected.
2023-02-07 15:30 (UTC+1): Management board and audit bodies informed
2023-02-07 16:30 (UTC+1): Bugzilla posted, root store programs informed
- Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident. A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.
Certificate issuance is not affected by this bug. All information in certificates are correct and compliant.
As availability of status information service (CRL, OCSP) is of utmost importance, the decision was taken to continue serving the incorrect times in the CRL until a fix can be deployed. The alternative to stop serving CRL would be an even bigger violation of regulation and would pose a bigger risk to the ecosystem than tolerating the wrong information until an emergency fix can be deployed.
- In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.
The problem does not affect certificates, only CRLs are affected. Even though the bug does not affect certificate content but only the revocation time, we have attached the list of the 12 affected non-expired TLS certificates to this post. The rest of the affected records are S/MIME and expired TLS certificates.
- In a case involving certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.
See point 4 above.
- Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
During internal and external audits, spot-checks of CRL and OCSP responses were compared and no discrepancy was found (fyi: we currently have ~392'000 entries in the relevant table of the db, and only 1446 are wrong -> less than 0.4%). Code-reviews missed the discrepancy in usage of the db fields, probably because the code for CRL generation and OCSP response were developped independently of each other.
-
List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.
-
A hotfix will be developed, tested and rolled-out so that the OCSP responder uses the same db field as the CRL generation. Target date: 20.02.2023
-
The OLD CA platform is planned to stop issuing certificates by 31.12.2023 (new CA plattform is already in operation and migration is ongoing)
-
SwissSign will implement a new control that compares all entries on all active CRLs against OCSP responses to detect deviations. Target date: 30.04.2023
| Assignee | ||
Comment 1•3 years ago
|
||
Points 8, 9 and 10 should be sub-points to 7... sorry for the formatting error.
Updated•3 years ago
|
| Assignee | ||
Comment 2•3 years ago
|
||
2023-02-13 15:03 (UTC+) We have implemented, tested and deployed the hotfix mentioned above as nr. 8. The listed certificates now show the same revocation time on CRL and OCSP.
Updated•3 years ago
|
Comment 3•2 years ago
|
||
Thanks. Unless there are any comments or concerns to be addressed here, I will close this bug on or about Wed. 19-Apr-2023.
Updated•2 years ago
|
Description
•