Closed Bug 1879529 Opened 2 years ago Closed 1 year ago

D-Trust: "unknown" OCSP response for issued certificates

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: enrico.entschew, Assigned: enrico.entschew)

Details

(Whiteboard: [ca-compliance] [ocsp-failure])

Preliminary Incident Report

Summary

This is a preliminary incident report.

For 3 OV certificates D-Trust’s OCSP validation service is responding “unknown” instead of “good”. The certificates were issued on January 26, 2024. This matter was brought to our attention by Ben Wilson via email.

We are investigating the situation and will come back with a detailed incident report within the next 7 days.

Impact

For 3 OV certificates D-Trust’s OCSP validation service is responding “unknown” instead of “good”.

Timeline

2024-02-08:

  • 05:03 Email from Ben Wilson

2024-02-08:

  • 05:57 Acknowledgement of email and content

2024-02-09:

  • 6:10 Start of internal analysis
  • 13:10 Informing Conformity Assessment Body about the issue

Root Cause Analysis

We are still investigating.

Lessons Learned

What went well

We are still investigating.

What didn't go well

We are still investigating.

Where we got lucky

We are still investigating.

Action Items

Action Item Kind Due Date
Regular manual check of https://sslmate.com/labs/ocsp_watch/ Detect 2024-02-09

Appendix

Details of affected certificates

List of all affected certificates
https://crt.sh/?sha256=0e25534e64d9bf31c7ed7e3814e19fd05f578e000c0cc71b72d9cbe029c8c8c4
https://crt.sh/?sha256=1a8068655af3ac388a1dacedcbd7f75c369410e44152c35f2122a29e6441107d
https://crt.sh/?sha256=4c05fcb339e55298f737c4774a73574422ca30b83feea6e613e5b211fa4b18e2

Based on Incident Reporting Template v. 2.0

Assignee: nobody → enrico.entschew
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [ocsp-failure]

Incident Report

Summary

For 3 OV certificates D-Trust’s OCSP validation service responded “unknown” instead of “good”. The certificates were issued on January 26, 2024.

Impact

For a period of 14 days, the OCSP validation service of D-Trust responded with "unknown" instead of "good" for 3 OV certificates.

Timeline

2024-01-25

  • 18:00 Updating the affected CA system

2024-01-26

  • 17:50 Start of isolated irregularities in the CA software with limited availability and isolated failures

2024-01-28

  • 14:10 Start of troubleshooting because the irregularities in the CA system persisted

  • 14:25 Restart of the CA system

2024-02-08:

  • 05:03 Email from Ben Wilson
  • 05:57 Acknowledgement of email and content

2024-02-09:

  • 06:10 Start of internal analysis
  • 12:30 Informing Conformity Assessment Body about the issue
  • 14:47 OCSP validation service begins to respond with “good” for the affected OV certificates

2024-02-15

  • 14:00 End of analysis

Root Cause Analysis

There were isolated irregularities in the CA software with reduced availability and individual failures. In principle, however, the system appeared to produce without errors. However, the individual failures led to the interaction between the CA software and the component responsible for publishing the certificate status, meaning that 3 certificates could still be produced but were not published to the OCSP system.

The problem occurred after an update of the CA system. The update had already been successfully tested several times in various test systems before being installed in the affected productive system. It has been used without errors in several other production systems since December 2023.

The manufacturer of the CA system has been contacted about the problem of irregularities in this particular production system and has now provided a patch. We are currently testing the patch. However, this problem is not the main reason why the incorrect response of the OCSP system remained undetected.

D-Trust has a monitoring system for cases of incorrect responses from the OCSP system. This monitoring system also successfully recorded the error. However, in this particular case, the monitoring system did not pass on the necessary information. There was no escalation that would have resulted in manual or automated intervention.

Lessons Learned

What went well

The error was detected by our monitoring system.

What didn't go well

The escalation chain did not work. For this particular case, there was monitoring but no escalation.

Where we got lucky

Only 3 TLS certificates were affected. All other certificates were successfully produced and published to the OCSP system.

Action Items

Action Item Kind Due Date
Regular manual check of https://sslmate.com/labs/ocsp_watch/ Detect 2024-02-09
Extending monitoring rules to ensure escalation in the reporting chain for this specific error case Detect 2024-03-14

Appendix

Details of affected certificates

List of all affected certificates
https://crt.sh/?sha256=0e25534e64d9bf31c7ed7e3814e19fd05f578e000c0cc71b72d9cbe029c8c8c4
https://crt.sh/?sha256=1a8068655af3ac388a1dacedcbd7f75c369410e44152c35f2122a29e6441107d
https://crt.sh/?sha256=4c05fcb339e55298f737c4774a73574422ca30b83feea6e613e5b211fa4b18e2

Quick update:
Since 19.02., 9:00AM the extending monitoring rules to ensure escalation in the reporting chain for this specific error case are in place.

It appears that this case has been completed, and I'll schedule this for closure on Friday, 5-Apr-2024.

Flags: needinfo?(bwilson)
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.