Open Bug 1916478 Opened 20 days ago Updated 16 days ago

emSign PKI Services: Delayed Revocation of SSL/TLS Certificates

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: naveen.ml, Assigned: naveen.ml)

References

(Blocks 1 open bug)

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

Incident Report

Summary

An external researcher notified emSign CA about a potential compromise involving four SSL certificates belonging to a customer. Although the revocation of certificates with compromised keys is required within a 24-hour window, the process was delayed due to internal process failure and additionally, some minor delay can also be attributed to the affected customer's lack of response, which prevented timely action. Once responsibility for the incident was routed to the correct team and contact with the customer was established, the certificates were successfully revoked, resolving the issue. This incident underscores the importance of updated training and re-training of support staff to triage support inflows appropriately, as well as prompt customer communication in addressing security concerns effectively and ensuring timely mitigation actions.

Impact

The delay in revoking the compromised SSL certificates posed a potential risk, as the certificates remained active beyond the intended revocation period. However, there were no reported incidents of misuse during this period, minimizing the potential impact.

Timeline

All times are IST.

2024-08-31 23:19 - The emSign PKI received a notification in the general support queue from an external researcher regarding the potential compromise of four certificates. The staff monitoring queue has processed as per the standard operating procedure and failed to route the information in a timely manner to the appropriate team.

2024-09-01 10:30 - The emSign PKI team became aware of the issue and began investigating the accessibility of the private key on their website. Previous mis-categorization of the support ticket introduced an approximate 11 hour delay in its processing.

2024-09-01 11:15 - The emSign PKI team confirmed that the private key had been compromised and raised an internal ticket with the PKI support team to initiate contact with the affected customer.

2024-09-01 12:45 - The SSL support team attempted to contact the customer using the registered contact information but did not receive any response to calls or emails.

2024-09-01 16:30 - The SSL support team made another attempt to contact the customer, but again, there was no response.

2024-09-02 09:15 - The SSL support team confirmed to emSign PKI that the customer had not responded.

2024-09-02 09:40 - The emSign PKI team acknowledged the situation and raised an internal incident (Incident No. EMINCPKI0019) with the compliance group, noting the delay in revocation due to the customer's lack of response.

2024-09-02 10:10 - The SSL support team successfully contacted the customer, informed them of the private key compromise, and notified the emSign PKI team to revoke the four affected SSL certificates.

2024-09-02 10:24 - The emSign PKI team revoked the four SSL certificates compromised due to the private key exposure.

Root Cause Analysis

The delay in revocation occurred because the support staff monitoring the general queue has followed the general standard operation procedure, and failed to route the information in a timely manner to the appropriate team. NOTE: there is a separate dedicated queue for Revocation requests, but this request came into the general support queue. In addition to this, the affected customer was unresponsive to multiple attempts to inform them about the compromise of their SSL private keys (although somewhat expected over a weekend period). The emSign PKI Team should have set their revocation cutoff time in alignment with the original communication timestamp, instead of when the case was eventually routed to them. This lack of following due process resulted in the revocation process exceeding the expected timeframe.

Lessons Learned

This incident highlights a critical need to train and re-train support staff to recognize and triage revocation requests appropriately, even if they are submitted via an alternative channel. The team responsible for taking the revocation action (in the case of key compromise), should validate the timestamp on the original request, and not just from when it was received into their queue. Additionally, this incident highlights the benefits to the industry for customers to adopt more robust, automated processes for managing certificate revocations, such as ACME (Automated Certificate Management Environment), which could allow for immediate replacement of compromised certificates without requiring manual intervention.

What went well

The external researcher quickly identified and reported the compromised certificates, enabling emSign CA an opportunity to initiate the revocation process as soon as possible. Internal teams acted promptly once the support request was routed appropriately.

What didn't go well

The revocation process was delayed due to the failure by support staff to recognize a revocation request in the general queue and the subsequent mis-categorization of the support ticket. The inability to reach the affected customer further contributed to the delay in the process unnecessarily, when the revocation could have proceeded regardless. These elements prevented the timely deactivation of the compromised certificates, leading to a potential security vulnerability.

Where we got lucky

Despite the delay, there were no reported incidents of misuse of the compromised certificates. The prompt action taken by the external researcher allowed the vulnerable certificates to be replaced.

Appendix

Details of affected certificates

Below are impacted certificate that private keys are exposed:

https://crt.sh/?id=13600202644
https://crt.sh/?id=14318127190
https://crt.sh/?id=14329256062
https://crt.sh/?id=14330892679

Next Steps:

Moving forward, the focus will be on training support staff to recognize the importance of certain types of requests that require expedited processing, irrespective of which channel the request is received through. Periodic validation of support staff competency will be reinforced.

Outside of addressing internal process failures, additional actions such as encouraging customers to implement automated certificate management solutions like ACME, which allow for immediate certificate replacement in the event of a compromise, will be a focus. Additionally, we will continue to educate customers on the importance of responsiveness during security incidents to ensure swift and effective mitigation actions.

Based on Incident Reporting Template v. 2.0

Assignee: nobody → naveen.ml
Blocks: 1911183
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [leaf-revocation-delay]
You need to log in before you can comment on or make changes to this bug.