Closed Bug 1611241 Opened 5 years ago Closed 5 years ago

Entrust: Compromised Private Key was not Revoked in Less than 24 Hours

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dathan.demone, Assigned: dathan.demone)

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0

Steps to reproduce:

  1. How your CA first became aware of the problem

On January 20th at approximately 6:39 am UTC, we received a notification from a third party that one of our customer’s private keys had been exposed. As such, we were required to revoke the certificate due to key compromise within 24 hours, in accordance with BR 4.9.1.1.

On January 20th at approximately 3:31 pm UTC, Entrust Datacard was notified by another third party regarding an exposed private key for a certificate that we had issued to one of our customers. This was the same certificate that as reported to us at 6:39 am UTC.

A third party report incident report has been published here:
https://gist.github.com/nstarke/a611a19aab433555e91c656fe1f030a9

The original notification from 6:39 am UTC was not escalated to management as per our process. Instead, the 3:31 pm UTC notification was used as our original incident report time.

As a result, the certificate was revoked 8 hours and 52 minutes past the 24-hour deadline.

  1. A timeline of the actions your CA took in response

Jan 20, 2020, 6:39 am UTC – We received a notification form the first third party via email to our technical support team that a customer private key was compromised.

Jan 20, 2020, 7:40 am UTC – An agent took ownership of the incident

Jan 20, 2020, 7:53 am UTC – The agent informed the customer that the private key was exposed and requested that the customer revoke the certificate. The agent failed to escalate this notification to management as per our compromised certificate revocation process.

Jan 20, 2020 3:31 pm UTC – Another email was sent to Entrust Technical Support from a different third party informing us of that a customer private key was exposed. This turned out to be the same certificate/private key from the notification at 6:39 am UTC. There were a total of 12 reports (the first two are referred to in the above timeline) for this issue that were sent in via email by third parties on January 20th.

Jan 20, 2020, 6:26 pm UTC – An agent took ownership of the case and contacted a verified subscriber administrator associated with the certificate and informed them they need to revoke within 24 hours of 3:31 pm UTC.

Jan 20, 2020, 7:30 pm UTC – The agent set the proper case label and escalated to management as per our compromised certificate revocation process.

Jan 20, 2020, 10:03 pm UTC – Entrust support provided an internal update to key stakeholders in compliance and security based on our process for compromised certificate revocation

Jan 20, 2020, 10:26 pm UTC – Entrust hosts an internal conference call to review the data and ensure that correct time is met for revocation. During this call, it was determined that the revocation time should occur before Jan 21, 2020, 3:31 pm UTC based on the date and time Entrust support received what was thought to be the initial incident notification.

Jan 21, 2020, 2:53 pm UTC – Support management calls to follow-up with subscriber administrators to ensure that they are ready to revoke the certificate before the Jan 21, 2020, 3:31 pm UTC deadline. The subscriber confirmed that they will revoke the certificate before Jan 21, 2020, 3:25 pm UTC, otherwise, it would be revoked by Entrust.

Jan 21, 2020, 3:23 pm UTC – The subscriber contacts Entrust Support and confirms that the certificate has been revoked. The official revocation time was 3:21 pm UTC

Jan 21, 2020, 3:29 pm UTC – Entrust Certificate Services Support Management confirms with internal teams that the certificate has been revoked.

Jan 21, 2020, 6:43 pm UTC – A Support Agent discovers that the original case that was created from the third party email we received on Jan 20, 2020, 6:39 am UTC was mislabeled and was never escalated according to our process. Management held an internal meeting and determined that we missed the 24-hour revocation deadline, as the first contact occurred at Jan 20, 2020, 6:39 am UTC and not of Jan 20, 2020, 3:31 pm UTC.

  1. Confirmation that your CA has stopped issuing TLS/SSL certificates with the problem
    This was a subscriber private key exposure issue, which is documented here: https://gist.github.com/nstarke/a611a19aab433555e91c656fe1f030a9

  2. A summary of the problematic certificates
    The subscriber exposed their private key by including the private key file on device firmware for a commercially available router that they sell. This was discovered and posted in the above report.

According to BR 4.9.1.1., Certification Authorities are required to revoked certificates when they are notified of key compromise within 24 hours. Due to the issues with determining the initial notification of the incident via third parties, the certificate was revoked 8 hours and 52 minutes past the 24-hour deadline.

  1. The complete certificate data for the problematic certificates
    https://crt.sh/?id=1955992027&opt=ocsp

  2. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

When an email is sent to Entrust Technical Support, a case is automatically created in our CRM system so that it can be actioned by one of our agents.

The original email to notify us of this exposed private key was sent at 6:39 am UTC and a case was created. The case was picked up by an agent at 7:40 am UTC but was not properly labeled or escalated as per our process. Our current process requires that incoming revocation requests due to compromise be labeled as “Misuse” and the agent must notify compliance management immediately so that the issue can be investigated further and so that appropriate action can be taken. Because this case was not labeled and escalated properly, our incident notification time was not set to 6:39 am UTC, which is the major reason why revocation did not occur within 24 hours.

The second email we received to notify us of the issue came in at 3:31 pm UTC and was properly labeled and escalated as per our process. We based this time as our initial incident notification time and looked to revoke before 3:31 pm UTC the following day to be within the 24 hour time period.

We received 10 more notifications for this issue after 3:31 pm UTC that were all properly actioned, labeled, and escalated.

Based on this summary, the incident was caused by a lack of automation in our case labeling and escalation process for these types of issues. The current labeling and escalation process is human-driven, which could lead to failure in executing the process promptly.

  1. List of steps your CA is taking to resolve the situation

We are making changes to our support CRM system to simplify the labeling of compliance-related cases and enabling automated notification to compliance management when there is content that may indicate a potential certificate compromise.

To avoid the potential of human error when labeling cases under the compliance category that will automatically generate the emails to compliance management, we will implement a system that checks all emails sent to support for specific key words that may be related to reporting a compliance issue, such as a compromised private key. If a key word is detected, an automated email will be sent to a compliance management distribution list to review the case/email.

We will provide an update when we have confirmed when these improvements will be implemented.

Assignee: wthayer → dathan.demone
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance] [delayed-revocation-leaf]

Hi Dathan,

Could you clarify a bit more about this reporting? That is, when it came in to your problem reporting, did it come in via the general support e-mail listed in Section 1.5.2, or the problem reporting / key compromise reports in 1.5.2? I'm basing this on the CPS dated 2019-09-30

If the latter, it seems like those should naturally be auto-escalated/auto-tagged, without having to do further content filtering, but perhaps I misunderstood?

Flags: needinfo?(dathan.demone)

Hello, I believe I am the first third party mentioned in the report, based on the timestamps.

I reported the issue to the two emails listed at https://crt.sh/?id=1955992027&opt=problemreportingecs.support@entrustdatacard.com and abuse@affirmtrust.com — which I believe are sourced from the CCADB.

One detail I can add to the timeline above: after the other reports, but before the revocation time, I received the following reply.

Date: Tue, 21 Jan 2020 01:23:56 +0000 (GMT)
From: ECS Support <ecs.support@entrust.com>
Subject: RE: [Email Loop Protection] [EXTERNAL]Compromised certificate
 private keys    [ ref:_00D301H7DR._5001O1gWNKC:ref ]

Hi,
Thank you for reporting this incident to us. We had reached out to the subscribers and are working with them to ensure the certificate gets revoked.

Regards,
[REDACTED]
Tech Support Agent
Entrust Datacard

Filippo - Yes, you were the first to report this problem according to our timeline. Thank you for sharing these details.

Ryan - As Filippo mentioned in his post, he did send his notification to the correct contact person listed in section 1.5.2 of our CPS. Emails sent to this address are received and triaged by our support team. At the time of this incident, there was no automated system in place to filter out key compromise notifications from the other emails we receive at this address. As I mentioned in the timeline, the first email we received from Filippo was not properly escalated to compliance management due to a human error. To address this, we are putting 2 new measures in place:

  1. When a support agent receives an email that they believe may be related to a compliance issue, they will tag the support case under the compliance category. We have made changes that will go live today to our CRM system to simplify how cases are tagged and to automatically send out a notification to a compliance management distribution list when the compliance tag is used.

  2. In order to reduce the risk of relying on a human to properly identify and tag the support case, we are implementing email content scanning for every support case to look for keywords that may indicate compliance items, such as private key compromise, certificate misuse, or certificate mis-issuance. When a keyword is found, an email will be sent to the compliance management distribution list for further investigation.

Item 1 will be implemented as of today.

We will perform some additional testing on item 2 this week and provide an update by Friday, Jan 31 to confirm that we have successfully the changes.

Flags: needinfo?(dathan.demone)

We are still working on item 2 that I described in the last post and it has not been fully implemented. We will provide another update on Friday, February 7th.

Thanks for the progress update, Dathan.

Perhaps I'm overthinking things, but it sounds like you're using the e-mail in 1.5.2 for issues other than compliance, is that correct? It seems like having an e-mail clearly indicated for compliance issues would allow you to ensure that compliance management is notified, without any reliance on content scanning or human categorization.

I mention this, because something like the above has been part of the remediation for other CAs.

Put differently: Is there a reason every e-mail is not escalated to compliance for review? Is there a path or set of options that could make that a possibility? It seems like getting messages in front of the 'right people' as quickly as possible is the goal, and while I appreciate the novelty and complexity of the solutions outlined in Comment #3, I'm wondering whether there is something simpler, with fewer moving parts, that could provide even greater assurance.

That said, if there are problems or challenges I'm overlooking, this is a great time to share them, to help build an understanding about them :)

Flags: needinfo?(dathan.demone)

Ryan, The contact email address listed in section 1.5.2 is used for more than just compliance. This email address is also used for technical support. This is our most actively monitored mailbox.

We did originally consider creating an email address specifically for compliance issues as part of our plan to address this incident. However, we determined that it may increase our risk, as we would have another mailbox to monitor. Even if we publish a compliance specific email address in our CPS and on our website, we have no way of controlling how someone will submit a compliance report to us. In fact, with this specific incident report, we noticed that some people contacted us using a variety of email addresses, including some that have been removed from our website.

We feel that using our most frequently monitored mailbox is our best path forward. The addition of the automated content scanning and new case tagging rules/automated email escalation will help to improve our response time and ensure that the right people are notified when we receive reports for potential compliance issues. I can confirm that the content scanning and tagging enhancements that we described have been implemented. Please let us know if you have any other feedback or questions.

Flags: needinfo?(dathan.demone)

Wayne: Setting N-I, as it sounds like all changes proposed have been implemented.

Dathan: Thanks for the added explanation. I'm not sure I fully understand or appreciate your concerns regarding compliance. For example, if your CP/CPS stated a method to report compliance issues, and someone did not use that reporting mechanism, but instead did something like send a self-addressed stamped envelope, or found someone at Entrust to send a LinkedIn InMail to, or left a comment on someone's Facebook page, I think those are situations were we might all reasonably conclude that was not a reasonable method of reporting non-compliance, and a failure for a CA to respond timely to that LinkedIn message would not represent a CP/CPS violation or a BR violation.

I think the same would be said about a dedicated compliance mail, which comparatively, a number of CAs have adopted. This ensures that messages to this dedicated compliance address are given the highest urgency and attention, and as few systems as possible are interleaved in a way that might cause non-compliance.

I can appreciate the confidence Entrust Datacard has in its system, and I suppose time will tell, but it does seem like such systems are error prone, have caused issues in the past (although I would need to dig up specific references, I do not believe Entrust is the first to propose or use such a system), and simpler solutions may be available.

At the end of the day, however, that's your decision, as the expectation of compliance remains the same.

Flags: needinfo?(wthayer)

It appears that all questions have been answered and remediation is complete.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: needinfo?(wthayer)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] [delayed-revocation-leaf] → [ca-compliance] [leaf-revocation-delay]
You need to log in before you can comment on or make changes to this bug.