Closed Bug 1651481 Opened 5 years ago Closed 5 years ago

Entrust: Late Revocation due to SHA-256 hash algorithm

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bruce.morton, Assigned: bruce.morton)

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36

Steps to reproduce:

Certificates are being revoked after the 5 day requirement.

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

On 17 June 2020, Entrust Datacard compliance team discovered we were issuing SSL certificates which were signed using an ECC P-384 key, but were hashed using SHA-256. This is not allowed per Mozilla Policy. The following incident report was posted, https://bugzilla.mozilla.org/show_bug.cgi?id=1648472.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

17 June 2020: Issue discovered using crt.sh linting software.
17 June 2020: Investigation started, where it was determined that the problem is two ECC CAs.
24 June 2020: Last CA configured to support SHA-384 signing.
25 June 2020: Incident Report submitted where the original plan was to allow the active certificates to expire. This was based on the reasoning that there was no security issue within the remaining validity period of the certificates.
26 June 2020: Plan changed to revoke all certificates and file the late revocation incident report.
29 June 2020: All Subscribers were notified to reissue their certificates to allow the non-compliant certificates to be revoked.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

Entrust has stopped issuing certificates using the incorrect hash algorithm on 24 June 2020.

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

As of 25 June 2020, 606 certificates with the incorrect hash algorithm were unexpired or unrevoked.

  1. The complete certificate data for the problematic certificates.

Certificates are listed in this Incident Report, https://bugzilla.mozilla.org/show_bug.cgi?id=1648472.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

Entrust originally took the position that there was no immediate security issue, nor a security issue within the remaining validity period of the certificates. After filing the Incident Report and reviewing with Mozilla, it was determined that the certificates should be revoked. Since revocation is taking place after the 5 day period, this Incident Report has been generated.

There were 606 certificates associated with 142 global customers. Entrust primarily issues certificates to enterprise customers which includes banks, governments and other large financial and commercial customers. It was considered that immediately revoking all certificates would negatively impact the Subscribers and their Relying Parties. As such we are working through a process to provide notice and confirmation. This process causes a delay to revocation, but also provides a better experience the users of the SSL ecosystem, by not providing errors when there is no urgent security issue. Entrust understands that we are responsible for consequences of late certificate revocation.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

Notification provided to all customers.
Reissue and revocation started shortly after notification.
Follow up with all non-responsive customers to confirm revocation and provide time to reissue certificates.
Will follow up on completion timeline.

Status: To date 376 certificates unexpired certificates are not revoked.

Assignee: bwilson → bruce.morton
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance] [delayed-revocation-leaf]

Bruce,

I worry this dangerously skirts close to what's called out in https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation , particularly "Responses similar to “we do not deem this non-compliant certificate to be a security risk” are not acceptable."

I encourage you to re-review that section, as well as reports from other CAs. For example, consider the example set by DigiCert in bugs like Bug 1516561, Bug 1517617, Bug 1515788, Bug 1516453, Bug 1516599, Bug 1516599, Bug 1519572, and Bug 1516545. These are closer to the model expected, as documented.

I'm also concerned that this incident so far seems to largely dismiss this consideration:

You will perform an analysis to determine the factors that prevented timely revocation of the certificates, and include a set of remediation actions in the final incident report that aim to prevent future revocation delays.

That is, I think this response:

Entrust primarily issues certificates to enterprise customers which includes banks, governments and other large financial and commercial customers. It was considered that immediately revoking all certificates would negatively impact the Subscribers and their Relying Parties.

Runs dangerously close to "We aren't, nor do we plan on in the future, meeting the requirements". In the spirit of finding productive solutions, I think understanding what Entrust is planning to do so that immediate revocation has zero negative impact is the concrete goal, or, should that be determined difficult, what Entrust is doing to make sure such Subscribers are aware of any negative impacts.

Flags: needinfo?(bruce.morton)

(In reply to Ryan Sleevi from comment #2)

Runs dangerously close to "We aren't, nor do we plan on in the future, meeting the requirements". In the spirit of finding productive solutions, I think understanding what Entrust is planning to do so that immediate revocation has zero negative impact is the concrete goal, or, should that be determined difficult, what Entrust is doing to make sure such Subscribers are aware of any negative impacts.

Mozilla policy https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation states: "Mozilla recognizes that in some exceptional circumstances, revoking the affected certificates within the prescribed deadline may cause significant harm, such as when the certificate is used in critical infrastructure and cannot be safely replaced prior to the revocation deadline, or when the volume of revocations in a short period of time would result in a large cumulative impact to the web. However, Mozilla does not grant exceptions to the BR revocation requirements. It is our position that your CA is ultimately responsible for deciding if the harm caused by following the requirements of BR section 4.9.1 outweighs the risks that are passed on to individuals who rely on the web PKI by choosing not to meet this requirement."
 
It is our position that revoking the 605 affected certificates within the prescribed deadline would provide significant harm to the Subscriber's website and Relying Party use. It was our decision that the harm caused revoking in accordance with BR 4.9.1 outweighed the risks passed on to the Subscribers and Relying Parties. For this incident Entrust takes responsibility for this decision.
 
For previous issues, Entrust has been transparent when we have found incidents and have worked hard to revoke certificates in the required timeline. When we have missed the timeline, we have implemented practices to try to prevent similar issues in the future. If there are future incidents (which we are trying hard to prevent), we still plan to revoke with the timeline provided in the BRs. However, we do believe that there are some incidents where quick revocation will not help. Per Mozilla’s statement above, we understand that Mozilla also understands this position.
 
We also reviewed another incident report https://bugzilla.mozilla.org/show_bug.cgi?id=1619179 where it was stated that “Subscribers are not able to replace certificates on the BR-mandated timelines.” Subscribers have configuration management practices which must be followed. They also provide certificate services and hosting for other customers which mandate practices or must also be involved in communication. With the number of certificates and our perception of the impact to the Subscribers and Relying Parties, we thought delayed revocation would be the best solution for the ecosystem.
 
We do know that we made an incorrect initial decision, where we planned to NOT revoke. This made our response and action plan late. In the future our plan will be:

  • Publish an Incident Report within 1 business day of detection of the incident
  • Publish a late revocation Incident Report within the required 24 hours or 5 days requirement, as applicable
  • Advise Subscribers to revoke within 24 hours or 5 days as applicable or provide an explanation why they cannot revoke
  • Update the late revocation Incident Report with the Subscriber status and the timeline for closure
  • Ensure that the Incident(s) are listed in the annual audit report
Flags: needinfo?(bruce.morton)

Status: To date 297 certificates unexpired certificates are not revoked.

I appreciate you quoting the page, because it shows you've read it. Unfortunately, I don't see any evidence of adherence to the expectations here, and I think it's reasonable to push back to say that your handling of this is not sufficient.

You cannot cherry pick parts of the policy, while ignoring the expectations that result of following it. I appreciate the response in Comment #3, but it doesn't provide any substantively new information from Comment #2, and I think this remains deeply concerning for Entrust in the future.

Unfortunately, Comment #3 is actually quite counter to the expectations set forth in https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation , as was Comment #1, and this is quite concerning.

For example, within that same section, the following expectations are clearly and unambiguously laid out, which Entrust has failed to adhere to in this incident:

  • The rationale must include an explanation for why the situation is exceptional.
  • When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis.
  • Any decision to not comply with the timeline specified in the Baseline Requirements must also be accompanied by a clear timeline describing if and when the problematic certificates will be revoked or expire naturally, and supported by the rationale to delay revocation.
  • You will perform an analysis to determine the factors that prevented timely revocation of the certificates, and include a set of remediation actions in the final incident report that aim to prevent future revocation delays.

Each of these bullet points has clearly not been met within this report. More concerning, your reply continues to be quite contrary to that expectation, in particular:

It is our position that revoking the 605 affected certificates within the prescribed deadline would provide significant harm to the Subscriber's website and Relying Party use. It was our decision that the harm caused revoking in accordance with BR 4.9.1 outweighed the risks passed on to the Subscribers and Relying Parties.

This provides no such analysis or details that have shown a thoughtful analysis that supports this conclusion, on a per-Subscriber basis.

However, we do believe that there are some incidents where quick revocation will not help.

This seems contrary to the goal of ensuring "a set of remediation actions in the final incident report that aim to prevent future revocation delays.". I see nothing to that effect on this bug. Worse, this seems to directly counter the view that circumstances are exceptional, by arguing that they are, in fact, to be expected, and that Entrust continues to expect similar incidents in the future.

If you are going to appeal to the policy, you need to actually adhere to it. Put bluntly: the goal of https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation is not to give CAs carte-blanch to arbitrarily violate the Baseline Requirements based on their judgement. It's intended to be a learning exercise for a truly exceptional situations. There is nothing, within this bug, that can help either Relying Parties, Browsers, or CAs on how to do better. And when CAs fail to demonstrate a committment to improving, both individually and as an ecosystem, that has serious long-term consequences. My hope is that you will carefully re-evaluate the response, and ensure that you are providing all the necessary and required details. An incomplete incident report is a worrying sign for the continued trust in the CA, and so it is essential that a proper incident report, with all of the required information, be provided.

Flags: needinfo?(bruce.morton)

(In reply to Ryan Sleevi from comment #5)

Unfortunately, Comment #3 is actually quite counter to the expectations set forth in https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation , as was Comment #1, and this is quite concerning.

For example, within that same section, the following expectations are clearly and unambiguously laid out, which Entrust has failed to adhere to in this incident:

  • The rationale must include an explanation for why the situation is exceptional.

When we discovered the issue, we had it reviewed with our security team to determine the risk to the customers. Security agreed that the Mozilla requirement is applicable and provided a statement from a NIST document "A hash function that provides a lower security strength than the security strength associated with the bit length of n ordinarily should not be used, since this would reduce the security strength of the digital signature process to a level no greater than that provided by the hash function." However, since they also stated that SHA-256 would not provide a security issue in the remaining certificate validity periods. As such, we considered this to be an exceptional circumstance where we believed that it would be better for the Subscribers and Relying Parties if we did not revoke without providing notification and time to the Subscribers to update the certificates as we considered that there was no immediate security issue.

We understand that you may not consider this to be exceptional, but that was our thought when the decision was made.

  • When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis.

As stated above, the delay was not based on per Subscriber basis.

However, in reviewing the issue with Subscribers we have had push back based on why the need to delay revocation: third parties must be coordinated to change certificates, strict change management system slows change, short staff due to COVID-19, need time to update critical system, single certificate distributed on many systems, and blackout period. These may be excuses, but they do reflect that Subscribers in enterprise environments have complex systems and processes and do need time to properly execute.

We do have one customer which is migrating system to cloud based architecture. They currently run their system for a third party which has strict/slow change management process. Since the current system will not be used after September 2020, they have asked for us to delay revocation until the end of September. If we accept this request, we will submit an Incident Report.

  • Any decision to not comply with the timeline specified in the Baseline Requirements must also be accompanied by a clear timeline describing if and when the problematic certificates will be revoked or expire naturally, and supported by the rationale to delay revocation.
  • 26 June 2020 - decision to revoke certificates
  • 29 June 2020 - notify all Subscribers with a request to revoke by 7 July 2020
  • 29 June to 7 July 2020 - contact all Subscribers to support the reason for the issue and discuss delays
  • 27 July 2020 - revoke all certificates, which had not yet been revoked
  • You will perform an analysis to determine the factors that prevented timely revocation of the certificates, and include a set of remediation actions in the final incident report that aim to prevent future revocation delays.
  • We will not the make the decision not to revoke.
  • We will plan to revoke within the 24 hours or 5 days as applicable for the incident.
  • We will provide notice to our customers of our obligations to revoke and recommend action within 24 hours or 5 days based on the BR requirements.
  • We will recommend to our customers to implement automation of certificate management.
  • We will increase our ability for correct implementation and testing to ensure that certificate profiles will meet the latest CA/Browser Forum or root program requirements.
  • We will monitor the Mozilla incidents and the discussion list to discover problems which other CAs have experienced and how they were resolved. This will allow us to review and react if required to our own implementation. This will also help to minimize the number of miss-issued certificates, which will reduce the risk of late revocation.
  • We will manage and update our pre-issuance and post-issuance linting to discover or prevent the problem early.
Flags: needinfo?(bruce.morton)

(In reply to Bruce Morton from comment #6)

When we discovered the issue, we had it reviewed with our security team to determine the risk to the customers. Security agreed that the Mozilla requirement is applicable and provided a statement from a NIST document "A hash function that provides a lower security strength than the security strength associated with the bit length of n ordinarily should not be used, since this would reduce the security strength of the digital signature process to a level no greater than that provided by the hash function." However, since they also stated that SHA-256 would not provide a security issue in the remaining certificate validity periods. As such, we considered this to be an exceptional circumstance where we believed that it would be better for the Subscribers and Relying Parties if we did not revoke without providing notification and time to the Subscribers to update the certificates as we considered that there was no immediate security issue.

We understand that you may not consider this to be exceptional, but that was our thought when the decision was made.

Correct, I do not consider this to be exceptional, and in fact contrary to the intent of this section. I want to make sure Entrust is aware of this, because it would be quite concerning if such a consideration is used again, and is deeply concerning that this was Entrust's rationale at the time.

This statement, is, in effect, "We considered it exceptional because we do deem this non-compliant certificate to be a security risk". The latter half of that statement is explicitly addressed at https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation

However, the overall response, highlighted above, seems to be ignoring an entire paragraph within https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation that the discussion of "exceptional" here is the circumstances of the revocation causing substantial, demonstrable harm. To date, Entrust has not provided any such explanation or evidence, despite repeated inquiries.

Without providing the necessary, expected, and required data, it's appropriate to treat this incident as not complying with https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation and instead an intentional and flagrant violation of the Baseline Requirements. It would be ideal to avoid such a serious incident. Given the concerns elsewhere about Entrust understanding what's expected of them, such as demonstrated in Bug 1648472, I'm quite concerned going forward.

  • You will perform an analysis to determine the factors that prevented timely revocation of the certificates, and include a set of remediation actions in the final incident report that aim to prevent future revocation delays.
  • We will not the make the decision not to revoke.
  • We will plan to revoke within the 24 hours or 5 days as applicable for the incident.
  • We will provide notice to our customers of our obligations to revoke and recommend action within 24 hours or 5 days based on the BR requirements.
  • We will recommend to our customers to implement automation of certificate management.
  • We will increase our ability for correct implementation and testing to ensure that certificate profiles will meet the latest CA/Browser Forum or root program requirements.
  • We will monitor the Mozilla incidents and the discussion list to discover problems which other CAs have experienced and how they were resolved. This will allow us to review and react if required to our own implementation. This will also help to minimize the number of miss-issued certificates, which will reduce the risk of late revocation.
  • We will manage and update our pre-issuance and post-issuance linting to discover or prevent the problem early.

I think there's another element here, and I'm concerned that it may be lurking and viewed as partially addressed by the above. As discussed in the above, Entrust demonstrated an impaired understanding of the expectations for revocation, and as a result, lead to an incident. Further, despite this impaired understanding, it appears the plan is still to arrive at the same result.

I don't think we can say the conclusion, which is still to not to revoke all of these certificates immediately, is supported by the data Entrust has provided. My hope is that Entrust can provide suitable data to demonstrate why this is acceptable, by understanding the expectations. However, if it is the situation that the data simply doesn't support the delay in revocation, then the expectation is that Entrust will act to revoke these certificates.

"Expectation" here may not mean reality. As noted in https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation , only Entrust is capable of revoking certificates, and may still decide that, despite the expectation, and the lack of data to support their decision, Entrust still feels the most appropriate step is to not revoke. That would be quite concerning and problematic, and seem to be an intentional dismissal both of the Baseline Requirements and root store expectations. So, as much as possible, I want to encourage you to either revoke or provide a suitable, thoughtful analysis about the impact that would be had by revoking, and the steps Entrust is taking to address that impact holistically.

That said, I'm encouraged by Entrust's commitment that they will never, in any circumstance, delay revocation again, for any reason. If that's not what is meant by Comment #6, however, then I think a more careful analysis of the factors that may cause Entrust to delay revocation, and the specific steps being taken to address those, is necessary. However, as the goal should be to get to a point where there are never any delays again, and given Entrust's confusion of the expectations, I wanted to make sure that, as currently stated in this bug, Entrust is committing to never having an exceptional situation again.

Flags: needinfo?(bruce.morton)

Status: To date 162 certificates unexpired certificates are not revoked.

Flags: needinfo?(bruce.morton)

Ben: I'm not sure if you have questions you'd like to add.

I'm concerned about the "We considered the policy, but we decided it doesn't matter" response, and there's not really more data at this point to support things. This would appear a willful, intentional, and flagrant violation. I realize that some no doubt consider the "SHA-384 vs SHA-256" to be minor, and think a large reaction is unjustified. However, on a substantive level, it's a question about whether the CA will follow Root Program requirements, even if they don't understand or agree with them. This is telling, because we've seen in the past CAs ignore requirements they don't understand or agree with, such as "Don't issue a MITM certificate" (e.g. Trustwave). It also raises concerns whether the CA is willing and is capable of revoking in response to security and compliance incidents.

Flags: needinfo?(bwilson)

(In reply to Ryan Sleevi from comment #9)

Ben: I'm not sure if you have questions you'd like to add.

I'm concerned about the "We considered the policy, but we decided it doesn't matter" response, and there's not really more data at this point to support things. This would appear a willful, intentional, and flagrant violation. I realize that some no doubt consider the "SHA-384 vs SHA-256" to be minor, and think a large reaction is unjustified. However, on a substantive level, it's a question about whether the CA will follow Root Program requirements, even if they don't understand or agree with them. This is telling, because we've seen in the past CAs ignore requirements they don't understand or agree with, such as "Don't issue a MITM certificate" (e.g. Trustwave). It also raises concerns whether the CA is willing and is capable of revoking in response to security and compliance incidents.

I’m hoping to address the concerns.

Entrust has been a CA for over 20 years and has always designed to meet the Root Program Requirements. We have also tried to comply to the transparency requirement of providing Incident Reports where we did not meet the requirement. And when an auditor found an incident, we have followed up with incident reports. In addition, we disclose our incidents to our auditors before audit.

Prior to this incident, we have filed 3 late revocation reports: https://bugzilla.mozilla.org/show_bug.cgi?id=1520876, https://bugzilla.mozilla.org/show_bug.cgi?id=1521520 and https://bugzilla.mozilla.org/show_bug.cgi?id=1636339. In all cases we planned to revoke the certificates, but were late due to process or technical reasons. We have put in practices to close out non-conformance and late revocation issues.

In this case, we considered that we had a large volume of customers and certificates and since the incident was not a security issue, we disclosed that we did not plan to revoke. In retrospect, based on https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation, we should have immediately filed a late revocation report and addressed this requirement, “Any decision to not comply with the timeline specified in the Baseline Requirements must also be accompanied by a clear timeline describing if and when the problematic certificates will be revoked or expire naturally, and supported by the rationale to delay revocation.” We were planning to allow the certificates to expire naturally, but did not cover this correctly in an incident report. We will ensure that incident reports and revocation is properly addressed in our policy, processes and procedures.

Although the Mozilla guidance does not address our specific situation, it does state the following, “such as when the certificate is used in critical infrastructure and cannot be safely replaced prior to the revocation deadline, or when the volume of revocations in a short period of time would result in a large cumulative impact to the web.” We did believe that since we issued our certificates to large enterprises, financial institutions and governments that revoking may cause an issue for critical infrastructure and could have a large cumulative impact to the web. We did not evaluate this issue on a per subscriber basis, but evaluated this on the types and names of the subscribers impacted.

We are capable of revoking a certificate and have no issues with revoking a certificate due to security reasons. We believe that we have responded with revocation for previous incidents and when requirement changes have occurred. We do hope that Mozilla will continue to allow CAs to explore other options than revoking in the BR time of 24 hours or 5 days, when there is no security issue and when we judge that it will impact a critical infrastructure or there will be a large cumulative impact to the web.

Your further guidance would be appreciated.

Status: To date 35 certificates unexpired certificates are not revoked.

Can you provide some additional information with regard to the 35 that are still unrevoked, e.g. what issues are faced, what industries are they for, and what is the timeline for revoking them?

Flags: needinfo?(bwilson)

I am concerned that, despite Comment #5, Comment #6 fails to substantively address this, as highlighted by Comment #7, and Comment #10 fails to substantively address the concerns. I appreciate the depth of what Entrust has written, but it lacks both substance and lacks meeting any of the requirements set forth here, as previously remarked in Comment #7.

The replies continue to cherry-pick statements while ignoring the surrounding expectations, despite being repeatedly highlighted. Whether this is ignorance or intentional is largely irrelevant, as both deeply undercut the trustworthiness of Entrust, as called out in Comment #9. For example, I am extremely skeptical of the actions set forth in Comment #6 meaningfulLy address Entrust’s decision making issues here, nor do they seem to meaningfully address the set of challenges customers face. At best, they seem to be “do more of the same, but more,” but when we know there are systemic flaws in how they understand and operate against the expectations put forward of them.

I think this failure is most pronounced by Comment #11 showing that, despite the commitment in Comment #6 to revoke, and overall to improve their incident handling, we still see we have to try to squeeze out information that is expected, as a baseline, in Comment #12.

Flags: needinfo?(bruce.morton)

(In reply to Ben Wilson from comment #12)

Can you provide some additional information with regard to the 35 that are still unrevoked, e.g. what issues are faced, what industries are they for, and what is the timeline for revoking them?

Ben, the current status is 17 unexpired certificates are not revoked. The certificates are all issued to the same enterprise customer, which has a complex configuration management process. The certificates are all issued internally and are not used by external relying parties. The service which they are supporting is being deprecated. The target is to get these final certificates replaced and revoked by 7 August 2020.

Flags: needinfo?(bruce.morton)

Status: The final 17 certificates were revoked on 7 August 2020. All certificates signed with the incorrect hashing algorithm have now expired or have been revoked.

Flags: needinfo?(bwilson)

I'll admit at the outset that the standard incident reporting form doesn't provide the best outline for CAs to follow in the case of delayed revocations, however, CAs need to read very carefully about delayed revocation and craft their reports accordingly. See https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation.

I think that before we can close this bug, we will need the following to be further addressed by Entrust:
(1) please provide a deeper objective analysis supporting Entrust's decision to delay revocation--as opposed to general statements of assumptions based on industry sectors (include specifics on a per subscriber basis); and
(2) please provide Entrust's analysis documenting the factors that affect/affected Entrust's timely revocation of the certificates.
Then, based on the foregoing, it may be advisable for Entrust to revisit its response in Comment 6 and restate a set of remediation actions that Entrust will take which are aimed at preventing future revocation delays.

Flags: needinfo?(bwilson)

(In reply to Ben Wilson from comment #16)

I'll admit at the outset that the standard incident reporting form doesn't provide the best outline for CAs to follow in the case of delayed revocations, however, CAs need to read very carefully about delayed revocation and craft their reports accordingly. See https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation.

I think that before we can close this bug, we will need the following to be further addressed by Entrust:
(1) please provide a deeper objective analysis supporting Entrust's decision to delay revocation--as opposed to general statements of assumptions based on industry sectors (include specifics on a per subscriber basis); and

Here is the timeline of the events resulting in the late revocation of the certificates:

• 17 June 2020: Issue discovered using crt.sh linting software.
• 17 June 2020: Reviewed requirements behind crt.sh lint and confirmed the Mozilla Policy section 5.1.2 requires “If the signing key is P-384, the signature MUST use ECDSA with SHA-384” see https://www.mozilla.org/en-US/about/governance/policies/security-group/certs/policy/#512-ecdsa
• 17 June 2020: Investigation indicated that there were problems with two ECC CAs, where one CA already had the problem resolved and the other CA was scheduled to have the problem resolved on 24 June 2020.
• 18 June 2020: Analysis shows 142 customers with 606 certificates impacted.
• 23 June 2020: Security review indicated that although the SHA-256 hash algorithm was used there was no security issue as SHA-256 is the minimum hash algorithm which is currently permitted.
• 24 June 2020: Review of Mozilla incident reports found https://bugzilla.mozilla.org/show_bug.cgi?id=1527423, where revocation was not part of the corrective action plan or discussed in the incident.
• 24 June 2020: Based on the security review and the similarity to another incident report, it was decided that the corrective action would not include revocation.
• 25 June 2020: Posted bug https://bugzilla.mozilla.org/show_bug.cgi?id=1648472 and listed of affected certificates and started review of comments to complete and finalize the bug.
• 26 June 2020: Based on incident review it was decided to revoke all certificates. A plan was created to revoke the certificates in a late fashion based on the number of customers impacted, the profile of the customers, the number of large sites impacted, and lack of personnel based on Canada Day (July 1) and Independence Day (July 4). Customers were requested to revoke by 7 July 2020, an extension was allowed to 27 July 2020 based on a justified request.
• 29 June 2020: All Subscribers were notified to reissue their certificates to allow the non-compliant certificates to be revoked.
• 1-10 July 2020: All requests for extensions were reviewed and rejected or approved. Extensions were based on the following reasons: third parties must be coordinated to change certificates, strict change management system slows change, short staff due to COVID-19, need time to update critical system, single certificate distributed on many systems, and blackout period.
• 7-10 July 2020: All non-responsive customers were contacted to ensure revocation was not done without contact.
• 27 July 2020: All certificates with the exception of one customer were revoked. The one customer was discussed in comment 14.
• 7 August 2020: All certificates were confirmed to be revoked or expired.

(2) please provide Entrust's analysis documenting the factors that affect/affected Entrust's timely revocation of the certificates.
Then, based on the foregoing, it may be advisable for Entrust to revisit its response in Comment 6 and restate a set of remediation actions that Entrust will take which are aimed at preventing future revocation delays.

The first factor is whether the miss-issued certificate is providing a high security risk to the Relying Parties. High security risk will be determined by internal security review and/or consultation with our root embedding partners. If high security risk, then it is our intention to revoke the certificates within the timelines specified in the BRs.

If there is no high security risk, then we need to assess the impact to the websites and the Relying Parties using those websites. We do not see a positive outcome to the SSL ecosystem by rejecting users of a site with revoked certificates or having Relying Parties working around revocation indications. The issue of significant harm is addressed in Mozilla’s guidance and has also been an issue in other Incident Reports.

In the future, for each new version of the Mozilla Root Store Policy, other browser or operating system policy, or update to CA/Browser Forum requirements, we will improve our existing system-change compliance analysis to ensure we understand the policy change, the policy change has been articulated to an owner, and the policy change has been tested to confirm implementation.

To remediate certificate revocation issues:
• We will follow the guidance provided by Mozilla, https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation for miss-issued certificates.
• We will plan to revoke within the 24 hours or 5 days as applicable for the incident.
• We will provide notice to our customers of our obligations to revoke and recommend action within 24 hours or 5 days based on the BR requirements.

To mitigate the incorrect implementation of policy, we will follow the root cause analysis of the original incident https://bugzilla.mozilla.org/show_bug.cgi?id=1648472:
• We will increase our ability for correct implementation and testing to ensure that certificate profiles will meet the latest CA/Browser Forum and root program requirements.
• We will monitor the Mozilla incidents and the discussion list to discover problems which other CAs have experienced and how they were resolved. This will allow us to review and react if required to our own implementation. This will also help to minimize the number of miss-issued certificates, which will reduce the risk of late revocation.
• We will manage and update our pre-issuance and post-issuance linting to discover or prevent the problem early.

Flags: needinfo?(bwilson)

Thanks for this response. I don't have any further questions and will close this bug on or about 9-October-2020 unless anyone else has issues or questions to raise.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] [delayed-revocation-leaf] → [ca-compliance] [leaf-revocation-delay]
You need to log in before you can comment on or make changes to this bug.