Closed Bug 1651828 Opened 4 years ago Closed 4 years ago

DigiCert: Delay of revocation for EV audit inconsistency incident

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: brenda.bernal, Assigned: brenda.bernal, NeedInfo)

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay] [covid-19])

Attachments

(7 files)

1.05 MB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Details
2.72 MB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Details
223.21 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Details
570.97 KB, application/vnd.ms-excel
Details
60.94 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Details
2.96 MB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Details
2.90 MB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Details

Incident Report – Mozilla Policy Violation

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

We filed the bug https://bugzilla.mozilla.org/show_bug.cgi?id=1650910 for inconsistent EV audits.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

06-July-2020: Bug filed for EV inconsistent audits.
07-July-2020: Impacted customers were notified.
07-July through 09-July 2020: Had discussions with customers on operational risk and impact for revocations to be actioned within the 5-day window.

We have gathered the feedback so far from all impacted customers and filing this bug for a delay in revocation. We wanted to file this early given the reasons for the delay and to get the thoughts of the browsers about the additional compliance issue (of failing to revoke). For a majority of the customers, we are able to execute on the Saturday, July 11th deadline, but the infrastructure and use of certificates by others makes the ecosystem impact significant that we felt it would be appropriate to take another compliance hit by stating our case for the delay. We provided the reasons why and certificate estimates are in 6) below.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

As noted in our prior filed bug (1650910): DigiCert has halted all EV issuance across CAs omitted from an EV audit report. All EV issuance from these are on ICAs that will be covered in an EV audit.

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

Based on the details provided, revocation is expected and required within the timeframe defined, as an industry-wide requirement.

We refined our data and searching for impacted certificates and found the approximated number in the previous bug was incorrect as it included all revoked and expired certificates. The total number of valid certificates is roughly 35,000. We are revoking most of these by the deadline, but the ones that support critical infrastructure dealing with the pandemic or that fall under the banking regulations needed more time.

Scheduled to revoke on July 11th: ~26,000 certificates
Scheduled to revoke on July 16th: ~4000 certificates
Scheduled to revoke on July 30th: ~4000 certificates

There are several that are asking for 45 days from the initial event. We have explained to these requests violate the industry requirements that we adhere to, but given the extraordinary situation of 2020, we thought we’d like to raise it here. Most of the 45 day requests involve a point of sales device, an ATM, or similar device that is using the WebPKI for non-web purposes. We recognize the need to separate non-web from web application and are exploring how to do that better. For now, we have recommended to customers using non-web devices that they replace their certificate with a separate root of trust.

5.The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.

We plan on posting the full list of serial numbers that will not meet the revocation deadline on Saturday. It is unlikely this number will be zero.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

There are several factors that lead to us not meeting the revocation deadline of five days. The sheer volume is probably the number one reason. We realize that is not a a sufficient reason alone and have not made decisions on delayed revocation for volume of certificates solely. The primary reasons are based on operational risks and impact as follows:

  1. More time to coordinate with third parties. Some of these are not operated by the requester or are distributed to third parties. Several payment and banking institutions require lead time to do testing, approval, and sign off. This reason is cited in the previous bug related to the underscore issue. Although we have mentioned this is essentially a SEV1 situation, the sign off process prevents a rapid replacement of installed certificates.
  2. COVID and ongoing lockdown restrictions. Accessing the data center and getting staff available at key locations has taken longer than expected simply because people are not available or are unable to travel at this time. A third of the certificates in this group are actively being used to monitor COVID patients and can’t have an outage.
  3. Regulatory bodies. Mobile banking applications, payment systems processing salaries, supplier payments, and online services run by financial institutions require some form of change process controlled by regulatory bodies. We are researching where these regulations may require more than 5 days. If this is true, then it would conflict with the policy and be something to present to the CAB forum as a conflict with local law. More likely the regulations just involve sufficient burden that it makes 5-day replacements impractical rather than impossible. Changing a cert on a mobile browser does require significant effort to have the update pushed to end users.
  4. US Government operations. Several of these certificates are key to the operations of the US government. These require approval. Although they have not stated that COVID is delaying approval, the approval process looks like it is taking longer than the five days.
  5. Pinned applications. A majority of the delayed revocation relates to pinned certificates where they need to figure out how to change the pinning of the cert. These are the ones we’d like to revoke on July 30. They have mentioned the pins don’t expire by then. We are working with them to see what we can do to still revoke within July.
  6. Tax season impact. With tax season deadline this year delayed, several tax organizations are on a freeze until July 15th.

Although we haven’t listed numbers here, we are gathering how many certificates fall into each reason and will provide the serial numbers based on reason category on or before July 11th.

7.List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

We acknowledge and accept that the requirement is 5-days and that we are committed to upholding industry mandated timelines. However, we couldn’t make it happen for the reasons stated above. Although we couldn’t manage to comply with the requirements on this particular issue, we do feel delaying the revocation in a few cases are justifiable considering the severe impact. We’re hoping for a response from the browsers about the next steps and their thoughts on the delay. We will continue to provide updates and working to still accelerate revocation and replacement over the next few days. As part of this remediation, we would like to take an active role in separating out non-web PKI from the web and are looking for recommendations on how to better accomplish that task. We are also promoting use of automation as a key part of any certificate solution, including investing more in improving our ACME and other certificate automation tools.

Thank you for your consideration.

Assignee: bwilson → brenda.bernal
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [delayed-revocation-leaf]

(In reply to Brenda Bernal from comment #0)

We have gathered the feedback so far from all impacted customers and filing this bug for a delay in revocation. We wanted to file this early given the reasons for the delay and to get the thoughts of the browsers about the additional compliance issue (of failing to revoke).

I mean, browsers can't and don't grant exceptions. Any failure to revoke would be a serious compliance issue that would impact trust in a CA going forward, even if it may not lead to immediate distrust. As we've seen from CAs that have a series of compliance issues, continuing to have compliance issues can ultimately result in loss of trust of the CA entirely.

For a majority of the customers, we are able to execute on the Saturday, July 11th deadline, but the infrastructure and use of certificates by others makes the ecosystem impact significant that we felt it would be appropriate to take another compliance hit by stating our case for the delay.

To be clear: This is not about "taking a hit". This is about demonstrating what makes this situation exceptional, which in this case is generally meant "unprecedented, new, or novel". This is touched on at https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation with the expectations, and I don't think this incident really rises to that.

As noted in our prior filed bug (1650910): DigiCert has halted all EV issuance across CAs omitted from an EV audit report. All EV issuance from these are on ICAs that will be covered in an EV audit.

I think there's concern about whether or not we can retroactively consider an EV audit having resolved this, so revocation is indeed the right answer. The response for dealing with such audit inconsistencies has, historically, been to revoke the entire intermediate and start fresh, due to the loss of trust and confidence. Revoking only the affected certificates appears to be a less-impactful option, and it's up to DigiCert to demonstrate that would sufficiently mitigate concerns, versus revoking the entire intermediate.

There are several that are asking for 45 days from the initial event. We have explained to these requests violate the industry requirements that we adhere to, but given the extraordinary situation of 2020, we thought we’d like to raise it here. Most of the 45 day requests involve a point of sales device, an ATM, or similar device that is using the WebPKI for non-web purposes.

The traditional response for dealing with such "non-web purposes" has been to revoke the intermediate via OneCRL. While Mozilla has seen this as an appropriate remediation, other browsers, including Google, have recognized that it fails to protect the broader ecosystem. While this may offer a viable alternative to revoking only the EV certificates, by revoking the intermediate in Firefox, I suspect that despite customer assertions, they are likely using them in the Web.

This specific use case is, as DigiCert knows, not one unfamiliar.

We recognize the need to separate non-web from web application and are exploring how to do that better. For now, we have recommended to customers using non-web devices that they replace their certificate with a separate root of trust.

This is not a new response for DigiCert, so I fail to see how these circumstances are exceptional. The same response was provided in the past - e.g. Bug 156561 - and DigiCert is no stranger to the concerns around exceptional events, such as those from Bug 1516453, Bug 1517617, Bug 1516599, or Bug 1516545. While DigiCert provided the experience in these past issues, I think the same response would be applicable regardless of the CA, as all CAs use these incident reports to better inform their policies and practices and ensure their Subscribers are appraised of the risks.

  1. More time to coordinate with third parties. Some of these are not operated by the requester or are distributed to third parties. Several payment and banking institutions require lead time to do testing, approval, and sign off. This reason is cited in the previous bug related to the underscore issue. Although we have mentioned this is essentially a SEV1 situation, the sign off process prevents a rapid replacement of installed certificates.

In the past, DigiCert had committed that this would not be an issue going forward. I'd be concerned if the steps DigiCert took a year ago, to move such customers off, as all publicly-trusted CAs were expected to do, were insufficient.

  1. COVID and ongoing lockdown restrictions. Accessing the data center and getting staff available at key locations has taken longer than expected simply because people are not available or are unable to travel at this time. A third of the certificates in this group are actively being used to monitor COVID patients and can’t have an outage.

In the past, DigiCert had committed that datacenter access issues were not going to be an issue going forward, by deploying and promoting automation solutions. It seems these Subscribers made an intentional choice not to adopt these practices to mitigate the risk?

  1. Regulatory bodies. Mobile banking applications, payment systems processing salaries, supplier payments, and online services run by financial institutions require some form of change process controlled by regulatory bodies. We are researching where these regulations may require more than 5 days. If this is true, then it would conflict with the policy and be something to present to the CAB forum as a conflict with local law. More likely the regulations just involve sufficient burden that it makes 5-day replacements impractical rather than impossible. Changing a cert on a mobile browser does require significant effort to have the update pushed to end users.

I think browsers would take a particularly dim view of such an approach. I think a CA that either failed to do their due dilligence in ascertaining whether customers posed such risk, or actively courted such customers, would be an existential risk to the continued trust of that CA. The reason is that the same reasoning could be seeing as actively promoting the issuance of MITM certificates, which is also explicitly forbidden, by attempting to use "local law" as a justification.

The clauses with respect to 9.16.3 are meant to be exceptional, and disclosed beforehand, which is explicitly noted as a SHALL requirement. By issuing the certificate, the sevrability clause was not exercised.

  1. US Government operations. Several of these certificates are key to the operations of the US government. These require approval. Although they have not stated that COVID is delaying approval, the approval process looks like it is taking longer than the five days.
  2. Pinned applications. A majority of the delayed revocation relates to pinned certificates where they need to figure out how to change the pinning of the cert. These are the ones we’d like to revoke on July 30. They have mentioned the pins don’t expire by then. We are working with them to see what we can do to still revoke within July.

This is unacceptable. There has been sufficient communication, to CAs, about the risk of pinning. CAs cannot be seen as responsible for their Subscribers misusing that CAs certificates, nor can the community be seen as a having to bear that risk.

  1. Tax season impact. With tax season deadline this year delayed, several tax organizations are on a freeze until July 15th.

DigiCert, and the CAs they acquired, such as Symantec, have used this explanation in the past. This would be a serious regression, and if it turned out these customers were previous Symantec customers, this would pose significant risk to the continued trust of DigiCert, who committed to the broader community that the set of issues that ultimately lead to the loss of trust in Symantec, some of which were intentional choices by Symantec to prioritize customers over behaving in a trustworthy fashion, would not similarly plague DigiCert.

We’re hoping for a response from the browsers about the next steps and their thoughts on the delay. We will continue to provide updates and working to still accelerate revocation and replacement over the next few days. As part of this remediation, we would like to take an active role in separating out non-web PKI from the web and are looking for recommendations on how to better accomplish that task. We are also promoting use of automation as a key part of any certificate solution, including investing more in improving our ACME and other certificate automation tools.

DigiCert already committed to this, a year ago. I think it's disheartening to think that there has been no progress. I understand that the relationship between DigiCert and its customers is complex, but it's precisely because these were things DigiCert already committed to in light of past incidents that it should have been able to highlight to current customers that there could be no future incidents without impacting DigiCert's trust.

I realize that customers of DigiCert will be directed to this, and perhaps chime in on why it's unreasonable to have expectations at all, and shouldn't everything be treated bespoke. We have requirements to treat CAs consistently, fairly, and to have an objective standard of trustworthiness. There is nothing inherent to a CA that prevents these obligations from being met: they are design trade-offs a CA knowingly makes. These expectations have been stable for years, and CAs that have struggled to meet the expectation in the past, such as DigiCert, have been given time to design systems to better align. Certificates that cannot be replaced within the time specified, by their very existence, pose a serious threat to the ability to respond. Just like we don't purchase fire insurance once our house is on fire, and expect everything to be OK, we don't wait to implement good practices until the serious security incident is here. The ecosystem has already faced a number of significant security events, from DigiNotar to Heartbleed to Symantec's distrust, and each of these reveal inadequacies in the system that CAs are supposed to be working to correct.

The choice on whether to revoke or not is, ultimately, DigiCert's, and that's called out in https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation . But I think DigiCert is very aware of the serious concerns that have come up, both with their CA and the broader trend of CAs that have attempted to have "the ecosystem" pick up the tab for CA's and customers decisions, and we have to move past that. When customers make a decision to use a publicly trusted CA, that CA is expected to explain their policies, which they do, as part of the Subscriber Agreement and CP/CPS. If the customer still uses the product, in a way that it's not intended for nor safe to use, it cannot be seen as "someone else's fault" when things go awry.

The policies that ensure prompt revocation keep all sites safe. They're the same policies that help protect your sites from hackers or abuse, by ensuring that if there were certificates you did not authorize, they are promptly revoked. The reason we expect consistency, regardless of severity, is precisely because the moment "severity" becomes an area for determining policy, we encourage CAs to adopt selectively-lax policies, which increases the risk, for everyone, of more high-severity incidents. This isn't theory, it's the exact system browsers have been trying to fix for the past 15 years, after the well-published "PKI race to the bottom" that ultimately resulted in DigiNotar: a series of "minor" issues that created an opportunity for a major, company-ending issue.

If DigiCert decides not to revoke, the burden rests on demonstrating why this is exceptional. Analyzing past incident reports, from DigiCert and other CAs, to see whether similar issues were encountered or similar commitments were made, is essential. This is because DigiCert will need to demonstrate, per-Subscriber, why the situation is exceptional, and have a comprehensive plan to prevent this underlying issue from happening again. Not ideas, but a timeline with concrete plans. For those customers that need additional time, it'll be important to identify specifically who they are and what steps they're taking, in order to ensure they don't simply shift to a new CA and cause the same compliance issues at a new CA.

I realize COVID-19 is truly exceptional here, and this understandably creates a host of challenges. The only information to determine whether something is reasonable or not is based on what's provided in these incidents, the analysis and comparison to past incidents, and the commitments going forward. I remain deeply concerned on this issue, with the information available, because it feels like either "We don't know how to solve this" or "We aren't willing to solve this and hoping browsers will be the 'bad guy' on this" or, worse, "our customers don't understand how harmful to the security of all users what they're asking for is". I suspect that, if there is any delay, specific commitments from those customers, which can help better improve the security for all users, would be a minimum bar to expect.

Whiteboard: [ca-compliance] [delayed-revocation-leaf] → [ca-compliance] [delayed-revocation-leaf] [covid-19]

We have a definitive plan that we are working towards for revocation. We are explaining the timing that must get elongated given the information collected about the challenges that the current situation with COVID-19 and operational risk that has been placed on these organizations. DigiCert remains committed to upholding the standards set forth for us as a CA and are not requesting delays for non-exceptional circumstances. These examples of operational risk compounded by the pandemic constraints with timing are noted below.
For those planned to be revoked and replaced by July 30th, here are some of the key reasons why revocations for about 10% of impacted organizations will happen later than we would like:

927 are from a large global banking entity (subject of certificate serial number 0474137bf2bdb2d9a4f6a101721b5f67) that services ATMs and payment terminals as well as US Federal government facing applications and required to go through a stringent change management process and are impacted by availability of personal to perform these changes due to COVID.

1,068 relate to a multinational bank (subjects of certificate serial numbers 09f3088bed65f0a4875d01be76d6499c) that are in lockdown and can’t apply the changes fast enough. The majority of their certificates are used on mobile banking applications.

269 are for a specific case for a national tax services firm (subject of certificate serial number 040B0C1B2601AB81A03AEEFE853F1F78) working with the COVID impacted organizations especially who are delayed in completing their government required paperwork. They are the primary users of these apps which are in a freeze to service this category of customers that was focused in many cases on life saving activities.

136 are related to a Federal Financial Institution (subject of certificate serial number 0E389CADFDFA0866D405FB74614B51BF) that also are impacted by requiring to work with 3rd party institutions which would have severe impact during the tax season deadline.

182 are for certificates (subject of certificate serial number 010F5E5FE3E12685FAFB44476AE525A9) impacting almost half a million mobile banking customers that need access to do banking online during shelter-in-place/lockdown.

89 certificates are from a global banking entity (subject of certificate serial number 0DCE177ABC35A92E36F7676ECADA2F13) that has mobile banking applications, payment systems processing salaries, supplier payments, online services for clients to manage their finances etc. and will impact many clients in US, Europe and Asia Pacific. They are aggressively working to replace the browser facing applications in the coming weeks but require more time for a small number of systems (3 certificates) that communicate peer-to-peer to propagate the change. Their operation is heavily impacted by COVID-19 restrictions for onsite personnel who belong to specific risk groups to make this change in the short period of time required. Revoking the certificate will have material impact on the banking eco-system and the bank’s role to support the various governments in comprehensive package of measures including their ability to service loans (SBA and other equivalent COVID-19 Bridging Loans) for businesses in the US and Europe and bring hardship to thousands of people in the form of missed payments and inability to make short term loans during the ongoing COVID-19 pandemic.

482 certificates are for various customers that have cited reasons related to delays due to COVID restrictions in getting to data centers to facilitate the replacement. Roughly a quarter of these are for a customer whose systems monitor COVID patients who are quarantined at home (subject of serial numbers is 0147512CD4014943BE99B99138A88AF0).

We have another customer that is requiring about 45 days to complete the replacement and revocation given they are handling POS terminals that cannot be cycled out in rapid fashion given their operational risk and impact for supporting customers during COVID.

We hope that the examples provided demonstrate by these Subscribers’ cases, why the situation is exceptional, and merit our consideration for providing them more than the 5 days.

We recognize that browsers do not grant exceptions, and this is entirely up to the CA based on a risk analysis given exceptional circumstances. These circumstances are what we want to highlight in our response and notice of delay. Given the once-in-a century situation we have with this pandemic, more people are transacting business on-line (whether it is healthcare or financial-related). The restrictions and lock-down that have been imposed and continues to severely impact the maintenance windows for changes for many of our customers.

We also recognize that under Mozilla policy that exceptional circumstances may be weighed in the balance of revoking certificates. We’ve established in the bug 1650910 Incident Report that the issue of the inconsistent EV audits that was raised was unprecedented. We sought to remedy the situation by following through on revoking the EV certificates issued under those impacted ICAs. However, given the novel situation with the EV audit gap with the ICA listings in our audits, but covered under the WebTrust for CA and BR audits, that we have to balance the true risk of the underlying issue with the significant harm caused by the revocation of a subset of the certificates and risk to people carrying out the tasks to travel to sites who are in lock down situations.

For clarity, we are not planning to retroactively consider an EV audit to resolve. We are revoking the certificates, but we are posting this delay notice given the circumstances that will cause significant harm to critical infrastructure that cannot be replaced safely by the 5 day deadline. We agree with you that adding to OneCRL is not the best path forward, which is why we are pursuing the revocation of the EV certificates as the course of action.

We have made good progress on automation over the last year and are continuing to build on it. We have ACME and other automation solutions and are working on simplifying the process of installation of the clients. One area we are focusing on is figuring out how to get enterprises to use the tools built in providing certificates. We welcome any suggestions about how customers can be encouraged to adopt automation faster.

We have made good progress on automation over the last year and are continuing to build on it. We have ACME and other automation solutions and are working on simplifying the process of installation of the clients. One area we are focusing on is figuring out how to get enterprises to use the tools built in providing certificates. We welcome any suggestions about how customers can be encouraged to adopt automation faster.

I want to focus on this a little. Had DigiCert implemented the solutions it'd previously committed to, then it seems reasonable that the once-in-a-century nature of the global pandemic would be irrelevant, for the same reason that, say, Covid-19 hasn't brought down global DNS. If we look at systemic root causes, we see that there isn't really novelty or previously unknown situations here; the situations provided in Comment #2 are explanations that DigiCert and other CAs have provided in the past, and even CAs like Symantec have used.

My concern is that these incident reports are an opportunity to learn and improve, and I worry that, so far, this issue doesn't provide any substantively new details that we can learn from, nor does it even propose any actionable or concrete improvements. At best, they're described in the abstract, and at worst, it can be read as "We plan to keep doing the same as we've been doing", which implicitly means "we plan to keep having these incidents until it's no longer commercially viable"

I appreciate the added detail that helps share why you believe the risk in revocation is significant. What I don't see is anything concrete in how to reduce that risk going forward, nor any discussion about why that risk wasn't already mitigated, as had previously been committed to. The purpose of past incident reports with DigiCert has been to systemically drive down this risk, but with incidents such as this, it's objectively hard to determine whether anything has been or can be learned.

Ultimately, when incident reports no longer provide new information or new understanding, they begin to appear as if the CA is saying "There's nothing more we can do to meet expectations", and that's when questions about trust necessarily have to come in. As much as possible, I'd like to avoid that.

I think a good example would be if, say, a DigiCert customer were to obtain a certificate, and then install it at a nuclear reactor as the basis for some management tool, or, say, install it on some medical screening device. If they do that, they're being unconscionably negligent and reckless, putting their users at risk, presumably on the basis of some convenience to them. The same as if I were to go install a bunch of nails in the tires of my car because they sparkle more and I like sparkles. That doesn't mean it's the tire manufacturer's fault for not supporting sparkly nails, and it's certainly no basis to go yell and blame the tire manufacturer for not giving me replacements/not selling me nail-proof tires because, like an idiot, I stuck some nails in my tire and didn't think about the consequences.

In such an incident report, it'd be reasonable to ask if DigiCert was telling folks to put nails in their tires. If they weren't doing that, did DigiCert at least let their customers know that putting nails in their tires is a bad idea? Did they offer sparkly tires, and the customer decide not to get them, because they thought tires with nails would look better? If, as a good faith gesture, the manufacturer offers to replace those tires, and then they go stick a bunch of nails in the new ones and demand replacements again, isn't it reasonable to tell them "wat"?

The analogy is tortured, admittedly, and I think we can stick a nail in it, because it's done. But I think there's a reasonably large concern that, to date, the details on this largely amount to "it's hard", and the plan forward is a bit "lol idk". I can understand that there's a partnership here, but I don't think the solution should have to come up with the browsers. You made a promise to the community you would design your systems to do one thing, that any customers you accepted would understand that you would do that one thing, and now when you're asked to do the one thing, it's exceptional and unforeseen.

Now that I've said mostly nice things, I want to be specific: publicly trusted certificates being used on ATMs and payment terminals is gross incompetence and negligence. I have zero sympathy for this, because since 2015, the consistent, loud, resounding message has been "STOP DOING THAT". I have zero sympathy for a CA that says "We didn't know they were doing that, oopsies", because DigiCert knew the expectations here, especially in light of the Symantec acquisition, in which DigiCert knew this would be unacceptable going forward. This goes doubly when at least one of these customers is a repeat offender. It should be simply unacceptable for this to keep happening.

I am beyond frustrated and disappointed by DigiCert's response, particularly Comment #2, because I fail to see how anything has changed from Symantec's responses, which ultimately contributed to them being distrusted. This is the opposite of a desirable pattern, and will eventually have to lead to the same result if it continues. I would be deeply troubled if DigiCert failed to revoke, based on the information presently shared, not because I'm not incredibly sympathetic or understanding to the truly once-in-a-lifetime (hopefully) global pandemic, but because DigiCert has put zero credible plan forward to address either these customers, or their broader customer base, going forward. And that's on DigiCert.

Here is what I just posted in Bug 1650910:

Mozilla's position is not just stated in the Root Store Policy, but also in https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation - "the question about when this should be done, particularly if it's not possible to contact the customer immediately, or if they are unable to replace their certificate quickly..." And "Mozilla recognizes that in some exceptional circumstances, revoking the affected certificates within the prescribed deadline may cause significant harm, such as when the certificate is used in critical infrastructure and cannot be safely replaced prior to the revocation deadline, or when the volume of revocations in a short period of time would result in a large cumulative impact to the web. " " The decision and rationale for delaying revocation will be disclosed to Mozilla in the form of a preliminary incident report immediately; preferably before the BR-mandated revocation deadline. The rationale must include an explanation for why the situation is exceptional. Responses similar to “we do not deem this non-compliant certificate to be a security risk” are not acceptable. When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis. Any decision to not comply with the timeline specified in the Baseline Requirements must also be accompanied by a clear timeline describing if and when the problematic certificates will be revoked or expire naturally, and supported by the rationale to delay revocation."

So, I think we recognize that this is a situation where all certificates cannot be revoked within the BR timeframe. We would like Digicert to present a well-explained plan in [this] delayed revocation bug--we much better prefer a well-thought-out plan over something that is rushed through.

Thank you, Ryan and Ben, for your comments. As always, we do appreciate the feedback and insights provided. I think, in general, we are on the same page, despite the stance in comment #2. I think we could have framed that better. The short answer is we agree with many of your comments about what qualifies as an exceptional case and the issues we are seeing. I thought I’d share some thoughts on the list along with some specific initial commitments we are making to identify these problems well before revocation looms. Going through the list,

We do agree that you can’t fix the audit issue and that revocation is the only option. Brenda’s mention of the audit timing was to express when the next audit occurs so we can verify that we have implemented a policy that all ICAs capable of issuing TLS are EV audited.

We do agree that pinning shouldn’t be used and that we need to take better efforts to discourage the practice. Although it was supported (and even encouraged) in 2015, that was five years ago. The last four years have been an active discouragement of the practice. I think we can do better as CA on identifying where keys are pinned and prevent it. I shared a commitment on it below.

We do agree that certificates for the webPKI and non-webPKI shouldn’t be mixed. Anyone who uses a TLS certs trusted by the browsers for must abide by the browsers policy requirements. There’s no exception for those who choose to use one in non-web purposes, including financial devices, non-web devices, and apps. This is already in our subscriber agreement. We already offer alternatives to browser trust and encourage people to use those for non-browser applications. Failing to segment means those customers and DigiCert assume all risks associated with the revocation periods required by the baseline requirements.

We do agree that automation is the solution. We have offered it for a while, including ACME, and are continuing to develop on it. We haven’t seen much adoption. This isn’t an excuse – only information from what we’ve seen. Tangentially, we’ve been researching why there isn’t broader adoption and the biggest reasons from the surveyed users are: 1) Do not believe the client is secure, 2) Cannot open ports, 3) Does not believe automation enhances security. Without commenting on the legitimacy of these (since we don’t think these are blockers), we are trying to address at least the third concern using a solution that is essentially a proxy for ACME.

We agree that business operational efficiencies are not an acceptable reason to delay revocation, including such things as lengthy approval processes and third-party sign offs. These are self-inflicted wounds, and the entire industry shouldn’t be barred from security improvements by any one company’s internal corporate policy.

Regulatory requirements. The primary ones Brenda referred to are new ones related to COVID, usually requiring a mandatory shut down in place. Another one we’ve heard but cannot confirm is that the various embargos in place (again related to COVID) are restricting access to international servers. We know there are restrictions that are highly impacting the normal course of business. We are gathering data on this and will provide it as part of the delayed revocation report due to COVID. know there are restrictions that are highly impacting the normal course of business. We are gathering data on this and will provide it as part of the delayed revocation report due to COVID.

US government entities are simply customers and, like any customer order certs trusted by the browsers, they are required to abide by the baseline requirements. Having a government customer can’t be an exceptional case or else the requirements wouldn’t apply to any of the qualified CAs in the EU. We do have a separate hierarchy for US government entities that is not publicly trusted, and we moved many to that chain. Any of the remaining uses probably are for government websites which are standard TLS certificates (and have the standard requirements).

Tax seasons, black-out periods, etc. are only an example of self-imposed operational hurdles. I can’t think of too many reasons that a cert roll-over couldn’t be accomplished except if the entity has a purely policy reason for restricting action.

In summary, of the original post and things we were hearing, we do agree that none of the above are exceptional circumstances and that they do not warrant a delay in revocation. The only circumstance I believe is exceptional and should delay immediate revocation (the delay ending no later than July 30) is when COVID is preventing a speedier replacement process.

As a result of the analysis, revocation of all certificates not impacted by COVID remains on the original timeline. I will post the certs impacted by COVID and the specific reason that COVID is causing issues after this post.

One point:

“I realize that customers of DigiCert will be directed to this, and perhaps chime in on why it's unreasonable to have expectations at all, and shouldn't everything be treated bespoke.“

I can’t tell what this was supposed to mean, but I thought I’d address our policy on escalations. Although I do discourage customers from reaching out directly to Google (since this is a DigiCert issue), I do encourage them to post to the Mozilla forum or on this bug. I like people posting here because it provides transparency and gets the community involved. I think open dialogue, transparency, and discussion are all good things, and I think there needs to be more subscriber and relying party participation across the board, including here and the CAB forum. All incidents seem like good opportunities to encourage direct communication in an open forum about issues instead of trying to do thinks behind closed doors.

That said, we acknowledge there is an issue with timing and us, as a CA, needing to do more to prevent these exceptional circumstances from occurring. For example, although we may not know the exact reason a customer wants to use a certificate, we can take actions during the cert issuance and lifetime to discourage non-browser use and to mitigate potential third-party policy impacts.

A few immediate actions come to mind:

Starting next week, we will begin rotating intermediates on a six-month rolling basis. Although this will not solve for already installed certificates, I think this should alert us where pinning is happening. We will use that opportunity to identify and remove pinned customers. We can then roll the ICA again to ensure it’s done.

For automation, we can identify which certificates were issued using automation compared to those that aren’t. The easiest way to make this more public is to include an extension in the certificate that identifies where automation is being used. However, I know the CAB Forum discussions around default deny may interfere with this approach. CAB Forum policy currently allows extensions provided as the CA understands the extensions. We could use this section to implement the improvement unilaterally, but I’d like to understand the impact better with what is going on in the CAB Forum before committing resources to achieving this with a specific timeline.

Although this is not a technical control, within five days, we will update our subscriber agreement to make it clear that pinning and non-browser use cases aren’t allowed. I realize this hasn’t worked with the five-day revocation requirement, but it’s part of a larger effort to drive solutions on stopping pinning.

Looking at the root cause of this and the similar issues, we need better automation around audits and CCADB. Obviously our past, manual controls simply aren’t sufficient. I know Mozilla has API access on its roadmap so longer term we can automate the checks and CCADB upload. Shorter term, within four weeks, we will open our internal database of CA certificates to public review and start building tools that replicate publicly some of the internal functionality of the ALV reports and matches the ICAs to the types of certs being issued. I realize this sounds like the intermediate disclosures on crt.sh, but the recent changes left it somewhat inaccessible with many false positive results. This will give better transparency around the key ceremonies at DigiCert and what is issuing from each intermediate. This will also allow the community to see some of the efforts to shift non-web PKI customers to non-browser roots.

We are still reviewing other short-term solutions and will post them after giving it additional thought and research. We will continue to follow the Mozilla policy on COVID related delays and post the list of revoked certificates and specifics for each customer about the impact of COVID on replacement, and their plan whether immediate or near term on what they plan to de-risk their PKI from actions needed to revoke based on BR timeline requirements.

Thanks again,

Jeremy

Jeremy, thank you very much for this update.

Since we’ve received some inquiries directly from Digicert customers, we wanted to take the opportunity to clarify some points:

Google is not taking any action that would result in site breakage today, this week, or even this month. When determining our response to Web PKI incidents, Google Chrome adheres to industry baseline requirements, which are developed jointly with other browsers and with the feedback and participation of CAs.

Google’s response to incidents is measured, balancing security and reliability, and made with consideration of the global context at hand. Ultimately, our goal is to protect users and improve security on the Internet. Google’s goal is not to break sites or critical infrastructure.

Digicert, as a CA, is in full control of how they act on the same industry baseline requirements (i.e. revoking or not). The deadline for Digicert to act (revoke or not) was 12PM PDT today; this too is specified by the industry baseline requirements.

When a CA chooses to not revoke as a result of non-compliance with the industry baseline requirements and browser root programs, separate from the original non-compliance issue itself, Google Chrome and Mozilla Firefox require that they follow up with a public incident report. The goal of this public incident report is to bring transparency to any issues and postmortem impact, resolution, and future mitigations. This process helps establish the global context, and to collaboratively identify solutions that can ensure consistency going forward, while minimizing impact to sites or critical infrastructure.

Quick point of clarification:
The deadline is today, not necessarily noon. The noon time frame came from a customer communication that was posted to bugzilla. We've moved it back on the time, but the revocation is still today.

Ryan,
I am just a newbie user so I hope you can enlighten me on some points.

You state "a DigiCert customer were to obtain a certificate, and then install it at a nuclear reactor as the basis for some management tool".
Are CAs expected to know and keep track of how their customers use every single certificate? Can you point me to the CAB policy on that?
If the answer is yes, how? When applying for a certificate we need to indicate what platform we will use it on and for what purpose? Or are CAs expected to track via CT logs every certificate used and determine platform and use via some algorithm and then either alert the user or revoke the certificate in the event it is installed on say Windows XP?

I sort of understand why CAB policies feel that they are protecting the "Internet security ecosystem" but does Mozilla and Goggle require the same level of user interaction for their browsers? Using your same analogy, what if a user were using Chrome 80 which is vulnerable to CVE-2020-6418 and has Chrome as the basis for some nuclear reactor or in some international banking app or some hospital Covid-19 app and can't update to Chrome 83 within 5 days? Would we expect some CAB type forum to come out and say "If Google can't manage how its browser is used by its users, or can't force all its users to upgrade Chrome within 5 days, then we need to revoke Chrome from all systems"? Or do we feel that browsers are any part less of the overall "Internet security ecosystem" and don't need to be held to as high a standard as an SSL certificate?

I eat, breathe and sleep Internet security for close to the past 35 years. I am far from someone who couldn't care less about Internet security. I believe that if a CA has a severe security breach, 5 days is an eternity and should probably be 48 hours, even on weekends and Covid-19. But when an audit fails or some minor issue (open for discussion as to the definition of what is minor and what is major), and 50,000 or 500,000 certificates need to be revoked (not 100 - which should always be 5 days) then 90% should be revoked within 5 days and another 5% within 10 days and 3% in 15 days and the remaining 2% in 20 days. Because there will always be some nuclear reactor or some hospital apps among 50,000 certificates that need a bit more time than 5 days.

Regards,
Hank

Note: the views expressed above are my own and do not necessarily reflect the views of my employer

Shorter term, within four weeks, we will open our internal database of CA certificates to public review and start building tools that replicate publicly some of the internal functionality of the ALV reports and matches the ICAs to the types of certs being issued. I realize this sounds like the intermediate disclosures on crt.sh, but the recent changes left it somewhat inaccessible with many false positive results.

Jeremy, I'm not aware of any bugs with https://crt.sh/mozilla-disclosures that would cause it to display "many false positive results". If you believe that it's not accurately representing the disclosure requirements of the Mozilla Root Store Policy, then please do discuss this with me in more detail, either publicly or privately. (So as not to distract the focus of this bug, feel free to start a thread on https://crt.sh/forum, or open an issue at https://github.com/crtsh/certwatch_db/issues/new, or send me an email). Thanks.

I am curious about the separation of web and non-web PKI hierarchies. Is this an EV thing or do the Browsers apply that policy (recommendation?) across all cert types?

Cheers Leif

(In reply to leifj from comment #10)

I am curious about the separation of web and non-web PKI hierarchies. Is this an EV thing or do the Browsers apply that policy (recommendation?) across all cert types?

Cheers Leif

I'm asking because the letsencrypt FAQ seems to contradict what Ryan was saying about mixing web and non-web PKI. From https://letsencrypt.org/docs/faq/:

"Does Let’s Encrypt issue certificates for anything other than SSL/TLS for websites?

Let’s Encrypt certificates are standard Domain Validation certificates, so you can use them for any server that uses a domain name, like web servers, mail servers, FTP servers, and many more."

So again I'm wondering what I'm missing.

(In reply to Jeremy Rowley from comment #5)

Although this is not a technical control, within five days, we will update our subscriber agreement to make it clear that pinning and non-browser use cases aren’t allowed. I realize this hasn’t worked with the five-day revocation requirement, but it’s part of a larger effort to drive solutions on stopping pinning.

You probably want to consider this very carefully, because there are potentially bad implications depending on what exact action is taken. In particular,

  1. Forbidding non-browser use cases is probably not what you want. There is a large class of applications which use http/https over the public internet in ways which we probably would consider acceptable. Desktop or mobile apps which are using the platform's native TLS stack talking rest/jsonrpc/xmlrpc/soap/... to a web stack on a provider's website is one obvious example. Another is git clone https://github.com/a/b . In fact, GitHub is currently using a DigiCert cert (SN 0557C80B282683A17B0A114493296B79) for that.

  2. Forbidding non-https use cases is also probably not what you want. For example, using a WebPKI certificate for an IMAP server is accepted. For better or worse, non-browser not-provider-specific clients tend to use the same cert store, and that seems unlikely to change soon. A quick censys query shows approximately 43k certs issued by one DigiCert intermediate I arbitrarily chose being used for SMTP/IMAP STARTTLS.

  3. Actually forbidding (versus just discouraging) anything along those lines risks leading to a giant mess. If a subscriber uses a cert for an IMAP server, or sticks a git repo on their website, does that mean they're violating the subscriber agreement, or misusing a certificate? If so, BR 4.9.1.1 says the CA MUST revoke if the "CA obtains evidence that the Certificate was misused" or "a Subscriber has violated one or more of its material obligations under the Subscriber Agreement or Terms of Use", and I imagine nobody particularly wants to deal with 43k problem reports (and that's just for SMTP and IMAP!)

I appreciate the desire for fast action and the desire to get things like ATMs out of the WebPKI. However, it's probably better to do any policy changes along these lines slowly slowly (i.e. longer than 5 days) to consider the wording and implications of those changes before they become effective.

Please find attached the list of certificates that we promised to post as we’ve indicated in 5) of the above incident report. This includes the full serial numbers, date when the certificate will be revoked and the reason for the delay, primarily due to COVID impact.

Also, please find attached to this bug, a file of certificates that have already revoked or expired as of Monday, July 13th that were not already in our revoked list, posted in Bug https://bugzilla.mozilla.org/show_bug.cgi?id=1650910.

We will provide a weekly update on the status of revocation up through July 30th, which is the final date of all certificates to be revoked for this incident.

Attached file Revoked as of July 13

I think there's some confusion here with regards to "forbidding" and expectations.

It's absolutely expected that, for any Subscriber that obtains a certificate from a CA used by the Web PKI, they are informed that the CA can and will revoke within the timelines defined by the Web PKI. There are no exceptions to this process, whether based on hardship or complexity or alignment of the planets or emotional state of mind at the time of revocation or any other countless explanations that can be given. The CA has an obligation to the Subscriber to inform them of this, and the Subscriber, by accepting the legally-binding Subscriber agreement, acknowledges that they understand and accept this.

If a Subscriber does not wish to accept these terms, countless other solutions exist, such as privately managed PKIs or federated PKIs within constrained environments, and so forth.

Updating a Subscriber agreement to specifically call out these cases is a positive and useful development, and one that should be encouraged. Several other CAs have done so, in light of similar incidents and delays, and it's even been proposed as an addition to the Baseline Requirements, to help better raise attention of the expectations/needs. The concerns raised in this issue apply to all certificates issued from browser/OS trusted hierarchies: they adhere to the expectations set by those hierarchies, and any divergence from those expectations is not itself a justification or rationale for exception.

Separately from this, when I talk about "no exceptions" above, I do mean it: the state is to move us closer to the point where the expectations are consistently adhered. If DigiCert, recognizing other efforts have failed, wants to explicitly prohibit their certificates from being used in that use case, that's fully within DigiCert's purview. If it turns out that it does reduce the set of "exceptional" circumstances, then it's not unreasonable to think this would be a best practice for all CAs to adopt, and potentially become required. Other cases of machine-to-machine/server-to-server authentication can evolve their own trust frameworks suitable for their specific use cases. For use cases like ATMs, POS, and Payment terminals, you can see progress is already being made by ASC X9, for which DigiCert is involved. This is a positive development.

To respond to specific comments:
Comment #10 asks:

Is this an EV thing or do the Browsers apply that policy (recommendation?) across all cert types?

This recommendation applies across all certificate types (i.e. it applies to DV as well). Specifically, using such a certificate, including Let's Encrypt certificates, will be subject to the same set of common rules across browser/OS, such as validation requirements and revocation requirements. If, as a Subscriber, you use a certificate in a way that makes that difficult, this is unfortunate, but largely self-inflicted. Provided you're informed of the trade-offs within the legally-enforceable Subscriber Agreement, or even that you've merely acknowledged them, you assign any risk to yourself if things go wrong. CAs actively prohibiting this is a useful approach, one of several available. If all other attempts to communicate the risk have failed, it may be appropriate. The goal is to ensure CAs consistently follow their stated policies, and industry expectations, without introducing fictions like "exceptions" which don't exist and have never existed.

Comment #12 states:

Desktop or mobile apps which are using the platform's native TLS stack talking rest/jsonrpc/xmlrpc/soap/... to a web stack on a provider's website is one obvious example.

That's not necessarily a good thing. This has already seen ample discussion within the community in the past, e.g. when considering the appropriate steps to take regarding removing trust in Symantec. If you're talking such protocols, you should be using a dedicated endpoint, independent from the set of "user facing" services. This is already identified as best practice, when looking at the impact to media streaming/set-top boxes that were impacted by the Symantec turn-down, or to use a less-browser-initiated action, the recent expiration of the Sectigo AddTrust root or the transition off TLS1.0/SHA-1/insecure protocols in general. It's already well-known that you want separate endpoints based on client capabilities and agility ability; e.g. an endpoint to represent the ever changing "state of the art" that browsers implement, and an endpoint that can reflect your limited client capabilities. Such a design, already being necessary without involving CAs, also becomes trivial to adapt with custom CAs. We see this in security-focused products already; e.g. Signal's use of a custom trust anchor for their API server.

Just as one would not complain that jsonrpc does not work with their protobuf compiler, since they're quite different formats and use cases, the same applies in the selection of CAs.

For better or worse, non-browser not-provider-specific clients tend to use the same cert store, and that seems unlikely to change soon

This is for worse, and there's nothing wrong with CAs emphasizing precisely this. Again, using the example of removing trust from a number of CAs, many decisions were made based on whether the certificates in use were used by browsers, and that will no doubt continue. While this is obviously complex for those that offer combined OS/Browser stores versus those that offer only OS-stores, at least with respect to Mozilla is clearly documented and a similar approach is taken by Google/Chrome.

The statement is not that these use cases are absolutely exclusionary, although CAs can choose to do so with their certificates, but that any such uses will be subject to the same requirements, expectation, and demands, and any failure outside of that is not compliant nor acceptable. CAs need to work to reduce such risks (e.g. by migrating customers to only automated solutions), although they can certainly take an interim approach of outright forbidding the use of their certificates this way.

If a CA chose to adopt such a policy, and did revoke such 43K certificates, that would not necessitate 43K problem reports to browsers, provided they were revoked in time. However, failing to revoke in time would necessitate an incident report, detailing the steps the CA is taking to prevent such delays going forward. In that regard, a quick policy change would be inadvisable, because it would seem any delay was driven by the CA's decisions and could have been foreseen (as they have been, on this issue)

Jeremy: I want to clarify that while Comment #5 is greatly appreciated, there's still a number of things pending of DigiCert here, and I'm hoping this week will provide the opportunity to take a thoughtful, measured approach to resolving this. Given DigiCert's past delayed revocation incidents, and the extensive discussion with DigiCert leadership, including yourself, that this would be fairly well-understood expectations by now.

To recap the expectations:

  • Data about revocation delays belong in this bug, not Bug 1650910.
    • Bug 1650910 is expected to discuss the audit issues, which have unfortunately been a pernicious issue with DigiCert for several years now, especially considering that issue itself was detected due to a separate audit issue.
    • This bug, Bug 1651828, is expected to discuss the delayed revocation issues, which have also unfortunately been a pernicious issue with DigiCert.
  • Please work carefully to ensure you've supplied the necessary information, as documented in Revocation. As mentioned
    • Please consider the experiences with Bug 1515788, Bug 1516453, Bug 1516545, Bug 1516561, Bug 1516599, Bug 1517617, Bug 1519572 .
      • This ensures you provide the answer to: "When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis."
    • To save time, and as mentioned in the past directly, please don't do things like "Major Pharmacy Benefits Manager" as an obfuscation tactic for the bug. Please focus on the Organization identity, as reflected in the certificate, to avoid creating additional work for reviewers.
    • If you decide to provide this information on a single bug (specifically, this bug), you still need to ensure the per-Subscriber organization of this information. The CSVs attached, and comments like Comment #2, don't provide that necessary breakdown and organization that helps us learn from and improve the overall ecosystem, since the per-Subscriber challenges are lost in aggregate.
      • This means per-Subscriber identifying of root causes. If you do use an organized set of root causes, you need to be able to link each Subscriber to specific root causes. I would be concerned trying to see that every Subscriber had the exact same set of root causes.
      • This means per-Subscriber timelines. As with the past bugs, identifying what certificates belong to what timeline is important.
    • When discussing mitigations, please link them back to the root causes. Consider this model, shared with Entrust after an insufficiently detailed incident report, as an example to build a clear understanding of challenges and fixes.
      • Please consider the previous bugs (such as those included above), both from DigiCert and other CAs, in thinking about what patterns, trends, or issues are identified, and what can be done here.

As noted in Comment #6, the goal of browsers is to learn and to improve, and help all CAs benefit. For example, it may be that the solution is to separate out the organization information used in OV/EV from the domain validation data present in all certificates, such as has been proposed by Apple, Google, Microsoft, and Mozilla for eIDAS certificates. This might allow reducing cross-system dependencies, while still providing a binding between domain and organizational identity that can be used beyond TLS, and potentially in other, non-browser TLS protocols such as those identified above.

Flags: needinfo?(jeremy.rowley)

Thank you Ryan. Although Brenda provided a preliminary report today, we are working on a version that makes it easy to identify the entity, reason for delay, and all associated serial numbers. We plan to post the file tomorrow.

As mentioned above, we rejected all non-COVID reasons as a reason for delayed revocation except in two cases. We asked anyone experiencing hardships due to COVID to specify the reason COVID impacted certificate replacement and to provide a timeline that was July 30th or before, with the expectation that there will be revocations along the way. Finally, we combed through the COVID impacted ones and rejected any that didn't relate to directly to certificate replacement issues or an indication of the certificate being used in critical infrastructure supporting the treatment of COVID.

After we post the file, we will work on preparing an incident report using the new model and post the trends we see and what can be done. Appreciate the feedback. This was very helpful.

This file contains org names based on a lookup on crt.sh that I did using one or two certificates from each group.

As requested and required, attached is current subscriber data in a format that I hope facilitates transparency while keeping the information digestable. I appreciate Ben organizing the data for us. I organized mine slightly different to show the organization doing the work to replace the certificate. This should make it easier to review the reasons provided. I’ve also updated the list of revoked certificates and timelines.

Although more information is provided on a per subscriber basis in the attached document and being examined, I summarized the reasons that we saw as follows:

  1. Death/Illness - 2 orgs / 122 certs

  2. Lockdown in region – 28 orgs / 987 certs

  3. Quarantine/Social distancing Rules – 35 orgs / 3180 certs

  4. Hospitals/Medical Treatment - 6 orgs / 336 certs

  5. Other/Critical Infrastructure – 6 orgs /305 certs

The root causes are mostly the same because everyone is experiencing similar issues – lockdowns and treatment of COVID are resulting in restricted access to facilities or restricting the personnel who would otherwise go to those facilities. I did find it interesting that none of the customers in the underscore delay required an extension for COVID.

All certificates will be revoked by July 30th. I think this date is supported by the reasons listed in the document. Some reasons are admittedly better than others, but we did reject a lot of COVID reasons that only seemed tangentially related to replacement of certificates. The current biggest risk to hitting the dates specified in the document is the continued rise of deaths from COVID.

Our next BR audit is scheduled to start in August. Per the policy, this will be listed in the audit statement.

Here is our weekly update for this delayed revocation incident:
-1,231 total certs revoked/expired since our last update (see attached file)
-3,600 left to revoke. Final revocation will be completed by July 30th.

1,231 - revoked/expired - serials + crt.sh links included

As an update on the additional tasks:

  1. Last week, we have started the key rotation project and created several issuing CAs. We've also come up with a standardized way to do this, basically naming each CA for the period of use. We've been working on automating the key ceremony to support this big of a turn over.

  2. We published a blog on the dangers of key pinning, available here: https://www.digicert.com/blog/certificate-pinning-what-is-certificate-pinning. We also are sending a mass communication to customers explaining the dangers of pinning browser-trusted certificates.

  3. We updated both our subscriber agreement and CPS to reflect that there is no such thing as non-browser browser-trusted certificates and that all are subject to the strict requirements of the Baseline Requirements and EV guidelines.

  4. We injected both audit and key ceremony automation into our roadmaps as urgent tasks and are working on them. The key ceremony automation is primarily about automating the process of using the tools we already have. The automated audit reports will provide automated audit reports for Webtrust auditors on certificates covered by Webtrust, the BRs, or EV guidelines.

  5. We've started looking at how we make the CA information more readily transparent. This may be harder than I first thought when I posted earlier. We're still hoping to hit the four week timeframe, but it may be delayed while we figure out some of the logistics.

Here is our weekly update:
2,651 - Revoked/expired since our last update on July 21st.
949 - Remaining active certs. Final revocation is still on target by July 30th.

Please find attached our final revocation list for our delayed EV revocations (as of July 30th).

I intend to close this bug on or after 10-Aug-2020 unless other issues or questions are raised.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] [delayed-revocation-leaf] [covid-19] → [ca-compliance] [leaf-revocation-delay] [covid-19]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: