The decision and rationale for delaying revocation will be disclosed to Mozilla in the form of a preliminary incident report immediately
We believe the large volume of revocations, if we had revoked the full set of affected certificates by the deadline, would have resulted in a large cumulative impact to the web. We revoked 1,711,396 certificates by the deadline (56% of the total affected), based on our evaluation that they had been replaced, were not in use, or currently had CAA records forbidding issuance to Let's Encrypt. Of the remaining 1,336,893 certificates, most (65%) were still in use based on our Internet scans, and the remainder were of undetermined status based on initial scans.
We believe that revoking those actively in-use certificates would have harmed the web because many users, upon encountering the revocation errors, would look up instructions on how to bypass revocation checks. For instance, the top Google result for [SEC_ERROR_REVOKED_CERTIFICATE firefox] right now is https://support.mozilla.org/en-US/questions/856276, which says 'You can uncheck "Use the OCSP to confirm the current validity"'. Those users are unlikely to re-enable revocation checks once they are done using the affected sites. This would prevent those users from receiving future warnings about revoked certificates.
Also, the experience of bypassing many revoked certificate warnings would likely contribute to such users' “warning blindness,” causing them to ignore future errors. The class of errors users start ignoring could extend beyond revocation errors to include similar types of errors, like certificate mismatch or certificate expiration. That would expose such users to a risk of their communications being intercepted with trivial attacks using non-browser-trusted certificates.
We acknowledge that this reasoning applies to future instances where millions of certificates need to be revoked at once. As such, we plan to develop systemic improvements so that millions of certificates can be automatically replaced by Subscribers within the BR-mandated deadlines. See below for details.
Any decision to not comply with the timeline specified in the Baseline Requirements must also be accompanied by a clear timeline
Since the deadline, we have revoked an additional 295,799 certificates, for a total of 2,007,195 revoked, plus 37,499 that expired before we revoked them. That leaves 1,003,596 still to be revoked or expire.
Over the next 83 days we will continue to work with our Subscribers to get certificates replaced, and will continue to revoke certificates as they are replaced. Specifically, we will check at least twice a week for certificates that have been replaced, and revoke those that have. Additionally, when subscribers with large numbers of certificates notify us that their replacement process is complete, we will revoke those certificates.
After 83 days, all affected certificates will have expired, due to our 90-day certificate lifetime.
I've attached a file listing, for each of the next 83 days, how many currently-unrevoked certificates will expire on that day. Dates are in UTC. Note that the number for 2020-03-07 is slightly higher because it represents that value for 2020-03-07 00:00 UTC, while the numbers described above are as of 01:10 UTC.
The issue will need to be listed as a finding in your CA's next BR audit statement.
We will make sure this happens.
Your CA will work with your auditor (and supervisory body, as appropriate) and the Root Store(s) that your CA participates in to ensure your analysis of the risk and plan of remediation is acceptable.
We will also do this.
That you will perform an analysis to determine the factors that prevented timely revocation of the certificates, and include a set of remediation actions in the final incident report that aim to prevent future revocation delays.
By reviewing previous incident reports and analyzing our current situation, a common root cause of failure to timely revoke is that Subscribers are not able to replace certificates on the BR-mandated timelines (24 hours and 5 days, depending on the issue).
Most Subscribers are not able to field round-the-clock incident response, so improving the speed of manual replacement processes cannot be the answer. Increasing public acceptance of revoked certificate errors also cannot be the answer, because that would undermine public faith in the web PKI. Reducing the incidence and scope of CA errors is an important part of the solution, and we have laid out some plans to that effect at https://bugzilla.mozilla.org/show_bug.cgi?id=1619047. However, responsible systems design requires layered responses, and it is possible that we, or another CA, will have a similar-sized incident in the future despite our best practices and best efforts.
Therefore, our conclusion is that we need to develop a protocol to notify Subscribers' systems of imminent certificate revocation, so those Subscribers can automate the process of replacing affected certificates before the deadline. We plan to design this protocol publicly, in collaboration with the PKI community, so that any CA and any Subscriber can implement it. We will also collaborate directly with popular ACME clients to integrate and test such automated replacement.