(In reply to Jeremy Rowley from comment #4)
Are you asking more details about Microsoft since it's the long pole? There are details on that one, but we can get more from them. For the rest:
IEXTCA-SSL.ibechtel.com - https://crt.sh/?id=9400462
Because the parent of this is already revoked and we don't have a relationship with this entity, we are working through how to even do this one. We've reached out to the key holder to talk about key desctruction (and whether they still have this one). We don't have a good ETA on this because we haven't heard back from the key holder on when key destruction can occur.
Right, in this case, the impact of that sub-sub-CA is only to affect that sub-CA, so that seems mitigated.
KPN Class 2 CA - https://crt.sh/?id=341594698
We are trying to pull this in to an earlier date and are creating a replacement CA now. October is the latest for key destruction. We're going to try and revoke the ICA in Sept. Similar to PKI Overheid, they control several issuing Sub CAs under them that issue smime. We are talking to the sub CAs under the KPN root and working through how to revoke them and get key destruction.
Thanks. That's Bug 1649964 , for future reference.
See above for the reasons. If that wasn't sufficient explanation, we can have them provide more information.
Of the five, the first four are the ones we'd like to accelerage the most (even with the short timeframes) since they poise the biggest security risks. Microsoft, on the other hand, is already a trusted CA within Mozilla, Microsoft, and Apple, meaning they share responsibility for correct operations. Although this does not mitigate the unaccepted compounded risk that if one CA is compromised then the whole CA is compromised, we weren't sure what additional time demands this placed on ICA reaplcement.
Isn't it correct that these are two separate teams? The fact that it's "not the same" has been used in the past to excuse issues (e.g. Bug 1604124 / Bug 1424305), so I'm trying to square that here.
Since you mentioned PKIoverheid's response, compare that set of proposed controls (still ongoing discussion) with what DigiCert has offered here, and there's a notable lack of detail.
I would say with the detail you've provided, the timeline is unacceptably long. That's why I asked, and continue to ask, for more detail, to better understand if the timeline proposed is 'safe' and 'reasonable'. The goal of these reports is to understand how, in this situation, the CA is planning to show nothing has gone wrong, nothing can go wrong, and nothing will go wrong, for whatever time period they're proposing. Similarly, to understand what steps are being taken to mitigate and prevent future delays. We've seen in the past, for example with low-entropy serial numbers, that a number of CAs were able to quickly replace millions of certificates, in an order less than O(months). Looking to understand why that doesn't work here is important to understanding how this will be fixed going forward.
To try to put differently: If the argument is that the volume of certs is large, then establishing details comparing this to other similarly large-or-larger volumes of certs in the past is useful to understand. Did the CA fail to learn from those past experiences? Were they in the process of implementing solutions, like automation? If so, what was the (old) timeline, why wasn't it completed by now, and what's being done going forward? If this is seen as different than past incidents, sharing details about why it's different is relevant. If it's the same as past incidents, sharing details about why they weren't learned from is relevant. Sharing details about how, in the future, CAs can avoid this being an incident, and specifically how this CA is avoiding a repeat, is relevant.
On the details of this incident report alone, one might conclude that, in the future, if there were any future issues with intermediates, it would also take 7 months to remediate, rather than 7 days, because no detail is given here about the delay and its mitigation. That would be unacceptable then, and that's why it should be unacceptable now. Detail is what helps inform the balanced tradeoff of saying "We accept the risk now, because in the future, we believe there's a clear and viable strategy to reduce the risk, and we can see this as an opportunity to learn".