Bug 1965612 Comment 14 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

[In response to Comment [12](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c12).]

Thank you for providing answers to the questions posted in Comment [8](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c8) and for providing the Full Incident Report in Comment [9](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c9).

We have a few additional questions:

(1) Can you please describe the TLS server authentication certificate automation solution(s) in place to help us understand what is and is not considered in scope of the solutions available to subscribers? Statements in Comment [6](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c6) indicate “_While many subscribers adopt renewed certificates within 24–48 hours, some do not._” 

We interpret that to mean while the process of requesting a certificate (which may include key generation and performing domain control validation) is automated for some subscribers, the retrieval and installation of the corresponding certificate might **not** be in-scope for the automation solution.

(2) Can you help us understand the percent of affected certificates that are relying on “Azure Key Vault” or “internal vaults” where “_Microsoft has the capability to centrally renew all certificates issued by specific Issuers_”?  For example: “_XX% of the affected certificates can automatically be renewed and automatically configured for use due to Microsoft certificate lifecycle management solutions._” 

(3) The context of a “leak” as presented below is unclear to us. Can you please explain this to us in a different way?

> We suspect, but have not yet confirmed, that the population is high due to subscriber implementation issues, aka a "leak”.

Is this referencing scenarios where certificates were requested and issued, but then later abandoned by subscribers without requesting revocation? 

(4) Comment [9](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c9) states: 

> Our plan is to revoke certificates in batches on a weekly basis, maintaining a CRL size which does not negatively impact clients, and leaving room for additional revocations in case other incidents occur.

Can you please share:
- (a) The criteria used for determining which certificates will be included in each week’s “batch”?
- (b) The CRL size being targeted to accomplish the stated goal of using this batch strategy?
- (c) How Microsoft PKI Services concluded the target size described immediately above will not negatively impact clients?

(6) We understand that Microsoft PKI Services was aware of its “CRL bloat” concerns related to mass revocation events in [February 2025](https://docs.google.com/spreadsheets/d/1Wjf7jFvI4C2MC1wBRrGT9MqFL_uKOiFBniz_IG-mHKg/edit?gid=2065781624#gid=2065781624), and presumably earlier. Can you help us understand that given the existence of this concern and the community’s emphasis on improving response to mass revocation events over the past year, Microsoft PKI Services did not move forward with planning (minimally) or implementing partitioned CRLs sooner? 

(7) Comment [6](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c6) includes:  

> Microsoft has the capability to centrally renew all certificates issued by specific Issuers managed in Key Vault and internal vaults—a process we've successfully executed in the past and can repeat if necessary.

Can you share which DCV method(s) is being relied upon during these types of renewals? 

(8) Comment [6](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c6) includes:

> Furthermore, as part of our effort to reduce the certificate lifetime we have already reduced most of our certificate lifetime to 6 months by default, with a goal to meet or exceed the industry lifetime requirements. 

Given 90% of the impacted time-valid certificates were found to no longer be in use, and when considered against the degree of automation we understand to be in place, could the default validity be decreased further to reduce the likelihood of “[stale](https://zanema.com/papers/imc23_stale_certs.pdf)” or unused TLS certificates? 

(9) Related to the above, has Microsoft considered the use of short-lived certificates, as defined by the TLS BRs, for these subscriber use cases?
[In response to Comment [12](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c12).]

Thank you for providing answers to the questions posted in Comment [8](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c8) and for providing the Full Incident Report in Comment [9](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c9).

We have a few additional questions:

(1) Can you please describe the TLS server authentication certificate automation solution(s) in place to help us understand what is and is not considered in scope of the solutions available to subscribers? Statements in Comment [6](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c6) indicate “_While many subscribers adopt renewed certificates within 24–48 hours, some do not._” 

We interpret that to mean while the process of requesting a certificate (which may include key generation and performing domain control validation) is automated for some subscribers, the retrieval and installation of the corresponding certificate might **not** be in-scope for the automation solution.

(2) Can you help us understand the percent of affected certificates that are relying on “Azure Key Vault” or “internal vaults” where “_Microsoft has the capability to centrally renew all certificates issued by specific Issuers_”?  For example: “_XX% of the affected certificates can automatically be renewed and automatically configured for use due to Microsoft certificate lifecycle management solutions._” 

(3) The context of a “leak” as presented below is unclear to us. Can you please explain this to us in a different way?

> We suspect, but have not yet confirmed, that the population is high due to subscriber implementation issues, aka a "leak”.

Is this referencing scenarios where certificates were requested and issued, but then later abandoned by subscribers without requesting revocation? 

(4) Comment [9](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c9) states: 

> Our plan is to revoke certificates in batches on a weekly basis, maintaining a CRL size which does not negatively impact clients, and leaving room for additional revocations in case other incidents occur.

Can you please share:
- (a) The criteria used for determining which certificates will be included in each week’s “batch”?
- (b) The CRL size being targeted to accomplish the stated goal of using this batch strategy?
- (c) How Microsoft PKI Services concluded the target size described immediately above will not negatively impact clients?

(5) We understand that Microsoft PKI Services was aware of its “CRL bloat” concerns related to mass revocation events in [February 2025](https://docs.google.com/spreadsheets/d/1Wjf7jFvI4C2MC1wBRrGT9MqFL_uKOiFBniz_IG-mHKg/edit?gid=2065781624#gid=2065781624), and presumably earlier. Can you help us understand that given the existence of this concern and the community’s emphasis on improving response to mass revocation events over the past year, Microsoft PKI Services did not move forward with planning (minimally) or implementing partitioned CRLs sooner? 

(6) Comment [6](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c6) includes:  

> Microsoft has the capability to centrally renew all certificates issued by specific Issuers managed in Key Vault and internal vaults—a process we've successfully executed in the past and can repeat if necessary.

Can you share which DCV method(s) is being relied upon during these types of renewals? 

(7) Comment [6](https://bugzilla.mozilla.org/show_bug.cgi?id=1965612#c6) includes:

> Furthermore, as part of our effort to reduce the certificate lifetime we have already reduced most of our certificate lifetime to 6 months by default, with a goal to meet or exceed the industry lifetime requirements. 

Given 90% of the impacted time-valid certificates were found to no longer be in use, and when considered against the degree of automation we understand to be in place, could the default validity be decreased further to reduce the likelihood of “[stale](https://zanema.com/papers/imc23_stale_certs.pdf)” or unused TLS certificates? 

(8) Related to the above, has Microsoft considered the use of short-lived certificates, as defined by the TLS BRs, for these subscriber use cases?

Back to Bug 1965612 Comment 14