Microsoft PKI Services: "unknown" OCSP response for issued certificates
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: agwa-bugs, Assigned: johnmas, NeedInfo)
Details
(Whiteboard: [ca-compliance] [ocsp-failure])
Attachments
(2 files)
Microsoft's OCSP responder (http://oneocsp.microsoft.com/ocsp) is responding with a status of "unknown" for the following certificates (presumed to exist based on the existence of a precertificate):
https://crt.sh/?sha256=04d0efae64fbdea822782c6384a00622fa98d6965bdfd498643879afaab6c539
https://crt.sh/?sha256=0e57e703b2812af9ceda5b5b1aabe3298e3568ac254c3a713b7fdb4a808ce329
https://crt.sh/?sha256=0f41f43cd6795ed2df5559c8c57c0d5cd8ce39b6fd1411ca43ec6a0f6325e571
https://crt.sh/?sha256=121e59b5cf8a4beee92770627e8e30406304f94d1eb6fca6c98a88fe11eed16c
https://crt.sh/?sha256=3efd6cd42c63202c1aec613b61922c7875dc23accf287654de7f3fc048c93bae
https://crt.sh/?sha256=44fa6f4a9226c8dc562be2590809cf10bc42d621b10cc684eeb22fa3f3aee8ef
https://crt.sh/?sha256=89ef71f2db5eb8dbd54f81f8beef9fdfdf4d4bf8ceebbd746ef1ea37e09a2dde
https://crt.sh/?sha256=8acf67a4fe3b867f432ba4459d2b50dba081d75a4a52a34949df39838ddf542b
https://crt.sh/?sha256=a3b6a31daa99f478c420cb31217d76a8f1adca68ae5adad6753144d622dd1464
https://crt.sh/?sha256=ce9d3b9446bdb826a7aa059c033bebd9c61d996743f419c8cf4a6b5a8de237b7
I have attached signed OCSP responses as evidence.
Updated•2 years ago
|
Comment 1•2 years ago
|
||
Acknowledging. We will provide a preliminary report later today.
Comment 2•2 years ago
|
||
Below is a preliminary incident report that we expect to add more detail within 7 days as we investigate.
How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in the MDSP mailing list, a Bugzilla bug, or internal self-audit), and the time and date.
- The Microsoft PKI Services (MS PKI) team became aware of this problem when this bug was assigned on 2022-10-03 08:06 PDT. The initial investigation determined that there was an issue that prevented publishing to OCSP that impacted all the reported certificates. Further investigation is ongoing and a full report is expected within the next 7 days.
A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
- 2022-08-08 16:20 PDT: OCSP publishing issues started for the impacted CA server and region
- 2022-08-08 16:55 PDT: First reported certificate was created
- 2022-08-08 18:45 PDT: Final reported certificate was created
- 2022-08-08 19:44 PDT: OCSP publishing issues ended
- 2022-10-03 08:06 PDT: This bug was assigned to Microsoft
- 2022-10-03 09:57 PDT: Reported certificates were manually published to OCSP, mitigating the specific certificates from the initial report
Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident. A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.
- Microsoft PKI Services did not issue any of the reported certificates to Subscribers. Our automation prevents issuance to Subscribers for any failure, including OCSP publishing failures.
- We have already implemented new monitoring and a human process to mitigate future issues where certificates are not published to OCSP. We are investigating automation solutions to monitor for this scenario. We expect to have a plan for this monitoring and automated remediation completed by next week.
In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.
- The first reported certificate was created 2022-08-08 16:55 PDT and the final reported certificate was created 2022-08-08 18:45 PDT. Please see below for the links to the certificates that were not published to OCSP.
In a case involving TLS server certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. It is also recommended that you use this form in your list "https://crt.sh/?sha256=[sha256-hash]", unless circumstances dictate otherwise. When the incident being reported involves an SMIME certificate, if disclosure of personally identifiable information in the certificate may be contrary to applicable law, please provide at least the certificate serial number and SHA256 hash of the certificate. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.
- The following certificates were provided in the bug:
- https://crt.sh/?sha256=04d0efae64fbdea822782c6384a00622fa98d6965bdfd498643879afaab6c539
- https://crt.sh/?sha256=0e57e703b2812af9ceda5b5b1aabe3298e3568ac254c3a713b7fdb4a808ce329
- https://crt.sh/?sha256=0f41f43cd6795ed2df5559c8c57c0d5cd8ce39b6fd1411ca43ec6a0f6325e571
- https://crt.sh/?sha256=121e59b5cf8a4beee92770627e8e30406304f94d1eb6fca6c98a88fe11eed16c
- https://crt.sh/?sha256=3efd6cd42c63202c1aec613b61922c7875dc23accf287654de7f3fc048c93bae
- https://crt.sh/?sha256=44fa6f4a9226c8dc562be2590809cf10bc42d621b10cc684eeb22fa3f3aee8ef
- https://crt.sh/?sha256=89ef71f2db5eb8dbd54f81f8beef9fdfdf4d4bf8ceebbd746ef1ea37e09a2dde
- https://crt.sh/?sha256=8acf67a4fe3b867f432ba4459d2b50dba081d75a4a52a34949df39838ddf542b
- https://crt.sh/?sha256=a3b6a31daa99f478c420cb31217d76a8f1adca68ae5adad6753144d622dd1464
- https://crt.sh/?sha256=ce9d3b9446bdb826a7aa059c033bebd9c61d996743f419c8cf4a6b5a8de237b7
Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
- We are still investigating these details and expect to have a complete answer within 7 days. Part of this investigation involves comparing every currently valid certificate generated by our CA software against OCSP to determine if there are any additional impacted certificates.
List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.
- We are still investigating these details and expect to have a complete answer within 7 days.
Comment 3•2 years ago
|
||
While our initial investigation pointed to an isolated issue with OCSP publishing during a brief period, Microsoft PKI Services performed an exhaustive review of all CAs to compare certificates generated by the CA to the certificates in OCSP. We determined the root cause was that the provisioning workflow would enter a failure between generating a certificate and successfully publishing to OCSP. The workflow would end after exhausting a retry count and no additional action was taken for the generated certificate. There were no specific alerts or automated processes in place to enforce OCSP publishing in this case.
It is worth noting that in mid-August 2022, we moved the OCSP publishing step much earlier in the provisioning workflow to publish pre-certificates to OCSP. This resulted in drastically less opportunities for workflow failures to occur before OCSP publishing. However, it still left two potential failure points that can result in not being published to OCSP. As an immediate measure, we mitigated this problem by adding alerting, as described in our previous response, for these failures and updated the processes to manually publish to OCSP. We are investigating more robust automated solutions to remove the human element from this mitigation process.
Microsoft PKI Services identified 2221 total certificates that did not meet the requirements in Section 5.4 of M.R.S.P. Version 2.8. These are all published to OCSP now and responding with a “good” response. We identified 2208 Non-expired Final Certificates that were not in OCSP due to this issue. None of these final certificates were provided to Subscribers as we only provide them if all workflow steps are successful. We also identified 13 Non-expired Pre-Certificates that were generated after 2022-09-30 that were not published to OCSP.
Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
- This is an implementation gap in handling an edge case scenario where any failures in publishing certificates to the OCSP provider after the certificate is generated are retried a fixed number of times before marking the request as failed. The design did not have a provision to publish these certificates to the OCSP provider after the failure.
List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.
- As stated above, Microsoft PKI Services recently changed the workflow process that drastically reduced the potential failures publishing to OCSP. For the rare cases where OCSP publishing could still fail, we have implemented alerts and manual mitigation steps. This mitigates the problems that caused this incident and will prevent future incidents.
Reporter | ||
Comment 4•2 years ago
|
||
Here are some more certificates whose status is "unknown":
https://crt.sh/?sha256=af722ba0c92bba20cb3379b03c37a7cd84756843771dc8d6c228c3a829256ab8&opt=ocsp
https://crt.sh/?sha256=643acb5b17a9d0a45175822a097d15c1ae2e43a0f466841416cc5b85889443d8&opt=ocsp
https://crt.sh/?sha256=0dbd2141bdd794ca5407e020d9477e57a4f985c403cd499ee038a691c1b99409&opt=ocsp
Why were these not identified during the remediation described in Comment 3?
Assignee | ||
Comment 5•2 years ago
|
||
Hi Andrew,
These three certificates are all pre-certificates generated before October 1, 2022 that do not have a final certificate. We did identify them in our review, but since there was no requirement to publish these to OCSP at the time we intentionally filtered them out of our impacted certificates count.
Regards,
John Mason
Could you explain why you think that those certificates - presumed to be issued before October 1, 2022, based on existing pre-certificates - do not need the OCSP services expected to exist for BR-compliant certificates?
The only special reference to October 1, 2022 that I can find is in MRSP s5.5:
[...]
Effective October 1, 2022,
- a CA MUST be able to revoke a certificate presumed to exist, if revocation of the certificate is required under this policy, even if the final certificate does not actually exist; and
- a CA MUST provide CRL and OCSP services and responses in accordance with this policy for all certificates presumed to exist based on the presence of a precertificate, even if the certificate does not actually exist.
(emphasis mine)
That does not preclude pre-certificates older than that date from the policy, only that the explicit requirement for the CRL and OCSP services would go into effect on 2022-10-01. There is no text in that section that grandfathers certificates which we presume to be issued before the policy went into effect, so I'm not sure why you assumed that you don't need to provide the required services for those certificates.
Could you provide the full certificate data for question 5? It seems like only those already public in this issue were linked, while you mentioned that there were 2221 certificates affected in Comment 3.
Comment 7•2 years ago
|
||
Microsoft PKI Services had been involved in the draft language of this requirement (Section 5.4 of M.R.S.P. Version 2.8). Our understanding the entire time has been from the perspective of pre-certificate or final certificate issuance. With that in mind, we implemented changes to our issuance workflow with the expectation that OCSP publishing occurs for all newly issued certificates, starting October 1, 2022.
We acknowledge that there can be multiple interpretations of this requirement. We would ask for clarity from Mozilla in this case if the requirement includes all pre-certificates issued before October 1, 2022.
Comment 9•2 years ago
|
||
(In reply to Dustin Hollenback from comment #7)
Microsoft PKI Services had been involved in the draft language of this requirement (Section 5.4 of M.R.S.P. Version 2.8). Our understanding the entire time has been from the perspective of pre-certificate or final certificate issuance. With that in mind, we implemented changes to our issuance workflow with the expectation that OCSP publishing occurs for all newly issued certificates, starting October 1, 2022.
That is a reasonable interpretation, because our effective dates for policy changes typically apply to certificates issued after the effective date.
We acknowledge that there can be multiple interpretations of this requirement. We would ask for clarity from Mozilla in this case if the requirement includes all pre-certificates issued before October 1, 2022.
The requirement does not include pre-certificates issued before October 1, 2022.
All pre-certificates issued on October 1, 2022, or later must satisfy the requirement.
Comment 10•2 years ago
|
||
Thank you for the clarification, Kathleen.
That was the last concern that I was aware of related to this bug. If there are no remaining concerns, we ask that this bug be resolved at this time.
Comment 11•2 years ago
|
||
Out of curiosity (this is not a question that should in any way delay the resolution of this ticket) -- you stated that Microsoft has already moved the generation of OCSP much earlier in the issuance workflow, to create fewer opportunities for workflow failure before the first OCSP response is issued.
Has Microsoft considered moving the creation of the first OCSP response to be before the creation of even the precertificate? All of the necessary inputs -- the serial number and the issuing CA -- can be available before the precertificate is issued.
Updated•2 years ago
|
Comment 12•2 years ago
|
||
(In reply to Aaron Gable from comment #11)
Has Microsoft considered moving the creation of the first OCSP response to be before the creation of even the precertificate? All of the necessary inputs -- the serial number and the issuing CA -- can be available before the precertificate is issued.
Hi Aaron.
"Before the precertificate is issued" means that even if a serial number is "available" it is still considered to be "unused" in the context of BR 4.9.10, which also says:
'If the OCSP responder receives a request for the status of a certificate serial number that is “unused”, then the responder SHOULD NOT respond with a “good” status.'
Since "good" is frowned upon, which CertStatus value would you propose to include in a first OCSP response that is created before the precertificate is issued?
Comment 13•2 years ago
|
||
The section that Rob references is stronger than a "SHOULD NOT" for CAs which are not Technically Constrained: it's actually a "MUST NOT".
While 4.9.10 speaks to the required parameters for responding to OCSP requests, it is seemingly mum on whether it is prohibited to sign a definitive response for an "unused" serial number and merely not distribute it to OCSP clients/Relying Parties.
Comment 14•2 years ago
|
||
Hi Corey. Point taken about 4.9.10. What do you make of 4.9.9?
"OCSP responses MUST either:
1.Be signed by the CA that issued the Certificates whose revocation status is being checked, or
2.Be signed by an OCSP Responder whose Certificate is signed by the CA that issued the Certificate whose revocation status is being checked."
If the intended meaning is that "...that issued the Certificate[s]" (note the past tense) has to occur before it's possible for a compliant OCSP response to exist, then this would imply that the CA is not permitted to sign a definitive OCSP response for an "unused" serial number.
Alternatively, if "whose revocation status is being checked" and the word "checking" in the section title mean that 4.9.9 is only intended to apply to OCSP responses that are actually distributed to relying parties, then ISTM that there are no rules whatsoever for OCSP responses whilst they remain undistributed. Even the "MUST conform to RFC6960 and/or RFC5019" requirement would not apply.
Comment 15•2 years ago
|
||
Hi Rob,
Alternatively, if "whose revocation status is being checked" and the word "checking" in the section title mean that 4.9.9 is only intended to apply to OCSP responses that are actually distributed to relying parties, then ISTM that there are no rules whatsoever for OCSP responses whilst they remain undistributed. Even the "MUST conform to RFC6960 and/or RFC5019" requirement would not apply.
Your reading that I quoted above most closely matches my interpretation of section 4.9.9. My understanding of the intent behind 4.9.9 is to prohibit CAs from providing OCSP responses which are usuable/not able to be verified unless RP software has access (via local policy, etc.) to other certificates and/or trust anchors besides those used for the TLS connection; it is not necessarily a restriction on what can be signed. Additionally, if section 4.9.9 restricts CAs on which OCSP responses they can sign, then that section would prohibit pre-production of OCSP responses until the CA actually receives an OCSP request. I suppose one could argue that the CA could internally issue an OCSP request to fulfill that obligation, but that seems contrived.
Thanks,
Corey
Comment 16•2 years ago
|
||
(In reply to Rob Stradling from comment #12)
Since "good" is frowned upon, which CertStatus value would you propose to include in a first OCSP response that is created before the precertificate is issued?
As Corey pointed out, BR 4.9.10 says that it is unacceptable to respond with a "good" status, not that it is unacceptable to produce an OCSP response with the "good" status and store it for use when an OCSP request arrives.
The issue is that signing a precertificate is a binding intent to sign a final certificate, even if that precertificate is never publicly shared or logged in CT. Therefore it is risky to sign a precertificate, then sign an OCSP response, and finally make both available to the public: if the creation of the OCSP response fails, then there is a precertificate for which no OCSP response is available. (This can, of course, be mitigated with the ability to live-sign new OCSP responses as requests come it.)
Ever since https://bugzilla.mozilla.org/show_bug.cgi?id=1577652, Let's Encrypt's approach has been to first sign the OCSP response, then sign the precertificate, then persist both to the database in a single transaction (with additional automation to recover both from the audit logs if the transaction fails). This way, if signing the precertificate fails, the OCSP response is dropped and never served, which is in line with both 4.9.9 and 4.9.10.
Comment 17•2 years ago
|
||
(In reply to Kathleen Wilson from comment #9)
(In reply to Dustin Hollenback from comment #7)
Microsoft PKI Services had been involved in the draft language of this requirement (Section 5.4 of M.R.S.P. Version 2.8). Our understanding the entire time has been from the perspective of pre-certificate or final certificate issuance. With that in mind, we implemented changes to our issuance workflow with the expectation that OCSP publishing occurs for all newly issued certificates, starting October 1, 2022.
That is a reasonable interpretation, because our effective dates for policy changes typically apply to certificates issued after the effective date.
We acknowledge that there can be multiple interpretations of this requirement. We would ask for clarity from Mozilla in this case if the requirement includes all pre-certificates issued before October 1, 2022.
The requirement does not include pre-certificates issued before October 1, 2022.
All pre-certificates issued on October 1, 2022, or later must satisfy the requirement.
Respectfully, this community has had a different interpretation of such cases and this was discussed when SC31 required a reasonCode (not unspecified) for the revocation of CA Certificates. Although the requirement became effective 2020-09-30, the expectation was that CAs had to add revocation reasons for all past revocations, because a CRL issued after 2020-09-30 had to contain a reason for all CA Certificate revocations. The discussion was conducted in m.d.s.p. and is available at https://groups.google.com/g/mozilla.dev.security.policy/c/7z6dqwdc16o/m/TVHevphhCwAJ.
The safest interpretation of such requirements is to retroactively check if the requirement applies and fix accordingly.
Updated•2 years ago
|
Comment 18•2 years ago
|
||
I've opened a conversation on m-d-s-p with the hope of resolving some of the underlying policy issues discussed in the comments made here thus far. See https://groups.google.com/a/mozilla.org/g/dev-security-policy/c/x3sRo8tALr0/m/cjLHyFQOBAAJ .
Thanks,
Ben
Updated•2 years ago
|
Updated•2 years ago
|
Updated•9 months ago
|
Description
•