WISeKey: OCSP responding "Unauthorized" for a TLS certificate
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: pfuentes, Assigned: pfuentes)
Details
(Whiteboard: [ca-compliance] [ocsp-failure])
Steps to reproduce:
We are opening this incident as a placeholder and we will publish a full incident report early next week.
Our internal monitoring system raised today an alarm due to a TLS certificate found listed in the OCSP Watch, due to "error from server: unauthorized".
SSLMate was pointing to a different certificate, not under our Root, so we thought it was a false positive and reported the issue, but SSLMate confirmed that indeed there was a problem with a certificate, but the link reported was not correct.
The certificate (https://crt.sh/?id=13445735643) is not revoked (and it must not be) and we are checking the root cause of this situation.
We will write a full incident report, using the appropriate format, ASAP in the next days, once we are more advanced in our investigation.
No other certificates seem to be affected. Certificate issuance and revocation services are working as usual.
Updated•3 months ago
|
Assignee | ||
Comment 1•3 months ago
|
||
Hello,
just a heads-up about this.
We are finalising our investigation about the root cause of the problem (it can be summarised as due to an error that prevented the final certificate to be fully issued), but given that the issue affects only a pre-certificate, we are pondering the option to request this bug to be closed as INVALID.
The rational for this request, if we finally do it, would be based on:
- The BR set as optional the need to respond to reserved serial numbers (Section 4.9.10, "The OCSP responder MAY provide definitive responses about “reserved” certificate serial numbers, as if there was a corresponding Certificate that matches the Precertificate [RFC6962].")
- Similar bugs such as https://bugzilla.mozilla.org/show_bug.cgi?id=1580393#c8
We may need a bit more of time to publish here again. If in the meantime anyone wants to chime-in about the issue (unauthorised response for a pre-certificate where the certificate doesn't exist), they will be welcomed!
Comment 2•3 months ago
|
||
Hi Pedro,
I believe bug 1580393 was quiet a few years ago, and before some changes were made to the Mozilla Policy.
Specifically, Section 5.4 states that
a CA MUST provide CRL and OCSP services and responses in accordance with this policy for all certificates presumed to exist based on the presence of a precertificate, even if the certificate does not actually exist.
I believe the BR language about the reserved status would fulfill this requirement, but an unauthorized response would not.
Comment 3•3 months ago
|
||
I agree with Martijn's analysis of the MRSP. Additionally, the OCSP carve-out for Reserved serials in the BRs is ineffective (and I filed this bug to remove it a year ago) because Section 7.1.2.9 also says
The existence of a signed Precertificate can be treated as evidence of a corresponding Certificate also existing.
Since all Precertificates are evidence of the corresponding Certificate existing, then all serials used in precertificates have evidence that they are actually Assigned, not just Reserved, and therefore OCSP responders MUST provide responses for them.
Assignee | ||
Comment 4•3 months ago
|
||
Thanks, Martijn and Aaron, for your comments.
Incident Report
Summary
A certificate issuance error occurred due to a race condition in EJBCA when a client application simultaneously requested two certificates for the same user. This resulted in one certificate request not being fully processed and not correctly persisted in our CA and OCSP database.
Impact
The impacted certificate was not initially persisted correctly, although it was stored as a precertificate in the EJBCA IncompleteIssuanceJournalData
table and later manually published. Additionally, the certificate was not automatically published into the OCSP database, so the OCSP service was responding "unauthorized" for this certificate.
Timeline
All times are UTC.
2024-06-19
- 17:16 Certificate issuance attempted but affected by a concurrency issue.
2024-06-20
- 02:02 A notification was sent from our monitoring tool indicating changes in OISTE's OCSP Watch feed.
- 04:39 A report was sent to SSLMate indicating that a URL in the entry that appeared in our feed was incorrect and we (incorrectly) assumed this was a false positive.
- 11:46 SSLMate responded acknowledging that the URL was indeed incorrect but the problem still existed for an OISTE certificate.
- 12:18 Investigation started.
2024-06-21
- 02:09 Precertificate manually published to the CA and OCSP database.
2024-06-22
- 02:02 A notification was sent from our monitoring tool indicating recovery in OISTE's OCSP Watch feed.
2024-06-24
- 17:37 Automated monitoring implemented for the IncompleteIssuanceJournalData EJBCA table to get an alert if any precertificate gets registered there.
Root Cause Analysis
The root cause was a race condition in EJBCA triggered by simultaneous certificate requests for the same EJBCA user from the same client application. EJBCA is apparently not equipped to handle such concurrent operations.
Lessons Learned
What went well
- Our monitoring systems detected the issue and manual intervention allowed for the certificate to be published.
What didn't go well
- Our client application could not prevent a race condition with concurrent requests for the same EJBCA user, resulting in a certificate failed to be fully issued and published to the OCSP database.
Where we got lucky
- The incident did not escalate to become a violation of Mozilla Root Store Policy due to our automated monitoring of the OCSP Watch feed for OISTE, which allowed for timely detection and manual correction within the allowed 4-day window required by https://www.mozilla.org/en-US/about/governance/policies/security-group/certs/policy/:
- Revocation
...
For end entity certificates, if the CA provides revocation information via an Online Certificate Status Protocol (OCSP) service:
- it MUST update that service at least every four days;
For the previous reason and according to our understanding this would not warrant an Incident Report and should be closed as INVALID, but we are providing the Incident Report here anyway to allow other CAs to learn from our experience.
Action Items
Action Item | Kind | Due Date |
---|---|---|
The precertificate was manually published to the CA and OCSP database. | Correct | 2024-06-21 |
Implement monitoring for the IncompleteIssuanceJournalData EJBCA table to get an alert if any precertificate gets registered there. | Prevent | 2024-06-24 |
Improve the EJBCA client application to avoid issues related to concurrency | Prevent | 2024-07-25 |
Study and possibly enable the EJBCA Pre-Certificate Maintenance Service (https://docs.keyfactor.com/ejbca/latest/pre-certificate-maintenance-service) that allows to automatically publish to the OCSP database precertificates that failed successful issuance. | Prevent | 2024-07-25 |
Appendix
Details of affected certificates
Comment 5•3 months ago
|
||
BR 4.10.2:
The CA SHALL maintain an online 24x7 Repository that application software can use to automatically check the current status of all unexpired Certificates issued by the CA.
MRSP 6 paragraph 1:
CA operators MUST maintain an online 24x7 repository mechanism whereby application software can automatically check online the current status of all unexpired certificates issued by the CA.
MRSP 6 paragraph 3:
responses MUST have a defined value in the nextUpdate field
Unauthorized OCSP responses are error responses that contain neither a nextUpdate field nor the status of the certificate. Therefore they cannot satisfy the requirement of operating a 24x7 service to check the current status of a certificate.
The four day window applies when updating from one definitive response to another (e.g. good to revoked).
Therefore, this was a compliance violation.
Similar incidents: Bug 1753123, Bug 1758372
The race condition is interesting and that it impacts EJBCA implies that this could be a common configuration mistake. Do you have more details on that? I'll note that it seems that you are awaiting a vendor fix and are still impacted allowing for mis-issuances to continue. I see no mention of issuance ever stopping during investigation until the band-aid for detection was applied either.
I'm still not quite sure what you mean by the initial certificate not being fully processed. A pre-certificate is mentioned, so presumably at least one well-formed and signed certificate was created for linting and CT purposes. Given it hit the OCSP cache, I presume it was signed with an intermediary and thus was considered a valid certificate albeit with the poison extension applied?
(In reply to Pedro Fuentes from comment #4)
Where we got lucky
- The incident did not escalate to become a violation of Mozilla Root Store Policy due to our automated monitoring of the OCSP Watch feed for OISTE, which allowed for timely detection and manual correction within the allowed 4-day window required by https://www.mozilla.org/en-US/about/governance/policies/security-group/certs/policy/:
- Revocation
...
For end entity certificates, if the CA provides revocation information via an Online Certificate Status Protocol (OCSP) service:
- it MUST update that service at least every four days;
For the previous reason and according to our understanding this would not warrant an Incident Report and should be closed as INVALID, but we are providing the Incident Report here anyway to allow other CAs to learn from our experience.
This is an interestingly unique interpretation of MRSP policy. How are you reconciling the acknowledgement of Comment 2 and Comment 3 specifically on MRSP interpretation with this view?
Assignee | ||
Comment 7•3 months ago
|
||
Hello @Andrew, @Wayne,
we aren't challenging the need to consider the incident as such. even if I'm still unsure about the interpretation of the 4-day window, because IMHO it may be interpreted also as applicable for the initial load of information in the OCSP responder DB (to avoid misinterpretation of this comment, it must be understood that our systems update the OCSP info in real time, except in this particular case, due to the incident being disclosed).
Anyhow, we opened this Bugzilla proactively from our side because we thought it was the right thing to do, so we don't have any issue with having this incident going through its natural process. Any comment that is received in the meantime is always enriching and an opportunity to learn.
Next week we will publish a progress report. As we said, issuance systems are working as expected and we have already implemented a first set of countermeasures that control the risk of recurrence of this problem.
Comment 8•3 months ago
|
||
(In reply to Pedro Fuentes from comment #1)
The rational for this request, if we finally do it, would be based on:
- The BR set as optional the need to respond to reserved serial numbers (Section 4.9.10, "The OCSP responder MAY provide definitive responses about “reserved” certificate serial numbers, as if there was a corresponding Certificate that matches the Precertificate [RFC6962].")
- Similar bugs such as https://bugzilla.mozilla.org/show_bug.cgi?id=1580393#c8
That comment in the bug says, "Given the outcome of the discussion on the mozilla.dev.security.policy list," and if you follow the link to the discussion, you will see that in September 2019, Mozilla added the following policy, and previous incidents were closed only because they happened before the finalization of the policy: "A CA must provide OCSP services and responses in accordance with Mozilla policy for all certificates presumed to exist based on the presence of a Precertificate, even if the certificate does not actually exist." Ballot SC23: Precertificates was passed so that the BRs didn't conflict with Mozilla policy.
You can see this policy still exists in version 2.9 of the Mozilla Root Store Policy as Martijn said in comment #2.
Assignee | ||
Comment 9•3 months ago
|
||
A heads-up on the progress...
The action item "Improve the EJBCA client application to avoid issues related to concurrency" that was planned for 25-July has been already implemented.
The pending item "Study and possibly enable the EJBCA Pre-Certificate Maintenance Service..." wouldn't be needed, but most likely we'd still complete it as a backup countermeasure.
We are closely monitoring all systems.
Comment 10•3 months ago
|
||
For cross-reference, see https://bugzilla.mozilla.org/show_bug.cgi?id=1905419#c3
Assignee | ||
Comment 11•3 months ago
|
||
No update this week. Pending item may or may not be executed depending of the outcome of some tests.
Regarding the link with the GoDaddy incident... While our case is different because it was originated by a "bug" that needed correction and not by architecture design, I think it would be beneficial to clarify the admitted lead times to update info in the OCSP responder, both for initial update after certificate issuance and for updates after revocation status changes. In particular, the applicability of the four-day period stated in the Mozilla Policy, that I think it should be clarified.
Comment 12•3 months ago
|
||
I'm wondering whether a 15-minute latency for publication of OCSP responses for pre-certificates would be something that should be adopted either in the Mozilla Root Store Policy or by the CA/B Forum. I filed an issue in GitHub for this: https://github.com/mozilla/pkipolicy/issues/280.
Assignee | ||
Comment 13•3 months ago
|
||
No update this week. We will inform next week if we finally decided to execute the pending optional action item.
Regarding the proposed change in the MRSP, I think this should be discussed with a broader perspective (not only precerts) at the CABF because, the way I see it, there's an inconsistency with the 24-hour CRL update period when there's a revocation status change.
Comment 14•3 months ago
|
||
In the CA/B Forum - here is a draft proposal - https://github.com/cabforum/servercert/pull/535.
Assignee | ||
Comment 15•2 months ago
|
||
We have completed all the planned tasks, including the last optional action (enable the EJBCA Pre-Certificate Maintenance Service).
We don't foresee other actions related to this issue, and we request that its closure is planned for next week, to give time in case anyone wants to make any comment.
Comment 16•2 months ago
|
||
I intend to close this on or about next Wed. 2024-07-31, unless there are additional questions or comments.
Updated•2 months ago
|
Description
•