Entrust: OCSP response signed with SHA-1
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: bruce.morton, Assigned: bruce.morton)
Details
(Whiteboard: [ca-compliance] [ocsp-failure])
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36
Assignee | ||
Comment 1•1 year ago
|
||
Summary
After learning about OCSP Watch (https://sslmate.com/labs/ocsp_watch/), our Operations team identified that two root CA OCSP responders were signing using SHA-1. We immediately scheduled to correct the OCSP responders to sign with SHA-256.
Impact
OCSP responses providing the status for some CA certificates were signed with SHA-1 and not SHA-256 as required by the TLS BR 7.1.3.2.1.
Timeline
2024-02-03
- 17:00 UTC - Operations reviewed OCSP Watch and viewed Entrust error with Entrust.net Certification Authority (2048) (https://crt.sh/?caid=32) root OCSP responder. Operations then viewed OCSP monitoring system and also discovered Entrust Root Certification Authority (https://crt.sh/?caid=99) root OCSP responder. No other errors were found.
2024-02-05:
- 18:00 UTC – Confirmed authorization to fix the error.
- 19:00 UTC – Tested fix in Staging environment and investigated the root cause.
- 20:30 UTC – Submitted change request for approval.
2024-02-06
- 21:00 UTC – Applied the fix to production.
- 21:15 UTC – OCSP monitoring was updated to ensure correct OCSP responders sign with SHA-256.
Root Cause Analysis
Both root CAs have self-signed certificate signing using SHA-1. The OCSP responder default signs with the algorithm which was used to sign the CA certificate. A year before the OCSP SHA-1 sunset date, the offline root components were updated to sign with SHA-256. The online OCSP responders were not updated, so did not meet the eventual OCSP SHA-1 sunset date.
Lessons Learned
What went well
- Since Entrust moved most CAs to minimum SHA-256, the default configuration setting did not provide SHA-1 signing in most instances.
What didn't go well
- Configuration setting for all OCSP responders were not reviewed before the sunset date. The monitoring software checks for the signing algorithm remained the same to when monitoring was originally set. The monitoring software does not check against third party rules.
Where we got lucky
Action Items
Action Item | Kind | Due Date |
---|---|---|
OCSP responders reconfigured to sign with SHA-256 | Correction | 2024-02-06 |
OCSP responder monitoring configuration update to expect SHA-256 | Detection | 2024-02-06 |
Add OCSP Watch to daily monitoring | Detection | 2024-02-29 |
Will follow up is we decide to take any actions to address prevention of an error for a future ballot.
Appendix
Details of affected certificates
No certificates were affected.
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Comment 2•1 year ago
|
||
Monitoring using OCSP Watch was added on 2024-02-09. Action items listed have been completed.
Comment 3•11 months ago
•
|
||
The OCSP SHA-1 sunset date was 2022-06-01, over 1.5 years before this incident was detected. It sounds to me like the root cause here is not one of monitoring (the monitors were successfully confirming that OCSP was being signed with the expected algorithm) but one of human process: keeping track of requirements and sunset dates, being aware of all systems potentially affected by changing requirements, and re-confirming compliance with requirements on a regular basis. Does Entrust plan to take any action items to ensure that similar requirements dates are not missed in the future?
Assignee | ||
Comment 4•11 months ago
|
||
We do have a system to track ballots and effective dates. In this case, we were proactive to implement the change which was completed with our offline roots. The online OCSP responders were purposely delayed from the proactive tasks to continue to support some SHA-1 hierarchies. Unfortunately, we failed on having a reminder to stop the delay. We have updated the procedures for our Operational Authority to ensure requirement dates are not missed and ballot actions have closed when all tasks are complete..
Comment 5•11 months ago
|
||
The timeline should include the relevant dates for the action items an reviews or missed reviews that led up to the incident. For example, when were the the offline root components updated to sign with SHA-256, when was the date that the requirements came into effect, when was the first response that didn't follow the requirements and therefore the start of the incident.
Mozilla policy now requires CA's to perform a Compliance Self-Assessment annually. Had Entrust previously done a self-assessment either following the CCADB framework or your own? What steps does Entrust take to confirm current operations meet requirements?
Assignee | ||
Comment 6•10 months ago
|
||
Apologize for the delay, we will update the timeline for this incident.
Assignee | ||
Comment 7•10 months ago
|
||
(In reply to Mathew Hodson from comment #5)
The timeline should include the relevant dates for the action items an reviews or missed reviews that led up to the incident. For example, when were the the offline root components updated to sign with SHA-256, when was the date that the requirements came into effect, when was the first response that didn't follow the requirements and therefore the start of the incident.
Here is an update to the timeline:
Timeline (all times in UTC)
2020-09
- Updated root and subordinate CAs to SHA-256 OCSP signing with some exceptions Entrust.net Certification Authority (2048) (https://crt.sh/?caid=32) and Entrust Root Certification Authority (https://crt.sh/?caid=99) for backwards compatibility.
2022-01-24
- CA/Browser Forum ballot SC-53 Sunset for SHA-1 OCSP Signing was passed by voting.
2022-02-16
- Policy Authority meeting discussed CA/Browser Forum ballots and status SC-53 as open.
2022-03
- Updated OCSP signing algorithm for a private trust and for AffirmTrust Networking root the associated subordinate CA. Discovered the OCSP signing algorithm was not based on the delegated OCSP signing certificate, but on the signing algorithm associated with the CA certificate. The playbooks were updated, but the Entrust.net Certification Authority (2048) and Entrust Root Certification Authority were missed.
2022-05-18
- Policy Authority meeting discussed CA/Browser Forum ballots and status SC-53 as closed.
2022-05-31
- Compliance team confirmed we would no longer be using SHA-1 to sign OCSP responses. This was interpreted that Compliance was confirming the ballot requirements; however, implementation was not verified.
2022-06-01
- 00:00 CA/Browser Forum ballot SC-53 stating “CAs MUST NOT sign OCSP responses using the SHA-1 hash algorithm” became effective as such, Entrust.net Certification Authority (2048) and Entrust Root Certification Authority were non-compliant.
2024-02-03
- 17:00 - Operations reviewed OCSP Watch and viewed Entrust error with Entrust.net Certification Authority (2048) root OCSP responder. Operations then viewed OCSP monitoring system and also discovered Entrust Root Certification Authority root OCSP responder. No other errors were found.
2024-02-05:
- 18:00 – Confirmed authorization to fix the error.
- 19:00 – Tested fix in Staging environment and investigated the root cause.
- 20:30 – Submitted change request for approval.
2024-02-06
- 21:00 – Applied the fix to production.
- 21:15 – OCSP monitoring was updated to ensure correct OCSP responders sign with SHA-256.
Mozilla policy now requires CA's to perform a Compliance Self-Assessment annually. Had Entrust previously done a self-assessment either following the CCADB framework or your own? What steps does Entrust take to confirm current operations meet requirements?
Entrust performs an annual self-assessment based on the CCADB framework. With regards to the TLS BRs, the framework ensures the policy is stated in the disclosed documents and is not a technical assessment. We also do similar assessments for requirements and policies not covered by the CCADB self-assessment, such as Apple policy, Microsoft policy and Adobe Approved Trust List (AATL) requirements.
Our Operations teams perform quality checks when changes are made, such as a ballot change. Monitoring would also be updated to continually many requirements.
Comment 8•10 months ago
|
||
The playbooks were updated, but the Entrust.net Certification Authority (2048) and Entrust Root Certification Authority were missed.
This seems like it should be the focus of your actual root cause analysis. Why were these CAs treated differently? What happens during your Policy Authority updates that could cause elision of whole hierarchies?
Assignee | ||
Comment 9•10 months ago
|
||
(In reply to honest_enteropneust from comment #8)
The playbooks were updated, but the Entrust.net Certification Authority (2048) and Entrust Root Certification Authority were missed.
This seems like it should be the focus of your actual root cause analysis. Why were these CAs treated differently?
Those Roots were threaded differently because they supported Code Signing and Time stamping subordinate CAs and Entrust was trying to extend support for SHA-1 of Legacy relying parties to the sunset date. The issue was some customers had code signed applications embedded in POE/Edge devices that could not easily be updated. The goal was to extend support for SHA-1 as close to the sunset date as possible; however, the exception was not closed before the deadline.
What happens during your Policy Authority updates that could cause elision of whole hierarchies?
The Policy Authority was purpose was to get a status of all open ballots and the status was deemed to be complete. We will follow up with an action for the operational authority to ensure the ballot update is complete, including any temporary exceptions.
Assignee | ||
Comment 10•10 months ago
|
||
We are adding the following action to ensure granted exceptions are addressed and ballot deadlines are followed.
Action Item | Kind | Due Date |
---|---|---|
Ticketing system for the PKI operational team will be updated to track exceptions and deadlines for ballots and browser policy compliance. All exceptions will be ticketed and tracked through completion. | Prevention | 2024-05-31 |
Assignee | ||
Comment 11•10 months ago
|
||
If there are no other comments, it is requested set the next update to 3 May 2024.
Comment 12•10 months ago
|
||
So, I ran a couple of very basic searches:
There have been multiple bugs open here in the past mentioning this tool. Why did it take you until 2024-02-03, to notice it? Am I right to assume that you do not have any procedures in place to monitor incidents involving other CAs to learn from, and apply those lessons learned to your own CA?
Once you noticed this issue, why did it take you until 2024-02-09 to file this incident?
An initial report should be filed within 72 hours of the CA Owner being made aware of the incident. If a full incident report is not yet ready, CA Owners should provide a preliminary report containing an executive summary of the incident and a date by which the full report will be posted.
Your timeline also fails to mention the 2024-02-09 date.
Based on this, I think there is another incident here for failing to provide the initial report within 72 hours.
Assignee | ||
Comment 13•10 months ago
|
||
(In reply to amir from comment #12)
Once you noticed this issue, why did it take you until 2024-02-09 to file this incident?
Based on this, I think there is another incident here for failing to provide the initial report within 72 hours.
Hi Amir, thank you for the questions. We will address and follow up.
Bruce.
Updated•10 months ago
|
Assignee | ||
Comment 14•10 months ago
|
||
(In reply to amir from comment #12)
So, I ran a couple of very basic searches:
There have been multiple bugs open here in the past mentioning this tool. Why did it take you until 2024-02-03, to notice it? Am I right to assume that you do not have any procedures in place to monitor incidents involving other CAs to learn from, and apply those lessons learned to your own CA?
We do have procedures in place to monitor incidents from other CAs. Unfortunately, in the case of OCSP, we did not proactively test our endpoints with the listed tools as we considered those issues already addressed. Going forward, we will ensure that our operations team is informed about tools that have successfully identified issues for other CAs, enabling us to conduct testing ourselves and, whenever feasible, ongoing monitoring.
Once you noticed this issue, why did it take you until 2024-02-09 to file this incident?
An initial report should be filed within 72 hours of the CA Owner being made aware of the incident. If a full incident report is not yet ready, CA Owners should provide a preliminary report containing an executive summary of the incident and a date by which the full report will be posted.
Your timeline also fails to mention the 2024-02-09 date.
Based on this, I think there is another incident here for failing to provide the initial report within 72 hours.
After consulting with CCADB Support, we've clarified the necessary timeline for initial incident reports. While the CCADB Policy recommends filing incident reports within 24 hours, it doesn't mandate CAs to do so within 72 hours of the CA Owner's awareness of the incident. We've complied with CCADB and root program requirements by submitting a comprehensive incident report within two weeks. However, we acknowledge the value of providing an initial report promptly for transparency. To address this, we'll assess and establish guidelines for when to issue initial incident reports and incorporate them into our procedures.
Comment 15•10 months ago
|
||
We do have procedures in place to monitor incidents from other CAs.
That's great! Can you please provide the triage logs that Entrust did for the following incidents?
- https://bugzilla.mozilla.org/show_bug.cgi?id=1765800
- https://bugzilla.mozilla.org/show_bug.cgi?id=1879529
- https://bugzilla.mozilla.org/show_bug.cgi?id=1879552
- https://bugzilla.mozilla.org/show_bug.cgi?id=1763203
- https://bugzilla.mozilla.org/show_bug.cgi?id=1844514
What I'm looking for in the logs:
- Date that the incident was triaged
- Findings from the triage process
- If any action items came from the triage
Comment 16•10 months ago
|
||
After consulting with CCADB Support, we've clarified the necessary timeline for initial incident reports.
I'm trying to see where this discussion happened. I do not see any threads in https://groups.google.com/a/ccadb.org/g/public, nor do I see anything here: https://groups.google.com/a/mozilla.org/g/dev-security-policy
It would help future CAs to also know what the clear language from CCADB is, and for transparency sake it will help if we know who gave you such guidance.
Comment 17•10 months ago
|
||
The referenced language is from https://www.ccadb.org/cas/incident-report, which states in part:
An initial report should be filed within 72 hours of the CA Owner being made aware of the incident. If a full incident report is not yet ready, CA Owners should provide a preliminary report containing an executive summary of the incident and a date by which the full report will be posted. The full incident report must be posted within two weeks of the incident.
For transparency, I responded to Entrust as the on-rotation CCADB Support representative including the following (emphasis mine):
the CCADB Policy imposes no requirement upon CAs to file an initial report nor to do so within 72 hours of the CA Owner being made aware of the incident. However, this is intended as guidance which is compatible with, but not authoritative over and above, individual Root Store requirements of Root Store Operators participating in the CCADB.
This response limits itself to pertinent information from the guidance published by the CCADB as it is not the CCADB's goal nor purpose to mediate between CAs and individual Root Programs. Noting that, the CCADB Steering Committee very much welcomes community input (e.g. via Bugzilla: https://bugzilla.mozilla.org/enter_bug.cgi?product=CA+Program&component=Common+CA+Database).
Comment 18•10 months ago
|
||
Thanks Clint for the transparency in information. Personal opinion, but in the future I hope CAs try to have these conversations on the public mailing lists to reduce the confusion in such areas.
I'm not sure I agree with Entrust's assertion that this happened per the policies of root program requirements:
We've complied with CCADB and root program requirements by submitting a comprehensive incident report within two weeks.
When a CA operator fails to comply with any requirement of this policy - whether it be a misissuance, a procedural or operational issue, or any other variety of non-compliance - the event is classified as an incident and MUST be reported to Mozilla as soon as the CA operator is made aware.
Emphasis: "as soon as the CA operator is made aware.", which, well, I guess depends on the definition of "as soon as", but I don't think ~6 days would fall into that definition.
Beyond this, there's really nothing on Mozilla's root program that states an incident should be made within 2 weeks, other than linking to https://www.ccadb.org/cas/incident-report
In that:
An initial report should be filed within 72 hours of the CA Owner being made aware of the incident. If a full incident report is not yet ready, CA Owners should provide a preliminary report containing an executive summary of the incident and a date by which the full report will be posted.
I read this as "An initial report SHOULD..." (RFC lingo). I know the BRs use SHOULD/MUST/MAY language as well, so maybe this page can also be drafted as such. But I do understand that this guidance is not meant to override what the root programs intend.
Assignee | ||
Comment 19•10 months ago
|
||
(In reply to amir from comment #15)
We do have procedures in place to monitor incidents from other CAs.
That's great! Can you please provide the triage logs that Entrust did for the following incidents?
- https://bugzilla.mozilla.org/show_bug.cgi?id=1765800
- https://bugzilla.mozilla.org/show_bug.cgi?id=1879529
- https://bugzilla.mozilla.org/show_bug.cgi?id=1879552
- https://bugzilla.mozilla.org/show_bug.cgi?id=1763203
- https://bugzilla.mozilla.org/show_bug.cgi?id=1844514
What I'm looking for in the logs:
- Date that the incident was triaged
- Findings from the triage process
- If any action items came from the triage
We do not have triage logs to share. Triage of the above incidents would not have triggered a concern related to our OCSP deprecation incident. We do agree that testing using OCSP Watch would have found our incident earlier.
We have added monitoring using OCSP Watch. We will also review our CA incident review process, and as stated above, we will ensure that our operations team is informed about tools that have successfully identified issues for other CAs.
Comment 20•10 months ago
|
||
If you do not have any logs for the triage, then how exactly are you monitoring Bugzilla for incidents?
Is there formality to the monitoring? How are new topics discovered and assigned?
Assignee | ||
Comment 21•10 months ago
|
||
(In reply to amir from comment #20)
If you do not have any logs for the triage, then how exactly are you monitoring Bugzilla for incidents?
Is there formality to the monitoring? How are new topics discovered and assigned?
We are monitoring Bugzilla through a module subscription similar to how we are registered to the Mozilla dev-security-policy mailing list and the CAADB public mailing list. When topics are discovered in monitoring, they are assigned for follow up as appropriate.
Comment 22•10 months ago
|
||
they are assigned for follow up as appropriate.
Were the topics I linked ever assigned out?
How do you make sure that the work of assigning out actually happens? Is there any system to followup to ensure a topic has been seen?
Assignee | ||
Comment 23•9 months ago
|
||
(In reply to amir from comment #22)
they are assigned for follow up as appropriate.
Were the topics I linked ever assigned out?
Of the bugs you posted, 4 of 5 were triaged and deemed as previously address or not necessary for further follow up.
The fifth -- https://bugzilla.mozilla.org/show_bug.cgi?id=1763203 -- was reviewed 25 May 2022, and should have been assigned out for investigation, but was not.
How do you make sure that the work of assigning out actually happens? Is there any system to follow up to ensure a topic has been seen?
If an incident has been triaged for investigation, then a request is made to the appropriate team and followed up by the compliance team.
We have a process to triage the incidents from other CAs, but currently do not have a programmatic system. We are investigating a new system which may help us to track the lifecycle, assignment and follow up of CA incidents.
Assignee | ||
Comment 24•9 months ago
|
||
We have no updates for this week and will continue to monitor the bug.
Assignee | ||
Comment 25•9 months ago
|
||
We have no updates for this week and will continue to monitor the bug.
Comment 26•9 months ago
|
||
Per comment #10, setting next action for May 31, 2024.
Assignee | ||
Comment 27•9 months ago
|
||
Action Item | Kind | Due Date |
---|---|---|
Ticketing system for the PKI operational team will be updated to track exceptions and deadlines for ballots and browser policy compliance. All exceptions will be ticketed and tracked through completion. | Prevention | Done |
All actions are complete. Requesting to close this incident.
Updated•8 months ago
|
Assignee | ||
Comment 28•7 months ago
|
||
All actions are complete. Request that this incident is closed.
Comment 29•7 months ago
|
||
I'll close this on or about Friday, 19-July-2024.
Updated•7 months ago
|
Description
•