Closed Bug 1588001 Opened 5 years ago Closed 4 years ago

Apple: OCSP responders return responses with incorrect issuer

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: certification_authority, Assigned: certification_authority)

Details

(Whiteboard: [ca-compliance] [ocsp-failure])

Attachments

(1 file)

On 03-October-2019 at 13:52 PT, we were notified via a problem report submitted to our Problem Reporting Mechanism that our OCSP responders were returning signed responses with incorrect issuer. Based on an initial investigation, we’ve determined that in some cases when the OCSP service receives a request it cannot process, it signs the response with a default OCSP responder (our OCSP service processes requests for multiple CAs). We are investigating a fix so that responses are not signed by an incorrect OCSP responder.

Further details will be provided no later than 17-October-2019.

Thanks for reporting.

Considering the Baseline Requirements, Section 4.9.5, requires the CA to provide a report within 24 hours of a problem report to the reporter, can you please attach that report here, prior to your expected 17-October update?

I want to highlight that the incident reporting policy defines the absolute upper bound for CAs to disclose. In general, responsible CAs that have nothing to hide can and do provide prompt, detailed, and thorough reports, as they focus on leading by example, to help the entire ecosystem improve, rather than doing only the minimum required of them. Sharing details about the analysis to date, particularly for an incident like this, show how a CA can be transparent and trustworthy, helping the entire ecosystem improve.

You can read a bit more about what makes a good incident report.

Flags: needinfo?(certification_authority)
Assignee: wthayer → certification_authority
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

CC'ing folks from Sectigo and DigiCert, as it's presently unclear the scope, but they're both the issuing roots. DigiCert/Sectigo folks: I filed https://github.com/mozilla/pkipolicy/issues/193 to help bring clarity about expectations for root CAs when dealing with sub-CA incident reports. Your input is welcome there (either on the issue or when it eventually comes up for policy discussion on-list)

Incident Report

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

    On 03-October-2019 at 13:51 PT, we were notified via a problem report submitted to our Problem Reporting Mechanism that our OCSP responders were returning signed responses with incorrect issuer.

  2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

    • 03-October-2019 at 13:51 PT - We were notified via a problem report submitted to our Problem Reporting Mechanism that our OCSP responders were returning signed responses with incorrect issuer.

    • 03-October-2019 at 13:52 PT - The compliance team read the problem report and began pulling the appropriate people together to begin the investigation.

    • 03-October-2019 at 14:15 PT - Investigation began to confirm the reported behavior.

    • 03-October-2019 at 16:15 PT - Based on an initial investigation, we determined that in some cases when the OCSP service receives a request it can’t process, it returns a status of unknown signed with a default OCSP responder, which is not always signed by the CA that issued the certificate whose revocation status is being checked.

    • 03-October-2019 at 19:21 - Notified DigiCert (Root vendor).

    • 03-October-2019 at 20:10 - We responded to the reporter with an initial acknowledgement and a commitment to investigate further and respond with more details within 24 hours of the report being submitted.

    • 04-October-2019 at 10:10 PT - Notified Sectigo (Root vendor).

    • 04-October-2019 at 11:26 PT - Notified Ernst & Young (WebTrust assessors).

    • 04-October-2019 at 13:48 PT - We provided a preliminary report on our findings to the individual who filed the problem report.

    • 07-October-2019 at 14:15 PT - We began rolling out a fix to our OCSP service. The fix was to disable the default OCSP responder so that the responses are always signed by the CA that issued the certificate whose revocation status is being checked. Disabling the default OCSP responder ensures that the responder will reply ‘unauthorized’ (as per RFC 6960) for all unknown issuers. The issuer may be unknown if the OCSP service cannot identify the issuer, such as when an OCSP client uses a hash algorithm for CertID that the OCSP service does not support or when the request indicates an unrecognized issuer that is not served by our OCSP service.

    • 10-October-2019 at 19:38 PT - Posted initial incident report to Bugzilla.

    • 17-October-2019 - The fix has been pushed out to the majority of our production OCSP service and scheduled for completion by 18-October-2019.

  3. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

    No non-compliant certificates were issued. The OCSP fix has been pushed out to the majority of our production OCSP service and scheduled for completion by 18-October-2019.

  4. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

    No non-compliant certificates were issued.

  5. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.

    No non-compliant certificates were issued.

  6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

    When the OCSP service was first set up in 2012, the OCSP service software did not allow the default OCSP responder to be disabled. We began issuing publicly trusted TLS certificates in 2014 and the default OCSP responder was not disabled. We did not identify this as an issue because our test cases did not address scenarios in which our OCSP service could not identify the issuer and thus signs the response with the default OCSP responder.

    The default OCSP responder is only used when the OCSP service cannot identify the issuer. This may occur when an OCSP client uses a hash algorithm for CertID that the OCSP service does not support or when the request indicates an unrecognized issuer that is not served by our OCSP service.

    We were aware of section 4.9.9 of the Baseline Requirements that states that OCSP responses must “Be signed by an OCSP Responder whose Certificate is signed by the CA that issued the Certificate whose revocation status is being checked.” but we were unaware that our OCSP responder configuration was violating that requirement when the issuer could not be identified due to the default OCSP responder.

  7. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

    1. As mentioned in the timeline above, we are in the process of changing our OCSP service configurations to ensure we respond with an unsigned OCSP error of ‘unauthorized’ for all unknown issuers by disabling the default OCSP responder. This is scheduled for completion by 18-October-2019.

    2. We are enhancing our OCSP service test cases to include additional OCSP request scenarios to ensure responses are compliant with Baseline Requirements. This is scheduled for completion by 22-November-2019.

Flags: needinfo?(certification_authority)

We cannot attach the Certificate Problem Report email because it is an internal Apple email and Apple's corporate and legal policies prevent such disclosures. However, we can provide a date-and-time-stamped sequence of all relevant events (also included in the full incident report) and our preliminary report on our findings provided to the individual who filed the report. We have redacted confidential information such as names, project timelines, and technology details.

  • 03-October-2019 at 13:51 PT - We were notified via a problem report submitted to our Problem Reporting Mechanism that our OCSP responders were returning signed responses with incorrect issuer.

  • 03-October-2019 at 13:52 PT - The compliance team read the problem report and began pulling the appropriate people together to begin the investigation.

  • 03-October-2019 at 20:10 PT - We responded to the reporter with an initial acknowledgement and a commitment to investigate further and respond with more details within 24 hours of the report being submitted.

  • 04-October-2019 at 13:48 PT - We provided a preliminary report on our findings to the individual who filed the problem report.

The preliminary report is as follows:

Hello [REDACTED],

Please see the following report of our preliminary investigation for your submitted notification:

    * We confirmed the behavior - our OCSP responses to requests where the hashAlgorithm is SHA-256 are producing incorrect responses.

    * The investigation suggests that our OCSP responder is unable to process those requests resulting in “unknown” responses signed by a default OCSP responder configured.

    * Because our Validation Authority system supports multiple PKIs, the default OCSP responder does not always correspond to the signer of the certificate being validated.

    * [REDACTED]

    * [REDACTED]

    * Coincidentally, we were already scheduled to begin our upgrade to [REDACTED] on Monday with a phased rollout across data centers scheduled to be completed by the 18th.

    * Once the upgrade is complete, the issue will be remediated.

    * We have notified both root vendors (DigiCert and Sectigo).

    * We have notified our WebTrust auditors.

    * We have begun drafting the initial incident response.

Regards, 
Apple PKI

We understand your concern is making sure that there is prompt and sufficient information for reports.

Our internal review process for public postings involves several levels of review and approval taking major events into consideration. However, we are working on modifying our processes to provide more prompt postings and replies, in particular the initial communication that an incident has occurred. We also intend to modify our processes to notify the root programs directly in addition to the root vendors within the first 24 hours.

Thanks.

You're correct that part of my concern is about the level of detail in the initial report in order to assess the scope, and in ensuring prompt responses, both to external problem reports and overall to incident reports. If I'm correctly interpreting Comment #4, the Problem Report came from an Apple employee (but not someone on the CA team), hence the redactions and additional disclosures that might not otherwise be included in a problem report if it was externally reported. That's understandable for this situation, and I think that the proposed remediation - ensuring that root programs and the root CAs are notified with sufficient detail - is a good path to avoiding it in the future. It's hopefully not necessary to emphasize that, as a publicly trusted CA, Apple may also receive external problem reports that require timely investigation and response, and so making sure the processes are robust to handle that, both publicly and with root programs, is of paramount importance.

With respect to the original report, and the request for the incident report, the concern was over this phrase: "we’ve determined that in some cases when the OCSP service receives a request it cannot process". My concern is that the original message does not help build a sense of the scope of the problem, and the impact to Relying Parties, which is critical to understanding the timeliness of the response. Providing more details about the cases you're aware of, which seem clear in the problem report shared with the reporter, is a critical part in understanding the issue and the risk/severity/how the CA is treating it. That doesn't mean it should be used to excuse delays ("fire? What fire? This is fine!"), but it is a key piece in being confident that the CA has everything in control and is responding appropriately.

I want to call out: had Apple included those initial three bullets in their problem report in the report shared with browsers, there would have been better understanding about the scope and nature of the issue, and thus understanding about the possible delays or challenges. Throw in an explanation about what steps were being taken to dig in / investigate further, and that would have been totally in line with the initial problem reports that help scope, while a CA works on a more formal problem report with investigation (and, presumably, approvals)

That said, there's one thing that stands out to me: Apple is one of the CAs that, like many others, was affected by the EJBCA issue. The description makes it sound like SHA-256 was not supported by the responder, but EJBCA has supported SHA-256 in the CertID since version 6.2.2, Released 3 September 2014. From the description, I've parsed that when an unrecognized algorithm is used for the CertID, it triggers the "unknown issuer" logic (documented in 6.12 over here), while if it recognizes the issuer, it'd provide authoritative responses. The change, to not have a default responder, will result in "Unauthorized" (per that documentation and your description)

This would suggest that Apple is either running a bespoke OCSP responder, was running a significantly out-of-date OCSP responder, or that I've completely misunderstood the underlying root cause. I'm hoping it's that latter - I looked through the EJBCA change logs to try to understand if it was a bug that was fixed / configuration not supported, but I entirely admit I'm not familiar there.

I ask, because it seems important for the overall ecosystem, particularly those that may rely on EJBCA, to understand a bit more about the underlying issue and how Apple's resolved it, since that helps make sure all CAs are able to learn and similarly examine and remediate their systems.

Similarly, in terms of "How do we help the ecosystem grow", and in line with improving the OCSP test cases, it'd be useful if Apple could share the OCSP test cases it has / what it tests (i.e. the test objective, not necessarily the test itself). From an ecosystem perspective, this may identify other test cases that should be added, or, alternatively, it may highlight Apple's good practices here, and be a model for other CAs to examine their own systems as holistically. Both of these are in the spirit of learning together and making the ecosystem better.

Flags: needinfo?(certification_authority)

Please see our responses below.

You're correct that part of my concern is about the level of detail in the initial report in order to assess the scope, and in ensuring prompt responses, both to external problem reports and overall to incident reports. If I'm correctly interpreting Comment #4, the Problem Report came from an Apple employee (but not someone on the CA team), hence the redactions and additional disclosures that might not otherwise be included in a problem report if it was externally reported. That's understandable for this situation, and I think that the proposed remediation - ensuring that root programs and the root CAs are notified with sufficient detail - is a good path to avoiding it in the future. It's hopefully not necessary to emphasize that, as a publicly trusted CA, Apple may also receive external problem reports that require timely investigation and response, and so making sure the processes are robust to handle that, both publicly and with root programs, is of paramount importance.

Yes, you are correct in interpreting Comment #4, in that the Problem Report came from an Apple employee (but not someone on the CA team), hence the redactions and additional disclosures that might not otherwise be included in a problem report if it was externally reported. We are aware that problem reports may come externally and we have one consistent process for handling all reports regardless of the source.

With respect to the original report, and the request for the incident report, the concern was over this phrase: "we’ve determined that in some cases when the OCSP service receives a request it cannot process". My concern is that the original message does not help build a sense of the scope of the problem, and the impact to Relying Parties, which is critical to understanding the timeliness of the response. Providing more details about the cases you're aware of, which seem clear in the problem report shared with the reporter, is a critical part in understanding the issue and the risk/severity/how the CA is treating it. That doesn't mean it should be used to excuse delays ("fire? What fire? This is fine!"), but it is a key piece in being confident that the CA has everything in control and is responding appropriately.

I want to call out: had Apple included those initial three bullets in their problem report in the report shared with browsers, there would have been better understanding about the scope and nature of the issue, and thus understanding about the possible delays or challenges. Throw in an explanation about what steps were being taken to dig in / investigate further, and that would have been totally in line with the initial problem reports that help scope, while a CA works on a more formal problem report with investigation (and, presumably, approvals)

We acknowledge the level of detail in our initial incident report was not enough to help root programs or relying parties understand the full scope of the problem. We did not specify what CAs or types of certificates were impacted nor provide initial information that would help other CAs understand if they were also impacted by the same issue. We also acknowledge that including additional detail (such as the initial three bullets in our preliminary problem report) as well as what steps we were taking to further investigate would have helped the community have a better understanding about the scope and nature of the issue. As mentioned in Comment #4, we are working internally to modify our processes to provide more prompt postings and replies.

That said, there's one thing that stands out to me: Apple is one of the CAs that, like many others, was affected by the EJBCA issue. The description makes it sound like SHA-256 was not supported by the responder, but EJBCA has supported SHA-256 in the CertID since version 6.2.2, Released 3 September 2014. From the description, I've parsed that when an unrecognized algorithm is used for the CertID, it triggers the "unknown issuer" logic (documented in 6.12 over here), while if it recognizes the issuer, it'd provide authoritative responses. The change, to not have a default responder, will result in "Unauthorized" (per that documentation and your description)

Prior to our planned upgrade that began on 07-October-2019 and was completed on 18-October-2019, Apple’s OCSP service was on version 4.0.14 of EJBCA and therefore did not support requests that used SHA-256 in the CertID nor allowed us to disable the default OCSP responder so that the responder would respond ‘unauthorized’ for all unknown issuers. The problem report initially alerted us to the fact that our OCSP service was responding to any OCSP request that used a hash algorithm other than SHA-1 (e.g., SHA-256) for CertID with an ‘unknown’ response signed by a default responder. But more importantly, it alerted us to the fact that we were non-compliant with the Baseline Requirements section 4.9.9 whenever the OCSP service could not identify the issuer (as we were not signing the response with a certificate signed by the CA that issued the certificate whose revocation status was being checked). The fix was to complete the planned upgrade of EJBCA running on our OCSP service and disable the default responder. A positive side-effect is that now we can also support SHA-256 for CertID which is, as far as we know, neither required nor forbidden by any requirements or policy.

This would suggest that Apple is either running a bespoke OCSP responder, was running a significantly out-of-date OCSP responder, or that I've completely misunderstood the underlying root cause. I'm hoping it's that latter - I looked through the EJBCA change logs to try to understand if it was a bug that was fixed / configuration not supported, but I entirely admit I'm not familiar there.

As mentioned above, we were running on version 4.0.14 of EJBCA prior to the upgrade that began on 07-October-2019 and was completed on 18-October-2019. The software upgrade was tested, planned, and scheduled before this incident was identified. A separate bug will be opened with more details.

I ask, because it seems important for the overall ecosystem, particularly those that may rely on EJBCA, to understand a bit more about the underlying issue and how Apple's resolved it, since that helps make sure all CAs are able to learn and similarly examine and remediate their systems.

We think an important lesson that other CAs can take away from this incident is that if using EJBCA for their OCSP service they should a) disable the default responder and b) be running version 6.2.0 or above to ensure that the responder will reply ‘unauthorized’ (as per RFC 6960) for all unknown issuers.

Similarly, in terms of "How do we help the ecosystem grow", and in line with improving the OCSP test cases, it'd be useful if Apple could share the OCSP test cases it has / what it tests (i.e. the test objective, not necessarily the test itself). From an ecosystem perspective, this may identify other test cases that should be added, or, alternatively, it may highlight Apple's good practices here, and be a model for other CAs to examine their own systems as holistically. Both of these are in the spirit of learning together and making the ecosystem better.

We hope sharing this additional information will prove informative to other CAs, relying parties, and root programs. In addition, after we’ve finished updating our OCSP test cases per Comment 3, we’ll share the test objectives with the community in the spirit of learning together to help the ecosystem grow.

Flags: needinfo?(certification_authority)

Wayne: I'm setting N-I to see if you have any questions beyond those listed, and if the Next-Update for 22-Nov-19 works.
Apple: I'm setting the N-I for the questions below about the disclosure timeline and additional details about patch management practices.

(In reply to certification_authority from comment #6)

Yes, you are correct in interpreting Comment #4, in that the Problem Report came from an Apple employee (but not someone on the CA team), hence the redactions and additional disclosures that might not otherwise be included in a problem report if it was externally reported. We are aware that problem reports may come externally and we have one consistent process for handling all reports regardless of the source.
...
As mentioned in Comment #4, we are working internally to modify our processes to provide more prompt postings and replies.

This is now the second incident in which an Apple employee, not on the CA team, detected and reported a compliance issue. While this speaks very positively towards Apple and the ability of the employees to proactively spot things, it would be a shame if problems reported internally provide less detail than problems reported externally.

It sounds like, from Comment #4, steps are being taken to address that through changes to business procedures, to allow more responsive and detailed reports, in line with the expectations for all publicly-trusted CAs. In line with #7 from Responding to an Incident, I don't think we have a clear timeline on which Apple's expecting that to be complete, which is expected.

I'm sympathetic to the challenges of communicating while working for a large tech company, and the difficulty in getting all of the appropriate approvals, but I think having a clearly defined timeline or objective, along with the regular weekly progress updates to determine if things are going off track/getting complicated, is how to build that transparency and trust that progress is being made.

This would suggest that Apple is either running a bespoke OCSP responder, was running a significantly out-of-date OCSP responder, or that I've completely misunderstood the underlying root cause. I'm hoping it's that latter - I looked through the EJBCA change logs to try to understand if it was a bug that was fixed / configuration not supported, but I entirely admit I'm not familiar there.

As mentioned above, we were running on version 4.0.14 of EJBCA prior to the upgrade that began on 07-October-2019 and was completed on 18-October-2019. The software upgrade was tested, planned, and scheduled before this incident was identified. A separate bug will be opened with more details.

Thanks for including sufficient detail as to understand the problem.

EJBCA 4.0.14 was released on 2013-02-15. The CA/Browser Forum Network and Certificate System Security Requirements requires that CAs:

Apply recommended security patches to Certificate Systems within six (6) months of the
security patch’s availability, unless the CA documents that the security patch would
introduce additional vulnerabilities or instabilities that outweigh the benefits of applying
the security patch

There have been multiple security fixes to EJBCA over the years since. Has Apple maintained a process to ensure this requirement has been met?

It was unclear if it was only the OCSP responders (which, as unfortunate as it may be, may meet all of the requirements of the BRs at present), or if it was the overall CA system. The latter would, admittedly, be more concerning, particularly when a number of the security fixes were to mitigate privilege escalation issues within various administrative interfaces.

I'm trying to be cautious here, because I don't want the result of this incident to be discouraging the use of COTS software (since the release dates and notes can be determined), nor do I want it to be about discouraging sharing versions information (which bespoke software would, unfortunately, not have), even though I believe this likely represents a serious oversight and failing by Apple, based on the limited information provided so far. What I am trying to build is a picture that shows that either Apple did have reasonable software management practices in place, or, in light of this, has adopted changes to the operations and controls to ensure that, going forward, there are reasonable software management practices in place.

We hope sharing this additional information will prove informative to other CAs, relying parties, and root programs. In addition, after we’ve finished updating our OCSP test cases per Comment 3, we’ll share the test objectives with the community in the spirit of learning together to help the ecosystem grow.

That sounds like 22-Nov-19, based on Comment #3.

Flags: needinfo?(wthayer)
Flags: needinfo?(certification_authority)

Oh, I realized I missed this part

A separate bug will be opened with more details.

with respect to the patching issue.

So I think the only thing N-I from Apple for this issue is the timeline to expect improvements to the disclosure/reporting process, so that we have a measurable objective, as well as the opportunity for Apple to provide regular updates if it turns out not being achievable.

(In reply to Ryan Sleevi from comment #7)

Wayne: I'm setting N-I to see if you have any questions beyond those listed, and if the Next-Update for 22-Nov-19 works.

I'm primarily interested in understanding more about the reason the OCSP responder software hadn't been updated:

There have been multiple security fixes to EJBCA over the years since. Has Apple maintained a process to ensure this requirement has been met?

This has been a factor in a few recent incidents, and I'd like to better understand what leads CAs to not regularly update COTS software and what might be done to improve this.

Flags: needinfo?(wthayer)

Regarding improvements to the disclosure/reporting process, at this point, we have already documented a formal communication plan that identifies who should be contacted and by when. We also adopted message templates to simplify and accelerate information gathering and message creation. The final part of our plan allows for varying levels of approval depending on the nature of the content. Our current process requires several levels of review for all posts regardless of the content. This new approach to reviewing content has verbal backing from management, but still needs to be formally documented and approved. We expect the new approach for reviewing content to be approved and implemented by 22-November-2019.

In response to Comment #5, we had previously thought to open a separate bug. However, since additional questions were asked in this bug, we decided to answer here.

Regarding compliance with the CA/Browser Forum Network and Certificate System Security Requirements, we’ve determined our practices and controls for both the OCSP software and CA software are compliant.

So, why did we choose not to update the OCSP software until mid October? There were multiple factors that we describe below (the CA service is run on separate infrastructure, and is running a current version of the software).

First, although EJBCA can be considered off-the-shelf software, it is highly customizable. We determined early on in our use of EJBCA that thorough testing in our own environment based on our specific customizations was essential before applying any updates in production.

Major architectural changes were introduced in EJBCA version 5. EJBCA version 5 was the first version to be Common Criteria certified, which introduced significant core software changes, and for our use cases unacceptably impacted performance. Working with the vendor we determined the best path to resolving these issues was to skip version 5 and wait for a future version.

Additionally, the OCSP software in version 4 was purpose-built for validation authority functions (a slimmed down version of EJBCA), and did not include any GUI-based interface. In versions 5 and beyond, the OCSP software is the same as the CA software. Our understanding of this architectural shift was, in part, to enable tighter integration between the CA and OCSP services and facilitate more automation and easier management of services. However, for our use cases, this architectural change introduced new security risks, which required additional review and testing to ensure an upgrade would not introduce new vulnerabilities. The shift in software architecture also required we make changes to underlying infrastructure.

Multiple versions of EJBCA were tested over the years and we worked closely with the vendor to resolve issues as they were identified. We were not aware of security risks with EJBCA 4.0.14 that outweighed the risk that upgrading would introduce new security threats and/or negatively impact the stability of the environment.

Prior to this incident, we had tested, planned, and scheduled an upgrade to EJBCA version 7.3.0. This upgrade introduced new features required for our use cases. As mentioned in a previous comment (Comment 3), we did not identify the need to upgrade sooner because our test cases did not address scenarios in which our OCSP service could not identify the issuer and thus signs the response with the default OCSP responder.

Flags: needinfo?(certification_authority)

Regarding compliance with the CA/Browser Forum Network and Certificate System Security Requirements, we’ve determined our practices and controls for both the OCSP software and CA software are compliant.

I believe the response in comment #10 is asserting compliance with the NCSSRs based on the following statement: "...unless the CA documents that the security patch would introduce additional vulnerabilities or instabilities that outweigh the benefits of applying the security patch" Is that correct?

If so, did Apple document such concerns? Was this done for each version of the OCSP software that Apple chose not to deploy? If it was determined that some/all new versions didn't meet the definition of "security patch", what was the process for that? Were Apple's auditors reviewing the required documentation for these exceptions?

From this line of questioning, it is probably apparent that I'm skeptical that remaining on version 4 for 5 years is compliant with the NCSSRs, unless PrimeKey continued to support that version with security patches that Apple deployed. If Apple didn't evaluate each release to determine if it included a "security patch" and document "additional vulnerabilities or instabilities that outweigh the benefits" for each such patch, I don't think Apple has complied. I can understand that Apple may not have been able to upgrade to version 5 for the reasons stated, but I also can't accept that 5 years is a reasonable timeline for remediating the problem.

The quoted exception in the NCSSRs should not create a loophole that allows a CA to indefinitely neglect patching of COTS software, and if it does that's a problem we should correct.

Flags: needinfo?(certification_authority)

Thank you for your thoughtful comments. We expect to reply no later than this Wednesday.

Flags: needinfo?(certification_authority)

Our key control to determine if security patches should be applied is a process that leverages automated vulnerability scans. Identified vulnerabilities are documented and tracked to resolution. As a key control, it is provided to the auditors for their assessments. We also have multiple non-key controls. For example, we review EJBCA release notes and critical security notifications from PrimeKey.

Based on all of the above considerations, we did not identify security risks with version 4.0.14 that outweighed the risk that upgrading would introduce new security threats and/or negatively impact the stability of the environment. Additionally, based on a current review of CVEs (https://cve.mitre.org/) for EJBCA 4.0.14 (running in VA mode) and its dependent libraries, we have not identified any known exploits. VA mode is a slimmed down version of EJBCA purpose-built for validation authority functions.

However, in an effort to ensure our conclusion about our compliance was correct, we investigated further and determined that while it’s possible the vulnerability scanner would detect a vulnerability with EJBCA, it is not likely. Also, since our review of EJBCA releases and decisions about upgrading is not a key control, it was not provided to the auditors.

Therefore, although we believe our overall practices for managing the OCSP software meet the spirit of the requirements, we’ve come to the conclusion that our key control does not go far enough to meet the requirements specified in 1l of the Network and Certificate System Security Requirements.

We will open a separate incident report with more details.

Regarding our improved approach to reviewing content before it’s posted publicly and the OCSP test objectives, we‘ve needed to shuffle some project timelines to address competing priorities. We’re making good progress and are on track to complete these tasks and provide an update as early as 20-December-2019.

Whiteboard: [ca-compliance] → [ca-compliance] - Next Update - 20-December 2019

Regarding improvements to the disclosure/reporting process, the new approach to reviewing content has been formally documented and approved by management.

Regarding the OCSP test cases, the process for sharing this type of information is taking longer than expected and we cannot accurately determine a new deadline. However, if we haven’t been able to complete the process by end of March 2020, we’ll at least provide a comment about the status.

Trying to summarize the discussion to date with the deliverables, and setting N-I for Apple for the Unclear parts that were previously committed to by specific dates, to try and confirm those dates were met.

Apple Actions

Completed

Unclear if completed

  • 2019-10-18: (Comment #3) Fix completely rolled out to all OCSP servers
  • 2019-11-22: (Comment #3) Updated test cases (internal) for OCSP services

Not yet completed

Other Actions

Related Bugs

  • Bug 1605372 from GlobalSign determined that GlobalSign was also affected by a similar misconfiguration of EJBCA.
Flags: needinfo?(certification_authority)
Whiteboard: [ca-compliance] - Next Update - 20-December 2019 → [ca-compliance] - Next Update - 31-March 2020

Due to the year end holiday in the US, we expect to reply to Comment 16 the week of January 6th.

Flags: needinfo?(certification_authority)

All “Unclear if completed” actions were completed by the previously committed dates. The fix was rolled out to all OCSP servers by 2019-10-18 and our internal OCSP test cases were updated by 2019-11-22.

We are making progress on obtaining approval to share our OCSP test cases. We have learned this could take a few more months. If we haven’t been able to complete the process by end of June 2020, we’ll provide an update about the status at that time.

Whiteboard: [ca-compliance] - Next Update - 31-March 2020 → [ca-compliance] - Next Update - 01-June 2020

We will not have approval to share our OCSP test cases by 01-June 2020. Can you update the Next Update to 30-June 2020 per our previous comment?

QA Contact: wthayer → bwilson
Whiteboard: [ca-compliance] - Next Update - 01-June 2020 → [ca-compliance] - Next Update - 30-June 2020

We obtained approval to share our OCSP test cases. Please see the attached document. This completes all open tasks for this incident.

This list of OCSP lints appears to be very thorough. Thanks. And these were implemented in November 2019?

Correct.

Then I don't see any reason why this bug cannot be closed.

Agreed. I’m not thrilled with how long it took to share, as a general rule for all CAs, but I also want to acknowledge that it’s an incredibly valuable contribution to the body of knowledge for CAs.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Summary: Apple OCSP responders return responses with incorrect issuer → Apple: OCSP responders return responses with incorrect issuer
Product: NSS → CA Program
Whiteboard: [ca-compliance] - Next Update - 30-June 2020 → [ca-compliance] [ocsp-failure]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: