Closed Bug 1522975 Opened 5 years ago Closed 5 years ago

Google Trust Services: Improper OCSP response for intermediate certificate

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ryan.sleevi, Assigned: kluge)

Details

(Whiteboard: [ca-compliance] [ocsp-failure])

Google Trust Services has reported an incident related to OCSP generation at https://groups.google.com/forum/#!topic/mozilla.dev.security.policy/8V4uiqJCMfA

The full message is included below, with some adjustments for formatting for readability.

Summary

During a signing ceremony in October 2018, Google Trust Services generated OCSP responses for five of its subordinate CAs and published them afterwards. On 11 January 2019 it was discovered that one of these responses was not created accurately.

The incorrect OCSP response did not have an impact on subscribers or relying parties because the concerned CA GTSY1 is not in operation yet. By publishing this report we nevertheless want to share our lessons learned and hope that they will help other CAs improve their systems and processes.

Cause and Detection

A key ceremony tool was used to generate the OCSP responses based on a config file which specifies all relevant parameters of the output files to be created.
The config file for the October ceremony was prepared, reviewed and tested following established procedures to ensure quality and conformity with the Baseline Requirements. Subsequently, it was submitted to a version control system. In a later review, a CA engineer discovered that some of the serial numbers in the file were in decimal and others in hex format. To make the number formats consistent across the file, they submitted a change list replacing the decimal numbers with their hex representation. The change list contained a paste error, which assigned the GTSY1 OCSP response a serial number which does not correspond to the serial number of the CA certificate.

The change list was reviewed before it was submitted but the review did not catch the mismatch. On 11 January 2019 a CA engineer identified it while doing work to prepare for use of the previously generated certificate.

Timeline

2018-10-18 Config file is created and submitted to version control system
2018-10-18 Change list is submitted against config file (convert decimal to hex serial numbers)
2018-10-29 OCSP responses are created and signed during a ceremony
2018-11-03 OCSP responses are published
2019-01-11 Serial mismatch is discovered and the root cause investigated. An impact assessment is performed and it yields that relying parties and subscribers are not affected.
2019-01-14 Generation of new OCSP response is planned and tested
2019-01-17 New OCSP response is generated
2019-01-18 New OCSP response is published

Analysis and Findings

The serial number mismatch was the result of a human error made when composing the change list. As a control against such errors we enforce the performance of code reviews before change lists can be submitted. The reviews are supported by a set of linters for various purposes. Tests for serial number consistency were not implemented at the time of the error, but have been added since that time.
Additionally, OCSP responders are monitored and alerts are generated if certain conditions are met. CA specific alerting for GTSY1 was not configured when the OCSP response was published because the CA was not in an issuing state.

Remediation

  • Correct OCSP response for GTSY1 has been prepared and published.
  • CA specific alerting was configured on the OCSP responder for all newly created subCAs regardless of their operating status.
  • Instructions for generating subCAs were improved to update the OCSP monitoring configuration without waiting for the subCA to be productionized.
  • Additional linters are being implemented which test the accuracy and plausibility of ceremony config files.
Assignee: wthayer → kluge
Whiteboard: [ca-compliance]

Note also the template https://wiki.mozilla.org/CA/Responding_To_An_Incident#Incident_Report

While this does not follow that format, it does appear to substantially address these questions, with the exception of Item #5. While this was an OCSP response, can you provide the details for GTSY1 and the associated responder URL?

Flags: needinfo?(kluge)
Status: NEW → ASSIGNED

Thank you Ryan.

The concerned CA is GTSY1
Fingerprint: CD88FA9DCA572C5B8C3EED3DA2E2624575463F30
crt.sh ID: 912361743
OCSP Responder URL: http://ocsp.pki.goog/gtsr1

Flags: needinfo?(kluge)

Timeline of remediation actions:

2019-01-18 Correct OCSP response for GTSY1 published
2019-01-24 CA specific alerting configured for all newly created subCAs regardless of their operating status
2019-01-25 Instructions for generating subCAs improved to update the OCSP monitoring configuration without waiting for the subCA to be productionized
2019-01-25 Additional linters implemented which test the accuracy and plausibility of ceremony config files

Wayne, I think this goes to you.

Flags: needinfo?(wthayer)

David: when will the additional linters described as the final remediation step be completed?

Flags: needinfo?(wthayer) → needinfo?(kluge)

Hi Wayne,
on 2019-01-25 a linter was implemented which would have caught the error. Further rules have been added today (2019-01-30).
As with all tooling we will continuously make improvements but as a remediation action this is complete now.

Flags: needinfo?(kluge)

I think I have one question related to this, in light of other discussions on incidents regarding linting/linters being deployed. Can you speak to more about what existing checks would have caught this (but which were not enabled broadly until 2019-01-24 per Comment #3) and what additional linters were enabled on 2019-01-25?

An example of how other CAs have responded can be found at https://bugzilla.mozilla.org/show_bug.cgi?id=1428877#c4 or https://bugzilla.mozilla.org/show_bug.cgi?id=1390978#c3

Flags: needinfo?(kluge)

In addition I'd like to share an updated procedure for creating a new subCA (the mentioned above improved instructions), as I feel this might be beneficial in some form for other CAs as well:


  1. Prepare for a physical ceremony, ie. choose place, people (auditors if
    needed), date, travel arrangements and approvals, permits, etc.

  2. Create ceremony script; make sure to read through scripts, exceptions and
    retrospectives of previous ceremonies to avoid past mistakes.

  3. Create ceremony tool config for the new subCA. Do not forget to add
    creation of revocation data for the new subCA.

    Tip: see if there is more revocation data needed to be created soon for
    other existing CAs.

  4. Test the configs by running ceremony tool to produce data signed by a test
    root, so you can show all the fields and extensions in a form of an X509
    certificate. Submit the configs to version control before creating
    application.

  5. Create a subCA creation application with explanation for a need of a new
    subCA; show all the fields, values, extensions of a proposed certificate,
    and get it validated and approved by another CA Engineer, CA Policy Authority
    and CA Product Manager.

    If there are any changes to proposed certificate due to reviews, update and
    submit configs, and re-run the ceremony tool to get a new test cert to show.

  6. Run the actual ceremony to create the new subCA certificate. Update version
    control with the ceremony output.

  7. Upload the new subCA to Mozilla CCADB within 7 days of ceremony.

  8. Update CPS with the new subCA.

  9. Update CA repository, ie.: index.html, CPS and new certificate (in
    DER format).

  10. Upload produced OCSP response to be served by OCSP responders.

  11. Optional: upload new certificate to Certificate Transparency logs (if the
    new subCA is publicly trusted).

  12. Update OCSP prober config to start monitoring for the new subCA revocation
    data immediately.


The last step used to be part of bringing up a new CA (ie. one of the steps after importing cert into EJBCA, for instance), which previously usually happened within a week or two after creating a subCA cert... This time it was the first time we wait almost half a year between, and having this step in this procedure instead makes it possible to catch future errors almost immediately.

Notes:

Ceremony tool is custom tool that creates keys, csrs, certs, crls, ocsps from a fairly human-readable config file. This helps us to abstract the underlying nitty-gritty of x509 structures and HSM communication from what is really important, ie. values of fields, and also to make sure that we only do approved operations on root keys.

OCSP prober is a custom tool that periodically queries OCSP responders for serial numbers of all our subCAs, and raises alerts if some are missing or expired.

There are two checks which would have caught the issue.

  1. Monitoring
    The OCSP monitoring service detects OCSP serial mismatches for all configured CAs because the OCSP responder will not find and return the expected response to the prober if its serial number does not match that in the request. Instead an alert is generated.

  2. Linting
    The new linters check certain semantic properties of the config file before it is submitted to the version control system. In particular they check that for each OCSP response in the tested configuration file, its serial number corresponds to that of an existing CA listed in a designated file in the version control system.
    If the OCSP response is generated for a CA that has not been created yet, the linter verifies that the CA is included in the configuration file being tested and that the serial numbers match.

Flags: needinfo?(kluge)

Wayne: I think this slipped through the cracks and goes back to you for review.

Flags: needinfo?(wthayer)

It appears that all questions have been answered and remediation is complete.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: needinfo?(wthayer)
Resolution: --- → FIXED
Summary: Google: Improper OCSP response for intermediate certificate → Google Trust Services: Improper OCSP response for intermediate certificate
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ocsp-failure]
You need to log in before you can comment on or make changes to this bug.