Closed Bug 1652581 Opened 4 years ago Closed 3 years ago

Google Trust Services: digitalSignature KeyUsage not set

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: awarner, Assigned: awarner, NeedInfo)

Details

(Whiteboard: [ca-compliance] [ca-misissuance])

Attachments

(6 files)

1.87 KB, application/x-x509-ca-cert
Details
1.87 KB, application/x-x509-ca-cert
Details
765 bytes, application/x-x509-ca-cert
Details
765 bytes, application/x-x509-ca-cert
Details
1.32 KB, application/x-x509-ca-cert
Details
704 bytes, application/x-x509-ca-cert
Details

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36

Steps to reproduce:

Actual results:

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

While reviewing the initial post and responses to the "SECURITY RELEVANT FOR CAs: The curious case of the Dangerous Delegated Responder Cert" thread starting on July 1st 2020, we quickly determined we were not affected by the issue as we do not delegate responses, but the intricacies of interplay between KU / EKU settings had us re-reviewing all of our root and sub-CA profiles. During this time, we received a note from Paul van Brouwershaven who runs revocationcheck.com enquiring about the fact that our root CAs do not have the digitalSignature KeyUsage bit set. After a quick email thread, we convened members of the CA Policy Authority and our engineering team for further discussion and reached consensus that our root CAs need to have the digitalSignature bit set to meet the requirement in section 7.1.2.1 BR because they are used to sign OCSP responses.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

2016-May/June: GTS prepares the acquisition of GlobalSign Roots R2 and R4
2016-06-22: GTS issues its own Root CAs modeled after the GlobalSign R2 and R4 Roots.
2016-06-22: GTS starts serving OCSP responses signed using the Root CA Private Keys.
2020-07-01: "SECURITY RELEVANT FOR CAs: The curious case of the Dangerous Delegated Responder Cert" thread is started on m.d.s.p and GTS determines there was no exposure to the specific issues raised. The interplay of KU/EKU settings led us to start a discussion about possible other issues.
2020-07-06: Paul van Brouwershaven notifies GTS of the missing digitalSignature bit
2020-07-07: Representatives from different GTS teams including the CA Policy Authority and technical representatives meet and conclude that the digitalSignature bit is required. Remediation options are examined, and it is agreed that the affected roots CA certificates should be reissued.
2020-07-13: This Bugzilla bug is filed.

  1. Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident.

We are preparing a ceremony to re-issue the affected CA certificates with the digitalSignature bit set. The ceremony will take place before 2020-09-30. We will update the action plan in point 7 shortly with the exact date.

In the meantime we will continue serving OCSP responses from the current root CAs because the missing digitalSignature bit does not have an immediate security or compatibility impact for subscribers or relying parties.

  1. In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.

6 roots are affected, GlobalSign R2 & R4 as well as GTS R1-R4. All will be reissued with the digitalSignature KeyUsage set.

GS R2 was created December 15, 2006
GS R4 was created November 12, 2012
GTS R1-R4 were created June 21, 2016

  1. In a case involving certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.

The relevant crt.sh links are:
crt.sh/?id=14
crt.sh/?id=8644166
crt.sh/?id=139646520
crt.sh/?id=139646522
crt.sh/?id=139646519
crt.sh/?id=139646525

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

Google Trust Services (GTS) acquired GlobalSign R2 and R4 in 2016. When GTS took over the roots, we also assumed responsibility for serving the revocation information for two subordinate CAs under R2 and R4 that were still operated by GlobalSign. To this end, we transitioned serving of revocation responses to our own infrastructure.

We modeled the technical profile of GTS roots R1-4 on GlobalSign’s R2 and R4 to avoid potential client compatibility issues in the future. GlobalSign initially set the profiles for the R2 and R4 and they did not have to include the digitalSignature bit because they served delegated OCSP responses. Our review of the modeled GTS R1-4 profiles did not identify the digitalSignature bit as missing because it overlooked that we were going to serve non-delegated responses.

We ran these profiles past a number of experts within Google and external. None of our reviewers identified the issue as the settings are fine, assuming that delegated OCSP responses are used. The fact that we were shifting from delegated responses to root signed responses was not flagged by any of the reviewers internal or external spanning policy and technical reviews.

In the following years the mistake was not identified because all major user agents validated the OCSP responses without generating errors. Prior to Paul van Brouwershaven’s email, GTS has never received reports complaining of invalid OCSP responses due to this issue. Thus a direct need to revalidate the certificate profiles did not arise after we put the roots into operation.

  1. List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

All of our CA application templates have been updated to ensure that the digitalSignature bits is set for all CA types.

The affected root CA certificates will be re-issued in a ceremony before 2020-08-31.

In contrast to the other OCSP related issues we reported previously in this forum, we believe that the root cause of this issue lies in the process we applied for preparing and reviewing the root CA profiles rather than the design of our OCSP system infrastructure.

We want to thank Paul for pointing us to this issue. In addition to the remediation actions listed above, we have decided to contract with an independent expert group to conduct adversarial review of our OCSP infrastructure from both a technical and compliance perspective.

We will share an update on this assessment within 2 weeks of completion.

Expected results:

Type: defect → task
Assignee: bwilson → awarner
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

(In reply to Andy Warner from comment #0)

In the meantime we will continue serving OCSP responses from the current root CAs because the missing digitalSignature bit does not have an immediate security or compatibility impact for subscribers or relying parties.

This is a statement that doesn't have a lot of explanatory detail attached. What sort of analysis have you conducted to conclude this? How would GTS know if a client rejected an OCSP response from GTS because of this?

Flags: needinfo?(awarner)

Fair question, we should have provided more detail on that point. The classic Martin Rees line "absence of evidence is not evidence of absence" comes to mind. The analysis and data we have to support the assertion that there is not an immediate security or compatibility impact is supported by two data points. 1. GTS has done extensive client testing including response handling and never detected an issue due to the absence of digitalSignature. It is not possible to do exhaustive testing of all clients, but based on testing of clients we have access to and observe in non-trivial volume, we are not aware of any issues. 2. We've been operating this way for 4 years and there have been no reports sent to our public contact addresses indicating clients had issues of any type due to the missing digitalSignature bit. Neither of these is fully conclusive, but they provide a reasonable basis for the claim.

Flags: needinfo?(awarner)

Weekly update - We are still on track for re-issuance before 2020-08-30.

Weekly update - We are still on track for re-issuance before 2020-08-30.

Weekly update - We are still on track for re-issuance before 2020-08-30.

Weekly update - We are still on track for re-issuance before 2020-08-30.

Whiteboard: [ca-compliance] → [ca-compliance] Next Update - 30-August 2020

The ceremony to rectify this issue took place on 2020-08-13 without issues. Submissions to CCADB and other updates are mostly completely. Remaining administrative updates and confirmations are on track to be completed this week.

Attached file gtsr1.pem
Attached file gtsr2.pem
Attached file gtsr3.pem
Attached file gtsr4.pem
Attached file gsr2.pem
Attached file gsr4.pem

Attached are the revised root certificates, these certificates are now published on our repository page at https://pki.goog as well. We made the minimal changes necessary to address the associated issue. Last week we have also offline reached out to Mozilla, Apple and Microsoft to provide them the background and origin of this change.

I will review this to see if it can be closed.

Flags: needinfo?(bwilson)
Flags: needinfo?(bwilson)
Whiteboard: [ca-compliance] Next Update - 30-August 2020 → [ca-compliance]

Andy/Ryan: Setting aside that, of course, hindsight is 20/20 and this will surely be a thing to look out for, I'm curious whether any other changes to the design and review for CA ceremonies would have caught this. For example, IIRC, some linters warn that you won't be able to sign OCSP responses, and understanding whether or not you use those linters, or how you review them, is useful. Similarly, whether there's been any conducted analysis as to the comprehensiveness of the linters that Google employees for these ceremonies.

This is certainly an interesting issue, in highlighting how an operational design element necessitates modification of certificate profile, and I'm curious to know what controls (now?) exist for things of this nature.

Flags: needinfo?(awarner)

Ryan, we continually run the latest version zlint, keep it up to date and run it prior to all ceremonies in a test environment to see the implications caused by the issuance of the particular certificate. At the time of initial creations, zlint did not have a digitalSignature use check. The warning you mentioned was added by commit fd40f579253ea1ebfb18a585ab5cd8e7dcde61aa, dated Feb 14 2020, which is later than our issuance dates. At the time the missing lint didn't seem like a problem, since digitalSignature is only mandated when signing OCSP responses with the CA key, which is something that the linter cannot be aware of.

We have also evaluated other linters and run manual checks with them from time to time to see how other linters are progressing and consider their use. To date, zlint has been the most useful and our primary certificate linting tool. Most of the linters focus on highly specific checks and tend not to cover the cases where multiple options could be valid depending on the use.

It would be interesting to explore whether it is viable to have linters which get additional context, such as delegated OCSP responders or full chains to end-entity certificates so they have full context to make decisions on the trust and validity of a particular certificate.

We recently introduced a form based certificate application that we will always use to generate new certificate applications going forward. The form has a number of logical controls and checks to ensure that the profile includes and excludes options appropriately and that all of the options are valid in combination. One of the checks is an explicit check for digitalSignature being set or not set based on intended use.

Flags: needinfo?(awarner)

Thanks Andy.

I'm having trouble making sense of some of your reply though, and hoping you can carefully check.

Looking at the commit you reference, for example, shows it was just a structural layout in ZLint - https://github.com/zmap/zlint/commit/fd40f579253ea1ebfb18a585ab5cd8e7dcde61aa . Did you perhaps blame, but not check the commit message and/or the previous version?

If you look through the Git history from the commit immediately prior, you'll see that ZLint has been generating a "NOTICE" about this since the very first version, as you can see at https://github.com/zmap/zlint/blame/53441bdd36c98f9d85ece445effe95d9c283f1c8/lints/cabf_br/lint_ca_digital_signature_not_set.go

This is why I'm trying to understand the procedures here, as it seems like there's a chance here that Notices are being overlooked. This could be an opportunity to improve how to handle such situations. Similarly, certlint (which, importantly, checks for a number of ASN.1 encoding issues) also creates a notice for this, and has for five years: https://github.com/certlint/certlint/commit/8e1f33daa551d3e106447a94a4e3b27fb2dd6f23

I'm hoping you can re-examine your root cause analysis, and look to see if there are further opportunities for improvement.

Flags: needinfo?(awarner)

Ryan, I was wrong about the PR in question. You are right, our summary of the history of the behavior was incorrect, sorry about that.

While we believe we had robust processes in place at the time of the original error, we have learned a lot in the last 4+ years and as such our tooling, processes and procedures have evolved significantly in that time.

These profiles, which were created when we began the process of establishing Google Trust Services, underwent Zlint checks using test certificates, but it appears that Zlint was run in a mode that would not have yielded notices such as these. The process we use today still involves the use of Zlint, however, procedures now explicitly specify that linters should be run in their most verbose mode and all info / notice level output will be reviewed. As such, current procedures would have helped.

We did not, and still do not exclusively rely on tools for such reviews and do several manual internal, and external reviews of profiles in an attempt to catch issues before ceremonies. Unfortunately, none of the reviewers at the time caught this issue during their reviews. We have since changed the review process in several ways, including using a more structured process with several stages of review and believe even if there was a tooling issue in the future, these new processes would catch the issue should something slip past tooling.

At the time of profile definition for these roots we had not finished all aspects for the design of the OCSP service, which may have been a contributing factor to the manual review failure.

Some key learnings we take away from this incident that may be useful for others include:

  • When using third-party linters such as Zlint, ensure always to use verbose output and have an explicit sign off on each item identified by the tooling vs assuming behavior of the linter and completeness of the manual review.
  • Ensure that all reviewers have an opportunity to not only review the profiles, but to review the invocation and output of any associated linters so that such issues have a greater chance of being caught.
  • Having a formal multi-stage review process with formal checklists, dedicated and trained reviewers is key to a robust review process.
  • When acquiring or divesting materials, it is critical that all functionality be re-reviewed in the context of the new use and operating environment.

Please let us know if you have further ideas.

Flags: needinfo?(awarner)

The process we use today still involves the use of Zlint, however, procedures now explicitly specify that linters should be run in their most verbose mode and all info / notice level output will be reviewed.

Can you clarify when that change was made? That seems relevant to square out the timeline, and seems important for any future incidents.

Beyond that, I don't think I have further questions here. Assigning to Ben for any questions he has.

Flags: needinfo?(bwilson)

The procedure change to use verbose output for certificate applications / ceremony prep was made during our recurring compliance review held on 2020-09-23.

IIUC from earlier comments in this bug, gsr2.pem (https://crt.sh/?id=3448659678) was issued on 2020-08-13. It's self-signed and not yet in the Mozilla trust store, but since valid paths to other Mozilla built-in roots also exist (see https://crt.sh/?caid=10&opt=mozilladisclosure) ISTM that the Mozilla Root Store Policy does apply to it.

https://crt.sh/?id=3448659678 has a sha1WithRSAEncryption (self-)signature, but I'm struggling to see how Mozilla's strict requirements for the permitted use of SHA-1 (as detailed in https://www.mozilla.org/en-US/about/governance/policies/security-group/certs/policy/#513-sha-1) have been met.

Bending the definition of "intermediate certificate" doesn't help, because https://crt.sh/?id=3448659678 doesn't have "a new key" (as evidenced by https://crt.sh/?caid=10), so I think that ultimately https://crt.sh/?id=3448659678 falls foul of this rule:

"CAs MUST NOT sign SHA-1 hashes over other data"

Does this seem like the correct interpretation of the policy, or have I missed something?

Thanks Rob for flagging this, and apologies for overlooking this.

Note that the Mozilla Policy has some complex interplay with the Baseline Requirements (at least, prior to SC31). SC31 didn't become effective until 2020-08-20, so the version of the Baseline Requirements in force at the time was version 1.7.0. BRs 1.7.0, Section 7.1.3 states

CAs MUST NOT issue any Subscriber certificates or Subordinate CA certificates using the SHA-1 hash algorithm. CAs MAY issue Root CA Certificates or Subordinate CA Certificates that are Cross Certificates using the SHA-1 hash algorithm.

CAs MAY continue to use their existing SHA-1 Root Certificates.

Subscriber certificates SHOULD NOT chain up to a SHA-1 Subordinate CA Certificate

In the Baseline Requirements, Root CA is defined as:

Root CA: The top level Certification Authority whose Root Certificate is distributed by Application Software Suppliers and that issues Subordinate CA Certificates

Thus, at the time, the Mozilla Policy was more strict than the Baseline Requirements. This was changed in SC31 and BRs 1.7.1, with the new Section 7.1.3.2.1 RSA capturing these requirements from Mozilla.

Flags: needinfo?(awarner)

Here is a link to the root inclusion case information listed in the CCADB - https://ccadb-public.secure.force.com/mozilla/PrintViewForCase?CaseNumber=00000666

Assuming there are no further comments, I will close this bug on or about Dec. 11.

Ben: It may be good to clarify for Mozilla, given Comment #22.

Ben: Poking on this

Thanks. I'll work on an email to the mdsp list that outlines the issue and requests feedback from the community.

Here is the email sent to the mdsp list that outlines the issue in Comment #22 and requests feedback from the community. https://groups.google.com/a/mozilla.org/g/dev-security-policy/c/ds3BLfZvRjg/m/xkLZDBYrAgAJ

Flags: needinfo?(bwilson)

An incident report is being prepared with full details.

We have created a incident report for the SHA1 element of this modification, you can find it here: https://bugzilla.mozilla.org/show_bug.cgi?id=1709223

Flags: needinfo?(bwilson)

This particular issue "digitalSignature KeyUsage not set" was already resolved with the re-signing noted in Comment #7. Is there any reason why this bug should remain open?

I'll schedule this to be closed on next Wed. 19-May-2021.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Summary: Google Trust Services digitalSignature KeyUsage not set → Google Trust Services: digitalSignature KeyUsage not set
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ca-misissuance]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: