Closed Bug 1751984 Opened 2 years ago Closed 2 years ago

Let's Encrypt: TLS Using ALPN TLS Version and OID

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aaron, Assigned: aaron)

Details

(Whiteboard: [ca-compliance] [dv-misissuance])

Attachments

(10 files)

This is a preliminary incident report.

At 16:48 UTC on Tuesday Jan 25, 2022, a third party informed Let’s Encrypt / ISRG that, while examining the Boulder codebase, they had noticed two irregularities in our implementation of the “TLS Using ALPN” validation method (BRs 3.2.2.4.20, RFC 8737). The client used to perform the “acme-tls/1” handshake and protocol negotiation did not enforce that the minimum negotiated TLS version was 1.2 or higher, and the validation code accepted either the id-pe-acmeIdentifier OID or a different OID which was used in earlier drafts of RFC 8737.

We have confirmed both of these bugs. We have temporarily disabled use of the TLS-ALPN-01 challenge type for new validations. We are working on invalidating all current TLS-ALPN-01 authorizations so that they cannot be used to validate new certificate issuances. We have landed fixes[1][2] in the Boulder codebase for both issues, and are in the process of deploying those fixes now.

We will post further information in this ticket by EOD Wednesday, Jan 26.

[1] https://github.com/letsencrypt/boulder/pull/5905
[2] https://github.com/letsencrypt/boulder/pull/5906

Assignee: bwilson → aaron
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

I see announcement here - https://community.letsencrypt.org/t/2022-01-25-issue-with-tls-alpn-01-validation-method/170450

'5-days to revoke' - Baseline Requirements 4.9.9.1 Section 5 shows domain authorize errors are 24-hour revokation.
You say revokation will 'start' more 2-days after.

Will we have second bug open to explain delay and missing of 24-hour revoke-window?

This update provides additional information about this incident but does not yet constitute a full incident report.

An initial timeline of actions taken to investigate and resolve this incident follows. All times are UTC.

Tuesday Jan 25, 2022
16:48 Bug report sent to ISRG engineering
17:41 TLS-ALPN-01 challenge type disabled in Staging
17:57 Fix to require TLS1.2+ landed in Boulder
18:10 Fix to disallow old OID landed in Boulder
18:16 TLS-ALPN-01 challenge type disabled in Prod
19:00 New version of Boulder deployed to Staging
19:40 Preliminary incident report posted to Bugzilla
19:49 New version of Boulder deployed to Prod
21:18 All unexpired TLS-ALPN-01 authorizations revoked in Staging
22:25 TLS-ALPN-01 challenge type re-enabled in Staging
23:05 Testing confirms Staging rejects validation for clients using TLS1.1 handshakes

Wednesday, Jan 26, 2022
00:13 All unexpired TLS-ALPN-01 authorizations revoked in Prod
00:48 TLS-ALPN-01 challenge type re-enabled in Prod
00:54 Testing confirms Staging rejects validation for clients using the old OID
00:54 Testing confirms Prod rejects validation for clients using the old OID
00:56 Testing confirms Prod rejects validation for clients using TLS1.1 handshakes
01:01 Policy review group determines that revocation of affected certs is necessary within 5 days
04:40 All affected subscribers with valid contact addresses notified

Disabling the TLS-ALPN-01 challenge type prevented any new authorizations based on the affected version of the Boulder code from being created. Revoking all existing unexpired authorizations which had been validated using the TLS-ALPN-01 challenge type prevented any new certificates based on affected authorizations from being issued. Fixing the underlying bugs and deploying a new version of the ACME service allowed us to re-enable the TLS-ALPN-01 challenge type to restore service to Subscribers.

Our Policy Management Authority group has determined that these two bugs – allowing ACME clients to negotiate TLS1.1 when performing the TLS-ALPN-01 challenge; and allowing ACME clients to include the SHA-256 digest of the key authorization when performing the TLS-ALPN-01 challenge – do not constitute “evidence that the validation of domain authorization or control… should not be relied upon”. Neither provides an attack vector by which a client would be able to fulfill the TLS-ALPN-01 challenge for an identifier they do not control. Therefore, revocation within 24 hours of notification is not required. However, these certificates were “not issued in accordance with” the Baseline Requirements and our own CP/CPS, so we will be revoking all affected certificates no later than 16:48 UTC on Sunday, Jan 30, 2022.

Our initial count of affected certificates indicates approximately 2 million unexpired certificates for which at least one identifier was validated using TLS-ALPN-01 prior to the authorizations being revoked and the fix being deployed. We will provide full details for these certificates in our full incident report, which we intend to publish here on Friday, Jan 28, 2022.

Apologies, a small update to the above. The aside in the first sentence of the second-to-last paragraph should have read (emphasis added):

-- allowing ACME clients to negotiate TLS1.1 when performing the TLS-ALPN-01 challenge; and allowing ACME clients to include the SHA-256 digest of the key authorization under the obsolete OID when performing the TLS-ALPN-01 challenge --

Our Policy Management Authority group has determined that these two bugs – allowing ACME clients to negotiate TLS1.1 when performing the TLS-ALPN-01 challenge; and allowing ACME clients to include the SHA-256 digest of the key authorization when performing the TLS-ALPN-01 challenge – do not constitute “evidence that the validation of domain authorization or control… should not be relied upon”

I'm a little surprised by this, and hoping this can be expanded upon. I suppose the question is - let's say that a hypothetical CA uses an "Any Other Method" (the now-removed 3.2.2.4.10), using a bespoke method, such as signature-via-carrier-pigeon. Would that trigger this 24 hour revocation? Likewise, to use a recent CA example (Bug 1748634), what if a CA was able to retroactively determine authorization, would that thus extend the time period from 24 hours to 5 days, by virtue of obtaining (additional) evidence?

Basically, I'm trying to understand if there are examples that the Policy Management Authority could elaborate that would trigger that clause. I had always understood that clause to cover exactly this scenario: an inappropriate validation method (whether not blessed or not implemented correctly). It sounds like there's a dispute as to whether this scenario represents said "evidence", and if that's the case, I'm hoping to understand what would remain. Alternatively, it may be that the current language is seen as discretionary to the CA's interpretation, in which case, if the intent is indeed to ensure that "Failure to properly validate" is said evidence, making that explicitly clear through an additional bullet point.

This may seem like I'm trying to aggressively push for 24 hour revocation, but there's a reason I think this interpretation is particularly important to unpack: If I understand the current argument correctly (and I may be misunderstanding it), then the logical conclusion seems to be that if a CA misissues a certificate for mozilla.org, but does so by not validating the domain at all, then the CA could argue that none of the 24 hour revocation rules apply. Logically, this would be because clause 5 is read to apply only to "positive" validations, and the lack of validation is simply a violation of the 5-day #5 - the CP/CPS violation that Let's Encrypt is asserting applies here.

Could you share more of the Policy Management Authority's analysis about the bounds and scope of the "evidence" clause? I realize that a thorough analysis may take more time then the more pressing issue of helping such customers rotate certificates, and that does seem the right priority, but I wanted to flag the consequence of this interpretation for further discussion.

Flags: needinfo?(aaron)

(In reply to Ryan Sleevi from comment #4)

I'm a little surprised by this, and hoping this can be expanded upon.

We intend to do so in our full incident report, which we will post tomorrow.

Flags: needinfo?(aaron)

Could you disclosure all mis-issued/affected certificates with S/N or crt.sh-entrance here?

Flags: needinfo?(aaron)

(In reply to Charles Wang from comment #6)

Could you disclosure all mis-issued/affected certificates with S/N or crt.sh-entrance here?

Although S/N would still be appreciated, as I mentioned in bug 1736064 comment 5 (and was later further clarified in comment 7 by Ryan) S/N is not a good identity for sharing problematic certificate data. The short of the issue is that CAs can issue certificates that share a serial number (and have done so in the past), effectively blocking serial numbers from being reliably unique identifiers of signed certificates.

(In reply to Matthias from comment #7)

(In reply to Charles Wang from comment #6)

Could you disclosure all mis-issued/affected certificates with S/N or crt.sh-entrance here?

Although S/N would still be appreciated, as I mentioned in bug 1736064 comment 5 (and was later further clarified in comment 7 by Ryan) S/N is not a good identity for sharing problematic certificate data. The short of the issue is that CAs can issue certificates that share a serial number (and have done so in the past), effectively blocking serial numbers from being reliably unique identifiers of signed certificates.

Even so,providing valid entrance(s) to the problematic certs is a necessary procedure while reporting a CA incident,it helps improve transparency.

(In reply to Charles Wang from comment #6)

Could you disclosure all mis-issued/affected certificates with S/N or crt.sh-entrance here?

Charles, I see that there's a gzipped CSV linked from https://letsencrypt.org/tlsalpnrevocation that contains all ~2.7MM of the affected certificate serial numbers.

Flags: needinfo?(aaron)

Summary

At 2022-01-25 16:48 UTC Let’s Encrypt was made aware of two instances of non-compliance in our implementation of the TLS-ALPN-01 challenge type (RFC 8737), which is the basis of the TLS Using ALPN validation method (BRs Section 3.2.2.4.20).

RFC 8737, Section 4, states “ACME servers that implement “acme-tls/1” MUST only negotiate TLS 1.2 or higher when connecting to clients for validation”. Our TLS-ALPN-01 client code was not setting a specific minimum TLS version, and was therefore using Go’s default minimum TLS version, which is TLS 1.0. While it is likely that many if not most validations were performed over TLS 1.2 or higher, Let’s Encrypt does not log the negotiated TLS version as part of the validation data, so we must assume that all validations conducted using this method could have been affected.

RFC 8737, Section 3 defines the id-pe-acmeIdentifier OID to be 1.3.6.1.5.5.7.1.31. This OID is used to identify the acmeIdentifier extension in the certificate used by the domain to fulfill the TLS-ALPN-01 challenge. In earlier drafts of the RFC, the OID for this extension was instead 1.3.6.1.5.5.7.1.30.1. Our challenge validation code was willing to accept both OIDs. Although Let’s Encrypt does have metrics on how many validations used the old OID, those metrics are not tied to individual validations, so we must assume that all validations conducted using this method could have been affected.

Both issues have been fixed. All unexpired certificates which contain identifiers validated using the TLS-ALPN-01 challenge type prior to the fix will be revoked by 2022-01-30 16:48 UTC, five days from when we were made aware that they were not issued in accordance with the Baseline Requirements.

Incident Report

How we first became aware of the problem.

Let’s Encrypt received a bug report mentioning both issues described above at 16:48 UTC on 2022-01-25. We immediately began incident response, and had confirmed the presence of both bugs within an hour.

It is worth noting that the Let’s Encrypt codebase contained a comment explicitly documenting that the validation process was accepting both OIDs for the acmeIdentifier extension. The comment explains that this was done on purpose during the time prior to RFC 8737’s standardization in order to avoid breaking early-adopter clients which had incorporated support for the TLS-ALPN-01 challenge and had not been updated to use the correct OID. We should have returned to the comment and corrected it after the standardization. As many software engineers can attest, the presence of a code comment does not always imply that the humans maintaining that code are aware of its presence.

Timeline of incident and actions taken in response.

All times are UTC; some events listed only at date granularity as timestamps are not available for RFC and BR publication events.

2018-02-22

  • First draft of RFC 8737 posted, with the old OID and no requirement for TLS 1.2+

2018-06-06

2018-08-13

  • Draft of RFC 8737 updated to specify the new OID

2018-08-20

2019-10-01

  • Draft of RFC 8737 updated to specify TLS 1.2+

2020-02-29

2020-09-22

  • Ballot SC33, which references RFC 8737, becomes effective in the Baseline Requirements. (Incident Begins)

2022-01-25

  • 16:48 Bug report sent to ISRG
  • 17:08 Let’s Encrypt engineers confirm the presence of both bugs
  • 17:41 TLS-ALPN-01 challenge type disabled in Staging
  • 17:57 Fix to require TLS1.2+ merged into Boulder
  • 18:10 Fix to disallow old OID merged into Boulder
  • 18:16 TLS-ALPN-01 challenge type disabled in Production
  • 19:00 New version of Boulder deployed to Staging
  • 19:40 Preliminary incident report posted to Bugzilla
  • 19:49 New version of Boulder deployed to Production
  • 21:18 All unexpired TLS-ALPN-01 authorizations revoked in Staging
  • 22:25 TLS-ALPN-01 challenge type re-enabled in Staging
  • 23:05 Testing confirms Staging rejects validation for clients using TLS <1.2 handshakes

2022-01-26

  • 00:13 All unexpired TLS-ALPN-01 authorizations revoked in Production (Incident Ends)
  • 00:48 TLS-ALPN-01 challenge type re-enabled in Production
  • 00:54 Testing confirms Staging rejects validation for clients using the old OID
  • 00:54 Testing confirms Production rejects validation for clients using the old OID
  • 00:56 Testing confirms Production rejects validation for clients using TLS <1.2 handshakes
  • 01:01 Policy review group determines that revocation of affected certs is necessary within 5 days
  • 04:40 All affected subscribers with valid contact addresses notified

2022-01-28

  • 18:15 Revocation of 826,073 known-replaced affected certificates begins
  • 21:25 Revocation of known-replaced affected certificates complete
  • 22:57 Revocation of all remaining affected certificates begins

Whether we have stopped the process giving rise to the problem or incident.

We are no longer conducting TLS-ALPN-01 validations that are affected by the issues described above. We are no longer issuing certificates based on authorizations that were affected by the issues described above.

During the initial incident response, we disabled the TLS-ALPN-01 challenge so that no further validations could be performed using the affected code path while we investigated and fixed the issue. We revoked all unexpired authorizations which had been validated using TLS-ALPN-01 so that no further certificates could be issued based on affected validations. We then fixed both issues and re-enabled TLS-ALPN-01 validation.

Summary of the affected certificates.

All Let’s Encrypt certificates containing at least one identifier which was validated using the “TLS Using ALPN” method since its formal adoption in the Baseline Requirements are affected. During that period, the TLS-ALPN-01 challenge has consistently been the least used challenge type for validating domain control.

At the time that the issues were fixed on 2022-01-26, there were 2,710,533 unexpired affected certificates. Some of these certificates will naturally expire while we perform revocations.

Complete certificate data for the affected certificates.

The attached files contain crt.sh urls in the format https://crt.sh/?sha256=<cert fingerprint>. The files are compressed using zstd, and sharded to fit in Bugzilla’s attachment file size limit. Together they cover all affected certificates which were unexpired at the time the issue was resolved. For the majority of unique serial numbers, the url points to the final certificate; for those serial numbers which never had a final certificate issued, the url points to the precertificate.

Explanation of how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

Both bugs were introduced via essentially the same mechanism: code was written to match a draft specification, but then not updated when the draft specification changed, was finalized, or was incorporated into the Baseline Requirements.

In the past year, particularly as part of our remediation for Bug 1715455, we have established a robust process for reviewing all changes to individual root program requirements, the Baseline Requirements, and the documents (often RFCs) which they mention. This review process includes examining our code which implements those requirements and making any appropriate changes. This review process was not yet in place in 2020, when RFC 8737 was published and when the TLS Using ALPN method was incorporated into the BRs.

Also in the past year, we have conducted a retrospective review of all CA Certificate Compliance issues reported in Bugzilla between 2019-06-01 and 2021-06-11 (and a corresponding ongoing review of all new and updated incident tickets). We examine our own practices, documents, and code in relation to these incidents. No other CA has reported an incident related to the TLS Using ALPN verification method specifically or TLS minimum versions in general during that period, so this review also did not cause us to re-examine this section of our codebase.

These bugs avoided detection because we did not have comprehensive processes for reviewing changes at the time that the relevant standards changed, and because no relevant changes or other CA incidents have occurred since such processes were put in place.

List of steps we are taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

For remediation we will be revoking all of the affected certificates and conducting a review specifically of all of our validation code paths to ensure that they are fully compliant with the Baseline Requirements validation methods that they implement.

Revocation of all affected certificates will be completed by 2021-01-30 16:48 UTC, 5 days after we were made aware of the issue. Revocation on a 5-day timeline is required under BRs Section 4.9.1.1 (7): “The CA… MUST revoke a Certificate within 5 days if… [t]he CA is made aware that the Certificate was not issued in accordance with these Requirements…”.

We considered that revocation may be required within 24 hours, under BRs Section 4.9.1.1 (4): “The CA SHALL revoke a Certificate within 24 hours if… [t]he CA obtains evidence that the validation of domain authorization or control for any Fully-Qualified Domain Name or IP address in the Certificate should not be relied upon.”. The determining factor would be whether or not the validations of domain control which were produced by the affected code can be relied upon.

The use of the incorrect OID for the acmeIdentifier extension does not affect the reliability of the challenge. The purpose of the acmeIdentifier extension is to carry the SHA-256 digest of the key authorization for the challenge. The key authorization is a combination of a unique token (provided to the authenticated subscriber at an earlier stage in the ACME protocol) and a thumbprint of the subscriber’s account key. This value can be computed independently by both the ACME client and server, and only by the ACME client and server, and domain control is validated when the client presents the correctly computed value to the server. Whether this value is contained in an extension identified by the old OID or the correct OID does not reflect the client’s ability to compute the correct value, and therefore the validation of domain control can be relied upon.

Both TLS 1.0 and TLS 1.1 are widely recognized as insecure protocols which are vulnerable to MITM attacks both for eavesdropping and for tampering with the contents of the connection. However again, the connection is simply used to transmit the key authorization, which can only be computed by the ACME server and the correct subscriber. The ability of an attacker to eavesdrop on or even modify the content of the connection can only result in a valid challenge attempt becoming unsuccessful, not in a malicious challenge attempt becoming successful. Whether the key authorization is transmitted via TLS 1.0, TLS 1.1, or TLS 1.2+ does not reflect the client’s ability to compute the correct value, and therefore the validation of domain control can be relied upon. (It is worth noting that other accepted validation methods have no protection around the transport of their Request Token or Random Value.)

Because we have concrete evidence that the validation of domain control can be relied upon despite not being conducted in accordance with the Requirements, it is not necessary to revoke within 24 hours.

For our comprehensive review of all three validation methods used by Let’s Encrypt, we will specifically do the following for each method:

  • Gather a group consisting of at least three PKI Staff and PKI Administrators and one member of the ISRG Technical Advisory Board (to provide a fresh perspective).
  • Re-read the subsection of the Baseline Requirements which defines the validation method.
  • Re-read any other documents (such as RFC 8555 or RFC 8737 and related errata) which are referenced by that section of the BRs.
  • Re-read all Boulder code responsible for generating unique random values, providing them to the ACME Subscriber, retrieving a token from the domain in question, and validating that token.

We commit to complete this review in the next six weeks, allowing two weeks for comprehensive review of each validation method.

Remediation Status Due Date
Revoke all affected unexpired certificates Started 2022-01-30 16:48 UTC
Comprehensive review of “Agreed-Upon Change to Website - ACME” validation Not yet started 2022-03-11
Comprehensive review of “DNS Change” validation Not yet started 2022-03-11
Comprehensive review of “TLS Using ALPN” validation Not yet started 2022-03-11

(In reply to Ryan Sleevi from comment #4)

Could you share more of the Policy Management Authority's analysis about the bounds and scope of the "evidence" clause?

As noted in the incident report above, we are not simply asserting that this is a general case of “there is no evidence that the validation should not be relied upon, therefore the 24-hour timeline does not apply”. Rather, we are asserting that, based on the specific features of the TLS Using ALPN validation method and the other validation methods sanctioned by the Baseline Requirements, there is positive evidence that the validation of domain control can be relied upon.

We do not hold that this analysis is necessarily generalizable to other validation methods or circumstances.

I had always understood that clause to cover exactly this scenario: an inappropriate validation method (whether not blessed or not implemented correctly). It sounds like there's a dispute as to whether this scenario represents said "evidence", and if that's the case, I'm hoping to understand what would remain. Alternatively, it may be that the current language is seen as discretionary to the CA's interpretation, in which case, if the intent is indeed to ensure that "Failure to properly validate" is said evidence, making that explicitly clear through an additional bullet point.

We would be supportive of a modification to the requirements that makes it explicitly clear that “failure to properly validate” in and of itself constitutes evidence that the validation of domain control should not be relied upon, regardless of the other properties of the validation, if that is the intent of the BRs.

As one additional note: we are conscious of the fact that, when justifying decisions not to revoke certificates within a timeframe required by the BRs, responses similar to “we do not deem this to be a security risk” are unacceptable. This is not the case in this instance: we are justifying a determination of which timeframe is mandated by the BRs and then revoking certificates within that time frame.

As of early Saturday, Jan 29, UTC, revocation of all affected certificates was complete. We have started our comprehensive review of the specification for the TLS Using ALPN validation method, and of our code implementing that method.

2022-01-29

  • 00:44 Automatic revocation of all affected certificates complete, less 29 errors
  • 01:17 Revocation of 27 straggler certificates complete
  • 03:43 Revocation of final 2 certificates complete
  • 06:44 Propagation of revocation through caches / CDNs complete

2022-01-31

  • 16:00 Began comprehensive review of TLS Using ALPN specification and implementation

We will provide a progress update on our comprehensive reviews no later than Friday, Feb 18. We will respond to questions and comments before that time but do not intend to provide daily updates, if the root program managers concur.

Remediation Status Date
Revoke all affected unexpired certificates Complete 2022-01-29 06:44 UTC
Comprehensive review of “Agreed-Upon Change to Website - ACME” validation Not yet started 2022-03-11
Comprehensive review of “DNS Change” validation Not yet started 2022-03-11
Comprehensive review of “TLS Using ALPN” validation Started 2022-03-11

We have completed our review of the code and specifications related to the TLS Using ALPN validation method. Our review included BRs Section 3.2.2.4.20 (TLS Using ALPN), RFC 8737 (ACME TLS-ALPN-01), RFC 5280 (Web PKI Profile), RFC 7301 (TLS ALPN), and RFCs 5246 (TLS 1.2) and 8446 (TLS 1.3). Our review group included eight members associated with Let’s Encrypt and five members providing external perspectives. Our findings are detailed below.

RFC 8737, Section 3 states “The client prepares for validation by constructing a self-signed certificate…”. Boulder’s implementation of the TLS-ALPN-01 method was not checking that the challenge certificate presented by the ACME Client (or its delegate) was self-signed. Although this is not phrased as a normative requirement, we have nevertheless added checks to the Boulder codebase which confirm that the presented certificate has the same Issuer and Subject bytes, and that the signature over the cert validates with the public key contained in the cert.

RFC 5280, Section 4.2, states “A certificate MUST NOT include more than one instance of a particular extension.”. The Go standard library’s crypto/x509 package does not enforce this constraint when parsing a DER-encoded certificate; instead, if it encounters a certificate extension twice, the latter instance of that extension “wins”. Although the RFC does not place a requirement on certificate consumers to reject such certificates (c.f. the requirement to reject if the certificate contains an unrecognized critical extension), we have nevertheless added checks to the Boulder codebase which confirm that no extension OID is encountered twice. We have also reported this behavior upstream and it will be resolved in a future release of Go.

RFC 5246, Section 7.4.1.4 states “There MUST NOT be more than one extension of the same type.”. Similarly, RFC 8446, Section 4.2 states “There MUST NOT be more than one extension of the same type in a given extension block.” The Go standard library’s crypto/tls package does not enforce this constraint when parsing a serverHello message; instead, if it encounters a TLS extension twice, the latter instance of that extension “wins”. Although the RFC does not place a requirement on TLS participants to abort the handshake if the other participant sends duplicate extensions, we have nevertheless reported this behavior upstream.

RFC 5246, Section 7.4.1.4 states “An extension type MUST NOT appear in the ServerHello unless the same extension type appeared in the corresponding ClientHello.”. Similarly, RFC 8446, Section 4.2 states “Implementations MUST NOT send extension responses if the remote endpoint did not send the corresponding extension requests... Upon receiving such an extension, an endpoint MUST abort the handshake with an unsupported_extension alert.”. The Go standard library’s crypto/tls package does not enforce this constraint when a client is validating the server’s hello message and completing the handshake. The TLS 1.2 implementation does return the alertUnsupportedExtension error, but only when unexpected ALPN protocols are presented, not if any unexpected extensions at all are presented. We have reported this behavior upstream.

For the sake of clarity, we reiterate that the latter two findings have not been mitigated. We do not intend to mitigate them directly in the Boulder codebase; we will instead wait for the upstream Golang project to determine what, if any, change should be incorporated into the Go standard library, and then incorporate that change ourselves when it is published as part of a Go release. We understand that CAs are required to carefully vet their dependencies. We stand by our decision to rely on the Go standard libraries for these functions, as evidenced by the speed with which they addressed the duplicate certificate extensions issue. Rolling our own crypto and TLS libraries would inevitably result in more issues than these.

We are reporting all of the above items out of a desire for transparency and sharing knowledge with the community. As noted above, the first three findings are not violations by the CA of the requirements in their respective RFCs; rather they are places where the CA can help check that the Subscriber (or their delegate) is conformant. The final finding does not have any impact on whether the final domain validation can be relied upon. We do not intend to revoke any certificates as a result of any of these findings.

We intend to begin our comprehensive reviews of the "Agreed-Upon Change to Website - ACME" and "DNS Change" methods soon. Please set the Next Update field to 2022-03-11, as previously committed.

Whiteboard: [ca-compliance] → [ca-compliance] Next update 2022-03-11
Summary: Let's Encrypt TLS Using ALPN TLS Version and OID → Let's Encrypt: TLS Using ALPN TLS Version and OID

We have completed our comprehensive reviews of the "Agreed-Upon Change to Website - ACME" (HTTP-01) and "DNS Change" (DNS-01) validation methods.

For the HTTP-01 method, we have no findings. We are making a few minor changes to ensure that we do not become non-compliant in the future:

  • Explicitly filter for the allowed redirect HTTP status codes, rather than relying on the go stdlib to do this
  • Check that the final HTTP status code is 2XX before reading the accompanying body
  • Trim all forms of unicode whitespace from the key authorization, rather than just ASCII whitespace

For the DNS-01 method, we have one note. BRs Section 3.2.2.4.7 states "If a Random Value is used, the CA SHALL provide a Random Value unique to the Certificate request...". Our infrastructure guarantees that the Random Values used in challenges are globally unique for at least their BRs-specified maximum lifetime (30 days). They are also statistically unique (they contain 256 bits of entropy). We believe that this satisfies all reasonable definitions of "unique".

This completes our committed remediation items. If there are no further questions, we ask that this issue be closed.

I'll close this tomorrow 2022-03-04, unless there are questions or issues still to discuss.

Flags: needinfo?(bwilson)
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] Next update 2022-03-11 → [ca-compliance] [dv-misissuance]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: