Closed Bug 1872371 Opened 1 year ago Closed 6 months ago

Buypass: Using an external DNS Resolver for DNS lookups

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mads.henriksveen, Assigned: mads.henriksveen)

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

Attachments

(2 files)

Incident Report

Summary

Buypass has issued TLS certificates where the DNS lookups have been done by using an external DNS Resolver. This can be considered to be a Delegated Third Party (DTP) and therefore not allowed to use for domain validation according to BR.

Impact

177 060 still active TLS certificates are affected, approx. 20% of them are issued based on dns-01 (method 7) while the rest are issued based on http-01 (method 19). As DNS CAA lookups are done through the same external DNS Resolver also the http-01 based certificates are affected.

Timeline

All times are UTC+1.

2017-06 Buypass ACME in production (supporting dns-01 and http-01), DNS lookup using internal DNS Resolvers

2017-10:

  • Added support for external DNS resolvers (i.e. ISP and Google)
  • Switched from internal DNS Resolver to external DNS Resolver for Buypass ACME to improve scalability and coverage

2023-06-16:

  • 09:00 We became aware that using external DNS tools like https://toolbox.googleapps.com/apps/dig/ for manual domain validation is not allowed as this is a DTP
  • 09:35 We stopped using this tool for manual domain validation

2023-12-14: We completed automation and removed all manual domain validation steps

2023-12-22:

  • 09:00: We became aware that using external DNS Resolvers are considered to be DTP and not allowed for domain validation
  • 10:00: We started investigation and concluded that Buypass ACME only used external DNS resolvers
  • 11:26: We stopped issuing certificates
  • 23:00: Switched Buypass ACME from using external DNS Resolvers to an internal DNS Resolver
  • 23:15: Resumed certificate issuance. We currently don't allow for reusing domain validations for Buypass ACME (a temporary measure)

2023-12-23: Started to identify affected certificates and subscribers

2023-12-28: Notified the first set of subscribers and encouraged them to renew their certificates immediately

2023-12-29: We continue to notify subscribers

Root Cause Analysis

The root cause of the incident was that Buypass did not understand that using external DNS Resolvers is considered using a Delegated Third Party (DTP). This is similar to the root cause of bug https://bugzilla.mozilla.org/show_bug.cgi?id=1839305.

Buypass has used external DNS Resolvers in Buypass ACME since October 2017. External DNS Resolvers have been extensively used for DNS lookups for non-TLS use, and the team responsible for implementing DNS lookups for the ACME solution was not aware of the specific TLS/BR-requirements and thus didn't consider this to be a problem.

Our interpretation of the main objective with DTP in BR has been to ensure that the CA complies with DTP requirements before delegating delegable (excl. domain validation) functions to a DTP. We have learned that the opposite interpretation also is important, i.e. using any external service for domain validation may be considered to be provided by a DTP and not allowed.

We understand that for DNS this is quite clear for DNS domain experts in the community, but it's still not very clear to us which type of external services that has to be considered as candidates for being provided by a DTP. We will engage in CA/Browser Forum to clarify these requirements in BR to ensure that all CAs in the ecosystem have a common understanding and must follow the same auditable requirements in BR.

The incident described in bug https://bugzilla.mozilla.org/show_bug.cgi?id=1839305 made us aware of the issue, but our main focus was on removing manual domain validation steps. This to avoid using external web services for manual domain validation. Unfortunately, we didn't focus on identifying similar external services used in our automated solutions.

Lessons Learned

What went well

What didn't go well

  • We didn't understand that using an external DNS Resolver is considered using a DTP
  • We didn't have proper focus on compliance consequences when introducing external DNS Resolvers

Where we got lucky

Action Items

Action Item Kind Due Date
Engage in CABF to clarify requirements related to DTP in BR Prevent 2024-01-15
Update internal policies with the proper definition of a DTP to ensure that all employees working in this domain are aware of this interpretation Prevent 2024-01-15
Update internal processes to ensure that external services considered for use in the TLS domain are assessed against the definitions of a DTP Prevent 2024-01-15

Appendix

Details of affected certificates

Affected certificates are non-expired certificates at 2023-12-29 12:00 UTC+1

See attachments.

Assignee: nobody → mads.henriksveen
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [ov-misissuance]

Hello,

please note that the notification to subscribers about this issue has been delegated to a third party: Microsoft Corporation.

The e-mail from no-reply at buypass dot no with subject "Urgent, immediate action required: Renew Buypass ACME (Go SSL) certificates" has been sent through the outlook/office365 servers, and not directly from Buypass.

So, you may also want to check if the third party delegation of e-mail is allowed or not.

Thank you, best regards

(In reply to Vitt G from comment #3)

Hello,

please note that the notification to subscribers about this issue has been delegated to a third party: Microsoft Corporation.

The e-mail from no-reply at buypass dot no with subject "Urgent, immediate action required: Renew Buypass ACME (Go SSL) certificates" has been sent through the outlook/office365 servers, and not directly from Buypass.

So, you may also want to check if the third party delegation of e-mail is allowed or not.

Thank you, best regards

Hi Vitt, this is not an issue as long as it's not being directly used in domain validation.


Mads, I'm not entirely sure if that's a root cause. Not knowing what would count as a DTP is definitely a cause, but I do not think that not knowing something counts as the root cause.

Could you please further explain why this knowledge gap happened? For example, how does the process of reading and understanding the requirements work at Buypass?

Furthermore, I think that the action items do not prevent an issue like this in the future. My suggestion on this other incident: https://bugzilla.mozilla.org/show_bug.cgi?id=1865368#c4 still applies here. Specifically:

Mads, I would like a commitment from Buypass to:

  • Monitor & Triage every incident report from other CAs that are opened in this bugzilla component. This means reading these incident reports & triaging them and writing why or why not they are relevant to your operations.
  • Triage 2022, and 2023's list of incidents in the same bugzilla component to both increase the understanding on the policy side, and to potentially find more areas that you can improve upon with technical solutions to these problems.

Finally, I believe since this problem was discovered more than a week ago, you should file another incident for the delayed revocation of the impacted certificates.

crt.sh has not been updated since December 13th due to the database being in read-only:

https://crt.sh/monitored-logs
https://groups.google.com/g/crtsh/c/vmuEQq7xHgM/m/yYwXK0pCAgAJ

I guess is possible cert spotter just mirrors/uses crt.sh database.

Buypass ACME dns01 validation seems to be broken for delegated domains (CNAME the validation) currently, so people may actually be unable to renew the certificate, as per Buypass customer case CSM-56211.

Thanks for the info about crt.sh. However, Cert Spotter is independent of crt.sh and is up-to-date with all Chrome/Apple-recognized logs so that's not the issue.

We were not able to revoke affected certificates within 5 days - see https://bugzilla.mozilla.org/show_bug.cgi?id=1872738.
We will give an update in this bug later this week.

The expectation for this class of incident is not 5 days, it is 24 hours: https://github.com/cabforum/servercert/blob/main/docs/BR.md#4911-reasons-for-revoking-a-subscriber-certificate

The CA obtains evidence that the validation of domain authorization or control for any Fully-Qualified Domain Name or IP address in the Certificate should not be relied upon (CRLReason #4, superseded).

Flags: needinfo?(mads.henriksveen)

Buypass has used external DNS Resolvers in Buypass ACME since October 2017. External DNS Resolvers have been extensively used for DNS lookups for non-TLS use, and the team responsible for implementing DNS lookups for the ACME solution was not aware of the specific TLS/BR-requirements and thus didn't consider this to be a problem.

7 months ago Andrew Ayer made it clear using third party DNS providers are not permitted: https://bugzilla.mozilla.org/show_bug.cgi?id=1838421#c8

You filed an incident because of that: https://bugzilla.mozilla.org/show_bug.cgi?id=1839305

In that incident you said:

The root cause of this incident was that Buypass did not understand that using externally operated DNS tools must be considered using a Delegated Third Party (DTP). We acknowledge that this is the case and have stopped using externally operated DNS tools.
Buypass has used such tools occasionally when performing manual domain validations using information from DNS.
We understand that this has been described as problematic in previous bugs and acknowledge that we should have learned by following the conversations in these bugs.

First, how is the root cause in that bug, materially different from the root cause in this bug?

Second, you were told this is a problem, and then went ahead and made the fix by just repeating the same mistake again. Could you please explain what are you doing to stop having these misinterpretations on domain validation & CA operation?


Finally, I want to hear if you have any concerns with committing to what I requested earlier in this thread: https://bugzilla.mozilla.org/show_bug.cgi?id=1872371#c4

Specifically:

Mads, I would like a commitment from Buypass to:

  • Monitor & Triage every incident report from other CAs that are opened in this bugzilla component. This means reading these incident reports & triaging them and writing why or why not they are relevant to your operations.
  • Triage 2022, and 2023's list of incidents in the same bugzilla component to both increase the understanding on the policy side, and to potentially find more areas that you can improve upon with technical solutions to these problems.

This request is not an unreasonable request. See when Let's Encrypt did this: https://bugzilla.mozilla.org/show_bug.cgi?id=1715455#c37

7 months ago it was about externally operated DNS tools.
This is about externally operated DNS recursion.

To me personally it is not clear how DNS recursion is equivalent to "domain validation" and a such, how external DNS recursion must be considered DTP, unless the authoritative DNS operation (domain, TLD and root DNS servers) are also DTPs and this case, no CA could ever use DNS.

Hi Lukas!

The expectation is that CAs:

  1. Ask the authoritative nameserver for a given DNS address for the DNS information (e.g. do not use externally operated recursive DNS resolvers).
  2. Validate DNSSEC, if available, as part of the DNS responses.

The issue here is that initially Buypass was using https://toolbox.googleapps.com/apps/dig/ to perform dns validation for some of the domains. The Fix that buypass did was to move that validation to using 8.8.8.8 directly through the DNS protocol.

These are materially the same thing. Arguably the toolbox method is even a bit safer since the responses come over https, and not over the plaintext DNS protocol.

The problem with Buypass's response to this is that they did not actually solve the root of the problem there - which was the use of externally operated DNS lookup tools since that makes them a DTP. They just simply switched from one form of it to another.

This topic has come up a couple of times in the past and the expectation on CAs are to be monitoring incidents in Bugzilla so they can learn from them and apply their learnings to their own environments.

Hello Amir,

thanks for elaborating. I'm wondering if forcing CA's to revoke certificates based on loose understandings in the community is a good and scalable approach. Can't those understandings be documented in the BR before being enforced? This would make it a lot clearer for everyone involved I'd think. And with clarity I mean less back and forth, less text to go through in other incidents.

But alas the status quo is what it is, so let's see what other CA's are doing in regards to DNS resolution.

I will start with:

Google Trust Services: uses "DNSSec-mostly" and DTPs for DNS resolution
https://bugzilla.mozilla.org/show_bug.cgi?id=1873739

Lukas, I don't control the speed at which the BRs update. It's actually not even possible for me to really participate in the CABForum process as I am neither a browser nor a CA.

However, despite that, CAs are also told to monitor Bugzilla, MDSP, etc for various discussions that do come up. My concern here is that the fix for the initial mistake was repeating the same mistake again, in another form.

Thank you for finding another potentially misbehaving CA!

(In reply to amir from comment #15)

Hi Lukas!

The expectation is that CAs:

  1. Ask the authoritative nameserver for a given DNS address for the DNS information (e.g. do not use externally operated recursive DNS resolvers).
  2. Validate DNSSEC, if available, as part of the DNS responses.

The issue here is that initially Buypass was using https://toolbox.googleapps.com/apps/dig/ to perform dns validation for some of the domains. The Fix that buypass did was to move that validation to using 8.8.8.8 directly through the DNS protocol.

These are materially the same thing. Arguably the toolbox method is even a bit safer since the responses come over https, and not over the plaintext DNS protocol.

The problem with Buypass's response to this is that they did not actually solve the root of the problem there - which was the use of externally operated DNS lookup tools since that makes them a DTP. They just simply switched from one form of it to another.

This topic has come up a couple of times in the past and the expectation on CAs are to be monitoring incidents in Bugzilla so they can learn from them and apply their learnings to their own environments.

In the first incident we were made aware that using an externally operated web service like https://toolbox.googleapps.com/apps/dig/ for manual domain validations is to be considered a DTP and not allowed for domain validation. Although we had missed these discussions in Bugzilla, we acknowledged that this was the case and stopped using the web service immediately, thus solving the issue in the first incident.

In the second incident we were made aware that using a recursive DNS Resolver also is to be considered a DTP and thus not allowed. This was not so obvious as we considered using a recursive DNS Resolver for DNS lookups as a common way of doing DNS lookups. Based on the comments received, in particular the precise description from Andrew Ayer, we acknowledged that this also is a DTP.

However, even though this is a common understanding among DNS experts in the community, it’s not obvious for all others, including us. This is why I raised this as an issue in https://bugzilla.mozilla.org/show_bug.cgi?id=1839305#c15, i.e. this should be clarified by the CA/Browser Forum. This must be understood and handled equally by all CAs in the ecosystem and therefore we need precise and auditable requirements in the Baseline Requirements.

I will emphasize that we didn't switch from using one DTP (the externally operated web service) to another when we were made aware of this in the first incident. This second incident is about the use of recursive DNS Resolvers in Buypass ACME since 2017.

In a retrospective perspective, we see that we should have understood that using a recursive DNS Resolver is the same type of DTP as using an external web service. But this is the core of the root cause in our case, we did not connect these different types of external services to the same issue.

(In reply to amir from comment #17)

Lukas, I don't control the speed at which the BRs update. It's actually not even possible for me to really participate in the CABForum process as I am neither a browser nor a CA.

However, despite that, CAs are also told to monitor Bugzilla, MDSP, etc for various discussions that do come up. My concern here is that the fix for the initial mistake was repeating the same mistake again, in another form.

Thank you for finding another potentially misbehaving CA!

Anybody may attend the CA/Browser Forum as an Interested Party - see https://cabforum.org/interested-parties/.

This is an update on the Action Items:

  • We are engaging in CABF Server Certificate Working Group (SCWG) to clarify requirements in BR related to DTP.

  • We have updated internal policies with the proper definition of a DTP to ensure that all employees working in the TLS domain are aware of of this interpretation.

  • We have updated internal processes to ensure that external services considered for use in the TLS domain are assessed against the definition of a DTP.

Internal policies and processes will be further updated based on clarifications in BR.

We have no new information in this bug.

We have no new information in this bug.

We have no new information in this bug.

We have no new information in this bug.

We have no new information in this bug.

We kindly request the NextUpdate field be set to 2024-04-03 - the same as for https://bugzilla.mozilla.org/show_bug.cgi?id=1872738 as these bugs are related.

Whiteboard: [ca-compliance] [ov-misissuance] → [ca-compliance] [ov-misissuance] Next update 2024-04-03
Whiteboard: [ca-compliance] [ov-misissuance] Next update 2024-04-03 → [ca-compliance] [ov-misissuance] Next update 2024-05-06
Whiteboard: [ca-compliance] [ov-misissuance] Next update 2024-05-06 → [ca-compliance] [ov-misissuance] Next update 2024-07-15

We have no new information in this bug and all action items have been closed. If there are no more comments or questions, I suggest we close this bug.

We have no new information in this bug and all action items have been closed. If there are no more comments or questions, I suggest we close this bug.

I'll close this bug on or about Friday, 19-July-2024.

Flags: needinfo?(bwilson)
Flags: needinfo?(bwilson)
Whiteboard: [ca-compliance] [ov-misissuance] Next update 2024-07-15 → [ca-compliance] [ov-misissuance]
Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: