GoDaddy : CAA checks did not properly handle issuewild tag allowing FQDN SANs to be added to wildcard certs
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: star, Assigned: star)
Details
(Whiteboard: [ca-compliance] [ov-misissuance] [dv-misissuance])
Attachments
(1 file)
72.93 KB,
text/plain
|
Details |
Preliminary Incident Report
Summary
On 6/23/2024, GoDaddy received a certificate problem report alerting us to potential concerns with CAA checking. As a result of the certificate problem report investigation, GoDaddy identified a software bug in the CAA validation process which allowed certificates with godaddy.com or starfieldtech.com in the 'issuewild' tag and NOT in the 'issue' tag to include the FQDN of that domain name as a SAN on a wildcard certificate. This is a violation of the Baseline Requirements for the Issuance and Management of Publicly-Trusted TLS Server Certificates, section 3.2.2.8, which states a "CA MUST retrieve and process CAA records in accordance with RFC 8659 for each dNSName in the subjectAltName extension." Specifically, RFC 8659, section 4.3., which states that "Each issuewild Property MUST be ignored when processing a request for an FQDN that is not a Wildcard Domain Name."
A full incident report will be published by Friday, 7/5/2024.
Updated•7 months ago
|
Comment 1•7 months ago
|
||
We observed and reported this bug alongside bug 1904749 as part of a regular scanning of CT logs and checking CAA records for all SANs in published certificates.
Observation
GoDaddy issues certificate containing both FQDN and wildcard names but corresponding relevant CAA records only authorizes GoDaddy for wildcards, i.e., it is only included as value of issuewild
tag.
Examples
Certificate Digest | SAN | CAA* |
---|---|---|
e1f111cd270afe68c96d7477cda2c55801d148c7 | [*.gamesalad.com, gamesalad.com] |
{"issue":["amazonaws.com", "letsencrypt.org", "amazontrust.com", "awstrust.com", "amazon.com"], "issuewild":["starfieldtech.com", "godaddy.com"]} |
c7c9d519ed96de2a040c310e09197d22481c3ffd | [*.carisls.com, carisls.com] |
{"issue":["letsencrypt.org", "godaddy.com", "amazon.com"],"issuewild":["letsencrypt.org"]} |
: * CAA issue
and issuewild
values as shortened here only to issuer-domain-names
Assignee | ||
Comment 2•7 months ago
|
||
Assignee | ||
Comment 3•7 months ago
|
||
Incident Report
Summary
GoDaddy issues TLS certificates and must adhere to CAB Baseline Requirements including BR 3.2.2.8 which states:
As part of the Certificate issuance process, the CA MUST retrieve and process CAA records in accordance with RFC 8659 for each dNSName in the subjectAltName extension that does not contain an Onion Domain Name.
On 2024-06-23 we were informed in a Certificate Problem Report (CPR) that a researcher believed they had identified a bug in our implementation of RFC 8659 Section 4.3. The CPR suggested that we may not be conforming to the requirement that:
Each issuewild Property MUST be ignored when processing a request for an FQDN that is not a Wildcard Domain Name. The reporter provided two examples of certificates they believed were affected by this issue.
Upon examination of the two examples included in the report, we determined that our code was mistakenly taking the presence of the issuewild tag alone as permission to issue both the wildcard domain and the wildcard’s Fully Qualified Domain Name (FQDN). This could be permissible if no CAA records with the issue tag were present on the FQDN of the wildcard domain. However, in the case where the wildcard domain had a CAA issuewild permitting GoDaddy to issue the wildcard domain, but a CAA issue record forbidding us from issuing the FQDN of the wildcard domain existed, our code did not correctly process the issue tag and did not prevent issuance of the FQDN of the wildcard domain.
As an example, if a certificate was requested for both site.com and *.site.com SANs, and the CAA records would be expected to show as:
site.com. 3600 IN CAA 0 issue "myotherca.com"
site.com. 3600 IN CAA 0 issuewild "godaddy.com"
In the case described above, a GoDaddy certificate would have been issued for both the FQDN of the wildcard (site.com) and the wildcard domain name (*.site.com), even though due to the existence of an issue record for another CA and not one for GoDaddy, only the wildcard domain name is permitted.
This action violates RFC 8659, section 4.3., stating "Each issuewild Property MUST be ignored when processing a request for an FQDN that is not a Wildcard Domain Name.".
Impact
We believe the first problematic certificate was issued on 2017-10-06 and the last was issued on 2024-06-26 prior to the code fix being deployed. During that period, we surfaced 847 active certificates (DV 805, OV 42) that had this problem. We surfaced an additional 2398 certificates which would have required revocation but were already revoked or expired. Nine certificates with the identified problem were issued between the time the CPR was received and the time we confirmed the existence of the problem and applied the bug fix. During the period from 2017-10-06 to 2024-06-26, we issued approximately 126.0M certificates overall.
By the time the revocation occurred on 2024-06-28, four of the initially surfaced active certificates had expired, meaning on the day of the revocation, we revoked 843 certificates for this issue (see Appendix for detailed listing)
Timeline
All times are UTC.
2013-01-01 00:00 - RFC 6844 defining CAA is published
2015-01-15 00:00 - GoDaddy publishes CP/CPS 3.6 explaining in section 4.1.1 that CAA records are currently not checked under RFC 6844, but may be checked in the future
2017-08-09 00:42 - GoDaddy deploys CAA logic into main application
2017-08-15 00:00 - GoDaddy publishes CP/CPS 3.12, details how CAA records are checked in Section 4.1.1, and enables CAA logic in main application
2017-09-08 00:00 - CAB Forum Releases TLS BR 1.5.8 which specifies CAs must check for CAA Records in section 3.2.2.8
2017-10-06 18:41 - GoDaddy issues first certificate with improper check of issuewild record (Serial: 6ed1bfab66a67d05 )
2019-08-27 00:53 - GoDaddy refactors elements of CAA checking functionality into a microservice
2019-11-01 00:00 - RFC 8659 published, which included some clarifications to RFC 6844
2024-06-23 13:54 - GoDaddy receives CPR email from researcher
2024-06-23 16:23 - GoDaddy Registration Authority (RA) Administrator notates the email/ticket as an CPR concerning CAA records and requests supervisor support
2024-06-23 16:44 - GoDaddy RA Supervisor contacts the PKI Development Team alerting to possible TLS BR violations
2024-06-23 17:24 - GoDaddy responds to CPR reporter, providing update of investigation starting based on the provided information
2024-06-24 15:08 - GoDaddy PKI Development Team starts reviewing issue and impact
2024-06-24 18:46 - GoDaddy PKI Development Team completes review and confirms the issue raised is a valid compliance issue
2024-06-24 19:59 – GoDaddy PKI Development Team presented preliminary findings confirming the presence of the bug identified in the CPR
2024-06-24 22:00 - GoDaddy PKI Leadership meets and assigns tasks for response and remediation
2024-06-25 00:23 - GoDaddy contacts customers whose certificates appeared in the examples provided in the CPR about pending revocation and steps to update CAA records to permit issuance
2024-06-25 16:00 - GoDaddy runs initial queries to identify potentially affected certificates
2024-06-26 02:12 - GoDaddy creates bug PRB1904748 and posts a Preliminary Incident Report posted to BugZilla
2024-06-26 14:33 - GoDaddy issues last certificate with issuewild problem(Serial: 00c60581f4069f17f6)
2024-06-26 22:56 – GoDaddy PKI Development Team deploys first part of fix for issuewild problem, stopping problematic requests from entering the system
2024-06-27 05:55 - GoDaddy PKI Development Team deploys second part of fix for issuewild problem, handling any requests that were already in the system beforehand, had an expired CAA TTL, or made it past the first part of the fix
2024-06-27 18:00 - GoDaddy confirms no additional problematic certificates issued post-fix
2024-06-27 20:28 - GoDaddy confirms final list of problematic active certificates (see appendix below)
2024-06-28 03:51 - GoDaddy begins contacting customers whose certificates were identified during investigation via email and/or phone to inform them of pending revocations
2024-06-28 12:00 - GoDaddy revokes problematic certificates included in the initial CPR (within 5 days of receipt of the CPR from 2024-06-23 23 13:54 )
2024-06-28 22:08 – GoDaddy revokes problematic certificates surfaced during investigation (within 5 days of the confirming the bug in the system on 2024-06-24 19:59 )
Root Cause Analysis
Background
GoDaddy’s original implementation of the CAA record check was written to follow the requirements outlined in RFC 6844 and deployed in August 2017. As noted in other Bugzilla reports as well as CAB discussions of RFC6844, there was some ambiguity in the original RFC. In our design process we reviewed the requirements surrounding the issuewild
flag in Section 5.3 of RFC 6844:
issuewild properties MUST be ignored when processing a request for
a domain that is not a wildcard domain.
If at least one issuewild property is specified in the relevant
CAA record set, all issue properties MUST be ignored when
processing a request for a domain that is a wildcard domain.
Based on our interpretation of the above, we implemented the requirement for the entire wildcard certificate, rather than the specific wildcard domain, meaning that an issuewild record would allow for both the wildcard domain itself and the FQDN of the wildcard domain to be issued. In investigating our ticket tracking system, it appears the requirement was mistakenly captured in both 2013 and 2017 as:
“issuewild - supersedes issue if the cert type [emphasis added] is a wildcard, if found, follow same logic as issue property for wildcards.” This drove the initial design of the CAA system.
When we moved some of the logic to a microservice in August of 2019, the publication of RFC 8659 was still two months away, so nothing had changed in authoritative RFC 6844 at the time of the microservice migration to signal that we should re-evaluate that logic. Unfortunately, it appears that once the RFC was published, our logic was not re-evaluated.
Additionally, one feature of our non-ACME certificates is that we automatically attach the FQDN of the wildcard to the certificate request, to enable a wildcard certificate to function not only on arbitrary subdomains, but on the FQDN of the wildcard domain as well (ex. mail.example.com AND example.com). Our code handled this well for domain validation, but due to the misinterpretation of issuewild, our code did not explicitly check on the FQDN of the wildcard. If an issuewild flag was present, that trumped the issue record, rather than augmenting it.
There was one developer who primarily worked on this code. Our code peer review (PR) process at the time was to have approval from one other developer. Starting September 9, 2022 we began to expand that to have PRs be reviewed by multiple developers, and have additional controls for PRs relating to sensitive code. If we had used this practice then, we may have caught these errors, but all the code was long committed by that point.
Finally, the infrequency of the situation made this problem less apparent. Most of our customers have no CAA records on their domains at all, and of those that do, most of them are properly accepted or rejected by GoDaddy. The case where a customer would want to authorize GoDaddy to issue the exact wildcard domain, but by the presence of CAA issue records for CAs not including GoDaddy, prohibit GoDaddy from issuing to the FQDN of the wildcard is rare.
Missed Requirement
Misinterpretation of the scope of the issuewild tag in the original RFC 6844 Section 5.2 and subsequent missed opportunity to reevaluate when clarity was added to RC 8659 Section 4.3 in 2019.
Not fully understanding the CAA implications for defaulting non-ACME wildcard certificates requests to include both the exact wildcard domain and the FQDN of the wildcard as a feature.
Requirement Fix
The bug fix included changes which brought our main application and microservice into compliance with RFC 8659. The issuewild meaning was scoped correctly. For certificates containing wildcard domains, explicit CAA checks are now done for the FQDN of the wildcard rather than relying on the CAA checks for the wildcard alone.
Deployment
The rollout of the fix was deployed in two phases. The first deployment on 2024-06-26 22:56 expanded the checks that we do when we are accepting a certificate request from a requestor. A fast-follow at 2024-06-27 05:55 added the same logic for existing requests, so that if the initial CAA check TTL for a request was expired and needed to be done prior to issuance, that check would also adhere to RFC 8659 and use the same logic. The CAA record audit data before and after the incident was then analyzed for additional violations using database surfacing queries.
Lessons Learned
What went well
We have well-developed bulk revocation capabilities which enabled us to complete the needed certificate revocations quickly and within timelines
We persisted adequate CAA data to surface affected certificates, even if it was somewhat difficult to parse.
While it took time to ascertain the issue and surface the problem, the CPR process ultimately worked and drove positive change to our system
What didn't go well
It took time to confirm the certificates affected from a larger pool of potentially affected certificates. We developed scripts to filter through the results and have them if another event were to happen, but putting them together took additional time due to the novelty of the issue.
Due to CAA records being less common knowledge, it was more difficult to communicate to customers how they could resolve the situation
Where we got lucky
This was reported to us via a certificate problem report
Less than 1% of our certificates under management have domains which have non-empty CAA records on their domains. This helped limit the negative effects of the mis-issuance.
Action Items
| Action Item | Kind | Due Date |
| ----------- | ---- | -------- |
| Add additional unit tests to check CAA records scenarios | Prevent | 2024-06-26 |
| Add synthetic monitor tests to validate our system is correctly detecting CAA records which prevent issuance | Prevent | 2024-10-10 |
Did GoDaddy at any point stop issuance? The Impact and Timeline sections implies issuance continued despite full knowledge of the issue.
Assignee | ||
Comment 5•7 months ago
|
||
Thank you for your question. No – we did not stop issuance. Our investigation determined that this specific issue was extremely rare and that it was likely that a very limited number or zero certificates would be affected between the time we confirmed the issue and applied the fix. Our assessment was, in fact, correct as only nine certificates were affected – out of 1.37M issued (or 0.0007%) -- during the period from bug confirmation to bug fix (2024-06-23 13:54 to 2024-06-26 22:56).
Once we isolated the bug and assessed the issue was extremely rare, we focused efforts on the bug fix and subsequently helping affected customers through the rekey process in a compressed time window. We were confident in our ability to expeditiously identify and revoke any that may have been issued between the time we confirmed the bug and the time the bug was fixed. We elected to continue issuing certificates during this period, because stopping issuance would have been extremely impactful and disruptive to our customers whose certificates were not affected. Ultimately, we selected the path that would result in the least disruption, while still being compliant with the requirements of the BRs and our CP/CPS.
Comment 6•5 months ago
|
||
Please provide an update on "Add synthetic monitor tests to validate our system is correctly detecting CAA records which prevent issuance".
Assignee | ||
Comment 7•5 months ago
|
||
Thank you for the question, Ben. We are actively working on adding the synthetic monitor tests to validate detection of CAA records which prevent issuance. We expect the rollout to be ahead of our defined timeline in early October and will update the incident once complete.
Updated•4 months ago
|
Assignee | ||
Comment 8•4 months ago
|
||
As of 9/3/2024, synthetic monitoring has been deployed and is operating as expected. All action items related to this incident have been completed.
Comment 9•3 months ago
|
||
As there have been no questions or comments, and it appears that all action items related to this incident have been completed, I will close this matter on or about Wed. 30-Oct-2024.
Updated•3 months ago
|
Description
•