Closed Bug 1462844 Opened 6 years ago Closed 5 years ago

GoDaddy: Improper DER results in failure to comply with RFC 5280 - Invalid characters in PrintableString

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ryan.sleevi, Assigned: dreynolds)

Details

(Whiteboard: [ca-compliance] [ev-misissuance])

Example certs:
https://crt.sh/?id=250008707&opt=cablint,x509lint,zlint,ocsp
https://crt.sh/?id=49843724&opt=zlint,cablint,x509lint,ocsp

This certificate fails to parse, as it contains a Subject attribute with a serialNumber field, and the serialNumber contains an backslash ("\"), which is not a valid character for a PrintableString.

It's further unclear if this represents a potential violation of 7.1.4.2(j), as to whether such an backslash is a "metadata" field, but I believe it reasonably can be argued as such.
Whiteboard: [ca-compliance]
Daymion: This appears to be a misissued certificate. Please provide an incident report, as described here:
https://wiki.mozilla.org/CA/Responding_To_A_Misissuance#Incident_Report
The incident report should be posted to the mozilla.dev.security.policy forum and added to this bug.
Flags: needinfo?(dreynolds)
Given https://bugzilla.mozilla.org/show_bug.cgi?id=1391429#c6 , the Google Chrome has requested that GoDaddy examine 100% of its non-revoked, unexpired certificates for non-conformities, and to report these as part of its incident report.
Update
7520919390119884311 was revoked at May 19 2018 3:20 PM AZ time inside of 24hrs
4549563154035492128 was revoked at May 19, 2018 3:09 PM AZ Time inside of 24hrs

We ran though the entire list of certificates and will have the results as part of our incident report.
Flags: needinfo?(dreynolds)
Assignee: wthayer → dreynolds
CA first become aware:  

We first became aware of the malformed certificates https://crt.sh/?id=250008707&opt=cablint,x509lint,zlint,ocsp & https://crt.sh/?id=49843724&opt=zlint,cablint,x509lint,ocsp  via a Bugzilla bug report on 5/18 and an email to practices@.

Timeline of the actions: 

5/18 1am UTC: Upon reviewing, and verifying the certs did indeed have a defect we started our revocation and rekey process by contacting the certificate owners. The owners were not immediately reachable, and/or needed time to perform the certificate swap.
5/19 10pm UTC: Certificates were revoked, after owner contact. Well within the 24hr required period. 

CA has stopped issuing defect: 

The identified certificates were defective due to a bug, which dated back to pre-2015. This defect rarely occurred.  February 2018 the issue reoccurred, but was caught/prevented by the linter. We corrected the defect on 2/8/2018.
Summary of the problematic certificates & Complete certificate data:
The printable string defect was found in the following certificates:

This bug:

https://crt.sh/?id=250008707&opt=cablint,x509lint,zlint,ocsp
https://crt.sh/?id=49843724&opt=zlint,cablint,x509lint,ocsp

Additionally, upon scanning our certificate store we identified:

https://crt.sh/?id=167970618&opt=cablint,zlint
https://crt.sh/?id=246757501&opt=cablint,x509lint,zlint 

All certificates with the defect were revoked within 24hrs following identification.

Explanation about how and why the mistakes were made or bugs introduced:

The defect occurred by improper handling of extended Unicode character. 

List of steps your CA is taking to resolve the situation:

Certificates were revoked, rekeyed. Linting was added to the provisioning pipeline to prevent future occurrences in November 2017.
Did you not scan certificates for the issue when fixing the defect in February?
We did scan. As to why these additional certs did not surface is currently unknown. We have since scanned every certificate under management with multiple certificate linting tools and posted to the Mozilla list our findings. We have also added zlint, to the pre-existing certlint to our provisioning process. (using 2 linters)
(In reply to Daymion Reynolds from comment #6)
> We did scan. As to why these additional certs did not surface is currently
> unknown. We have since scanned every certificate under management with
> multiple certificate linting tools and posted to the Mozilla list our
> findings.

It's difficult to understand how these two statements can be seen as compatible. Without knowing why these additional certificates were not detected, why is there confidence that every certificate has been scanned?

If the issue is one of technical failure, then that suggests that it may still be an issue, and the fix on 2/8/2018 may have been incomplete.
If the issue is one of procedural failure, then that suggests that there is opportunity to improve the process for future incident events.

One of the greatest areas of concerns in the CA ecosystem is that, upon detecting misissuance, without something like CT in place, it is nearly entirely reliant on the CA to adequately detect and scan their systems. Statements such as "We scanned all of our certificates" can be difficult to determine the accuracy of, particularly if and when counter-examples are detected.

For example, a way that process improvement can be made is, when scanning all certificates, some form of structured 'commitment' as to what certificates were scanned is made. For example, attaching a list of every serial number for every issuer that was examined can help determine whether or not a given certificate was scanned or not during the time of detection. If that certificate was not scanned, its absence will be noted. If the certificate was scanned, then it can be determined that the method of scanning / detection is itself deficient, and process opportunities to improve there.

In post-mortems, sometimes the answer is, unfortunately, "We don't know". However, the steps for the post-mortem should lay out "But here's how we'll improve in the future", so that the next time something unfortunately happens, the data is there. In the CA ecosystem, it's further necessary to explore ways in which that data can be shared/publicized or otherwise publicly committed to, since a system based on trust is one that requires checks and balances.
Ryan: are you waiting for a response from GoDaddy, or can this be closed?
Flags: needinfo?(ryan.sleevi)
I'm still concerned that it's not clear why they were not originally detected. I'm not entirely confident that the new process is better than the old process, even though the new process clearly caught these new certs, because it's not clear what changed (if anything).

I'd feel better with a mitigation being steps to make sure that issues with future scans can be detected/understood, as a process change, and documented here. I'm more concerned that certs were missed during the initial scan than the bug/issue itself, and even though it was caught subsequently, am not sure if that's due to luck or due to improved processes not mentioned here.

However, I'm OK with closing this out in the absence of it, and treating it more seriously if it were to happen again.
Flags: needinfo?(ryan.sleevi)

Resolving per Ryan's comments.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ev-misissuance]
You need to log in before you can comment on or make changes to this bug.