Closed Bug 1711432 Opened 7 months ago Closed 6 months ago

Telekom Security: Certificate with invalid FQDN

Categories

(NSS :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Arnold.Essing, Assigned: Arnold.Essing)

Details

(Whiteboard: [ca-compliance])

We have issued a certificate with an FQDN starting with a hyphen.
A detailed incident report will follow.

Assignee: bwilson → Arnold.Essing
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

1.How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in the MDSP mailing list, a Bugzilla bug, or internal self-audit), and the time and date.
Our internal QA includes a periodical check of https://crt.sh/?lint=1+week for errors and warnings. On 2021-05-16 09:30, this check revealed that a certificate (https://crt.sh/?id=4522571849&opt=cablint) has been issued with an invalid FQDN (the FQDN starts with a hyphen) in the commonName and the SubjectAlternativeName.

2.A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
(Times in UTC)
2019:
2019-03: An update for the relevant CA is provided which, among other things, shall prevent “incorrect” FQDNs from being accepted as valid input. This functionality is successfully tested (including the “no hyphen in the beginning or end of an FQDN”-rule).
2019-08: Another update (next version) has been deployed which introduced a bug that would allow the software to accept FQDNs starting with a hyphen.

2021-05-14 (Friday):
2021-05-14 10:37:18: An Enterprise RA issues a number of certificates for several subdomains, some of which having a format of “<some name>-wru.schnelltestportal.de”. In one of these certificates the Enterprise RA mistakenly (human error) only used “-wru.schnelltestportal.de” which neither the software itself nor the activated zLint detected as an invalid FQDN. Therefore, issuance was not prevented.

2021-05-16 (Sunday):
2021-05-16 09:30: The internal QA checks https://crt.sh/?lint=1+week and detects the ERROR. Corresponding personnel is informed.
2021-05-16 11:01: Confirmation of mis-issuance by the responsible personnel. As a first reaction, further issuance of certificates under the affected CA is stopped and revocation of the erroneous certificate is decided upon.
2021-05-16 13:15: Issuance of certificates under the corresponding CA is stopped.
2021-05-16 19:32: Customer revokes affected certificate.

2021-05-17 (Monday):
2021-05-17 06:49 Our software vendor is informed about the unintended behaviour of the software.
2021-05-17 07:00: Management Call – Further evaluation of the problem. It is confirmed, that the software does not behave as intended and does not block FQDNs with a hyphen in the beginning. Apart from that, the software works as intended (blocks other special characters, spaces etc.). In order to resume issuance in a timely manner, activating certLint for the relevant templates has been decided upon as an immediate measure.
2021-05-17 09:19: Software vendor confirms that the software has a bug and that this bug exists since 2019.
2021-05-17 09:38: Opening this bug.
2021-05-17 11:59: The relevant templates are updated to contain certlint and x.509-lint.
2021-05-17 12:22: Tests confirm that issuance of certificates with invalid FQDNs (especially those with a hyphen in the beginning) are now successfully blocked.
2021-05-17 12:45: Management Call – The decision to resume issuance is made. Additional measures are evaluated.
2021-05-17 13:05: Issuance of certificates is resumed.

3.Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident. A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.
Issuance of certificates under the corresponding CA has been stopped on 2021-05-16 13:15 UTC and has been resumed after templates have been updated to contain the necessary linters on 2021-05-17 13:05 UTC.

4.In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.
The affected certificate has been issued to an internal customer and is the only occurrence of mis-issuance as described in this bug (other occurrences have not been found). It was issued 2021-05-14 10:37:18 UTC and has been revoked on 2021-05-16 19:32:02 UTC.

5.In a case involving certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.
The complete certificate data can be seen at: https://crt.sh/?id=4522571849

6.Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
In our opinion, the mistake could happen due to two main reasons. On the one hand, the software itself was supposed to prevent submission of invalid FQNDs entirely. Surprisingly to us, it did not. On the other hand, this mistake could also have been avoided, if certLint had been active (zLint did not detect this mistake).
Software failure: In early 2019, an update had been provided which, among other features, ensured that only “correct” FQDNs were accepted. This functionality had been tested successfully, including that FQDNs do not start with a hyphen (due to the available test protocols we were able to verify that this indeed had been tested). In August 2019, a new version update had been provided which introduced a bug that, in some scenarios, allowed FQDNs to begin with a hyphen (the exact reasons how this bug could happen are unclear). Since the functionality in regard to FQNDs had been successfully tested for the prior version, it was not tested again for the new version and the bug was not detected.
Inactive CertLint: Similar to our other bug (https://bugzilla.mozilla.org/show_bug.cgi?id=1703528), the configuration of Linters for this CA is part of the templates and the relevant templates only included zlint (which did not detect the mistake).
The bug stayed undetected until now since the testing of new versions no longer covered the functionality of correctly validating FQDNs. Also, there have not been any occurrences of this bug in the past.

7.List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.
Immediate measure (already implemented): The relevant templates have been configured to not only include zlint, but certlint and x.509lint as well. This way, further mis-issuances with invalid FQDNs should no longer be possible and issuance of certificates has been resumed shortly after the change was implemented and successfully tested.
Further measures (to be implemented soon):
Bugfix: A patch will be deployed to fix the bug that was introduced in 2019. It is expected, that this hotfix will be provided by our software vendor within the next two days and that it will be deployed to the production environment at the beginning of June (after testing will have been performed).
Configuration of Linters: As mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=1703528 an update is currently being tested that will centralize the linting configuration. It is still expected that this update will be deployed to the production environment in week 21 or 22.

We will provide an update on the progress next week at the latest.

Thank you for this incident report, and your swift response up to now.

Unrelated to subject, but related to the certificate that needed to be revoked:

I noticed that the revoked certificate asserts subject:serialNumber = 1. According to your CPS, that means that this field is included to ensure that the subject information is unique across certificates, but this seems like arbitrarily inserted data that cannot be validated or supplied by anyone other the CA.

Could you help me understand your decision making process as to why you've included a field that effectively contains only metadata [1] in your certificate?

[1] I chose to use the word "metadata" here, as it indicates that a subject:serialNumber of N is probably the N-th requestor of such certificate, which is information only available to and useful for the CA. For unique addressing of the certificate itself, the certificateSerialNumber should be sufficient. I think that this might, but also might not, qualify as metadata under BR s7.1.4.2. I'm on the fence, but leaning towards 'metadata as specified in s7.1.4.2'.

Flags: needinfo?(Arnold.Essing)

It might be worth noting that zlint (or at least the version on crt.sh) does not detect issues, so other CAs might also have this issue and not detect it.

Hello Matthias,
you raised an interesting point. First, we would like to clarify that, in this case, the serialNumber is not used to differentiate between “actually different” subjects. Instead, follow-up certificates (renewal, re-issue…) for the same subject are given an incremented serialNumber. On the one hand, this means that (in our opinion) third parties do not need to worry about being able to distinguish “different” subjects since they are all the same subject and, to our understanding, this data does not fit the definition of “metadata” from the baseline requirements. On the other hand, we agree that this data is probably unnecessary/expendable and, to be on the safe side, should be removed from the subjectDN. We currently plan to do so in a timely manner, probably within the upcoming week.

Flags: needinfo?(Arnold.Essing)

The hotfix to prevent hyphens at the beginning of FQDNs is planned to be deployed in week 23. Along with it, the CA software update to centralize the configuration of linters for this PKI service will be deployed (as mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=1703528 . We therefore assume to not have any new information the upcoming week and plan to provide the next update in week 23 at the latest.

Also, in regard to Matthias' point, we removed the serialNumber from the subjectDN today.

By "week 23", I'm assuming somewhere between 2021-06-07 and 2021-06-13. Just making sure we're measuring weeks from the same starting point ;)

Flags: needinfo?(Arnold.Essing)

Yes, that assumption is correct. More precisely, the change will be between 2021-06-07 and 2021-06-09.

Flags: needinfo?(Arnold.Essing)

The hotfix to prevent hyphens at the beginning of FQDNs has been deployed yesterday (2021-06-07). Along with it, the CA software update to centralize the configuration of linters (as mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=1703528) has also been deployed for this PKI service.
Please let us know if more information is needed. Otherwise, we would consider this incident as resolved.

I will call this up for resolution / closure on or about this Friday, 11-June-2021, unless there is more information needed/requested.

Flags: needinfo?(bwilson)
Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.