Open Bug 1524733 Opened 10 months ago Updated 3 months ago

CFCA: invalid dnsNames

Categories

(NSS :: CA Certificate Compliance, task)

task
Not set

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: jonathan, Assigned: jonathansshn)

Details

(Whiteboard: [ca-compliance] - Next Update - 1-October 2019)

CFCA issued these two (pre-)certificates with invalid dnsNames. I reported them to their problem reporting email address and didn't receive a response, however they were both revoked the same day.

https://crt.sh/?opt=zlint&id=741188922 (reported 2019-01-28 04:08 UTC)
https://crt.sh/?opt=zlint&id=862381016 (reported 2019-01-28 04:15 UTC)

CFCA: Please provide an incident report, as per https://wiki.mozilla.org/CA/Responding_To_An_Incident

I've informed CFCA of this bug via email.

Whiteboard: [ca-compliance]

Hi, this is Jonathan Sun from CFCA, the following is Bugzilla Report to 1524733:
This bug is reported from Bugzilla bug 1524733, in which two certificates were issued with invalid dnsName and invalid wildcard.
After Jonathan Rudenberg reported these two bugs via emails in January 28th, we reported and required to revoke these certificates. The revocation were finished in January 28th and you can check in https://crt.sh/?id=741188922&opt=zlint and https://crt.sh/?id=862381016&opt=zlint .
We had stopped to issue such wrong certificates. these bad actions is due to lack of operator’s review after RA system decoded the CSRs. We had required operator and internal auditor to check the decoded results to ensure the right content.
The auto-correct and warning function is submitted to our R&D department, and this would be avoided to happen again.

Jonathan: thank you for this incident report.

(In reply to Jonathan Sun from comment #2)

Hi, this is Jonathan Sun from CFCA, the following is Bugzilla Report to 1524733:
This bug is reported from Bugzilla bug 1524733, in which two certificates were issued with invalid dnsName and invalid wildcard.
After Jonathan Rudenberg reported these two bugs via emails in January 28th, we reported and required to revoke these certificates. The revocation were finished in January 28th and you can check in https://crt.sh/?id=741188922&opt=zlint and https://crt.sh/?id=862381016&opt=zlint .

As described at [1], please provide a timeline of the actions CFCA took in response. Here is an example [2].

We had stopped to issue such wrong certificates. these bad actions is due to lack of operator’s review after RA system decoded the CSRs. We had required operator and internal auditor to check the decoded results to ensure the right content.

Please explain why your system does not automatically block this problem? What other checks rely on an operator's review?

The auto-correct and warning function is submitted to our R&D department, and this would be avoided to happen again.

When will these functions be deployed?
Will CFCA be deploying other functions to prevent misissuance?
Does CFCA perform post-issuance linting? If so, why were these certificates not detected?
Does CFCA plan to implement pre- and post-issuance linting? If so, when?

[1] https://wiki.mozilla.org/CA/Responding_To_An_Incident#Incident_Report
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1448986#c1

Assignee: wthayer → jonathansshn
Flags: needinfo?(jonathansshn)

Have you linted your currently valid certificates to find others that have invalid dnsNames?

QA Contact: kwilson → wthayer
  1. Problem Report:
    CFCA recognized these two problematic certificate via two reports from Jonathan Rudenburg’s email on January 28th ,2019.
  2. Timeline:
    January 28,2019: After we received his email, we immediately checked the CA database and contact the customers. They explained that certificates hadn’t been deployed on servers and CFCA revoked these two certificates in the same day.
  3. Statement
    CFCA had stopped issuing certificates with the problem.
  4. Summary
    There are two certificates with this problem.
    This first certificate with wrong dnsName *yhjypt.com issued in August 13, 2018;
    The second certificate with wrong dnsName * .conac.cn issued in September 13, 2018.
  5. Certificate Data:
    Please visit https://crt.sh/?id=862381016&opt=zlint and https://crt.sh/?id=741188922&opt=zlint to check the data.
  6. Explanation:
    This problem is due to lack of the "Hard fail" detection mechanism and rely too much on the regulation and skill of employees. So these two issues are not founded until we are informed.
  7. Steps:
  8. Update system with hard fail mechanism and this had been finished in February 27,2019.
  9. Monthly training about BR requirements to employees.
  10. Monthly inner audit to CFCA EV Root CA to prevent future problems.
    Followings are response to Wayne :
    • As described at [1], please provide a timeline of the actions CFCA took in response.
    Please check the “Timeline” Part.
    • Please explain why your system does not automatically block this problem? What other checks rely on an operator's review?
    Lack of "Hard fail" detection mechanism and this had been added to the system. Inner audit is another method CFCA use to do checks.
    • When will these functions be deployed?
    These functions had been deployed in February 27, 2019.
    • Will CFCA be deploying other functions to prevent misissuance?
    We plan to deploy other detecting software in which will apply BR and EV requirements in the system to prevent misissuance.
    • Does CFCA perform post-issuance linting? If so, why were these certificates not detected?
    By far, CFCA do post-issuance linting by manual which is the inner audit, and the blank error and dnsName point missing error are hard to be found out by eye.
    • Does CFCA plan to implement pre- and post-issuance linting? If so, when?
    The new version of detecting mechanism had been deployed in February 27, 2019 to implement pre-issuance linting. In this year, the post-issurance linting tool would be deployed for inner auditors to prevent misissuance.
    Thanks to Wayne, Jonathan, and Kathleen again for pointing out this problem.
Flags: needinfo?(jonathansshn)

(In reply to Jonathan Rudenberg from comment #4)

Have you linted your currently valid certificates to find others that have invalid dnsNames?
yes we had linted currently valid certificates to make sure there is no more invalid dnsNames used.

yes we had linted currently valid certificates to make sure there is no more invalid dnsNames used.

I found this currently valid, unrevoked certificate with a null byte at the end of one of the SANs: tms.yillionbank.com\x00

https://crt.sh/?id=1043325361&opt=zlint

(In reply to Jonathan Rudenberg from comment #7)

yes we had linted currently valid certificates to make sure there is no more invalid dnsNames used.

I found this currently valid, unrevoked certificate with a null byte at the end of one of the SANs: tms.yillionbank.com\x00

https://crt.sh/?id=1043325361&opt=zlint

This bug is fixed in the February 27 2019 update. The certificate had been revoked in February 28, 2019 and confirmed with customer that no damage is caused.

This misissued certificate was also noted in [1]: https://crt.sh/?id=1231965201&opt=zlint

Can you please explain how these certificates were missed when you linted certificates for invalid dnsNames and provide incident reports for them?

[1] https://groups.google.com/d/msg/mozilla.dev.security.policy/0Pf-ExrXaNY/NooJ2L3KAAAJ

Flags: needinfo?(jonathansshn)

(In reply to Jonathan Rudenberg from comment #9)

This misissued certificate was also noted in [1]: https://crt.sh/?id=1231965201&opt=zlint

Can you please explain how these certificates were missed when you linted certificates for invalid dnsNames and provide incident reports for them?

[1] https://groups.google.com/d/msg/mozilla.dev.security.policy/0Pf-ExrXaNY/NooJ2L3KAAAJ

For the missed input subjectAltname in this case, as Jokob Bohm said in https://groups.google.com/d/msg/mozilla.dev.security.policy/0Pf-ExrXaNY/NooJ2L3KAAAJ , the CAA checking action couldn't prevent this from happening perfectly. We checked the production log, and this error is caused by operator's manual input. we had finished system updates which would check TLD in common name and subjectAltnames automatically in February 27 update, the wrong TLD input will be reported as "invalid TLD " from the system after this update. More training had been done to operators.

Flags: needinfo?(jonathansshn)

Jonathan: That explanation does not explain how it was missed. That is, the CAA checking is orthogonal and unrelated to this certificate, as this certificate had a malformed dNSName.

The request in Comment #9 was an explanation about how these certificates were not detected based on Comment #5. As it stands, I don't believe there is high confidence that CFCA's past examination was correct, and thus it is difficult to believe that there are no further issues.

A path forward on this is to describe what steps CFCA used to scan its existing certificates in Comment #5, why those steps failed to detect the certificates in Comment #7 and Comment #9, and what steps CFCA is taking to:

  1. Rescan its database of issued certificates
  2. Understand why the existing scan failed
  3. Address whatever systemic issues are revealed through that investigation

I highlight this, because a failure of a CA to detect previously misissued certificates, after it has claimed it has scanned them, is a very serious issue, as much or more serious than the misissuance itself. Multiple CAs have been distrusted for failing to detect other certificates they've misissued after an incident, such as those two highlighted, and thus CFCA should endeavor to understand why they also failed, and provide a thorough update to the community about how they will be preventing such failures in the future.

Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Flags: needinfo?(jonathansshn)

(In reply to Ryan Sleevi from comment #11)

Jonathan: That explanation does not explain how it was missed. That is, the CAA checking is orthogonal and unrelated to this certificate, as this certificate had a malformed dNSName.

The request in Comment #9 was an explanation about how these certificates were not detected based on Comment #5. As it stands, I don't believe there is high confidence that CFCA's past examination was correct, and thus it is difficult to believe that there are no further issues.

A path forward on this is to describe what steps CFCA used to scan its existing certificates in Comment #5, why those steps failed to detect the certificates in Comment #7 and Comment #9, and what steps CFCA is taking to:

  1. Rescan its database of issued certificates
  2. Understand why the existing scan failed
  3. Address whatever systemic issues are revealed through that investigation

I highlight this, because a failure of a CA to detect previously misissued certificates, after it has claimed it has scanned them, is a very serious issue, as much or more serious than the misissuance itself. Multiple CAs have been distrusted for failing to detect other certificates they've misissued after an incident, such as those two highlighted, and thus CFCA should endeavor to understand why they also failed, and provide a thorough update to the community about how they will be preventing such failures in the future.

The former "scanning" of the database is based on very basic logic to find obvious mistakes such as "DNS name have invalid character", as CFCA's certificate issuing amount is not very large, we processed applications based on manual double check, which were relied on inner auditor’s skills. This is the reason why the mistake happens.

We understand that it was too weak to find more uncertainty in human eyes such as illegal characters/ blank space/ illegal dnsNames.

We had found that too much relying on manual work is the main reason. The restrict update according to RFC 5280 , IANA TLD limitations, and CA/B Forum BR had been submitted to our R&D engineer. The limitations had been added and modified in the production system. The audit checking tool is still in developing progress, but we simulated the offline environment to check the existing data temporary after that, and and no more valid certificates contains “invalid dnsNames”.

In the future, the CFCA system will do pre-check on every part of the customer inputs automatically according to RFC 5280 , IANA TLD limitations, and CA/B Forum BR and the production system will do the check again as the what the pre-check do.

Flags: needinfo?(jonathansshn)

I'm uncertain as to how to understand this response.

If I understand correctly, CFCA's scanning of the database was to perform a human review of all of the domain names it issued, and this human review failed to detect the issues, even after being alerted to them on this issue. Despite developing tools (in Comment #5) to prevent new certificates, existing certificates were not examined for their compliance with this.

If I understand correctly, CFCA is planning to replace this with automated systems. However, no clear timeline is provided for these automated systems, based on Comment #0 and Comment #5.

Is that correct?

Flags: needinfo?(jonathansshn)
Flags: needinfo?(wthayer)

Emailed POCs on 2019-07-04 regarding this issue, highlighting https://wiki.mozilla.org/CA/Responding_To_An_Incident#Keeping_Us_Informed

(In reply to Ryan Sleevi from comment #13)

I'm uncertain as to how to understand this response.

If I understand correctly, CFCA's scanning of the database was to perform a human review of all of the domain names it issued, and this human review failed to detect the issues, even after being alerted to them on this issue. Despite developing tools (in Comment #5) to prevent new certificates, existing certificates were not examined for their compliance with this.

If I understand correctly, CFCA is planning to replace this with automated systems. However, no clear timeline is provided for these automated systems, based on Comment #0 and Comment #5.

Is that correct?

Hi, Ryan

Your understand is correct, we had planed to replace this with automated systems to avoid rely too much on the regulation and skill of employees.

This plan began earlier this year, after the February emergency update to operation flow and RA system, we had planed a complete update schedule, which includes automated validation and strict automated audit functions, the whole schedule would be done in the late of September and used in productive system by the end of November(after strict test and bug fix) in 2019.

Thanks to Wayne, Jonathan, and Kathleen again for pointing out this problem, and thanks Ryan for your opinions and helps.

Flags: needinfo?(jonathansshn)

the whole schedule would be done in the late of September and used in productive system by the end of November(after strict test and bug fix) in 2019.

Oliver: please update this bug when the update is completed and when it is used in production, and if the schedule changes.

Flags: needinfo?(wthayer)
Whiteboard: [ca-compliance] → [ca-compliance] - Next Update - 1-October 2019
You need to log in before you can comment on or make changes to this bug.