Closed Bug 1391089 Opened 7 years ago Closed 7 years ago

WISeKey: Non-BR-Compliant Certificate Issuance

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kathleen.a.wilson, Assigned: pfuentes)

References

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

The following problems have been found in certificates issued by your CA, and reported in the mozilla.dev.security.policy forum. Direct links to those discussions are provided for your convenience.

To continue inclusion of your CA’s root certificates in Mozilla’s Root Store, you must respond in this bug to provide the following information:
1) How your CA first became aware of the problems listed below (e.g. via a Problem Report, via the discussion in mozilla.dev.security.policy, or via this Bugzilla Bug), and the date.
2) Prompt confirmation that your CA has stopped issuing TLS/SSL certificates with the problems listed below.
3) Complete list of certificates that your CA finds with each of the listed issues during the remediation process. The recommended way to handle this is to ensure each certificate is logged to CT and then attach a CSV file/spreadsheet of the fingerprints or crt.sh IDs, with one list per distinct problem.
4) Summary of the problematic certificates. For each problem listed below: number of certs, date first and last certs with that problem were issued.
5) Explanation about how and why the mistakes were made, and not caught and fixed earlier.
6) List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
7) Regular updates to confirm when those steps have been completed.

Note Section 4.9.1.1 of the CA/Browser Forum’s Baseline Requirements, which states:
“The CA SHALL revoke a Certificate within 24 hours if one or more of the following occurs: …
9. The CA is made aware that the Certificate was not issued in accordance with these Requirements or the CA’s Certificate Policy or Certification Practice Statement; 
10. The CA determines that any of the information appearing in the Certificate is inaccurate or misleading; …
14. Revocation is required by the CA’s Certificate Policy and/or Certification Practice Statement; or 
15. The technical content or format of the Certificate presents an unacceptable risk to Application Software Suppliers or Relying Parties (e.g. the CA/Browser Forum might determine that a deprecated cryptographic/signature algorithm or key size presents an unacceptable risk and that such Certificates should be revoked and replaced by CAs within a given period of time).

However, it is not our intent to introduce additional problems by forcing the immediate revocation of certificates that are not BR compliant when they do not pose an urgent security concern. Therefore, we request that your CA perform careful analysis of the situation. If there is justification to not revoke the problematic certificates, then explain those reasons and provide a timeline for when the bulks of the certificates will expire or be revoked/replaced. 

We expect that your forthcoming audit statements will indicate the findings of these problems. If your CA will not be revoking the certificates within 24 hours in accordance with the BRs, then that will also need to be listed as a finding in your CA’s BR audit statement.

We expect that your CA will work with your auditor (and supervisory body, as appropriate) and the Root Store(s) that your CA participates in to ensure your analysis of the risk and plan of remediation is acceptable. If your CA will not be revoking the problematic certificates as required by the BRs, then we recommend that you also contact the other root programs that your CA participates in to acknowledge this non-compliance and discuss what expectations their Root Programs have with respect to these certificates.


The problems reported for your CA in the mozilla.dev.security.policy forum are as follows:

** Failure to respond within 24 hours after Problem Report submitted
https://groups.google.com/d/msg/mozilla.dev.security.policy/PrsDfS8AMEk/w2AMK81jAQAJ
The problems were reported via your CA’s Problem Reporting Mechanism as listed here:
https://ccadb-public.secure.force.com/mozilla/CAInformationReport
Therefore, if this is the first time you have received notice of the problem(s) listed below, please review and fix your CA’s Problem Reporting Mechanism to ensure that it will work the next time someone reports a problem like this.

** Invalid dnsNames (e.g. invalid characters, internal names, and wildcards in the wrong position)
https://groups.google.com/d/msg/mozilla.dev.security.policy/CfyeeybBz9c/lmmUT4x2CAAJ
https://groups.google.com/d/msg/mozilla.dev.security.policy/D0poUHqiYMw/Pf5p0kB7CAAJ
Hi Kathleen,

WISeKey also mis-issued one (1) certificate where the common name was not also included in the SANs. While WISeKey did not respond on list, they did revoke the certificate 30 hours after the report.

Here is the crt.sh link: https://crt.sh/?id=100609198&opt=cablint
Here is the original thread where the issue was reported: https://groups.google.com/d/msg/mozilla.dev.security.policy/K3sk5ZMv2DE/4oVzlN1xBgAJ

-Vincent
Dear all,
we're sorry for this issue. Please let me respond here as requested.

First of all, I'd like to confirm that there have been two (2) mis-issued certificates, the one that you point primarily in this bug and the one that is mentioned in the comment:
- The one mentioned in the bug (CERT-A, from now on) is considered mis-issued by containing a local DNS name
- The one mentioned in the comment (CERT-B, from now on) is considered mis-issued by not containing a valid DNS name in the first SAN
Both certificates have been already revoked. CERT-A was revoked yesterday,17/Aug and CERT-B was revoked the 8/Aug

I'll respond according to your schema:
1) How your CA first became aware of the problems listed below (e.g. via a Problem Report, via the discussion in mozilla.dev.security.policy, or via this Bugzilla Bug), and the date.
***
CERT-A was detected yesterday by checking the URL https://misissued.com/batch/8/ 
CERT-B was notified directly by email the 5/Aug by Alex Gaynor. We haven't been aware of any other notification related to this problem
***
2) Prompt confirmation that your CA has stopped issuing TLS/SSL certificates with the problems listed below.
***
Both problems have been started an investigation to determine the cause and we can confirm there's no reason this should occur again
***
3) Complete list of certificates that your CA finds with each of the listed issues during the remediation process. The recommended way to handle this is to ensure each certificate is logged to CT and then attach a CSV file/spreadsheet of the fingerprints or crt.sh IDs, with one list per distinct problem.
***
CERT-A: https://crt.sh/?id=19429431&opt=cablint
CERT-B: https://crt.sh/?id=100609198&opt=cablint
***
4) Summary of the problematic certificates. For each problem listed below: number of certs, date first and last certs with that problem were issued.
***
Same as the above
***
5) Explanation about how and why the mistakes were made, and not caught and fixed earlier.
***
For CERT-A (.local name in SAN)
We implement an automated check in our SSL web portal, that scans the CSR to detect invalid domain names. Apparently this certificate, due to a bug in the process, bypassed this check. We apply a manual validation before issuing the certificate that obviously also failed in this case. This certificate wasn't listed in any of our random samples during internal audits, so the issue continued undetected until now.
---
For CERT-B (not valid DNS name in first SAN)
This certificate shouldn't exist. This is a pre-certificate issued by the EJBCA server during the CT log publishing of a test certificate, while we were implementing the EV support in our CAs, so not a real SSL certificate. During the EV adaptation process the EJBCA server malfunctioned. Actually the certificate itself wasn't logged in the EJBCA DB (these EJBCA pre-certificates aren't kept in the DB) and we had to do a low-level procedure to re-insert it and be able to revoke it, and that's the reason we couldn't process the revocation in less than 24 hours. Anyway, as said this was not a real SSL associated to any web server, so it didn't have impact on any customer.
***
6) List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
***
For CERT-A we ensured that the code in the SSL portal is verifying properly the names included in the CSR
For CERT-B we don't foresee any action as this was a bug during a test in the CA configuration
***
7) Regular updates to confirm when those steps have been completed.
***
As per our understanding, the problems have been already solved
***

I hope this settles the issue, although we'll make our best to ensure that this is not repeated.

Best regards and sorry for the inconveniences,
Pedro
(In reply to Pedro Fuentes from comment #2)
> 5) Explanation about how and why the mistakes were made, and not caught and
> fixed earlier.
> ***
> For CERT-A (.local name in SAN)
> We implement an automated check in our SSL web portal, that scans the CSR to
> detect invalid domain names. Apparently this certificate, due to a bug in
> the process, bypassed this check. We apply a manual validation before
> issuing the certificate that obviously also failed in this case. This
> certificate wasn't listed in any of our random samples during internal
> audits, so the issue continued undetected until now.
> ---
> For CERT-B (not valid DNS name in first SAN)
> This certificate shouldn't exist. This is a pre-certificate issued by the
> EJBCA server during the CT log publishing of a test certificate, while we
> were implementing the EV support in our CAs, so not a real SSL certificate.
> During the EV adaptation process the EJBCA server malfunctioned. Actually
> the certificate itself wasn't logged in the EJBCA DB (these EJBCA
> pre-certificates aren't kept in the DB) and we had to do a low-level
> procedure to re-insert it and be able to revoke it, and that's the reason we
> couldn't process the revocation in less than 24 hours. Anyway, as said this
> was not a real SSL associated to any web server, so it didn't have impact on
> any customer.
> ***
> 6) List of steps your CA is taking to resolve the situation and ensure such
> issuance will not be repeated in the future, accompanied with a timeline of
> when your CA expects to accomplish these things.
> ***
> For CERT-A we ensured that the code in the SSL portal is verifying properly
> the names included in the CSR
> For CERT-B we don't foresee any action as this was a bug during a test in
> the CA configuration
> ***

Thanks for responding. I think it's still necessary to provide additional detail.

That is, we can look at the problem from two dimensions: The problem itself, and the systemic issues that allowed the problem to manifest. Your description focuses on the resolution of the problem, but doesn't indicate any systemic changes have been made. As a consequence, it does not help the community feel that your CA has taken steps to reduce the risk of future violations (of any requirement) in the future. That is, one dimension is "Did you fix the bug", but another dimension is "How was the bug introduced, why was it not detected, and what steps are you taking to prevent future bugs"

To understand how you might approach this problem, consider https://groups.google.com/d/msg/mozilla.dev.security.policy/vl5eq0PoJxY/W1D4oZ__BwAJ and how it provided a timeline of events, the steps that were already in place (and with substantial detail), where there were controls missing or mistakes made, details about the steps being taken (e.g. "We fixed the bug" is not sufficient detail to understand), and a holistic, systemic awareness of how the CA is managed and the opportunities for errors.


For this specific case:
---
For CERT-A:
- What was the bug in the process that the certificate bypassed?
- What steps have you taken to make sure that the process is not bypassed?
- Describe how you're "verifying properly" the names - that is, what controls have you put in place?
- It sounds like, systemically, the only verification is sampling audit. Compare with PKIoverheid, which has taken steps to both deploy 100% CT for their certificates, pre-issuance monitoring via certlint and related, and post-issuance monitoring for issues. Has your CA taken similar steps? If so, when will they be deployed? If not, why do you believe such critical steps are not necessary?
For CERT-B:
- It sounds like you were doing manual testing. Why did you not detect this as part of your manual testing procedures?
- What changes have you made to your manual testing process in response to this?
- If it was not manual testing, but deployment of updated software, what changes have you made to your software deployment playbook?
- As it was related to an improper configuration, what procedures did you have in place to review configurations before issuance? What changes have you made to those procedures, to ensure the next configuration update does not similarly lead to misissuance?


For both cases, providing a timeline of events, including the steps your CA took, can help the community better understand your CA's commitment to security, standards compliance, and policy adherence.
(In reply to Pedro Fuentes from comment #2)
> ***
> CERT-A was detected yesterday by checking the URL
> https://misissued.com/batch/8/ 

I emailed a problem report about this certificate, can you please investigate and explain why it was not received?

Here are the relevant message headers:

  From: Jonathan Rudenberg <jonathan@titanous.com>
  Subject: Misissuance - invalid dnsNames 
  Message-Id: <A7B38FB3-EC73-444F-9DD8-78A9A784CAE5@titanous.com>
  Date: Sun, 13 Aug 2017 00:45:12 -0400
  To: cps@wisekey.com
(In reply to Jonathan Rudenberg from comment #4)
> (In reply to Pedro Fuentes from comment #2)
> > ***
> > CERT-A was detected yesterday by checking the URL
> > https://misissued.com/batch/8/ 
> 
> I emailed a problem report about this certificate, can you please
> investigate and explain why it was not received?
> 
> Here are the relevant message headers:
> 
>   From: Jonathan Rudenberg <jonathan@titanous.com>
>   Subject: Misissuance - invalid dnsNames 
>   Message-Id: <A7B38FB3-EC73-444F-9DD8-78A9A784CAE5@titanous.com>
>   Date: Sun, 13 Aug 2017 00:45:12 -0400
>   To: cps@wisekey.com

Hi Jonathan, cps@wisekey.com is a mailing list received by me and some other people. I can't find any notification from you in my mailbox, nor I was warned by any colleague about it. We didn't detect recently any problem in the mail system, so I can't say why it didn't arrive, if it was a problem in our side or in yours.
I'd appreciate if you can send us a test message.
Thanks
(In reply to Pedro Fuentes from comment #5)
> I'd appreciate if you can send us a test message.

Done.
(In reply to Ryan Sleevi from comment #3)
> (In reply to Pedro Fuentes from comment #2)
>
(...) 
> 
> For both cases, providing a timeline of events, including the steps your CA
> took, can help the community better understand your CA's commitment to
> security, standards compliance, and policy adherence.

Dear Ryan,
thanks for your message and help to improve our response.
Please let me answer as per your additional questions.

> For CERT-A:
> - What was the bug in the process that the certificate bypassed?
***
By the time this certificate was issued, we implemented a validation process in our SSL portal, so incoming CSR would be checked to prevent invalid TLD. That early version didn't implement properly the procedure and there was a bug appearing under certain conditions (number of SAN and position of the infringing TLD). This was solved time ago and the TLD validation has been tested and validated.
***
> - What steps have you taken to make sure that the process is not bypassed?
***
The backend of the SSL portal now raises more visible warnings on infringing CSR, and we automated the rejection of those CSR in an early status, so and administrator can't mismatch an invalid CSR.
***
> - Describe how you're "verifying properly" the names - that is, what
> controls have you put in place?
***
As said above, we run a process to extract al DNS names in the CSR and validate against a list of valid TLD.
***
> - It sounds like, systemically, the only verification is sampling audit.
> Compare with PKIoverheid, which has taken steps to both deploy 100% CT for
> their certificates, pre-issuance monitoring via certlint and related, and
> post-issuance monitoring for issues. Has your CA taken similar steps? If so,
> when will they be deployed? If not, why do you believe such critical steps
> are not necessary?
***
Until now we relied on manual verifications before issuing and periodic reviews based on samples. With the availability of new tools we are starting to rely on automated testing, but actually the plan is to start using the certificate management software of QuoVadis (company acquired recently by WISeKey) that is more evolved in terms of CABF BR compliance. The planning is to phase out the current CMS before end of 2017, but in the meantime we're enforcing a deeper manual validation and a periodic verification of 100% of SSL certificates
***

TIMELINE (CET TIMES):
17/Aug 17:40 - We get an internal notification after finding the offending certificate in https://misissued.com/batch/8/
17/Aug 18:00 - After verifying the issue, we revoked the certificate and notified the customer. We starting an internal investigation process to identify the root cause
18/Aug 11:30 - We close the investigation process, ensuring that the current implementation of TLD validation is correct and we must enforce additional human validations before issuing each certificate


> For CERT-B:
> - It sounds like you were doing manual testing. Why did you not detect this
> as part of your manual testing procedures?
***
This certificate was not supposed to be issued, as it was a pre-certificate generated during a test issuance of an EV certificate, after applying a patch in one EJBCA server to enable publishing to CT logs. The EJBCA didn't even register the certificate in the DB, so for us this certificate was non-existent, so not appearing in our certificate list. 
***
> - What changes have you made to your manual testing process in response to
> this?
***
Not in response to this issue, but as result of the EV tests (CT publishing) executed by the time the offending certificate was issued, we decided not to start the commercial issuance of EV certificates until the platform had been substituted, as mentioned above. 
***
> - If it was not manual testing, but deployment of updated software, what
> changes have you made to your software deployment playbook?
***
This particular problem has proven very difficult to track, as the issue is about a pre-certificate not registered in the DB. In order to avoid such situations, as above said, the decision is not to provide a commercial EV service and keep by now operation only of OV certificates which rely on a different technology.  
***
> - As it was related to an improper configuration, what procedures did you
> have in place to review configurations before issuance? What changes have
> you made to those procedures, to ensure the next configuration update does
> not similarly lead to misissuance?
***
Please refer to my above response. This issue is more related to a malfunctioning patch than a configuration problem, and it wouldn't be repeated as the affected system is not anymore enabled to issue new certificates.
***

TIMELINE (CET times):
5/Aug 16:55 - Message sent by Alex Gaynor to cps@wisekey.com, reporting the mis-issued certificate
5/Aug 18:50 - Once was detected that this was a pre-certificate and that was not in the DB of the CMS, we escalated the issue to the CA management team
7/Aug 05:00 - After verifying that the certificate wasn't in the EJBCA DB, the issue is escalated and we request external support
7/Aug 22:00 - We get a procedure to retrieve the certificate and be able to revoke it
(In reply to Jonathan Rudenberg from comment #6)
> (In reply to Pedro Fuentes from comment #5)
> > I'd appreciate if you can send us a test message.
> 
> Done.

Your message has been received. We'll check the mail server and anti spam logs, to see what could be wrong with the previous one.
Thanks.
(In reply to Pedro Fuentes from comment #7)

Thank you for the continued discussion and timely responsiveness.


> ***
> Until now we relied on manual verifications before issuing and periodic
> reviews based on samples. With the availability of new tools we are starting
> to rely on automated testing, but actually the plan is to start using the
> certificate management software of QuoVadis (company acquired recently by
> WISeKey) that is more evolved in terms of CABF BR compliance. The planning
> is to phase out the current CMS before end of 2017, but in the meantime
> we're enforcing a deeper manual validation and a periodic verification of
> 100% of SSL certificates
> ***

This sounds like a reasonable and responsible plan for converging infrastructures and ensuring compliance. Per Request 7, it would be appropriate to use this bug to continue to provide details about that planned transition over the next several months.

> > - As it was related to an improper configuration, what procedures did you
> > have in place to review configurations before issuance? What changes have
> > you made to those procedures, to ensure the next configuration update does
> > not similarly lead to misissuance?
> ***
> Please refer to my above response. This issue is more related to a
> malfunctioning patch than a configuration problem, and it wouldn't be
> repeated as the affected system is not anymore enabled to issue new
> certificates.
> ***

As you transition to your new platform, one thing I would suggest, based on this description, is to ensure you're maintaining audit logs on your HSMs regarding the use of keys for signatures. This would allow you to cross-correlate these logs with the logs on your issuance platform and certificate database. For example, the presence of 5 signing operations, but only 4 certificates, would reveal an incident worth investigating. Even better would be to log the contents of what was signed (at the HSM layer), in addition to recording the certificates issued (at your CA layer).

Integrating such logging, and the routine review of such logging, into your CA operations will help provide a worthwhile defense, not just against situations of configurations or bad patches, but also against malicious activity.


I think the outstanding items to track on this bug are:
- Regular updates about the transition to QuoVadis' system and tools
- Resolution as to why Jonathan's problem report was not detected
Dear all,
as follow-up of this issue, I'd just like to confirm that WISeKey transitioned its SSL issuance to the QuoVadis platform, as stated in this bug. Only in special cases we'd keep using the current platform (a few MPKI customers with domain constraints).
Regards,
Pedro
Pedro: Do you have an update on the investigation with respect to Comment #8?
(In reply to Ryan Sleevi from comment #11)
> Pedro: Do you have an update on the investigation with respect to Comment #8?

Hi Ryan,
we could verify that:
- None of the three persons that receive messages to the mail list cps@wisekey.com got that message in our mailboxes
- We didn't find it stopped in the spam filter
- We didn't recorded any downtime or malfunction of the mail server in that timeframe

Therefore, not putting in doubt that the mail was sent, after a best-effort analysis we couldn't find any information about it.

Regards,
Pedro
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ov-misissuance]
You need to log in before you can comment on or make changes to this bug.