Closed Bug 1552562 Opened 5 years ago Closed 5 years ago

Entrust: Question marks in certificate O and L fields

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bruce.morton, Assigned: bruce.morton)

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36

Steps to reproduce:

CA issued an SSL certificate with question marks in the O and L fields of the Subject Name.

Expected results:

Unicode characters should have been in the O and L fields of the Subject Name.

This miss-issue was reviewed through our audit process. It was discovered that although we found the miss-issued certificate, revoked the certificate and addressed the problem, we were not transparent and did no file a miss-issue report. Here is the report.

  1. How your CA first became aware of the problem

Entrust Datacard discovered the miss-issue through verification review and linting.

  1. A timeline of the actions your CA took in response

July 5, 2018, 6:02 UTC- Certificate issued
July 5, 2018, 13:30 UTC - Miss-issue certificate discovered
July 6, 2018, 1:28 UTC - Certificate revoked
August 2, 2018 – Patch deployed

  1. Confirmation that your CA has stopped issuing TLS/SSL certificates with the problem

The certificate was miss-issued due to a bug in a release. Future miss-issuances were first prevented by monitoring. A patch has been issued to correct the bug.

  1. A summary of the problematic certificates

There was only one certificate miss-issued. This certificate was requested with Chinese characters. In setting up the enterprise account, the verified data was copied for one place to another and the Unicode was not preserved and ended up as question marks.

  1. The complete certificate data for the problematic certificates

https://crt.sh/?id=574798325

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

The bug which corrupted Chinese characters in our system was introduced as part of a release in March 2018. The problem was detected as soon as the first certificate was issued with incorrect data.

  1. List of steps your CA is taking to resolve the situation

We introduced a manual monitoring process to stop these certificates from being issued again. The process was patched on 2 August 2018 and we use certificate linting to detect this instance if it occurs again.

The certificate was miss-issued due to a bug in a release. Future miss-issuances were first prevented by monitoring. A patch has been issued to correct the bug.
The bug which corrupted Chinese characters in our system was introduced as part of a release in March 2018. The problem was detected as soon as the first certificate was issued with incorrect data.

This is not a sufficient level of detail. In order to be assured that Entrust has understood and mitigated the root cause, as well as systemically addressed it in other places it may present, and to help the community better understand best practices that may be employed in the development, implementation, and verification of CA software, I'm hoping you can provide further detail.

To help demonstrate how other CAs are providing this, see https://bugzilla.mozilla.org/show_bug.cgi?id=1551374#c3 or comments https://bugzilla.mozilla.org/show_bug.cgi?id=1550645#c8 through https://bugzilla.mozilla.org/show_bug.cgi?id=1550645#c11 (and the related follow-up questions) to understand some of the goals and objectives of this process.

Flags: needinfo?(bruce.morton)
Assignee: wthayer → bruce.morton
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Summary: Entrust - Question marks in certificate O and L fields → Entrust: Question marks in certificate O and L fields
Whiteboard: [ca-compliance]

Bruce: thank you for this report. Given the timing and comments that "we were not transparent and did no file a miss-issue report.", I would like to request that Entrust answer questions 6 and 7 in the context of failing to report misissuance in a timely fashion, and also would like to know if Entrust has checked for any other unreported misissuances?

(In reply to Ryan Sleevi from comment #2)

This is not a sufficient level of detail. In order to be assured that Entrust has understood and mitigated the root cause, as well as systemically addressed it in other places it may present, and to help the community better understand best practices that may be employed in the development, implementation, and verification of CA software, I'm hoping you can provide further detail.

Details of the bug:
Data was vetted as Unicode in the order table. Upon completion of vetting, the data was copied to the account table. The bug was that the data was copied as varchar instead of nvarchar, resulting in any non-ascii characters getting converted to question marks. All instances of this pattern were fixed at the same time.

Other improvements:
We now have online pre-issuance linting which will catch errors like this.

Flags: needinfo?(bruce.morton)

(In reply to Wayne Thayer [:wayne] from comment #3)

Bruce: thank you for this report. Given the timing and comments that "we were not transparent and did no file a miss-issue report.", I would like to request that Entrust answer questions 6 and 7 in the context of failing to report misissuance in a timely fashion, and also would like to know if Entrust has checked for any other unreported misissuances?

This was an oversight. The certificate miss-issuance was discovered, certificate revoked and changes designed as normal business practice. The miss-issuance report should have been issued similar to other miss-issuances that have had reports before and after this miss-issuance. Based on previous experience, our policy is to be transparent and "immediately" file miss-issuance reports even the data is incomplete as more data can be added to the report as investigation is completed.

Based on internal review and our annual compliance audit, all miss-issuances have been reported.

(In reply to Bruce Morton from comment #4)

Details of the bug:
Data was vetted as Unicode in the order table. Upon completion of vetting, the data was copied to the account table. The bug was that the data was copied as varchar instead of nvarchar, resulting in any non-ascii characters getting converted to question marks. All instances of this pattern were fixed at the same time.

I hope we can agree that this doesn't really rise to the level compared to in the other examples.

The handling of Unicode data is, admittedly, complex, but a responsible incident report would cover the flow of data from entry through the various systems and ultimately result in issuance. It would provide an understanding of all the components involved, how data is exchanged between each of those components, and how each layer has been carefully examined and documented to avoid not just this issue, but any related data conversion or translation issues. Understanding which layer provided the translation, why it translates as such, whether any such past translations have occurred in any other fields or information, are all similarly appropriate for a 'good' incident response.

Other improvements:
We now have online pre-issuance linting which will catch errors like this.

While a valuable improvement, combined with the provided incident report, it does not instill confidence the Entrust has sufficiently understood and communicated the root cause, nor taken appropriate steps to mitigate or prevent this in the future. For example, the provided response does not demonstrate whether or not Entrust has considered any linting failures during pre-issuance linting to be signals of systemic design flaws within their system, and appropriate for careful review (and documentation) about each factor that lead to the lint failure and how that's being addressed.

In short, while preissuance linting is an essential step in preventing a CA from actually signing a malformed certificate, from a policy and compliance perspective, any lint failure, even if caught, should be seen as a critical and systemic design flaw.

In light of the example incident reports I provided, which demonstrate a much greater level of detail than provided, I'm curious whether or not Entrust would like to amend or extend its incident report to sufficiently demonstrate that they have truly understood the root causes and systemically addressed them. The greater level of detail provided now and in the future, the better a CA can demonstrate it is capable of responding timely and with sufficient detail so as to reassure the community that no further actions are required when looking at the set of incidents a CA has had. It similarly helps reassure the community that the CA is committed to the ongoing security of the ecosystem, by helping ensure that all CAs are able to learn from the issues and implement similar checks, thus improving the ecosystem.

Flags: needinfo?(bruce.morton)

(In reply to Ryan Sleevi from comment #6)

Expected subject name
CN = www.dlggzy.cn
O = 大理公共资源交易中心
L = 大理
ST = Yunnan
C = CN

Actual subject name
CN = www.dlggzy.cn
O = ??????????
L = ??
ST = Yunnan
C = CN

How did the organization and locality fields end up as question marks?

First it will help to have some background information on the data model and work flow for new accounts. An account is an entity that holds verified information that can be used in a certificate, such as domain names and organization names. Before a new account is created, it starts out as an order with unverified information. Once the order data is verified the account is created and activated, and becomes available for authorized account users to log in and create certificates containing the verified information.

The process of creating and activating an account includes copying verified organization and locality information from the order table to the account table.

The order table and account table both store organization and locality as Unicode (nvarchar), although this wasn't always the case. Originally these fields used the ASCII data type (varchar) and Entrust did not support Unicode organization names. When Unicode support was added, in addition to changing the data schema we also updated the code that copies these fields to treat them as Unicode. Unfortunately the code path that handles new accounts, as opposed to adding new organizations on existing accounts, was missed in this update, and not discovered in manual or automated testing.

As a result of this bug, the 10-character verified organization name and 2-character verified locality became 10 question marks and 2 question marks respectively in the activated account.

Normally the verification agent (human) activating the account reviews the account page after the data's been copied from the order page to ensure that everything looks as expected prior to activating the account, but that did not happen in this case. It did happen after account activation, and was reported to the development team, but due to time zone differences the development team did not receive the report until the next morning. By this time the customer had already issued a certificate containing the 10-question mark organization and 2-question mark locality.

How we have improved our development process to ensure this won't happen again?
Manual and automated testing now uses Unicode characters in every possible field on input forms and APIs.

Flags: needinfo?(bruce.morton)

(In reply to Bruce Morton from comment #7)

Normally the verification agent (human) activating the account reviews the account page after the data's been copied from the order page to ensure that everything looks as expected prior to activating the account, but that did not happen in this case. It did happen after account activation, and was reported to the development team, but due to time zone differences the development team did not receive the report until the next morning. By this time the customer had already issued a certificate containing the 10-question mark organization and 2-question mark locality.

Thanks. I think we're making good progress here in further delving in to root causes.

I'm encouraged that you've added additional manual and automated testing here. Why was this overlooked in the initial testing? What changes, organizationally, are being made with how you develop tests, both when modifying/migrating existing systems and when developing new systems? Who (organizationally, not personally) is responsible for generating the test cases and ensuring their completeness, and who (organizationally, not personally) is responsible for reviewing them? Has that changed?

Similarly, this highlights a potential systemic issue: a human review factor, and the human review not being conducted. This potentially represents a more systemic, and potentially more serious, flaw, in that human review processes are known to have issues. What additional steps are being taken to systemically mitigate these issues going forward? For example, two-party review for any form of manual review, increased internal audits and spot checks, etc?

Flags: needinfo?(bruce.morton)

(In reply to Ryan Sleevi from comment #8)
The tester is responsible for writing the test cases. The developer and the QA Prime are responsible for reviewing the test cases. All parties are aware that Unicode testing is now mandatory.

We use two parties for all verification. The two parties are a Verification Specialist to perform the verification and a Verification Auditor to review that the process was followed correctly and the objective evidence is sound. The Verification Auditor does a check on the account org profile to make sure it is the same as what was verified, which helps prevent this error.
Please note that:

  1. All verified accounts are reviewed by 2 different people under 2 different roles,
  2. In this case, the vetting portal correctly showed the organization and it was reviewed by 2 people,
  3. When accounts are approved by the 2 people on the verification side, the account is approved and created. This is where the bad data issue occurred, and
  4. We have implemented a new process where the verification agent will check the account in the customer account portal as soon as they activate it to make sure the data has not changed.

We have activated pre-issuance linting based on zlint and post-issuance linting based on cablint to help prevent or detect certificate miss-issuances.

Flags: needinfo?(bruce.morton)

It appears that remediation is completed.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ov-misissuance]
You need to log in before you can comment on or make changes to this bug.