Closed Bug 1687608 Opened 3 years ago Closed 3 years ago

E-Tugra: The failure to revoke a certificate

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dtokgoz, Assigned: dtokgoz)

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

This is a preliminary report.

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.
    On 15 Jan 2021 15:04 (UTC), we received an email from George (george@fozzie.dev) about revocation of a certificate which commonName of "cebimde.com.tr" which is not included in the SAN of the certificate. The Revocation is completed on 18 Jan 2021, But the certificate must be revoked in 5 days. On 20 Jan, A decision about creating this bug was taken in bug https://bugzilla.mozilla.org/show_bug.cgi?id=1687139.
  2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
    (all times are GMT)
    1 23 Dec 2020 12:33 The mentioned certificate was issued.
    24 Dec 2020 14:18 The problem on certificate is found and a new certificate was issued. The first certificate was marked for revocation.
    5 Jan 2021 18:10 An email from George was received.
    16 Jan 2021 The technical persons queried certificates on our systems system and find that it was reissued and marked for revocation. To take an action, he sent it to Security Group to investigate. The technical team did not take an action to inform reporter about the process.
    18 Jan 2021 Security Group investigate the problem for revocation and revoked. The certificate was reissued but revocation process was not completed. Security Group and System Developers continue to investigate the root cause and enhancing to system.
    We continue to investigate the reason.
  3. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
    None of CA services are affected.
  4. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
    The revocation request of the mentioned certificate was completed. No more certificate is found with similar state.
  5. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
    There are no more certificates were issued with the same problem.
  6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
    This is still under investigation. The root cause was the history of the certificate, the certificate is reissued with subscriber request 24 hours later. It was marked as reissued by subscriber request.
  7. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
    This is pending completion of our investigation. We are planning how to enhance our systems and our procedures about monitoring all certificates issues and all certificates life cycles.
    Final Incident Report will be published from here.
Assignee: bwilson → dtokgoz
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [delayed-revocation-leaf]
  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.
    On 15 Jan 2021 15:04 (UTC), we received an email from George (george@fozzie.dev) about revocation of a certificate which commonName of "cebimde.com.tr" which is not included in the SAN of the certificate. The Revocation is completed on 18 Jan 2021, But the certificate must be revoked in 5 days. On 20 Jan, A decision about creating this bug was taken in bug https://bugzilla.mozilla.org/show_bug.cgi?id=1687139.

  2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
    (all times are GMT)
    23 Dec 2020 12:33 The mentioned certificate was issued.
    24 Dec 2020 14:18 The problem on certificate is found and a new certificate was issued. The first certificate was marked for revocation.
    5 Jan 2021 18:10 An email from George was received.
    16 Jan 2021 The technical persons queried certificates on our systems system and find that it was reissued and marked for revocation. To take an action, he sent it to Security Group to investigate. The technical team did not take an action to inform reporter about the process.
    18 Jan 2021 Security Group investigate the problem for revocation and revoked. The certificate was reissued but revocation process was not completed. Security Group and System Developers continue to investigate the root cause and enhancing to system.
    18 Jan 2021: We started more detailed investigation how this error occurs.

  3. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
    None of CA services are affected.

  4. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
    The revocation request of the mentioned certificate was completed. No more certificate is found with similar state.

  5. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
    There are no more certificates were issued with the same problem.

  6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
    The root cause was the history of the certificate, the certificate is reissued with subscriber request 24 hours later.
    The certificate was prepared with SAN conflict error. It was mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=1687139. This error was catched in certificate by our system and the certificate was not send to Subscriber. Next day it was reissued with the correct values.
    Our systems alert any error in all certificate pre issuing and post issuing. We have 3 level validation controls.
    • Pre validations: applied before certificate is issued.
    • Post validations: applied after certificate is issued.
    • Zint controls: applied after post validations as Final Control.
    This error was catched by our post validation controls and is marked an error, but not marked as mis issued. Final zlint validations was not applied for this certificate due to an error found in previous controls. So the misissue error cannot be catched by zlint controls.
    Certificates that are reissued s revoked after a time period and this certificate was placed in this list in our reports. All misissued certificates are revoked when it was determined. But All other certificates like this certificate was not marked as mis-issued are revoked after a permission.

  7. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
    During investigation, we planned and implement how to enhance our systems and our procedures about monitoring all certificates issues and all certificates life cycles.
    Post-validation controls were reviewed to manage all possible errors. The system was covered. Zlint validations are being applied whether the post validations are success or failure.
    We are developing a workflow that when any error occurs in certificate issue whether it is misissue or not, will be reviewed and approved for revocation by trusted personals in our systems. We will complete it on the middle of Feb. We will update internal procedures ve process with this workflow.

So if I'm understanding this correctly, you determined that the certificate had an error through Zlint but you didn't think it was a misissuance and so didn't revoke the certificate within the 5 days mandated by the BRs? I'm unsure how E-Tugra came to this conclusion as Zlint also provides citations for each of its lints and the "e_subject_common_name_not_from_san" lint has a citation of "BRs: 7.1.4.2.2".[1] Do E-Tugra staff have access to this information when determing whether a certificate error is a misissuance or not?

[1] - https://github.com/zmap/zlint/blob/master/v3/lints/cabf_br/lint_subject_common_name_not_from_san.go

Flags: needinfo?(dtokgoz)

Hi,
I think my English caused some misunderstanding.
We had 3 validation controls.
• 1st control, “Pre-Validation Controls”: applied before a certificate is issued. If some errors found the certificate is not issued.
• 2nd control, “Post-Validation Controls”: applied after a certificate is issued.
• 3rd control, “Zint Controls”: applied after “Post-Validation Controls” as Final Control. This control was run on the certificate only if the certificate passes the 2nd controls “Post-Validation Controls”.

This error was catched by our “Post-Validation Controls” of Our Certificates Authority System and the certificate is marked with an error, but not marked as mis issued certificate.
Because this error was catched in “Post-Validation Controls”, the next validation control, “Zlint Controls” was not applied for this certificate.
It is absolutely true that if zlint control was run, zlint would catch it. But in this case, it was not run, because the certificate did not pass the 2nd controls “Post-Validation Controls”.

We have 3 part in here that we are fixing.
• We are revising all of “Post Validation Controls” routines to be able sure that there are no missing controls and to prevent mis issuing.
• We are running now “Zlint Controls” for all SSL certificates whether a certificate pass or not pass the “Post-Validation Controls”. Thus, we started to run always 2 independent validation controls (“Post-Validations Controls” and “Zlint Controls”) after a certificate was issued.
• And, we are enhancing our procedures as follow. In any kind of error, the system will create a recurring notification until an action is taken for all kind of certificate errors. We are also enhancing our procedures and control checklists and revocation processes, to prevent such errors and situations from happening again.

Flags: needinfo?(dtokgoz)

The proposed steps seem far below industry good practice here; e.g. post-issuance linting is known to regularly result in misissuance, and CAs that fail to take appropriate steps to prevent misissuance (like more consistent pre-issuance linting) ultimately have trust in them removed.

It sounds like your remediation effort is doubling-down on post-issuance lints, rather than trying to work these into pre-issuance lints (e.g. as Comment #2 was suggesting). Am I misunderstanding something?

Flags: needinfo?(dtokgoz)

Hi Ryan
You are right. We have controls before issuance of a certificate that was developed inside our system. We implemented an additional pre-issuance linting using zlint implementation. This was noted in bug (https://bugzilla.mozilla.org/show_bug.cgi?id=1687139) that was opened with the relating this bug. This implementation is completed. There are 2 weeks latency on completion. We planned to put it production at the end of this week (3rd, April).

Flags: needinfo?(dtokgoz)

Update and Close Request
Post-validation controls were reviewed to manage all possible errors. Zlint validations are being applied independently our custom post validations and runs always whether our post validations are success or failure.
We also added the zlint control before issuing certificates, on tbsCertificate phase. The development and tests are completed and we will apply them on production at the end of this week (3rd, April).
We were developed a workflow that when any error occurs in certificate issue whether it is misissue or not, it is being reviewed and approved for revocation by trusted personals in our systems. We completed it at the end of February of Feb. We rebuild internal procedures ve process with this workflow.

Unless I hear otherwise, I am going to assume that the proposed changes will be put into production tomorrow and that I can close this matter next Wed. 7-April-2021.

Flags: needinfo?(bwilson)
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] [delayed-revocation-leaf] → [ca-compliance] [leaf-revocation-delay]
You need to log in before you can comment on or make changes to this bug.