Closed Bug 1921254 Opened 11 months ago Closed 6 months ago

Izenpe: Duplicate attribute in Subject

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: d-fernandez, Assigned: d-fernandez)

Details

(Whiteboard: [ca-compliance] [ev-misissuance])

Attachments

(5 files)

Incident Report

Summary

Izenpe has issued a certificate with a duplicate attribute in subject. According to the BR in force in the moment it was issued (2.0.6),
as stated in 7.1.4.1:

>Each Name MUST NOT contain more than one instance of a given AttributeTypeAndValue across all RelativeDistinguishedNames unless explicitly allowed in these Requirements.

So this is a violation of the BRs.

Impact

Only one certificate has been affected (https://crt.sh/?id=14410275037).

Timeline

2023-04-11 00:00 - BR 2.0.0 is published

2024-04-03 00:00 - Izenpe installs Zlint 3.6.2-rc1

2024-08-04 00:00 - Zlint 3.6.3 is released.

2024-09-02 06:00 - A new version of our client application is deployed.

2024-09-03 07:31 - A certificate with duplicate attribute (1.3.6.1.4.1.311.60.2.1.3 - jurisdictionOfIncorporationCountryName) in subject is issued.

2024-09-03 07:40 - The certificate is revoked.

2024-09-04 06:00 - A new version of the application is deployed to fix the bug.

2024-09-26 08:00 - Izenpe updates Zlint from version 3.6.2 to 3.6.3.

2024-09-26 08:30 - Izenpe analyzes all certificates issued in the last 13 months, finding one with an error (once excluded the 1876565 bug ones)

Root Cause Analysis

The combination of two distinct situations has lead us to this situation.
On September 2nd, a change on the application that manages our certificates request from clients, introduced the bug.
The change was intended to affect only to DV/OV certificates but it had a collateral impact on EV, resulting in duplication of the last attribute of subject.
The certificate was issued without any problem as no error advise was arised from ZLINT but the later manual revision noticed this "anomaly" and the issuing team decided to revoke it.

Lessons Learned

What went well

With the amount of certificates we issue, a manual check of the certificate is still performed. The skill level of the issuing team
helped then detecting the anomaly and decided to revoke the certificate and report this to the development team so it could be fixed.

What didn't go well

  • The code change of our application should have been tested with all the certificate profiles we issue and not only with DVs and OVs.
  • If any certificate is revoked because it seems to had been badly issued, it should had been also reported to the PKI Team, so we can
    analyze and proceed, if required, with the bugzilla notification.
  • We had not established a deadline between zlint new versions publication and the moment we install it.

Where we got lucky

The issuing team detected the anomaly and prevented the client from downloading the certificate.

Action Items

Action Item Kind Due Date
Stop issuing SSL certificates Prevent 2024-09-04
Fix application code Remediation 2024-09-05
Update ZLINT to 3.6.3 Prevent 2024-09-26
Update ZLINT to latest version in less than 7 days Prevent 2024-xx-xx

Appendix

Details of affected certificates

https://crt.sh/?id=14410275037

Assignee: nobody → d-fernandez
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Summary: Duplicate attribute in Subject → Izenpe: Duplicate attribute in Subject
Whiteboard: [ca-compliance] [ev-misissuance]

Why did it take over 3 weeks for Izenpe to raise this incident? 'What didn't go well' highlights a discrepancy in revocation, mis-issuance, and change to the CA issuance process all happening without any oversight from the PKI team. No action item exists to resolve this, and an explanation of what went wrong would be greatly appreciated.

The Root Cause Analysis section implies that a manual check is what caught this certificate mis-issuance. What does your pre-linting and post-linting processes look like to cause this to happen? From what I can tell in the report every certificate is manually checked?

On the action items themselves, 'Remediation' is not a category as per CCADB's Incident Reporting Guidelines. Which action items are complete, and when were they completed? Is this a preliminary report, or the final report?

Flags: needinfo?(d-fernandez)

Why did it take over 3 weeks for Izenpe to raise this incident? 'What didn't go well' highlights a discrepancy in revocation, mis-issuance, and change to the CA issuance process all happening without any oversight from the PKI team. No action item exists to resolve this, and an explanation of what went wrong would be greatly appreciated.

The mis-issuance was detected after updating the zlint version to 3.6.3 and analyzing all the certificates issued the last 13 months.
After this "finding" we realized that the current zlint version (3.6.2) didn't detect this issue and we started digging with the issuing team why it had been revoked.
At this point, they told us that they detected the problem and just in case, revoke it and notify the development team.
Changes to the CA issuance process are always guided by the PKI Team and we are notified when it happens. But in this situation, when the problem mas noted,
the comunication was between de issuing team and the development team as they believed there had not been any mis-issuance and not considered a major Bug. We will reflect this in the action item.

The Root Cause Analysis section implies that a manual check is what caught this certificate mis-issuance. What does your pre-linting and post-linting processes look like to cause this to happen? From what I can tell in the report every certificate is manually checked?

Before linting solutions were applied, the issuing team checked manually certificates to prevent delivering bad issued certificates to clients (mainly checking the Subject),
and they still keep doing. We are talking about 3-4 on avarege per day. In this situation, where ZLINT 3.6.2 didn't catch the issue, they didn't believe there had been a major issue.

On the action items themselves, 'Remediation' is not a category as per CCADB's Incident Reporting Guidelines. Which action items are complete, and when were they completed? Is this a preliminary report, or the final report?

Sorry for the confusion, I am including a new report version.

Flags: needinfo?(d-fernandez)

Incident Report

Summary

Izenpe has issued a certificate with a duplicate attribute in subject. According to the BR in force in the moment it was issued (2.0.6),
as stated in 7.1.4.1:

>Each Name MUST NOT contain more than one instance of a given AttributeTypeAndValue across all RelativeDistinguishedNames unless explicitly allowed in these Requirements.

So this is a violation of the BRs.

Impact

Only one certificate has been affected (https://crt.sh/?id=14410275037).

Timeline

2023-04-11 00:00 - BR 2.0.0 is published

2024-04-03 00:00 - Izenpe installs Zlint 3.6.2-rc1

2024-08-04 00:00 - Zlint 3.6.3 is released.

2024-09-02 06:00 - A new version of our client application is deployed.

2024-09-03 07:31 - A certificate with duplicate attribute (1.3.6.1.4.1.311.60.2.1.3 - jurisdictionOfIncorporationCountryName) in subject is issued.

2024-09-03 07:40 - The certificate is revoked.

2024-09-04 06:00 - A new version of the application is deployed to fix the bug.

2024-09-26 08:00 - Izenpe updates Zlint from version 3.6.2 to 3.6.3.

2024-09-26 08:30 - Izenpe analyzes all certificates issued in the last 13 months, finding one with an error (once excluded the 1876565 bug ones)

Root Cause Analysis

The combination of two distinct situations has lead us to this situation.
On September 2nd, a change on the application that manages our certificates request from clients, introduced the bug.
The change was intended to affect only to DV/OV certificates but it had a collateral impact on EV, resulting in duplication of the last attribute of subject.
The certificate was issued without any problem as no error advise was arised from ZLINT but the later manual revision noticed this "anomaly" and the issuing team decided to revoke it.

Lessons Learned

What went well

With the amount of certificates we issue, a manual check of the certificate is still performed. The skill level of the issuing team
helped then detecting the anomaly and decided to revoke the certificate and report this to the development team so it could be fixed.

What didn't go well

  • The code change of our application should have been tested with all the certificate profiles we issue and not only with DVs and OVs.
  • If any certificate is revoked because it seems to had been badly issued, it should had been also reported to the PKI Team, so we can
    analyze and proceed, if required, with the bugzilla notification.
  • We had not established a deadline between zlint new versions publication and the moment we install it.

Where we got lucky

The issuing team detected the anomaly and prevented the client from downloading the certificate.

Action Items

Action Item Kind Due Date
Stop issuing SSL certificates Prevent 2024-09-04
Fix application code Mitigate 2024-09-05
Update ZLINT to 3.6.3 Detect 2024-09-26
Update ZLINT to latest version in less than 7 days Prevent 2024-xx-xx
Issuing Team is informed that MUST report to PKI Team of any anomaly Mitigate 2024-09-26
Development Team is informed that MUST report to PKI Team of any deployment Mitigate 2024-09-26

Appendix

Details of affected certificates

https://crt.sh/?id=14410275037

comment 2 does not sufficiently answer the question posed in comment 1.

If this certificate was “revoked just in case” why wasn’t a preliminary incident report opened? Beyond that, how was it not clear this is a misissuance?

Couple more things, linting tools are not exhaustive on compliance. Do you have any other measures other than linting tools you use for making sure your certificates are compliant?

Beyond that, it might help explain why you’ve historically waited so long to update your lining tools.

Do you do linting pre or post issuance? Or both?

Is the issuance process manual or automated? Please explain how the issuance process works, preferably accompanied by screenshots. For example if I reached out to you to get a cert for example.com, what’s the process for that?

Flags: needinfo?(d-fernandez)

Hi again,
I will try to go through your questions:
If this certificate was “revoked just in case” why wasn’t a preliminary incident report opened? Beyond that, how was it not clear this is a misissuance?

The issuing Team does not have the enough skills to determine if this was a missuance or not, they just check if something "normal" regarding the subject. And if not, they should had reported it to the PKI Team, something that they did not.
To prevent this situation, the PKI Team will automatically be notified (by email) each time a TLS certificate is revoked.

Couple more things, linting tools are not exhaustive on compliance. Do you have any other measures other than linting tools you use for making sure your certificates are compliant?

No, but after what has happened we are planning to use pkimetal at least after issuance.

Beyond that, it might help explain why you’ve historically waited so long to update your lining tools.

Since BUG 1876565 , we planned to keep up to date with zlint. The first update (3.6.2) after that incident was installed within 24h, and with 3.6.3, it was published in August when most of the PKI Team was on holiday and when we installed on September 26th, we discovered the bug when running it across all the issued certificates in the last 13 months. Yesterday, the 3.6.4 release candidate was published and we have installed it for the post linting.

Do you do linting pre or post issuance? Or both?

Both. And for each issued certificate, we receive an email with a summary of info,warn, error and fatal messages from the zlint analysis.

Is the issuance process manual or automated? Please explain how the issuance process works, preferably accompanied by screenshots. For example if I reached out to you to get a cert for example.com, what’s the process for that?

The issuance process is manual, this is, the issuing Team must "click" the "issue certificate" button.
The process for issuing certificates (with the current Root) it would be (summarized):
1. Register the company after Izenpe checks all the information about it (VAT Number, Official Name....)
2. Login to application using a Natural Person qualified certificate.
3. Complete the form with CN and SAN, Domain validation method and type of certificate.
4. Once the pkcs10 file is uploaded and the application has validated the domain(s), the issuing Team can click the issuing button, and the public key, Subject data (from the form) and SANs are sent to the
PKI infraestructure.
5. The PKI infraestructure performs the issuing process, applying the profile requested. At this point, other tasks are also accomplished like sending the precertificate to CT log servers, checking th public key for issues and so on.

A pdf is attached to show this process and the incident report has been updated to reflect two new action items.

Flags: needinfo?(d-fernandez)

Incident Report

Summary

Izenpe has issued a certificate with a duplicate attribute in subject. According to the BR in force in the moment it was issued (2.0.6),
as stated in 7.1.4.1:

>Each Name MUST NOT contain more than one instance of a given AttributeTypeAndValue across all RelativeDistinguishedNames unless explicitly allowed in these Requirements.

So this is a violation of the BRs.

Impact

Only one certificate has been affected (https://crt.sh/?id=14410275037).

Timeline

2023-04-11 00:00 - BR 2.0.0 is published

2024-04-03 00:00 - Izenpe installs Zlint 3.6.2-rc1

2024-08-04 00:00 - Zlint 3.6.3 is released.

2024-09-02 06:00 - A new version of our client application is deployed.

2024-09-03 07:31 - A certificate with duplicate attribute (1.3.6.1.4.1.311.60.2.1.3 - jurisdictionOfIncorporationCountryName) in subject is issued.

2024-09-03 07:40 - The certificate is revoked.

2024-09-04 06:00 - A new version of the application is deployed to fix the bug.

2024-09-26 08:00 - Izenpe updates Zlint from version 3.6.2 to 3.6.3.

2024-09-26 08:30 - Izenpe analyzes all certificates issued in the last 13 months, finding one with an error (once excluded the 1876565 bug ones)

Root Cause Analysis

The combination of two distinct situations has lead us to this situation.
On September 2nd, a change on the application that manages our certificates request from clients, introduced the bug.
The change was intended to affect only to DV/OV certificates but it had a collateral impact on EV, resulting in duplication of the last attribute of subject.
The certificate was issued without any problem as no error advise was arised from ZLINT but the later manual revision noticed this "anomaly" and the issuing team decided to revoke it.

Lessons Learned

What went well

With the amount of certificates we issue, a manual check of the certificate is still performed. The skill level of the issuing team
helped then detecting the anomaly and decided to revoke the certificate and report this to the development team so it could be fixed.

What didn't go well

  • The code change of our application should have been tested with all the certificate profiles we issue and not only with DVs and OVs.
  • If any certificate is revoked because it seems to had been badly issued, it should had been also reported to the PKI Team, so we can
    analyze and proceed, if required, with the bugzilla notification.
  • We had not established a deadline between zlint new versions publication and the moment we install it.

Where we got lucky

The issuing team detected the anomaly and prevented the client from downloading the certificate.

Action Items

Action Item Kind Due Date
Stop issuing SSL certificates Prevent 2024-09-04
Fix application code Mitigate 2024-09-05
Update ZLINT to 3.6.3 Detect 2024-09-26
Update ZLINT to latest version in less than 7 days Prevent 2024-xx-xx
Issuing Team is informed that MUST report to PKI Team of any anomaly Mitigate 2024-09-26
Development Team is informed that MUST report to PKI Team of any deployment Mitigate 2024-09-26
PKI Team will receive automatically by email whenever a TLS certificate is revokated Prevent 2024-10-18
Run pkimetal at least at post issuance Detect 2024-11-15

Appendix

Details of affected certificates

https://crt.sh/?id=14410275037

Thank you for this report. We have a few questions.

  1. In the What didn’t go well Section it’s cited that the code change was not tested with all certificate profiles. Is there a corresponding action item that ensures incomplete certificate profile testing does not reoccur? Is there possibly an even larger action item for complete testing of code changes in general before production release?
  2. What is the status of the remaining Action Items?
  3. In the Action Items section there are two “Mitigate” items that include informing teams of the need to report. Can you elaborate on what it means to be “informed” and "report" in these actions? How are you planning to measure the effectiveness of these actions?
  4. Can you describe how the last 13 months of certificates were analyzed, as stated in Comment 2?
  5. How is "normal" defined, as stated in Comment 5? "The issuing Team does not have the enough skills to determine if this was a missuance or not, they just check if something "normal" regarding the subject."
  6. Considering the dependency on manual certificate issuance, how would you address a large-scale revocation situation while simultaneously fulfilling the expectations of the TLS BRs?

Related to question 4 in Comment 8 (and the analysis of certificates over the last 13 months), it appears several additional time-valid and unrevoked certificates are mis-issued and not considered by Izenpe’s analysis.

For example:

Specifically, these certificates contain the "User Notice" policy qualifier within the certificatePolicies extension, prohibited by the TLS BRs effective September 15, 2023.

The above certificates were randomly sampled and should not be considered exhaustive.

Given this sampling, we’re particularly interested in understanding more about the analysis previously performed, especially considering a similar incident report [1] was posted approximately 6 months ago. Our expectation is that when new incidents are reported, all CA Owners review them thoroughly to determine if they are also impacted by the issue or its underlying root cause(s) and need to take action.

Hi,
I will answer the previous questions ASAP, but I have focused today on the issue raised on the previous comment (thanks to Clements),
regarding some certificates that were still alive and must had been revoked on Bug 1876565. Thus, I have opened a new bug 1922844 to clarify and fix what has happened.

These are the answers to Comment 8

In the What didn’t go well Section it’s cited that the code change was not tested with all certificate profiles. Is there a corresponding action item that ensures incomplete certificate profile testing does not reoccur?
Is there possibly an even larger action item for complete testing of code changes in general before production release?

In this situation we have not set an action item as this was something it should have done as a normal procedure when a change is done.

What is the status of the remaining Action Items?

All the items but the last two ones that require a small development have been completed. Regarding zlint, the 3.6.4rc1 has been already tested the same day is was released.

In the Action Items section there are two “Mitigate” items that include informing teams of the need to report. Can you elaborate on what it means to be “informed” and "report" in these actions?
How are you planning to measure the effectiveness of these actions?

From the issuing team, we expect a comunication by any mean as we usualy do, this is by, phone call, email or Teams. Regarding the Development team, we expect an automatically generadted email every time a new versión is deployed.

Can you describe how the last 13 months of certificates were analyzed, as stated in Comment 2?

When a certificate is issued, it is copied to another server to be postlinted. Once it happens (and the corresponding email is sent), it is also moved to a "processed" folder. When a new version of zlint is deployed, we extract from this folder
all the certificates issued in the las 13 months and run zlint over them.

How is "normal" defined, as stated in Comment 5? "The issuing Team does not have the enough skills to determine if this was a missuance or not, they just check if something "normal" regarding the subject."

When a certificate is issued they receive a couple of emails (attatched to this bug). One is the zlint output and the other one just shows the SUBJECT and SAN of the certificate. This last is the part they can control and visually they can notice
that something may have gone wrong.

Considering the dependency on manual certificate issuance, how would you address a large-scale revocation situation while simultaneously fulfilling the expectations of the TLS BRs?

In other situations, where we have had to revoke some thousands certificates int he past (I remeber ROCA vulnerability), this has been done automatically, given a list of serial numbers a process performed all the
revokations. If we would have to face such a massive revocation it would be done the same way, it would not be a problem.

Attached image subject_screenshot.jpg

Regarding Comment 9, the last question is also answered
in Bug 1922844 and I will summarize here too.
Everytime a certificate is issued, and automatic process downloads it to a server to be postlinted.
In that server we have a copy of all the certificates and it is quite easy
to analyze part of it (for example, all certificates issued the last 13 months) when an update of the linting tool is performed.
Where we failed is copying the precertificates that have not finally been issued (due to a problem with the SCTs) and
that must had been reported when we opened the BUG 1876565 as they must be
considered as if they were really issued. This behaviour will be fixed soon.

Attached image zlint_screenshot.jpg

Hi,
as part of the prevention items, since this week we are receiving an email if any SSL certificate is revoked.
It has been tested with this certificate as it can be in this screenshot
Regards,

Hi,
we are having some delays setting up a new server for dockers where we planned to run pkimetal . We hope this problem would be fixed soon and finish the installation before the month ends.
Regards,

Attached image zlint.jpg

HI,
we have finally setup a docker to run pkimetal through its API and now its currently sending an email when a a certificate it's issued.
We have also sent to pkimetal API all the certificates issued in the last 13 months without any finding.
Regards,

Please provide a status update. By the way, status updates are required on a weekly basis unless a "Next update" has been set in the Whiteboard.

Flags: needinfo?(d-fernandez)

Hi Ben,
we missed to set a Next Update in this Bug, sorry for the inconvenience.

Flags: needinfo?(d-fernandez)

• Incident Description:
One EV certificate with a duplicate "jurisdictionCountryName" attribute in subject was issued. The problem was manually detected and it was revoked.
• Incident Root Cause(s):
A change in our certificate management application affecting to DV and OV certificates introduced a bug in EV certificates.
At the same time, the current Zlint version at that time did not detect that problem and the certificate was issued.
• Remediation Description:
The bug in the code was fixed, we updated the zlint to the last version and included in our roadmap to do so within a week once a new version is released.
• Commitment Summary:
We have changed some internal procedures to handle this situations. One one hand we will perform more tests when we update the web application and on the other, warn the issuing team that any revokation
must be informed to the PKI team.

All Action Items disclosed in this Incident Report have been completed as described, and we request its closure.

Based on Comment #22, I am considering this Incident Report closed and will close this bug tomorrow, 19-Feb-2025, unless there are any outstanding issues or questions to discuss.

Flags: needinfo?(bwilson)
Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: