Closed Bug 1597135 Opened 25 days ago Closed 8 days ago

HARICA: 3 EV TLS Certificates without L or ST

Categories

(NSS :: CA Certificate Compliance, task)

task
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jimmy, Assigned: jimmy)

Details

(Whiteboard: [ca-compliance] )

Attachments

(1 file, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0

Steps to reproduce:

Our internal checks have identified three EV TLS Certificates that were issued without subject:localityName or subject:stateOrProvinceName attribute. All three have been scheduled for revocation within 5 days. An incident report with initial investigation and remediation actions will be posted tomorrow. In the meantime, no EV SSL certificates are issued until full remediation of the issue.

Type: defect → task
Whiteboard: [ca-compliance]
Assignee: wthayer → jimmy

Incident Report Analysis

HOW HARICA FIRST BECAME AWARE OF THE PROBLEM

During the internal quality checks and tests to introduce improved linting software, it was discovered that three (3) EV TLS certificates were issued without L or ST in the subjectDN.

IMMEDIATE ACTIONS

The problematic certificates were all issued from a recently created QWAC certificate profile. The QWAC certificate profile was configured to require at least localityName.

In addition to this, it was discovered that the issuing CA was linked to certlint linter instead of the recommended cablint linter. The CA configuration was updated to use cablint linter.

All Issuing CAs technically capable of issuing TLS Certificates were scanned to ensure that the recommended linter was enabled. Only one Issuing CA was affected. All certificates issued from this CA were scanned and re-linted to confirm that no other certificate was issued in error.
1.2.1 Timeline of the actions HARICA took in response

Sunday, November 17, 2019

  • During internal quality checks due to testing a new linting software, it was discovered that three (3) EV Certificates using a recently created QWAC profile did not include localityName or stateOrProvince attribute in their subjectDN field.
  • Further investigation revealed that the QWAC certificate profile did not enforce “localityName or stateOrProvince” (conditional rule) to be present in the subjectDN of end-entity certificates.
  • As a temporary measure, all certificate profiles technically capable of issuing OV/IV/EV/QWAC TLS certificates were updated to enforce the existence of localityName.
  • The issuing CA (https://crt.sh/?caid=119883) was linked to certlint linter instead of the recommended cablint linter. However, all previously executed quarterly audits were using the recommended linter (cablint) as it was a separate process. The last internal audit was executed with certificates issued until 2019-09-30 and did not reveal any mis-issuances.
  • EV/QWAC Certificate issuance was stopped.
  • The CA configuration of the issuing CA (https://crt.sh/?caid=119883) was updated to use cablint for pre-signing linting.
  • All Issuing CAs technically capable of issuing TLS Certificates were scanned to ensure the recommended linter was enabled. The scan confirmed that only one Issuing CA was affected (https://crt.sh/?caid=119883). All certificates issued from this CA were scanned and re-linted to confirm that no other certificate was issued in error.
  • Α notification to Bugzilla was drafted and submitted (https://bugzilla.mozilla.org/show_bug.cgi?id=1597135)
  • A notification to affected subscribers was drafted

Monday, November 18, 2019

  • EV/QWAC Certificate issuance was re-enabled
  • Our auditor was notified about the incident, the preliminary findings and planned actions.
  • The affected parties were contacted to replace their Certificates within the revocation timeline according to the Baseline Requirements.
  • The post-subCA creation ceremony script was updated to include steps that update the crt.sh mis-issuance check script with the new subCA.
  • PrimeKey was contacted to request a feature to enable a conditional configuration check, to require subjectDN field (localityName OR stateOrProvinceName) for end-entity profiles.
  • The Validation Specialists were particularly noted that for EV Certificates and QWACs the JoI Locality OR StateOrProvince is required, and that these are not related to the subjectDN:localityName, subjectDN:stateOrProvince (the LocalityName and/or StateOrProvince must be copied in both locations).
  • The training material for Validation Specialists was scheduled to be updated by end of the week (November 22, 2019) to explicitly describe this requirement to avoid future misunderstandings.
  • The affected certificates are scheduled to be revoked on Thursday, November 21, 2019.

IS THE PROBLEM SOLVED?

Yes. The affected subCA has been configured to use the recommended linter in order to technically enforce all applicable EV requirements. Additionally, the certificate profiles have been updated to require at least the localityName to appear in the subjectDN field for IV/OV/EV/QWACs and Validation Specialists have been particularly noted for this requirement. Mis-issued certificates shall be revoked until Thursday November 21, 2019 (within the required time frame).

THE COMPLETE CERTIFICATE DATA FOR THE PROBLEMATIC CERTIFICATES

The entire certificate database was examined. Here are the problematic certificates:

WHY WERE THESE PROBLEMS NOT DETECTED SOONER?

The pre-issuance lint for the Issuing CA that issued the problematic certificates was not configured to use the recommended linting tool.

ACTIONS TO PREVENT RECURRENCE OF THIS ISSUE

There were two problems that were detected in our root cause analysis:

  1. The Validation Specialist that issued the QWACs mistakenly thought that the jurisdictionOfIncorporationLocalityName and jurisdictionOfIncorporationStateOrProvinceName were sufficient to convey to the Relying Parties that the organization is located in that specific Locality and State. All Validation Specialists were notified about the requirement to include the subjectDN:localityName OR subjectDN:stateOrProvinceName field in IV/OV/EV Certificates. Their training material was updated to include this clarification.
  2. A particular subCA was linked to certlint linter instead of the linter which is recommended for the case (cablint). We decided to introduce harder controls to reduce the human-error factor, thus we updated our linting script to automatically detect whether the lint is for a TLS Certificate subject to the Baseline Requirements and EV Guidelines. This will ensure the recommended linter is automatically selected. In addition, we configured the end-entity profiles capable of issuing IV/OV/EV/QWACs to require the subjectDN:localityName to appear in the Certificate until a better technical control is offered by the software vendor. We submitted a feature request to PrimeKey το enable the combined check for localityName OR stateOrProvinceName.

Incident Impact

This incident had impact on two Subscribers where certificates had to be replaced. No other impact was detected.

Conclusions and Recommendations

We consider the mitigations applied to be sufficient to prevent similar issues from taking place in the future, but we will closely monitor to verify. HARICA already has technical controls in place to restrict certificate profiles according to the Baseline Requirements and EV Guidelines. The limitation of EJBCA to restrict certificate to include “localityName OR stateOrProvince” is now a feature request to the software vendor, PrimeKey. In the meantime, HARICA configured the end-entity profiles related to IV/OV/EV/QWACs for TLS Certificate to require subject:localityName.

Dimitris,

Thanks for sharing this report. It's very clear and very helpful, and it makes a great preliminary report. However, in the process of diverging from the Mozilla process ( https://wiki.mozilla.org/CA/Responding_To_An_Incident ), it lacks several critical details.

I think one of the key aspects we're missing here is root cause analysis. Root cause is not simply about preventing this exact scenario next time, but about understanding what systemic issues might have lead or contributed to this, and understanding how they can be mitigated.

I don't really see any introspection into the systemic controls, just about mitigating this specific issue. Can you clarify when you'll be providing a complete incident report? I think one of the key things will be understanding what controls existed, where and why they failed, and what sort of steps are being taken. There seems to be a number of steps that could have caught or prevented this.

Flags: needinfo?(jimmy)

(In reply to Ryan Sleevi from comment #2)

Dimitris,

Thanks for sharing this report. It's very clear and very helpful, and it makes a great preliminary report. However, in the process of diverging from the Mozilla process ( https://wiki.mozilla.org/CA/Responding_To_An_Incident ), it lacks several critical details.

I am not sure why you think our response is diverging from the Mozilla process. We are using a template that is based on the Mozilla's template. This is not our final incident report, because our actions have not been completed yet. We have provided several critical details, based on our initial investigation, trying to be as transparent as possible.

I think one of the key aspects we're missing here is root cause analysis. Root cause is not simply about preventing this exact scenario next time, but about understanding what systemic issues might have lead or contributed to this, and understanding how they can be mitigated.

Distinguishing between causes, root causes and symptoms is very clear to us, else there is no way to minimize the possibility of similar issues to re-occur. Even though we have not finished our work, we try to elaborate on that:

The first cause was the misunderstanding of the Validation Specialists that assumed that for EV Certificates it would be redundant to include the subject:jurisdictionOfIncorporationLocalityName and subject:localityName in the same certificate and similar for subject:jurisdictionOfIncorporationStateOrProvinceName and subject:stateOrProvinceName. Even though this example was not explicitly mentioned during training, out training material highlights that for EV Certificates, EV Guidelines are applied in addition to Baseline Requirements.

For this reason we decided to improve our training material with this clarification and also put it in our written exam questions. We are also considering reviewing the entire training material in search of other similar places of misunderstanding, even though misunderstandings are the hardest to predict. Since we never had a similar incident that was addressed to improper training of our staff, we will make efforts to improve it by adding more examples, based on different scenarios.

The second cause, that "allowed" the first cause to manifest into a misissuance, was the configuration of the new SubCAs to use a linter that was not the best available for these types of Certificates. This was addressed to human error during the post ceremony configuration activities. If the recommended linter was configured, it would prevent the misissuance.

The mitigation for this particular issue was to improve the linting script to auto-detect the type of certificate and apply the recommended linter. We are examining further improvements to automate CA configuration in post-ceremony activities based on the types of Certificates that the Issuing CA is technically capable of issuing. HARICA is currently using several available CLI configuration options to automate the post-ceremony CA configuration but EJBCA has certain limitations that cannot be easily performed using CLI.

We now reached the third cause, which is the technical restrictions posed by the EJBCA software. For other certificate profile requirements (whether fields are required/optional, acceptable values and size per subject attribute, etc), EJBCA provides the necessary tools and HARICA is using them to enforce the Certificate Profiles per the Baseline Requirements and EV Guidelines. The only rule that EJBCA was not able to provide in the end-entity profile tools was the combination of existence of subject:LocalityName OR subject:StateOrProvinceName.

To address this lack of support in the EEP configuration, we requested a feature from PrimeKey (https://jira.primekey.se/browse/ECA-8704) to allow for these conditional restrictions in end-entity profiles, as this is a feature that most publicly-trusted CAs would like to use and implement. Until then, we set our end-entity profiles for issuance of TLS Certificates to require subject:localityName for IV/OV/EV/QWACs (in addition to the other required fields).

Analyzing further on the above causes, especially the first two which are part of HARICA’s internal organization, one could consider the “human factor” to be the source. For us, it is a clear indication that our strategy to use more automation and remove the possibility for human errors in as many places as possible, is correct.

Going deeper in our analysis, while automation is certainly valuable, it still transfers part of the risk to systems, engineers and/or developers to properly implement, configure, operate and monitor them. Simply put, misunderstandings, omissions or errors, at the human or system level, can always happen; but, this is precisely a good reminder of the value of our -costly- choice to maintain multiple overlapping controls, so that we have a fault-tolerant system. In this case, it is clear that more than two things had to go wrong for a misissuance to occur, and even when that happened, our processes were able to detect the issue and take immediate action.

I hope the above are not seen as an appraise to our processes; a misissuance did take place. We remain alert and continue to evaluate specific good practices (some mentioned above and included in our incident report).

I don't really see any introspection into the systemic controls, just about mitigating this specific issue. Can you clarify when you'll be providing a complete incident report? I think one of the key things will be understanding what controls existed, where and why they failed, and what sort of steps are being taken.

We focused on the controls and mitigations that apply for the L/ST issue that hopefully address the concerns raised. We expect to deliver the complete incident report next week, after the affected certificates are revoked and our analysis is completed. The final report will include additional information that is included in this response.

There seems to be a number of steps that could have caught or prevented this.

We agree. As mentioned above, despite the failure of 2 controls (training, pre-linting), this issue was caught by HARICA exactly because we had a 3rd control in place (internal checks/testing). This testing was conducted as part of our continuous preemptive actions to introduce improved tools (linters in this case) and more automation. We are continuously trying to learn from existing incidents and improve our existing tools and practices.

Please let us know if you have any further questions or concerns.

Flags: needinfo?(jimmy)

All problematic certificates were revoked as planned. An updated report will be posted early next week.

1. Incident Report Analysis

1.1 HOW HARICA FIRST BECAME AWARE OF THE PROBLEM

During the internal quality checks and tests to introduce improved linting software, it was discovered that three (3) EV TLS certificates were issued without L or ST in the subjectDN.

1.2 A TIMELINE OF THE ACTIONS HARICA TOOK IN RESPONSE

The problematic certificates were all issued from a recently created QWAC certificate profile. The QWAC certificate profile was configured to require at least localityName.

Additionally, it was discovered that the issuing CA was linked to certlint linter instead of the recommended cablint linter. The CA configuration was updated to use cablint linter.

All Issuing CAs technically capable of issuing TLS Certificates were scanned to ensure that the recommended linter was enabled. Only one Issuing CA was affected. All certificates issued from this CA were scanned and re-linted to confirm that no other certificate was issued in error.

Here is a detailed timeline:

Sunday, November 17, 2019

  • During internal quality checks due to testing of a new linting software, it was discovered
    that three (3) EV Certificates using a recently created QWAC profile did not include
    localityName or stateOrProvince attribute in their subjectDN field.
  • Further investigation revealed that the QWAC certificate profile did not enforce
    “localityName or stateOrProvince” (conditional rule) to be present in the subjectDN of
    end-entity certificates.
  • As a temporary measure, all certificate profiles technically capable of issuing
    OV/IV/EV/QWAC TLS certificates were updated to enforce the existence of
    localityName.
  • The issuing CA (https://crt.sh/?caid=119883) was linked to certlint linter instead of the
    recommended cablint linter. However, all previously executed quarterly audits were
    using the recommended linter (cablint) as it was a separate process. The last internal
    audit was executed with certificates issued until 2019-09-30 and did not reveal any mis-
    issuances.
  • EV/QWAC Certificate issuance was stopped.
  • The CA configuration of the issuing CA (https://crt.sh/?caid=119883) was updated to use
    cablint for pre-signing linting.
  • All Issuing CAs technically capable of issuing TLS Certificates were scanned to ensure
    the recommended linter was enabled. The scan confirmed that only one Issuing CA was
    affected (https://crt.sh/?caid=119883). All certificates issued from this CA were scanned
    and re-linted to confirm that no other certificate was issued in error.
  • Α notification to Bugzilla was drafted and submitted
    (https://bugzilla.mozilla.org/show_bug.cgi?id=1597135).
  • A notification to affected subscribers was drafted.

Monday, November 18, 2019

  • EV/QWAC Certificate issuance was re-enabled.
  • Our auditor was notified about the incident, the preliminary findings and planned
    actions.
  • The affected parties were contacted to replace their Certificates within the revocation
    timeline according to the Baseline Requirements.
  • The post-subCA creation ceremony script was updated to include steps that update the
    crt.sh mis-issuance check script with the new subCA.
  • PrimeKey was contacted to request a feature to enable a conditional configuration check,
    to require subjectDN field (localityName OR stateOrProvinceName) for end-entity
    profiles.
  • The Validation Specialists were particularly noted that for EV Certificates and QWACs
    the JoI Locality OR StateOrProvince is required, and that these are not related to the
    subjectDN:localityName, subjectDN:stateOrProvince (the LocalityName and/or
    StateOrProvince must be copied in both locations).
  • The training material for Validation Specialists was scheduled to be updated by end of
    the week (November 22, 2019) to explicitly describe this requirement to avoid future
    misunderstandings.
  • The certificate with serial number 06607994DD3087EAECDA6184FD21E7B0 was
    revoked.
  • The remaining affected certificates were scheduled to be revoked on Thursday,
    November 21, 2019.

Tuesday, November 19, 2019

  • Further analysis of the incident was conducted. The incident analysis was documented
    in more detail.
  • Additional information related to the incident was posted to the Bugzilla bug.

Thursday, November 21, 2019

  • Certificates with serial numbers 41F7531AB378BA72F082AB5B0CB6150C and
    1DEB0CC305A21A74B1527363548084D6 were revoked.

Friday, November 22, 2019

  • Training documentation was updated with this particular incident explanation and clear
    instructions for Validation Specialists. New related questions were added to the test
    options.

Tuesday, November 27, 2019

  • Final report was posted to Bugzilla.

1.3 HAS HARICA STOPPED, OR HAS NOT YET STOPPED, ISSUING CERTIFICATES WITH THE PROBLEM?

Yes. The affected subCA was configured to use the recommended linter in order to technically enforce all applicable EV requirements. Additionally, the certificate profiles were updated to require at least the localityName to appear in the subjectDN field for IV/OV/EV/QWACs and
Validation Specialists have been particularly noted for this requirement. Misissued certificates were all revoked by Friday November 22, 2019, within the required time frame.

1.4 A SUMMARY OF THE PROBLEMATIC CERTIFICATES

A full list is included in the next section.

1.5 THE COMPLETE CERTIFICATE DATA FOR THE PROBLEMATIC CERTIFICATES

The entire certificate database was examined. Here are the problematic certificates:

1.6 EXPLANATION ABOUT HOW AND WHY THE MISTAKES WERE MADE OR BUGS INTRODUCED, AND HOW THEY AVOIDED DETECTION UNTIL NOW

The pre-issuance lint for the Issuing CA that issued the problematic certificates was not configured to use the recommended linting tool and the mis-issuance was not immediately detected. Our last quarterly audit scan included Certificates issued between 2019-07-01 and 2019-09-30.

1.7 EXPLANATION ABOUT HOW AND WHY THE MISTAKES WERE MADE OR BUGS INTRODUCED, AND HOW THEY AVOIDED DETECTION UNTIL NOW

There were three causes that were detected in our analysis, all of which occurred for the issue to take place:

  1. The Validation Specialist that issued the QWACs mistakenly thought that the jurisdictionOfIncorporationLocalityName and jurisdictionOfIncorporationStateOrProvinceName were sufficient to convey to the Relying Parties that the organization is located in that specific Locality and State. All Validation Specialists were notified about the requirement to include the subjectDN:localityName OR subjectDN:stateOrProvinceName field in IV/OV/EV Certificates. Their training material was updated to include this clarification. In addition to the specific notification/announcement to all Validation Specialists about the proper way to apply the requirements and the training material update for this specific issue, a review of the entire training material for Validation Specialists was considered in search for other areas of misunderstanding. Given that misunderstandings are the hardest to predict when writing or evaluating training material, we also considered changes in our training practices. The conclusion at this time is that, since we never had a similar incident that was addressed to improper training of our staff, we will make efforts to improve training practices and use more real-life examples in the training material and, whenever recommended, more collaborative knowledge sharing between different team roles based on different scenarios, especially in cases where technical controls are not feasible and thus, human is the main control.
  2. A particular subCA was linked to certlint linter instead of the linter which is recommended for the case (cablint). We decided to introduce harder controls to reduce the human-error factor, thus we updated our linting script to automatically detect whether the lint is for a TLS Certificate subject to the Baseline Requirements and EV Guidelines. This will ensure the recommended linter is automatically selected. The improvement to auto-detect the type of certificate is considered effective to prevent this human error from repeating. We are continuously examining improvements to automate CA configuration in post-ceremony activities based on the types of Certificates that the Issuing CA is technically capable of issuing. HARICA is currently using several available CLI configuration options to automate the post-ceremony CA configuration but EJBCA has certain limitations that cannot be easily performed using CLI.
  3. There were technical restrictions posed by the EJBCA software. For other certificate profile requirements (whether fields are required/optional, acceptable values and size per subject attribute, etc), EJBCA provides the necessary tools and HARICA is using them to enforce the Certificate Profiles per the Baseline Requirements and EV Guidelines. The only rule that EJBCA was not able to provide in the end-entity profile tools was the combination of existence of subject:LocalityName OR subject:StateOrProvinceName. To address this lack of support in the EEP configuration, we requested a feature from PrimeKey (https://jira.primekey.se/browse/ECA-8704) to allow for these conditional restrictions in end-entity profiles, as this is a feature that most publicly-trusted CAs would like to use. Until then, we set our end-entity profiles for issuance of TLS Certificates to require subject:localityName for IV/OV/EV/QWACs (in addition to the other required fields).

2. Incident Impact

This incident had impact on two Subscribers where certificates had to be replaced. No other impact was detected.

3. Conclusions and Recommendations

We consider the mitigations applied to be sufficient to prevent similar issues from taking place in the future, but we will closely monitor to verify. HARICA is continuously improving its technical controls to restrict certificate profiles according to the Baseline Requirements and EV
Guidelines. The limitation of EJBCA to restrict certificate to include “localityName OR stateOrProvince” is now a feature request to the software vendor, PrimeKey. In the meantime, HARICA configured the end-entity profiles related to IV/OV/EV/QWACs for TLS Certificate to require subject:localityName.

We re-affirmed our strategy to use more automation and remove the possibility for human errors in as many places as possible. Despite the failure of 2 controls (training, pre-linting) and EJBCA’s limitation to enforce this specific rule, t his issue was caught by HARICA because of multiple layers of control (internal checks/testing). This testing was conducted as part of our continuous preemptive actions to introduce improved tools (linters in this case) and more automation. With respect to the quality and completeness of the training practices, we consider that training people with more real-life examples is the key to avoid mis-understandings or improper handling of corner cases. We are continuously trying to learn from existing incidents and improve our existing tools and practices.

Attachment #9109775 - Attachment is obsolete: true

The attached PDF is the final incident report. We have copied the core sections of this report in markdown for easier review.
Please let us know if there are further questions or concerns.

Could you explain what you mean by:

The Validation Specialists were particularly noted that for EV Certificates and QWACs
the JoI Locality OR StateOrProvince is required, and that these are not related to the
subjectDN:localityName, subjectDN:stateOrProvince (the LocalityName and/or
StateOrProvince must be copied in both locations).

It sounds like you're saying the jurisdictionOf* information is copied into the subjectDN:locality/StateOrProvince, but these fields are expressing different things: one where the entity is incorporated, another where the entity is operated. It may be the same as incorporation, but not necessarily, depending on how the CA is verifying information. I'm hoping you can explain more the process here?

Flags: needinfo?(jimmy)

The Validation Specialists are very clear about the two different roles that the jurisdictionOf* and the other subjectDN stateOrProvince and Locality mean. The word "copied" was meant for the particular mis-issued cases where the information of JoI* and the other subject DN fields were the same.

To make it more clear, it is not a "mistake" but a "requirement" for an organization that is registered/incorporated in the city of Athens and is also physically located in Athens to include the localityName=Athens and the juridsictionOfIncorporationName=Athens.

Perhaps it would be clearer if we replaced the word "copied" with the word "included".

I hope this helps.

Flags: needinfo?(jimmy)

Gotcha! Yeah, that makes it clearer, and why I was trying to get clarification to better understand :)

I wasn't sure if Wayne had further questions here. It seems pre-issuance linting was configured, but with the wrong linter, combined with a correct profile from the compliance team, but a wrong understanding of the validation specialists. All things considered, there were a lot more controls in place here than we've seen from other incidents, and so I'm more comfortable with an approach of "fix the controls" (e.g. more training, additional configuration) since many systemic controls were already in place.

Flags: needinfo?(wthayer)

It appears that all questions have been answered and remediation is complete.

Status: UNCONFIRMED → RESOLVED
Closed: 8 days ago
Flags: needinfo?(wthayer)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.