Closed Bug 1623356 Opened 2 years ago Closed 1 year ago

GlobalSign: Misissuance of QWAC Certificates

Categories

(NSS :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: douglas.beattie, Assigned: douglas.beattie)

Details

(Whiteboard: [ca-compliance])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36

Steps to reproduce:

GlobalSign has issued 4 QWAC certificates as of this date and 3 of them are not compliant and are either revoked, or in the process of being revoked. 
https://crt.sh/?Identity=%25&iCAID=126739

Actual results:

#1) This certificate has improper content for Organization Identifier with a repeated "PSDFR-ACPR". It has PSDFR-ACPR-PSDFR-ACPR-10278 and it should have been PSDFR-ACPR-10278
https://crt.sh/?id=2540741251

#2) This certificate is missing the CABFOrganizationIdentifier extension and there is no value in the  PSD2QcType field in the QCStatement extension.  https://crt.sh/?id=2573480105

#3) Same as issue #2:  https://crt.sh/?id=2577599695

Expected results:

#1: the OrganizationIdentifier should have been: PSDFR-ACPR-10278, and the registrationReference field of the CABFOrganizationIdentifier should have been ACPR-10278 instead of ACPR-PSDFR-ACPR-10278

#2 and #3: The certificate should have had the CABFOrganizationIdentifier extension and the PSD2QcType field in the QCStatement extension

We've stopped issuing QWAC certificates until the Validation agents are re-trained and the system is patched.  The full misissuance report will be forthcoming.

Type: defect → task
Whiteboard: [ca-compliance]
Assignee: wthayer → douglas.beattie
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attached file Incident Report

This report covers 3 certificates as described below.

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, via a discussion in mozilla.dev.security.policy, or via a Bugzilla bug), and the time and date.

#1) We monitor the issuance of the new certificate products closely, so we observed these issues as part of our auditing process.  Once we noticed this issue shortly after issuance (see #2 below). 

#2 and #3)  These were both for the same customer.  We noticed the #2 and #3 shortly after issuance.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

#1: https://crt.sh/?id=2540741251

Issued: Mar 5 15:52:47 GMT

Observed: Mar 5 15:59 GMT

Communicated with Customer:  March 5

Revoked: Mar 9,  09:20:52 GMT: We reached out to the customer on Thursday, 3/5 Friday 3/6, but it took until Monday to confirm revocation with them

#2:  https://crt.sh/?id=2573480105 

Issued: Mar 13 08:59:58 2020 GMT

Observed: Mar 13 09:12

Communicated with Customer Mar 13.  We communicated to the customer; however, because it was the weekend they did not get the message until Monday.

Revoked:  Mar 20  03:18:51

#3: https://crt.sh/?id=2577599695

Initially it was our understanding that the bug encountered in #2 was caused by the customer's selection of the UCC SAN option when requesting the certificate, so we had them submit a new order without that option selected.  This was reviewed and approved for issuance by management (the validation agents did not issue with out approval, following the guidance provided to them as part of the remediation of #2 s stated below).    Given this had same issue as #2 we stopped all issuance until a full root cause analysis could be identified.

Issued: Mar 14 16:01:07 2020 GMT

Observed: Mar 14 16:40

Communicated with Customer to let them know we plan to revoke: March 15

Revoked: Mar 20 10:43:05 

General Incident related tickets and events:

Mar 9: Provided updated training to the Validation agents that process QWAC orders regarding the content of the OrganizationIdentifier

Mar 13: Based on #2, Provided direction to stop issuance of all QWAC certificates until further notice except if explicitly authorized my a member of the compliance team.

Mar 13, 11:55: opened ticket for developers defining the errors in QWAC #2 and #3 for their analysis and investigation

Mar 14: Stopped all issuance until root cause is identified

Mar 17 03:12 - Development team identified possible cause, but needed more time to confirm.

Mar 17 10:10 - Development team confirm bug in the RA application after certificate request data has been edited.

Mar 19 : Patch installed to address #2 and #3

Mar 20: Issuance permitted to resume

  1. Whether your CA has stopped, or has not yet stopped, issuing TLS/SSL certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

We temporarily stopped issuance of all QWAC certificates until the patch was applied and now we have continued issuance.

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

As listed above

  1. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.

As listed above.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

#1: The Applicant entered entire PSP Identifier into the PSP Number field, "PSDFR-ACPR-10278" instead of 10278.  While our validation agent verified the information with the proper authorities, they missed the fact that "PSDFR-ACPR" was repeated in the OrganizationIdentifier.

#2 and #3: The issue was due to a bug in our RA platform which is was encountered when the order data was modified.  The result was a malformed certificate with the issued discussed above.  Our QWAC certificate issuance is new and we've only issued a couple of certificates. The issue was detected within an hour of issuance.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

#1: We provided a refresher training to the validation agents so they are aware of exactly how to validate and verify these fields in QWAC certificates.  On 30 March we will be updating the QWAC ordering page with improved guidance for the Applicant as well as adding field validation to be sure that the customer entered PSP Number does not start with VAT, NTR or PSP.

#2 and #3:  We have updated our QA testing process and methodology to include testing of modified certificate requests.  In the past we've tested valid and invalid requests, but haven't included all the manual steps to modify each identity field that is editable to verify that no unforeseen issues were introduced. This is now in place.

It seems like there were two delayed revocations in this report, which warrants a separate issue, if I understood correctly.

From your timelines, restructured to be an actual timeline (trimmed to minutes since some were missing seconds):

Incident #1

While human error happens, I don't know that I'm terribly confident in the root cause analysis in #1. It sounds like the system entirely relies on human agents to enter well-formed values, which at this point, I think it's been firmly established that it's negligent for CAs to rely on human-only controls for items that can be technically validated (as we've seen with DNS names, iPAddresses, jurisdiction information, etc).

The described mitigations of field validation equally seem quite incomplete, given the formal structure captured in TS 119 495, and so I have little confidence that there won't be repeat issues. For example, had the validation agent entered "FR-ACPR-10278" the issue could just as well repeat. What other considerations has GlobalSign given here?

Incident #2 / #3

I'm not really sure that the explanation of root causes is really helpful here, or helps other CAs avoid similar mistakes. As the best I can determine, the explanation is just that "It was a bug". Comment #0 describes what was wrong, but Comment #2 just says when the bug happened ("If data was modified"), without really explaining how or why such an invalid cert could be issued.

As the examples in https://wiki.mozilla.org/CA/Responding_To_An_Incident try to highlight, helping the community build an understanding of how GlobalSign's system is designed, how that lead to such a bug, what the conditions were and why/how that manifest, helps identify root causes more thoroughly and helps everyone better design robust systems. As captured with Incident #1, I can't help but feel that there are more missed opportunities here, and so I'm not particularly inspired by the incident response to date that the root causes have been identified, or why/how it's sufficient to only address this with "more QA"

Flags: needinfo?(douglas.beattie)

Ryan,

For #1, both the applicant and the validation agent use a pull-down to populate the NCA from a predefined list of NCAs. This also includes the country, so both of those fields are limited to system supplied values.

The issue in #1 was the validation of the Payment Service Provider number. On Monday we're changing the input prompt to be more clear/accurate and we are also adding a check to prevent users from entering their OrganizationIdentifier into the PSP field. This would have prevented the user from entering PSDFR-ACPR-10278 into the PSP field.

Flags: needinfo?(douglas.beattie)

The first half of the reply sounds like it may be misunderstanding my concern. To help clarify, can you provide more specific details about what the technical changes you’re putting in place? As presently described, I’m not confident they would address the concern, but this may be an area where providing additional technical details (i.e. that anybody else would be able to reliably implement based on your description) would help demonstrate y’all are actually on top of it.

Flags: needinfo?(douglas.beattie)

As you know, the OrganizationIdentifier for a PSD2 QWAC is constructed from the following fields:

  • "PSD" as 3 character legal person identity type reference;
  • NCA Country: the 2 character ISO 3166 country code representing the NCA country;
  • hyphen-minus "-" (0x2D (ASCII), U+002D (UTF-8));
  • NCA Identifier: This is the 2-8 character NCA identifier (A-Z uppercase only, no separator);
  • hyphen-minus "-" (0x2D (ASCII), U+002D (UTF-8));
  • PSP (Payment Service Provider): This is the authorization number as specified by the NCA for this PSP.

Currently the NCA Name and (corresponding) Country are fixed in a pulldown with only the following value pairs permitted (a descriptive name is shown, but these are the only values that can be populated into this field). We didn't make this up, it came from: https://www.etsi.org/deliver/etsi_ts/119400_119499/119495/01.02.01_60/ts_119495v010201p.pdf

  • AT-FMA
  • BE-NBB
  • BG-BNB
  • HR-CNB
  • CY-CBC
  • CZ-CNB
  • DK-DFSA
  • EE-FI
  • FI-FINFSA
  • FR-ACPR
  • DE-BAFIN
  • GR-BOG
  • HU-CBH
  • IS-FME
  • IE-CBI
  • IT-BI
  • LI-FMA
  • LV-FCMC
  • LT-BL
  • LU-CSSF
  • NO-FSA
  • MT-MFSA
  • NL-DNB
  • PL-PFSA
  • PT-BP
  • RO-NBR
  • SK-NBS
  • SI-BS
  • ES-BE
  • SE-FINA
  • GB-FCA

So, of the 6 fields in the OrganizationIdentifier, all are either fixed or from the pull-down above except for the PSP Number. We're improving the description of that field on the certificate request page and we are adding validation to make sure that it does not start with PSD (to catch those that might be pasting in their entire OrganizationIdentifier instead of what were asking for here).

(In reply to douglas.beattie from comment #6)

So, of the 6 fields in the OrganizationIdentifier, all are either fixed or from the pull-down above except for the PSP Number. We're improving the description of that field on the certificate request page and we are adding validation to make sure that it does not start with PSD (to catch those that might be pasting in their entire OrganizationIdentifier instead of what were asking for here).

Thanks for the added details Doug. I've been helping review QWAC lints for zlint, written by MTG AG, as part of this.

The concern I was trying to capture in Comment #3, and I think Comment #6 highlights is still a concern, is trying to understand what validation happens either on the PSP number and/or on the entire OrganizationIdentifier. You've rightly called out that the OrganizationIdentifier has a defined structure, but Incident #1 reveals a situation where GlobalSign was not actually verifying that structure.

The original reply appeared that the sole mitigation was retraining users on the PSP number (and, from Comment #6, adding a description of that field). However, that doesn't seem like it systemically addresses the risk.

In Comment #6, I think we're getting closer, because you're highlighting "Adding validation to make sure that it does not start with PSD", but that doesn't seem like it meaningfully addresses the root concern? I was trying to highlight how a validation agent could end up providing a PSP Number that wasn't valid, but which didn't start with "PSD", as an example that would bypass that validation.

A more systemic fix would seem to be in validating the overall OrganizationIdentifier against the structure captured in 119 495, as you referenced, along with the NCA list. You can see I discussed some of that over on the ZLint project. Similarly, validating the PSP Identifier against the NCA's format seems like a more systemic fix than simply blocking the string "PSD". It's the NCAs that define the formatting restrictions for PSP identifiers, and I'm trying to understand what, if any, effort is being spent to validate that.

(In reply to douglas.beattie from comment #6)

We didn't make this up, it came from: https://www.etsi.org/deliver/etsi_ts/119400_119499/119495/01.02.01_60/ts_119495v010201p.pdf

Also, I'd be remiss in not highlighting that you're referring to an older version. The current deliverable is v1.4.1, replacing v1.3.2 and v1.3.1 that came out in the time since v1.2.1.

You can find more about the workstream here, which calls out that 1.4.1 replaced the EBA list in Annex D and clarifying the expectations for transition. Hopefully, this helps reduce the risk of future compliance issues from using the outdated tables and versions :)

For example, HR-CNB is renamed to HR-HNB in v1.3.2, so the dropdown may need updating.

Ryan,
I believe that the OrganizationalIDentifier in #1 does/could follow the strict format in the case where the PSP is: PSDFR-ACPR-10278

Yes, to the educated eye that looks very odd. The ETSI spec I referenced above says:

  • PSP identifier (authorization number as specified by the NCA. There are no restrictions on the characters used).

How would you write a lint to detect the issue in #1, other than looking for some obvious mistakes?

(In reply to douglas.beattie from comment #10)

Yes, to the educated eye that looks very odd. The ETSI spec I referenced above says:

  • PSP identifier (authorization number as specified by the NCA. There are no restrictions on the characters used).

How would you write a lint to detect the issue in #1, other than looking for some obvious mistakes?

You validate the PSP Identifier based on the format defined by the NCA.

Or, put differently, you have a list of NCA's you've recognized (note Comment #9, a slightly outdated list). You would determine whether or not the NCA had defined a format for the PSP identifier. For example, DigiCert has already begun to examine their Incorporating Agencies for defined identifier formats, which relates to serialNumber.

In response to comment 11:

Ryan, Yes, I inadvertently referenced the old spec and we're updating our NCA list in our ordering platform. If someone selected this NCA we would have detected that during our validation process (even with the old value in our ordering platform) because the validation agents had the updated list.

In response to comment 7

A more systemic fix would seem to be in validating the overall OrganizationIdentifier against the structure captured in 119 495, as you referenced, along with the NCA list. You can see I discussed some of that over on the ZLint project. Similarly, validating the PSP Identifier against the NCA's format seems like a more systemic fix than simply blocking the string "PSD". It's the NCAs that define the formatting restrictions for PSP identifiers, and I'm trying to understand what, if any, effort is being spent to validate that.

Yes, It’s certainly possible to create unique lints for each and every NCA, but doing that and keeping it up-to-date would be a large task and I think a lint that works across all NCAs would be best, if possible. Given that the verification agents need to validate the PSP number and the number of expected QWAC certificates as compared with OV and EV, I’d recommend we focus more time on lints for those types of certificates over those for QWAC to give us the most return for the effort.

Flags: needinfo?(douglas.beattie)

I am still concerned, because QWACs are nominally meant to be more robust than EV, at least from the EU perspective, and yet despite this, misissuance occurred. Blocking the string “PSD”, and user training, continues to feel like a short term fix that does not address root causes, but is merely temporary until a real fix can and is developed. The fact that each of these certificates required not one, but two independent validation agents to review and make mistakes does highlight the systemic nature here, and the potential flaws.

Again, using QWACs as the example, something such as NCA or VAT could seemingly slip by.

Similarly, I’m not exactly reassured that your validation agents had the new list and this inconsistency wasn’t spotted until now. Although it’s somewhat reassuring that you have additional human controls to go with existing (allowlisting) technical controls, it seems there is a breakdown in systems management when the technical controls allow illegal values and have to resort to human controls to correct.

I’m trying to approach this as a systems problem. I can understand short term fixes for the issue as it previously presented, but I’m trying to look for systemic fixes and prevent the class of issues. And to make sure we’ve identified the root causes.

I am still concerned, because QWACs are nominally meant to be more robust than EV, at least from the EU perspective, and yet despite this, misissuance occurred. Blocking the string “PSD”, and user training, continues to feel like a short term fix that does not address root causes, but is merely temporary until a real fix can and is developed. The fact that each of these certificates required not one, but two independent validation agents to review and make mistakes does highlight the systemic nature here, and the potential flaws.

While the validation agents did define the OrgID correctly, they didn't spot that the system logic, there to help them, was going to add the PSD and NCA info separately. As Doug mentioned, this has now been addressed, both from an UI point of view as from an agent training point of view. To address the root cause, we have kickstarted the process of investigating the different formats across jurisdictions and types of service providers to come up with a list of formats that we can enforce on a system level. We plan to conclude this investigation by end of day Wednesday 8th of April. The implementation timeline of the system controls will partially depend on the complexity of what we'll find.

Similarly, I’m not exactly reassured that your validation agents had the new list and this inconsistency wasn’t spotted until now. Although it’s somewhat reassuring that you have additional human controls to go with existing (allowlisting) technical controls, it seems there is a breakdown in systems management when the technical controls allow illegal values and have to resort to human controls to correct.

As you say, the validation agents had this list as a last (and additional) line of defense. As the last line of defense, they would have only spotted it when a Croatian order came in, which hasn't been the case. We have tightened our review process to better inform asset owners of the specific changes required to technical controls when relevant requirements have changed, and we are expanding on this in our existing GRC solution.

Gotcha.

If I would make a suggestion, if it ends up being that putting in escalation filters (i.e. if it doesn't meet the filter, it gets escalated for additional review) for the NCA-specific PSP identifiers doesn't work, I think it's still worth examining and making sure that you examine the full set of prefixes (beyond just PSD) as a point of escalation. As y'all mentioned, the rules around identification schemes are sadly lax with the organizationIdentifier, with PSP identifiers allowed to not only start with "PSD", but also other potentially confusing strings such as "PAS" or "IDC" (that is, conflating with an ETSI TS 119 412-1, 5.1.3 natural person semantics identifier). Figuring out how to escalate any 'strange' values, both in UI and to additional eyes, seems like a more systemic fix.

Thanks for your suggestion. We are awaiting the final results of the investigation on the formatting of the numbers, which will be concluded EOD today. From what we can currently see already, there seems to be a wide variety of formats across jurisdictions for the PSD2 certificates alone. From the initial feedback, it seems like it is difficult to pin down the format requirements within each jurisdiction, so we have to go by representative samples for some jurisdictions. We have no clear indication of how the format may even change over time, so your suggestion is a welcome one. We are looking at a variety of technical controls to assist our validation specialists, and will keep you up to date with the progress of that review.

QA Contact: wthayer → bwilson
Whiteboard: [ca-compliance] → [ca-compliance] - Next Update - 15-June 2020
Flags: needinfo?(eva.vansteenberge)
Whiteboard: [ca-compliance] - Next Update - 15-June 2020 → [ca-compliance] Next Update 15-July 2020

Based on our investigation, we have created an internal format validator of the fields used during the validation process. This has been in use for two weeks. We have also tested it for all historic orders, and all those passed the format validator test. This was the expected outcome, since the compliance team has been reviewing these for all of the orders since the opening of this ticket.

We are not enforcing the format when the customers enter the information in an automated manner, since not all NCAs have yet published formal structure requirements. Therefore, we review them during the vetting phase where any divergence flagged by the format validator gets escalated to the compliance team for review. This has been working most successfully (one escalation so far was due to the number being published in the EUCLID database having a different format than the number published by the NCA on their own website). We are investigating the number format for other types of Organization Identifiers as well (e.g. NTR and VAT) and exploring the possibility of implementing similar checks for those.

Flags: needinfo?(eva.vansteenberge)

I intend to close this bug on or about next Friday, 24-July, unless there are reasons it should remain open.

Flags: needinfo?(bwilson)
Whiteboard: [ca-compliance] Next Update 15-July 2020 → [ca-compliance]
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.