Closed Bug 1715024 Opened 3 years ago Closed 3 years ago

Sectigo: Misspellings in stateOrProvince or localityName fields

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: tim.callan, Assigned: tim.callan)

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

Attachments

(1 file)

22.23 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Details
  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

We received a report in our SSL Abuse line from another CA reporting nine certificates with apparent misspellings in either the stateOrProvince or localityName field.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

All times Eastern Daylight Time

May 19, 5:32 am
Certificates reported to Sectigo’s abuse address.

10:08 am
Investigation completed and errors in all nine certificates acknowledged.

May 20, 11:09 am
The first of these certificates is revoked. One of these certificates expires.

May 23, 9:22 am
All remaining certificates revoked.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

These errors are typos, indicating human error. Two of these typos are in the stateOrProvince field. Those could not occur today due to our use of a lookup table for this field.

The others occur in the localityName field. Due to the variable nature of this field’s contents, there is no systematic block we can put in place for typos on this field.

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

Nine certificates.
The earliest was issued October 21, 2019.
The last was issued June 12, 2020.

  1. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.

https://crt.sh/?id=2523908929
https://crt.sh/?id=2927042667
https://crt.sh/?id=2394808803
https://crt.sh/?id=2103791444
https://crt.sh/?id=2835361204
https://crt.sh/?id=2348122084
https://crt.sh/?id=2938630670
https://crt.sh/?id=2021129023
https://crt.sh/?id=2032081476

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

These mistakes are all clear typos.

Two of them occur in the stateOrProvince field and the others in the localityName field. These certificates were issued prior to our implementation of a lookup table for the stateOrProvince field. At the time all these fields were dependent entirely on human input. It appears that these typos went unnoticed through the issuance process.

As a follow-on to bug #1710243 we are presently conducting an investigation to discover other certificates with names in the stateOrProvince field that do not match our accepted list of values. We expect to find some. We plan to add information to that bug (or this one) as we learn more.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

For the two certificates with errors in the stateOrProvince field, this problem is solved by the fact that we now use a lookup table.

The localityName field is tougher, simply because the variety of information that might correctly sit in this field is so great and there is no official source of potentially correct strings for this field.

We have an open ticket to prevent inclusion of the localityName field whenever stateOrProvince is present. It happens that this functionality would have saved the other seven certificates from misissuance.

The value in this change is that localityName is a high risk field due to the difficulty in putting programmatic checks on its contents. And there are many instances where a required field (stateOrProvince) is already doing the necessary work. In this case, eliminating localityName from a given certificate eliminates a potential repository of typos with no downside.

We recognize, of course, that oftentimes the localityName field will be present even after this new functionality is released and that the typo vulnerability will still exist for these certificates. As I mentioned in bug # 1712120, we have a roadmap item to build the strongest programmatic protections we can for the content of all fields of all public certificate types. Internally we have been referring to this as the “Guard Rails” project.

As part of the Guard Rails project, we will be examining localityName with an eye toward how to eliminate or mitigate errors in this field. While it may be impossible to completely eliminate the possibility of error, there are certain kinds of errors we can eliminate. For example, our roadmap includes the exclusion of pure numerical values in this field, which would have prevented the issuance errors behind bug #1714193. It will take a degree of creativity to fully fence in this particular field, and we imagine it will be an iterative process for us. We will start with the best we can figure out and take the learning process from there.

Assignee: bwilson → tim.callan
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

We have tried to provide a thorough writeup of this event, including what we can and plan to put in place to mitigate this kind of misissuance in the future. We of course are available to respond to the community's questions or comments.

Tim:

  • 2020-10-28: In Bug 1645686, Comment #61, Sectigo stated: "We anticipate full remediation of this issue by year’s end."
  • 2021-01-04: In Bug 1645686, Comment #82, Sectigo provided a "list of all ST field values which were contained in the Subject information of issued EV or OV certificates at the start of this bug", where "start of this bug" was 2020-06-14 (Bug 1645686, Comment #0)
  • 2021-02-08: In Bug 1645686, Comment #90, Sectigo stated "At this time we have revoked all certificates affected by this issue and have put in place a whitelist of allowed values for subject:stateOrProvinceName and subject:jurisdictionStateOrProvinceName based primarily on ISO 3166-2."
  • 2021-05-11: In Bug 1645686, Comment #96, Sectigo acknowledges that they did not scan for typos, and will provide a full report.
  • 2021-05-16: In Bug 1710243, Comment #14, Sectigo provides a full report, referencing the previous work completed 2021-02-08 as having addressed this.

This bug demonstrates two certificates that failed to be caught, https://crt.sh/?id=2021129023 and https://crt.sh/?id=2032081476 , by the above statements.

Comment #0 states:

As a follow-on to bug #1710243 we are presently conducting an investigation to discover other certificates with names in the stateOrProvince field that do not match our accepted list of values. We expect to find some. We plan to add information to that bug (or this one) as we learn more.

Here's where I'm struggling to make sense: On 2021-02-08, Sectigo stated they implemented an allowlist that did not contain the spelling errors from their initial list they used. They also stated that they had revoked all certificates by this issue. However, the statements here in Comment #0 seem to suggest that Sectigo used one list to scan, changed that list to resolve issues, and did not rescan their corpus against that (corrected) list. However, in presenting the corrected list, they implied that they had done so.

I hope this makes it clear why there's confusion about how Sectigo failed to detect these issues (repeatedly, now), and trying to understand what process Sectigo has in place that as it improves compliance in response to incidents, it actually explores its past corpus.

For example, it seems easy to imagine that, having completed the first scan (against the bespoke list), conducting a second scan (against the modified list) would be far less time-consuming and reveal things (such as these) that were overlooked.

What am I missing here?

Flags: needinfo?(tim.callan)

Here I need to echo my post in bug #1714193 comment #5. We are working on a considered reply to this question but have prioritized our focused attention on bug #1712188 for the short term. This post is to acknowledge the question and let you know a response it coming. Thanks for your understanding. For now I'm leaving needinfo open.

(In reply to Ryan Sleevi from comment #2)
Here is the process we went through.

In June 2020 bug 1645686 pointed out a series of stateOrProvince names including countries, continents, and words like “null.” This led to a long exploration of the problem which ultimately landed on the plan to use the ISO 3166-2 list as a basis, with some modifications as discussed, for going-forward issuance. This has the benefit of providing a list of discrete options that are previously vetted and that we can be highly confident are acceptable. It removes the risk of typos and the like. I believe this is what you reference as the “corrected” list.

As discussed previously in bug 1710243 comment 15, it is not necessary for a stateOrProvince name to be in our lookup table for it to be an acceptable name according to the BRs. There is nothing to suggest that these names on the list are the only possible names.

For example, let’s consider England.

In Subject and Issuer Names in certs, we use "C=GB", meaning the "United Kingdom of Great Britain and Northern Ireland." England is a country, but it's a subdivision of "C=GB" (which is sort of a "country of countries").

X.520 section 5.3.3 defines "State or Province Name" as follows (emphasis mine):

"The State or Province Name attribute type specifies a state or province. When used as a component of a directory name, it identifies a geographical subdivision in which the named object is physically located or with which it is associated in some other important way."

By that definition England can be a valid name for stateOrProvince under Great Britain, even though you won’t see it in our going-forward states list. We won’t issue new certificates with England in stateOrProvince, but in the event of an existing certificate with this value, it is not wrong and does not constitute misissuance.

With that idea in mind, in February we went through the entire list of differentials between our active certificate base and our new ISO-based table. We had to examine 14,730 lines and make a determination about whether or not this name 1) matched our going-forward name, 2) did not match our going-forward name but was an acceptable substitute, 3) did not match any going-forward name but was still an acceptable name within the definition given above, or 4) was not an acceptable stateOrProvince name and required revocation.

Most of the differences between our going-forward list and other acceptable names involve the substitution of a name in place of the name on our list for the same region. Reasons for those substitutions can include:

  • Character sets. We determined that Provence-Alpes-Cote D'Azur was acceptable for Provence-Alpes-Côte-d’Azur.
  • Translations. We determined that North Rhine-Westphalia was acceptable for Nordrhein-Westfalen.
  • Friendly names. We determined that Hlavní město Praha was acceptable for Praha, Hlavní mesto.
  • Spaces, hyphens, capitalization, and other incidental punctuation. We determined that Zuid Holland was acceptable for Zuid-Holland.

We were cognizant that a review of 14,730 individual names, many of them in languages unfamiliar to us, would likely turn out imperfect. To provide transparency on the results of the process, we published the full list on January 14 in bug 1645686 comment 82. We were prepared to receive feedback from the community on this list, but it didn’t come in that form. Rather, it came in the form of reports of misissued certificates.

For example, “Singpapore” is certainly a typo. It crept through our review process. It’s sitting in the January 14 list at line 12195. Had anyone in the community discovered it there, they could have responded and we would surely have added this certificate to the revocation list. Instead, it came in subsequently as an inbound misissuance report, and we promptly responded then. Warminsko-Wazurskie and Malopolskia are less obvious to English speakers, but the same principle applies. Malopolskia is on line 10476 and Warminsko-Wazurskie is on line 10506.

This was a very difficult process for the reasons stated above. I have to suspect that these examples are all simply cases of someone making an error when going through a very arduous task. The good news is that our change to the new lookup table means this kind of error will not repeat itself in the future (and this goes back to our theme this year of replacing human judgement and human labor with codified, software-based behavior as described in the first few paragraphs of bug 1712188 comment 20). Closing off future misissuance is the bigger win than whether or not we were successful in swatting every last typo in our corpus of certificates.

Directly in response to bug 1710243 and now this bug we are going back to take another pass at finding and removing errors of this sort. This project is underway.

Our first step was a query for certificates that did not match the countryName in our table. We found a total of 29 country codes in 1511 certificates. We vetted them all and every country code was a country code legitimately in use for some kind of remote territory of a larger country with unusual legal status. Examples include Aruba, Gibraltar, and the Cook Islands. We determined that while we would not be using these country codes moving forward (by, for example, using GB rather than GI for Gibraltar residents), we nonetheless did not need to revoke any of these certificates. This is good news in that is validates that the quality of the work we did in the beginning of 2021 was high.

These country codes are:

  • AW - Aruba
  • BM - Bermuda
  • CK – Cook Islands
  • CW - Curacao
  • FO – Faroe Islands
  • GF – French Guiana
  • GG - Guernsey
  • GI - Gibraltar
  • GP - Guadeloupe
  • GU - Guam
  • IM - Isle of Man
  • JE - Jersey
  • KY – Cayman Islands
  • MF – Saint Martin
  • MO - Macao
  • MP - Northern Mariana Islands
  • MQ - Martinique
  • NC - New Caledonia
  • NF - Norfolk Island
  • PF - French Polynesia
  • PM - Saint Pierre and Miquelon
  • PR - Puerto Rico
  • RE - Réunion
  • SX - Sint Maarten
  • TC - Turks and Caicos Islands
  • VG - British Virgin Islands
  • VI - U.S. Virgin Islands
  • YT - Mayotte

Next we moved on to stateOrProvince. We are partway through that process now, which is where it’s been stalled for a few weeks for reasons discussed extensively in this and other presently open bugs. In this case our query returned a much bigger list of 1021 strings that were not a perfect match for the new records we are using. Nearly all the strings examined so far meet the criteria laid out above for names we won’t use going forward but that don’t constitute a misissued certificate. A few are friendly names that we will be adding to our table when we make it that far down the to-do list (see bug 1710243 comment 18 for a discussion of friendly names).

We are about 150 records into the 1000+ that need to be checked. I am hopeful we can get that done this week, but please don’t interpret that as a hard commit. We’ll revoke and report any that don’t meet the five criteria laid out above.

Flags: needinfo?(tim.callan)

Is the view that these four certificates also fit the following explanation?

I have to suspect that these examples are all simply cases of someone making an error when going through a very arduous task.

I agree that it's important and significant that the transition to an allowlist ensures greater validation care and awareness, but I'm still a bit concerned how a stateOrProvinceName of APAC could creep through onto that allowlist. That said, I'm aware the review is still in progress.

Flags: needinfo?(tim.callan)

As part of our investigation we uncovered 26 certificates with Gothenburg in the stateOrProvinceName field. We revoked these certificates on July 10.

https://crt.sh/?serial=12F7EDAA4738017B18DE4E7B8D059B28
https://crt.sh/?serial=07412089954F79FD0553A7EF28E9F220
https://crt.sh/?serial=5B297F5DDE277E4339068950857D8309
https://crt.sh/?serial=449C61618424806E3CFC1A948803673A
https://crt.sh/?serial=69150B004CF92519F340DDF5E24F696B
https://crt.sh/?serial=42F4097A10C55AB990592048F8B0BCFD
https://crt.sh/?serial=009E92F4B4D5379AD8136155947DA8A066
https://crt.sh/?serial=74E2CF4604217574933F615EA621D13C
https://crt.sh/?serial=4167899D21F6A97F490E08E8BFC99D7B
https://crt.sh/?serial=00DEEC8A9E475310329D7DB8C7670936AD
https://crt.sh/?serial=00DE638A7DE1070F20391B336EA1B152B7
https://crt.sh/?serial=7592A9CF1ECF6551B22D3260E8058B25
https://crt.sh/?serial=00A806421519C05CEF36261D5209C3870C
https://crt.sh/?serial=38974B8623451EAC0218E4056A1E56DF
https://crt.sh/?serial=1FF44BB36C7384F62DC78E522CEC056E
https://crt.sh/?serial=00DF8CB9DCEB0C93D70C72CC550B128602
https://crt.sh/?serial=13B199115C91B663477C662A041F5B02
https://crt.sh/?serial=00E68F180B34D5C0AE25F0C79210FF029A
https://crt.sh/?serial=33B0DB6732B62247851F9A25E778D92E
https://crt.sh/?serial=00B5ABF26CDE6E3C47DDF837B2E8A9C72C
https://crt.sh/?serial=04A4B89AA80F59ECF082897989C07D0F
https://crt.sh/?serial=00D8B8BF99EDD893D3F081542CD492233A
https://crt.sh/?serial=00E148813E20FBF313F69655277C4DA108
https://crt.sh/?serial=00F56BD7D03568CE01FC79F1ED346E4E6B
https://crt.sh/?serial=037C23373B38AA444B9EA5CE0D2997F3

(In reply to Tim Callan from comment #7)

As part of our investigation we uncovered 26 certificates with Gothenburg in the stateOrProvinceName field. We revoked these certificates on July 10.

Excuse me. Typo. 25 certificates.

(In reply to Ryan Sleevi from comment #6)

Is the view that these four certificates also fit the following explanation?

I have to suspect that these examples are all simply cases of someone making an error when going through a very arduous task.

... I'm still a bit concerned how a stateOrProvinceName of APAC could creep through onto that allowlist.

Yes, it is.

Apac is a state in Uganda from the ISO list. As explained in comment 4, we accepted substitutions of “Spaces, hyphens, capitalization, and other incidental punctuation” (emphasis mine). We believe the reviewer didn’t realize the country was New Zealand, not Uganda.

Flags: needinfo?(tim.callan)

(In reply to Tim Callan from comment #9)

Apac is a state in Uganda from the ISO list. As explained in comment 4, we accepted substitutions of “Spaces, hyphens, capitalization, and other incidental punctuation” (emphasis mine). We believe the reviewer didn’t realize the country was New Zealand, not Uganda.

Thanks. Wanting to double check I understand both this and the existing Sectigo controls.

If I'm understanding correctly, it sounds like either (a) the review of existing values was done in isolation of other fields (such as countryName) or (b) the reviewer made a mistake and only looked for the stateOrProvinceName value in the ISO lookup.

Could you clarify which it was, and talk a little more about how the review was conducted?

Where I'm going with this is trying to make sure that the current allowlists consider the fields as part of a logical group (e.g. stateOrProvinceName values depend on the countryName values) and that's technically enforced. Understanding a little more how the original review was conducted is trying to understand whether we have similar risks of values that are valid for one region but not another being allowed.

Flags: needinfo?(tim.callan)

(In reply to Ryan Sleevi from comment #10)

If I'm understanding correctly, it sounds like either (a) the review of existing values was done in isolation of other fields (such as countryName) or (b) the reviewer made a mistake and only looked for the stateOrProvinceName value in the ISO lookup.

Could you clarify which it was, and talk a little more about how the review was conducted.

It had to have been (b), as evidenced by our published list of existing stateOrProvinceName values at bug 1645686 comment 82, in which the stateOrProvinceName and countryName values are both present. We did not consider any other fields as we conducted this review, as we were looking at unique state-country combinations and not individual certificates.

Our process went as follows:

  • Run a query of the certificate base and build a table of all stateOrProvinceName values and their associated countryName values
  • Find everything that was a perfect match for one of our new going-forward strings, including countryName match. Mark the record No for revocation.
  • Go down the list looking at the other entries individually to determine if they meet the substitution criteria described in comment 4. If so, mark the record No for revocation. Per comment 9 we believe this is where the APAC error occurred.
  • Look at outstanding names to see if they meet the broader definition of “State or Province Name” as described in comment 4. If it does, mark the record No for revocation. This may require a small bit of research.
  • If a record does not meet any of the above criteria, mark it Yes for revocation.

Where I'm going with this is trying to make sure that the current allowlists consider the fields as part of a logical group (e.g. stateOrProvinceName values depend on the countryName values) and that's technically enforced.

Our programmatic logic requires that the stateOrProvinceName value must match one of the entries from this order’s countryName value.

Flags: needinfo?(tim.callan)

We are monitoring this bug for additional questions or comments.

Comment #4 mentioned you were still in process of reviewing, which Comment #5 reiterated. It's unclear if Comment #7 is meant to represent the conclusion of that investigation; Comment #12 seems to suggest that, but I don't want to make any assumptions here, and thus had been waiting for conclusive updates.

Flags: needinfo?(tim.callan)
Flags: needinfo?(tim.callan)

(In reply to Ryan Sleevi from comment #13)
No, I just messed up. Yesterday I posted comment 12 before realizing I had an update to provide.

We have concluded our research into improper state or province names in our corpus of certificates. We used the process described in comment 11. That research discovered an additional 182 certificates with stateOrProvinceName values that did not meet our criteria. They were issued between May 17, 2019 and August 11. 2020.

We revoked these certificates on August 1. I have attached the list in attachment 9234454 [details].

(In reply to Tim Callan from comment #15)

(In reply to Ryan Sleevi from comment #13)
No, I just messed up. Yesterday I posted comment 12 before realizing I had an update to provide.

I'm not trying to beat you up here - I know Sectigo has a lot going on - but please understand, I cannot help but feel these mistakes are becoming a pattern, and they're undermining the confidence in Sectigo's improvements. Specifically, in Bug 1563579, Comment #25, Sectigo made the following commitment:

c. Each proposed incident response and substantive comment is peer-reviewed for correctness and clarity by at least one other team member.

I realize that, when looking at this, the determination was likely "This wasn't a substantive comment, it didn't need peer review", but when I look at recent mistakes on Sectigo issues, I can't help but feel that there are more bugs evading this peer review than should be, and I'm concerned.

This seems like a good opportunity to revisit those processes and controls, and perhaps add an additional peer reviewer to ensure no substantive comment is expected, or ensure even the non-substantive comments get double checked.

When it comes to the clarity of the comment, Comment #15 mentions the process in Comment #11. My understanding of Comment #11 was that this was describing the previous, flawed process (Comment #10), but it sounds like you're saying you repeated that same process to detect new issues?

Where I'm going with this is wanting to understand whether or not we'll have another incident in 3-9 months, and the process for Comment #11 needs to be repeated again, and great expense and at the last minute . Comment #4 describes how you already went through this process (in June 2020), and now we've got more certificates that slipped through, detected by the process in July 2021. How can we be sure it's solved once and for all?

Flags: needinfo?(tim.callan)

(In reply to Ryan Sleevi from comment #16)

This seems like a good opportunity to revisit those processes and controls, and perhaps add an additional peer reviewer to ensure no substantive comment is expected, or ensure even the non-substantive comments get double checked.

You’re correct that I deemed this a non-substantive comment and therefore didn’t show it to anyone before I put it up. We do obtain peer review on the actual copy for all substantive comments to ensure we’re providing the right, accurate information. During our bi-weekly WebPKI Incident Response (WIR) calls we go over what we’re doing with each open bug, which sometimes is in the line of “post a check-in on Wednesday if nothing happens before then.” That is the form peer review takes for minor things. We maintain a task list, which we update in those meetings and operate from. After comment 16 and bug 1720744 comment 4, we’re keeping an eye on this process for ways to continue improving.

When it comes to the clarity of the comment, Comment #15 mentions the process in Comment #11. My understanding of Comment #11 was that this was describing the previous, flawed process (Comment #10), but it sounds like you're saying you repeated that same process to detect new issues?

The criteria for determining if the contents of this field are acceptable have not changed. However, there are meaningful differences between this year’s exercise and last year’s.

  1. Scope. Last year we reviewed nearly 15,000 state or province names while this year we reviewed 1,021. This is a vast difference when it comes to ensuring the quality of results.
  2. Peer review. This year we included a peer review step. After the initial reviewer (myself) finished the first pass, I then passed the results to other members of the WIR team to put more eyes on these decisions. We had an agenda item to discuss the results and the peer review in one of our working meetings.
  3. Multiple reviews. After the peer review step, the initial reviewer went down all the results one more time to confirm we still felt good about these decisions.

No equivalent of points 2 and 3 occurred during last year’s exercise. I didn’t get into these differences in my earlier post because I sought to identify the criteria we used to determine which names were okay and which were not, rather than to compare what we did differently on the two occasions.

How can we be sure it's solved once and for all?

It’s also important to remember that we now have a highly reliable method of populating this field for new issuance, which wasn’t the case throughout 2020. Bug 1710243 describes how today we populate and maintain our table of acceptable state and province names. We systematically ensure a match to this list, so the total set of at-risk certificates will continue to drop until it reaches zero in August 2022 when the last of the two-year certificates expires.

Flags: needinfo?(tim.callan)

Are there any more questions on this bug?

I've got no further questions regarding this incident, and am sending to Ben, although I note my reservations and concerns in Comment #16 and hope future incidents pay better attention to substance and detail.

Flags: needinfo?(bwilson)

Ben,

We believe this bug is ready to close. Are you satisfied to close this bug?

I'll close this bug on Friday, 27-Aug-2021

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ov-misissuance]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: