Closed Bug 1645686 Opened 1 year ago Closed 1 month ago

Sectigo: Lack of input validation in stateOrProvinceName

Categories

(NSS :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1710243

People

(Reporter: fozzie, Assigned: rich)

References

Details

(Whiteboard: [ca-compliance])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:79.0) Gecko/20100101 Firefox/79.0

Steps to reproduce:

I have identified many mistakes in the stateOrProvinceName field in Sectigo's EV certificiates, affected certificates including both leaf and precertificates are listed below (or leaf only - https://misissued.com/batch/99/).

"Default Province":

"null":

"Great Britain":

"United Kingdom":

"Europe":

"Russia":

This seems worrying similar to bug 1548713 and bug 1551362.

All of the certificates are currently queued to be revoked according to Sectigo.

Actual results:

The certificates were misissued.

Expected results:

The certificates shouldn't have been issued with these mistakes, Sectigo should have better input validation on these fields.

While I was going to initially dupe this, I see two reasons to keep this as distinct (and there may be more, but these two stood out)

  • The certificates do not appear to have been previously identified by Sectigo, suggesting system wide scams were inadequate or incomplete
    • Issue 1548713 stressed the importance of examining past incidence. Sectigo’s representative acknowledged this, indicating it would be continued as part of Issue 1575022, but appears to have used that shift in bugs to only focus on EV.
    • This makes me think that this is a very worrying incomplete remediation of Issue 1548713
  • At least one certificate appears to have been issued after Sectigo gave assurances of a complete remediation.

Robin: You know what to do, and hopefully you will do so in a timely fashion.

Assignee: bwilson → Robin.Alden
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Flags: needinfo?(Robin.Alden)
Whiteboard: [ca-compliance]

A few more certificates:

"France" (20) - https://misissued.com/batch/103/
"Germany" (5) - https://misissued.com/batch/104/
"Ireland" (2) - https://misissued.com/batch/105/
"Belgium" (1) - https://crt.sh?q=2558339609
"Sweden" (1) - https://crt.sh?q=1706756419

I think it's likely there are more.

I acknowledge this bug report.

We are working to identify the factors that allowed these certificates to be issued.

I will provide an initial report later today with our actions and findings so far.

We are still examining systems and data to fully understand the issue. I will respond substantively later today.

(In reply to Robin Alden from comment #4)

I will respond substantively later today.

It is not "today" in any time zone anymore. With regard to bug 1563579, you might not want to commit yourself to deadlines which you are not able to meet.

Blocks: 1563579

Robin: Is there a reason why https://crt.sh/?id=2048475000&opt=cablint hasn't been revoked like the others? This has passed the 5 day requirement.

Another certificate with the "France" province (issued after I submitted this report): https://crt.sh/?id=2956083627

Robin: This is completely unacceptable, both in terms of lack of response and in terms of continuing to misissue. Please confirm that you have halted all EV issuance until Sectigo is able to provide a meaningful response and fully commit that no further certificates will have an error.

Flags: needinfo?(rob)
Flags: needinfo?(rich)
Flags: needinfo?(kwilson)
Flags: needinfo?(bwilson)

Maybe you could create a spreadsheet that would help us track when these certificates were issued, the cause of the error, when revoked, and what has been implemented to prevent these problems in the future?

Flags: needinfo?(bwilson)
Flags: needinfo?(kwilson)

Ryan: (In reply to Ryan Sleevi from comment #8)
The issue that permitted the issuance of these certificates has been fixed.
We corrected the issue that permitted the issuance of some of these certificates despite the work previously done.
I will follow up with the detail of what changed and how it related to the earlier bugs you referenced.

(In reply to george from comment #6)
https://crt.sh/?id=2048475000&opt=cablint has has been revoked.
I will follow up in this ticket to explain what led to its delayed revocation.

(In reply to Ben Wilson from comment #10)
Ben, I will be happy provide that information.

As well as addressing the above points our next task on this ticket will be to prepare and share the report of all of the certificates affected by this issue. This will involve a disclosure of a list of certificates that are still to be revoked

We are working through the body of unexpired and unrevoked certificates doing a comparison with the driving tables of country/state data (mostly from ISO 3166-2) looking for other misissued certificates that require revocation.

This is the first list of additional certificate for which the state field value does not match our driving tables, and we are working to revoke these additional certificates now. There are 78 certificates in this batch. https://misissued.com/batch/115/

We anticipate that it will take about a week to work through the whole list as after the automated matching the list needs some manual grooming to split out misissued certificates from defects due to the automated matching being over-conservative (which is fine for new issuance).

I hope to be able to provide daily incremental lists of these additional misissuances related to this issue.

In regards to certificates like https://crt.sh/?id=1777593900 - are we counting "Scotland", "England" and "Wales" as misissued? There are a lot of certificates issued with these provinces (not just Sectigo).

A censys search for:
"England" - https://bit.ly/3hWZqjN
"Scotland" - https://bit.ly/2YtsebL
"Wales" - https://bit.ly/2YuJWvO

Regarding that certificate that george just mentioned (https://crt.sh/?id=1777593900), that has some other interesting issues:

Subject:
streetAddress = Suite L4a, 160 Dundee Street
streetAddress = Edinburgh
localityName = Midlothian

One Google verification checks 'Suite L4a, 160 Dundee Street' in localityName=Edinburgh, not localityName=Midlothian.

I may be wrong here, but this implies that there is one more validation issue, as subject:locality is not mentioned in bug 1575022 and the other open bugs for Sectigo do not cover incorrect the subject, or are explicitly about an other subject field.

Robin, could you verify this?

We processed another portion of the list and found 3 more certificates. Details at https://misissued.com/batch/116/
We continue to work to complete this disclosure.

Regards

(In reply to george from comment #13)

In regards to certificates like https://crt.sh/?id=1777593900 - are we counting "Scotland", "England" and "Wales" as misissued? There are a lot of certificates issued with these provinces (not just Sectigo).

A censys search for:
"England" - https://bit.ly/3hWZqjN
"Scotland" - https://bit.ly/2YtsebL
"Wales" - https://bit.ly/2YuJWvO

Hi George,

For the issuance of new certificates our list of acceptable stateOrProvince values is based on ISO 3166-2 with some customization of the list where local knowledge is available to tailor the list to what is regularly and correctly in use in that country since those are the values we expect to find in the relevant QGISs.
‘Scotland’ is a sub-division of the UK in ISO 3166-2 and will be widely accepted on that basis, although we choose not to accept it for new certificates.

Regards

(In reply to Matthias from comment #14)

Regarding that certificate that george just mentioned (https://crt.sh/?id=1777593900), that has some other interesting issues:

Subject:
streetAddress = Suite L4a, 160 Dundee Street
streetAddress = Edinburgh
localityName = Midlothian

One Google verification checks 'Suite L4a, 160 Dundee Street' in localityName=Edinburgh, not localityName=Midlothian.

I may be wrong here, but this implies that there is one more validation issue, as subject:locality is not mentioned in bug 1575022 and the other open bugs for Sectigo do not cover incorrect the subject, or are explicitly about an other subject field.

Robin, could you verify this?

Hi Matthias,

we´re verifying this specific address issue. Will come later with more information.

Regards

(In reply to Iñigo from comment #17)

(In reply to Matthias from comment #14)

Regarding that certificate that george just mentioned (https://crt.sh/?id=1777593900), that has some other interesting issues:

Subject:
streetAddress = Suite L4a, 160 Dundee Street
streetAddress = Edinburgh
localityName = Midlothian

One Google verification checks 'Suite L4a, 160 Dundee Street' in localityName=Edinburgh, not localityName=Midlothian.

I may be wrong here, but this implies that there is one more validation issue, as subject:locality is not mentioned in bug 1575022 and the other open bugs for Sectigo do not cover incorrect the subject, or are explicitly about an other subject field.

Robin, could you verify this?

Hi Matthias,

we´re verifying this specific address issue. Will come later with more information.

Regards

This certificate was revoked

(In reply to Iñigo from comment #18)

we´re verifying this specific address issue. Will come later with more information.

Regards

This certificate was revoked

This doesn't sound like more information that was promised. Are we to expect more, or would it be better to open a separate incident report for this, since it sounds like a new, not-yet-addressed issue.

It also sounds like there may have been a compliance failure in timely revoking the certificate? Or is it just that Sectigo did not update this bug when revocation was performed?

(In reply to Ryan Sleevi from comment #19)

It also sounds like there may have been a compliance failure in timely revoking the certificate? Or is it just that Sectigo did not update this bug when revocation was performed?

I checked all of the certificates within the two batches they sent and they were all revoked within the mandated 5 days, it seems Sectigo just didn't update this bug report to report this.

Flags: needinfo?(rob)

We have made further progress in processing the list of our previous EV issuance. We will follow up shortly with the next list of certificates that prove to be misissued and must be revoked because of invalid 'state' field values.

The "next list" that Robin mentioned in Comment 21 can be found here:
https://misissued.com/batch/172/

Flags: needinfo?(rich)

Are you just checking if there's a country name in the stateOrProvince field or are you checking all certificates against ISO 3166-2 for that country? The list you provided just seems to include certificates with either the country name or the alpha-2 code in the stateOrProvinceName field.

It also seems like Sectigo isn't revoking these certificates within 5 days after discovery, this certificate is included but it expired on June 24th https://crt.sh/?id=1420901014. Can you explain if you're revoking these after discovery or after you've compiled them all as a list to share here?

We are setting the 5 day clock as they are discovered. Manual review is ongoing.

I should clarify that, because at the end of the day the 5 day clock will be set in batches. We are not strictly following ISO 3166-2, though that's a significant first pass filter. We have a number of people performing the manual review. The first line may look at X in the state field and say, "that's wrong, that should be revoked." That will then get kicked up to the next level for verification. That person may agree, or they may look at it and either agree, or may say, "that's an Anglicized version of the ISO 3166-2 entry," or, "that's a historical subdivision which is still widely accepted as correct." So I guess it would be more correct to say that we are starting the 5 day clock as the discovery is confirmed, and that does tend to happen in batches as neither the initial review, nor the final review results are submitted up for action one by one.

So when was the first certificate in this batch approved? I'm struggling to see how a certificate that expired on the 24th of June has made it into the batch if you're only scanning valid certificates?

(In reply to Rich Smith from comment #25)

The first line may look at X in the state field and say, "that's wrong, that should be revoked." That will then get kicked up to the next level for verification. That person may agree, or they may look at it and either agree, or may say, "that's an Anglicized version of the ISO 3166-2 entry," or, "that's a historical subdivision which is still widely accepted as correct."

BR section 3.2.2.1:

The CA SHALL verify the identity and address of the Applicant using documentation provided by, or through communication with, at least one of the following:

  1. A government agency in the jurisdiction of the Applicant’s legal creation, existence, or recognition;
  2. A third party database that is periodically updated and considered a Reliable Data Source;
  3. A site visit by the CA or a third party who is acting as an agent for the CA; or
  4. An Attestation Letter.

Your statement implies that BR 3.2.2.1 option 2 is used, but can be overridden by a CA operator without going through option 1, 3 or 4, and that is quite concerning.

The BR seem to require the CA to use reasonably up-to-date information in all of the options 1 to 4 (that is, minus the optional 825 day validation data reuse clause in BR section 4.2.1). I do not believe that the statement "that's a historical subdivision widely accepted as correct" from any of the CA's operators is a valid alternative to any of the validation methods in 3.2.2.1.

Flags: needinfo?(rich)

(In response to george in comment 27)

So when was the first certificate in this batch approved? I'm struggling to see how a certificate that expired on the 24th of June has made it into the batch if you're only scanning valid certificates?

What is being reviewed is a pull of all issued EV certificates as of the day prior to us putting in a code change to only allow a specific set of approved entries into the State field. Because this review is being done on a specific report, pulled on a specific date (I'm sorry that I don't know the exact date), it's possible that, as in this case, some of the certificates have expired since that report was extracted. Robin has mostly been handling the posting of finalized batches, but as he is out of the office today, I took over. I wasn't sure whether or not to include the expired certificate but chose to do so in the interest of transparency.

(In response to Matthias in comment 28)

Your statement implies that BR 3.2.2.1 option 2 is used, but can be overridden by a CA operator without going through option 1, 3 or 4

You are assuming that all (any?) of the possible sources listed in 3.2.2.1 actually strictly follow ISO 3166-2. That is a faulty assumption, even when dealing with government agencies. There is a very small set of countries for which I have fairly high confidence that most if not all sources will match state/province information with what is listed in ISO 3166-2. There is another small subset, UK among them, for which I know things on the ground are all over the place and are just as likely not to have anything to do with what's officially listed in ISO 3166-2. Then add to that the use of completely different character sets such as Cyrillic, Kanji, etc. ISO utilizes UTF8, but it's basically a subset of UTF8 that I would characterize as roman alphabet plus diacritical marks not full native language/characters. Sectigo, I think correctly, allows full native language and native character set, but that's yet another bit that requires review from this dataset.

I do not believe that the statement "that's a historical subdivision widely accepted as correct"

Again, you are making a faulty assumption that just because one small group within a particular country's government decides to publish a list to ISO means that is the be all end all. The reality is it's not. Even among different departments of the same government they don't necessarily adhere to that standard. You need look no further than UK Companies House for numerous illustrations. It's complicated which is why the CA/B Forum has floated the idea of requiring CAs to adhere to ISO 3166-2 for the state/province field, but has not made the decision to make that a requirement. Requiring strict adherence to ISO 3166-2 would certainly make my job easier, but I'm not sure it would necessarily make everything more accurate.

I've just uploaded a new batch here:
https://misissued.com/batch/173/

(In reply to Rich Smith from comment #29)

Your statement implies that BR 3.2.2.1 option 2 is used, but can be overridden by a CA operator without going through option 1, 3 or 4

You are assuming that all (any?) of the possible sources listed in 3.2.2.1 actually strictly follow ISO 3166-2.

No, I'm not assuming conformance to ISO 3166-2 (although that would be great of course), but I'm pointing out that "the CA operator said it was OK" is not a validation method as specified in 3.2.2.1.

The BR make it clear that at least one of the 4 specified validation methods must be used for validating the subject matter. If you use an external database to validate subject matter, and this validation fails (quote from your validation methodology explanation: "That's wrong, that should be revoked"), you must find another validation method that complies with 3.2.2.1. and that does validate the subject matter, or you must revoke / not issue the certificate with the subject matter. A CA's operator that states "that's a historical subdivision which is still widely accepted as correct" is not a validation method according to 3.2.2.1, and that is what I wanted to point out. So, your current validation method as published does (in my opinion) not validate the subject matter in a way compatible with 3.2.2.1, as it allows certificates to pass without being validated by one of the 4 methods in 3.2.2.1 (be they ISO 3166-2 or any other 3rd party datasource).

Again, you are making a faulty assumption that just because one small group within a particular country's government decides to publish a list to ISO means that is the be all end all. The reality is it's not. Even among different departments of the same government they don't necessarily adhere to that standard. You need look no further than UK Companies House for numerous illustrations. It's complicated which is why the CA/B Forum has floated the idea of requiring CAs to adhere to ISO 3166-2 for the state/province field, but has not made the decision to make that a requirement. Requiring strict adherence to ISO 3166-2 would certainly make my job easier, but I'm not sure it would necessarily make everything more accurate.

I appreciate this, and agree that there's no perfect way to do validation. But the way you've currently explained Sectigo's validation practices is explicitly wrong, and not for the reason of not fitting ISO 3166-2 to a T, but for skipping the requirement of validating the relevant subject matter using at least one of the validation methods in BR section 3.2.2.1.

https://misissued.com/batch/173/

19 certificates in this batch were included but should not have been. These certificates are not misissued and will not be revoked.
crt.sh IDs:
2545122841
2172246239
1590062998
1547840241
1920027985
1926187497
1926187418
1467043405
1305971524
1414783985
1930240450
1451202666
1763737769
929848277
977892360
2034981669
643103060
807750636
2677710546

Flags: needinfo?(rich)

Can you explain why you don't consider "Brussel" to be a misissuance? I can see "Brussels-Capital" under ISO 3166-2 but not "Brussel" unless it's accepted as a region somewhere else. Although it's entirely possible I'm just missing something here.

(In reply to george from comment #33)

Can you explain why you don't consider "Brussel" to be a misissuance? I can see "Brussels-Capital" under ISO 3166-2 but not "Brussel" unless it's accepted as a region somewhere else. Although it's entirely possible I'm just missing something here.

It was pointed out that Brussel is the Dutch language version of Brussels.

I understand, although it is my understanding that the majority of Brussels is French speaking. As this is loosely defined in the BRs I think it should be fine but maybe just use "Brussels-Capital" for certificates issued in the future to avoid confusion?

I've just uploaded a new batch here:
https://misissued.com/batch/174

(In reply to Robin Alden from comment #11)

(In reply to george from comment #6)
https://crt.sh/?id=2048475000&opt=cablint has has been revoked.
I will follow up in this ticket to explain what led to its delayed revocation.

Just to backtrack a bit here, do you have that follow up now, or when can we expect it?

Nevermind those are the incidents Rich explained, ignore sorry!

I've uploaded a new batch here:
https://misissued.com/batch/175/

At this point we believe we've found and revoked all EV certificates containing invalid information in the ST field.

There were two root causes:

  1. A longstanding system requirement that both L and ST fields be populated. The frontline workaround for this system requirement (I hesitate to call it a bug because at the time it was implemented I believe it was both intentional, and well-intentioned, road to Hades notwithstanding) was, in most cases, to repeat either L or C field data in the ST field if there was no valid ST information to populate.
  2. Basic human error; typos, misspellings and in some cases information in the ST field that simply should not have passed validation

Our respective remediation for these issues is:

  1. Our systems have been modified such that it is no longer required that both L and ST fields be populated, so if ST should be blank it will be blank.
  2. We have instituted a fixed list of acceptable ST fields which will allow no other information to populate into the field.

In response to george in comment 37

I'm chasing up Robin for this.

In response to george, comment #6:

We have reviewed all data we have regarding this particular certificate and can determine no reason for the delay save human error. At the time that this bug was opened staff had to manually revoke each certificate separately. Near as I can tell this one just got missed as the agent was running through the list. As mentioned previously in another bug (sorry, I can't find which at the moment) we have put a system in place which allows for managing the full batch of certificates in cases where there are more than one in order to try to insure against this sort of oversight in the future.

Thanks Rich.

I have gone through some Censys queries for stateOrProvinceNames in various countries and have found all of the incorrect ones have been revoked and I've been unable to identify any more.

We've now taken what we learned from processing and revoking these EV certificates, made further improvements to our ST field filtering and applied that to OV. In this process we've uncovered an initial list of OV certificates which may require revocation. A number of certs which are on this list were discovered in parallel by another CA, who notified us.

There are three batches to report now, the first of which contains 604 certificates and has been revoked. There are two other batches which for reasons I will explain in a separate bug/incident report were not revoked w/in the 5 day window required by the BR. The first of those, a batch of 5, was revoked today, 1 day later than the 5 days allowable by the BR. The final is a batch of 124 certificates which is in progress. I’ll post these all to misissued.com tomorrow and provide links as I’ve not had a chance to pull all of the crt.sh IDs together.

We are working through the rest of the OV list we've generated internally in a similar manner to which we reviewed the EV certificates, so we expect to be posting additional batches of OV certs as they are finalized.

The batches mentioned in comment 45 are:

Batch 1, all revoked w/in the timeframe allowed by the BR
Batch 2, revoked 1 day beyond the 5 day window allowed by the BR
Batch 3, to be revoked on or before Sept. 20, 2020

A bug with incident report covering these latter two batches can be found here

During a very brief Censys searching session I have discovered at least one OV certificate that had been issued since Robin stated here on the 22nd of June in comment 11 that this issue has been fixed. When did Sectigo apply this fix to OV certificates?

https://crt.sh/?q=94ca7757e02ee3eb98cb91a0430962e899f0af51190007bed768f8f9ca4dc464

As I mentioned in Comment #45 we have applied what we learned from looking at our EV population to OV. We have discovered a suspect list of up to ~21,000 OV certificates that may have invalid state or province in the Subject. We are still reviewing these certificates to verify that they do in fact require revocation, so anticipate that the final number will be somewhat lower than this. Nonetheless this represents a far higher number of certificates than we will be able to revoke and replace w/in the 5 days required, even accounting for the fact that final verification of various batches will be staggered somewhat as it was for the EV population. We will follow up with a target date for full revocation as we better understand the number and characteristics of the certificates in question.

I understand that the initial scan for this bug was only EV certificates. However when you applied the patch to only allow certain values in the ST field was this only for EV certificates or OV as well? If it was only for EV when was the patch applied to OV?

Sectigo has issued 5 intermediate certificates with "Singapore" in the stateOrProvinceName field, looking at ISO 3166-2 the subdivisions in Singapore are "Central", "North East", "North West", "South East", "South West". Does Sectigo consider "Singapore" to be valid in this field?

In response to george, comment 49;

These changes required additional development work to fully implement for OV, and this deployment was completed on September 17, 2020.

In response to george in comment 50

george, SG-01 in ISO 3166-2 is "Central Singapore", not just "Central". While the code we have now deployed no longer allows just "Singapore" we have in the past and continue to consider this to be an acceptable value to denote ISO 3166-2 SG-01 and we will not be taking any action on either leaf or intermediate certificates in which we previously permitted this value.

I will also note for the record that there are other values in the Subject:stateOrProvinceName field of previously issued certificates which are neither ISO 3166-2 values nor contained w/in our current list of acceptable values for certificates going forward, but that nevertheless we do not consider inaccurate/incorrect, and therefore will also not be taking action upon.

The initial list of these values was generated during our review of EV certificates, and is as yet incomplete as we are adding to it based upon additional findings as we process the OV certificates. We will be happy to post the final list publicly once we’ve worked through this entire issue. We anticipate having a firmer timeline for full resolution of this issue by the end of next week.

I think there’s a pretty large difference between “Central Singapore” and “Singapore” especially when the country is named Singapore. This seems to be close to “France” vs “Île-de-France” which Sectigo has determined was misissued in the past.

Assignee: Robin.Alden → rich

We continue to work on the review of the OV certificates mentioned in comment 48. We hope to have this issue fully resolved by the end of the year and I’ll be publishing the first batch of confirmed mis-issued certificates early next week.

The first confirmed batch of OV certs with bad ST fields has been posted here.

I'm still working on finalizing the next batch to report but we've hit the 5 day mark on the batch I posted in Comment 56. As I predicted in Comment 48, we will not be able to get these certificates revoked w/in the 5 days required by the BR. Ben, I would prefer to continue to deal with all aspects of this issue on this bug because I think Sectigo, Mozilla and the community at large will all be better able to follow the issue if everything is in one place, nevertheless if you prefer I will open a new bug for this. Please let me know what you'd like me to do.

Flags: needinfo?(bwilson)

What caused the five day miss? My understanding is that timeline is strict and there are no exceptions other than potential covid delays. We had to do a big process when we missed for covid reasons. I think there needs to be more explanation on why there was a miss, similar to the expectation on other bugs.

We've added a new batch of OV certs w/confirmed invalid ST fields here.

In response to Jeremy Rowley in Comment 58:
That's a good question and I'll answer it in the incident report I'll be posting in the next few days.

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

In investigating this bug we discovered a large number of certificates which had been issued with invalid data contained in the Subject:stateOrProvinceName (ST) field. EV certificates were reviewed and corrected first and while the full process took time the finalized batches were each small enough due to the smaller population of EV certificates that we were able to notify subscribers and get corrected certificates out w/in the 5 day revocation window stipulated by the Baseline Requirements and our CPS. The OV population on the other hand, represent a much higher overall population and, as such the finalized batches, especially the first, are too large for our staff to even be able to notify all Subscribers w/in the 5 day window.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

After completing remediation of the EV certificate population we began looking at our OV certificate base. After applying filtering based upon the lessons learned in processing EV, on September 27th we determined that there were ~21k OV certificates with invalid ST field information. We are still working to finalize the review and the total number has been adjusted down at that point to somewhere between 15-18k. We posted the first batch of 7,660 confirmed OVs with invalid ST field data on October 15th.

  1. Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident. A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.

As of September 17, 2020 all certificates issued will only contain a value in the ST field which is listed in a data table of values we have determined to be acceptable. These values are largely determined by ISO 3166-2 with the addition of some values determined to be acceptable and commonly used w/in certain jurisdictions, as well as local language/UTF-8 character variants of ISO 3166-2 entries, Korean Hangul character variants being one example.

  1. In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.

Batches of OV certificates are being posted to misissued.com and the links to those batches provided in this bug as we finalize them.

  1. In a case involving certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.

See #4

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

The ultimate root causes of this issue were stated above in Comment 41.

Regarding the failure to revoke within the 5 days allowed by the BR, the number of OV certificates involved are too great to even allow our agents to send out notification to all the affected Subscribers within the 5 day window. In a vast majority of these cases the invalid ST data consists of either a repeat of the Subject:localityName (L) or Subject:countryName (C) field data and was caused by the bug in our systems described in Comment 41(a) above. Some examples: L = Copenhagen, ST = Denmark, C = DK, or; L = 250 Euston Road, ST = London, NW1 2AF, C = GB. That second one is completely botched yet completely correct at the same time. While this is technically inaccurate, we think any reasonable person would agree that it is not misleading. As such we made a judgement call that the community at large, including the Relying Parties would not be best served if we revoked all these certificates thereby bringing down all these web sites, for which there is no evidence of fraud, phishing or any other intent to mislead, without allowing for time to notify the Subscribers and allow them time to obtain a replacement certificate, whether from Sectigo or another provider.

  1. List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

We have made sure that all certificates we issue contain only ST data values which are allowed by a whitelist encoded into our systems. Our staff is working as quickly as possible to finalize identification of the problematic OV certificates and notify the Subscribers and get the affected certificates revoked. We anticipate full remediation of this issue by year’s end.

Regarding the failure to revoke within the 5 days allowed by the BR, the number of OV certificates involved are too great to even allow our agents to send out notification to all the affected Subscribers within the 5 day window.

The obligation to timely revoke certificates stands regardless of the CA’s ability to communicate with its customers prior to compliance deadlines. Additionally, the inability to notify customers of impending revocation via e-mail has not been previously mentioned by a CA as a reason to delay revocation, so it would be beneficial to the community to highlight the challenges currently being experienced that make e-mail notification within 5 days exceptionally onerous.

L = 250 Euston Road, ST = London, NW1 2AF, C = GB. That second one is completely botched yet completely correct at the same time.

BR section 7.1.4.2.2 (e) clearly defines the acceptable values that can be encoded in a localityName. Unsurprisingly, as the name “localityName” would suggest, only locality names (or state/province names in the case of XX country code) are allowed and encoding a street address or any other data that should be included in another subject field is not compliant or “completely correct”.

As such we made a judgement call that the community at large, including the Relying Parties would not be best served if we revoked all these certificates thereby bringing down all these web sites, for which there is no evidence of fraud, phishing or any other intent to mislead, without allowing for time to notify the Subscribers and allow them time to obtain a replacement certificate, whether from Sectigo or another provider.

Mozilla’s incident reporting guidelines regarding revocation delays (https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation) explicitly states that deeming non-compliant certificates “not a security risk” is not acceptable justification for delayed revocation. Can you provide more comprehensive rationale on the decision to not revoke the problematic certificates within the mandated timeframe in a separate incident report to be in alignment with Mozilla incident reporting expectations and established precedent?

Batch 3 of OV with bad ST fields here.

OV Batch 2 have passed 5 days. They have not all been revoked, for the same reasons stated in the incident report regarding OV batch 1 here.

In response to Corey Bonnell, comment 62:

the inability to notify customers of impending revocation via e-mail has not been previously mentioned by a CA as a reason to delay revocation, so it would be beneficial to the community to highlight the challenges currently being experienced that make e-mail notification within 5 days exceptionally onerous.

The review and notifications are all being done mostly manually which means we can't just hit a button and send out all the notifications at once. The reason for this is we now have a ST whitelist built into our system to ensure this doesn't happen again, however that does not mean that we are automatically revoking every certificate that has a value different from the whitelist. As an example, we are not allowing abbreviations going for most ST entries going forward, but if an existing certificate has an acceptable abbreviation it does not require revocation.

Can you provide more comprehensive rationale on the decision to not revoke the problematic certificates within the mandated timeframe in a separate incident report to be in alignment with Mozilla incident reporting expectations and established precedent?

I stated my rationale for keeping the incident response in this bug, and asked Ben (and by implication any of the other peers) to let me know if he would rather have me open a new bug in Comment 57.

The lack of a way to automatically send out notification emails doesn't seem limited to this bug. Will Sectigo look into a way of automatically sending out notifications for revocation?

In response to george comment 65

This is one of the things we're trying to address in the our revocation process improvements referenced in bug 1648717.

Batch 4 of OV certificates with bad ST fields here.

Batch 3, referenced here have not all been revoked, for the same reasons stated previously for batches 1 and 2.

(In reply to Rich Smith from comment #67)

Batch 4 of OV certificates with bad ST fields here.

None of my clients who have certificates listed in Batch #4 have been notified by Sectigo.
When will notices be issued?

In response to Hank Nussbacher in comment 69:

Batch 4 is now beyond the 5 day revocation deadline. Hank, as I mentioned, getting the notifications out is the thing keeping us from getting these all revoked w/in 5 days. That said, if you see certs that you can take action on, by all means we encourage you to reach out to our customer service team and get them sorted. We're making good progress and revoking certificates from this issue nearly every day. To this point more than half of all certificates confirmed and disclosed have been revoked, but we welcome any help you or any other of our Subscribers who see certificates belonging to them might give in moving their certificates along in the process.

New batch posted here.

6th and final batch posted here.
Final total of OV certs with invalid ST fields came to just under 14k so substantially fewer than my initial posted estimate.

Progress update;
As mentioned above, we found just under 14k certificates with invalid ST values, primarily a repeat of either the L or C field information. As of today 10,444 of these certificates have been resolved. The remaining 3,329 certificates still outstanding will all either expire or be revoked before January 1.

Update: Currently there are 10759 resolved. By the end of the day we should have 11677 resolved and 2110 still outstanding.

We have no update at this time.

We have no update at this time.

We have no update at this time.

All certificates with invalid ST fields were revoked or expired as of December 31, 2020. This issue has been fully resolved and a whitelist of valid ST fields has been put in place to prevent future errors. I'll post a more detailed post mortem soon.

Sorry for the delay. I was off for two weeks during the holidays and have been playing catch up. Post mortem write up is about half done. Hope to post by the end of the week.

Flags: needinfo?(bwilson)

I'm posting two files to this bug.

  1. List of values which our system currently accepts for subject:stateOrProvinceName and subject:joiStateOrProvinceName:
    country_state_abbreviations_20210103.csv

  2. List of all ST field values which were contained in the Subject information of issued EV or OV certificates at the start of this bug and the final determination as to whether or not that value required revocation:
    ReviewedSTFinal.csv

These files constitute what I believe to be the last open items required in response to this incident.

So, is there an additional submission (e.g. postmortem mentioned above) that I should be waiting for? Thanks.

Flags: needinfo?(Robin.Alden) → needinfo?(rich)

(In reply to Rich Smith from comment #70)

Batch 4 is now beyond the 5 day revocation deadline. Hank, as I mentioned, getting the notifications out is the thing keeping us from getting these all revoked w/in 5 days.

Sectigo failed to meet their obligation under the BRs to revoke multiple times as documented in this bug. I think it would be useful for Sectigo to open a separate bug with an incident report that details the difficulties they have revoking certificates and what steps they will take to improve. That way any timelines (such as for improving subscriber notifications as mentioned in comment #70) can be tracked there as that seems like a separate issue.

In response to Ben Wilson, comment 83:

The lessons learned in this exercise fell roughly into two categories.

The first related to subscribers’ certificate agility:

  • While most subscribers could handle swapping out their certificates in the designated five-day period, a nontrivial subset represented that it would be difficult or impossible to do so. Their stated reasons included (but may not have been limited to),
    • A third party has to make this change and isn’t available in the stated time period
    • The certificates are located in physically remote places, such as utility infrastructure, meaning accessing a large number of them in a short time period is infeasible
    • Only one or a few individuals can make the change and they aren’t available in the time period
  • There is, of course, an important difference between what is impossible for the Subscriber and what is merely inconvenient. In the case of most of these complainers, it turned out to be the latter and not the former.
  • Some Subscribers represented that failure of these systems would be a danger to life and limb. This puts the CA in a difficult situation as ostensibly the company can only remain compliant with the BRs by disregarding the health and safety of human beings. It is impossible as a practical matter for the CA to determine the accuracy of these claims or how severe and extensive that risk is. Of course, it was the Subscriber that actually put health and safety at risk by designing and operating a system that couldn’t tolerate the revocation of some or even all of its public certificates. Nonetheless, as a practical matter, the CA is deciding whether or not to meet its five-day deadline knowing that it might be endangering people by doing so. We were told that revocation before replacement could occur would disable (among other things):
    • Air traffic control systems
    • Commuter train operating systems
    • Natural gas infrastructure, potentially leaving thousands of households without heat in the winter
    • Medical systems directly contributing to the care of patients
    • COVID-19 response command and control, information sharing, etc.
    • Transportation, logistics, and supply chain for medical personal protective gear

On the bright side, these conversations did give us an opportunity to stress to these Subscribers the untenable fragility of their current methods; it’s hard to determine what action they ultimately will take. It’s also important to note that no mechanism exists by which CAs can monitor or enforce that Subscribers have deployed their certificates properly or with sufficient agility to handle unexpected revocations.

The other lessons learned from this experience had mostly to do with our internal revocation processing. This is an area we've already been looking at and making improvements upon, but trying to manage ~14k revocations revealed some weaknesses we hadn't previously thought about.

  • Our bulk revocation platform needs more work
    • The bulk revocation portal is great for revoking large numbers of certificates, but aside from loading a batch and setting the revocation time, there isn’t much else there to help manage the overall process.
    • We can load a large number of certificates easily, but if mistakes are made in the uploading of a batch, corrections are extremely difficult and time consuming.
    • We do not yet have the ability to automate notifications from the bulk revocation portal.
  • Time calculations for 24-hour or 5-day revocation timeframes are still partially manual on both bulk and single cert revocation platforms, and thus prone to error.
  • Getting data from a problem report into both revocation portals is via manual data entry and so prone to error.

These are some of the issues we came across in terms of our overall revocation processing functions. I’m sure there are others, and we’ll be adding more detail and following up on this subject more in bug 1648717.

Flags: needinfo?(rich)

(In reply to Rich Smith from comment #85)

These are some of the issues we came across in terms of our overall revocation processing functions. I’m sure there are others, and we’ll be adding more detail and following up on this subject more in bug 1648717.

Bug 1648717 was about a failure to provide a preliminary report to the reporter. The preliminary report could be sent before the CA has decided to revoke, and it would still need to be sent if revocation was not necessary. The issues related subscriber agility and subscriber notifications don't seem the same as the issue in bug 1648717. The provided timeline in that bug actually shows that the subscriber was notified properly; the issue was communication with the reporter.

In response to Mathew Hodson comment #85:

Mathew, you're not wrong that [1648717] (https://bugzilla.mozilla.org/show_bug.cgi?id=1648717) started with a problem responding to the reporting party. However, it has since morphed into a topic on our overall revocation processing systems, policies and procedures, and indeed these are all inter-related both from a staffing and systems stand-point. As such we are looking at it and trying to deal with it from a holistic frame of reference. Both the initial report on bug 1648717 as well as subsequent comments, and the processing of revocations for this bug have shown that we have multiple points of weakness/failure in the entire problem reporting and revocation processing system that we need to address and improve upon. From our side it seems more logical, at least at this stage, to deal with these by looking at it as one holistic, systemic issue, and we believe that it will be easier for us to communicate progress with the community in the same way, so on the same bug. It's possible that either the Mozilla peers may disagree with this approach, or that as we progress we may see that it might be better to split topics apart and if either of those are or become true we are absolutely willing to separate things out.

(In response to Rich Smith from comment #85)

We have two more lessons learned we can add here.

With a large certificate base, scouring it for a specific type of error can actually be very difficult. In this case we wrote a script that pulled a list of suspect certificates, but then we had to vet them by human eye to remove false positives. This was an essential step as we started with a suspect list of about 24,000 certs and wound up with about 14,000 that actually had an issue with the State field. In other words, we had a false positive rate of about 40%. As you may imagine, this was a brutal exercise that took a great deal of time to complete. Unfortunately we couldn’t come up with a way to cut out that human verification step when taking on a purge of this nature. One implication is that there is an enormous difference between a process to uncover SOME or even MOST of the certificates with a certain type of irregularity vs. the BRs and a process to uncover ALL of them. The latter is much, much harder than the former.

Second, we encountered a change in the list of ISO region descriptions midway through this process. By sheer coincidence of timing, ISO added a region in Norway in the five-day period between the time we flagged a set of certs for revocation and the actual revocation occurred. Unfortunately we didn’t find out until after revocation that this set of certs could have stayed active. The takeaway here is that the available list of locations according to ISO is a moving target, with both the addition and removal of accepted values occurring on an ongoing basis. CAs need to think about the best way to stay current with these changes.

We have no update to add right now.

Ben,
At this time we have revoked all certificates affected by this issue and have put in place a whitelist of allowed values for subject:stateOrProvinceName and subject:jurisdictionStateOrProvinceName based primarily on ISO 3166-2. Unless there are additional comments or questions, we think this bug has been resolved and would ask that it be closed.

Flags: needinfo?(bwilson)

It appears that everything has been completed, so I will schedule this bug for closure on or about Friday, 12-Feb-2021, unless I am mistaken.

Status: ASSIGNED → RESOLVED
Closed: 4 months ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED

Bug #1710243 is a DUPLICATE of this bug. I am redirecting dialog from that bug to this one and requesting we reopen this bug and close the duplicate.

In the bug's first few comments, the poster reported one certificate with Moldova in the stateOrProvinceName field, five with Malta, one with Warminsko-Wazurskie, and one with Malopolskia.

The original poster also made a comment about the contents of an OU field, which was addressed by another poster in bug #1710243 comment #7.

We will carve out and respond to the individual state fields reported here.

Flags: needinfo?(bwilson)

Bug #1710243 comment #4 points out an incorrect State field in Poland: Warminsko-Wazurski, which is a typo for Warminsko-Mazurski. We agree that it is and have set the certificate for revocation. I'll follow up with a full report shortly.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Tim, although I am reopening this bug because your comment #92 and Comment #93, I do believe it's appropriate to continue to use Bug 1710243 for discussion, and to close this issue.

I realize you're suggesting similar root causes, due to timing of when the original certificates were issued, but I do believe the substance alone is different enough to warrant separate discussion. I will respond separately on that bug.

Status: REOPENED → RESOLVED
Closed: 4 months ago1 month ago
Flags: needinfo?(bwilson) → needinfo?(tim.callan)
Resolution: --- → FIXED

Sorry, didn't mean to close Ben's N-I. Ultimately, the choice in Comment #94 is up to him and Kathleen as whether to duplicate, although I do not support it.

Flags: needinfo?(bwilson)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Bug #1710243 comment #6 references Malopolskia as an incorrect State field string. We concur and will set the certificate for revocation. I'll add this to the full report I'm working on.

Let's close this bug and continue discussion in the new bug, Bug #1710243. Even though it might not be a "Duplicate" per se, I'll select that option for expediency.

Status: REOPENED → RESOLVED
Closed: 1 month ago1 month ago
Flags: needinfo?(bwilson)
Resolution: --- → DUPLICATE
Duplicate of bug: 1710243
Flags: needinfo?(tim.callan)
You need to log in before you can comment on or make changes to this bug.