Closed Bug 1720744 Opened 3 years ago Closed 3 years ago

Sectigo: State name in localityName

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: tim.callan, Assigned: tim.callan)

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

1. How your CA first became aware of the problem

On July 9 our internal certificate investigation discovered six certificates issued with “Suffolk” in both the stateOrProvinceName and localityName fields. Suffolk is a valid stateOrProvinceName value but is not the name of the city in which this company resides.

2. Timeline

All times Eastern Daylight Time

July 9, 2:16 pm
Certificates discovered

July 11
Two of the affected certificates expire

July 13, 7:28 pm
Remaining four certificates revoked

3. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem

These certificates were issued to the same Subscriber. We have searched our active certificate base and there are no other examples of this error. One of our former RAs committed the original error. The subscriber ordered replacements for two of the effected certificates in June 2021, which is why two of these certificates show Not Before dates that are after the removal of this RA’s privileges.

We removed this RA’s privileges early in 2021. Because these certificates have been revoked, this problem cannot replicate itself in additional reissuance.

4 A summary of the problematic certificates

Six certificates issued between July 10, 2020 and July 8, 2021

5. Affected certificates

https://crt.sh/?id=4829983798
https://crt.sh/?id=3069098675
https://crt.sh/?id=4829956563
https://crt.sh/?id=3072657815
https://crt.sh/?id=3069095570
https://crt.sh/?id=3072657830

6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now

One of our former RAs committed this error.

At the time of original issuance we lacked programmatic controls to regulate the content the localityName field.

7. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future

Bug 1715929 comment 13 describes our intention to eliminate the localityName field from new certificates and how we will do so. This change will eliminate misissuance of this sort.

This RA who committed this error was in the batch whose privileges we removed in Q1 as part of our drive to maintain closer control over validation. We discuss our RA program, our reasons for removing privileges, and how former RAs can continue to add value for their customers in bug 1714193.

Assignee: bwilson → tim.callan
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

Tim: I'm not fully sure I parsed Bug 1715929, Comment #13 correctly, but it seems the suggestion here is that even if an RA has access disabled, customers whom that RA validated previously can still obtain new certificates using that previous information, is that correct?

How does Sectigo manage the reuse of that information? I'm just trying to understand that if the RA performed the validation, is there a risk of RA reusing validation, and then Sectigo reusing validation - effectively doubling the reuse period? Anything more you can shed light into the process here re: the June '21 certificates would be extremely valuable.

Flags: needinfo?(tim.callan)
Summary: Sectigo: State name in in locality → Sectigo: State name in localityName

(In reply to Ryan Sleevi from comment #1)
You make a good point that after RA privileges were removed this OV information should have been rendered invalid. It is our policy that once an RA’s privileges are removed we require fresh validation of any OV information that came from that RA.

Therefore this is unexpected behavior and we simply didn’t put that together while we were writing up the Bugzilla report. Mea culpa. We’ve been looking into this and it turns out there was a software bug for active single-domain certificates only in which our flag for OV documentation does not reset after RA privileges are removed.

Because removal of RA privileges is an uncommon, one-time thing and this bug only manifests itself under very specific circumstances, we did not detect it until now. We have checked a fix in to QA and expect to release it in this coming weekend’s release cycle.

Flags: needinfo?(tim.callan)

We are monitoring this bug for further questions or comments from the community.

(In reply to Tim Callan from comment #2)

We have checked a fix in to QA and expect to release it in this coming weekend’s release cycle.

(In reply to Tim Callan from comment #3)

We are monitoring this bug for further questions or comments from the community.

I mean, the obvious question: Did you actually manage to release it? I don't think we can take for granted here you did, and was hoping the update would clarify.

Therefore this is unexpected behavior and we simply didn’t put that together while we were writing up the Bugzilla report

I'm not trying to beat you up here - mistakes happen, and the goal of these incident reports is to help bring transparency in a way so that we can spot these and work to improve things better. At the same time, this seems like a mistake that should have been caught during your peer review, and triggered a deeper dive into understanding what had happened.

To be clear, I recognize there's a tension between prompt reports (where all the facts aren't known yet) and detailed reports, but this is an area where if you were still investigating how this happened, it would have been good to acknowledge and flag, because the assumption is that omission is unawareness. I had hoped, for example, we'd have a more detailed description of the bug, how it happened, and whether you've revisited past issuances. You mentioned "we did not detect it until now", but didn't mention if it had happened before.

Basically, I was hoping Comment #3 was going to dive into the substance of the issues recognized in Comment #2. But it feels that hope was misguided? What's being done to help better execute and deliver here?

Flags: needinfo?(tim.callan)

(In reply to Ryan Sleevi from comment #4)

Did you actually manage to release it?

We released it on July 25.

I had hoped, for example, we'd have a more detailed description of the bug, how it happened, and whether you've revisited past issuances.

Replacement has different logic for multi-domain and single-domain certificates for the simple reason that there are no commercial implications for replacement of a single-domain certificate. Hosting partners regularly change the total set of domains in their multi-domain certificates as they reconfigure clusters, and our systems need to track that usage. We accomplish this by creating a new order each time, which requires checking the status of OV reuse. In this scenario, if RA privileges have been removed, the system correctly discerns that no valid OV information is available and requires new validation. This is the expected behavior.

Single-domain certificate reissuance occurs through a simple, “streamlined” code process that goes back to before the BRs were in place. In this case it turns out the code issues replacement certificates without rechecking the RA’s status. We surmise that when this code was written the consequences of removing RA status hadn’t been fully explored and the potential need to revalidate a replacement certificate from a deprecated RA hadn’t been considered.

We addressed this problem through a check we added, as reported in comment 2. We deployed that code update on July 25.

This incident gets into a topic we’ve touched on earlier in other bugs, which is the idea of legacy code review and code simplification. We have taken that on as one of our large quality initiatives (along with other well-discussed projects such as our validation documentation review and update, our CPS review and update, the Guard Rails project, and our certificate base investigation for misissued certificates). We can’t “boil the ocean” by attempting review and simplification of all aspects of our code and systems at once, so instead we are isolating individual parts of our operation and prioritizing them.

As covered in bug 1694233, bug 1718771, and bug 1718579, we started with DCV. While by an accident of timing this examination came to a head with the discovery of two types of DCV error, it’s valuable to remember that our DCV project was underway before we discovered these bugs. (Note also that even though our investigation and problem mitigation is complete for the “DCV Reuse” and “Manual DCV” bugs, the original scope of the project spanned code review, code simplification, and code documentation. The full extent of that project is still ongoing.)

You mentioned "we did not detect it until now", but didn't mention if it had happened before.

As we previously were unaware of the incorrect behavior, we had not looked into that question. We have conducted an investigation, which took a little time since it required looking individually into each suspect certificate.

Just this morning we concluded the investigation. We’ve identified 24 certificates requiring revocation. This number is small because the flaw only occurs with a specific combination of factors, as laid out earlier in this bug. We will share the list once revocation is complete.

Flags: needinfo?(tim.callan)

We have conducted an investigation, which took a little time since it required looking individually into each suspect certificate.

Thanks Tim.

I think, in thinking about future incidents, it's useful to be very explicit about these, as touched on in Comment #4. Our goal is to build an accurate, and contemporaneous, understanding of what the CA is doing. The more that a CA can state, up front, that they're doing, the better it is to avoid any concern or doubt.

I don't have further questions, as Comment #5 addresses my follow-up concerns that arise from Comment #5 (i.e. "This was legacy code" -> "We're dealing with legacy code via guard rails and systemic revamping"). I'll flag this for Ben, or you can, once the remaining deliverables have been completed.

Flags: needinfo?(bwilson)

In looking back at this bug to ensure everything is cleaned up, we observed that on July 22 we issued a certificate with Suffolk as the localityName and with no stateOrProvinceName field, which you can see at https://crt.sh/?id=4912123547. We revoked this certificate on August 11.

We looked into this particular issuance, and here is what occurred:

  • The validation team could not verify the original address on the order.
  • We obtained a new address from the relevant QGIS: 82 James Carter Road, Mildenhall Industrial Estate, Suffolk, IP28 7DE, UK.
  • As no city was present in the QGIS, the validation rep mistook Suffolk for the city name and entered it as such.
  • Because there is no programmatic check on localityName, the system accepted this entry as valid.

This error perfectly illustrates the difficulty in validating localityName and why we are removing it from certificates moving forward. As this certificate did not come from the same former RA as the earlier reported certificates, the fact that Suffolk is the specific offending name is a coincidence.

stateOrProvinceName-localityName exclusivity removes the possibility of errors like this for new certificates. Had the functionality been in place on July 22, this misissuance would not have occurred. See bug 1724476 comment 3 for more detail.

This is a good moment to point out how extremely vague the definition of “locality” is: X.520 says “When used as a component of a directory name, it identifies a geographical area or locality in which the named object is physically located or with which it is associated in some other important way” and we have found no other definition. In the instance of the certificate referenced in comment 7, there is a strong argument to be made that no misissuance occurred. We chose to revoke this certificate for the sake of consistency with the earlier revocation decision from comment 0. However, there is a very important difference between the latest certificate from comment 7 and those earlier certificates.
The first batch of certificates each contained a subject:stateOrProvinceName value of Suffolk in addition to the word Suffolk in the subject:localityName field. This latest certificate contained Suffolk in the subject:localityName field and had no subject:stateOrProvinceName field at all.

This difference matters because for the first set of certificates, the “state or province” and “locality” being reported were exactly the same. While it can occur that a city and “state or province” both have the same name (such as New York or Moscow), in this case there is no city named Suffolk. Suffolk is the county, pure and simple. Bear this point in mind because we will return to it.

Let’s deal with the simpler case first, the most recent certificate reported in comment 7. Here the certificate simply offered the business’s “locality” as Suffolk. Per the X.520 definition, the business’s locality is, in fact, Suffolk. The BRs state that subject:localityName "MUST contain the Subject’s locality information as verified under Section 3.2.2.1" and that subject:stateOrProvinceName "MUST contain the Subject’s state or province information as verified under Section 3.2.2.1". We cannot see anything in BR 3.2.2.1 to explain the difference between “locality” and “state or province.” In bug 1715024 comment 4 we explain why England could be considered a valid entry for stateOrProvinceName. If “ST=England, C=GB" is fine, there is nothing in the language of the BRs to indicate that "L=Suffolk, ST=England, C=GB" is not equally fine. In short, on its own, we believe this certificate was not misissued.

The first batch is not quite the same in that subject:stateOrProvinceName contains Suffolk, which is the same as the subject:localityName. In this case, we felt that using both fields, which we believe are generally understood to determine geographic location at different levels of granularity, to report the exact same governmental demarcation of land would be against the spirit of the BRs, even if perhaps not the letter. Here we see the difference between Suffolk and New York at play.

Once we had revoked a set of certificates for Suffolk in the subject:localityName field, that further complicated the matter of a certificate that contained only Suffolk in the subject:localityName field without a subject:stateOrProvinceName field being present. In this case, we decided that consistent behavior was valuable and so revoked this certificate as well.

As mentioned in many other posts on multiple bugs, we are in the process of removing subject:localityName from our certificates. Doing so eliminates a freeform text field in favor of a list of preapproved values and eliminates a degree of squishiness, which we so want to avoid.

We have added some additional information we felt was relevant to this bug and the larger conversation. We’ll also point out that a description of the phase-in process for stateName-localityName exclusivity is available at bug 1724476 comment 3.

We feel this bug is ready to be closed. Is there any other detail we can provide the community, or should we go ahead and close this bug?

I'll slate this bug for closure on Friday, 3-Sept-2021.

While we wait for this bug to close, we will continue to monitor it for further discussion.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ov-misissuance]
You need to log in before you can comment on or make changes to this bug.