Closed Bug 1670861 Opened 4 years ago Closed 3 years ago

Actalis: delayed revocation related to inaccurate value in stateOrProvinceName

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: adriano.santoni, Assigned: adriano.santoni)

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36

Actual results:

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

This bug is related to Actalis not revoking within 5 days, as per the BR, just one (1) of the certificates affected by bug 1648997 (all the others were revoked on time).

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

See bug 1648997.

  1. Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident. A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.

Regarding certificate issuance, see bug 1648997.

The problem that this bug is about, is that we had to grant our customer a few days of delay in revoking one particular certificate (out of 20) than allowed by the BR.

  1. In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.

See bug 1648997.

  1. In a case involving certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.

See bug 1648997.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

We decided to grant our customer a slight delay before revoking one particular certificate after having considered: the practical difficulty for our customer to replace that single certificate within 5 days, the disruption that a revocation within 5 days would have caused, and the absence of meaningful security risks connected to the inaccurate (but not misleading) value of stateOrProvinceName.

  1. List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

We are implementing several measures that we believe will reduce the likeliness of recurrence and the extent of delays in revoking certificates:

  • Since the delays are mostly caused by the difficulty, for complex organizations, in quickly replacing many certificates with manual procedures, at unexpected times, we intend to push more decisively towards greater automation of the enrollment process. In particular, we are going to start promoting the adoption of the ACME protocol by our customers: to that end, we will start a communication initiative by end November, aimed at our biggest customers.

  • Considering that sometimes deferment of revocation terms are requested by customers due to improper practices such as certificate pinning or the use of web-trusted certificates in contexts where it is not necessary, we are introducing in our CPS appropriate warnings and clarifications regarding those practices, which we discourage and will not consider valid reasons to delay the revocation of certificates. We expect to be able to publish our next CPS version by end of October (it is currently being reviewed by several stakeholders).

A further measure, addressing the specific problem of bug 1648997, is the integration of a third-party online geographical database with our registration authority management system to make sure that the address details be accurate without relying on human judgement alone. We are on track with this effort: testing is already in progress and we aim and deploying it in production before EOY.

Assignee: bwilson → adriano.santoni
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance] [delayed-revocation-leaf]

The most recent CPS on your website is Version: 5.8, dated Oct 05, 2020. Do you have a more current one yet?

Flags: needinfo?(adriano.santoni)

That is the most recent. We inserted several clarifications and warnings in section 1.4.2.

Can you provide an update with respect to automation? In particular, I would want to ensure we don't have a future issue where "We told our customers about ACME, but they chose not to use it, so it's not our fault we are delaying again". There are many concrete steps the CA can take, as well as objective measures of progress, to ensuring that the proposed solution (automation) is adopted and prevents the highlighted issue.

Sorry about the delay.

Here is an update:

  • as of today, we already have some major customers who get certificates from us through ACME;
  • we received positive feedback from some of the other customers to whom we have proposed the adoption of ACME, and we are about to start a testing phase with some of them;
  • we are planning measures that will make the use of ACME convenient also from an economic point of view.

If you're willing to share some other concrete steps that we haven't thought by ourselves, to foster customer adoption of ACME, we'd be happy to hear and evaluate.

Given that the goal of automation, in the context of this issue, is to reduce or eliminate disruptiveness for customers when it comes to replacing certificates (e.g. to make revocation easier, by making issuance easier), Comment #3 was trying to highlight that it only achieves that goal if you work with customers to actually implement and deploy.

It sounds like you've made progress (with respect to Comment #4), which is fantastic to hear. In terms of concrete steps, it also sounds like you're making progress in working with your customers to understand and identify challenges, and those are also worthwhile to share with the community (e.g. the IETF and/or the CA/B Forum and/or m.d.s.p.) if you're not sure of options/solutions.

I'm assigning this to Ben for consideration for closure, because I don't think my remaining suggestions below need to block closure.

Concrete additional actions:

  • Implement monitoring to determine what percentage of your new certificates are issued via automated means vs non-automated means. This can help guide you in identifying customers who would be at risk in the event of revocation, to identify their challenges and make sure you've got solutions.
  • It can also make them aware that they are accepting any operational risks that might result from not automating, and help you reiterate your commitment to the Baseline Requirements' prompt and timely revocation.
  • Equally, however, consider your existing certificates, and identify which of those customers have adopted automation (for their new certificates). That is, I imagine that, like most CAs, you have some customers with many certificates spread throughout the year, and some customers with only a few certificates, so this obviously isn't a one-size-fits-all. However, the objective here is to determine, from a customer level, how well the adoption of automation is progressing, because the automation reduces the risk of revocation delays by ensuring easier/prompt replacement.
  • However, even in perfect automation solutions, there can still be risk (e.g. see https://bugzilla.mozilla.org/show_bug.cgi?id=1619179#c7 ), so working with the customers that, as they implement automation, they also think about how to manage such risks at scale, as well as the opportunity for Actalis to work with other CAs to identify common solutions for your customers' needs.
Flags: needinfo?(adriano.santoni) → needinfo?(bwilson)

I will close this bug on or about next Wednesday 27-Jan-2021 unless there are issues still to address.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] [delayed-revocation-leaf] → [ca-compliance] [leaf-revocation-delay]
You need to log in before you can comment on or make changes to this bug.