Closed Bug 1826713 Opened 1 year ago Closed 9 months ago

Actalis: Certificates issued with validity period greater than 398 days

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: adriano.santoni, Assigned: adriano.santoni)

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0

Steps to reproduce:

We received a report that three certificates, issued by Actalis, have a validity period greater than 398 days. We have started investigations and will publish an incident report soon.

Assignee: nobody → adriano.santoni
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance] [ov-misissuance]
Flags: needinfo?(adriano.santoni)

We apologize for the delay.
Here is our preliminary incident report:

1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in the MDSP mailing list, a Bugzilla bug, or internal self-audit), and the time and date.

We received an email from michael.lettona@digicert.com, sent to our mailbox cert-problem@actalis.it, reporting that we had 3 certificates with a validity period greater than 398 days.

2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was performed.

(all times are local Italian times)
2023-04-05 23:21 | We received an email from michael.lettona@digicert.com reporting that we had 3 certificates with a validity period greater than 398 days.

2023-04-06 08:17 | We began our investigations on the reported issue. First of all, we checked that the report was reliable and accurate, and we found that it was. Then we started looking into the possible causes of the problem.

2023-04-06 09:25 | We found that those certificates were issued with an old, offline SubCA that is only used for a few websites run by our company. The reasons why it has been possible, from that SubCA, to issue certificates with a validity period longer than 398 days is still to be clarified and requires further investigation.

2023-04-06 10:02 | Considering that those certificates were issued for servers managed by our company and that their replacement would not create any problems, we decided to proceed immediately with their replacement and revocation.

2023-04-06 15:28 | We completed replacement and revocation of the offending certificates.

2023-04-06 17:13 | As a an immediate preventative measure, pending completion of our investigations, we had the configuration of the involved SubCA modified so to only allow 1-year validity of TLS certificates.

3. Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident. A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.

We stopped in September 2020 to issue certificates with a validity greater than 398 days, as per the BR. The three certificates discussed here that do not comply with the rule are, in fact, the result of an incident.

We have already taken interim measures to prevent the recurrence of this type of problem and further measures will be taken later.

4. In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.

A total of three (3) certificates were affected, as listed in section 5 below.

The oldest affected certificate was issued on 2021-03-18.

The most recent affected certificate was issued on 2023-03-07.

The third affected certificate was issued on 2022-07-01.

5. In a case involving TLS server certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. It is also recommended that you use this form in your list "https://crt.sh/?sha256=[sha256-hash]", unless circumstances dictate otherwise. When the incident being reported involves an SMIME certificate, if disclosure of personally identifiable information in the certificate may be contrary to applicable law, please provide at least the certificate serial number and SHA256 hash of the certificate. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.

https://search.censys.io/certificates-legacy/facf84a18abb35b92a3816e16f6a5ac6b03286b8748da2cbb6e766c80d48a967
https://search.censys.io/certificates-legacy/2a97e3c4990e1930c839af678329598f52b1c3613296525f82e60ac7bd289365
https://search.censys.io/certificates-legacy/97a4e050cf9f2b17be54c7a668cfd83dc1dc34609f6fa1c0b6d6a3df2438ed20

6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

This is still under investigation.

The old SubCA used to issue the three offending certificates is not connected to our online systems and is essentially offline (it can only be operated manually). Since 1st September 2020, only 20 certificates have been issued by that SubCA, and as of today only 3 of them are still active (excluding an OCSP responder's certificate).

For some reason that we are still investigating, we found that in 2020 such SubCA was not updated so as to comply with the maximum validity of 398 days that came into force on 2020-09-01 (as per BR 6.3.2). Furthermore, the compliance verification mechanism that was set on that CA apparently did not trigger.

7. List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future. The steps should include the action(s) for resolving the issue, the status of each action, and the date each action will be completed.

The first step that we took was to modify the certificate profiles on the involved SubCA, so not to offer any possibility to issue TLS certificates with a validity period greater than 1 year (this was done on April 6th).

The further measures that we will take will depend on the outcomes of the investigations we have in progress.

Flags: needinfo?(adriano.santoni)

In the following we provide the results of our investigations and the list of measures we decided to take to prevent recurrence of this problem.

The 3 offending certificates were issued by a SubCA which, with the exceptions we clarify below, we have not used for 4 years and we should have already decommissioned.

In fact, for a number of reasons, about 4 years ago we decided to discontinue using that SubCA to issue TLS certificates to customers. Consequently in 2019 that SubCA was disconnected from our online systems and since then it was no longer possible, with that SubCA, to issue TLS certificates through our normal online processes and related validation and issuance procedures.

However, that SubCA had not been decommissioned, because some departments of our company needed to be able to obtain, for some more time, TLS certificates issued by that SubCA, as this was required by old versions of our remote signature client software still being used by a number of customers. Therefore, it was still possible to issue certificates with that SubCA for a few domains of our company, following an old procedure that was applied only in the context of our remote signature service (not related to our public CA service for TLS certificates).

After our investigations we found that, when the maximum validity of 398 days was introduced in the BRs (effective September 2020), the configuration of that SubCA was not updated accordingly. This apparently was due to the fact that such SubCA was mistakenly considered decommissioned by the staff in charge of making the necessary configuration changes, presumably due to misunderstanding certain communications between different departments of our company.

A post-issuance linting script was set up on that SubCA, since early 2020, but unfortunately the email alerts it generated - when the 3 offending certificates were issued - were not delivered to the intended recipients due to a subsequent change to our internal email services which took place in 2021, after which the said script apparently was not updated so to call the new mail server, for reasons that could not be reconstructed.

In conclusion, this incident was mostly caused by a combination of incorrect assumptions and misunderstandings.

The measures we intend to take to avoid the recurrence of this type of problem are:

  • modify the certificate profiles on the involved SubCA, so to prevent issuance of TLS certificates with a validity period of more than 1 year -- done on April 6th;
  • meeting of all involved company departments to share what happened, highlight the mistakes made and raise awareness among all stakeholders -- held on April 17th;
  • internal formalization of the fact that the involved SubCA (which in any case expires in February 2024) from now on will no longer be usable for any purpose whatsoever – by end of April;
  • deactivate the involved SubCA so that further certificate issuance is plain impossible (certificate life cycle management and the CRL/OCSP services will still be ensured) -- by mid of May.

Hi Adriano,

You summarized the root cause as:

mostly caused by a combination of incorrect assumptions and misunderstandings.

Are there any lessons learned that can be shared related to:

  1. avoiding the incorrect assumptions or misunderstandings that contributed to this incident?
  2. decommissioning CAs, in general?
  3. positive and negative testing for alerting?

Additionally, is there anything worth sharing with the community as a result of the April 17th meeting?

Hi Chris,

in response to your questions:

  • The meeting of 17 April allowed the participants to realize that there had been an insufficient focus on certain important information that had been shared among different departments of our company, in particular between the area responsible for the CA service and other areas that deal with completely different services which however also rely on the CA. When it comes to communications that have to do with compliance, we have agreed that it is necessary for all recipient departments to give explicit acknowledgment that they fully understand what has been communicated, whereas in this case the "tacit assent" principle was applied which proved to be fallacious.

  • We then agreed on the fact that when important and sensitive tasks are entrusted to the technical staff, even when they are senior people, normally meticulous and reliable, it is always necessary to carefully check that the required tasks have been carried out thoroughly and precisely, even through the verification of specific objective evidence for each processing environment in which the interventions were requested. Unfortunately, this approach can easily be perceived as a sign of mistrust and micro-management, however when there is a potential non-compliance at stake there is no alternative.

  • We have also established that, when we decide to decommission (in the sense of ceasing to use) a particular CA without revoking it immediately, it is necessary to formalize and share with all the stakeholders the reasons why the revocation of the CA is postponed, and in this case it is still required to adopt all the necessary measures necessary to ensure that the CA remains fully compliant with the BRs for as long as it remains "alive", also through periodic verification of the correct functioning of the existing compliance-checking mechanisms. And these checks must be tracked. We should considered a CA truly decommissioned only when it is revoked or expired, otherwise it must still be subject to all the necessary compliance checks even if it no longer signs certificates but only CRLs. Where a pre-issuance linting is not possible, the opportunity to quickly migrate that CA to a more advanced software environment that allows it must be evaluated and the relative decisions tracked.

We have no further updates.

We have no further updates.

I will close this on or about Wed. 19-Jul-2023 unless further discussion is required.

Flags: needinfo?(bwilson)
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.