Closed Bug 1668523 Opened 4 years ago Closed 4 years ago

Asseco DS / Certum: Failure to revoke within 5 days

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aleksandra.kurosz, Assigned: aleksandra.kurosz)

References

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36

Steps to reproduce:

Bug related to bug 1667986. We will back with report soon.

Assignee: bwilson → aleksandra.kurosz
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [delayed-revocation-leaf]
  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

We received an alert email at 2020-09-26 19:44 (UTC+2).

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

2020-09-26 19:00 (UTC+2) - Notification is received via the email address revoke@certum.pl.
2020-09-26 20:00 (UTC+2) - The employee operating the mailbox accepts the request, verifies it regarding possible security impacts (found none), and sends for the second analysis stage.
2020-09-27 08:00 (UTC+2) - The second stage of analysis starts.
2020-09-27 12:00 (UTC+2) - The database is analyzed in terms of all the provisions of the "Russian Federation" in the stateOrProvinceName field. Certum prepares a list of affected certificates. The customer is informed about the problem.
2020-09-28 10:37 (UTC+2) - The reporter is informed that we received his report. The reason for the late response is described in bug number 1667684 (https://bugzilla.mozilla.org/show_bug.cgi?id=1667684).
2020-09-28 11:00 (UTC+2) – The stateOrProvinceName field is removed from the customer’s dedicated SSL certificate profile.
2020-09-28 17:30 (UTC+2) - The customer starts issuing new certificates without the stateOrProvinceName field.
2020-09-29 09:00 (UTC+2) - The meeting with the customer is held to create a schedule of revoking incorrectly issued certificates. Certum agrees to wait for customer confirmation that new certificates are replaced, before revoking certificates. The argumentation for that will be described in bug number 1668523 (https://bugzilla.mozilla.org/show_bug.cgi?id=1668523) soon.
2020-09-30 11:50 (UTC+2) – Certum is informed by the customer that the first batch of 786 certificates was replaced on the customer’s servers.
2020-10-01 11:00 (UTC+2) - Certum revokes 777 certificates (9 certificates have expired meanwhile) from the first part. The remaining certificates will be revoked up to 2020-10-13 (EOD).
2020-10-02 15:00 (UTC+2) - Over 1900+ certificates is reissued.
2020-10-02 15:00 (UTC+2) - The next batch of 732 certificates is revoked.
2020-10-13 EOD (UTC+2) - All affected certificates will be revoked.

  1. Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident. A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.

Not applicable.

  1. In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.

In the attachment crt_sn_bug1668523.txt.

  1. In a case involving certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.

In the attachment crt_sn_bug1668523.txt.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

All affected certificates have been issued for one of our business customers which is the biggest Internet company in Eastern Europe. The total audience of his services is more than 70 million users per month, with total traffic of 2 Tbit/sec. Our customer takes great care of users' privacy and provides all services via HTTPS. Also, all internal connections are protected with SSL within this company.

In case of emergency revocation of all SSL certificates, the business impact would be unprecedented and could be harmful to a great part of Internet users. Both Certum and customer decided that it has to be done without undue delay but taking into account the concern for user safety. Certum agreed to wait with revocations until we receive confirmation that certificates are replaced. We believe that this decision was a reasonable step for the Internet community.

Our client currently working on the certificate reissue process and they need up to 10 business days to reissue and replace all SSL certificates.

  1. List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

We believe that such mass revocation of thousands of certificates for such critical infrastructure with such a high volume of traffic should be always investigated separately. We are working with our customer in order to establish a procedure of operation in case of such an incident in the future.

Simultaneously, we are focusing on verifying certificate profiles in order to avoid similar errors that require certificates revocation (https://bugzilla.mozilla.org/show_bug.cgi?id=1667986).

Attached file crt_sn_bug1668523.txt

List of affected certificates.

See Also: → 1667684

Thanks for at least following https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation with respect to filing this incident before the deadline.

Given that much larger, global Internet companies have been able to replace the vast majority (>99%) of certificates within hours-or-days, and under the timeline required, through the use of automation, I'm expecting that any plan involving commitments to have this not happen again (regardless of whether it's this same customer) will look at Certum supporting APIs for automated issuance, and taking proactive measures to ensure that the majority of their users make use of such APIs.

Type: defect → task

The vast majority of our certificates are issued using API.

In the next weeks, we are going to increase awareness among our partners using API in the field of handling security incidents. Especially when it is necessary to quickly revoke certificates. Regardless of the number of affected certificates, we are going to encourage our partners to think over an action plan on their side in case of such incidents. Surely, we will strongly support them during these processes.

We also intend to conduct a study and obtain the opinions of our partners whether we can improve something on our side to facilitate their actions in the event of an incident.

We believe that the path we have chosen will allow us to better manage the urgent revocation of certificates in the event of a security incident.

Tomorrow I will update the status of revocations.

Update on revocations:

2201 of 2202 certificates have been revoked now. I will update this big if the last certificate will be revoked.

Can you provide the details for the last certificate and an explanation of why it's taking longer than the rest to replace?

Flags: needinfo?(wtrapczynski)

The last certificate has been revoked at 2020-10-09 22:45:50 UTC.

This certificate was in the last batch of planned revocations and was unintentionally not replaced on the server. We noticed this during the revocations verification and after a few hours, the last certificate has been replaced and revoked.

Flags: needinfo?(wtrapczynski)

From Comment #1

We believe that such mass revocation of thousands of certificates for such critical infrastructure with such a high volume of traffic should be always investigated separately. We are working with our customer in order to establish a procedure of operation in case of such an incident in the future.

This sounds like a statement that Asseco DS/Certum won't abide by its CP/CPS in certain circumstances; that is, that Asseco DS/Certum believes there are exceptions to 4.9.1 of its CP/CPS.

In Comment #4, I don't really see concrete timelines or deliverables to understand what Asseco DS/Certum is doing, going forward, to ensure that their CP/CPS is adhered to. Have I overlooked an important detail?

Flags: needinfo?(wtrapczynski)

(In reply to Ryan Sleevi from comment #8)

From Comment #1

We believe that such mass revocation of thousands of certificates for such critical infrastructure with such a high volume of traffic should be always investigated separately. We are working with our customer in order to establish a procedure of operation in case of such an incident in the future.

This sounds like a statement that Asseco DS/Certum won't abide by its CP/CPS in certain circumstances; that is, that Asseco DS/Certum believes there are exceptions to 4.9.1 of its CP/CPS.

No. We do not assume that in certain circumstances the postpone of revocations is something usual and relieves us of our obligation to comply with CP/CPS. This wording is rather a reference to Mozilla Wiki's: "Mozilla recognizes that in some exceptional circumstances, revoking the affected certificates within the prescribed deadline may cause significant harm, such as when the certificate is used in critical infrastructure and cannot be safely replaced prior to the revocation deadline, or when the volume of revocations in a short period of time would result in a large cumulative impact to the web.". Given this case, there was a lot of affected certificates installed on servers with high network traffic. If we would force revocations without prior replaces certificates on servers it could harm a significant number of users of the Internet. Taking into account the above and the fact that there was no direct risk for end users we decided that a few days of delay in revocations will be less harmful to the Internet community.

In Comment #4, I don't really see concrete timelines or deliverables to understand what Asseco DS/Certum is doing, going forward, to ensure that their CP/CPS is adhered to. Have I overlooked an important detail?

As a CA, we are always ready to revoke many certificates in a short time. In further analysis, as a root cause of this delayed revocations, we identified the weaknesses in awareness of times of certificates revocation (in case of an incident) among our API clients.

Therefore, the steps we are going to in the next weeks are:

  • Preparing a guide of certificates revocation in case of occurring one of the cases described in CABF BR 4.9.1.1 (in progress right now).
  • Preparing a questionnaire to identify potential problems our clients may not deal with it (we are going to send it two weeks after providing the guide).
  • Performing a questionnaire analysis and working with clients to solve issues.

The main goal of all these steps is to build greater awareness among our clients. This should allow us and our clients to improve the incident revocation and renewal management process.

Flags: needinfo?(wtrapczynski)

(In reply to Wojciech Trapczyński from comment #9)

(In reply to Ryan Sleevi from comment #8)

From Comment #1

We believe that such mass revocation of thousands of certificates for such critical infrastructure with such a high volume of traffic should be always investigated separately. We are working with our customer in order to establish a procedure of operation in case of such an incident in the future.

This sounds like a statement that Asseco DS/Certum won't abide by its CP/CPS in certain circumstances; that is, that Asseco DS/Certum believes there are exceptions to 4.9.1 of its CP/CPS.

No. We do not assume that in certain circumstances the postpone of revocations is something usual and relieves us of our obligation to comply with CP/CPS. This wording is rather a reference to Mozilla Wiki's:

This is in reference to Responding to An Incident: Revocation.

However, it's important to point out a core requirement of this, which is:

You will perform an analysis to determine the factors that prevented timely revocation of the certificates, and include a set of remediation actions in the final incident report that aim to prevent future revocation delays.

A statement such as should be always investigated separately does seem to imply that there will be future revocation delays, and thus at odds with the expectation.

The main goal of all these steps is to build greater awareness among our clients. This should allow us and our clients to improve the incident revocation and renewal management process.

Thanks. I recognize that Comment #9 begins to include some of the details required in the aforementioned expectation. However, I do want to highlight that the objective that Asseco DS is targeting is not merely "building greater awareness", but ensuring you prevent future revocation delays.

It is undeniable that "awareness" is likely a significant component of that, but I do want to make sure we see a clear plan of action as to what would happen if, despite all of Asseco DS's efforts to build greater awareness, a subscriber would still find themselves in the situation described here. Would Asseco reasonably state, as expected, that they made the subscriber aware and that revocation is required?

The problem here is that awareness is not a behaviour change, nor does it ensure behavior changes. The goal is to ensure behavior changes, and to have a plan towards that. For example, other CAs have adopted approaches of further reductions to their certificate lifetime, or even unpredictable replacement requirements, precisely to ensure all Subscribers are prepared and ready should Asseco need to revoke such certificates. The recently adopted SC35 in the CA/B Forum was adopted to ensure that CAs understand that the Subscriber Agreement, a binding legal agreement with the Subscriber, ensures the CA can and will revoke for violations of the CP/CPS, which was similarly a result of past discussions with CAs.

As it stands, I'm not sure "in the next weeks" is a very clear timeline, so can you make sure to provide a

binding timeline of when your CA expects to accomplish each of these remediation steps.

Flags: needinfo?(wtrapczynski)

(In reply to Ryan Sleevi from comment #10)

Thanks. I recognize that Comment #9 begins to include some of the details required in the aforementioned expectation. However, I do want to highlight that the objective that Asseco DS is targeting is not merely "building greater awareness", but ensuring you prevent future revocation delays.

It is undeniable that "awareness" is likely a significant component of that, but I do want to make sure we see a clear plan of action as to what would happen if, despite all of Asseco DS's efforts to build greater awareness, a subscriber would still find themselves in the situation described here. Would Asseco reasonably state, as expected, that they made the subscriber aware and that revocation is required?

The problem here is that awareness is not a behaviour change, nor does it ensure behavior changes. The goal is to ensure behavior changes, and to have a plan towards that. For example, other CAs have adopted approaches of further reductions to their certificate lifetime, or even unpredictable replacement requirements, precisely to ensure all Subscribers are prepared and ready should Asseco need to revoke such certificates. The recently adopted SC35 in the CA/B Forum was adopted to ensure that CAs understand that the Subscriber Agreement, a binding legal agreement with the Subscriber, ensures the CA can and will revoke for violations of the CP/CPS, which was similarly a result of past discussions with CAs.

Good point.

We are going to build awareness by sending a guide. Then by sending a survey and made an analysis, we are going to change behavior.

As it stands, I'm not sure "in the next weeks" is a very clear timeline, so can you make sure to provide a

binding timeline of when your CA expects to accomplish each of these remediation steps.

The timeline for these tasks is:

2020-11-02 - Sending a guide to our clients.
2020-11-16 - Sending a questionnaire to our clients and gathering answers up to 2020-11-20.
2020-11-27 - Completion of the survey analysis and conclusions.

After that, we are going to implement the changes that will affect behavior change. The changes will be determined mainly by the results of the survey conducted. I will be able to provide the next steps after completing the survey analysis.

Flags: needinfo?(wtrapczynski)

Update: The guide is ready and will be send 2020-11-02.

Update: The guide has been sent.

Can you share the guide with us here?

Flags: needinfo?(wtrapczynski)

(In reply to george from comment #14)

Can you share the guide with us here?

I attached it to this bug.

Flags: needinfo?(wtrapczynski)
Whiteboard: [ca-compliance] [delayed-revocation-leaf] → [ca-compliance] [delayed-revocation-leaf] Next update 2020-12-01

All planned tasks have been timely done.

We have decided that the following actions will be taken to prevent future revocations delays. We have one formal step, one technical step, and three technical recommendations that we will share with our API users.

[Formal - will be done up to 2021-01-01]

  1. Change in the partners' contracts - adding clear information about the certificates revocations

We noticed that in our contracts with partners there is no clear information about reasons and times of certificates revocations.

We decided to add to the contracts clear provisions on the reasons and times of certificate revocation.

[Technical - will be done up to 2021-01-15]

  1. Change in API method – additional field for the revocation contact

We noticed that in the case of some partners, the time of informing about the certificate revocation between Certum and partner, and then partner and customer is too long.

An additional field will be added to the API method in which the partner can provide an additional contact email address of the customer in case of the need to revoke the certificate. If this address will be provided then Certum will contact both partner, and the customer directly. The address will not be used for any other purpose except in case of certificate revocation. This will significantly shorten the time in which the customer receives information about planned revocation. This will give the partner and customer more time to react and replace the certificate before revocation.

[Recommendations - will be done up to 2020-12-15]

  1. Implement of reissue certificate API function

A small number of partners currently have the reissue certificate function implemented, allowing for quick reissue of the certificate. This function may be useful when it is necessary to quickly replace the revoked certificate with a new one.

We will encourage and recommend to our partners to implement the reissue function.

  1. Implement the fully automated process of the certificate installation

A small number of partners currently have a fully automated process of installing SSL certificates on servers. Lack of automation may result in extended certificate update time in case of revocation. This can be of particular importance when you need to update many certificates in a short time.

We will encourage and recommend to our partners to implement a fully automated process of installing SSL certificates on servers.

  1. Focus on certificate replacing, not on certificate revoking

Usually, in the case of an incident, Certum revokes all affected certificates. As an option, we allow our partners to revoke affected certificates by themself in proper, required by CABF BR, time.

We decided to give up this practice. Now, in case of an incident, all necessary revocations will be performed directly by Certum. It will allow our partners to focus only on the timely replacement of the certificates.

Whiteboard: [ca-compliance] [delayed-revocation-leaf] Next update 2020-12-01 → [ca-compliance] [delayed-revocation-leaf] Next update 2021-01-15

I would like to inform you that all planned steps have been timely done.

Flags: needinfo?(bwilson)

I will close this next Wed. 27-Jan-2021.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Summary: Asseco DS/Certum: Failure to revoke within 5 days → Asseco DS / Certum: Failure to revoke within 5 days
Whiteboard: [ca-compliance] [delayed-revocation-leaf] Next update 2021-01-15 → [ca-compliance] [leaf-revocation-delay]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: