Closed Bug 1819105 Opened 2 years ago Closed 2 years ago

NETLOCK: Disclosed CRL is expired

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: agwa-bugs, Assigned: horvath.tamas2)

Details

(Whiteboard: [ca-compliance] [crl-failure])

Netlock has disclosed the CRL http://crl1.netlock.hu/index.cgi?crl=trustev3&crltype=f for the issuer "C=HU, L=Budapest, O=NETLOCK Ltd., CN=NETLOCK Trust EV CA 3". This CRL has a Next Update in 2020:

        Issuer: C = HU, L = Budapest, O = NETLOCK Ltd., CN = NETLOCK Trust EV CA 3
        Last Update: May 27 12:50:51 2020 GMT
        Next Update: May 28 12:50:51 2020 GMT
Assignee: nobody → banyai.anna
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [crl-failure]

Netlock, please provide:

  1. a full incident report
  2. a explanation as to why 11 days have passed without acknowledgment
  3. what changes will be made to ensure more timely acknowledgment and incident reporting from Netlock in the future
Assignee: banyai.anna → horvath.tamas
Flags: needinfo?(horvath.tamas)

How your CA first became aware of the problem (e.g/via a problem report submitted to your Problem Reporting Mechanism, a discussion in the MDSP mailing list, a Bugzilla bug, or internal self-audit), and the time and date.

We got a Bugzilla ticket on the manner from Mr. Andrew Ayer. https://bugzilla.mozilla.org/show_bug.cgi?id=1819105

A timeline of the actions your CA took in response/A timeline is a date-and-time-stamped sequence of all relevant events/This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was performed.

Date Action taken
01/03/2023 Bugzilla ticket was processed by our internal team and created a ticket internally on the manner
02/03/2023 The problem was identified and averted by the support engineers
03/03/2023 The uprising cache problem was identified and averted by the support engineers

Whether your CA has stopped, or has not yet stopped, certificate issuance or the process giving rise to the problem or incident/A statement that you have stopped will be considered a pledge to the community; a statement that you have not stopped requires an explanation.

The given NETLOCK Trust EV CA 3 is only used to generate test certificates for valid/expired and revoked test sites (valid.ev.tanusitvany.hu, expired.ev.tanusitvany.hu, revoked.ev.tanusitvany.hu) and it is considered a test function, therefore it is currently monitored ont he infrastructure level only.

In a case involving certificates, a summary of the problematic certificates/For each problem: the number of certificates, and the date the first and last certificates with that problem were issued/In other incidents that do not involve enumerating the affected certificates (e.g/OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified/This will help us measure the severity of each problem.

The case is not involving certificates.

In a case involving TLS server certificates, the complete certificate data for the problematic certificates/The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem/It is also recommended that you use this form in your list "https://crt.sh/?sha256=[sha256-hash]", unless circumstances dictate otherwise/When the incident being reported involves an SMIME certificate, if disclosure of personally identifiable information in the certificate may be contrary to applicable law, please provide at least the certificate serial number and SHA256 hash of the certificate/In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.

The only affected certificate is for the valid.ev.tanusitvany.hu test site.
https://crt.sh/?id=8127310592

Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

The root cause of the problem was an incorrectly generated CRL list, which was cached by our front-end servers. The CRL generation error was fixed right away, which cleared the cachings, except one. Unfortunately as mentioned above currently we do not monitor all aspects of the test chains, so it didn’t pick up the error on only one cache server (from 3 cache servers only one held and provided this incorrectly generated CRL).

List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future/The steps should include the action(s) for resolving the issue, the status of each action, and the date each action will be completed.

We are currently in the process of redesigning our monitoring system, where we are planning to extend both the levels of checking and the tools we use. At the moment we are in the process of defining the business requirements which can be automated by different monitoring implementations.

Flags: needinfo?(horvath.tamas)

Hello Chris!

We are currently in the process of multiple external audits, and our compliance resources are limited, this is the reason we did not respond to the initial request. One of the audit is almost over, and also our team has been extended with new internal and external colleagues (including me).

Tamás,
Are there any updates from your side?
Thanks,
Ben

Flags: needinfo?(horvath.tamas)

Tamás,

The delays observed in Netlock's open incident reports are unacceptable and minimally represent a failure to comply with the Chrome Root Program's commitment to timely and transparent incident reporting. Beyond providing updates as requested across this bug (and all others related to Netlock), please explain specific and time-bound steps Netlock will complete to improve upon the delivery of its existing commitments.

-Ryan

Hello Ben!

No further update at this moment. We have improved the monitoring on the infrastructure level to be able to detect such incidenst.

Hello Ryan, Thanks for the note and you are absolutely right, there were no update on the issues inthe past couple weeks. As we have limited resources and went through multiple audits in the past 8 weeks my scope was abstracted from the communication on these manners. I'm sorry about it, will improve it in the upcoming days/weeks

Thanks

Tamas

Flags: needinfo?(horvath.tamas)

Hi Tamás,

Comment 5 requests Netlock explain specific and time-bound steps that it will complete to improve upon the delivery of its existing commitments. Please see that this is provided.

Limited resources and a competing audit schedule are not satisfactory explanations as to the delays and insufficient prioritization of incident management demonstrated in this bug, and many others.

As a reminder, we provide examples of good practice on CCADB.org.

Thanks,
Ryan

Hello!

Our entire monitoring system is currently under review.

The review of our monitoring system is required by the Hungarian Authority of the TSP too, which must be completed by a specific deadline. (2024.01.02.)

By the end of this year, we will finish reviewing our monitoring systems, of which the following steps have already been met:

  1. We have identified the system elements that have been included in the monitoring
  2. Reviewed their necessity
  3. We specified the elements necessary
  4. Started the procurements.

As long as the new system is not installed, we are doing manual checks on a predefined basis.

These CRL controls are carried out manually, following the check that is due in Jira tickets.

An internal tool will be written in a month for automatic tests and control too.

Tamas

Summary: Netlock: Disclosed CRL is expired → NETLOCK: Disclosed CRL is expired

Hello!

Besides updateing our internal monitoring systems we implemented a new external monitoring tool which checks crlwatch, ocspwatch and crt.sh data for getting faster alerts on issues related to our certificates and services. The external tool is uptimerobot, the configuration is implemented on multiple devices already with wich we get push notifications over the issues.

Tamas

Unless there are additional questions or comments, I intend to close this on Friday, 29-Sept-2023.

Flags: needinfo?(bwilson)
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.