Camerfirma: certificate for unregistered domain cuatis.net
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: agwa-bugs, Assigned: ana.lopes)
Details
(Whiteboard: [ca-compliance] [ov-misissuance])
On Nov 12, 2019 at 12:17 UTC, Camerfirma issued the following certificate for mail.cuatis.net (note the subject organization "CUALTIS S.L.U"):
https://crt.sh/?sha256=F2C6316C4EFA5C18EAB78E36AB2EA3DA54CD544D47ACCE0607AFAE84C68FD0CC
At 16:55 UTC that day, this certificate was revoked.
At 17:18 UTC that day, Camerfirma issued the following certificate for mail.cualtis.net (note the addition of an l in the domain name):
https://crt.sh/?sha256=8B5620B6B878B52FFFB8ADB333C23950C461E4AD09045A3F5256F5B30DED80EE
At this time, cuatis.net is not a registered domain. While this does not preclude the possibility that the first certificate above was correctly validated, the sequence of events strongly suggests that the first certificate was the result of a typo in the domain name. I would therefore ask Camerfirma to provide details about how this certificate was validated.
Updated•5 years ago
|
Please find below the incident report.
-
How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.
We were aware of the problem on October 22nd because of the bug 1672423 opened by Andrew Ayer on October 21st. -
A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
• After reviewing the information and verifying that that certificate was issued with an error in its domain name, we started to investigate the case. (October 22nd)
• We contacted our Operations department to know why it was possible to issue that certificate (October 22nd)
• Operations department sent us their explanations (October 22nd)
• We reviewed the measures applied to prevent the issuing of certificates with errors that were applied at that time (October 22nd)
• We reviewed with the development department that the developed controls are currently working to prevent this kind of problems (October 22nd) -
Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
The certificate detected is the only one issued with that problem and was revoked immediately after detecting the incorrect domain name. The CA has not issued any more certificates with that error. -
A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
https://crt.sh/?sha256=F2C6316C4EFA5C18EAB78E36AB2EA3DA54CD544D47ACCE0607AFAE84C68FD0CC -
The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
https://crt.sh/?sha256=F2C6316C4EFA5C18EAB78E36AB2EA3DA54CD544D47ACCE0607AFAE84C68FD0CC -
Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
This error was introduced because of a human error.
The client introduced a typo when they completed the name of the domain.
At that time, we did not have the automatic controls yet and the checks were performed manually by comparing with the name given by the client, so the error was not detected in the moment of the issuance.
However, the typo was detected later by the RA operator and revoked and substituted for the correct one in less than 5 hours. -
List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
• The main action that prevents this issue for future certificates have already been implemented. Camerfirma developed a new functionality in its systems to control the domain that was passed to the production environment and started to be used on August 24th,2020
• We are going to reinforce the training for our RA operators so that they know better how to report this kind of mistakes. (December 31st, 2020)
Updated•5 years ago
|
Comment 2•5 years ago
|
||
(In reply to Ana Lopes from comment #1)
- Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
The certificate detected is the only one issued with that problem and was revoked immediately after detecting the incorrect domain name. The CA has not issued any more certificates with that error.
How do you know this?
Nothing about this incident report provides much confidence that this can be stated as authoritatively as stated here; Camerfirma was unaware of the existence of the issue in the first place until it was reported externally.
The fact that this was externally detected and reported makes it clear that there are indeed a number of remedial steps that Camerfirma could be taking to provide assurance to the public. For example:
- Examining every single revocation performed by the RA operators to ensure that there were no other failures to report.
- Examining every single unexpired certificate for other examples of potentially-invalid domain names, by cross-referencing against WHOIS data.
However, given that Camerfirma proposes that it will take them two months to train their staff as to the basics of PKI operations, it seems unlikely to believe that Camerfirma could conduct either of those steps correctly in a timely fashion. It also seems to undermine the confidence by which this statement was made; for example, should someone find another certificate demonstrating this problem, what would Camerfirma's response be, and what should the community response be?
- Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
This error was introduced because of a human error.
The client introduced a typo when they completed the name of the domain.
At that time, we did not have the automatic controls yet and the checks were performed manually by comparing with the name given by the client, so the error was not detected in the moment of the issuance.
However, the typo was detected later by the RA operator and revoked and substituted for the correct one in less than 5 hours.
Human error is not a root cause. "Reinforcing training" does not address the issue. This is clearly called out in Mozilla's expectations for incidents, and it's very disappointing that Camerfirma would overlook such a key element.
Let's look at a number of key failures here:
- A lack of validation that domains are well-formed/valid (let alone validating authorization, this is simply about making sure "is a real domain")
- A lack of automated CAA checking
- A lack of supervision regarding certificate revocations
- A lack of historic examination of past certificates in response to incidents
- A lack of meeting the baseline expectations for incident reporting
This is a profoundly significant issue. Beyond issuing unauthorized certificates, it would appear there was a coverup at Camerfirma regarding this, much like what happened at DigiNotar.
I'm sure Camerfirma would assert it was "human error", a single RA operator mistake, a lack of understanding, or countless other seemingly-benign explanations, but it bears calling out, Camerfirma's leadership team implemented policies and architectures that enabled such situations and failed to supervise them accordingly, and these "benign answers" point to serious deficiencies.
Camerfirma's explanation fails to provide reasonable assurance that this would be prevented going forward. An RA operator can just as well make mistakes and revoke them without supervision. I think this more fundamentally calls into question all certificates issued by Camerfirma, given the clear lack of suitable controls here. I'd like to encourage you to take a more detailed look at examining root causes for what went wrong here, and what steps could be taken to prevent this. I think further, it's appropriate to understand the organizational separation at Camerfirma that it requires an "Operations department" to explain here, and how Camerfirma is structurally organized.
Similarly, it's absolutely essential to understand why Camerfirma does not see the completion of training until 2020-12-31, what that training will entail, why that training is necessary, what the current practices around training are, and how that training is being updated in light of this.
Hi Ryan,
We know that no more certificates were issued with the same error because we examined all the issued certificates to verify it.
We have to take into account that nowadays, we use automatic controls to review the certificates that we did not have available when the incident took place and that is why we can feel more confident about it now.
Regarding your questions about the training, we would like to add that the training will be progressive, including information about the problems as we detect them and will be accompanied by technical controls that let Expertise department know if any suspicious revocation is performed.
We consider that, apart from automatic controls, we need a good level of knowledge of the RA people to avoid future errors that have been not considered and controlled by automatic controls.
Related to the causes and solutions, we agree with you and the root cause of the problem is not the human error, the problem was originated by that, but the root cause of the problem was a combination of the following three failures that you mentioned:
• A lack of validation that domains are well-formed/valid (let alone validating authorization, this is simply about making sure "is a real domain")
• A lack of automated CAA checking
And was not reported because of:
• A lack of supervision regarding certificate revocations
We considered them when the incident was detected and studied and that is why we have implemented measures to avoid those error in the future like:
• The automatic controls to detect the incorrect name
To avoid this kind of problems in the future, we have created a new RFC to develop a new functionality to detect all the revocations performed in less than a week since the certificate was issued because we consider that it is a suspicious activity that is not normal if no error has been made.
If that happens, an alert will be sent to the Expertise department automatically to study the case.
We want to highlight that we will add some other alerts as we discover more suspicious situations that we have to take into account.
All the RA operators already receive the mandatory training courses, but we want to add some pills related to the problems and bugs detected to this training to increase the awareness so that they can understand the consequences of each possible error and know how to react in case of new incidents or problems that have not been taught about.
The plan is to include this new information in the new training courses for new operators and send notifications to the rest of them.
Besides, following your advice, we are going to examine more in depth other possible root causes that could affect to the problem and review the organization of Camerfirma to associate the different requirements to the different departments involved.
Update:
The following controls already deployed will be able to avoid future problems:
- Syntax control of the domain (August 2020)
- Automatic verification of CAA (June 2020).
Besides, as we planned, we have already updated the training for RA operators to reinforce the information about how to report the detected mistakes during their activity.
Nevertheless, the new functionality to detect all the revocations performed in less than a week that we mentioned will be developed by March 2021.
Do you think the bug could be closed or you need any extra information about it?
Comment 5•5 years ago
|
||
I think it's probably best to keep any further discussion at https://groups.google.com/g/mozilla.dev.security.policy/c/dSeD3dgnpzk/m/diOfeWNpBQAJ since I think we've reached the point of useful replies.
Hi Ben,
As you can see, there is not a discussion about this bug in concrete at https://groups.google.com/g/mozilla.dev.security.policy/c/dSeD3dgnpzk/m/diOfeWNpBQAJ because the information that we have there is very general about the issues.
Do you consider we need to include extra information about this bug so that you can close it?
Comment 7•5 years ago
|
||
I will schedule to close this bug on or about 22-Jan-2021 unless there are further comments/issues that need to be addressed specifically here rather than in the thread in m.d.s.p. started here: https://groups.google.com/g/mozilla.dev.security.policy/c/dSeD3dgnpzk/m/4_UkFO2SAAAJ.
Comment 8•5 years ago
|
||
Thank you Ben. We do not have more updates to add.
Updated•5 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Description
•