Closed Bug 1717790 Opened 5 months ago Closed 4 months ago

Firmaprofesional: 2021 Audit Report Finding 1 out of 3

Categories

(NSS :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mprieto, Assigned: mprieto)

Details

(Whiteboard: [ca-compliance])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36

Steps to reproduce:

#1 The existence of a user who is exercising a trusted role (System Administrator) has been evidenced without proof of the formal assignment and acceptance of said role.
1.1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.
It is a finding identified in the annual eIDAS audit carried out in March 2021 (29th).

Actual results:

1.2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
On 2021-04-08 16:41, this finding was registered in our JIRA (Ticketing System) and an action plan was established and The specific points of the ETSI and the obligations were studied.
On 2021-04-08 16:47 The process “PR101-Human Resources” is modified and the following paragraph is added:
Each manager of the different areas must validate the correct assignment and acceptance of the role before making it effective and delivering the permissions and credentials associated with that role.
On 2021-04-28 17:10 The new process is validated by Firmaprofesional's Human Resources Director and new personnel's incorporations already follow this requirement.

1.3. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
It does not apply.
1.4. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
It does not apply.
1.5. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
It does not apply.

Expected results:

1.6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
There was no explicit role verification before registering a profile for the performance of certain tasks that require a specific role assignment.
1.7. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
On 2021-04-28 17:10 The new procedure is validated by Firmaprofesional's Human Resources Director and new personnel's incorporations already follow that each manager of the different areas must validate the correct assignment and acceptance of the role before making it effective and delivering the permissions and credentials associated with each role.

Type: enhancement → task
Assignee: bwilson → mprieto
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

This is the third year in a row that Firmaprofessional has had audit findings that have revealed new, previously undisclosed, incidents.

11 months ago, in Bug 1649502, Comment #2, concerns were raised about Firmaprofesional's lack of prompt disclosure. https://wiki.mozilla.org/CA/Responding_To_An_Incident#Incident_Report includes the following language:

Each incident should result in an incident report, written as soon as the problem is fully diagnosed and (temporary or permanent) measures have been put in place to make sure it will not re-occur. If the permanent fix is going to take significant time to implement, you should not wait until this is done before issuing the report. We expect to see incident reports as soon as possible, and certainly within two weeks of the initial issue report. While remediation work may still be ongoing, a satisfactory incident report will serve to resolve the issue from a Mozilla perspective.

Similarly, concerns were raised in Bug 1649726, Comment #6 about the quality of the reports, and the goals of the incident reporting process, related to Firmaprofesional's incident reports.

While I am encouraged, in a depressing way, that there remain ETSI auditors capable of highlighting issues and not just sweeping them off the report once remediated, I'm also concerned that Firmaprofesional's incident reporting process does not appear to have substantially improved in the interim. It suggests that there are deeper, systemic flaws at play here, and that we will continue to see Firmaprofesional fail to meet expectations: in operations, and in incident reports. Equally, there's a long string of troubling incidents showing that there's a pattern of non-compliance here.

Most importantly, I'm troubled by the actual substance of this issue, and Firmaprofesional's failure to recognize why it's quite troubling, and thus seek to address this as part of their incident report.

It would appear that Firmaprofesional gave someone system administrator permissions, without formally assigning that role or expectation as expected. Implicitly, this also suggests that there was the potential of this bypassing the separation of roles and duties - because Firmaprofesional failed to record that role and duty. Further, when confronted with this, it seems Firmaprofesional's solution was to try to fix this for "new" assignments, without any retroactive analysis or correction.

Equally, the perfunctory nature of this incident report makes it difficult to understand how things were, how they are, and the full impact. This seems very much a "Frodo has an adventure" sort of summary that fails to meet a good incident report.

Can you provide more substance here, both to why you failed to report this timely, and what the full issue was?

Flags: needinfo?(mprieto)
Flags: needinfo?(mprieto)
Flags: needinfo?(mprieto)

This is the third year in a row that Firmaprofessional has had audit findings that have revealed new, previously undisclosed, incidents.

2019: Bug 1606380, Bug 1612929, Bug 1610448
2020: Bug 1649502, Bug 1649679, Bug 1649724, Bug 1649726
11 months ago, in Bug 1649502, Comment #2, concerns were raised about Firmaprofesional's lack of prompt disclosure. > https://wiki.mozilla.org/CA/Responding_To_An_Incident#Incident_Report includes the following language:

Each incident should result in an incident report, written as soon as the problem is fully diagnosed and (temporary or permanent) measures have been put in place to make sure it will not re-occur. If the permanent fix is going to take significant time to implement, you should not wait until this is done before issuing the report. We expect to see incident reports as soon as possible, and certainly within two weeks of the initial issue report. While remediation work may still be ongoing, a satisfactory incident report will serve to resolve the issue from a Mozilla perspective.

Similarly, concerns were raised in Bug 1649726, Comment #6 about the quality of the reports, and the goals of the incident reporting process, related to Firmaprofesional's incident reports.

While I am encouraged, in a depressing way, that there remain ETSI auditors capable of highlighting issues and not just sweeping them off the report once remediated, I'm also concerned that Firmaprofesional's incident reporting process does not appear to have substantially improved in the interim. It suggests that there are deeper, systemic flaws at play here, and that we will continue to see Firmaprofesional fail to meet expectations: in operations, and in incident reports. Equally, there's a long string of troubling incidents showing that there's a pattern of non-compliance here.

Most importantly, I'm troubled by the actual substance of this issue, and Firmaprofesional's failure to recognize why it's quite troubling, and thus seek to address this as part of their incident report.

Dear Ryan,
Thank you for your comments that always help CAs to improve their processes. We have read thoroughly "Frodo has an adventure" and we will try to do better.
In relation to your comment on the incidents of the last 3 years, Firmaprofesional is making efforts to be more rigorous with the standards because new incidents are always in new matters, demonstrating that the actions and measures previously taken on previous findings are really effective. Without a doubt, we all always have to improve in different aspects, which is the reason for the annual audits.

Some of the incidents of this year are even proposals by the auditors to make the processes more clear than before because we are already doing it but they can be more transparent when it comes to being evidenced. In addition, Firmaprofesional are participating proactively in some discussions and votes in this forum. This shows how interested we are in improving.

On the other hand, in relation to incident notification times, we notify them as soon as the auditors deliver the final report, since some of the reported incidents we have even rejected because we did not agree with them, such as the case of RFC5280, which we see even now that we were right with the link provided by @Dimitris. And the Bug 1700145 is a good example of how Firmaprofesional tried to do the best in this matter.

It would appear that Firmaprofesional gave someone system administrator permissions, without formally assigning that role or expectation as expected. Implicitly, this also suggests that there was the potential of this bypassing the separation of roles and duties - because Firmaprofesional failed to record that role and duty. Further, when confronted with this, it seems Firmaprofesional's solution was to try to fix this for "new" assignments, without any retroactive analysis or correction.

Equally, the perfunctory nature of this incident report makes it difficult to understand how things were, how they are, and the full impact. This seems very much a "Frodo has an adventure" sort of summary that fails to meet a good incident report.

Can you provide more substance here, both to why you failed to report this timely, and what the full issue was?

The Security Officer is the only person that can grant such a role, so we always have control on who does which security role. It was an isolated matter of the bureaucratic procedure of evidence docummentally suh assignment.

We acknowledge that documented evidence is also important and this is why we have changed our HR procedure, to ensure that not only the assignment is authorized but it is also documented and easily traceable.

Additionally the Security Officer and the Human Resources responsible verified that the assignment of roles of all personnel had been carried out correctly and only the reported incident has been an isolated case, therefore the measures to be adopted are to improve future processes, in this case. This task was done on march, the 30th after discovering the potential finding.

Flags: needinfo?(mprieto)

On the other hand, in relation to incident notification times, we notify them as soon as the auditors deliver the final report, since some of the reported incidents we have even rejected because we did not agree with them, such as the case of RFC5280, which we see even now that we were right with the link provided by @Dimitris.

In these cases, and in general, it's better to over-disclose, rather than under-disclose. Having a CA show proactive disclosure and engagement, while ultimately resolving the matter as Resolved/Invalid, is a far better outcome than having a CA be aware of a matter of concern, spend several months debating it with their auditor, and then ultimately finding out their evaluation was wrong.

A key point of these incident reports is to provide transparency and gather feedback. It also allows others to provide support for your analysis, or to highlight concerns with the analysis. This exists to ensure CAs are behaving transparently.

All of this to say: In the future, if there is the suspicion of an incident, it would be far better for Firmaprofesional's continued trust to disclose it (along with the preliminary analysis), including any analysis why you may disagree. The more transparent a CA is, the better, and it also helps identify areas that might be confusing auditors (leading to improvements) or highlighting areas of concern with auditors (to be addressed directly with them). In the worst case, and it is a legitimate issue, then you've promptly disclosed, which is still the best outcome for users.

With respect to this incident itself, it seems you've largely focused on correcting the immediate issue, which appears to be a lack of required documentation. However, this doesn't explain how Firmaprofesional missed this issue (until the auditor detected it). It also suggests that there may be other issues, because 5.4.1 of the BRs requires these events (and supporting records) be logged.

Put differently, Question 6 of the Incident Report seeks to understand "How did things go wrong, and how will things be prevented from going wrong in the future". You focused on the incident, but we also want to understand how the process went wrong to let the incident happen, and what's being changed (in the compliance process) to detect and prevent incidents of this category (i.e. not implementing the BRs correctly), rather than this specific incident.

Flags: needinfo?(mprieto)

In these cases, and in general, it's better to over-disclose, rather than under-disclose. Having a CA show proactive disclosure and engagement, while ultimately resolving the matter as Resolved/Invalid, is a far better outcome than having a CA be aware of a matter of concern, spend several months debating it with their auditor, and then ultimately finding out their evaluation was wrong.
A key point of these incident reports is to provide transparency and gather feedback. It also allows others to provide support for your analysis, or to highlight concerns with the analysis. This exists to ensure CAs are behaving transparently.
All of this to say: In the future, if there is the suspicion of an incident, it would be far better for Firmaprofesional's continued trust to disclose it (along with the preliminary analysis), including any analysis why you may disagree. The more transparent a CA is, the better, and it also helps identify areas that might be confusing auditors (leading to improvements) or highlighting areas of concern with auditors (to be addressed directly with them). In the worst case, and it is a legitimate issue, then you've promptly disclosed, which is still the best outcome for users.

Thanks Ryan for giving us this way to improve in the future, which is also perfectly aligned with what we have proposed in the https://bugzilla.mozilla.org/show_bug.cgi?id=1717795
Finally, in future occasions, when there are discrepancies between the auditors and our interpretation of a possible finding, we will launch a public discussion to verify which is the valid criterion, a fact that will help both the community and us in the specific case.

With respect to this incident itself, it seems you've largely focused on correcting the immediate issue, which appears to be a lack of required documentation. However, this doesn't explain how Firmaprofesional missed this issue (until the auditor detected it). It also suggests that there may be other issues, because 5.4.1 of the BRs requires these events (and supporting records) be logged.

Analyzing the detail of section 5.4.1 of the BRs that Ryan comments, we consider all logs and recorded events are managed correctly for different reasons:

  1. Because it have recently been audited
  2. Because only Firmaprofesional operators issue SSLs and this type of certificate is not delegated to any other external entity.
  3. Because we use an automated ticketing system (JIRA) for SSL issuance requests.
  4. Because we use our RA certificate issuance system, which can only be accessed with a digital operator certificate and in which all the steps of issuing an SSL certificate are recorded.
  5. Because the finding was found by the auditor reviewing the records and not finding the formal assignment and acceptance of said role.

Put differently, Question 6 of the Incident Report seeks to understand "How did things go wrong, and how will things be prevented from going wrong in the future". You focused on the incident, but we also want to understand how the process went wrong to let the incident happen, and what's being changed (in the compliance process) to detect and prevent incidents of this category (i.e. not implementing the BRs correctly), rather than this specific incident.

Two key departments, HR and the technical department (or department in which specific personnel are linked to), participate in the process in this incident.

We believe that this incident would be resolved and above all, it is prevented from repeating it again, if communication between the affected departments was improved, by modifying the procedure. And, indeed, this is what we proposed.

The resolution of the incident is easy: the person responsible for the user made the formal assignment of the role, and the user formally accepted the role. And the resolution of the root cause is what prompted us to modify the procedure. What we are trying to do with the redesign of the procedure is that, proactively, the person in charge of the area (e.g. Security Manager) to which the person with the new role is going to join (e.g. Systems Administrator), notify HR the specific role assignment for the new worker. In this way, HR performs the assignment of roles, based on the tasks confirmed by the security manager, correctly closing the communication circle.

Any proposal for improvement will be well received!

Flags: needinfo?(mprieto)
Flags: needinfo?(bwilson)

I don't have any further suggestions or questions. I will schedule this to be closed on or about this Friday, 13-Aug-2021.

Status: ASSIGNED → RESOLVED
Closed: 4 months ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.