CommScope: Certificate not revoked as it was supposed to be
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: nicol.so, Assigned: nicol.so)
Details
(Whiteboard: [ca-compliance] [policy-failure])
Incident Report
Summary
CommScope was informed of an issue with a website of ours intended to support revocation testing. The certificate of the website was supposed to be revoked, but was not reported as so in the applicable CRL and responses from the applicable OCSP responder.
Impact
The number of certificates involved is 1 (a certificate used for a test website was supposed to have been revoked but was not).
Timeline
All times are UTC-0700.
2023-10-11:
- 11:23 Timestamp of email from Mozilla through which CommScope was alerted to the issue
- 11:30 Investigation began to verify the issues found in test websites for the root CA “CommScope Public Trust ECC Root-01”
- 12:26 Email sent to CA Officers and selected members of the R&D team with an analysis of the two test website certificate issues reported
- 14:18 Determination made that one of the issues (certificate not revoked as expected) needed to be handled as an incident; a case would be created in CommScope’s issue tracking system
- 14:42 Case created in the issue tracking system
- 15:11 The incident/case was approved by a CA Officer; remediation actions authorized to proceed
- 15:35 The certificate involved was revoked
- 15:46 Case status updated to “resolved”
2023-10-12:
- 13:00 Meeting held to discuss factors contributing to the incident and what improvements, if any, should be made to help prevent repeats of the incident. A decision was made to modify the operating procedure involved in the incident
2023-10-13:
- 15:21 First draft of updated operating procedure circulated
2023-10-17:
- 13:27 Updated operating procedure approved and published
Root Cause Analysis
The issue was caused by a human error in executing a procedure. On 2023-09-08, a ticket was created in our issue tracking system for an internally-originated request to revoke 4 certificates used by our test websites. In processing the request, an operator used an application UI to initiate revocation of the certificates, one at a time. Instead of processing each to-be-revoked certificate once, the operator inadvertently processed the one of the certificates twice. A contributing factor to the non-detection of the issue was that no separate checking was performed after revocation was performed.
Lessons Learned
Procedures with parts requiring human intervention should incorporate technical and/or procedural controls to protect against human errors. More particularly, where a human operator is required to process a list of items, control(s) should be put in place to ensure that all of the items are processed, by comparing the items processed against the source list.
Action Items
None (no outstanding action items). All planned remediation and improvement actions have already been completed at the time of report submission.
Action Item | Kind | Due Date |
---|---|---|
Revise the operating procedure involved to add a step to check the status of all requested revocations to ensure that all of them have been processed | Prevent | 2023-10-17 |
Appendix
Details of affected certificates
https://crt.sh/?sha256=E3374A9E42C766BE23FA796B2DBEDE7D58CD00285EDF74D29B3A744EC5E2366A
Updated•2 years ago
|
Comment 1•2 years ago
|
||
Can other remediation/action items be developed from this to more fully address the root cause analysis and the lessons learned? It seems that revising the operating procedure as outlined under your Action Items is inadequate when additional technical measures might be available to implement. In other words, you should take additional actions to minimize the chance that procedural/human errors will again be the reason for future incidents.
In the near term, we’re exploring other measures to reduce the likelihood of human error in the workflow involved in the incident. We plan to implement a certificate status checking tool for use by our operators. The tool takes a list of certificates, checks the status of the certificates, and produces a report. The input to this tool is copied from the original revocation request. We plan to have one operator do the revocation and a different one to run the tool to check the results. We expect the use of this tool to reliably detect the kind of human error in the incident.
Comment 3•2 years ago
|
||
When you get the chance and have more information, can you update your action items?
Thanks,
Ben
When you get the chance and have more information, can you update your action items?
CommScope will provide updates on the implementation of our new tool and procedure as we progress.
Comment 5•1 years ago
|
||
Please provide an update on the status of this remediation ASAP. In accordance with https://www.ccadb.org/cas/incident-report, "Once the report is posted, CA Owners should respond promptly to questions that are asked, and in no circumstances should a question linger without a response for more than one week, even if the response is only to acknowledge the question and provide a later date when an answer will be delivered."
Thanks,
Ben
We estimate that the development of the new tool to support our improved certificate revocation procedure (mentioned in our 2023-11-02 comment) is about 80% complete. We are targeting start of testing by the end of this week.
For the future please note:
2023-10-12:
13:00 Meeting held to discuss factors contributing to the incident and what improvements, if any, should be made to help prevent repeats of the incident. A decision was made to modify the operating procedure involved in the incident
This is the day you should've filed the incident report. Not 6 days after.
(In reply to amir from comment #7)
For the future please note:
2023-10-12:
13:00 Meeting held to discuss factors contributing to the incident and what improvements, if any, should be made to help prevent repeats of the incident. A decision was made to modify the operating procedure involved in the incidentThis is the day you should've filed the incident report. Not 6 days after.
Thanks for the comment. We will make sure that in the future, when there is a reportable incident, we will provide a timely preliminary report in accordance with the CCADB guidelines.
Update: We have started testing the new tool mentioned in our 2023-11-02 comment. Some needed changes have been identified. In the next several days, we plan to continue testing and then modify the implementation to incorporate the needed changes identified.
Assignee | ||
Comment 10•1 year ago
|
||
Update: We have finished development testing of the new tool. We will begin the process of getting the tool approved and incorporating it into a revised operating procedure.
Assignee | ||
Comment 11•1 year ago
|
||
Update: The process of getting the new tool approved and incorporated into a revised operating procedure is in progress.
Comment 12•1 year ago
|
||
Hello Nicol,
do you think you could create an alert using this new tool? For example, have it run once a day and alert if a test certificate does not match its intended state?
Thanks
Assignee | ||
Comment 13•1 year ago
|
||
Update: We are in the final stage of installing a revised operating procedure that incorporates the improvements we outlined in our 2023-11-02 comment. Due to some late edits in the procedure document, the review process was extended. We expect the improved procedure to be put into use very soon.
Assignee | ||
Comment 14•1 year ago
|
||
(In reply to Antonis from comment #12)
do you think you could create an alert using this new tool? For example, have it run once a day and alert if a test certificate does not match its intended state?
Thanks for the comment. I believe your suggestion is directed towards a different issue than the one at hand. Our new tool and procedure are designed to catch operator errors. Once it is verified that all intended-to-be-revoked certificates have been revoked (none of the certificates was inadvertently omitted), periodically checking the status of a test certificate will not yield additional benefits.
Assignee | ||
Comment 15•1 year ago
|
||
We have implemented the improvements mentioned in our 2023-11-02 comment. The new tool and operating procedure have been put into use.
Assignee | ||
Comment 16•1 year ago
|
||
It has been 9 days since we reported the completion of the improvements that we said, in comment 2, that we would make. There have been no follow-up questions. There are no outstanding remediation action items. CommScope would like to request that this issue be treated as resolved.
Comment 17•1 year ago
|
||
I'll close this on Wed. 6-March-2024.
Updated•1 year ago
|
Description
•