Closed Bug 1967929 Opened 3 months ago Closed 27 days ago

KIR: Failed to respond a Certificate Problem Report within 24 hours

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: piotr.grabowski, Assigned: piotr.grabowski)

Details

(Whiteboard: [ca-compliance] [policy-failure])

Preliminary Incident Report

Summary

  • Incident description:
    Before Bug https://bugzilla.mozilla.org/show_bug.cgi?id=1966006, two CPR were sent to by the Chrome Root Program Team on May 2, 2025 (~14:09 UTC) using the KIR problem reporting address disclosed to the CCADB. Further, on May 5, 2025 (~13:07 UTC) to the CA Email Alias address disclosed to the CCADB, but KIR failed to respond within the time requirement of Section 4.9.5 of the TLS BRs.

A full incident report will be provided no later than Monday May 29th 2025.

  • Relevant policies:
    Section 4.9.5 of the TLS BRs: Within 24 hours after receiving a Certificate Problem Report, the CA SHALL investigate the facts and circumstances related to a Certificate Problem Report and provide a preliminary report on its findings to both the Subscriber and the entity who filed the Certificate Problem Report.

  • Source of incident disclosure:
    Third Party Reported

Assignee: nobody → piotr.grabowski
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance] [policy-failure]

Full Incident Report

Summary

  • CA Owner CCADB unique ID: A000251

  • Incident description:
    Before Bug https://bugzilla.mozilla.org/show_bug.cgi?id=1966006, two CPR were sent to by the Chrome Root Program Team on May 2, 2025 (~14:09 UTC) using the KIR problem reporting address disclosed to the CCADB. Further, on May 5, 2025 (~13:07 UTC) to the CA Email Alias address disclosed to the CCADB, but KIR failed to respond within the time requirement of Section 4.9.5 of the TLS BRs.

  • Timeline summary:

    • Non-compliance start date: May 3, 2025
    • Non-compliance identified date: May 12, 2025
    • Non-compliance end date: May 13, 2025
  • Relevant policies:
    Section 4.9.5 of the TLS BRs: Within 24 hours after receiving a Certificate Problem Report, the CA SHALL investigate the facts and circumstances related to a Certificate Problem Report and provide a preliminary report on its findings to both the Subscriber and the entity who filed the Certificate Problem Report.

  • Source of incident disclosure:
    Third Party Reported - The Chrome Root Program (CRP) Team

Impact

  • Total number of certificates: 1

  • Total number of "remaining valid" certificates: 0

  • Affected certificate types: SubCA

  • Incident heuristic:
    Subordinate CA certificate SZAFIR TRUSTED CA3 - https://crt.sh/?sha256=ec036c294f512dd28c5666c2d53ec0dcf6f397fed6f8703a7c7532da3e02de8c
    The CA SHALL initiate an investigation into the relevant facts and circumstances within 24 hours of receiving a Certificate Problem Report and must deliver a preliminary report of its findings to both the Subscriber and the party that submitted the report.

  • Was issuance stopped in response to this incident, and why or why not?:
    The incident did not involve certificate issuance.

  • Analysis:
    This incident report is about a delayed CPR response for one certificate.

  • Additional considerations:

Timeline

March 05, 2025 – 12:55 UTC – Revocation of Szafir Trusted CA3 certificate according to the closure plan of Szafir Trusted CA3
March 13, 2025 – 12:55 UTC – Expected CCADB disclosure deadline ( 7 days )
May 02, 2025 – 14:09 UTC – The Chrome Root Program (CRP) Team posts an email message to KIR problem reporting address disclosed to the CCADB.
May 05, 2025 – 13:07 UTC – The Chrome Root Program (CRP) Team posts an email message to the CA Email Alias address disclosed to the CCADB (certificates@kir.com.pl).
May 09, 2025 – 18:27 UTC – The Chrome Root Program (CRP) Team posts an email message to the individual contact addresses disclosed to the CCADB.
May 12, 2025 – 11:10 UTC – An email message to the individual contact addresses disclosed to the CCADB was read
May 13, 2025 – 07:03 UTC – Preliminary incident report regarding the revocation of subordinate CA certificate Szafir Trusted CA3 was submitted
May 13, 2025 – 20:21 UTC – Full incident report https://bugzilla.mozilla.org/show_bug.cgi?id=1966006 was submitted
May 21, 2025 – 03:17 UTC – The Chrome Root Program (CRP) Team posts an comment https://bugzilla.mozilla.org/show_bug.cgi?id=1966006#c2 (2) that the observed CPR response timeline also violates the expectations of the TLS BRs.
May 21, 2025 – 05:00 UTC – WebPKI team begin a preliminary investigation.
May 22, 2025 – 09:33 UTC – Preliminary Incident Report.
May 26, 2025 – 10:10 UTC – Piotr Grabowski of the KIR WebPKI team sent an email message to The Chrome Root Program (CRP) team for clarrification to the comment https://bugzilla.mozilla.org/show_bug.cgi?id=1966006#c2
May 27, 2025 – 12:15 UTC –The Chrome Root Program (CRP) team responds to Piotr Grabowski with clarifications to the comment https://bugzilla.mozilla.org/show_bug.cgi?id=1966006#c2

Related Incidents

Bug Date Description
1886626 2024-03-20 Delayed response to CPR.
1886998 2024-03-22 Late response to a CPR - email problems.

Root Cause Analysis

Contributing Factor #: title The late response was caused by different things happening simultaneuosly

  • Description:
    • The first email was sent on May 02, 2025 – 14:09 UTC to Problem Reporting Mechanism disclosed in CCADB which is a webform http://www.elektronicznypodpis.pl/en/contact-us/contact-form/ The email address behind a webform is a general contact email. The email did not match any existing classification so it was forwarded to standard queue. Additionally there was a peek load on the standard queue due to high interest in KIR services. Peek load increased standard processing time from 3 to 7 days . KIR offers special form for revocation requests and for High Priority Problem Reports. Emails received on these special address behind the form are processed within 24 hours
    • The second email was sent on May 05, 2025 – 13:07 UTC to CA Email Alias address disclosed to the CCADB (certificates@kir.com.pl). The email adress is a group adress which included persons from WebPKI team who at the time were on their holidays and one person who is no longer working with certificates (department changed).
    • The third email message was sent on to the individual contact addresses disclosed to the CCADB on May 09, 2025 – 18:27 UTC and was read by the recepients on May 12, 2025 – 11:10 UTC and processed by WebPKI team whithin 24 hours by submitting on May 13, 2025 – 07:03 UTC – Preliminary incident report

  • Timeline: We have already updated KIR CA Email Alias address disclosed to the CCADB to contain all and only persons from WebPKI team.

  • Detection: Third Party Reported - The Chrome Root Program (CRP) Team

  • Interaction with other factors:

  • Root Cause Analysis methodology used: 5-Whys

Lessons Learned

  • What went well: Prompt action was taken by WebPKI team whithin 24 hours by submitting on May 13, 2025 – 07:03 UTC – Preliminary incident report after the third email message was read by the recepients on May 12, 2025 – 11:10 UTC

  • What didn’t go well:
    KIR problem reporting address disclosed to the CCADB behind a webform http://www.elektronicznypodpis.pl/en/contact-us/contact-form/ is a general contact email.
    CA Email Alias address disclosed to the CCADB did not include a complete list of people from WebPKI team and included also one person who is no longer working with certificates (department changed).

  • Where we got lucky:

  • Additional:

Action Items

Action Item Kind Corresponding Root Cause(s) Evaluation Criteria Due Date Status
Update KIR CA Email Alias address disclosed to the CCADB to contain all and only persons from WebPKI team. Prevent Root Cause # 1 2025-05-28 Completed
Update KIR problem reporting address disclosed to the CCADB to be dedicated email adress for certificate problem reporting for fast processing within 24 hours Prevent Root Cause # 1 2025-06-01 Ongoing
Provide internal procedure for quarterly checking email group addresses for correctness and completeness Prevent Root Cause # 1 2025-06-05 Ongoing

Appendix

Based on Incident Reporting Template v. 3.0

Action 1 and 3 completed.
Action 2 - sumbitted do Root Store (case 00002456) - waiting for the case to be closed.

The Root Cause Analysis as submitted above appears rather shallow. I believe responding to the following questions, at least to start with, would be beneficial for both KIR and the community at large to tease out deeper underlying causes.

Question 1: Given that "KIR offers special form for revocation requests and for High Priority Problem Reports.", why was the "Problem Reporting Mechanism disclosed in CCADB" a webform that leads to a "general contact email"?

Question 2: Why did the original CPR "not match any existing classification", resulting it being "forwarded to standard queue"?

Question 3: Does "not match any existing classification" refer to some automated process, a classification process conducted by humans, or a combination thereof? Can you describe that process?

Question 4: You state the "CA Email Alias address disclosed to the CCADB is a group adress which included persons from WebPKI team who at the time were on their holidays". What procedures does KIR have in place to ensure the constant monitoring of key email inboxes, such as that disclosed to CCADB, while key personnel are on holidays?

Question 5: You list two incidents as related. Why do you not consider the following incidents related? Bug 1905509, Bug 1888881, Bug 1885754, Bug 1886722, Bug 1902868?

Question 6: The list in Question 5 is not necessarily comprehensive; are there any additional incidents that KIR considers related per the definition of "related incident" in the Incident Reporting Guidelines?

Question 7: What procedures does KIR have in place to monitor incidents reports posted by other CAs in this forum?

Question 8: What steps or reviews of its systems and/or procedures has KIR taken or conducted over the preceding two years (i.e. within the scope of "related incidents" for this incident per the Incident Reporting Guidelines) based on incidents it considers related to this incident?

We appreciate the detailed questions and the opportunity to provide additional context and clarity. Below are our responses to each question raised, aligned with our responsibilities in accordance with the WebTrust for CAs audit framework and Mozilla’s Root Store Incident Reporting expectations.

Response to Question 1)

While KIR has maintained a dedicated internal mechanism for handling certificate revocation requests and high-priority problem reports, the Problem Reporting Mechanism (PRM) disclosed in the Common CA Database (CCADB) had not been updated to reflect this separation. As a result, the URL listed in CCADB directed users to a general-purpose webform, which routed submissions to a general contact email alias, not specifically designated for high-severity security or compliance issues. This misalignment occurred due to an oversight in our CCADB configuration management. While internal processes had evolved to support high-priority CPR (Certificate Problem Report) triage, our externally published PRM details were not promptly updated to align with these internal improvements.

Responses to Questions 2-4)

At the time of the incident, KIR’s CPR intake process relied on a general-purpose webform configured to categorize incoming messages using rule-based classification logic. This logic attempted to match report content (based on form fields or keywords) against a set of predefined categories such as revocation requests, technical support, or policy questions. To address this issue comprehensively and prevent recurrence, KIR has established a dedicated email address specifically for receiving Certificate Problem Reports. This remediation provides the following improvements:
Direct, Unambiguous Routing:
The new email bypasses automated classification and routes all messages directly to the WebPKI response team, ensuring that CPRs are not filtered or delayed through general support triage.
Continuous Monitoring:
This mailbox is monitored by on-call personnel, with clearly defined escalation procedures and holiday coverage to ensure responsiveness at all times. Team calendars are centrally monitored to ensure no gaps in coverage during periods of planned absence, with automatic reassignment of monitoring duties.
Updated CCADB Entry:
The CCADB "Problem Reporting Mechanism" entry has been updated to list this new address, clearly labeled as the appropriate contact point for high-priority WebPKI issues, including misissuance, revocation delays, or other non-compliance concerns.
SLA-Based Response Assurance:
Incoming messages to the dedicated CPR address are now handled in accordance with documented service level targets, including acknowledgement within 24 hours and triage within one day.

Responses to Questions 5–8)

Bug 1905509 (NETLOCK)
Bug 1888881 (CFCA)
Bug 1885754 (Entrust)
Bug 1886722 (Hongkong Post)
Bug 1902868 (GoDaddy)
At the time of submission, these incidents were not included as “Related Incidents” for the following reason: Initial focus on CA-internal causal factors:
KIR initially defined “related” narrowly, interpreting the Mozilla Incident Reporting Guidelines as primarily focusing on internal control or process similarities. Our incident was caused by specific issues such as:
Misclassification of a CPR due to non-standard routing logic in a webform,
A temporary gap in monitoring of a group email alias due to coinciding staff leave.
The referenced incidents involved external CAs and appeared to involve different root causes, including third-party communication breakdowns, alert fatigue, or procedural backlogs unrelated to KIR’s system configuration at the time.
However, upon further review, we now acknowledge that these incidents share a failure domain with KIR’s event—namely, deficiencies in timely recognition, classification, or escalation of Certificate Problem Reports (CPRs). We will update our report to include these incidents as related.

Upon review 2 additional CA compliance incidents have been identified and included:
https://bugzilla.mozilla.org/show_bug.cgi?id=1959733 (CFCA: Failed to respond a Certificate Problem Report within 24 hours which violates Section 4.9.5 of the TLS BRs)
https://bugzilla.mozilla.org/show_bug.cgi?id=1963629 (HARICA: One of the two Certificate Problem Report email aliases not working)
We will update our report to include these incidents as related as well.

KIR recognizes its responsibility not only to manage and respond to its own incidents, but also to remain informed about compliance issues affecting the broader CA ecosystem. In line with this responsibility, and consistent with the expectations of the Mozilla Root Store Policy and the WebTrust for CAs criteria, KIR has implemented the following procedures: reviews are conducted at least once per week to identify newly reported incidents and updates to ongoing cases. Each incident is analyzed against relevance to KIR systems or procedures. If an incident from another CA demonstrates a risk similar to a vulnerability within KIR’s environment, an internal process review or analysis is triggered immediately. Over the past two years, KIR has conducted a number of proactive steps and process evaluations in response to incidents it deems related. This included mentioned internal process reviews or analysis, conducted annual self-assessments and internal audits against WebTrust for CA criteria.

Updated section of Related Incidents

Related Incidents

Bug Date Description
1886626 2024-03-20 certSIGN: Delayed response to CPR.
1886998 2024-03-22 Microsec: Late response to a CPR - email problems.
1905509 2024-06-29 NETLOCK: CPR was not responded to in 24 hours
1888881 2024-04-01 CFCA: Failure to respond to a CPR in a complete and/or timely manner
1885754 2024-03-16 Entrust: CPR was not responded to in 24 hours
1886722 2024-03-21 Hongkong Post: Delayed response to CPR
1902868 2024-06-15 GoDaddy: CPR was not responded to in 24 hours
1959733 2025-04-10 CFCA: Failed to respond a Certificate Problem Report within 24 hours which violates Section 4.9.5 of the TLS BRs
1963629 2025-04-30 HARICA: One of the two Certificate Problem Report email aliases not working

All action items have been completed

Action Items

Action Item Kind Corresponding Root Cause(s) Evaluation Criteria Due Date Status
Update KIR CA Email Alias address disclosed to the CCADB to contain all and only persons from WebPKI team. Prevent Root Cause # 1 2025-05-28 Completed
Update KIR problem reporting address disclosed to the CCADB to be dedicated email adress for certificate problem reporting for fast processing within 24 hours Prevent Root Cause # 1 2025-06-01 Completed
Provide internal procedure for quarterly checking email group addresses for correctness and completeness Prevent Root Cause # 1 2025-06-05 Completed

We have no further updates.

That is a very good update and explanation. Thank you for the thoughtful response.

That was a great update and explanation, but on checking into your certificate practice statement (v1.23) I noted a few things that need updated in 1.5.2:

  1. The general contact email address seems to be the same as discussed previously
  2. There is an outdated link to a CPR form that does not seem to be have been used for some time

Otherwise this has been a very impressive incident report.

I'll mirror Mike and Wayne in saying that the response above is a very good update, and answers my questions fully. Thank you for taking the time to engage with the questions in detail.

(In reply to Wayne from comment #8)

That was a great update and explanation, but on checking into your certificate practice statement (v1.23) I noted a few things that need updated in 1.5.2:

  1. The general contact email address seems to be the same as discussed previously
  2. There is an outdated link to a CPR form that does not seem to be have been used for some time

Otherwise this has been a very impressive incident report.

Thank you all for your kind words and for your careful review of our Certificate Practice Statement (CPS). We appreciate the opportunity to clarify the points raised:

General Contact Email Address:
As noted, the general contact email address remains the same as discussed previously. We confirm that this email address is valid, active, and monitored, and continues to serve as the official point of the first contact.

Link to CPR Form:
We would like to clarify that the link currently provided in section 1.5.2 of CPS v1.23 is already valid and leads to the page with a new CPR mechanizm webtrustcerts@kir.pl.

Please feel free to reach out if any additional clarification is needed or if further review identifies other areas for improvement.

Please provide a Closure Report if you feel this is ready for closure.

Flags: needinfo?(piotr.grabowski)

Report Closure Summary

  • Incident description:
    Before Bug https://bugzilla.mozilla.org/show_bug.cgi?id=1966006, two CPR were sent to by the Chrome Root Program Team on May 2, 2025 (~14:09 UTC) using the KIR problem reporting address disclosed to the CCADB.
    Further, on May 5, 2025 (~13:07 UTC) to the CA Email Alias address disclosed to the CCADB, but KIR failed to respond within the time requirement of Section 4.9.5 of the TLS BRs.

  • Incident Root Cause(s):
    The delayed response resulted from a combination of factors occurring at the same time:
    May 2, 2025 (14:09 UTC): The first report was submitted via the webform listed in the CCADB. This form routes to a general contact email and did not match any predefined category, leading to it being forwarded to a standard queue. At the time, the standard queue was experiencing peak load due to increased demand for KIR services, extending processing time from the usual 3 days to up to 7 days. Notably, a special form exists for revocation or high-priority issues, which ensures a 24-hour response, but it was not used.
    May 5, 2025 (13:07 UTC): The second message was sent to the CA email alias provided in CCADB. Unfortunately, the alias included staff members from the WebPKI team who were on vacation, and one member no longer working with certificates, leading to further delay.
    May 9, 2025 (18:27 UTC): A third email was sent directly to individual contact addresses listed in CCADB. This message was read on May 12, 2025 (11:10 UTC), and the WebPKI team submitted a preliminary incident report within 24 hours, on May 13, 2025 (07:03 UTC).
    The combination of incorrect channel usage, staffing issues, and peak workload led to a compounded delay in incident response.

  • Remediation description:

    KIR updated:

  • CA Email Alias address disclosed to the CCADB to contain all and only persons from WebPKI team.

  • Problem reporting address disclosed to the CCADB to be dedicated email adress for certificate problem reporting for fast processing within 24 hours

and also provided internal procedure for quarterly checking email group addresses for correctness and completeness.

  • Commitment summary:
    KIR confirms that all action items have been implemented.
    All Action Items disclosed in this report have been completed as described, and we request its closure.
Flags: needinfo?(piotr.grabowski)

This is a final call for comments or questions on this Incident Report.

Otherwise, it will be closed on approximately 2025-07-17.

Flags: needinfo?(incident-reporting)
Whiteboard: [ca-compliance] [policy-failure] → [close on 2025-07-17] [ca-compliance] [policy-failure]
Status: ASSIGNED → RESOLVED
Closed: 27 days ago
Flags: needinfo?(incident-reporting)
Resolution: --- → FIXED
Whiteboard: [close on 2025-07-17] [ca-compliance] [policy-failure] → [ca-compliance] [policy-failure]
You need to log in before you can comment on or make changes to this bug.