Closed Bug 2004698 Opened 6 months ago Closed 5 months ago

NAVER Cloud Trust Services: Failure to respond to CPR within 24 hours

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dexter.c.dopping, Assigned: hogeun.yoo)

Details

(Whiteboard: [ca-compliance] [policy-failure] [external])

Steps to reproduce:

Around 2025-12-06 00:59 UTC I sent the following message to dl_rootca[at]navercorp[dot]com:

Dear NAVER,

After seeing this BugZilla report: https://bugzilla.mozilla.org/show_bug.cgi?id=2004492

I looked through other intermediate certificates to see if they have the same issue.

It appears that the following certificate is affected:

C=KR,O=NAVER Cloud Trust Services Corp.,CN=NAVER Secure Certification Authority 2<br>
https://crt.sh/?q=571e521e5e22810d33bb1a39991143e9e64cd8dae97d65931b194e19aee81e86

It includes an issuer ca link to http://rca.navercorp.com/cert/naverrca1.der, which at the moment returns a PEM encoded certificate.

According to RFC 5280 4.2.2.1 the certificate may not be PEM encoded:

> Where the information is available via HTTP or FTP, accessLocation<br>
> MUST be a uniformResourceIdentifier and the URI MUST point to either<br>
> a single DER encoded certificate as specified in [RFC2585] or a<br>
> collection of certificates in a BER or DER encoded "certs-only" CMS<br>
> message as specified in [RFC2797].

Could you please look into this?

Thank you, Kind regards,
Dexter

Actual results:

Nothing. There was no delivery failure message nor a response from NAVER

Expected results:

There should've been a response within 24 hours in accordance to section 4.9.5 of the BRs.

Because there was no response, I don't know if my message was received.

We confirm receipt of this report and are investigating.
A Preliminary Incident Report will follow shortly.

Assignee: nobody → hogeun.yoo
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance] [policy-failure] [external]

Preliminary Incident Report

Summary

Full Incident Report

Summary

  • CA Owner CCADB unique ID: A005672

  • Incident description:
    A Certificate Problem Report (CPR) was submitted to dl_rootca@navercorp.com on 2025-12-06 00:59 UTC regarding an issue related to the caIssuers URI. However, no acknowledgment or preliminary response was provided within the 24-hour timeframe required by Baseline Requirements (BR) Section 4.9.5. The CA’s first acknowledgment of the report was posted publicly in the Mozilla Bugzilla CA Certificate Compliance component after the 24-hour window had elapsed, at which point investigation was initiated.

  • Timeline summary:

    • Non-compliance start date: December 7, 2025 00:59:00 UTC (24 hours elapsed after the Certificate Problem Report was submitted without review)
    • Non-compliance identified date: December 8, 2025 14:58:00 UTC (06:58 PST)
      (The CA acknowledged receipt of the report in Bugzilla and initiated investigation)
    • Non-compliance end date: December 8, 2025 15:37:00 UTC (07:37 PST)
      (Preliminary Incident Report submitted)
  • Relevant policies:
    Baseline Requirements Section 4.9.5 – Time within which CA must process the revocation request
    https://github.com/cabforum/servercert/blob/main/docs/BR.md#495-time-within-which-ca-must-process-the-revocation-request

  • Source of incident disclosure:
    Third-party report via Bugzilla
    https://bugzilla.mozilla.org/show_bug.cgi?id=2004698

Impact

  • Total number of certificates:
    N/A – This incident related to delayed handling of a Certificate Problem Report and did not involve certificate issuance or revocation.
  • Total number of "remaining valid" certificates:
    N/A
  • Affected certificate types:
    None
  • Incident heuristic:
    Process and monitoring gap in Certificate Problem Report intake and acknowledgement.
  • Was issuance stopped in response to this incident, and why or why not?:
    No. This incident did not involve incorrect certificate issuance, certificate compromise, or any condition requiring revocation. The non-compliance was limited to delayed acknowledgement of a Certificate Problem Report.
  • Analysis:
    N/A
  • Additional considerations:
    N/A

Timeline

  • December 6, 2025 00:59 UTC
    A Certificate Problem Report (CPR) was received at the CA’s designated reporting mailbox.

  • December 7, 2025 00:59 UTC
    The 24-hour response window required under BR 4.9.5 elapsed, marking the start of the non-compliance condition.

  • December 8, 2025 13:45 UTC
    The Reporter submitted an incident report to the Mozilla Bugzilla CA Certificate Compliance component
    (Bug ID: 2004698).

  • December 8, 2025 14:58 UTC
    The CA acknowledged receipt of the report in Bugzilla and formally initiated investigation.

  • December 8, 2025 15:37 UTC
    The CA submitted the Preliminary Incident Report in Bugzilla.
    This point marks the end of the non-compliance condition.

  • December 8, 2025 16:07 UTC
    The CA convened an internal incident response meeting and began analysis of the root cause and potential corrective measures.

  • December 9, 2025 00:00 UTC
    Additional internal discussions were conducted, including a review of similar incidents disclosed in the Bugzilla CA Certificate Compliance component and analysis of alternative root causes for comparable failures.

  • December 9, 2025 08:32 UTC
    The CA formalized the following corrective measures as organizational policy and completed executive (C-level) review and approval:

    • Strengthening alerting mechanisms for the reporting mailbox, including enhanced notifications and web-based popup alerts.
    • Establishment of a mandatory twice-daily review process for the reporting mailbox, including weekends and public holidays.

Related Incidents

Bug Date Description
1994454 2025-10-15 Failed to respond to a Certificate Problem Report within 24 hours, in violation of Section 4.9.5 of the TLS Baseline Requirements, which is comparable to the delayed CPR acknowledgement in the subject incident.
1959733 2025-04-10 Involved delayed handling of a Certificate Problem Report beyond the 24-hour requirement defined in Section 4.9.5 of the TLS BRs, similar in nature to the subject incident.
1970727 2025-06-05 Failure to respond to a Certificate Problem Report within the required 24-hour timeframe under Section 4.9.5 of the TLS BRs, reflecting a comparable CPR monitoring gap.
1967929 2025-05-22 Delayed acknowledgement of a Certificate Problem Report exceeding the 24-hour response window mandated by Section 4.9.5 of the TLS Baseline Requirements.
1985466 2025-08-27 Incident related to failure to timely respond to a Certificate Problem Report within 24 hours, aligning with the same BR 4.9.5 violation pattern as the subject incident.
1963629 2025-04-30 Certificate Problem Report was not reviewed within the 24-hour requirement under Section 4.9.5 of the TLS BRs, demonstrating a similar process-related failure.
1905509 2024-06-29 Failed to respond to a Certificate Problem Report within 24 hours in violation of Section 4.9.5 of the TLS Baseline Requirements, comparable to the subject incident.

Root Cause Analysis

Contributing Factor #1: Insufficient monitoring and alerting for the CPR reporting mailbox

  • Description:
    The CA relied primarily on manual monitoring of the designated Certificate Problem Report (CPR) reporting mailbox. Although the mailbox was established as the official intake channel for CPRs, automated alerting and escalation mechanisms were not sufficiently implemented to ensure timely awareness of time-sensitive compliance emails. As a result, the CPR was not reviewed within the 24-hour timeframe required by Section 4.9.5 of the TLS Baseline Requirements.

  • Timeline:
    The reporting mailbox had been in use prior to the incident as the primary channel for receiving CPRs; however, the level of automated alerting and escalation in place at the time was not adequate to guarantee timely detection of CPRs.

  • Detection:
    This contributing factor was identified during the incident investigation, when the CA reviewed the reporting mailbox and Bugzilla and confirmed that the CPR had not been acknowledged within the required timeframe. Prior to the incident, reliance on manual monitoring limited the ability to detect delayed acknowledgement in a timely manner.

  • Interaction with other factors:
    This factor interacted with the absence of a formalized review procedure, increasing the risk of delayed CPR acknowledgement during non-business hours, including weekends and public holidays.

  • Root Cause Analysis methodology used:
    5-Whys analysis.

Contributing Factor #2: Absence of a formalized review procedure covering non-business hours

  • Description:
    At the time of the incident, the CA did not have a documented and mandatory review procedure requiring periodic checks of the CPR reporting mailbox during non-business hours. This procedural gap contributed to the delay in acknowledging the CPR within the required response window.
  • Timeline:
    The lack of a formal review schedule existed prior to the incident and had not previously resulted in missed CPR acknowledgements, which allowed the risk to remain unaddressed.
  • Detection:
    This contributing factor was identified during internal incident response discussions following acknowledgment of the CPR, when monitoring responsibilities outside standard business hours were reviewed.
  • Interaction with other factors:
    In combination with manual monitoring and lack of alerting, this procedural gap increased reliance on individual awareness rather than systematic controls.
  • Root Cause Analysis methodology used:
    Process gap analysis.

Contributing Factor #3: Over-reliance on historical success of existing processes

  • Description:
    The CA’s existing CPR intake and handling processes had historically functioned without incident, which led to an implicit assumption that the current monitoring approach was sufficient. This reduced the perceived urgency to implement additional safeguards such as automated alerts or formalized escalation procedures.
  • Timeline:
    This condition developed over time as prior CPRs were handled without delay, reinforcing confidence in the existing process until the incident occurred.
  • Detection:
    This contributing factor was identified during post-incident root cause analysis, when reviewing why known monitoring gaps had not been proactively addressed.
  • Interaction with other factors:
    This factor reinforced both the lack of automated monitoring and the absence of formal review procedures, allowing the issue to persist until external reporting via Bugzilla occurred.
  • Root Cause Analysis methodology used:
    Retrospective operational review.

Lessons Learned

  • What went well:
    Once the incident was identified, the CA promptly acknowledged the report in Bugzilla, initiated investigation, and submitted a Preliminary Incident Report on the same day. Internal coordination was swift, and no incorrect certificate issuance or certificate compromise occurred as a result of the delayed CPR acknowledgement.

  • What didn’t go well:
    The CA relied on manual monitoring of the CPR reporting mailbox without automated alerting or a formalized review procedure covering non-business hours. This resulted in the CPR not being reviewed within the 24-hour timeframe required by Section 4.9.5 of the TLS Baseline Requirements.

  • Where we got lucky:
    The incident did not involve any misissued certificates, revocation delays, or impact to relying parties. Additionally, the scope and impact remained limited, and the issue was escalated externally via Bugzilla before any further risk materialized.

  • Additional:
    The incident highlighted the need to treat CPR handling as a time-critical compliance function requiring systematic controls rather than reliance on historical process stability or individual awareness. This led to the identification of gaps in monitoring, escalation, and policy formalization, which are being addressed through corrective and preventive actions.

Action Items

Action Item Kind Corresponding Root Cause(s) Evaluation Criteria Due Date Status
Implement enhanced alerting for the CPR reporting mailbox, including mobile notifications and web-based popup alerts, to ensure immediate awareness of time-sensitive compliance emails. Detect / Prevent Root Cause #1 Confirmation that alerts are triggered immediately upon receipt of CPR emails; internal testing results and alert configuration documentation. 2025-12-10 Complete
Establish and formalize a mandatory twice-daily review process for the CPR reporting mailbox, including weekends and public holidays, as an organizational policy. Prevent Root Cause #2 Approved policy documentation and evidence of scheduled reviews; internal verification of adherence to the review process. 2025-12-10 Complete
Require executive (C-level) review and approval of the CPR monitoring and escalation policy to ensure organizational accountability and long-term sustainability. Prevent Root Cause #2, Root Cause #3 Documented executive approval records and inclusion of the policy in official internal compliance documentation. 2025-12-10 Complete

This report has gone stale. If it is ready for closure, please file a Closure Report.

Flags: needinfo?(hogeun.yoo)

Report Closure Summary

  • Incident description: Failed to acknowledge a CPR regarding caIssuers encoding within the 24-hour window required by TLS BR 4.9.5.
  • Incident Root Cause(s): Reliance on manual monitoring of the reporting mailbox without automated alerting or formalized review procedures for weekends and holidays.
  • Remediation description: Implemented automated mobile/web alerts and established a mandatory twice-daily mailbox review process that covers 365 days a year.
  • Commitment summary: NAVER Cloud Trust Services is committed to maintaining the 24-hour response standard through the newly implemented automated alerts and formalized daily review protocols. These measures have been integrated into our official internal compliance documentation and organizational policy to prevent any future recurrence of delayed responses.

All Action Items disclosed in this report have been completed as described, and we request its closure.

Flags: needinfo?(hogeun.yoo)

This is a final call for comments or questions on this Incident Report.

Otherwise, it will be closed on approximately 2026-01-14.

Flags: needinfo?(incident-reporting)
Whiteboard: [ca-compliance] [policy-failure] [external] → [close on 2026-01-14] [ca-compliance] [policy-failure] [external]
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Flags: needinfo?(incident-reporting)
Resolution: --- → FIXED
Whiteboard: [close on 2026-01-14] [ca-compliance] [policy-failure] [external] → [ca-compliance] [policy-failure] [external]
You need to log in before you can comment on or make changes to this bug.