Open Bug 1903066 Opened 18 days ago Updated 10 minutes ago

Chunghwa Telecom: Delayed Revocation with Controversial Extension (2.5.29.9, SubjectDirectoryAttributes)

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: leox, Assigned: leox, NeedInfo)

Details

(Whiteboard: [ca-compliance] [leaf-revocation-delay])

Attachments

(7 files)

Incident Report

Summary

This issue originally stemmed from Bugzilla 1887096, and 1899466. Due to the use of a controversial value in the certificate extension 2.5.29.9, to meet the first point of BR 7.1.2.11.5, we have decided to remove the extension after discussion. We are also preparing a schedule to revoke the certificates containing this extension. Since this schedule is too close to the previous batch revocation(3/19-5/13), our promotion plan has not had time to be executed, so we can only proceed with the batch revocation in the same manner as last time. Therefore, it is very difficult to revoke all certificates within the five days stipulated by BR.

How became aware of the problem.

Due to Bugzilla 1899466, the certificate extension used a controversial value. After internal discussions, it was agreed to remove the extension and revoke the certificates containing the controversial extension. This action is too close to the previous round (3/19-5/13), and we have not had sufficient time to individually communicate with subscribers and conduct policy briefing sessions. Therefore, we will have to proceed with batch revocation in the same manner as last time. Due to the impact on various government agencies, it is very difficult to revoke all certificates within the five days stipulated by the BR.

Impact

A total of 12,911 certificates are affected.

Root Cause

Given that this action is too close to the previous round, we have not been able to individually communicate with subscribers regarding the certificate revocation deadlines (e.g., conducting policy briefing sessions). Therefore, we will proceed with batch revocation in the same manner as the previous operation. For these two incidents, we have compiled the following situations that have led to the inability to complete certificate revocation within five days:

  • Many user contacts do not have an IT background and cannot replace the certificates themselves. They need to contact IT vendors or equipment suppliers, which usually requires 1-2 weeks to schedule an appointment.
  • Government agencies use official documents for approval processes. Replacing certificates requires approval from multiple levels of management, making it very difficultly to complete within five days.

TimeLine

All times are UTC+8.

2024-05-23

  • 12:00 Investigated, and reviewed the cause.
  • 14:46 Stopped issuing certificates.
  • 16:00 Engineers provided adjustment solutions, updated CPS in the new version
  • 18:00 Adjusted the program, tested issuance
  • 18:30 Resumed normal certificate issuance

As of 18:30 on 5/23, normal certificate issuance has resumed, and subsequent certificates will no longer contain the extension (2.5.29.9 subjectDirectoryAttributes).

2024-05-24

  • Response to first actions taken.

2024-05-27

  • Report to the supervisor for a decision.
  • Decide to revoke certificates containing the controversial extension.

2024-05-28

  • Assess the scope of impact and develop a timeline for certificate revocation.
  • Plan the reissuance process and notify the responsible units.

2024-05-29

  • Begin reissuing certificates.

2024-06-03

  • Completed the first email notification to subscribers regarding certificate reissuance.

2024-06-07

  • According to BR, all affected certificates should be revoked by Day 5.
  • Completed the second email notification to subscribers regarding certificate reissuance.

2024-06-13

  • Revoked the first batch of certificates for users who had replaced them, totaling 4,306 certificates. Additionally, 15 certificates expired, making a total of 4,321 certificates.

2024-06-17

  • Issued the Delayed Revocation Incident Report on Bugzilla.

Action Items

Action Item Status Due Date
Stop issuing and remove the controversial extension Finished 2024-05-23
Response to first actions taken Finished 2024-05-24
Report to the supervisor for a decision Finished 2024-05-27
Schedule and discuss a meeting Finished 2024-05-28
Reissue certificates Finished 2024-05-31
Batch Revocation of Replaced Subscriber Certificates Started 2024-07-14
Contact subscribers to replace certificates Started 2024-07-14
Revoke all affected certificates Started 2024-07-14

Lesson Learns

What went well

  • Due to the experience from the last mass revocation, we were able to quickly establish a schedule and manage tasks according to the process.
  • Following the previous Incident Report (IR), we have strengthened the CA's capabilities, equipping us with the ability to handle large-scale revocations and reissues.

What didn't go well

  • The previous certificate misissuance (EKU set to critical) and the current use of a controversial OID occurred too close together, making it difficult to effectively communicate with users and improve the existing SOPs that still need refinement.
  • Government administrative processes require approvals before actions can be taken.
  • It is very challenging to explain to government agencies why emergency revocation is necessary and to clarify the differences before and after certificate changes.
  • The number of certificates to be revoked this time is double that of the last time, presenting an even greater challenge, but we will strive to complete the revocation faster than before.

Appendix

  • List of 12,911 affected certificates to be revoked.
  • On 6/13, revoked the first batch of installed subscriber certificates, totaling 4,321 certificates (of which 15 were expired, with an actual revocation of 4,306 certificates).
Assignee: nobody → leox
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Summary: Chunghwa Telecomm: Delayed Revocation with Controvrsial Extension (2.5.29.9, SubjectDirectoryAttributes) → Chunghwa Telecomm: Delayed Revocation with Controversial Extension (2.5.29.9, SubjectDirectoryAttributes)
Whiteboard: [ca-compliance] [leaf-revocation-delay]
Summary: Chunghwa Telecomm: Delayed Revocation with Controversial Extension (2.5.29.9, SubjectDirectoryAttributes) → Chunghwa Telecom: Delayed Revocation with Controversial Extension (2.5.29.9, SubjectDirectoryAttributes)

I'm sure that Chunghwa Telecom are aware of Mozilla's incident report policy for delayed revocation, as tracking and following that policy is a requirement for any included root, but reading this incident report prompts me to remind Chunghwa of these elements:

  • The decision and rationale for delaying revocation will be disclosed in the form of a preliminary incident report immediately; preferably before the BR-mandated revocation deadline. The rationale must include detailed and substantiated explanations for why the situation is exceptional. Responses similar to “we do not deem this non-compliant certificate to be a security risk” are not acceptable. When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis.
  • Your CA will work with your auditor (and supervisory body, as appropriate) and the Root Store(s) that your CA participates in to ensure your analysis of the risk and plan of remediation is acceptable.

The attached list of crt.sh links is not sufficient to meet the standard of per-subscriber detail specified in the policy, and the filing of this report so long after the incident, rather than "immediately", is also a violation of that policy. I don't know if we need a separate incident for the failure to follow the delrev incident report policy, though; I think it's enough to address it here: Please provide detail as to why there was no preliminary incident report filed as per the policy.

Please provide per-subscriber rationale for delaying revocation, including detail about the harm that would result from prompt revocation, and why there were no options that would mitigate that harm.

**Please also provide detail about how Chunghwa Telecom worked with its auditors and the root programs

I have other issues with the actions described in the report so far, but let's get it to at least be complete and then we can discuss the decisions that were made.

Flags: needinfo?(leox)

(In reply to Leo Fang from comment #0)

Given that this action is too close to the previous round, we have not been able to individually communicate with subscribers regarding the certificate revocation deadlines (e.g., conducting policy briefing sessions). Therefore, we will proceed with batch revocation in the same manner as the previous operation. For these two incidents, we have compiled the following situations that have led to the inability to complete certificate revocation within five days:

  • Many user contacts do not have an IT background and cannot replace the certificates themselves. They need to contact IT vendors or equipment suppliers, which usually requires 1-2 weeks to schedule an appointment.
  • Government agencies use official documents for approval processes. Replacing certificates requires approval from multiple levels of management, making it very difficultly to complete within five days.

Leo,
In bug 1892419 comment 13, the Chrome root program said,

The revocation timelines described in 4.9 of the TLS BRs must be considered as superseding to those desires of a customer.

In your response to that post you did not acknowledge that revoking certificates is your responsibility and not your Subscriber’s. In that entire bug you have not provided any action items that will address the true root cause of the incident, which is Chunghwa Telecom’s failure to follow through on its responsibilities as a public CA as detailed in the Baseline Requirements.

Now here we are, exactly a month later, and you have announced your intent once again to willfully delay revocation for an extended period. You already are two weeks late on this revocation, and for more than 8000 of these certificates you plan on taking another month.

This despite bug 1899466 comment 11, which promises “relevant reform plans” and bug 1899466 comment 13, which states,

In the viewpoint of Root CA team, I think the revocation should be executed as scheduled to comply with the BR, even if the subscriber window cannot be contacted.

Question 1: Were these two comments empty promises?

Revocation is the first step but not the last. The report in comment 0 does not acknowledge this incident’s true root cause, which is that Chunghwa Telecom willfully dismissed the Baseline Requirements rules in favor of its Subscribers’ convenience. The report contains no action items that address this root cause. Bug 1892419 has the same problem.

I will point you to bug 1889062 comment 15, in which Mozilla clarifies that delayed revocation bugs need to stay open until offending CAs acknowledge their responsibility to pursue revocations correctly and include appropriate action items. Mozilla describes appropriate action items this way:

Action Items should include:

  • technological improvements that describe any technological upgrades or changes to the infrastructure that will help in faster detection and response to incidents requiring revocation;
  • detailed changes to policies and procedures to ensure timely revocation, including new guidelines, checklists, and approval processes; and
  • monitoring and auditing to ensure compliance with such policies and procedures and to identify any lapses quickly.

Update:

  • 6/20 revoked 8,292 certificates
  • 6/20 expired 262 certificates

Total affected certificates: 12,911
Total certificates revoked: 12,598 (97.57%)
Total certificates expired: 277 (2.15%)
Remaining: 36 (0.28%)

Flags: needinfo?(leox)

(In reply to Mike Shaver (:shaver emeritus) from comment #2)

I'm sure that Chunghwa Telecom are aware of Mozilla's incident report policy for delayed revocation, as tracking and following that policy is a requirement for any included root, but reading this incident report prompts me to remind Chunghwa of these elements:

  • The decision and rationale for delaying revocation will be disclosed in the form of a preliminary incident report immediately; preferably before the BR-mandated revocation deadline. The rationale must include detailed and substantiated explanations for why the situation is exceptional. Responses similar to “we do not deem this non-compliant certificate to be a security risk” are not acceptable. When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis.

That's right. We should post it within 72 hours before making a supplementary report. We have discovered such an oversight and will quickly make up the report.

  • Your CA will work with your auditor (and supervisory body, as appropriate) and the Root Store(s) that your CA participates in to ensure your analysis of the risk and plan of remediation is acceptable.

We have subsequently formulated some remedial measures to avoid the next incident.

  1. The terms agreed with users emphasize that we must abide by the BR time limit and fulfill the CA's responsibilities. Government agencies that are deemed unable to cooperate should not use TLS certificates issued by GTLSCA.
  2. Synchronize the system update program, synchronize the CHT corresponding BR program and certificate profile to the CAs of the two teams, and complete the update within the deadline to ensure that the format to comply BR.
  3. Strengthen education and training instructions that subscribers can truly understand the rules that CA needs to follow in order to comply with BR, and in the policy that require full cooperation from users.

The attached list of crt.sh links is not sufficient to meet the standard of per-subscriber detail specified in the policy, and the filing of this report so long after the incident, rather than "immediately", is also a violation of that policy. I don't know if we need a separate incident for the failure to follow the delrev incident report policy, though; I think it's enough to address it here: Please provide detail as to why there was no preliminary incident report filed as per the policy.

Please provide per-subscriber rationale for delaying revocation, including detail about the harm that would result from prompt revocation, and why there were no options that would mitigate that harm.

For this incident, we all used ReasonCode(4) as the reason for revocation.

**Please also provide detail about how Chunghwa Telecom worked with its auditors and the root programs

The Auditor communicates through letters, and the CHT cooperates to provide the required relevant information, conducts on-site audits, and the external auditor (accounting firm) issues an audit report.
The CHT RootCA communicates with Browser through CCADB.

I have other issues with the actions described in the report so far, but let's get it to at least be complete and then we can discuss the decisions that were made.

(In reply to Tim Callan from comment #3)

Leo,
In bug 1892419 comment 13, the Chrome root program said,

The revocation timelines described in 4.9 of the TLS BRs must be considered as superseding to those desires of a customer.
Now here we are, exactly a month later, and you have announced your intent once again to willfully delay revocation for an extended period. You already are two weeks late on this revocation, and for more than 8000 of these certificates you plan on taking another month.

We have processed the revocation of most of the more than 8,000 certificates on 6/20. The reason for the delay in the abolition is mainly that the users are all government agencies and require official documents to be signed and returned.

This despite bug 1899466 comment 11, which promises “relevant reform plans” and bug 1899466 comment 13, which states,

In the viewpoint of Root CA team, I think the revocation should be executed as scheduled to comply with the BR, even if the subscriber window cannot be contacted.

Question 1: Were these two comments empty promises?

Revocation is the first step but not the last. The report in comment 0 does not acknowledge this incident’s true root cause, which is that Chunghwa Telecom willfully dismissed the Baseline Requirements rules in favor of its Subscribers’ convenience. The report contains no action items that address this root cause. Bug 1892419 has the same problem.

I will point you to bug 1889062 comment 15, in which Mozilla clarifies that delayed revocation bugs need to stay open until offending CAs acknowledge their responsibility to pursue revocations correctly and include appropriate action items. Mozilla describes appropriate action items this way:

Action Items should include:

  • technological improvements that describe any technological upgrades or changes to the infrastructure that will help in faster detection and response to incidents requiring revocation;
  • detailed changes to policies and procedures to ensure timely revocation, including new guidelines, checklists, and approval processes; and
  • monitoring and auditing to ensure compliance with such policies and procedures and to identify any lapses quickly.

Thank you for your suggestion. We have realized that our attitude towards this incident was wrong and we should take CA's responsibility.

(In reply to Leo Fang from comment #5)

The attached list of crt.sh links is not sufficient to meet the standard of per-subscriber detail specified in the policy, and the filing of this report so long after the incident, rather than "immediately", is also a violation of that policy. I don't know if we need a separate incident for the failure to follow the delrev incident report policy, though; I think it's enough to address it here: Please provide detail as to why there was no preliminary incident report filed as per the policy.

Please provide per-subscriber rationale for delaying revocation, including detail about the harm that would result from prompt revocation, and why there were no options that would mitigate that harm.

For this incident, we all used ReasonCode(4) as the reason for revocation.

I will only respond to this answer as it is showing a tremendous misunderstanding on Chunghwa Telecom's behalf. To be clear: this is a delayed revocation incident. As part of this, when a subscriber says they cannot handle the revocation deadline that reason needs to be publicly communicated. It does not have to be a direct quote from the subscriber, but an accurate portrayal of what made things impossible.

Put simply, we need a list of subscribers who went past the 5-day threshold and specifically what was stopping them from accepting the certificate revocation. As an example please see the attachment here.

Flags: needinfo?(leox)

Update:

  • 6/21 revoked 23 certificates

Total affected certificates: 12,911
Total certificates revoked: 12,621 (97.75%)
Total certificates expired: 277 (2.15%)
Remaining: 13 (0.10%)

Flags: needinfo?(leox)

(In reply to Wayne from comment #7)

I will only respond to this answer as it is showing a tremendous misunderstanding on Chunghwa Telecom's behalf. To be clear: this is a delayed revocation incident. As part of this, when a subscriber says they cannot handle the revocation deadline that reason needs to be publicly communicated. It does not have to be a direct quote from the subscriber, but an accurate portrayal of what made things impossible.

Put simply, we need a list of subscribers who went past the 5-day threshold and specifically what was stopping them from accepting the certificate revocation. As an example please see the attachment here.

Due to this delayed revocation, we summarized three reasons:

Reasons for delayed revocation Number of certificates affected
Government agencies need to process official documents and obtain approval from higher-level supervisors before they can carry out replacement operations. 7,571
Government agencies lack IT capabilities and need to contact IT vendors and equipment suppliers to perform certificate replacements. 5,304
Government agencies need to obtain supervisor approval for official documents and, based on these documents, schedule IT vendors and equipment suppliers to perform certificate replacements 36

To date, the revocation rate is 99.90%.

It is expected that after revoking 11 more certificates on June 25, the rate will reach 99.98% (with 2 remaining certificates to be updated, which are expected to be fully revoked by June 30).

(In reply to Leo Fang from comment #9)

(In reply to Wayne from comment #7)

I will only respond to this answer as it is showing a tremendous misunderstanding on Chunghwa Telecom's behalf. To be clear: this is a delayed revocation incident. As part of this, when a subscriber says they cannot handle the revocation deadline that reason needs to be publicly communicated. It does not have to be a direct quote from the subscriber, but an accurate portrayal of what made things impossible.

Put simply, we need a list of subscribers who went past the 5-day threshold and specifically what was stopping them from accepting the certificate revocation. As an example please see the attachment here.

Due to this delayed revocation, we summarized three reasons:

Reasons for delayed revocation Number of certificates affected
Government agencies need to process official documents and obtain approval from higher-level supervisors before they can carry out replacement operations. 7,571
Government agencies lack IT capabilities and need to contact IT vendors and equipment suppliers to perform certificate replacements. 5,304
Government agencies need to obtain supervisor approval for official documents and, based on these documents, schedule IT vendors and equipment suppliers to perform certificate replacements 36

To date, the revocation rate is 99.90%.

It is expected that after revoking 11 more certificates on June 25, the rate will reach 99.98% (with 2 remaining certificates to be updated, which are expected to be fully revoked by June 30).

I see no action items detailing the efforts Chungwa Telecom will make to ensure that this (delayed revocation) does not happen again. You need to work with your subscribers so that they change their process for "official documents" and for obtaining "supervisor approval" so that in case of another revocation event they can handle revocation within 24h/5days.

I also want to point out that these reasons don't really look compatible with Mozillas guidance on delayed revocation:
"Mozilla recognizes that in some exceptional circumstances, revoking the affected certificates within the prescribed deadline may cause significant harm, such as when the certificate is used in critical infrastructure and cannot be safely replaced prior to the revocation deadline, or when the volume of revocations in a short period of time would result in a large cumulative impact to the web. "

What harms would timely revocation cause?

The attachment for the latest batch of revoked certificates contain search links, meaning you have to click the single search result to view the details on crt.sh. Please provide direct links in the future. Because crt.sh seems to be under a high load currently I couldn't check all of the listed certificates, but everyone I could I noticed that for me there was no public DNS record for them.

Flags: needinfo?(leox)

Update:

  • 6/25 revoked 10 certificates
  • 6/25 expired 1 certificates

Total affected certificates: 12,911
Total certificates revoked: 12,631 (97.83%)
Total certificates expired: 278 (2.16%)
Remaining: 2 (0.01%)

Flags: needinfo?(leox)
Attached file revoke0627-crtsh-2.csv

Update:

  • 6/27 revoked 2 certificates

Total affected certificates: 12,911
Total certificates revoked: 12,633 (97.85%)
Total certificates expired: 278 (2.15%)
All affected certificates were revoked or expired.
The revocation was completed 18 days earlier than expected.

(In reply to Leo Fang from comment #9)

(In reply to Wayne from comment #7)

I will only respond to this answer as it is showing a tremendous misunderstanding on Chunghwa Telecom's behalf. To be clear: this is a delayed revocation incident. As part of this, when a subscriber says they cannot handle the revocation deadline that reason needs to be publicly communicated. It does not have to be a direct quote from the subscriber, but an accurate portrayal of what made things impossible.

Put simply, we need a list of subscribers who went past the 5-day threshold and specifically what was stopping them from accepting the certificate revocation. As an example please see the attachment here.

Due to this delayed revocation, we summarized three reasons:

Reasons for delayed revocation Number of certificates affected
Government agencies need to process official documents and obtain approval from higher-level supervisors before they can carry out replacement operations. 7,571
Government agencies lack IT capabilities and need to contact IT vendors and equipment suppliers to perform certificate replacements. 5,304
Government agencies need to obtain supervisor approval for official documents and, based on these documents, schedule IT vendors and equipment suppliers to perform certificate replacements 36

To date, the revocation rate is 99.90%.

It is expected that after revoking 11 more certificates on June 25, the rate will reach 99.98% (with 2 remaining certificates to be updated, which are expected to be fully revoked by June 30).

I don’t know how to be more explicit about this that I am now, so I hope it works:

For each subscriber (but ideally for each certificate) you must provided detailed explanation of why it was impossible for the Subscriber to replace the certificate, and what specific harm to the web ecosystem would have resulted from revoking that Subscriber’s certificates promptly.

A summary of general trends is not sufficient. Saying that an affected site is part of an important industry is not sufficient.

When describing why it was impossible to replace the certificate, it should be the case that the Subscriber would have faced the same challenges in the event of a key compromise. If there is a legislative or regulatory limitation, you should state what that regulation or law is, and what limits it places on this specific situation.

Please comply with this requirement and provide a sufficiently-detailed description, and quickly; this incident report is currently very much not compliant with Mozilla’s incident response policy.

Flags: needinfo?(leox)

(In reply to Zacharias from comment #10)

I see no action items detailing the efforts Chungwa Telecom will make to ensure that this (delayed revocation) does not happen again. You need to work with your subscribers so that they change their process for "official documents" and for obtaining "supervisor approval" so that in case of another revocation event they can handle revocation within 24h/5days.

We will add the expiration deadline of the BR to the user agreement, and we have obtained official consent. This incident has attracted official attention. In case of similar incidents in the future, users will be required to take action within the deadline according to the user agreement. The relevant official document approval process can be completed later to ensure that the deadline is met.

I also want to point out that these reasons don't really look compatible with Mozillas guidance on delayed revocation:
"Mozilla recognizes that in some exceptional circumstances, revoking the affected certificates within the prescribed deadline may cause significant harm, such as when the certificate is used in critical infrastructure and cannot be safely replaced prior to the revocation deadline, or when the volume of revocations in a short period of time would result in a large cumulative impact to the web. "
What harms would timely revocation cause?

All our users are government agencies, and the government's infrastructure spans across various crucial services nationwide. If we revoke all certificates immediately, the impact will be extensive, as illustrated below:

  1. The airport control tower's monitoring system would be unable to function properly, affecting flight takeoff and landing as well as allocation.
  2. The ICU centralized monitoring system in the healthcare system would be paralyzed, affecting patients' medical rights.
  3. Voltage load monitoring would be paralyzed, making it impossible to manage the national power grid, which could lead to regional or nationwide blackouts. Current power load data would be inaccessible.
  4. The railway system would be paralyzed, making it impossible to obtain real-time dispatch information, which could result in railway accidents and jeopardize passenger safety.
  5. The meteorological radar monitoring system would be paralyzed, unable to display radar information normally, thus hindering accurate weather prediction and information dissemination.
  6. Regarding the economy, as it is currently the tax reporting season in our country, it would affect tax reporting, service integration, tax applications, and inquiries.
  7. Government units such as the household registration system would be paralyzed, causing significant disruptions as all domestic services related to individuals or families would be unperformable.

Availability is also a part of information security. While complying with BR and cybersecurity regulations, we must also consider the availability of services to prevent the paralysis of government digital services.

The attachment for the latest batch of revoked certificates contain search links, meaning you have to click the single search result to view the details on crt.sh. Please provide direct links in the future. Because crt.sh seems to be under a high load currently I couldn't check all of the listed certificates,

We will attempt to provide direct links in the future; the method is still under research.

but everyone I could I noticed that for me there was no public DNS record for them.

Before issuing each certificate, we thoroughly verify the DNS names. Our external auditors review them annually, and they are all compliant.

Flags: needinfo?(leox)

(In reply to Leo Fang from comment #14)

What harms would timely revocation cause?

All our users are government agencies, and the government's infrastructure spans across various crucial services nationwide. If we revoke all certificates immediately, the impact will be extensive, as illustrated below:

  1. The airport control tower's monitoring system would be unable to function properly, affecting flight takeoff and landing as well as allocation.
  2. The ICU centralized monitoring system in the healthcare system would be paralyzed, affecting patients' medical rights.
  3. Voltage load monitoring would be paralyzed, making it impossible to manage the national power grid, which could lead to regional or nationwide blackouts. Current power load data would be inaccessible.
  4. The railway system would be paralyzed, making it impossible to obtain real-time dispatch information, which could result in railway accidents and jeopardize passenger safety.
  5. The meteorological radar monitoring system would be paralyzed, unable to display radar information normally, thus hindering accurate weather prediction and information dissemination.
  6. Regarding the economy, as it is currently the tax reporting season in our country, it would affect tax reporting, service integration, tax applications, and inquiries.
  7. Government units such as the household registration system would be paralyzed, causing significant disruptions as all domestic services related to individuals or families would be unperformable.

Availability is also a part of information security. While complying with BR and cybersecurity regulations, we must also consider the availability of services to prevent the paralysis of government digital services.

If even one of those are true it raises serious concerns over why these services are using publicly-trusted roots. I can appreciate the care in handling a single revocation incident for these services, but the ongoing issuance is asking for far more serious incidents in the future. Is there a service of critical national importance that hasn't been built to rely on these certificates?

I am aware that the Root Programs already review these incidents but I am going to have to ask them to weigh in on this publicly. If this is not a single CA issue, then there needs to be some very serious talks urgently on how to avoid building in critical failure points into critical infrastructure. We do not want to wait for when something goes wrong at this scale to have that kind of discussion.

Flags: needinfo?(ryandickson)
Flags: needinfo?(clintw)
Flags: needinfo?(bwilson)

(In reply to Leo Fang from comment #14)

(In reply to Zacharias from comment #10)

I see no action items detailing the efforts Chungwa Telecom will make to ensure that this (delayed revocation) does not happen again. You need to work with your subscribers so that they change their process for "official documents" and for obtaining "supervisor approval" so that in case of another revocation event they can handle revocation within 24h/5days.

We will add the expiration deadline of the BR to the user agreement, and we have obtained official consent. This incident has attracted official attention. In case of similar incidents in the future, users will be required to take action within the deadline according to the user agreement. The relevant official document approval process can be completed later to ensure that the deadline is met.

I also want to point out that these reasons don't really look compatible with Mozillas guidance on delayed revocation:
"Mozilla recognizes that in some exceptional circumstances, revoking the affected certificates within the prescribed deadline may cause significant harm, such as when the certificate is used in critical infrastructure and cannot be safely replaced prior to the revocation deadline, or when the volume of revocations in a short period of time would result in a large cumulative impact to the web. "
What harms would timely revocation cause?

All our users are government agencies, and the government's infrastructure spans across various crucial services nationwide. If we revoke all certificates immediately, the impact will be extensive, as illustrated below:

  1. The airport control tower's monitoring system would be unable to function properly, affecting flight takeoff and landing as well as allocation.
  2. The ICU centralized monitoring system in the healthcare system would be paralyzed, affecting patients' medical rights.
  3. Voltage load monitoring would be paralyzed, making it impossible to manage the national power grid, which could lead to regional or nationwide blackouts. Current power load data would be inaccessible.
  4. The railway system would be paralyzed, making it impossible to obtain real-time dispatch information, which could result in railway accidents and jeopardize passenger safety.
  5. The meteorological radar monitoring system would be paralyzed, unable to display radar information normally, thus hindering accurate weather prediction and information dissemination.
  6. Regarding the economy, as it is currently the tax reporting season in our country, it would affect tax reporting, service integration, tax applications, and inquiries.
  7. Government units such as the household registration system would be paralyzed, causing significant disruptions as all domestic services related to individuals or families would be unperformable.

While I agree that disruption of the listed services would risk almost certain harm, of these number 6 and 7 are obviously of importance for the public to have access to. But number 1-5? Those services, that are critical to the public’s interests, are not relevant for them to access. Quote the opposite in fact, if I was asked to make an ICU monitoring system publically available I would assume that it was a joke and refuse to assist if it wasn’t.

Seriously, Taiwan isn’t that big. Why not run a dedicated network for these critical critical services?

Availability is also a part of information security. While complying with BR and cybersecurity regulations, we must also consider the availability of services to prevent the paralysis of government digital services.

The attachment for the latest batch of revoked certificates contain search links, meaning you have to click the single search result to view the details on crt.sh. Please provide direct links in the future. Because crt.sh seems to be under a high load currently I couldn't check all of the listed certificates,

We will attempt to provide direct links in the future; the method is still under research.

Thank you

but everyone I could I noticed that for me there was no public DNS record for them.

Before issuing each certificate, we thoroughly verify the DNS names. Our external auditors review them annually, and they are all compliant.

I’m not saying you didn’t verify the DNS names, I’m saying I couldn’t even find any. But maybe this is user error on my part. But now that I know what many of these certificates are for, I feel a bit safer knowing that I couldn’t connect to many of their subjects.

Flags: needinfo?(leox)

Let me ask this in the most clear way possible:

Let's say an incident happens June 2025 that requires all of these certificates to be revoked and reissued in either 24 or 120 hours.

Will Chunghwa Telecom be able to actually do that? or would we see excuses like:

The meteorological radar monitoring system would be paralyzed, unable to display radar information normally, thus hindering accurate weather prediction and information dissemination.
Regarding the economy, as it is currently the tax reporting season in our country, it would affect tax reporting, service integration, tax applications, and inquiries.
Government units such as the household registration system would be paralyzed, causing significant disruptions as all domestic services related to individuals or families would be unperformable.

If you do say that you will not be delaying revocation in the future, then what action item is actually making that possible for your CA?

(In reply to Wayne from comment #15)

I am aware that the Root Programs already review these incidents but I am going to have to ask them to weigh in on this publicly. If this is not a single CA issue, then there needs to be some very serious talks urgently on how to avoid building in critical failure points into critical infrastructure. We do not want to wait for when something goes wrong at this scale to have that kind of discussion.

Agreed. Ongoing discussions need to be prioritized and accelerated. We also need to obtain all necessary information and reach the right conclusions at a faster pace to address these current situations and have solutions in place ahead of time.

Flags: needinfo?(bwilson)

All our users are government agencies, and the government's infrastructure spans across various crucial services nationwide.

Which of these crucial services are intended to be accessed by major web browsers (Firefox, Chrome, Safari, Edge) that use the latest versions of the major WebPKI root stores (that are managed by Mozilla, Chrome, Apple, Microsoft) ?
And conversely, which of these crucial services are only intended to be accessed by client software that does not use such a root store?

Thought experiment: Comment #14 claims that the "impact will be extensive" if these certificates are revoked (via CRL and/or OCSP). An alternative way forward could be to leave the affected certificates unrevoked and to instead request removal of the affected root(s) from the various WebPKI root stores. If this happened, what would be the expected impact?

To be clear: I'm not suggesting a browser-enforced "distrust" here. I'm simply wondering if CA-initiated root removal is a viable solution to this problem, perhaps combined with spinning up a new root that only serves legitimate WebPKI use cases.

Flags: needinfo?(clintw)

(In reply to Mike Shaver (:shaver emeritus) from comment #13)

(In reply to Leo Fang from comment #9)
I don’t know how to be more explicit about this that I am now, so I hope it works:
For each subscriber (but ideally for each certificate) you must provided detailed explanation of why it was impossible for the Subscriber to replace the certificate, and what specific harm to the web ecosystem would have resulted from revoking that Subscriber’s certificates promptly.
A summary of general trends is not sufficient. Saying that an affected site is part of an important industry is not sufficient.

The reason for not being able to revoke in time this time is due to the official document approval process and personnel IT capability issues. As mentioned in the previous response, we will include and emphasize the BR and CPS regulations in the user agreement in the future.

When describing why it was impossible to replace the certificate, it should be the case that the Subscriber would have faced the same challenges in the event of a key compromise. If there is a legislative or regulatory limitation, you should state what that regulation or law is, and what limits it places on this specific situation.
Please comply with this requirement and provide a sufficiently-detailed description, and quickly; this incident report is currently very much not compliant with Mozilla’s incident response policy.

We agree with your point of view. We will emphasize the deadlines for revocation according to BR and CPS regulations in the user agreement, providing users with a basis to follow that the CA must subsequently revoke all affected certificates within the specified timeframe.

Flags: needinfo?(leox)

(In reply to Wayne from comment #15)

If even one of those are true it raises serious concerns over why these services are using publicly-trusted roots. I can appreciate the care in handling a single revocation incident for these services, but the ongoing issuance is asking for far more serious incidents in the future. Is there a service of critical national importance that hasn't been built to rely on these certificates?

  1. We will add to the user agreement to emphasize that, based on the provisions of BR and CPS, the CA must revoke all affected certificates within the deadline, providing users with a basis for processing.
  2. Strengthen user promotion and let users better understand their responsibilities. If they cannot cooperate with BR and CPS, it is a critical infrastructure that is not suitable for using "Public Trusted" SSL certificates, and they should find another way.

I am aware that the Root Programs already review these incidents but I am going to have to ask them to weigh in on this publicly. If this is not a single CA issue, then there needs to be some very serious talks urgently on how to avoid building in critical failure points into critical infrastructure. We do not want to wait for when something goes wrong at this scale to have that kind of discussion.

This issue was discussed at the CA/Browser Forum F2F Meeting in May, and it is recommended to refer to the opinions provided by the Root Programs.

(In reply to Zacharias from comment #16)

While I agree that disruption of the listed services would risk almost certain harm, of these number 6 and 7 are obviously of importance for the public to have access to. But number 1-5? Those services, that are critical to the public’s interests, are not relevant for them to access. Quote the opposite in fact, if I was asked to make an ICU monitoring system publically available I would assume that it was a joke and refuse to assist if it wasn’t.
Seriously, Taiwan isn’t that big. Why not run a dedicated network for these critical critical services?

Yes, currently these critical infrastructures all have dedicated networks. However, due to the regulatory authorities requiring all units to implement HTTPS connections, various government agencies have been coming to apply for certificates issued by GTLSCA. In the future, we will focus on promoting the use of dedicated encryption methods for these critical infrastructures or for web services provided to specific users.

I’m not saying you didn’t verify the DNS names, I’m saying I couldn’t even find any. But maybe this is user error on my part. But now that I know what many of these certificates are for, I feel a bit safer knowing that I couldn’t connect to many of their subjects.

Thank you, Zacharias. Please refer to the Direct Link provided in the attachment.
revokedList-crtshID-12911.csv

(In reply to amir from comment #17)

Let me ask this in the most clear way possible:
Let's say an incident happens June 2025 that requires all of these certificates to be revoked and reissued in either 24 or 120 hours.
Will Chunghwa Telecom be able to actually do that? or would we see excuses like:

The meteorological radar monitoring system would be paralyzed, unable to display radar information normally, thus hindering accurate weather prediction and information dissemination.
Regarding the economy, as it is currently the tax reporting season in our country, it would affect tax reporting, service integration, tax applications, and inquiries.
Government units such as the household registration system would be paralyzed, causing significant disruptions as all domestic services related to individuals or families would be unperformable.
If you do say that you will not be delaying revocation in the future, then what action item is actually making that possible for your CA?

Thank you, Amir, for your clear questions.

We have two measures for this:

  1. Emphasize in the user agreement the deadlines for revocation based on BR and CPS, providing users with a basis for processing. We will reiterate this in annual promotion with users, stressing that users should establish a response process that can be completed within the deadlines; otherwise, users will not comply with the requirements of the GTLSCA user agreement.
  2. Focus on promoting the adoption of dedicated encryption methods for these critical infrastructures or web services provided to specific users.

Although we have proposed relevant solutions, we still need some time for advocacy and communication with users.

(In reply to Ben Wilson from comment #18)

Agreed. Ongoing discussions need to be prioritized and accelerated. We also need to obtain all necessary information and reach the right conclusions at a faster pace to address these current situations and have solutions in place ahead of time.

Thank you, Ben. We currently propose 2 matching methods, as shown in the above reply to amir, please refer to it.

(In reply to Rob Stradling from comment #19)

All our users are government agencies, and the government's infrastructure spans across various crucial services nationwide.
Which of these crucial services are intended to be accessed by major web browsers (Firefox, Chrome, Safari, Edge) that use the latest versions of the major WebPKI root stores (that are managed by Mozilla, Chrome, Apple, Microsoft) ?
And conversely, which of these crucial services are only intended to be accessed by client software that does not use such a root store?

Currently, most of our country's digital services are web-based, with a high dependency on browsers, especially those used by the general public. For web services provided to specific users, dedicated networks can be considered.

We recommend that government agencies evaluate on their own whether to adopt dedicated networks for critical infrastructures or web services provided to specific users. If GTLSCA certificates are still used, they should establish complementary measures that can be completed within the specified deadline.

Thought experiment: Comment #14 claims that the "impact will be extensive" if these certificates are revoked (via CRL and/or OCSP). An alternative way forward could be to leave the affected certificates unrevoked and to instead request removal of the affected root(s) from the various WebPKI root stores. If this happened, what would be the expected impact?
To be clear: I'm not suggesting a browser-enforced "distrust" here. I'm simply wondering if CA-initiated root removal is a viable solution to this problem, perhaps combined with spinning up a new root that only serves legitimate WebPKI use cases.

We recommend that government agencies communicate to all units that for critical infrastructures or web services provided to specific users, dedicated networks should be used, and "Public Trusted" certificates should not be adopted.

(In reply to Leo Fang from comment #24)

(In reply to amir from comment #17)

Let me ask this in the most clear way possible:
Let's say an incident happens June 2025 that requires all of these certificates to be revoked and reissued in either 24 or 120 hours.
Will Chunghwa Telecom be able to actually do that? or would we see excuses like:

The meteorological radar monitoring system would be paralyzed, unable to display radar information normally, thus hindering accurate weather prediction and information dissemination.
Regarding the economy, as it is currently the tax reporting season in our country, it would affect tax reporting, service integration, tax applications, and inquiries.
Government units such as the household registration system would be paralyzed, causing significant disruptions as all domestic services related to individuals or families would be unperformable.
If you do say that you will not be delaying revocation in the future, then what action item is actually making that possible for your CA?

Thank you, Amir, for your clear questions.

We have two measures for this:

  1. Emphasize in the user agreement the deadlines for revocation based on BR and CPS, providing users with a basis for processing. We will reiterate this in annual promotion with users, stressing that users should establish a response process that can be completed within the deadlines; otherwise, users will not comply with the requirements of the GTLSCA user agreement.
  2. Focus on promoting the adoption of dedicated encryption methods for these critical infrastructures or web services provided to specific users.

Although we have proposed relevant solutions, we still need some time for advocacy and communication with users.

These are not compatible with the requirements of being a CA. Your CA is currently in non-compliance with regards to the BRs and various root store policies.

I'd like you to explain what decision making process happened (if any) at Chunghwa Telecom for accepting these critical infrastructure use cases with no regards to Chunghwa's prior commitments to the root stores and root programs? This should be considered part of your root cause analysis. How many Chunghwa Telecom staff were involved? Was a risk analysis done?

Also, to make sure we're on the same page, if there was a domain validation error on your end that required a 24 hour revocation timeline for all of your issued certificates, would you be delaying that revocation too as there is critical infrastructure on the line?

Furthermore, I would like a commitment, with a deadline, that if a mass-revocation is necessary by any incident beyond {SomeDate}, you will NOT be delaying revocation. While I'm happy that you're going to encourage your customers to adopt private PKI, I want a firm date where your CA will draw the line and say no more delayed revocations beyond this date.

(In reply to amir from comment #27)

I'd like you to explain what decision making process happened (if any) at Chunghwa Telecom for accepting these critical infrastructure use cases with no regards to Chunghwa's prior commitments to the root stores and root programs? This should be considered part of your root cause analysis. How many Chunghwa Telecom staff were involved? Was a risk analysis done?

We only verify the ownership of the DNS according to the domain validation procedures of the OV level. We do not know what specific services the users install the certificates on. We fulfill our CA duties in accordance with the CPS.

It is only when an event requiring certificate revocation occurs and we notify users through announcements and letters that users inform us why they cannot complete the replacement within the specified deadline. We then consider the importance of the service and the system's availability and reluctantly delay the revocation.

Through these two large-scale revocations, we have analyzed, based on user feedback, where users specifically install these SSL certificates and how their administrative processes differ. This is why we did not immediately include these reasons in the root cause analysis.

In past risk analyses, the focus was mainly on whether the validation process was thorough, whether system availability met the requirements, whether there was proper hierarchical authorization of personnel, and the risk analysis of system security. We had not considered how to respond to such large-scale revocations. Moving forward, we will include this in our risk assessments.

Also, to make sure we're on the same page, if there was a domain validation error on your end that required a 24 hour revocation timeline for all of your issued certificates, would you be delaying that revocation too as there is critical infrastructure on the line?

Based on these two large-scale revocations, we have also emphasized to stakeholders and users the necessity of complying with BR and CPS. In the event of incidents such as domain validation errors, we will notify users by letter to replace their certificates and then proceed to revoke their certificates within the deadline specified by BR.

Furthermore, I would like a commitment, with a deadline, that if a mass-revocation is necessary by any incident beyond {SomeDate}, you will NOT be delaying revocation. While I'm happy that you're going to encourage your customers to adopt private PKI, I want a firm date where your CA will draw the line and say no more delayed revocations beyond this date.

If possible, we hope to clearly specify in the user agreement the reasons and time limits for revocation as mandated by BR, starting one year after these stipulations are clearly marked, with no further delays in revocations from July 1, 2025.

The main reason is that the certificate validity period is one year. Although our user agreement already includes compliance with BR and CPS, these terms were not previously emphasized, and users did not pay attention to them. Therefore, we hope to promote awareness among users so that they have a clear understanding and can fully grasp these emphasized rules when renewing certificates. This way, in future large-scale revocations, users will no longer have excuses for delays.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: