Telekom Security: Revocation delay for TLS certificates with basicConstraints not marked as critical
Categories
(CA Program :: CA Certificate Compliance, task)
Tracking
(Not tracked)
People
(Reporter: Arnold.Essing, Assigned: Arnold.Essing)
References
(Blocks 1 open bug)
Details
(Whiteboard: [ca-compliance] [leaf-revocation-delay])
Attachments
(1 file)
29.21 KB,
text/csv
|
Details |
Summary
Telekom Security has issued TLS certificates with basicConstraints but didn't mark them as critical as described in https://bugzilla.mozilla.org/show_bug.cgi?id=1875820. With regard to BR section 4.9.1.1#12, the affected certificates should have been revoked within 5 days at the latest.
As described in the above mentioned bug, not all affected certificates could be replaced by the customers within the given period and since the continuity of some of the customers' critical infrastructures depends on the affected certificates, not all affected certificates were revoked in time.
Impact
336 of the affected certificates were not revoked in time, i.e. these certificates are subject of this bug.
Timeline
All times are UTC.
2024-01-22:
16:37 The affected customers were informed and were asked to exchange and revoke the affected certificates as soon as possible but within 5 days at the latest.
since 2024-01-23 (ongoing):
Regularly contact with the affected customers, mostly by phone.
2024-01-26
08:00 Video conference with the management of the Trust Center to decide whether certificates should not be revoked due to their criticality.
09:57 Last email to affected customers with a plea for a detailed explanation of the reasons, why certificates should not be revoked, e.g. due to the criticality of the infrastructures in which these certificates are used.
2024-01-27
10:00 Video conference between the root and compliance team and the solution management of the affected solution to make a last check, if all certificates were revoked within the 5-day-period, except those declared as critical from the affected customers.
Root Cause Analysis
The affected certificates are certificates from some enterprise customers who use them in their infrastructures. Some customers with a large number of certificates were affected, some even with more than a hundred affected certificates.
Because of the necessary changes in the customers critical infrastructures, which had to be planned and implemented carefully considered not all customers were able to replace all certificates within the given timeline.
Due to the fact, that revoking of not yet replaced certificates would have big impacts on the continuity of the customers' infrastructures, not all certificates were revoked within the 5-day-period.
Lessons learned
What went well?
Most of the affected customers reacted in short-term.
What didn’t go well?
Some of the affected customers did not react in short-term, so that lots of repeated inquiries for these customers were needed.
Where we got lucky
n/a
Action Items
Action Item | Kind | Due Date
Sensitization of the customers with regard to react in short-term and to be prepared for faster replacement procedures.| Mitigate | ongoing |
Appendix
The list of the affected certificates which could not be revoked in the time slot with links to crt.sh (Pre-certificates, as not all leaf certificates are published in crt.sh).
Assignee | ||
Comment 1•2 years ago
|
||
Thanks for the incident report Arnold,
I feel like "Sensitization of the customers" is not guaranteed to be able to "Mitigate" this risk and prevent this from happening or reducing its scope.
Are there any other steps your CA could take to ensure this will not happen (at this scale) again in the future? Being unable to revoke 41.17% of your certificates when they're in the hundreds is not very encouraging, especially when they may get in the tens of thousands.
Do you have any plans to support and direct customers at issuance with ACME, and supporting protocols such as ARI, to lower this number using technical controls?
Do you have some cut-off point for certificates without a "detailed explanation of the reasons"?
From my point of view, pleas over e-mail and phone did not prove effective and will not scale when you have more customers.
Updated•2 years ago
|
Assignee | ||
Comment 3•2 years ago
|
||
Hello Antonis,
Thanks for your questions, to which we would like to give feedback.
Are there any other steps your CA could take to ensure this will not happen (at this scale) again in the future? Being unable to revoke 41.17% of your certificates when they're in the hundreds is not very encouraging, especially when they may get in the tens of thousands.
As this involves the use of certificates in infrastructure components, automation is not as simple as using certificates purely as web server certificates, for which automation is possible via ACME, for example. From our point of view, raising customer awareness for the implementation of more and more automation is essential for this, as it is not possible to contractually oblige all customers to automate.
However, we would also like to emphasize, that "unable to revoke 41.17% of your certificates" is in our opinion not reflecting all the circumstances. We are certainly able to revoke all certificates and are willing to do so if necessary. Since the basicConstraints in end entity certificates are optional in both RFC5280 and the S/MIME-BR, we considered the potential security risk to be very low and decided not to enforce revocation within 5 days for the declared certificates, whose revocation would impact critical infrastructures.
Do you have any plans to support and direct customers at issuance with ACME, and supporting protocols such as ARI, to lower this number using technical controls?
We have currently implemented ACME for the issuance of DV certificates and we plan to offer automation, e.g. via ACME or other protocols, for the issuance of OV and EV certificates in the future, as is currently being discussed in the SCWG.
Do you have some cut-off point for certificates without a "detailed explanation of the reasons"?
The cut-off point would have been the 5-day period as per BR#4.9.1.1. For this, we evaluated shortly before the 5 days period ended (2024-01-27, see timeline above), whether all certificates were revoked or whether plausible reasons had been provided by the customers to delay the revocation (including the risk assessment from our side).
From my point of view, pleas over e-mail and phone did not prove effective and will not scale when you have more customers.
We agree that pleas over phone (in addition to obligatory mass emails) may not prove effective for standard offerings with thousands of customers with less than a handful of certificates each, but they can prove effective and, in our view, must be done in Enterprise RA environments where the affected customers have dozens or even hundreds of certificates, as was the case here.
Comment 4•2 years ago
|
||
This report doesn't meet the expectations set by Mozilla for CAs responding to a revocation incident. It doesn't include "detailed and substantiated explanations for why the situation is exceptional."
(In reply to Arnold Essing from comment #3)
We are certainly able to revoke all certificates and are willing to do so if necessary. Since the basicConstraints in end entity certificates are optional in both RFC5280 and the S/MIME-BR, we considered the potential security risk to be very low and decided not to enforce revocation within 5 days for the declared certificates, whose revocation would impact critical infrastructures.
In what situation would you consider revocation necessary? You are already required to revoke these certificates by the Baseline Requirements and by Mozilla Root Store Policy.
Specifically, your response is called out as an example unacceptable rationale by Mozilla:
Responses similar to “we do not deem this non-compliant certificate to be a security risk” are not acceptable. When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis.
I'm wondering if the root programs are not here to enforce the revocation requirements, should we remove these revocation requirements?
My gut feeling is that some CAs are providing promises to their downstream entities that they knowingly can not keep. I'll repeat what I said in the Buypass incident, you are not a private PKI. Your commitment is not only to your subscribers, but to everyone on the web.
At this point though, the ball is in the court of the root programs. At what point do these root programs make it clear that delayed revocation is not acceptable, and to enforce this on the CAs? The lack of enforcement on these rules effectively make these rules impossible to enforce in more dire circumstances.
My take here: If you're unable to revoke a handful of certificates within a 5 day time frame, you should not be a CA. Being a CA is hard, and you have the trust of the world placed in you. That trust should not come lightly.
Comment 6•2 years ago
|
||
It sounds as though you contacted the Subscribers and chose which revocations to delay based on which Subscribers asked to delay the revocation. Per the link that Mathew Hodson provided above, you’ll need to provide specific rationales for each Subscriber:
When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis.
Each individual rationale must be compelling in its own right. “The Subscriber doesn’t have a way to replace the certificates in time” isn’t likely to cut it.
Comment 7•2 years ago
|
||
(In reply to Comment #5)
Thank you for sharing your perspective!
Speaking only on behalf of Chrome, it’s important to clarify that we do expect CA Owners to comply with our program policy, their own policies, and the BRs. However, the enforcement mechanisms and the visibility of these actions might not always be immediate or publicly evident. The frequency of a particular incident for a CA Owner, the quality of responses (detail, transparency, etc.), and commitment to make meaningful and demonstrable change aligned with evidenced continuous improvement are all significant factors when we evaluate CA Owners for initial and continued inclusion in the Chrome Root Store.
While a single incident of delayed revocation by a CA might not lead to an explicit enforcement action, it does feed into our constant evaluation and assessment of a CA Owner's ability to comply with the policies and commitments they have made to the ecosystem, and their ability to competently and reliably serve the community. These evaluations may result in the removal of a CA Owner from the Chrome Root Store, or the application of other technical controls that affect how the certificates they issue are trusted in Chrome.
Comment 8•2 years ago
|
||
For each of these affected Telekom Security Subscribers:
- What other approaches or solutions were considered or explored?
- How can their infrastructures and their planning be improved so that Telekom Security can uphold its policy commitments?
Also, we would highly encourage additional detail to be provided in this report. Minimally:
Request for update #1: The Timeline section should be updated to include the separate incident that determined the need for revocation, which was 1875820 on January 22, 2024. All of the events detailed on the CCADB incident report page need to be included in the Timeline.
Request for update #2: The Root Cause Analysis section should be updated to include a detailed analysis of the combined conditions that created the issue. As stated on the CCADB incident report page “It is unusual for an incident to have a single root cause”. What were all of the conditions which combined to give rise to this issue? When did they first arise and how did they avoid detection, especially considering the topic of delayed revocation is longstanding.
Request for update #3: We would also encourage a more robust analysis of Lessons Learned and corresponding Action Items. These could possibly be related to the other approaches or solutions considered when analyzing the affected Subscribers. Consider how your Action Items will instill confidence in the community that this issue will not recur in the future.
Comment 9•2 years ago
|
||
(in reply to Comment #4 and Comment #6)
We are aware that all affected certificates must be revoked according to the BR, we did not want to put this into question. By "if necessary" we meant that we as a CA would revoke the certificates if the affected Enterprise RAs, who are primarily obliged to revoke the affected certificates, did not react in time. Accordingly, we have now revoked all affected certificates that have not yet been revoked.
As mentioned in the timeline above, we requested all affected customers on January 22 to revoke their certificates as soon as possible, but within 5 days at the latest. The following customers then asked for more time as they were not able to replace all certificates within the given time. They have all credibly explained to us that revoking the not yet replaced certificates would have a serious impact on their infrastructures and the associated business processes, as the continuity of their infrastructures, or at least parts thereof, depend on these certificates:
- GK SOFTWARE SE: Internationally operating service provider (SAP among others)
- NTT Data Business Solutions AG: Internationally operating service provider (SAP among others)
- Ratiodata SE: IT service provider in the banking sector
- Polizei Bayern: IT Infrastructure of the police in Bavaria
- Satellic NV: IT Infrastructure of the toll system in Belgium
- AKDB: Data center for German municipalities
- Hochtief AG: IT Infrastructure of “Hochtief”, a global infrastructure group (construction)
- Deka Bank: IT Infrastructure of the German “Deka Bank”
- Deutsche Telekom Group: IT Infrastructure of Mobile/Voice Services
The above list indicates the infrastructures of the affected customers. Due to the protection of internal customer information, we cannot disclose more detailed information, e.g., on the reasons for the time needed by the customers for the replacement of the certificates or the affected systems in detail.
Note: We appreciate the discussion in the CA/Browser Forum regarding a third revocation period for TLS certificates that need to be revoked but are not creating any harm to certificate consumers or contain misleading information. (see https://cabforum.org/2023/12/07/minutes-of-the-f2f-60-meeting-in-portsmouth-nh-usa-3-5-october-2023-scwg-4-october/, https://bugzilla.mozilla.org/show_bug.cgi?id=1861069#c8)
(In reply to Comment #8)
We will update the bug shortly regarding the timeline, root cause and lessons learned.
Comment 10•2 years ago
|
||
(in reply to Comment#8)
To support automation in Enterprise RA environments as well, we implemented the option of requesting certificates via CMP some time ago. As of now, CMP is widely used by the Enterprise RAs for requesting S/MIME certificates but less for TLS certificates. As already mentioned, we have so far also implemented ACME for the application of DV certificates and will extend this to OV and EV certificates as far as possible. To this end, we welcome the current discussion in the SCWG about amending the EVGL for automation.
However, we do not know all of our customers' infrastructures and are therefore unable to assess the opportunities for automation in these infrastructures. Based on the lessons we have learned from this bug, we will not only sensitize customers to more automation, but we will also offer training and workshops for our Enterprise RAs on this topic. In future, we will also carry out annual self-assessments for our Enterprise RAs, in which we will also ask about the status of the implementation of automated processes.
Update timeline (including relevant extracts from the original bug https://bugzilla.mozilla.org/show_bug.cgi?id=1875820):
2023-04-22
BR version 2.0.0 adopted
2023-09-15
BR version 2.0.0 effective, beginning of the misissuance
2024-01-07
zlint version 3.6.0 was released
2024-01-22
08:25 The error message “ERROR: basicConstraints MAY appear in the certificate, and when it is included MUST be marked as critical “ in crt.sh was found in our weekly checks
11:37 The list of the affected certificates and customers was provided
16:37 The affected customers were informed and were asked to exchange and revoke the affected certificates as soon as possible but within 5 days at the latest
2024-01-23 to 2024-02-05
Regularly contact with the affected customers, mostly by phone.
2024-01-26
08:00 Video conference with the management of the Trust Center to decide whether certificates should not be revoked due to their criticality.
09:57 Last email to affected customers with a plea for a detailed explanation of the reasons, why certificates should not be revoked, e.g. due to the criticality of the infrastructures in which these certificates are used.
2024-01-27 (end of the 5-day period)
10:00 Video conference between the root and compliance team and the solution management of the affected solution to make a last check, if all certificates were revoked within the 5-day-period, except those declared as critical from the affected customers. 336 were not yet revoked (see Attachment)
2024-01-30
07:52 Opening this revocation delay bug
2024-02-01
09:11 Test of the zlint version 3.6.0 in the test environment successfully completed
2024-02-05
13:00 Videoconference with the management of the Trust Center with the decision, to revoke all certificates that have not yet been revoked the next day
13:25 Information to customers about the final revocation of all certificates the next day
2024-06-02
15:41 All affected certificates are replaced and revoked.
Update Root Cause Analysis
In addition to the initial Root Cause Analysis we now see further causes in retrospect:
Although all Enterprise RAs have received initial training and have been informed of changes as required, we have found that they are not sufficiently sensitized. Due to the fact that the error was not caused by them and the erroneous certificates had practically no impact on usage, we had to do some persuasive work. In addition, we also had to learn that even with understanding customers, implementation takes longer than expected, as many contacts from different areas (Enterprise RA, Management, Organization, IT department, Support) had to be coordinated on the customer side. Since we have mostly had very good experiences with our Enterprise RA customers, we have underestimated these circumstances so far.
Update Lessons learned
As already mentioned, we have learned that we not only need to sensitize customers more, but also demand more automation or at least better preparation for faster certificate replacement processes as stated above (training, workshops, annual self-assessments).
Update Action Items
In addition to the initial Action Item (Sensitization of the customers) we now see further Action Items:
Action Item | Kind | Due Date
Preparing slides for a training with regard to automation | Prevent | 2024-03
Preparing a self-assessment for Enterprise-RAs | Prevent | 2024-04
On-site audits at selected customers in which, in addition to the existing Enterprise RA topics, the possibilities of automation and preparation for quicker reactions are also considered | Prevent | starting in the second half of 2024
Comment 11•2 years ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Comment 12•2 years ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Comment 13•2 years ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Comment 14•2 years ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Comment 15•1 years ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 16•1 years ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 17•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Comment 18•1 year ago
|
||
Thank you for the updates in Comment 10.
It's not clear if the first two Action Items have been completed. Additionally, they are detailed as preparation activities. Were they executed? If so, what was the result?
How are you evaluating (or planning to monitor and evaluate) their effectiveness in preventing this issue from recurring in the future?
Comment 19•1 year ago
|
||
Hello Chris!
It's not clear if the first two Action Items have been completed. Additionally, they are detailed as preparation activities. Were they executed? If so, what was the result?
We are still working on the slides and the self-assessment and expect to be ready by mid-May. Due to the annual audit in the last few weeks, we are a little behind schedule.
How are you evaluating (or planning to monitor and evaluate) their effectiveness in preventing this issue from recurring in the future?
One goal of the regular self-assessments is monitoring customer awareness with regard to short-term response and automation and to be prepared for faster replacement procedures. The feedback on the self-assessments is evaluated jointly by Compliance Management, Solution Management and internal auditors. This evaluation will lead to further steps being taken, if necessary.
Comment 20•1 year ago
|
||
Our action items are still ongoing. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 21•1 year ago
|
||
Our action items are still ongoing. Please let us know if there are any further comments or questions.
Comment 22•1 year ago
|
||
When I review this bug, I see a lot of attention paid to the difficulty Subscribers can have in managing their certificates. Certainly this is an important point and an area where perhaps CAs can show the way. However, this bug contains very little in the way of acknowledgement that the CA’s primary responsibility is to the greater internet-using public and not to individual Subscribers. Comment 4, comment 5, and comment 6 make this point clearly. Nonetheless, as late as comment 10 you have a root cause analysis that does not acknowledge this failure on Telekom Security’s part.
Although all Enterprise RAs have received initial training and have been informed of changes as required, we have found that they are not sufficiently sensitized. Due to the fact that the error was not caused by them and the erroneous certificates had practically no impact on usage, we had to do some persuasive work. In addition, we also had to learn that even with understanding customers, implementation takes longer than expected, as many contacts from different areas (Enterprise RA, Management, Organization, IT department, Support) had to be coordinated on the customer side.
Furthermore, your updated action items contain no provision for preventing the willful delay of revocation when it requires taking an uncomfortable position with your Subscriber.
The thing you must come to understand is that Subscribers will routinely tell you they need more time. However, when presented with hard deadlines, it turns out they actually can swap out their certificates. They just don’t want to.
CAs need to hold the line on revocation deadlines, or the deadlines become meaningless - and so does revocation. Is Telekom Security prepared to publicly commit to permitting no more late revocations from here on out?
Updated•1 year ago
|
Assignee | ||
Comment 23•1 year ago
|
||
Hi Tim,
Thank you for your feedback. In principle, we agree on the position you take and especially the fact that the deadlines are important and that they should be complied with high priority. It seems that our comments so far did not make this sufficiently clear, however, we did push our customers a lot in order to accelerate the revocation process and enforce the deadlines. We are also aware of the fact that subscribers often claim that they need more time despite actually being able to act timely when presented with hard deadlines. Let us assure you that we presented each customer with hard deadlines (24h as well as 5 days) and only refrained from enforcing those in the last moment if necessary. Ultimately, we did revoke all remaining certificates without exception as stated in comment#10.
We had several escalations and displeased customers and even lost a few in the process. So we did actually take a very uncomfortable position with our customers and also pushed those that claimed to not have enough time. However, we can not make an ultimate statement that there will be no more delayed revocations. The Mozilla wiki itself states that this decision must be based on the risks imposed on the parties: “It is our position that your CA is ultimately responsible for deciding if the harm caused by following the requirements of the Baseline Requirements outweighs the risks that are passed on to individuals who rely on the web PKI by choosing not to meet this requirement.”
However, we are doing everything we can to avoid these situations by optimizing processes together with our customers, promoting automation and so on. Speaking of which:
We finalized our action item tasks “slides for a training with regard to automation” and “self-assessment for Enterprise RAs”. We approached the first customers with the request to fill out the self-assessment in a timely manner and plan on using their feedback for some fine-tuning. Afterwards, the self-assessment will be distributed to all remaining customers. Based on the results, we will then approach individual customers in regard to automation of certificate management.
Unless there are comments in the next few days, we would like to request to close this bug.
Comment 24•1 year ago
|
||
(In reply to Arnold Essing from comment #23)
It seems that our comments so far did not make this sufficiently clear, however, we did push our customers a lot in order to accelerate the revocation process and enforce the deadlines. We are also aware of the fact that subscribers often claim that they need more time despite actually being able to act timely when presented with hard deadlines. Let us assure you that we presented each customer with hard deadlines (24h as well as 5 days) and only refrained from enforcing those in the last moment if necessary.
Given the overlap of what you are saying and this recent M-D-S-P post would Telekom Security be willing to give examples of how they approached this difficult issue with subscribers? I appreciate this can be a delicate matter for all parties, but the quantity of delayed revocation issues across all CAs lately is raising questions on how these issues are handled.
Comment 25•1 year ago
•
|
||
(In reply to Arnold Essing from comment #23)
We had several escalations and displeased customers and even lost a few in the process. So we did actually take a very uncomfortable position with our customers and also pushed those that claimed to not have enough time. However, we can not make an ultimate statement that there will be no more delayed revocations. The Mozilla wiki itself states that this decision must be based on the risks imposed on the parties: “It is our position that your CA is ultimately responsible for deciding if the harm caused by following the requirements of the Baseline Requirements outweighs the risks that are passed on to individuals who rely on the web PKI by choosing not to meet this requirement.”
However, we are doing everything we can to avoid these situations by optimizing processes together with our customers, promoting automation and so on. Speaking of which:
We finalized our action item tasks “slides for a training with regard to automation” and “self-assessment for Enterprise RAs”. We approached the first customers with the request to fill out the self-assessment in a timely manner and plan on using their feedback for some fine-tuning. Afterwards, the self-assessment will be distributed to all remaining customers. Based on the results, we will then approach individual customers in regard to automation of certificate management.
Unless there are comments in the next few days, we would like to request to close this bug.
First, thank you for upholding your commitment to the BRs and the integrity of the WebPKI even when it was uncomfortable and had negative consequences for your business. It is appreciated!
The action items listed here, however, seem to stop at "we will try" rather than "we will ensure". You describe outreach and assessment, which are certainly good, but it seems like Telekom Security should be working to ensure that they do not have any certificates issued in the future to subscribers who cannot accommodate the revocation deadlines set forth in the BRs. Will Telekom Security ensure that subscribers are assessed for this capability before certificates are issued to them? Will you update your CPS (1.4.2) and associated documentation/notices to indicate that the certificates are not appropriate for use in circumstances where a 24-hour revocation may be required in an emergency, and a 5-day revocation required during the course of normal business?
Comment 26•1 year ago
|
||
It has been 7 days and no answers have been provided for the above questions.
Assignee | ||
Comment 27•1 year ago
|
||
We have our answers nearly ready, but we want to discuss them finally. As we are attending the CA/Browser Forum F2F meeting this week, we will reply on Monday.
Assignee | ||
Comment 28•1 year ago
|
||
Regarding comment#24:
Hello Wayne,
Thanks for the interesting M-D-S-P post. We certainly did think about what or rather how we inform our customers to make them react timely but there are some interesting considerations in your post, that, admittedly, we did not have on the radar yet. While we are not able to give you some examples from our customer communication, we did some evaluation and might be able to provide some insight to that with some examples:
- We put in quite a lot of information about what went wrong and why revocation was necessary. This was motivated due to the experience from past communications where customers asked for more and more details because they obviously did want to argue. Surprisingly, this time the need for revocation was not challenged by anyone. This might be due to our explanations or maybe the customers know from past incidents that arguing does not “help”. However, you are certainly correct, that this transparency also invites to further discussions. We will keep this in mind for further incidents and test different approaches and observe their impact on the willingness of the customers to comply.
- We did not provide any options at all. Especially something like “If you can not revoke in time, give us a reason” was never proactively communicated. The message only contained “revocation is necessary”.
- We did give final deadlines at which all affected certificates had to be revoked.
- We provided each customer with an individual list of certificates to be revoked and even gave them a regular status on their own progress each day. The scaling here is no problem.
We hope that this response is helpful and provides useful feedback on your comment. In any case, we will keep an eye on the M-D-S-P discussion and see what we can take away from it.
Regarding comment#25:
Hello Mike,
Thank you for acknowledging our efforts. As described at the end of comment#23, based on the results of the assessment, we intend to cooperatively work with the customers to solve problems in regard to complying with revocation deadlines. As a matter of fact, we have discontinued business relationships with several customers that were not cooperative in the past. From our point of view, this counts towards “working to ensure” that we can comply with the revocation deadlines more reliably. However, we might not quite share your interpretation of “try” and “ensure”. In the end, basically every control has the purpose to ensure something but, naturally, also has the potential to fail due to unforeseen circumstances etc.
The need for completely new controls like an assessment of subscribers before certificate issuance or additional content in the CPS should, in our opinion, be discussed in the public discussion list for it to be applicable to all CAs equally. To demand such measures from a single CA does not seem expedient for the goals of the entire webPKI nor does it seem fair since it is related to significant additional costs on the CA and potentially its customers. However, if we understand your intentions behind those proposals correctly, we believe the effect of both measures to overlap with the required acceptance of the terms of use / subscriber agreement which already includes the commitment of the subscriber to revoke within 24h/5days.
Also, this topic is currently being discussed in the CAB/F and just recently in the F2F-Meeting. Be assured, that we will continue to follow those discussions and integrate any new information into our future planning.
Comment 29•1 year ago
|
||
Why aren’t you able to provide examples of the communication you sent to the customers? I understand if you don’t want to share their replies.
Assignee | ||
Comment 30•1 year ago
|
||
In our opinion communications between the CA and its customers are not something that should be disclosed in public discussion forums. We appreciate the approach to avoid revocation delays by improving communication with customers. As stated by Wayne, this is a very delicate matter for all parties and the trigger is “the quantity” of delayed revocation issues “across all CAs”. We believe that further discussions should be held within the framework of the CA/Browser Forum, where initiatives on how to deal with the revocation delay issues have already started and were also discussed at the last F2F meeting in Bergamo.
Assignee | ||
Comment 31•1 year ago
|
||
If there are no further questions, we kindly ask to close this bug.
Comment 32•1 year ago
|
||
In our opinion communications between the CA and its customers are not something that should be disclosed in public discussion forums.
We believe that further discussions should be held within the framework of the CA/Browser Forum, where initiatives on how to deal with the revocation delay issues have already started and were also discussed at the last F2F meeting in Bergamo.
You're entitled to your opinion. Your opinion is not a valid one in my view. This is a forum of public trust. CABForum is not accessible to the public as easy as these are. If you're not sharing your communication with your customers (which, a lot of other CAs have no problem doing so), then how can the public assert you of the trust you've been given?
I don't think this bug should be closed until Telekom Security provides reasonable action items to prevent further delayed/failed revocations. None of the ones in this bug really address that.
Also beyond that:
To demand such measures from a single CA does not seem expedient for the goals of the entire webPKI nor does it seem fair since it is related to significant additional costs on the CA and potentially its customers.
This is not a reasonable stance. Every CA has a requirement to follow the baseline requirements. There are many CAs that are following them without a problem, and a bunch that keep struggling. An alternative view to this could be that the root programs start distrusting all the CAs that are failing to follow this - is that a more preferred outcome here?
Comment 33•1 year ago
|
||
Apart from our opinion that we also consider the discussions and decisions of the CA/Browser Forum to be public, since all protocols are freely available on the Internet, and that we are not aware of a lot of other CAs, with one exception, that have published their customer communications, we would like to clarify that the last quote seems to have been taken out of context.
Our statement in comment #28
To demand such measures from a single CA does not seem expedient for the goals of the entire webPKI nor does it seem fair since it is related to significant additional costs on the CA and potentially its customers.
referred to Mike's suggestion in comment #25 to ensure that subscribers are assessed on their ability to comply with the revocation deadlines set forth in the BRs. As the Baseline Requirements do not contain any requirements for CAs to assess future customers in regard to their revocation capabilities or to write some specific clause in Chapter 1.4.2 of a CPS, we see no deviation from the BR.
Comment 34•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Comment 35•1 year ago
|
||
(In reply to Stefan Kirch from comment #34)
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Stefan,
Telekom Security never dealt with the points I brought up more than a month ago in comment 22. The root cause of this delrev bug is the willful choice by your CA to prioritize your Subscribers’ convenience over the rules all CAs are expected to follow. This is a severe problem in a public CA. Your handling of this bug has exacerbated this problem by failing to recognize, despite repeated comments, that you made this error and failing to do anything about it.
In bug 1889062 comment 15 Mozilla clarifies that before closing a deliberate delrev bug, the CA must address the policy and decision making flaws that allowed that problem to occur, including,
detailed changes to policies and procedures to ensure timely revocation, including new guidelines, checklists, and approval processes
It’s worth remembering that Mozilla’s “exceptional circumstances” language actually says, “Mozilla does not grant exceptions to the BR revocation requirements” and that CAs should (emphasis mine) “acknowledge this non-compliance.” This language is not permission for CAs to delay revocation. It is instruction for one of the remediation measures a CA must take if it fails to follow revocation rules correctly, with the goal to rectify the erroneous behavior so that the CA does not repeat the error.
In reviewing this thread I do not see a Root Cause Analysis that acknowledges the willful decision to choose customer convenience over obedience to the rules, nor do I see action items to address this deliberate disobedience. This despite the fact that comment 22, comment 25, and comment 32, from three different posters, make it clear that this is required.
It is disappointing to see your message at this point in the remediation. Thirteen days ago, in comment 31, you asked to close this bug despite your woefully incomplete response and lack of acknowledgement that Telekom Security was 100% the author of this problem. Now here we are, two weeks later, and it’s clear none of this is sinking in.
To make it easy, I will lay out what you need to do.
- Read the responses on your own bugs and address all of them in a timely manner, preferably straight away but no matter what within a week. Pay attention to what commentators actually say and respond to that, rather than simply parroting your own assumptions over and over again.
- Go back now and review all comments on this bug and make sure you have addressed each of them sincerely, in its entirety, and with an open mind.
- Follow this practice for every other bug opened against your CA for as long as you continue to be in the root stores.
- Provide an updated Incident Report that is complete with a correct Root Cause Analysis and one or more Action Items that address this root cause.
- The true Root Cause is that Telekom Security knowingly chose to violate the rules to accommodate its Subscribers’ stated unwillingness to install replacement certificates in time. The Subscriber’s ability and willingness to install replacements are not the CA’s concern. The CA’s role is to revoke the certificates within the stated deadline and to let the Subscriber know that this revocation will occur. The rest is for the Subscriber to work out. CAs do not have understanding nor control of Subscriber priorities and processes and will never be able to judge or alter Subscriber certificate processes.
- Suitable Action Items include establishment of a policy that prohibits the willing suspension of revocation deadlines under any circumstances, mechanisms in place to enforce that policy, and a hard promise to the WebPKI community that your CA will never make this choice again.
IMHO Telekom Security needs to follow these steps before this bug is suitable to close.
Comment 36•1 year ago
|
||
Thank you for your opinion. We cannot follow on all points as there are some personal opinions, unjustified accusations and unsubstantiated claims. Regardless of the fact that we would like to see a more careful choice of words in a public discussion, we would like to address a few points:
The root cause of this delrev bug is the willful decision to choose customer convenience over obedience to the rules
As already written in some comments, this is not true! The affected certificates had a critical impact on the critical infrastructures of the customers as well as the people who depend on these infrastructures, e.g. an affected toll system. The decisions not to revoke some certificates were based on credibly explanations of the customers. As stated in comment#23, we had several escalations and displeased customers and even lost a few in the process. So we did actually take a very uncomfortable position and we are far away from "customers' convenience". In the end, we have made our decisions in order to avoid disrupting critical infrastructures by revoking certificates that carry no security risk at all.
In reviewing this thread I do not see a Root Cause Analysis that acknowledges the willful decision to choose customer convenience over obedience to the rules, nor do I see action items to address this deliberate disobedience.
As written in comment#10 we see the root cause in particular in the fact that the customers were not sufficiently sensitized with regard to the revocation periods and accordingly the processes for the short-term exchange of certificates were not established with some customers. We are still of the opinion that the decision not to revoke under the given circumstances was a consequence, but not the cause. For this, we have defined the action items to sensitize the customers with regard to the revocation deadlines as well as automation, to get their acknowledgment by filling out a self assessment and to dig deeper in onsite Audits, where we consider it apporpriate.
In addition, regardless of this bug, we are in the process of offering and promoting automation more and more to avoid misissuances as well as revocation delays.
These are all action items that contribute to preventing such bugs in the future.
In bug 1889062 comment 15 Mozilla clarifies that before closing a deliberate delrev bug, the CA must address the policy and decision making flaws that allowed that problem to occur, including,
-detailed changes to policies and procedures to ensure timely revocation, including new guidelines, checklists, and approval processes
From our point of view, the action items mentioned in bug 1889062 are specifically tailored to GDCA and therefore cannot be generally applied to all other bugs with delayed revocation. But of course we have looked at and checked them anyway as part of our monitoring of the bugs and have come to the conclusion that they are not suitable for us. Our policy states that we follow the BRs, including the revocation deadlines. The decision to deviate from this in order to prevent greater harm to the customers and the general public was well discussed with all experts and approved by our management. And, of course, we have publicly announced the decision to deviate by filing this bug. We do not see this as a policy or decision making flaw.
Telekom Security never dealt with the points I brought up more than a month ago in comment 22.
We have answered to your comments after two days, then there was no reaction for 5 weeks. Wouldn't it be fair in a public discussion to not only expect the CA that posted the bug to respond within 7 days ("Read the responses on your own bugs and address all of them in a timely manner, preferably straight away but no matter what within a week."), but also those who comment to react in a timely manner if they are not satisfied with the answers?
Ultimately, the discussions at the F2F meetings show how complex the revocation delay issue is. There are already a number of initiatives to find reasonable solutions, initial proposals are under discussion, such as a further revocation deadline or an exception for certain customers in combination with the obligation to use 90-day certificates, etc. From our point of view, further discussions on this topic should be held in the CA/Browser Forum mailing lists, working groups or F2F . We will not be able to solve the problem conclusively in the currently open revocation delay bugs.
Comment 37•1 year ago
|
||
I think you’ve fundamentally, and grossly misunderstood your place in this ecosystem.
For this, we have defined the action items to sensitize the customers with regard to the revocation deadlines as well as automation, to get their acknowledgment by filling out a self assessment and to dig deeper in onsite Audits, where we consider it apporpriate.
None of this actually matters? You can decide to do this however you want but it is not acceptable to have a repeat of this revocation delay in the future because you’re going to put your “critical” customers ahead of the requirements here.
Ultimately, the discussions at the F2F meetings show how complex the revocation delay issue is. There are already a number of initiatives to find reasonable solutions, initial proposals are under discussion, such as a further revocation deadline or an exception for certain customers in combination with the obligation to use 90-day certificates, etc. From our point of view, further discussions on this topic should be held in the CA/Browser Forum mailing lists, working groups or F2F . We will not be able to solve the problem conclusively in the currently open revocation delay bugs.
Not really, no. There are a few CAs that insist that their customers are more special than other subscribers and therefore they can’t revoke. Whatever the outcome of these meetings, you’re still bound to follow the baseline requirements as they are today. When and if they change, then you should change your behavior then. As we saw with other bugs, future rule changes do not impact your current commitments.
As Tim put it, none of your current action items prevent a delayed revocation from happening in the future.
Are you ready to commit that, for example, anytime after August 1st 2024, if you have a situation needing a mass revocation you will not delay it? If you can’t commit to this, then you’re willfully in non-compliance with the requirements.
I would hope that the recent decision to distrust Entrust would have your CA realizing that the repeat delayed revocations and dodging questions is unacceptable and will have consequences.
Comment 38•1 year ago
|
||
(In reply to Mathew Hodson from comment #4)
This report doesn't meet the expectations set by Mozilla for CAs responding to a revocation incident. It doesn't include "detailed and substantiated explanations for why the situation is exceptional."
We're 5 months in and the core of this comment has not been understood by Telekom Security. To date we have action items that are functionally preparing slides and a q&a with subscribers that took months to materialize. We're still lacking in a per-subscriber breakdown of why revocation was impossible, and why it will not happen again.
(In reply to Stefan Kirch from comment #36)
Thank you for your opinion. We cannot follow on all points as there are some personal opinions, unjustified accusations and unsubstantiated claims. Regardless of the fact that we would like to see a more careful choice of words in a public discussion, we would like to address a few points:
I too would like to see a more careful choice of words in a public discussion. Whichever sentence struck Telekom Security the wrong way, it would be helpful if they elaborated and helped everyone understand what generated this response. As it stands the response highlighted so far:
The root cause of this delrev bug is the willful decision to choose customer convenience over obedience to the rules
As already written in some comments, this is not true! The affected certificates had a critical impact on the critical infrastructures of the customers as well as the people who depend on these infrastructures, e.g. an affected toll system. The decisions not to revoke some certificates were based on credibly explanations of the customers. As stated in comment#23, we had several escalations and displeased customers and even lost a few in the process. So we did actually take a very uncomfortable position and we are far away from "customers' convenience". In the end, we have made our decisions in order to avoid disrupting critical infrastructures by revoking certificates that carry no security risk at all.
Highlight a complete misunderstanding of what critical infrastructure is, this is a recurring misunderstanding by CAs who seem to believe any computer system a subscriber uses must inherently be critical. If any aspect of this was uncomfortable for the CA then it reflects a misunderstanding of what their role in the ecosystem is, and I would advise some internal discussion on how to resolve this going forward.
In reviewing this thread I do not see a Root Cause Analysis that acknowledges the willful decision to choose customer convenience over obedience to the rules, nor do I see action items to address this deliberate disobedience.
As written in comment#10 we see the root cause in particular in the fact that the customers were not sufficiently sensitized with regard to the revocation periods and accordingly the processes for the short-term exchange of certificates were not established with some customers. We are still of the opinion that the decision not to revoke under the given circumstances was a consequence, but not the cause. For this, we have defined the action items to sensitize the customers with regard to the revocation deadlines as well as automation, to get their acknowledgment by filling out a self assessment and to dig deeper in onsite Audits, where we consider it apporpriate.
In addition, regardless of this bug, we are in the process of offering and promoting automation more and more to avoid misissuances as well as revocation delays.
These are all action items that contribute to preventing such bugs in the future.
The root cause as described is "the customers were not ready", it does not explain your role as a CA in revocation. I appreciate that you understand not revoking was a consequence, but I think there is a misunderstanding of what it was a consequence of. There needs to be an understanding of the subscriber/CA relationship. Were this to occur again tomorrow, would every subscriber's certificate be revoked within 5-days?
In bug 1889062 comment 15 Mozilla clarifies that before closing a deliberate delrev bug, the CA must address the policy and decision making flaws that allowed that problem to occur, including,
-detailed changes to policies and procedures to ensure timely revocation, including new guidelines, checklists, and approval processesFrom our point of view, the action items mentioned in bug 1889062 are specifically tailored to GDCA and therefore cannot be generally applied to all other bugs with delayed revocation. But of course we have looked at and checked them anyway as part of our monitoring of the bugs and have come to the conclusion that they are not suitable for us. Our policy states that we follow the BRs, including the revocation deadlines. The decision to deviate from this in order to prevent greater harm to the customers and the general public was well discussed with all experts and approved by our management. And, of course, we have publicly announced the decision to deviate by filing this bug. We do not see this as a policy or decision making flaw.
You need to learn from other CA's mistakes, not await a Root Program to explain what precisely you have to do in order to be compliant. This is about working in good faith with all parties, it is expected that more than the bare minimum will be done. What greater harm is being implied here by the way? Whether your management approves anything has no bearing on how compliant anything is, it just reflects how much of the culture needs changed.
In short, please improve your policy and decision making going forward.
Telekom Security never dealt with the points I brought up more than a month ago in comment 22.
We have answered to your comments after two days, then there was no reaction for 5 weeks. Wouldn't it be fair in a public discussion to not only expect the CA that posted the bug to respond within 7 days ("Read the responses on your own bugs and address all of them in a timely manner, preferably straight away but no matter what within a week."), but also those who comment to react in a timely manner if they are not satisfied with the answers?
Ultimately, the discussions at the F2F meetings show how complex the revocation delay issue is. There are already a number of initiatives to find reasonable solutions, initial proposals are under discussion, such as a further revocation deadline or an exception for certain customers in combination with the obligation to use 90-day certificates, etc. From our point of view, further discussions on this topic should be held in the CA/Browser Forum mailing lists, working groups or F2F . We will not be able to solve the problem conclusively in the currently open revocation delay bugs.
We have already heard in detail what actually happened at those F2F meetings, and what was told to CAs. Revocation delays can be complex, but it is not something that should re-occur. The details of these F2F meetings are not made public immediately, thus a delay in any responses by parties not physically present.
Comment 39•1 year ago
|
||
(In reply to Stefan Kirch from comment #36)
Thank you for your opinion. We cannot follow on all points as there are some personal opinions, unjustified accusations and unsubstantiated claims. Regardless of the fact that we would like to see a more careful choice of words in a public discussion, we would like to address a few points:
Reviewing comment 35, I feel it was factual, fair, and substantiated, with correctly chosen words. Therefore, I must ask you to clarify.
Question 1: Please quote all passages that you feel are personal opinions. As you consider your reply, I will remind you that a capsule summary of established facts in one’s own words is not an example of personal opinion.
Question 2: Please quote all passages that you feel are unjustified accusations.
Question 3: Please quote all passages that you feel are unsubstantiated claims.
Question 4: Please quote all passages where you feel that “a more careful choice of words” is needed. For each quoted passage please explain why you feel that way.
The root cause of this delrev bug is the willful decision to choose customer convenience over obedience to the rules
As already written in some comments, this is not true! The affected certificates had a critical impact on the critical infrastructures of the customers as well as the people who depend on these infrastructures, e.g. an affected toll system. The decisions not to revoke some certificates were based on credibly explanations of the customers. As stated in comment#23, we had several escalations and displeased customers and even lost a few in the process. So we did actually take a very uncomfortable position and we are far away from "customers' convenience". In the end, we have made our decisions in order to avoid disrupting critical infrastructures by revoking certificates that carry no security risk at all.
In your reply you start by saying it’s not true and then go on to elaborate how you chose to violate the revocation rules given in the BRs because you were unwilling to put your Subscribers into a position where they were forced to choose between accelerated certificate deployment and an outage. Which is you admitting that it is true.
In reviewing this thread I do not see a Root Cause Analysis that acknowledges the willful decision to choose customer convenience over obedience to the rules, nor do I see action items to address this deliberate disobedience.
As written in comment#10 we see the root cause in particular in the fact that the customers were not sufficiently sensitized with regard to the revocation periods and accordingly the processes for the short-term exchange of certificates were not established with some customers. We are still of the opinion that the decision not to revoke under the given circumstances was a consequence, but not the cause.
Then you see it incorrectly and your opinion is misguided. The root cause is that you had the power to revoke and you chose not to because you were uncomfortable revoking certificates when Subscribers told you they were not ready. That is, factually, the willful choice to violate the BRs for the sake of your customers’ convenience.
Your unwillingness or inability to acknowledge this point is a very bad look for a public CA. The fact is that many parties, including those representing root programs, have made this point to many CAs on many bugs over the past four months, including directly to Telekom Security. At this date in 2024 it strains credulity for any CA to claim it doesn’t understand that it alone controls if revocation occurs on time.
For this, we have defined the action items to sensitize the customers with regard to the revocation deadlines as well as automation, to get their acknowledgment by filling out a self assessment and to dig deeper in onsite Audits, where we consider it apporpriate.
In addition, regardless of this bug, we are in the process of offering and promoting automation more and more to avoid misissuances as well as revocation delays.
These are all action items that contribute to preventing such bugs in the future.
They may very well contribute, and I encourage you to continue with them. They are not, however, adequate to resolve this bug. To resolve this bug, your CA must come to admit that it alone is capable of completing revocations on time and that it alone is responsible for its failure to perform its duty as a public CA. To resolve this bug, your CA must propose a set of concrete, measurable action items that credibly allow the WebPKI community to believe that this error (the willful delay of mandated revocation) will not occur again.
Until you do these things, this bug will not close, and you will continue to have this debate on this forum with those who see more clearly than you do. You’ve been doing this for five months now. Don’t you want a path to resolution? I am showing you that path. You simply need to accept it and follow what you’re being given.
In bug 1889062 comment 15 Mozilla clarifies that before closing a deliberate delrev bug, the CA must address the policy and decision making flaws that allowed that problem to occur, including,
-detailed changes to policies and procedures to ensure timely revocation, including new guidelines, checklists, and approval processesFrom our point of view, the action items mentioned in bug 1889062 are specifically tailored to GDCA and therefore cannot be generally applied to all other bugs with delayed revocation. But of course we have looked at and checked them anyway as part of our monitoring of the bugs and have come to the conclusion that they are not suitable for us.
Simply put, that is a wrong conclusion. There is no material difference between that bug and yours in terms of the root cause and its acceptable solutions. As a public CA you are expected to take a sophisticated and open-minded approach to understanding accepted norms and behaviors. You are strongly expected to pay attention to related bugs for information and ideas to make your own practice more reliable, compliant, and secure.
The fact that you can, with a straight face, state that these action items do not apply to you is deeply disheartening. It’s obvious they apply to you. This isn’t difficult to figure out.
Our policy states that we follow the BRs, including the revocation deadlines. The decision to deviate from this in order to prevent greater harm to the customers and the general public was well discussed with all experts and approved by our management. And, of course, we have publicly announced the decision to deviate by filing this bug. We do not see this as a policy or decision making flaw.
As above, your insistence that this was not a policy or decision making flaw is a major problem. Not only did you willfully pick and choose the rules you wanted to follow, but even now, you still refuse to admit that doing so is wrong. I cannot state how bad this is.
Telekom Security never dealt with the points I brought up more than a month ago in comment 22.
We have answered to your comments after two days, then there was no reaction for 5 weeks. Wouldn't it be fair in a public discussion to not only expect the CA that posted the bug to respond within 7 days ("Read the responses on your own bugs and address all of them in a timely manner, preferably straight away but no matter what within a week."), but also those who comment to react in a timely manner if they are not satisfied with the answers?
In fact, no. Of the tens of thousands of technology companies in the world, only a few dozen have the rare privilege of being gatekeepers of public digital trust. This amazing privilege comes with uncompromising obligations. In addition to following the published rules (which Telekom Security failed to do and which is why we’re engaged in this very discussion), one of the other obligations is that any member of the internet-using public, of which I am one, has the right to examine or question how the CA handles its reported incidents. The CA has the obligation to address all serious questions and comments in a candid, informative, accurate, and timely manner with the objective of identifying and acting on opportunities to improve its operation. The questioner has no obligation at all.
However, if we take your response instead to mean that I have not dealt fairly with you on this thread, I very much have. Comment 22 was clear in its statements and requests, easy to understand, and consistent with what I am saying now, two months later. Your comment 23 missed the mark by failing to acknowledge the responsibility clearly articulated in comment 22, and then the same day multiple members of the community jumped in to explain the error in your ways. In particular, Mike Shaver’s comment 25 was on point when it said,
The action items listed here, however, seem to stop at "we will try" rather than "we will ensure".
In the months that follow, despite multiple attempts by multiple commentators to show Telekom Security the way forward, your CA continues to dig in its heels on an unsupportable argument up through today.
Ultimately, the discussions at the F2F meetings show how complex the revocation delay issue is. There are already a number of initiatives to find reasonable solutions, initial proposals are under discussion, such as a further revocation deadline or an exception for certain customers in combination with the obligation to use 90-day certificates, etc. From our point of view, further discussions on this topic should be held in the CA/Browser Forum mailing lists, working groups or F2F . We will not be able to solve the problem conclusively in the currently open revocation delay bugs.
This already has been adequately addressed in comment 37.
Updated•1 year ago
|
Comment 40•1 year ago
|
||
Hello Tim,
We acknowledge receiving your questions in comment#39, but we will need until next week to provide a full response.
Thanks
Stefan
Comment 41•1 year ago
|
||
Before we take the next steps, I would like to give a short summary from our perspective.
Within this bug we have tried to be open and honest about the reasons for the revocation delay and to explain what we consider to be reasonable action items to get better with the aim of preventing revocation delay in the future.
We have tried to explain why we made the decision we did in this specific case in the past. This was not meant as and should not be an argument for further revocation delays in the future. We are well aware that the goal is to avoid further revocation delays in the future or more to prevent any bugs and subsequently revocation delays. There is no question about that.
Following Chrome’s request in February, we described a more detailed timeline, root cause analysis, and new action items and listed the customers and infrastructures affected. Thereafter, there were no comments for more than three months, apart from Chrome's request to know whether the announced action items had been completed and how we intended to assess their effectiveness. The fact that there were no further comments or questions for over three months was a sign for us that both the root cause analysis and the action items were accepted.
This is questioned in the last comments, so there seems to be a need for further clarifications.
From our point of view, we have responded to all comments up to and including comment#32 in such a way that there were no further questions about it, except comment#22. However, we will go through all the comments again and see if all serious and factual questions and comments are addressed and will give answers or clarifications within the next comments. However, due to the summer vacation and other personal reasons, this will take a few more days.
(in reply to comment#39)
Reviewing comment 35, I feel it was factual, fair, and substantiated, with correctly chosen words. Therefore, I must ask you to clarify.
Question 1: Please quote all passages that you feel are personal opinions. As you consider your reply, I will remind you that a capsule summary of established facts in one’s own words is not an example of personal opinion.
Question 2: Please quote all passages that you feel are unjustified accusations.
Question 3: Please quote all passages that you feel are unsubstantiated claims.
Question 4: Please quote all passages where you feel that “a more careful choice of words” is needed. For each quoted passage please explain why you feel that way.
I suggest that we send you some examples by personal email afterwards. In my opinion we should limit this discussion in Bugzilla to factual discussion. Emotional discussions about the manner of commenting do not seem helpful to me at this point.
Comment 42•1 year ago
|
||
(In reply to Stefan Kirch from comment #41)
However, due to the summer vacation and other personal reasons, this will take a few more days.
This reads as Telekom Security believing that "summer vacation" is an acceptable reason for delayed and sub-par responses to queries in Bugzilla, is this interpretation correct? Furthermore, it seems highly concerning that you are so understaffed that simply replying to messages in a timely fashion is not possible for you for -- what I assume is, given the European context -- several weeks every year.
(in reply to comment#39)
Reviewing comment 35, I feel it was factual, fair, and substantiated, with correctly chosen words. Therefore, I must ask you to clarify.
Question 1: Please quote all passages that you feel are personal opinions. As you consider your reply, I will remind you that a capsule summary of established facts in one’s own words is not an example of personal opinion.
Question 2: Please quote all passages that you feel are unjustified accusations.
Question 3: Please quote all passages that you feel are unsubstantiated claims.
Question 4: Please quote all passages where you feel that “a more careful choice of words” is needed. For each quoted passage please explain why you feel that way.I suggest that we send you some examples by personal email afterwards. In my opinion we should limit this discussion in Bugzilla to factual discussion. Emotional discussions about the manner of commenting do not seem helpful to me at this point.
Given that Telekom Security made what essentially boils down to accusations of unprofessional conduct in this public forum, I believe it is only fair that they explain those accusations in the same public forum. Nevertheless, if Telekom Security refuses to respond to said questions in the public, please consider this message as asking those same questions and send the responses to my email as well.
Comment 43•1 year ago
•
|
||
I'm going to start reminding everyone of Mozilla's Community Participation Guidelines (CPG) - https://www.mozilla.org/en-US/about/governance/policies/participation/, especially our prohibition on the use of derogatory and dismissive language. Enforcement of our CPG is something Mozilla takes very seriously. All community members are held accountable to the standard set forth in the CPG. Anyone who posts in Bugzilla or m-d-s-p also sets the standard for others and signals what is, and is not, acceptable in our community. We all have a role to play in fostering a respectful and inclusive community. Remember that your words and actions set the tone for others and reflect the values we uphold at Mozilla. Violations of the CPG will be addressed promptly to ensure a safe and welcoming environment for everyone. Sanctions may include temporary suspensions of posting privileges or outright bans on participating. Let's work together to create a positive and constructive space where all members feel valued and heard.
Comment 44•1 year ago
|
||
(In reply to Stefan Kirch from comment #41)
Before we take the next steps, I would like to give a short summary from our perspective.
Within this bug we have tried to be open and honest about the reasons for the revocation delay and to explain what we consider to be reasonable action items to get better with the aim of preventing revocation delay in the future.
We have tried to explain why we made the decision we did in this specific case in the past. This was not meant as and should not be an argument for further revocation delays in the future. We are well aware that the goal is to avoid further revocation delays in the future or more to prevent any bugs and subsequently revocation delays. There is no question about that.
Following Chrome’s request in February, we described a more detailed timeline, root cause analysis, and new action items and listed the customers and infrastructures affected. Thereafter, there were no comments for more than three months, apart from Chrome's request to know whether the announced action items had been completed and how we intended to assess their effectiveness. The fact that there were no further comments or questions for over three months was a sign for us that both the root cause analysis and the action items were accepted.
This is questioned in the last comments, so there seems to be a need for further clarifications.
From our point of view, we have responded to all comments up to and including comment#32 in such a way that there were no further questions about it, except comment#22. However, we will go through all the comments again and see if all serious and factual questions and comments are addressed and will give answers or clarifications within the next comments. However, due to the summer vacation and other personal reasons, this will take a few more days.(in reply to comment#39)
Reviewing comment 35, I feel it was factual, fair, and substantiated, with correctly chosen words. Therefore, I must ask you to clarify.
Question 1: Please quote all passages that you feel are personal opinions. As you consider your reply, I will remind you that a capsule summary of established facts in one’s own words is not an example of personal opinion.
Question 2: Please quote all passages that you feel are unjustified accusations.
Question 3: Please quote all passages that you feel are unsubstantiated claims.
Question 4: Please quote all passages where you feel that “a more careful choice of words” is needed. For each quoted passage please explain why you feel that way.I suggest that we send you some examples by personal email afterwards. In my opinion we should limit this discussion in Bugzilla to factual discussion. Emotional discussions about the manner of commenting do not seem helpful to me at this point.
Telekom, you are arguing in bad faith. You’ve made a claim that people are being rude to you, while never clarifying where. This is unacceptable behavior for this forum. Frankly, I’m not even going to bother arguing this point - as it’s a waste of time and arguably a distraction tactic utilized by Telekom.
Beyond that, can you please explain how you think it’s acceptable to delay responses here because of summer vacation?
With that response, it’s clear that your CA is not properly staffed. Do you have any evidence or actions that proves that you, in fact, are staffed adequately?
From reading this bug, I can potentially even see that the delayed revocations is because your CA doesn’t have the adequate staff to handle talking to customers and revoking and replacing at the same time.
Ben: I don’t see anything in this thread that would be considered unprofessional. There are many direct questions, but considering the nature of trust placed in a CA, these questions are perfectly fine to be asked. I would like to ask that these accusations be formally rescinded, as it carries a risk of this behavior spreading to other CAs. If other CAs adopt the tactics used by Telekom here, community enforcement would become near impossible. Any misbehaving CA can decide to interpret the direct questions as harassment, etc and never actually answer the questions or delay responses.
Comment 45•1 year ago
|
||
(In reply to Ben Wilson from comment #43)
(In reply to JSaares from comment #42)
(In reply to Stefan Kirch from comment #41)
I'm going to start reminding everyone of Mozilla's Community Participation Guidelines (CPG) - https://www.mozilla.org/en-US/about/governance/policies/participation/, especially our prohibition on the use of derogatory and dismissive language.
Hi Ben,
I think that Mozilla’s community participation guidelines are valuable and important, even though I have probably been on the wrong side of them a few times in the last 20 years. However, I do not see an example of derogatory or dismissive language in this bug with the possible exception of the claims made about Tim’s individual conduct in comment 36. It is not derogatory to state an opinion of a CA’s conduct, even if that statement is direct and forceful, and even if it makes the representatives of the subject corporation embarrassed or uncomfortable—except to the extent that any negative statement about a related entity could be labelled as derogatory. I feel confident that this is not the extent intended by the CPG, especially given the use of Bugzilla to capture evaluation of CA activities. It is fine, indeed essential, for CAs to be told that their work is not satisfactory; that is the basis on which necessary improvement rests, and it does not need to be softened or couched in dilutive framing for it to be appropriate for this forum.
It could be helpful, I think, for you to specify what writing you think crossed, or approached, Mozilla’s tolerance for derogatory or dismissive language, so that participants can more concretely determine how to express themselves.
Given the importance of the role of CAs in the ecosystem, and the unfortunate history of widespread non-compliance by many CAs, and the minimal degree to which root programs have been seen to enforce established standards of quality in reports and remediation action items, I think a degree of frustration and impatience on the part of the community is understandable. Calling direct public attention to cases where CAs have underperformed is one of the few tools that the community has to defend the health of the web PKI. A desire for more “decorum” and collaborative language is understandable, from the perspective of improving the experience of CA representatives who are relaying the actions and plans that are under discussion, but I think you will find that the community are much more encouraging and generous in their framing when dealing with CAs that have a good record of meeting their commitments to the BRs and MRSP. The importance of community scrutiny must take precedence over a desire for all interactions to be friendly and infinitely patient.
(I actually believe that if WebTrust audits weren’t so much done in the context of keeping the customer happy for the next engagement, we would find that there was a lot less malpractice for CAs to report and the community to scrutinize.)
Comment 46•1 year ago
|
||
Thanks. I'll see if we can work on better guidance for what is unacceptable community behavior, in coordination with Mozilla CPG staff.
Comment hidden (admin-reviewed) |
Comment 49•1 year ago
|
||
This is a polite reminder that Bugzilla is our professional working environment as well as our issue tracker. I encourage you to review our Community Participation and Bugzilla Etiquette guidelines; comments that help move issues towards a resolution are always welcome. Comments that add nothing more than demands that a resolution occur, however, are not.
Only people with editbugs permissions will be able to participate to this conversation.
Comment 50•1 year ago
|
||
(in reply to JSaares and Amir)
However, due to the summer vacation and other personal reasons, this will take a few more days.
This reads as Telekom Security believing that "summer vacation" is an acceptable reason for delayed and sub-par responses to queries in Bugzilla, is this interpretation correct? Furthermore, it seems highly concerning that you are so understaffed that simply replying to messages in a timely fashion is not possible for you for -- what I assume is, given the European context -- several weeks every year.
Beyond that, can you please explain how you think it’s acceptable to delay responses here because of summer vacation?
As I have written it is not only the summer vacations, but there are also other personal reasons which I did not want to list for privacy reasons of my colleagues and myself, so it is a special situation this week. To deduce from this that we are basically understaffed, that we are not available for several weeks during the holidays or that the understaffing is also the reason for the revocation delay, is not correct and maybe this is a good example of what I consider to be unsubstantiated claims.
| However, we will go through all the comments again and see if all serious and factual questions and comments are addressed and will give answers or clarifications within the next comments.
Currently we are at comment#50 and going through them all in detail, assuring to have understood everything correctly and preparing satisfactory answers will take time. Please also note that we are not native English speakers and therefore have to at least double-check our answers and translations in a more people principle, so that no misleading translations arise, which has unfortunately happened in the past and then led to further ambiguities. This also takes time and we therefore ask for your understanding.
(in reply, to JSaares, Amir and Mike Shaver)
I suggest that we send you some examples by personal email afterwards. In my opinion we should limit this discussion in Bugzilla to factual discussion. Emotional discussions about the manner of commenting do not seem helpful to me at this point.
Given that Telekom Security made what essentially boils down to accusations of unprofessional conduct in this public forum, I believe it is only fair that they explain those accusations in the same public forum
Telekom, you are arguing in bad faith. You’ve made a claim that people are being rude to you, while never clarifying where. This is unacceptable behavior for this forum. Frankly, I’m not even going to bother arguing this point - as it’s a waste of time and arguably a distraction tactic utilized by Telekom.
However, I do not see an example of derogatory or dismissive language in this bug with the possible exception of the claims made about Tim’s individual conduct in comment 36
Behind the suggestion to reply to Tim personally was the idea to bring the discussion in public back to an objective discussion. Perhaps we misinterpreted the comments, but we felt (and still feel, see the claim above) unobjectively attacked and wanted to return to objectivity. But we also didn't want to leave the accusations against Tim's comments unanswered and therefore suggested explaining them in a personal email.
Whatever the case, there was no bad faith behind it, nor a distraction tactic or something else. The opposite is the case, we wanted to act with good faith.
In our announced comments next week, we will then also address comment#35 and comment#39 and provide some examples that we consider to be personal opinions or unjustified accusations or unsubstantiated claims instead of responding Tim by personal email.
Comment 51•1 year ago
|
||
(In reply to Stefan Kirch from comment #50)
(in reply to JSaares and Amir)
However, due to the summer vacation and other personal reasons, this will take a few more days.
This reads as Telekom Security believing that "summer vacation" is an acceptable reason for delayed and sub-par responses to queries in Bugzilla, is this interpretation correct? Furthermore, it seems highly concerning that you are so understaffed that simply replying to messages in a timely fashion is not possible for you for -- what I assume is, given the European context -- several weeks every year.
Beyond that, can you please explain how you think it’s acceptable to delay responses here because of summer vacation?
As I have written it is not only the summer vacations, but there are also other personal reasons which I did not want to list for privacy reasons of my colleagues and myself, so it is a special situation this week. To deduce from this that we are basically understaffed, that we are not available for several weeks during the holidays or that the understaffing is also the reason for the revocation delay, is not correct and maybe this is a good example of what I consider to be unsubstantiated claims.
Telekom Security can speak about unsubstantiated claims, it truly does not matter. As a CA the role of your company is to adhere to these regulations. Mentioning vacations and other personal reasons makes it very clear that these are issues at the company. These should not be used as a basis to pick and choose which questions you, as an individual, feel comfortable responding to. This is about the company you work at, if you are feeling pressured in any way then it is on your management to provide adequate resources and training to help you perform your duties.
That we are at this stage is a breakdown in communication. It would be helpful if we all tried to read each others statements in good faith.
| However, we will go through all the comments again and see if all serious and factual questions and comments are addressed and will give answers or clarifications within the next comments.
Currently we are at comment#50 and going through them all in detail, assuring to have understood everything correctly and preparing satisfactory answers will take time. Please also note that we are not native English speakers and therefore have to at least double-check our answers and translations in a more people principle, so that no misleading translations arise, which has unfortunately happened in the past and then led to further ambiguities. This also takes time and we therefore ask for your understanding.
The 7 day deadline starts when the message is posted. It can be pushed back by stating that they'll be answered more thoroughly at a specific date, but that does not absolve the issues that lead up to the deadline being missed. Telekom Security are by no means the only CA operating who have English as a second language, and I assure you everyone is trying to work around that to ensure miscommunications are understood.
Behind the suggestion to reply to Tim personally was the idea to bring the discussion in public back to an objective discussion. Perhaps we misinterpreted the comments, but we felt (and still feel, see the claim above) unobjectively attacked and wanted to return to objectivity. But we also didn't want to leave the accusations against Tim's comments unanswered and therefore suggested explaining them in a personal email.
This is a professional forum, everyone is trying to act in good faith and improve the standards of discussion for everyone at the table. Please take the advice stated, and take a step back to internally address how these statements came from your company. We do not need any report on this publicly, this is a learning opportunity for Telekom Security internally.
Whatever the case, there was no bad faith behind it, nor a distraction tactic or something else. The opposite is the case, we wanted to act with good faith.
In our announced comments next week, we will then also address comment#35 and comment#39 and provide some examples that we consider to be personal opinions or unjustified accusations or unsubstantiated claims instead of responding Tim by personal email.
By my count I'm seeing the following questions without adequate responses. Please use this as a baseline to show how Telekom Security are adhering to their duties:
2024-07-04 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c37
2024-07-04 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c38
2024-07-08 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c39
2024-07-18 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c42
2024-07-18 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c44
Please stop focusing on Comment 39 and providing statements that are not helping matters. Questions have been asked for weeks with no response. If you feel a question is misunderstanding your position, please say so, but then answer the question to the best of your ability.
As an aside I appreciate the lockdown for users with editbugs permission, however not all long-term participants of this forum have such permissions. Can I ask that at the very least Tim Callan is provided it, with respect to this bug, in order to have a civil conversation.
Comment 52•1 year ago
|
||
As I have written it is not only the summer vacations, but there are also other personal reasons which I did not want to list for privacy reasons of my colleagues and myself, so it is a special situation this week. To deduce from this that we are basically understaffed, that we are not available for several weeks during the holidays or that the understaffing is also the reason for the revocation delay, is not correct and maybe this is a good example of what I consider to be unsubstantiated claims.
However, we will go through all the comments again and see if all serious and factual questions and comments are addressed and will give answers or clarifications within the next comments. However, due to the summer vacation and other personal reasons, this will take a few more days.
Telekom: the only information we as the community have access to is the ones you give us. When your responses to a 7 day response deadline starts with: “Before we take the next steps, I would like to give a short summary from our perspective.” And ends with “However, due to the summer vacation and other personal reasons, this will take a few more days.” - That to me is a sign of understaffing and mismanagement.
You’re allowed to label this as unsubstantiated, or whatever you would like. However, personnel taking time off does not allow you to delay responses to these questions. It’s irrelevant to this forum, and it’s only brought to relevancy because you have used it as the reasoning/excuse for not providing responses in a timely manner.
I’m extremely surprised at how dismissive your CA has been in these responses. From an external perspective, this is making me think your CA does not really know how to conduct itself with regards to Bugzilla. To ensure this is not the case, please answer the following questions:
-
Do you have a process to triage every Bugzilla bug in the “CA Certificate Compliance” component? If so, please write out what the process for that looks like.
-
Do you have a process to triage every post on the various relevant mailing lists (e.g. MDSP, SCWG, etc)? If so, please write out what the process for that looks like.
-
If your answer for Question 1 was yes, do you have triage logs available for any of the recent (last 6 months) bugs? If so, please share the triage logs for https://bugzilla.mozilla.org/show_bug.cgi?id=1872371.
These questions will help ensure the community that you’re aware of how Bugzilla operates, and that you have a process for triaging bugs that other CAs are impacted with.
——
Beyond that, I also think that restricting this bug doesn’t really make sense, and actively excludes many members of the community that have helpful views and opinions.
Comment 53•1 year ago
•
|
||
As a member of the greater internet-using public who is also working as a security researcher and consultant, I would also like to hear a response from Telekom Security on the following two considerations. These were originally pointed out by Mike Shaver in the GDCA bug discussion, but I agree that they are relevant for other public CAs as well.
From comment https://bugzilla.mozilla.org/show_bug.cgi?id=1889062#c9:
How will [the CA] ensure that they do not issue certificates to subscribers with critical services, who have not also provided assurances that they can replace certificates within 24 hours or themselves (the subscriber) take responsibility for any service disruption if that does not occur?
If a service is essential to society and cannot operate successfully within the constraints of the BRs, then it should not be using WebPKI, and CAs should be ensuring that they do not issue WebPKI certificates to such services.
From comment https://bugzilla.mozilla.org/show_bug.cgi?id=1889062#c11:
Issuing a certificate to a subscriber who did not acknowledge and accept that immediate revocation may occur in the case of BR violation is misissuance. By my understanding of the BRs, and that of a well-informed anonymous expert who I consulted, you should not have issued replacement certificates if the subscriber did not accept that revocation can happen instantly at any time.
Comment 54•1 year ago
|
||
(referring to comment#40)
From our point of view, we have responded to all comments up to and including comment#32 in such a way that there were no further questions about it, except comment#22. However, we will go through all the comments again and see if all serious and factual questions and comments are addressed and will give answers or clarifications within the next comments.
As mentioned above, we have reviewed all previous comments, our responses and the reactions to them and believe that we have addressed all comments up to and including comment #35 as well as some of the more recent comments, here is an overview:
- comment#2 was addressed in comment#3, after which there was no further query.
- comment#4 was addressed in comment#9, after which there was no further query.
- comment#5 was addressed in comment#7, after which there was no further query.
- comment#6 was addressed in comment#9, after which there was no further query.
- comment#8 was addressed in comment#10, for which there was then a query in comment#18, which was addressed in comment#19, whereupon there were no further queries.
- comment#22 was addressed in comment#23, for which there were no further queries for 5 weeks, so we could assume that the answer was satisfactory. A further question was then raised in comment#35 (see below).
- comment#24 was addressed in comment#28, after which there was no further query.
- comment#25 was addressed in comment#28, after which there was no further query.
- comment#26 was addressed in comment#27, after which there was no further query.
- comment#29 and comment#32 were addressed in comment#30 and comment#33 (one iteration), after which there was no further query.
- comment#35 was addressed in comment#36, after which there were further queries to this in comment#37, comment#38 and comment#39, which have yet to be answered.
- comment#42 was addressed in comment#50, after which there was no further query until now.
- comment#44 was addressed in comment#50, after which there was no further query until now.
- comment#45 was addressed in comment#50, after which there was no further query until now.
We therefore believe that feedback on comment#37 to comment#39 and comment#51 to comment#53 is outstanding.
(in reply to comment#51)
Wayne, we think this basically agrees with your count (before comment#52 and comment#53 came up):
By my count I'm seeing the following questions without adequate responses. Please use this as a baseline to show how Telekom Security are adhering to their duties:
2024-07-04 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c37
2024-07-04 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c38
2024-07-08 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c39
2024-07-18 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c42
2024-07-18 https://bugzilla.mozilla.org/show_bug.cgi?id=1877388#c44
Deviating from this, we consider comment#42 and comment#44 to be answered, but we also look at them again and provide further feedback on the outstanding comments as announced.
Comment 55•1 year ago
|
||
(In response to comment#52)
Thank you for your concrete questions.
I’m extremely surprised at how dismissive your CA has been in these responses. From an external perspective, this is making me think your CA does not really know how to conduct itself with regards to Bugzilla. To ensure this is not the case, please answer the following questions:
- Do you have a process to triage every Bugzilla bug in the “CA Certificate Compliance” component? If so, please write out what the process for that looks like.
- Do you have a process to triage every post on the various relevant mailing lists (e.g. MDSP, SCWG, etc)? If so, please write out what the process for that looks like.
- If your answer for Question 1 was yes, do you have triage logs available for any of the recent (last 6 months) bugs? If so, please share the triage logs for https://bugzilla.mozilla.org/show_bug.cgi?id=1872371.
These questions will help ensure the community that you’re aware of how Bugzilla operates, and that you have a process for triaging bugs that other CAs are impacted with.
Answer to question 1) Yes, we have a process to triage every bug in the "CA Certificate Compliance" component. Practically speaking, we check on a daily basis. Since we are on the mailing list, we receive an email for every new bug. Due to the time difference, however, we check most of the new bugs until the next morning. Formally, the review takes place at least every week (usually on Tuesdays). Each bug is documented within an Excel sheet and evaluated separately. This process has been in place and documented since the spring of 2018. The task is the responsibility of the "Root and Compliance Team".
It is common practice that there are actually 3 types of procedure:
a) The bug is classified as "not relevant" by the initial assessment.
b) The bug is potentially relevant and will be discussed and evaluated immediately or at least in our weekly "Root and Compliance Telco" (as with the bug in question 3.).
c) A bug remains open for a longer period in order to pay explicitly attention to whether something significant arises later.
The ongoing discussions in the bugs will be monitored in each of the 3 cases afterwards. We receive mails for every comment on every bug. These will all be processed by Tuesday at the latest and any new finding that is important to us will be documented in the comments field and re-evaluated accordingly.
Answer to question 2) Yes, we are on all relevant mailing lists (cabforum, servercert-wg, Mozilla dev-security-policy, ct-poliy, smcwg-public, netsec, cabf_validation, CCADB Public, ETSI ESI, ...). These are also usually checked on a daily basis. Every Tuesday at the latest, all of them are read. Topics that are important to us (e.g. future ballots, planned changes to root store policies, CCADB innovations, changes in the ct-log environment, new linter versions, new tools, ...) are also thematically documented and regularly evaluated. In numerous cases, this results in concrete work packages at an early stage.
In addition, we check all associated relevant websites (e.g. CABF, ETSI, Mozilla, Microsoft, Apple, Google (Chrome, G-Suite), CCADB CA Task List, CRL watch, OCSP watch, crt.sh, Mozilla disclosures, BSI, SOG-IS, CT Log, …) at least on a weekly basis for new versions, publications or test-results.
The fact that all the above-mentioned evaluations have been carried out is also documented weekly.
Answer to question 3) The entries in our documentation on the Buypass bug https://bugzilla.mozilla.org/show_bug.cgi?id=1872371 are quite unspectacular. Note: as this is an original excerpt, the entries are in German:
- Nr: 655
- Bug-Nr: 1872371
- Bug-URL: Buypass: Using an external DNS Resolver for DNS lookups (Incl. Link: https://bugzilla.mozilla.org/show_bug.cgi?id=1872371 )
- Stichworte: external DNS resolvers
- Bug veröffentlicht am: 29.12.2023
- Prüfung eingeleitet: 02.01.2024
- Verantwortlich: Arnold
- Ergebnisdatum: 02.01.2024
- Ergebnis: OK
- Bemerkung: 02.01.2024: Im Root-Team: Buypass hat recursive DNS Resolver genutzt, wir nutzen eigenen DNS
Notes:
- This bug was noted on 2023-12-30, internally communicated via email, but formally documented in accordance with the process on 2024-01-02.
- The further comments in the bug gave no reason to change the inital evaluation.
Comment 56•1 year ago
|
||
(In reply to Stefan Kirch from comment #55)
(In response to comment#52)
Thank you for your concrete questions.I’m extremely surprised at how dismissive your CA has been in these responses. From an external perspective, this is making me think your CA does not really know how to conduct itself with regards to Bugzilla. To ensure this is not the case, please answer the following questions:
- Do you have a process to triage every Bugzilla bug in the “CA Certificate Compliance” component? If so, please write out what the process for that looks like.
- Do you have a process to triage every post on the various relevant mailing lists (e.g. MDSP, SCWG, etc)? If so, please write out what the process for that looks like.
- If your answer for Question 1 was yes, do you have triage logs available for any of the recent (last 6 months) bugs? If so, please share the triage logs for https://bugzilla.mozilla.org/show_bug.cgi?id=1872371.
These questions will help ensure the community that you’re aware of how Bugzilla operates, and that you have a process for triaging bugs that other CAs are impacted with.Answer to question 1) Yes, we have a process to triage every bug in the "CA Certificate Compliance" component. Practically speaking, we check on a daily basis. Since we are on the mailing list, we receive an email for every new bug. Due to the time difference, however, we check most of the new bugs until the next morning. Formally, the review takes place at least every week (usually on Tuesdays). Each bug is documented within an Excel sheet and evaluated separately. This process has been in place and documented since the spring of 2018. The task is the responsibility of the "Root and Compliance Team".
It is common practice that there are actually 3 types of procedure:
a) The bug is classified as "not relevant" by the initial assessment.
b) The bug is potentially relevant and will be discussed and evaluated immediately or at least in our weekly "Root and Compliance Telco" (as with the bug in question 3.).
c) A bug remains open for a longer period in order to pay explicitly attention to whether something significant arises later.
The ongoing discussions in the bugs will be monitored in each of the 3 cases afterwards. We receive mails for every comment on every bug. These will all be processed by Tuesday at the latest and any new finding that is important to us will be documented in the comments field and re-evaluated accordingly.Answer to question 2) Yes, we are on all relevant mailing lists (cabforum, servercert-wg, Mozilla dev-security-policy, ct-poliy, smcwg-public, netsec, cabf_validation, CCADB Public, ETSI ESI, ...). These are also usually checked on a daily basis. Every Tuesday at the latest, all of them are read. Topics that are important to us (e.g. future ballots, planned changes to root store policies, CCADB innovations, changes in the ct-log environment, new linter versions, new tools, ...) are also thematically documented and regularly evaluated. In numerous cases, this results in concrete work packages at an early stage.
In addition, we check all associated relevant websites (e.g. CABF, ETSI, Mozilla, Microsoft, Apple, Google (Chrome, G-Suite), CCADB CA Task List, CRL watch, OCSP watch, crt.sh, Mozilla disclosures, BSI, SOG-IS, CT Log, …) at least on a weekly basis for new versions, publications or test-results.
The fact that all the above-mentioned evaluations have been carried out is also documented weekly.Answer to question 3) The entries in our documentation on the Buypass bug https://bugzilla.mozilla.org/show_bug.cgi?id=1872371 are quite unspectacular. Note: as this is an original excerpt, the entries are in German:
- Nr: 655
- Bug-Nr: 1872371
- Bug-URL: Buypass: Using an external DNS Resolver for DNS lookups (Incl. Link: https://bugzilla.mozilla.org/show_bug.cgi?id=1872371 )
- Stichworte: external DNS resolvers
- Bug veröffentlicht am: 29.12.2023
- Prüfung eingeleitet: 02.01.2024
- Verantwortlich: Arnold
- Ergebnisdatum: 02.01.2024
- Ergebnis: OK
- Bemerkung: 02.01.2024: Im Root-Team: Buypass hat recursive DNS Resolver genutzt, wir nutzen eigenen DNS
Notes:
- This bug was noted on 2023-12-30, internally communicated via email, but formally documented in accordance with the process on 2024-01-02.
- The further comments in the bug gave no reason to change the inital evaluation.
Thank you for the response! Your bug triage process being written out is very valuable and can help other CAs who have sometimes struggled with keeping up with Bugzilla.
Comment 57•1 year ago
|
||
(in reply to comment#35 and comment#39 regarding personal opinions, accusations, claims and chosen words)
Reviewing comment 35, I feel it was factual, fair, and substantiated, with correctly chosen words. Therefore, I must ask you to clarify.
Question 1: Please quote all passages that you feel are personal opinions. As you consider your reply, I will remind you that a capsule summary of established facts in one’s own words is not an example of personal opinion.
Question 2: Please quote all passages that you feel are unjustified accusations.
Question 3: Please quote all passages that you feel are unsubstantiated claims.
Question 4: Please quote all passages where you feel that “a more careful choice of words” is needed. For each quoted passage please explain why you feel that way.
From our point of view, the requirement to list all passages does not contribute to solving the actual bug, for this see also Wayne's commentary
Please stop focusing on Comment 39 and providing statements that are not helping matters
However, as we do not want to leave this point open, we will limit ourselves to a few examples in the hope of finding a reasonable compromise between the desire not to go into it at all and the desire for detailed feedback.
The root cause of this delrev bug is the willful choice by your CA to prioritize your Subscribers’ convenience over the rules all CAs are expected to follow.
The claim that we acted out of convenience for our customers is not true. Regardless of whether the root cause is the fact that we had not sufficiently sensitized the customers or underestimated the replacement processes or our deliberate decision not to revoke and thus violate the BRs, we did not make the decision out of customer convenience, but to avoid disruption to infrastructures and their affected users.
…to accommodate its Subscribers’ stated unwillingness to install replacement certificates in time.
Regardless of classifying this as a claim or a personal opinion, this is also not true. Our customers have never stated their unwillingness, in fact they have been constructive.
The CA has the obligation to address all serious questions and comments in a candid, informative, accurate, and timely manner with the objective of identifying and acting on opportunities to improve its operation. The questioner has no obligation at all.
This is your opinion, as we see also obligations for the questioners. Even if there are no written guidelines for questioners, e.g. in the rules for responding to an incident or in the CPG, we see it simply part of a constructive and respectful discussion that both sides respond appropriately, especially in a public discussion.
As for the choice of words, we found the general tone of the commentary to be very commanding (“read the answers”, “go back now”, “follow this practice”) and lecturing (“To make it easy, I will lay out what you need to do”, “The thing you must come to understand ”, “I am showing you that path”).
Comment 58•1 year ago
|
||
(in reply to comment#37 to comment#39)
Since the core statements in comment#37 to comment#39 have the same background or goal in many aspects, we summarize them in our commentary for the sake of clarity, in the hope of answering all questions.
Core statement 1: The discussions and future regulations of the CABF are not relevant for this bug
We have pointed them out because we think it is important that revocation periods and delays are fundamental issues that needs to be clarified in general and are already being addressed in the CABF. The goal was to state, that we are actively following these discussions and possible changes in the rules. It was not meant to use these discussions as reason for the delay given in the current bug as future regulations of course do not matter today.
Core statement 2: The action items don't go far enough
The listed action items have so far been interpreted in such a way that they basically contribute to the avoidance of further revocation delay bugs but do not go far enough to really exclude them. In order to make our declared goal of excluding revocation delay bugs in the future clearer, we would like to explain them in more detail.
Note: Since this revocation delay bug only affects our Trust Service for Enterprise RA customers, the action items are focused on this Trust Service accordingly.
Action Item | Kind | Due Date
Sensitization of the customers with regard to react in short-term and to be prepared for faster replacement procedures.| Mitigate | ongoing |
Preparing slides for a training with regard to automation | Prevent | 2024-03
Preparing a self-assessment for Enterprise-RAs | Prevent | 2024-04
On-site audits at selected customers in which, in addition to the existing Enterprise RA topics, the possibilities of automation and preparation for quicker reactions are also considered | Prevent | starting in the second half of 2024
While the slides only contribute to raising customer awareness in terms of automation as well as the demand for timely revocation and thus do not prevent revocation delays, there is much more to the self-assessments.
The Enterprise RAs are obliged by means of a RA agreement to adhere to the requirements of our CP and CPS and thus also to the BR. As a consequence, they are therefore also bound to comply with the revocation periods, as these are of course also stipulated in our CP and CPS. But since the revocation periods are some of many requirements, the first step is to use the self-assessment to ask the Enterprise RAs for awareness and a clear statement on compliance with this requirement.
Depending on the evaluation of the answers of every single Enterprise RA as part of their self-assessment, we will take one of the next steps:
Variant A: The Enterprise RA confirms the awareness and gives a clear statement that all certificates can be replaced within the deadlines and accordingly all certificates can be revoked within the deadlines, either from the Enterprise RA itself or from the CA. If we consider the answer to be credible, no further action is needed at the moment. If we have any doubts we will follow up on this via email in the first step and, if necessary, we will do an onsite-audit and take any further action we deem appropriate.
Variant B: The Enterprise RA gives a statement that potentially not all certificates can be replaced within the deadlines and accordingly has a problem with the revocation deadlines.
In this case we will try to work out a solution with this customer, which can be, e.g.
- the use of private certificates instead of public certificates wherever possible,
Note: To make this change on our side, it's just a configuration for that client, so it can be done in a very short time, since our solution offers private certificates anyway. - the (temporary) use of DV certificates for a faster replacement from our automated solution via ACME instead of OV certificates via Enterprise RA website or CMP.
If no solution can be found, we give the customer the choice of
- continue to use our certificates but accept that we will revoke them within the deadlines when needed, i.e. accept that, for example, their infrastructure will be disrupted, or
- terminate the contract and obtain certificates from another CA.
Ultimately, our goal is to only maintain agreements with customers, resp. to make agreements with new customers, who are aware of the revocation deadlines and give a clear statement, to revoke in good time by itself or, if necessary, to accept a (mass) revocation by the CA.
Comment 59•1 year ago
|
||
(in reply to comment#53)
From our point of view, these questions have already been addressed in the answers in comment#58. Is that OK?
Comment 60•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 61•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 62•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 63•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 64•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 65•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Assignee | ||
Comment 66•1 year ago
|
||
We are monitoring this bug for feedback. Please let us know if there are any further comments or questions.
Comment 67•1 year ago
|
||
I anticipate taking action on all old delayed revocation bugs prior to October 1, 2024.
Updated•1 year ago
|
Updated•11 months ago
|
Comment 68•10 months ago
|
||
We continue work on incident-reporting and compliance requirements aimed at reducing delayed revocation, so this bug will remain open until at least February 1, 2025. Meanwhile, CAs should review https://github.com/mozilla/www.ccadb.org/pull/186.
Updated•8 months ago
|
Comment 69•8 months ago
|
||
Referring to Mozilla's announcement to close Bug 1911183, in which this bug is referenced, here is the Closure Summary of this bug:
Incident Description: 336 misissued certificates of Bug 1875820 were not revoked in time.
Incident Root Cause(s): Since the customers were unable to replace the affected certificates in time and a failure of their infrastructures was to be avoided, Telekom Security did not force the revocation as required.
Remediation Description: To address this issue, Telekom Security sensitized customers to the possibilities of automation and preparation for timely (mass) revocation in case of future bugs and conducted a self-assessment for all Enterprise RAs as well as some on-site audits with selected customers to assess their ability to revoke in a timely manner.
Furthermore, Telekom Security has sharpened the Terms of Use as well as the contracts with the Enterprise-RAs regarding the obligation for a timely revocation.
Commitment Summary: We will continuously strengthen our processes, follow the updates of BRs and root policies, and regularly train our staff as well as our customers to avoid such incidents in the future.
All Action Items disclosed in this Incident Report have been completed as described, and we request its closure.
Comment 70•8 months ago
|
||
I will close this on or about Friday, 7-Feb-2025.
Comment 71•8 months ago
|
||
Ben,
In my quick review of what admittedly is a long and contentious bug, I did not spot explicit acknowledgement from Telekom Security on several points:
- It deliberately delayed revocation against the explicit instruction of the BRs, which is not a permissible decision for a CA to make.
- This choice was Telekom Security’s and nobody else’s.
- This is unacceptable behavior for a public CA then, now, or in the future.
While I see commitments about communication of expectations to Subscribers in a couple ways, I do not see a commitment to never again deliberately delay revocation. I believe any CA with a history of deliberate delrev should be expected to make this commitment and should have action items to establish an unambiguous policy and train employees on this policy. These action items should be completed before the bug can be closed, in my opinion.
While Telekom Security had plenty of energy for contention against those who called its decisions and policies into question, I do not believe we saw these basic steps take place.
I think this bug should stay open until Telekom Security meets these minimum requirements.
Comment 72•8 months ago
|
||
Thanks, Tim,
I appreciate your points.
While I understand the concern that CAs must follow the Baseline Requirements, I don’t believe that requiring them to make an absolute, irrevocable commitment to never delay revocation again is the right approach. Mozilla expects CAs to adhere to the BRs and implement necessary improvements when compliance issues arise. When a CA has previously delayed revocation, we focus on clear corrective actions, operational and process improvements, and a demonstrated commitment to strengthening compliance going forward. I believe CAs are getting the message that any delayed revocation will bring heightened scrutiny. However, mandating an unqualified, perpetual commitment introduces risks—particularly in complex, real-world scenarios where unforeseen operational challenges might arise. Our goal is to foster an ecosystem where CAs are accountable and transparent while allowing for our assessment of compliance improvements. We will continue to evaluate each case based on its specific facts, circumstances, and the corrective measures taken.
Thanks again,
Ben
Updated•7 months ago
|
Comment 73•7 months ago
|
||
I will pull this up for closure on or about Friday, 14-March-2025.
Updated•6 months ago
|
Description
•