Closed Bug 1523186 Opened 10 months ago Closed 2 months ago

KIR S.A.: Misissuance - missing OCSP AIA, Validity > 825 days

Categories

(NSS :: CA Certificate Compliance, task)

task
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: piotr.grabowski, Assigned: piotr.grabowski)

Details

(Whiteboard: [ca-compliance])

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36

Steps to reproduce:

https://crt.sh/?caid=15985&opt=cablint,zlint,x509lint&minNotBefore=2019-01-01

Bug report:

  1.   How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date. 
    

The problem was identified during post-issuance linting procedure on 14/01/19 by operator and sent for further verification.

  1.   A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular 
    

requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

14/01/19 One certificate without OCSP AIA and KeyUsage extension not marked as critical was issued.
14/01/19 We investigation the root cause began.
15/01/19 We identified and removed from system the registration policy that issued the problematic certificate. The problematic policy template was not listed in policies allowed for Certificate Transparency logging but contained Signed Certificate Timestamp extension. The usage of such policy template should be blocked by the CT functionality. We had only one policy in such state.
16/01/19 We described and raised the issue to software vendor with highest priority.
21/01/19 In agreement with the customer we revoked the one problem certificate.
23/01/19 We got response from software vendor with the patch.
25/01/19 The fix was tested and implemented. During the test phase when testing nagative scenarios we have found one more configuration issue. We were able to log intentionally malformed certificate request for server from KIR domain szafir.kir.pl
(with validity period greater than 825 days https://crt.sh/?id=1142862481). The goal of the test was to block this kind of request. The configuration was fixed, retested with possitive result and the test certificate was immediately revoked.
25/01/19 We ran a script over our existing certificate and police database to identify ojects that could be affected by this issue. No additional objects were identified.

  1.  Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation. 
    

On 15/01/19 we removed from system the registration policy that issued the problematic certificate to prevent this problem from re-occurring. We have also fixed and retested with possitive result test scenario configuration.

  1.  A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued. 
    

Impact of one certificate, issued on 14/01/2019 and another one during the test phase issued 25/01/2019.

  1.  The complete certificate data for the problematic certificates. 
    

https://crt.sh/?id=1120102462 https://crt.sh/?id=1120102462
https://crt.sh/?id=1142862481 https://crt.sh/?id=1142862481

  1.   Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now. 
    

The policy was previously used for Certificate Transparency logging but some time ago it was removed from the list of approved CT logging policies. The usage of the policy removed from allowed CT logging list but available for operator
should be blocked by the system but it didn't work this way.

  1. List of steps CA is taking to resolve the situation and ensure it will not be repeated.

We implemented patch from software vendor and did reconfiguration to prevent this kinds of issues to happen again. We have also changed certificate policy management procedure to make sure that teamplates removed from CT functionality are unavailable for operators even if they were to be blocked by system. We have also double checked CT configuration and test scenarios.

Thank you for reporting this Piotr. I believe the issue with the first certificate is that it was not issued in conformance with your CPS, correct? I do not believe the certificate itself is strictly out of compliance with any requirements. The second one with a validity period of over 5 years is clearly a big issue.

Pre-issuance linting is discussed in bug 1495497. Would that have also prevented this issue?

What is the status of pre-issuance linting?

A pattern of repeated misissuance is emerging, and the most recent example of a 5+ year duration is extremely troubling. What will KIR do to ensure that no further misissuance occurs prior to the implementation of pre-issuance linting? Without some reasonable assurance against further incidents, I would suggest that KIR suspend issuance of all certificates until pre-issuance linting is in place.

Assignee: wthayer → piotr.grabowski
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Flags: needinfo?(piotr.grabowski)
Summary: Software bug - 2 certificates with BR violations → KIR S.A.: Misissuance - missing OCSP AIA, Validity > 825 days
Whiteboard: [ca-compliance]

I just noticed this response from Piotr in an email:

I have response from Verizon that they will send us a patch containing some
pre-issuance linting features on 03/02.

Would those linting features have prevented these two certificates from being issued?

I believe the issue with the first certificate is that it was not issued in conformance with your CPS, correct?

-yes, it is not compliant with our CPS

The second one with a validity period of over 5 years is clearly a big issue.

-As as I said in previous post - it was configuration issue with new test, easy fixed and should not happen anymore in the future.
I can give you further explanation why it is impossible for this kind of issue to re-occure.

What is the status of pre-issuance linting?

-We are expecting field size validation improvements in February.

A pattern of repeated misissuance is emerging, and the most recent example of a 5+ year duration is extremely troubling. What will KIR do to ensure that no further misissuance occurs prior to the implementation of pre-issuance linting? Without some reasonable assurance against further incidents, I would suggest that KIR suspend issuance of all certificates until pre-issuance linting is in place.

The issue with validity > 825 days was just one-time case and it is not going to happen again. For the rest of our issues I think we made a great progress in technical and administrative controls but we are aware that without full pre-issuance linting feature we are still exposed for small but still mistakes. Anyway we will exert pressure on Verizon to implement pre-issuance linting as soon as possible.

Flags: needinfo?(piotr.grabowski)

Piotr: It was pointed out to me that the OCSP status of the 5+ year certificate appears to be "unknown": https://crt.sh/?id=1142862481&opt=cablint,ocsp

Assuming that this is yet another incident and not just a crt.sh error, please provide an incident report.

Flags: needinfo?(piotr.grabowski)

Wayne, the test certificate (5+) was immediately revoked after generation and was not issued to the customer that's why its OCSP status is "unknown".
Best regards,
Piotr Grabowski

Flags: needinfo?(piotr.grabowski)

By responding "unknown", this certificate is not revoked. Do you have a timeline to provide the proper response?

Flags: needinfo?(piotr.grabowski)

This certificate is revoked on CRL. Because the certificate has been never received by the customer its status on OCSP is "unknown". To make the certificate "revoked" on OCSP first we should make it "valid" what makes no sense. I know there is inconsistency between CRL and OCSP but there are some scenarios when it can be insecure to make it valid just in order to make it revoked.

Flags: needinfo?(piotr.grabowski)

Can you please provide more details explaining why you can not mark this certificate revoked?

The Baseline Requirements, nor RFC 5280, distinguish based on whether or not it has been "received by the customer". Further, disconnect between the OCSP status and the CRL status is not compatible with the relevant RFCs; they should provide identical status information.

Flags: needinfo?(piotr.grabowski)

Of course I can mark it as revoked after I make it valid, but I think it is more secure practice not to change its status at all when the certificate is not received by the customer. Let's suppose the scenario when your CA generate certificate and the customer wants you to deliver it to its office. What OCSP status the certificate should have when you are on your way to the customer office? valid - I do not think so. When the certificate is stolen you are in trouble. So the only option is "unknown" but then we have different statuses on CRL and OCSP - but we are still safe. It is not only my opinion, we had a big discuss with our auditors about that.

Flags: needinfo?(piotr.grabowski)

I think we'd be happy to discuss further with your auditors.

Certainly, a revoked certificate should clearly indicate that. A CA MUST support the capability of clearly indicating revoked certificates.

If your CA software requires you to do something insecure to accomplish that, then your CA software should be replaced. However, the end result - of a clear revoked status - should be achieved. Until it is, I don't think we can close out this incident report.

As for your hypothetical, I think we'd be happy to continue that discussion on m.d.s.p., as it has come up in other situations and discussions (for example, automated certificate issuance where the OCSP response is not-yet-available). However, broadly speaking, the act of the CA issuing the certificate is of utmost concern, regardless of the delivery mechanism of that certificate.

Flags: needinfo?(piotr.grabowski)

Hi, I think the discussion with our auditors could be very interesting but it could be hard to convince them to take part in it.

We have made this certificate revoked in OCSP.

Regards

Flags: needinfo?(piotr.grabowski)

I have created bug 1525082 for Ernst & Young Poland.

The discussion on the mozilla.dev.security.policy list has concluded that it is not permitted to maintain an "unknown" OCSP response for an issued valid or revoked certificate: https://groups.google.com/d/msg/mozilla.dev.security.policy/6MSwBBikf10/q-vyCtr_AwAJ

Piotr: please update this bug as pre-issuance linting is implemented, as noted in comment #3

Whiteboard: [ca-compliance] → [ca-compliance] - Next Update - 03-March 2019

I also found this certificate that has the same issues: https://crt.sh/?id=556959414&opt=cablint,x509lint,zlint

Piotr: has a complete scan of your active certificates been completed? If so, why was the one in comment #14 not reported and revoked?

Also, what is the status of the remediations described in comments #3 and #4?

Flags: needinfo?(piotr.grabowski)

It's been four months without updates. Is there an update here?

Emailed POCs on 2019-07-04 regarding this issue, highlighting https://wiki.mozilla.org/CA/Responding_To_An_Incident#Keeping_Us_Informed

Regarding comment #3 and #4 we have deployed basic pre-linting patch. We have also procedural controls in place that should protect us for similar misissuance to happen again.

Talking about full scan of your active certificates especially because of unknown OCSP responses I think we should conduct this scan again.
I will let you know about the results when ther are ready.

Flags: needinfo?(piotr.grabowski)

Piotr: Do you have an estimated time when they will be ready?

  • When did you implement the controls from Comment #3 and Comment #4?
  • Why was there a delay in notifying Mozilla of these?
  • What steps are being taken to ensure timely and prompt updates, along with clear timelines, going forward?
Flags: needinfo?(piotr.grabowski)

Ryan: I wrote in https://bugzilla.mozilla.org/show_bug.cgi?id=1495497 in comment #16.
The pre-linting patch was deployed in June 2019.

Flags: needinfo?(piotr.grabowski)

Resetting Needs-Info, as not all of the items requested in Comment #19 were responded to.

Note that, per https://wiki.mozilla.org/CA/Responding_To_An_Incident#Incident_Report

A timeline is a date-and-time-stamped sequence of all relevant events

Is it correct to understand that, in https://bugzilla.mozilla.org/show_bug.cgi?id=1495497#c16 , K.I.R. SA did not deploy linting until 2019-06-17 12:26 PDT? If so, it would appear that it may have only been the result of https://bugzilla.mozilla.org/show_bug.cgi?id=1495497#c15 , a remark at 2019-06-17 07:30 PDT, judging by the silence since 2019-01-21 10:06 PDT ( in https://bugzilla.mozilla.org/show_bug.cgi?id=1495497#c12 )

I ask these questions because K.I.R. SA did not provide weekly updates, and thus it's important to have a holistic view about what steps were taken, when they were taken, and what the motivating factors are. These are key elements to distinguishing a CA that has a good internal incident handling procedure, but poor communications, and a CA that is poor at both. This is the opportunity here to present the facts about how K.I.R. SA handled these, and why the community should not be significantly worried about the CA and the lack of updates on the issues until e-mailed.

Flags: needinfo?(piotr.grabowski)

Ryan: KIR deployed basic prelinting patch on 2019-06-13 10:00 and it is only time coincidence that this fact was communicated after https://bugzilla.mozilla.org/show_bug.cgi?id=1495497#c15

Flags: needinfo?(piotr.grabowski)

Please do not reset the Needs-Info unless you are confident that you have answered all of the questions that were asked when it was set.

Please review Comment #19 and Comment #21 very carefully. A failure to comprehensively reply and answer all relevant questions will, as mentioned in Comment #21, appear very unfavorably for the CA and its ability to handle and prevent incidents.

Flags: needinfo?(piotr.grabowski)

Ryan: I think I have answered all questions in comment #22.
I would like to emphasize that we communicated update as soon as the change was deployed and it was not directly activated by your email . Of course we should communicate in weekly manner that the patch was not deployed for community to have a holistic view and this kind of communication activity should be incorporated in our internal procedures.
I belive this was only short term communication problems from our side.

Flags: needinfo?(piotr.grabowski)

I cannot possibly see how these reasonably answer the clear and specific questions of Comment #19.

I can only conclude that this is an unacceptable response and have thus resulted in an adverse opinion of the CA and its operations. I am not satisfied with these responses, despite efforts to encourage meaningful response.

I'm passing this over to Wayne to decide if he'd like to push this further, but my conclusion is that K.I.R. SA is either unwilling or unable to answer the questions provided to any meaningful detail, and thus is extremely likely to have further misissuance and poor responsiveness, such that appropriate steps may be removal in the future. However, that's only my recommendation.

Flags: needinfo?(wthayer)

It appears to me that the deployment of the pre-linting patch happened at least 2 weeks before that fact was communicated in comment #22, and that K.I.R. SA is unwilling to describe why that communication was delayed or how delays in communications will be prevented in the future.

In summary, a certificate with a validity period of more than 6 years was issued, K.I.R. SA failed to revoke it, and K.I.R. SA failed to update this bug in a timely fashion and to answer all questions.

Remediation has been completed, so I am resolving this bug.

Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Flags: needinfo?(wthayer)
Resolution: --- → FIXED
Whiteboard: [ca-compliance] - Next Update - 03-March 2019 → [ca-compliance]
You need to log in before you can comment on or make changes to this bug.