Closed Bug 1639502 Opened 4 years ago Closed 4 years ago

Asseco DS / Certum: Incorrect OCSP response encoding

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mpalmer, Assigned: wtrapczynski)

Details

(Whiteboard: [ca-compliance] [ocsp-failure])

Attachments

(1 file)

1.62 KB, application/octet-stream
Details
Attached file invalid_ocsp_response

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36

Steps to reproduce:

I made an OCSP request via POST to http://dvcasha2.ocsp-certum.com/ at or around 2020-05-20 10:00:00 UTC, for certificate https://crt.sh/?id=1469465875.

Actual results:

I received a validly signed OCSP response (attached) in an HTTP 200 response, whose ResponseData section encodes a default value for the version of the basic OCSP response. This is, as I understand it, a violation of RFC6960, which requires the ResponseData to be DER-encoded within the larger OCSP response, and DER encoding requires default values to not be encoded.

Expected results:

The ResponseData.version field should not be encoded, as long as it has a value equal to that of the default value for the field.

Assignee: bwilson → wtrapczynski
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Summary: Certum: Incorrect OCSP response encoding → Asseco DS / Certum: Incorrect OCSP response encoding
Whiteboard: [ca-compliance]
Flags: needinfo?(wtrapczynski)

Thank you. We are looking into it.

We have fixed it.

Does Mozilla require an incident report as described here: https://wiki.mozilla.org/CA/Responding_To_An_Incident#Incident_Report?

Flags: needinfo?(wtrapczynski)

Yes.

Flags: needinfo?(wtrapczynski)
  1. How your CA first became aware of the problem (e.g. via a problem report submitted
    to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a
    Bugzilla bug, or internal self-audit), and the time and date.

On 2020-05-20 at 10:18 UTC, the bug was created https://bugzilla.mozilla.org/show_bug.cgi?
id=1639502.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-
    stamped sequence of all relevant events. This may include events before the incident
    was reported, such as when a particular requirement became applicable, or a document
    changed, or a bug was introduced, or an audit was done.

All times are UTC.

2020-05-20 10:18 – Bugzilla bug was created.
2020-05-20 10:30 – We began an investigation.
2020-05-20 11:00 – We confirmed the reported bug.
2020-05-20 15:00 – We finished checking the source code of OCSP Responder to be sure that
there is no other invalid encoding.
2020-05-20 18:00 – We fixed invalid encoding in OCSP Responder.
2020-05-21 11:00 – We installed the fix on production.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the
    problem. A statement that you have will be considered a pledge to the community; a
    statement that you have not requires an explanation.

Not applicable to certificates.

We stopped provide OCSP responses with invalid ResponseData.version field on 2020-05-21
11:00 UTC.

  1. A summary of the problematic certificates. For each problem: number of certs, and the
    date the first and last certs with that problem were issued.

Not applicable.

  1. The complete certificate data for the problematic certificates. The recommended way
    to provide this is to ensure each certificate is logged to CT and then list the
    fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one
    list per distinct problem.

Not applicable.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how
    they avoided detection until now.

This redundant encoding of ResponseData.version field was caused by a bug in OCSP Responder.

We did not detect it earlier because in our test scenarios for OCSP Responder we have not
directly checked the encoding of ResponseData.version field. Despite the presence of an
encoded field which is equal to its default value, tools we use for tests do not return any errors.
The wrongly encoded structure of ResponseData.version field was not breaking signature over
the whole OCSP response. The client's software (e.g. browser) does not return any warning
during OCSP response verification. These things made it a bit difficult to detect that bug.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not
    be repeated in the future, accompanied with a timeline of when your CA expects to
    accomplish these things.

The OCSP Responder source code has been checked in order to be sure that there is no other
invalid encoding. Then we have fixed the OCSP Responder application. An additional scenario has
been added to our OCSP tests including additional steps to ensure that if ResponseData.version
field contains default value it is not encoded in response.

Flags: needinfo?(wtrapczynski)

(In reply to Wojciech Trapczyński from comment #4)

The OCSP Responder source code has been checked in order to be sure that there is no other
invalid encoding.

To confirm, the only examination was source examination, right? How many people examined the source?

That is, I'm trying to preemptively address that if there's another issue, it doesn't fall into one of the common responses we see of "The technician who reviewed (was confused/didn't spot the error/didn't examine this requirement) and we've fixed it with (multi-party review / more training / improved clarification)"

Then we have fixed the OCSP Responder application. An additional scenario has
been added to our OCSP tests including additional steps to ensure that if ResponseData.version
field contains default value it is not encoded in response.

  • What tests did you already have for your OCSP tests, in general?
  • What tests do you have specific to encoding/DER, specifically?
  • How is your OCSP profile configured? That is, is it possible that some features are only used some of the time?

Here I'm trying to address the scenario of "We tested this field, but didn't test (that other field / this field under that configuration / this edge case), and we've fixed it by (improving testing)"

Flags: needinfo?(wtrapczynski)

(In reply to Wojciech Trapczyński from comment #4)

  1. Explanation about how and why the mistakes were made or bugs introduced, and how
    they avoided detection until now.

This redundant encoding of ResponseData.version field was caused by a bug in OCSP Responder.

When was the bug introduced, by the way?

Despite the presence of an
encoded field which is equal to its default value, tools we use for tests do not return any errors.

Have you looked for other (additional) tools you could use for tests, which do return errors on invalid DER-encoded responses?

(In reply to Ryan Sleevi from comment #5)

(In reply to Wojciech Trapczyński from comment #4)

The OCSP Responder source code has been checked in order to be sure that there is no other
invalid encoding.

To confirm, the only examination was source examination, right? How many people examined the source?

The source code was examined by two persons. Besides, we examined the DER-encoded OCSP response and that check was conducted by yet another two persons.

  • What tests did you already have for your OCSP tests, in general?

Apart from testing application during the process of its development, we test the things like correctness of OCSP response signature, OCSP response status, OCSP response signing certificate validity, and such. All these tests are separately performed for each of our issuers. All these tests are performed as part of the monitoring of the correct operation of the OCSP service.

  • What tests do you have specific to encoding/DER, specifically?

As we just started using ZLint library for checking TLS certificates we thought that it would be nice to have a similar tool for testing OCSP responses. So, we did it. We created a tool for testing DER encoded OCSP response. So far, we used it to check basics like Signature Algorithm value, Produced At/This Update/Next Update times, Cert Status, and such. We use this tool in offline mode before we install a new version of OCSP responder.

  • How is your OCSP profile configured? That is, is it possible that some features are only used some of the time?

No, we have not such features.

Flags: needinfo?(wtrapczynski)

(In reply to mpalmer from comment #6)

When was the bug introduced, by the way?

This unnecessary encoding of version field was introduced when OCSP Responder was first launched. So, this bug results from assumptions made at the very beginning rather than from a programming error made later.

Have you looked for other (additional) tools you could use for tests, which do return errors on invalid DER-encoded responses?

No, we have not yet.

If there are no other questions, we request the bug to be closed.

Flags: needinfo?(bwilson)

The use of the word "yet" in your last answer implied that Certum would be investigating tools for validating for correct DER encoding. Is that no longer the intention?

(In reply to mpalmer from comment #10)

The use of the word "yet" in your last answer implied that Certum would be investigating tools for validating for correct DER encoding. Is that no longer the intention?

We have tested OpenSSL and Golang crypto libraries. Both of them do not return an error for invalid OCSP response where the default value of ResponseData.version is encoded. Of course, these libraries return an error for more general encoding errors but not for this case you reported in this bug. And such general encoding errors are detected by our monitoring system.

We are not going to look any further because we have deployed linting for OCSP services in which we explicitly check the encoding the ResponseData.version. Additionally (since my last post), we deployed OCSP linting in our continuous monitoring system (not just offline checking as I described it earlier).

To sum up, in our opinion, we deployed sufficient solutions that should prevent us from making a mistake connected with OCSP services once again.

(In reply to Wojciech Trapczyński from comment #11)

(In reply to mpalmer from comment #10)

The use of the word "yet" in your last answer implied that Certum would be investigating tools for validating for correct DER encoding. Is that no longer the intention?

We have tested OpenSSL and Golang crypto libraries. Both of them do not return an error for invalid OCSP response where the default value of ResponseData.version is encoded. Of course, these libraries return an error for more general encoding errors but not for this case you reported in this bug. And such general encoding errors are detected by our monitoring system.

Right, neither of these tools have really ever been useful for detecting malformed things. They're quite liberal in what they accept.

Tools like asn1c (which is used by certlint) do take the ASN.1 schemas and generate stricter decoders from them. This is why the awslab's certlint (now forked and living at https://github.com/certlint/certlint ) uses this approach.

I mention this, because ZLint also does not detect a number of things that violate the ASN.1 schema of RFC 5280 (e.g. size constraints), so this is an opportunity to more holistically look at both certificate and OCSP/CRL linting and add or contribute a layer more in-depth and robust. While this was specific to ResponseData.version, it seems just as likely any other number of issues can present themselves, with the same root cause, and go undetected.

(In reply to Wojciech Trapczyński from comment #11)

We have tested OpenSSL and Golang crypto libraries. Both of them do not return an error for invalid OCSP response where the default value of ResponseData.version is encoded.

I would not expect those libraries to return an error. Their job is to be liberal in what they accept.

We are not going to look any further because we have deployed linting for OCSP services in which we explicitly check the encoding the ResponseData.version.

This seems like an overly literal interpretation of the problem that needs to be solved. I'd prefer if we (the relying party community) didn't have to play "coding mistake whack-a-mole" with CAs, one bug at a time.

I will close this bug early next week unless Ryan or Matt have additional questions.

Flags: needinfo?(bwilson)

(In reply to Ryan Sleevi from comment #12)

Tools like asn1c (which is used by certlint) do take the ASN.1 schemas and generate stricter decoders from them. This is why the awslab's certlint (now forked and living at https://github.com/certlint/certlint ) uses this approach.

Ryan, thank you for pointed that out. We will look into possibilities to integrate asn1c in our linting tools.

(In reply to mpalmer from comment #13)

This seems like an overly literal interpretation of the problem that needs to be solved. I'd prefer if we (the relying party community) didn't have to play "coding mistake whack-a-mole" with CAs, one bug at a time.

Generally, we share your point of view.

In this case, we wanted to prevent us from making the same mistake as soon as possible and we did it. That is why we decided to enhance our linting tool for OCSP a new lint instead of creating a brand new solution.

OK, so now that the immediate problem has been prevented with a stop-gap lint check, what is Certum doing to address larger issue holistically?

(In reply to mpalmer from comment #17)

OK, so now that the immediate problem has been prevented with a stop-gap lint check, what is Certum doing to address larger issue holistically?

We treat adding a new lint as a full solution for this bug. As I wrote in https://bugzilla.mozilla.org/show_bug.cgi?id=1639502#c15 we will consider integrating asn1c in our linting tools but I am not sure should we proceed it in this bug.

I intend to close this bug on or after 5-Aug-2020 unless there are additional issues or questions.

Flags: needinfo?(bwilson)

I don't believe that Certum have fully mitigated the underlying issue, but merely deployed a stop-gap for the narrow problem initially reported.

Matt, do you have some further actions that we can ask Certum/Asseco to take to ensure that future OCSP errors do not occur? Thanks.

Flags: needinfo?(bwilson) → needinfo?(mpalmer)

Yes, deploy a "linting" tool that is aware of the requirements of DER encoding, preferably for all their DER-encoded artifacts (not just OCSP).

Flags: needinfo?(mpalmer)

This is the distinction between cablint/certlint (the AWSlabs version), which uses asn1c to help ensure this, and ZLint, which does not, as mentioned in Comment #12.

I think the substance is that, regardless of whether you use asn1c or a commercial tool (a number available), ensuring valid DER does seem a reasonable expectation for a CA, and the production of invalid DER is rather systemically concerning. This particular issue's root cause has been on Mozilla's "Forbidden Practices" list since virtually the first version of policy. It was also part of the systemic errors highlighted to CAs when launching moz::pkix. While the focus was on "certificates", the artifacts produced (including OCSP responses) all fit within the DER expectation, and presumably a robust solution for certificates would easily lend itself to a robust solution for CRLs and OCSP responses.

Ultimately, I don't think we can "make" the CA implement any particular solution. I think Matt's right to highlight that the systemic risk is unaddressed, and so I think any future failure by the CA, in light of the concerted effort, might be worthy of causing a more serious discussion about continued participation. But I don't know that we need to keep this incident open, since we've done our best to highlight the need to "do better".

Flags: needinfo?(bwilson)

I'd like a response from the CA on these latest points. Thanks.

Flags: needinfo?(bwilson) → needinfo?(wtrapczynski)

As I wrote in https://bugzilla.mozilla.org/show_bug.cgi?id=1639502#c18 we treat adding a new lint as a full solution for this particular bug and we believe that it should prevent us from doing the same error once again.

In https://bugzilla.mozilla.org/show_bug.cgi?id=1639502#c15 I wrote that we will consider integrating asn1c in our linting tools (or as a separate solution). Now, we are in the analyzing process with that. We are going to implement checking for all kinds of DER object. I am not sure should we proceed with it in this bug because I am now not able to determine a timeline for that.

Flags: needinfo?(wtrapczynski)

Thanks for your additional response. I'm still inclined to close this bug, so I intend to do so on or after this Friday, 28-August, unless there are further questions.

Flags: needinfo?(bwilson)
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ocsp-failure]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: