Closed Bug 997509 Opened 10 years ago Closed 10 years ago

mozilla::pkix should not require "revoked" or "unknown" OCSP responses to be fresh or unexpired

Categories

(Core :: Security: PSM, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla33
Tracking Status
firefox30 --- unaffected
firefox31 + verified
firefox32 + verified
firefox33 --- verified
firefox-esr24 --- unaffected
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: briansmith, Assigned: keeler)

References

Details

(Keywords: sec-want)

Attachments

(2 files, 5 obsolete files)

In mozilla::pkix we implemented a freshness check according to the spec to verify that the OCSP response is fresh. We also enforce the 10-day-old rule in the baseline requirements. However, we've found that many CAs are not following the 10-day-old rule, especially for intermediate certs where technically the freshness requirement is 1 year.

We should change the freshness checks within mozilla::pkix's VerifyEncodedOCSPResponse so that we'll honor these not-fresh-enough revoked and unknown OCSP responses within mozilla::pkix. NSSCertDBTrustDomain already contains such logic to do this, however it is non-obvious whether the order that errors are detected and handled is correct with the NSSCertDBTrustDomain-based approach. It will be more clearly correct to move the special processing of old revoked and unknown responses into pkixocsp.cpp, so that revoked and unknown responses effectively never expire. I imagine we'd do this by having VerifyEncodedOCSPResponse return an extra output parameter that indicates the freshness of the OCSP response.
Attached patch patch (obsolete) — Splinter Review
Assignee: nobody → dkeeler
Status: NEW → ASSIGNED
Attachment #8419055 - Flags: review?(brian)
Comment on attachment 8419055 [details] [diff] [review]
patch

Review of attachment 8419055 [details] [diff] [review]:
-----------------------------------------------------------------

Looks pretty reasonable. See the comments below.

::: security/manager/ssl/tests/unit/test_ocsp_caching.js
@@ +131,5 @@
> +                      clearSessionCache);
> +  add_test(function() { do_check_eq(gFetchCount, 0); run_next_test(); });
> +
> +  // Due to the (old) cached Revoked response, no further OCSP requests will
> +  // be made.

Why treat Revoked and Unknown differently? IMO, this should be exactly like the Unknown case: try to get a new response but use the old revoked response if that fails.

Also, why are you using ocsp-stapling-revoked-old above and ocsp-stapling-none here? Why not use ocsp-stapling-none for both? It is unclear to the reader if/how these two hosts are even using the same certificate.

@@ +151,5 @@
> +  add_test(function() { do_check_eq(gFetchCount, 0); run_next_test(); });
> +
> +  // A failure to retrieve an OCSP response must result in the cached Unknown
> +  // response being recognized and honored.
> +  add_connection_test("ocsp-stapling-none.example.com",

Ditto: it is confusing that we're using a different hostname but expecting it to use the same certificate. Could we make it less confusing by using the same hostname?

::: security/pkix/lib/pkixocsp.cpp
@@ +718,5 @@
> +  // Old responses are accepted if they indicate a revoked or unknown
> +  // certificate.
> +  if (context.time - SLOP > notAfter &&
> +      context.certStatus != CertStatus::Revoked &&
> +      context.certStatus != CertStatus::Unknown) {

This isn't what I was intending when I filed this bug and I don't think this is the right behavior for *stapled* OCSP responses. For stapled OCSP responses, we should return SEC_ERROR_OCSP_OLD_RESPONSE because that is the more informative error and because SEC_ERROR_OCSP_OLD_RESPONSE is not overridable. That is why I suggested that we add another output parameter to VerifyEncodedOCSPResponse to indicate whether or not the OCSP response was old.

BTW, we should add a comment to the code that determines which errors are overridable to indicate why it is important that SEC_ERROR_OCSP_OLD_RESPONSE not be overridable.
Attachment #8419055 - Flags: review?(brian) → review-
Keywords: sec-want
Blocks: 1007269
Thanks for the review. I have a couple of clarifications/questions.

(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #2)
> ::: security/manager/ssl/tests/unit/test_ocsp_caching.js
> @@ +131,5 @@
> > +                      clearSessionCache);
> > +  add_test(function() { do_check_eq(gFetchCount, 0); run_next_test(); });
> > +
> > +  // Due to the (old) cached Revoked response, no further OCSP requests will
> > +  // be made.
> 
> Why treat Revoked and Unknown differently? IMO, this should be exactly like
> the Unknown case: try to get a new response but use the old revoked response
> if that fails.

I think this issue is orthogonal to these changes. It is already the case that Revoked and Unknown are treated differently with regard to attempting to fetch a different response (also, if a certificate has ever been Revoked, I'm not sure it makes sense to treat it as unrevoked if we then see a "good" status, even if it's newer).

> Also, why are you using ocsp-stapling-revoked-old above and
> ocsp-stapling-none here? Why not use ocsp-stapling-none for both? It is
> unclear to the reader if/how these two hosts are even using the same
> certificate.

This is a limitation of our test infrastructure at the moment. A single hostname cannot serve different certificates. However, since -none and -revoked-old serve the same certificate, an OCSP response for one will apply to the other as well (there's already a comment on line 57 about this).

> ::: security/pkix/lib/pkixocsp.cpp
> @@ +718,5 @@
> > +  // Old responses are accepted if they indicate a revoked or unknown
> > +  // certificate.
> > +  if (context.time - SLOP > notAfter &&
> > +      context.certStatus != CertStatus::Revoked &&
> > +      context.certStatus != CertStatus::Unknown) {
> 
> This isn't what I was intending when I filed this bug and I don't think this
> is the right behavior for *stapled* OCSP responses. For stapled OCSP
> responses, we should return SEC_ERROR_OCSP_OLD_RESPONSE because that is the
> more informative error and because SEC_ERROR_OCSP_OLD_RESPONSE is not
> overridable. That is why I suggested that we add another output parameter to
> VerifyEncodedOCSPResponse to indicate whether or not the OCSP response was
> old.

My interpretation of this bug was along the following:
When validating an OCSP response (stapled or otherwise), it may be expired. In most cases, we want to basically discard this response (because if it says the certificate is good, we can't trust that it still is). However, if the response says the certificate has been revoked, we actually do want to make a note of this, because we now know that at one point the CA revoked the certificate. (The unknown case is a little different, because my understanding is that a certificate may go from unknown to good. This is in contrast to my impression that once a certificate has been revoked, its status will never go back to good.)
Is this not what you were going for, here? I guess I would appreciate a little more guidance on what you think is the right way to go about this bug.

> BTW, we should add a comment to the code that determines which errors are
> overridable to indicate why it is important that SEC_ERROR_OCSP_OLD_RESPONSE
> not be overridable.

This is certainly doable.
Flags: needinfo?(brian)
(In reply to David Keeler (:keeler) [needinfo? is a good way to get my attention] from comment #3)
> I think this issue is orthogonal to these changes. It is already the case
> that Revoked and Unknown are treated differently with regard to attempting
> to fetch a different response (also, if a certificate has ever been Revoked,
> I'm not sure it makes sense to treat it as unrevoked if we then see a "good"
> status, even if it's newer).

Please read http://tools.ietf.org/html/rfc6960#section-4.4.8 and let me know if you still have the same position. In particular, do you think that we should treat (eventually) treat Revoked *without* the "extended revoked" extension differently than Revoked *with* that extension? "Clients do not have to parse this extension in order to determine the status of certificates in responses" suggests that we shouldn't treat responses differently based on the presence/absence of the extension. However, Kaspar Brand has also been insistent that our interpretation of "Unknown" goes beyond what the spec says (he says we should try another source of revocation information upon getting Unknown, which we won't do), so I guess we could go either way here.

According to bug 943815 comment 2, at least one commercial OCSP responder software vender has already implemented "extended revoked" with the intention that clients treat "extended revoked" like "Unknown" so I'm worried that there may already be deployed OCSP responders using "extended revoked" and we'd run into the same issues regarding just-issued certificates that the CA's responder doesn't know about "yet" that forced us to aggressively retry for "Unknown". On the other hand, we don't have any real-world reports of that badness happening for Revoked like we had for Unknown.

> > Also, why are you using ocsp-stapling-revoked-old above and
> > ocsp-stapling-none here? Why not use ocsp-stapling-none for both? It is
> > unclear to the reader if/how these two hosts are even using the same
> > certificate.
> 
> This is a limitation of our test infrastructure at the moment. A single
> hostname cannot serve different certificates. However, since -none and
> -revoked-old serve the same certificate, an OCSP response for one will apply
> to the other as well (there's already a comment on line 57 about this).

I see. I think the main issue here is whether we should do this only for fetched responses or for stapled responses too. The comments of mine that you are responding to are based on the idea that we'd only treat expired revoked/unknown responses specially for fetched responses but not for stapled responses. And, that caused me to think it didn't make sense to use the OCSP stapling test infrastructure to test this, but instead that we should use the JS-based mock OCSP responder we created, where the responder would return an expired Revoked/Unknown response the first time it was accessed and a different OCSP response for the next access.

> My interpretation of this bug was along the following:
> When validating an OCSP response (stapled or otherwise), it may be expired.
> In most cases, we want to basically discard this response (because if it
> says the certificate is good, we can't trust that it still is). However, if
> the response says the certificate has been revoked, we actually do want to
> make a note of this, because we now know that at one point the CA revoked
> the certificate. (The unknown case is a little different, because my
> understanding is that a certificate may go from unknown to good. This is in
> contrast to my impression that once a certificate has been revoked, its
> status will never go back to good.)
> Is this not what you were going for, here? I guess I would appreciate a
> little more guidance on what you think is the right way to go about this bug.

Remember that our handling of stapled responses is stricter than our handling of non-stapled responses in preparation for Must-Staple. When a site is Must-Staple-enabled we won't due the fallback to OCSP fetching when we receive an old response; instead we should report the SEC_ERROR_OCSP_OLD_RESPONSE error and stop.

(Further, if a server is Must-Staple then we don't need to store stapled OCSP responses in the OCSP cache.)

Counterpoint: If the server is stapling a response that is Revoked/Unknown then it is already misconfigured by using a bad certificate, and it shouldn't matter whether we return SEC_ERROR_OCSP_OLD_RESPONSE or SEC_ERROR_REVOKED_CERTIFICATE or SEC_ERROR_OCSP_UNKNOWN_CERT.

Counter-counter-point: It actually does matter whether we return SEC_ERROR_OCSP_OLD_RESPONSE or another error code for stapled responses, because we do fallback to OCSP fetching for SEC_ERROR_OCSP_OLD_RESPONSE but not for any other errors.

In your patch, why doesn't the counter-counter-point matter?
Flags: needinfo?(brian)
Brian and I discussed on IRC that for the time being, mozilla::pkix will not be supporting the "Extended Revoked" status. If this becomes a compatibility issue, we can revisit this decision. As a result, responses with the Unknown status will correspond either to certificates that are too new for the responder to know about or were mis-issued. This status is a temporary condition that will cause a verification failure if a Good response can't be fetched. Responses with the Revoked status will correspond to certificates that have been revoked by the CA. This is a permanent condition that will cause a verification failure.

If an expired, stapled response is Unknown, we will attempt to fetch a newer one. If this fails (or fails to result in a newer response), SEC_ERROR_OCSP_OLD_RESPONSE will be returned. If it succeeds, the status of the new response will be used.
If an expired, stapled response is Revoked, SEC_ERROR_REVOKED_CERTIFICATE will be returned.
If an expired, from-network response is Unknown, SEC_ERROR_OCSP_UNKNOWN_CERT will be returned.
If an expired, from-network response is Revoked, SEC_ERROR_REVOKED_CERTIFICATE will be returned.

Brian, let me know if I've left anything out or misunderstood the outcome of our conversation.
That seems right to me.
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #6)
> That seems right to me.

Based on the info we got on Twitter about at least one CA in our program already using the Extended Revoked semantics, it would also be OK to treat Revoked and Unknown exactly the same as far as this is concerned, if that is easier.
Blocks: 1018876
Attached patch patch v2 (obsolete) — Splinter Review
This patch implements the behavior that if a fetch failed and if there was a stapled response indicating Revoked or Unknown that has expired, the handshake is halted with the corresponding Revoked or Unknown error.
Attachment #8419055 - Attachment is obsolete: true
Attachment #8434479 - Flags: review?(brian)
Comment on attachment 8434479 [details] [diff] [review]
patch v2

Clearing review for now - this needs rebasing.
Attachment #8434479 - Flags: review?(brian)
Attached patch patch v3 (obsolete) — Splinter Review
This applies on top of bug 1019198, which just landed.
Attachment #8434479 - Attachment is obsolete: true
Attachment #8436160 - Flags: review?(brian)
Comment on attachment 8436160 [details] [diff] [review]
patch v3

Review of attachment 8436160 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/certverifier/NSSCertDBTrustDomain.cpp
@@ +399,4 @@
>      if (stapledOCSPResponse) {
> +      SECStatus rv = VerifyAndMaybeCacheEncodedOCSPResponse(cert, issuerCert,
> +                       time, maxOCSPLifetimeInDays, stapledOCSPResponse,
> +                       ResponseIsFromNetwork);

I don't think that we should verify the stapled OCSP response again. Instead, I think we should have a way to do what VerifyAndMaybeCacheEncodedOCSPResponse does, but passing in the verification result from the stapling above. This way, we're guaranteed to get consistent results.

AFAICT, this would just involve splitting VerifyAndMaybeCacheEncodedOCSPResponse into two functions, that VerifyAndMaybeCacheEncodedOCSPResponse calls in succession.

(Race conditions are very unlikely given how the code is currently written, but AFAICT the patch is technically already racy because the state of the trust database could change between the two verification calls, such that the trust of a delegated OCSP responder cert is modified, giving two different results for the two verifications.)

@@ +438,4 @@
>    if (stapledOCSPResponse) {
> +    SECStatus rv = VerifyAndMaybeCacheEncodedOCSPResponse(cert, issuerCert,
> +                     time, maxOCSPLifetimeInDays, stapledOCSPResponse,
> +                     ResponseIsFromNetwork);

Same here. I think we should avoid trying to verify the same OCSP response more than once in this function.

::: security/manager/ssl/tests/unit/test_ocsp_stapling_expired.js
@@ +141,5 @@
> +    // be returned.
> +    add_ocsp_test("ocsp-stapling-revoked-old.example.com",
> +                  getXPCOMStatusFromNSS(SEC_ERROR_REVOKED_CERTIFICATE),
> +                  null);
> +    add_ocsp_test("ocsp-stapling-revoked-old.example.com", Cr.NS_OK,

I think we should have a comment explaining this case too.

@@ +146,5 @@
> +                  ocspResponseGood);
> +    add_ocsp_test("ocsp-stapling-unknown-old.example.com",
> +                  getXPCOMStatusFromNSS(SEC_ERROR_OCSP_UNKNOWN_CERT),
> +                  null);
> +    add_ocsp_test("ocsp-stapling-unknown-old.example.com", Cr.NS_OK,

ditto here.

::: security/pkix/lib/pkixocsp.cpp
@@ +379,5 @@
> +  if (context.expired && (!badStatusOverridesExpired ||
> +                          context.certStatus == CertStatus::Good)) {
> +    PR_SetError(SEC_ERROR_OCSP_OLD_RESPONSE, 0);
> +    return SECFailure;
> +  }

Like I said before, I think it makes more sense for this function to always return both the error (revoked/unknown) and an indicator about whether the response is expired, so the caller can decide how to rank those two issues against each other, and so the caller doesn't have to call VerifyEncodedOCSPresponse twice for the same response.
Attachment #8436160 - Flags: review?(brian)
Tracking to make sure we have it in 31.
Attached patch patch v4 (obsolete) — Splinter Review
Ok - I reorganized the patch so we don't have to re-verify responses.
Attachment #8436160 - Attachment is obsolete: true
Attachment #8440869 - Flags: review?(brian)
Comment on attachment 8440869 [details] [diff] [review]
patch v4

Review of attachment 8440869 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/certverifier/NSSCertDBTrustDomain.cpp
@@ +418,5 @@
>  
> +  // If the response from the network has expired but indicates a revoked
> +  // or unknown certificate, PR_GetError() will return the appropriate error.
> +  // We actually ignore expired here.
> +  bool expired = false;

NIT: don't assign a value to expired here. (I suspect MSVC will complain of an unused assignment if you do.)

@@ +454,5 @@
>  SECStatus
>  NSSCertDBTrustDomain::VerifyAndMaybeCacheEncodedOCSPResponse(
>    const CERTCertificate* cert, CERTCertificate* issuerCert, PRTime time,
>    uint16_t maxLifetimeInDays, const SECItem* encodedResponse,
> +  EncodedResponseSource responseSource, bool& expired)

/*out*/ bool& expired

::: security/certverifier/NSSCertDBTrustDomain.h
@@ +90,5 @@
>    static const PRTime ServerFailureDelay = 5 * 60 * PR_USEC_PER_SEC;
>    SECStatus VerifyAndMaybeCacheEncodedOCSPResponse(
>      const CERTCertificate* cert, CERTCertificate* issuerCert, PRTime time,
>      uint16_t maxLifetimeInDays, const SECItem* encodedResponse,
> +    EncodedResponseSource responseSource, bool& expired);

/*out*/ bool& expired

::: security/pkix/include/pkix/pkix.h
@@ +111,5 @@
>  
> +// The out parameter expired will be true if the response has expired. If the
> +// response also indicates a revoked or unknown certificate, that error
> +// will be returned by PR_GetError(). Otherwise, SEC_ERROR_OCSP_OLD_RESPONSE
> +// will be returned by PR_GetError().

"Otherwise, SEC_ERROR_OCSP_OLD_RESPONSE will be returned by PR_GetError() for an expired response."
Attachment #8440869 - Flags: review?(brian) → review+
Attached patch patch v4.1 (obsolete) — Splinter Review
Great - thanks for the reviews.
https://tbpl.mozilla.org/?tree=Try&rev=ebf922d2c57e
Attachment #8440869 - Attachment is obsolete: true
Attachment #8440970 - Flags: review+
Attached patch patch v4.2Splinter Review
I think I figured out the windows failures:
https://tbpl.mozilla.org/?tree=Try&rev=9020b5f9d519
Attachment #8440970 - Attachment is obsolete: true
Attachment #8443154 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/31310e455130
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Comment on attachment 8443154 [details] [diff] [review]
patch v4.2

[Approval Request Comment]
Bug caused by (feature/regressing bug #): mozilla::pkix
User impact if declined: in some cases, a user visiting a site with a revoked certificate will not be notified that it has been revoked
Testing completed (on m-c, etc.): landed on m-c, has tests
Risk to taking this patch (and alternatives if risky): low
String or IDL/UUID changes made by this patch: none
Attachment #8443154 - Flags: approval-mozilla-aurora?
Attached patch patch for betaSplinter Review
[Approval Request Comment]
See previous request comment.
There were trivial merge conflicts for beta, hence this patch.
Attachment #8444564 - Flags: review+
Attachment #8444564 - Flags: approval-mozilla-beta?
Attachment #8444564 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8443154 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
This appears to be the issue I encountered when I filed bug 1007891. I looked at the sites listed in that bug, and all of them function correctly in the latest build of Fx31. 

Based on this, marking verified for Fx31. If anyone disagrees, set the flag back to fixed and provide more instructions for verification purposes. Thank you.
Marking verified for Fx32/33 as well, as this issue was fixed a while ago and we no longer encounter this error in our continued testing.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.