Closed Bug 933109 Opened 11 years ago Closed 11 years ago

"Unknown" OCSP responses are being cached in memory, preventing us from receiving a "Good" response

Tracking

(firefox25 wontfix, firefox26 wontfix, firefox27 fixed, firefox28 fixed, firefox29 fixed, firefox-esr17 wontfix, firefox-esr2427+ fixed, b2g18 affected, b2g-v1.1hd affected, b2g-v1.2 affected)

Status:

RESOLVED FIXED

Milestone:

3.15.4

Tracking Flags:

Tracking

Status

firefox25

---

wontfix

firefox26

---

wontfix

firefox27

---

fixed

firefox28

---

fixed

firefox29

---

fixed

firefox-esr17

---

wontfix

firefox-esr24

27+

fixed

b2g18

---

affected

b2g-v1.1hd

---

affected

b2g-v1.2

---

affected

People

(Reporter: briansmith, Assigned: briansmith)

References

(Blocks 1 open bug)

Details

(Whiteboard: [see links to user reports in comment 4][qa-])

Attachments

(3 files, 5 obsolete files)

Treat replies with "unknown" status as temporary failures 11 years ago Kaspar 1.79 KB, patch	KaiE : review-	Details \| Diff \| Splinter Review
patch v3, work in progress 11 years ago Kai Engert (:KaiE:) 28.15 KB, patch		Details \| Diff \| Splinter Review
ocsp-unknown-caching-fix.patch 11 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 12.89 KB, patch	wtc : review+	Details \| Diff \| Splinter Review
unknown-caching-tests.patch for Gecko 11 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 10.58 KB, patch	keeler : review-	Details \| Diff \| Splinter Review
unknown-caching-tests.patch for Gecko [v2] 11 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 6.59 KB, patch	keeler : review+ briansmith : checked-in+	Details \| Diff \| Splinter Review
ocsp-unknown-caching-fix.patch [v2] 11 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 12.35 KB, patch	wtc : review+ wtc : review+	Details \| Diff \| Splinter Review
ocsp-unknown-caching-fix.patch [v3] 11 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 13.64 KB, patch	wtc : review+	Details \| Diff \| Splinter Review
ocsp-unknown-caching-fix.patch [v4] 11 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 13.81 KB, patch	briansmith : review+ briansmith : checked-in+	Details \| Diff \| Splinter Review

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Description

•

11 years ago

See bug 745747 comment 24 and whatever discussion follows that.

We need to make sure that if we've received an "Unknown" response and cached it, that when we do OCSP fetching we try to fetch a response again, even if the "Unknown" response hasn't expired yet. If the fetch fails, then return SEC_ERROR_OCSP_UNKNOWN_CERT.

If we don't do this, then some sites will get DoS'd temporarily when they first get issued a cert and the CA's OCSP responder returns "Unknown" for that cert due to not (yet) being aware of its issuance. This is happening surprisingly often in the real world, based on reports from Twitter and Hacker News that I've seen. It is happening more frequently now because of the baseline requirements requiring CAs to return "Unknown" for unknown certs, where many of them were previously returning "Good" for unknown certs. It is especially bad because SEC_ERROR_OCSP_UNKNOWN_CERT is not overridable.

We should fix this soon if it is at all reasonable. To fix it right away in Firefox would mean to implement what I suggest in bug 745747 comment 24 (or similar) in NSS's classic cert verification code. For EV, we'd have to do the same for libpkix. It would be more important to do the classic mode fix first, since most certs are not EV.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 1

•

11 years ago

Sid, can you suggest somebody to work on this? I have too much high-priority stuff on my plate right now. Thanks.

Flags: needinfo?(sstamm)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 2

•

11 years ago

We should uplift the fix for this as far as we can.

status-b2g18: --- → affected

status-b2g-v1.1hd: --- → affected

status-b2g-v1.2: --- → affected

status-firefox25: --- → affected

status-firefox26: --- → affected

status-firefox27: --- → affected

status-firefox28: --- → affected

status-firefox-esr17: --- → affected

status-firefox-esr24: --- → affected

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

11 years ago

Blocks: 871954

Sid Stamm [:geekboy or :sstamm]

Comment 3

•

11 years ago

Keeler offered to take this.

Assignee: nobody → dkeeler

Status: NEW → ASSIGNED

Flags: needinfo?(sstamm)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 4

•

11 years ago

https://twitter.com/search?q=sec_error_ocsp_unknown_cert
https://input.mozilla.org/en-US/?q=sec_error_ocsp_unknown_cert&date_start=2013-08-15&selected=90d

Whiteboard: [see links to user reports in comment 4]

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 5

•

11 years ago

See:
http://translate.google.com/translate?sl=auto&tl=en&js=n&prev=_t&hl=en&ie=UTF-8&u=http%3A%2F%2Fwww.dgkim.net%2Fwordpress%2F2013%2F11%2F18%2Focsp-%25EA%25B2%2580%25EC%25A6%259D%25ED%2595%25B4%25EB%25B3%25B4%25EA%25B8%25B0%2F

https://bugzilla.startcom.org/show_bug.cgi?id=322

Priority: P2 → P1

Kaspar

Comment 6

•

11 years ago

Attached patch Treat replies with "unknown" status as temporary failures (obsolete) — Details — Splinter Review

It's not a PSM patch, but for the time being, perhaps the most effective solution would consist of not caching unknown responses in any case (neither from POST replies nor stapled ones), as illustrated by the attached patch. cert_RememberOCSPProcessingFailure will then prevent further requests for that cert for 1h by default.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

11 years ago

Assignee: dkeeler → brian

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

11 years ago

Comment 7

•

11 years ago

The fix for this bug may also need to handle the "extended revoked" status in a similar way. See bug 943815 for a description of "extended revoked."

Kai Engert (:KaiE:)

Updated

•

11 years ago

Depends on: 943952

Kai Engert (:KaiE:)

Comment 8

•

11 years ago

Comment on attachment 8335124 [details] [diff] [review]
Treat replies with "unknown" status as temporary failures

(In reply to Kaspar Brand from comment #6)
> Created attachment 8335124 [details] [diff] [review]
> Treat replies with "unknown" status as temporary failures

This patch probably doesn't behave as you had intended.

Firefox runs in relaxed OCSP mode. (Ignoring OCSP network errors.)

When executing for the first time, the OCSP code will return a failure (because of the unknown status).

When executing again (e.g. when reloading the page), the cache entry created by cert_RememberOCSPProcessingFailure will be found. However, because of the relaxed OCSP mode, the remembered failure will be ignored.

As a result, I believe, this patch has the effect to ignore the unknown status and proceed anyway (at least on the second and subsequent load attempts).

Attachment #8335124 - Flags: review-

Kai Engert (:KaiE:)

Comment 9

•

11 years ago

I think a better patch, with less unexpected side effects, would be:

Cache the unknown status response, but cache it for a very short amount of time. Five minutes?

Kai Engert (:KaiE:)

Comment 10

•

11 years ago

It seems the logic in PKIX_RevocationChecker_Check is stricter than the classic code. Even with an unknown status locally cached, with the cache item still fresh, it enforces an OCSP retrieval from network anyway, because the cache item is treated as having an invalid status.

Kai Engert (:KaiE:)

Comment 11

•

11 years ago

Attached patch patch v3, work in progress (obsolete) — Details — Splinter Review

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 12

•

11 years ago

Similar issues affected GlobalSign (and thus CloudFlare, SoundCloud, etc.) today. So, even more reason why we need to make sure this patch gets into Firefox 27.

Kai Engert (:KaiE:)

Comment 13

•

11 years ago

There is no working patch yet.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

11 years ago

Blocks: 943815

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

11 years ago

Component: Security: PSM → Libraries

Product: Core → NSS

Target Milestone: mozilla28 → 3.15.4

Version: Trunk → trunk

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 14

•

11 years ago

Attached patch ocsp-unknown-caching-fix.patch (obsolete) — Details — Splinter Review

This patches makes the following changes:

1. 'Unknown' responses are always considered stale.

2. When we encounter a stale 'Unknown' or 'Revoked' response, we will continue to try to fetch a newer one like before. However, if that fetch fails, and hard fail mode is not enabled, then we will fall back to the stale 'Unknown' or 'Revoked' response.

3. We will now never replace an 'Unknown' or 'Revoked' response with a negative (failure) cache entry like before; instead, an 'Unknown' or 'Revoked' cache entry can only be replaced by a valid signed entry.

Attachment #8335124 - Attachment is obsolete: true

Attachment #8339540 - Attachment is obsolete: true

Attachment #8350814 - Flags: review?(wtc)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 15

•

11 years ago

Attached patch unknown-caching-tests.patch for Gecko (obsolete) — Details — Splinter Review

I had some trouble doing the testing with the NSS OCSP testing code that Kai started in this bug.

For now, I have written these Gecko tests to verify that the behavior is as we intend. The tests verify that:

1. Unknown responses are still put into the cache.
2. We use cached Unknown responses when an OCSP fetch fails.
3. A newer 'Good' response fetched from the network will take precedence over an existing 'Unknown' response in the cache.
4. A newer 'Good' response fetched from the network will replace an existing 'Unknown' response in the cache and that the newly-cached 'Good' response will be used to satisfy future requests.

Attachment #8350816 - Flags: review?(dkeeler)

Wan-Teh Chang

Comment 16

•

11 years ago

Comment on attachment 8350814 [details] [diff] [review]
ocsp-unknown-caching-fix.patch

Review of attachment 8350814 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc. The patch seems reasonable, but it may be prudent to review it more.
I'll try to review it again tomorrow. You can check it in first.

::: lib/certhigh/ocsp.c
@@ +758,5 @@
> +
> +    /* Work around broken OCSP responders that return unknown responses for
> +     * certificates, especially certificates that were just recently issued.
> +     * Forcing cacheItem->nextFetchAttemptTime to zero will cause us to attempt
> +     * to update the cache entry every time we look at it.

The comment says "forcing cacheItem->nextFetchAttemptTime to zero", but
the code below doesn't do that. Is the comment wrong? Or does the comment
describe code in some other function?

@@ +761,5 @@
> +     * Forcing cacheItem->nextFetchAttemptTime to zero will cause us to attempt
> +     * to update the cache entry every time we look at it.
> +     */
> +    if (fresh && cacheItem->certStatusArena &&
> +        cacheItem->certStatus.certStatusType == ocspCertStatus_unknown) {

Why is it necessary to test cacheItem->certStatusArena?

@@ +765,5 @@
> +        cacheItem->certStatus.certStatusType == ocspCertStatus_unknown) {
> +        fresh = PR_FALSE;
> +    }
> +
> +    OCSP_TRACE(("OCSP ocsp_IsCacheItemFresh: %d\n", fresh));

Move this OCSP_TRACE statement outside PR_ExitMonitor.

@@ +4981,2 @@
>              rv = SECSuccess;
> +        } else if (ocsp_IsCacheItemFresh(cacheItem)) {

Nit: we can call ocsp_IsCacheItemFresh(cacheItem) at line 4973 and save its return value in a local variable.

@@ +4983,2 @@
>              /*
>               * No status cached, the previous attempt failed.

This comment needs to be updated because the condition here is now "no status cached, and cache item is fresh".

@@ +4997,1 @@
>          }

Should we add an "else" branch here and set *cacheFreshness to ocspStale?

Or we can just do
    *cacheFreshness = ocsp_IsCacheItemFresh(cacheItem) ? ocspFresh : ocspStale;
at line 4973.

::: lib/certhigh/ocspi.h
@@ +40,5 @@
>                                   PRTime time,
>                                   PRBool addServiceLocator,
>                                   CERTCertificate *signerCert);
>  
> +typedef enum { ocspFresh, ocspStale, ocspMissing } OCSPFreshness;

Nit: the enums will get assigned the values 0, 1, 2. It would be nice to let ocspMissing get the 0 value, by listing it first.

Attachment #8350814 - Flags: review?(wtc) → review+

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 17

•

11 years ago

Comment on attachment 8350816 [details] [diff] [review]
unknown-caching-tests.patch for Gecko

Review of attachment 8350816 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, but I'm surprised the tests in test_ocsp_stapling_expired.js need to be removed.

::: security/manager/ssl/tests/unit/test_ocsp_stapling_expired.js
@@ -63,5 @@
>    // server staples an OCSP response with a recent signature but with an
>    // out-of-date validity period. The certificate has not expired.
>    add_ocsp_test("ocsp-stapling-expired.example.com", Cr.NS_OK,
> -                ocspResponseGood);
> -  add_ocsp_test("ocsp-stapling-expired-fresh-ca.example.com", Cr.NS_OK,

Why do your changes affect the tests in this file?

@@ -83,2 @@
>    add_test(function() { ocspResponder.stop(run_next_test); });
> -  add_test(check_ocsp_stapling_telemetry);

I'm almost certain this doesn't need to be removed - just update the expected telemetry values, below.

::: security/manager/ssl/tests/unit/test_ocsp_unknown_caching.js
@@ +14,5 @@
> +  let ocspResponder = new HttpServer();
> +  ocspResponder.registerPrefixHandler("/", function(request, response) {
> +    ++fetchCount;
> +
> +    dump("fetchCount: " + fetchCount + "\n");

dump isn't handled by the test infrastructure output - use do_print if you think it's important that this information be available (and wherever else you use dump)

@@ +81,5 @@
> +                      clearSessionCache);
> +  add_test(function() { do_check_eq(fetchCount, 2); run_next_test(); });
> +
> +  add_test(function() { ocspResponder.stop(run_next_test); run_next_test(); });
> +

Do you think it's useful to check the stapling telemetry at the end of the test?

Attachment #8350816 - Flags: review?(dkeeler) → review-

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 18

•

11 years ago

Attached patch unknown-caching-tests.patch for Gecko [v2] — Details — Splinter Review

David, it was a mistake to include the changes to OCSPCommon.cpp and test_ocsp_stapling_expired.js in my patch. I had made those changes when I was trying to diagnose a bug in my patch. I have removed those changes from the patch.

I updated the patch to use do_print() instead of dump().

I did not add the OCSP stapling telemetry to the patch. Tests should try to test one thing when possible. I think we have enough testing of the telemetry.

Attachment #8350855 - Flags: review?(dkeeler)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

11 years ago

Attachment #8350816 - Attachment is obsolete: true

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 19

•

11 years ago

Comment on attachment 8350814 [details] [diff] [review]
ocsp-unknown-caching-fix.patch

Review of attachment 8350814 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/certhigh/ocsp.c
@@ +758,5 @@
> +
> +    /* Work around broken OCSP responders that return unknown responses for
> +     * certificates, especially certificates that were just recently issued.
> +     * Forcing cacheItem->nextFetchAttemptTime to zero will cause us to attempt
> +     * to update the cache entry every time we look at it.

My patch originally did force cacheItem->nextFetchAttemptTime to zero, which probably has the same effect, but I changed it to work the way the patch currently works when debugging the patch. I think the current way is clearer.

@@ +761,5 @@
> +     * Forcing cacheItem->nextFetchAttemptTime to zero will cause us to attempt
> +     * to update the cache entry every time we look at it.
> +     */
> +    if (fresh && cacheItem->certStatusArena &&
> +        cacheItem->certStatus.certStatusType == ocspCertStatus_unknown) {

My understanding is that the members of cacheItem->certStatus are undefined when certStatusArena is NULL, based on how other code that accesses certStatusArena is written and documented.

@@ +765,5 @@
> +        cacheItem->certStatus.certStatusType == ocspCertStatus_unknown) {
> +        fresh = PR_FALSE;
> +    }
> +
> +    OCSP_TRACE(("OCSP ocsp_IsCacheItemFresh: %d\n", fresh));

Instead, I removed the PR_EnterMonitor and PR_ExitMonitor calls, because this function is always called with the monitor entered already.

@@ +4981,2 @@
>              rv = SECSuccess;
> +        } else if (ocsp_IsCacheItemFresh(cacheItem)) {

I did something very similar.

@@ +4983,2 @@
>              /*
>               * No status cached, the previous attempt failed.

I believe nothing has changed here. This block was previously already guarded by ocsp_IsCacheItemFresh(cacheItem). Please see if the new version of the code is clearer.

@@ +4997,1 @@
>          }

Effectively, that is what I have done.

::: lib/certhigh/ocspi.h
@@ +40,5 @@
>                                   PRTime time,
>                                   PRBool addServiceLocator,
>                                   CERTCertificate *signerCert);
>  
> +typedef enum { ocspFresh, ocspStale, ocspMissing } OCSPFreshness;

Done.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 20

•

11 years ago

Attached patch ocsp-unknown-caching-fix.patch [v2] (obsolete) — Details — Splinter Review

Review comments addressed.

Attachment #8350814 - Attachment is obsolete: true

Attachment #8350858 - Flags: review?(wtc)

Wan-Teh Chang

Comment 21

•

11 years ago

Comment on attachment 8350858 [details] [diff] [review]
ocsp-unknown-caching-fix.patch [v2]

Review of attachment 8350858 [details] [diff] [review]:
-----------------------------------------------------------------

I reviewed the patch again, this time very carefully.
I understand the patch much better now.

I have some new comments.

::: lib/certhigh/ocsp.c
@@ +800,5 @@
> +     * response to replace it.
> +     */
> +    if (!single && cacheItem && cacheItem->certStatusArena &&
> +        (cacheItem->certStatus.certStatusType == ocspCertStatus_revoked ||
> +         cacheItem->certStatus.certStatusType == ocspCertStatus_unknown)) {

Please explain why we should skip the
ocsp_FreshenCacheItemNextFetchAttemptTime call at the end of the
function. I agree it is OK to skip the ocsp_CheckCacheSize call.

I don't need a detailed explanation. A simple confirmation that
you considered this issue would be good enough.

@@ +4946,5 @@
>   * in a failure, missingResponseError shows the error code of
>   * that failure.
>   */
>  SECStatus
> +ocsp_GetCachedOCSPResponseStatus(CERTOCSPCertID *certID,

Please update the comment for this function. In particular,
"Return value SECFailure means: not found or not fresh" is no
longer true.

@@ +4976,5 @@
>              if (*rvOcsp != SECSuccess) {
>                  *missingResponseError = PORT_GetError();
>              }
>              rv = SECSuccess;
> +        } else if (*cacheFreshness == ocspFresh) {

I now think this should be changed back to just "} else {".

This way, all cache items will get the same treatment whether they
contain a certStatus or an error. Right now the patch leaves this
code in a half-changed state, making it hard to understand.

I don't understand this code as well as you do, so I may have
missed something subtle.

@@ +5088,5 @@
>                                         &certIDWasConsumed, 
>                                         &rvOcsp);
>      if (rv != SECSuccess) {
> +        if (ocsp_FetchingFailureIsVerificationFailure()) {
> +            rvOcsp = SECFailure;

Should we call PORT_SetError here? I guess
ocsp_GetOCSPStatusFromNetwork has set the error code. If so, there
is still a small risk that ocsp_FetchingFailureIsVerificationFailure
may modify the error code.

@@ +5091,5 @@
> +        if (ocsp_FetchingFailureIsVerificationFailure()) {
> +            rvOcsp = SECFailure;
> +        } else if (cachedResponseFreshness == ocspStale &&
> +                   (cachedErrorCode == SEC_ERROR_OCSP_UNKNOWN_CERT ||
> +                    cachedErrorCode == SEC_ERROR_REVOKED_CERTIFICATE)) {

Do we also need to check that rv == SECSuccess &&
rvOcsp != SECSuccess? Or is that implied by a nonzero
cachedErrorCode?

::: lib/certhigh/ocspi.h
@@ +40,5 @@
>                                   PRTime time,
>                                   PRBool addServiceLocator,
>                                   CERTCertificate *signerCert);
>  
> +typedef enum { ocspMissing, ocspFresh, ocspStale } OCSPFreshness;

We probably should name this enum type OCSPCacheFreshness, with values ocspCacheMissing, ocspCacheFresh, and ocspCacheStale.

The reason is that "ocspMissing" is ambiguous. It can mean "the cache item is missing" or "we have a cache item but the cert status is missing (i.e., we have an error entry)".

Attachment #8350858 - Flags: review?(wtc)

Attachment #8350858 - Flags: review+

Wan-Teh Chang

Comment 22

•

11 years ago

Comment on attachment 8350814 [details] [diff] [review]
ocsp-unknown-caching-fix.patch

Review of attachment 8350814 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/certhigh/ocsp.c
@@ +4983,2 @@
>              /*
>               * No status cached, the previous attempt failed.

You are right. The comment "No status cached" refers to the fact
that cacheItem->certStatusArena is NULL, which is still true.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 23

•

11 years ago

Comment on attachment 8350858 [details] [diff] [review]
ocsp-unknown-caching-fix.patch [v2]

Review of attachment 8350858 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/certhigh/ocsp.c
@@ +800,5 @@
> +     * response to replace it.
> +     */
> +    if (!single && cacheItem && cacheItem->certStatusArena &&
> +        (cacheItem->certStatus.certStatusType == ocspCertStatus_revoked ||
> +         cacheItem->certStatus.certStatusType == ocspCertStatus_unknown)) {

ocsp_FreshenCacheItemNextFetchAttemptTime would cause us to extend the amount of time we consider the possibly-already-stale existing entry to be fresh upon failure to update it. AFAICT, that's not what we want; we want to keep trying to update it on failure.

@@ +4946,5 @@
>   * in a failure, missingResponseError shows the error code of
>   * that failure.
>   */
>  SECStatus
> +ocsp_GetCachedOCSPResponseStatus(CERTOCSPCertID *certID,

Done. SECFailure now means that the function failed in a fatal way. It will now return SECSuccess for missing, fresh, or stale entries. Please see the new documentation in the new patch.

@@ +4976,5 @@
>              if (*rvOcsp != SECSuccess) {
>                  *missingResponseError = PORT_GetError();
>              }
>              rv = SECSuccess;
> +        } else if (*cacheFreshness == ocspFresh) {

The full content of the else block is:

  } else if (*cacheFreshness == ocspFresh) {
      /*
       * No status cached, the previous attempt failed.
       * If OCSP is required, we never decide based on a failed attempt 
       * However, if OCSP is optional, a recent OCSP failure is
       * an allowed good state.
       */
      if (!ignoreGlobalOcspFailureSetting &&
          OCSP_Global.ocspFailureMode == 
              ocspMode_FailureIsNotAVerificationFailure) {
          rv = SECSuccess;
          *rvOcsp = SECSuccess;
      }
      *missingResponseError = cacheItem->missingResponseError;
  }

Note that we must NOT set *rvOCSP = SECSuccess when the error entry is stale. Otherwise, a failure to retrieve an OCSP response would permanently disable OCSP fetching for that certificate.

Consequently, I did not make the change you suggested. However, I did move the (*cacheFreshness == ocspFresh) test from the outer else clause to the inner block's if:

            if (*cacheFreshness == ocspFresh &&
                !ignoreGlobalOcspFailureSetting &&
                OCSP_Global.ocspFailureMode == 
                    ocspMode_FailureIsNotAVerificationFailure) {
                *rvOcsp = SECSuccess;
            }

I also factored out the (rv = SECSuccess) statement.

@@ +5088,5 @@
>                                         &certIDWasConsumed, 
>                                         &rvOcsp);
>      if (rv != SECSuccess) {
> +        if (ocsp_FetchingFailureIsVerificationFailure()) {
> +            rvOcsp = SECFailure;

I made this change (defensive programming).

@@ +5091,5 @@
> +        if (ocsp_FetchingFailureIsVerificationFailure()) {
> +            rvOcsp = SECFailure;
> +        } else if (cachedResponseFreshness == ocspStale &&
> +                   (cachedErrorCode == SEC_ERROR_OCSP_UNKNOWN_CERT ||
> +                    cachedErrorCode == SEC_ERROR_REVOKED_CERTIFICATE)) {

This is much clearer in the new patch, because we know the return value from ocsp_GetCachedOCSPResponseStatus will be SECSuccess if we've gotten to this point. Also, we know that ocspResponseFreshness was either ocspMissing or ocspStale because if it were ocspFresh then we would have returned early before attempting the fetch.

::: lib/certhigh/ocspi.h
@@ +40,5 @@
>                                   PRTime time,
>                                   PRBool addServiceLocator,
>                                   CERTCertificate *signerCert);
>  
> +typedef enum { ocspMissing, ocspFresh, ocspStale } OCSPFreshness;

I think "Cache" is redundant because it is implied by "freshness". What else would we be checking freshness of, if not a cache entry?

In the new patch, ocspMissing is now unambiguous; it means there was no cache entry.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 24

•

11 years ago

Attached patch ocsp-unknown-caching-fix.patch [v3] — Details — Splinter Review

Attachment #8350858 - Attachment is obsolete: true

Attachment #8350913 - Flags: review?(wtc)

Wan-Teh Chang

Comment 25

•

11 years ago

Comment on attachment 8350913 [details] [diff] [review]
ocsp-unknown-caching-fix.patch [v3]

Review of attachment 8350913 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc. I didn't read your response to my previous review comments.
I just review this patch directly.

You just need to fix the comment problems. The patch needs to converge
if we want it in NSS 3.15.4. I think you wrote a very good patch.
This is not an easy bug.

::: lib/certhigh/ocsp.c
@@ +4936,5 @@
>  {
>      return ocsp_CertHasGoodStatus(single->certStatus, time);
>  }
>  
> +/* SECFailure means a fatal error occurred.

"fatal error" is not clear. I can say "certificate revoked" is a
fatal error, but that's not what you meant. Perhaps you can just
say it's an invalid argument (SEC_ERROR_INVALID_ARGS).

@@ +4946,5 @@
>   * in a failure, missingResponseError shows the error code of
>   * that failure.
> + * cacheFreshness is ocspCacheMissing if no entry was found,
> + *                   ocspCacheFresh if a fresh entry was found, or
> + *                   ocspCacheStale if a stale entry was found.

The three enum values in this comment should be changed back
to ocspMissing, ocspFresh, and ocspStale. No need to rename them.

@@ +4990,2 @@
>                  OCSP_Global.ocspFailureMode == 
>                      ocspMode_FailureIsNotAVerificationFailure) {

I guess you think ignoring OCSP failure should require a
fresh cache item. This seems reasonable. It also matches the
comment "if OCSP is optional, a recent OCSP failure is an
allowed good state". So at least we don't need to update the
comment.

Note that whether we check *cacheFreshness == ocspFresh or not
here doesn't matter to the call site in CERT_CheckOCSPStatus
because it will use the rvOcsp output argument only when
cachedResponseFreshness == ocspFresh.

NOTE: the call site in CERT_CacheOCSPResponseFromSideChannel
probably should not ignore a cached OCSP failure. I think that
call site should pass ignoreGlobalOcspFailureSetting=PR_TRUE.
But that is a pre-existing problem and you can ignore it in
this bug report.

Attachment #8350913 - Flags: review?(wtc) → review+

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 26

•

11 years ago

Attached patch ocsp-unknown-caching-fix.patch [v4] — Details — Splinter Review

http://hg.mozilla.org/projects/nss/rev/4c3d2fec4ea1

Attachment #8350962 - Flags: review+

Attachment #8350962 - Flags: checked-in+

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 27

•

11 years ago

(In reply to Wan-Teh Chang from comment #25)
> "fatal error" is not clear. I can say "certificate revoked" is a
> fatal error, but that's not what you meant. Perhaps you can just
> say it's an invalid argument (SEC_ERROR_INVALID_ARGS).

Done.

> The three enum values in this comment should be changed back
> to ocspMissing, ocspFresh, and ocspStale. No need to rename them.

Done.

> NOTE: the call site in CERT_CacheOCSPResponseFromSideChannel
> probably should not ignore a cached OCSP failure. I think that
> call site should pass ignoreGlobalOcspFailureSetting=PR_TRUE.
> But that is a pre-existing problem and you can ignore it in
> this bug report.

I filed bug 952808 and checked in the fix for it in a separate patch.

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 28

•

11 years ago

Comment on attachment 8350855 [details] [diff] [review]
unknown-caching-tests.patch for Gecko [v2]

Review of attachment 8350855 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.

::: security/manager/ssl/tests/unit/test_ocsp_unknown_caching.js
@@ +3,5 @@
> +// License, v. 2.0. If a copy of the MPL was not distributed with this
> +// file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +"use strict";
> +
> +let fetchCount = 0;

suggestion: I find it useful to prefix global variables with "g" (so "gFetchCount" in this case)

Attachment #8350855 - Flags: review?(dkeeler) → review+

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Updated

•

11 years ago

Blocks: 952876

Kai Engert (:KaiE:)

Comment 29

•

10 years ago

> @@ +765,5 @@
> > +        cacheItem->certStatus.certStatusType == ocspCertStatus_unknown) {
> > +        fresh = PR_FALSE;
> > +    }
> > +
> > +    OCSP_TRACE(("OCSP ocsp_IsCacheItemFresh: %d\n", fresh));
>
> Instead, I removed the PR_EnterMonitor and PR_ExitMonitor calls, because this function is
> always called with the monitor entered already.


It would be better to keep calls to Monitors. It reminds the programmer that protection is necessary. If you remove the call, future programmers might get into trouble, if they add calls to the function, without noticing the requirement for locking.

Kai Engert (:KaiE:)

Comment 30

•

10 years ago

(In reply to Brian Smith from comment #14)
> 
> 2. When we encounter a stale 'Unknown' or 'Revoked' response, we will
> continue to try to fetch a newer one like before. However, if that fetch
> fails, and hard fail mode is not enabled, then we will fall back to the
> stale 'Unknown' or 'Revoked' response.
> 
> 3. We will now never replace an 'Unknown' or 'Revoked' response with a
> negative (failure) cache entry like before; instead, an 'Unknown' or
> 'Revoked' cache entry can only be replaced by a valid signed entry.


It worries me that you allow a response with a good status to override a previously received revoked status.

I'm worried about replay attacks during the hours after a revocation took place.

An attacker who hacked a compromised cert might replay a good status message (e.g. stapled into a MITMed connection).

In my understanding of past discussions in NSS meetings, there was consensus that status revoked should always stick.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 31

•

10 years ago

(In reply to Kai Engert (:kaie) from comment #29)
> It would be better to keep calls to Monitors. It reminds the programmer that
> protection is necessary. If you remove the call, future programmers might
> get into trouble, if they add calls to the function, without noticing the
> requirement for locking.

Entering/exiting monitors has a cost, so it is better for performance to avoid doing so when reasonable. Also, entering/exiting monitors tends to make the code harder to read by encouraging us to use "goto" and friends.

Before removing the monitor usage in that function, I looked at other functions and found ones like ocsp_SetCacheItemResponse that also do not enter/exit the monitor. So, I think the reasoning I used to justify my change is already being used in ocsp.c.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 32

•

10 years ago

(In reply to Kai Engert (:kaie) from comment #30)
> (In reply to Brian Smith from comment #14)
> > 
> > 2. When we encounter a stale 'Unknown' or 'Revoked' response, we will
> > continue to try to fetch a newer one like before. However, if that fetch
> > fails, and hard fail mode is not enabled, then we will fall back to the
> > stale 'Unknown' or 'Revoked' response.
> > 
> > 3. We will now never replace an 'Unknown' or 'Revoked' response with a
> > negative (failure) cache entry like before; instead, an 'Unknown' or
> > 'Revoked' cache entry can only be replaced by a valid signed entry.
> 
> It worries me that you allow a response with a good status to override a
> previously received revoked status.

In the PKIX model, some revocations can be undone. In particular, a certificate can be revoked with "on hold" status and then un-revoked. Also, in RFC 6960 there is the "extended revoked" status where 

> An attacker who hacked a compromised cert might replay a good status message
> (e.g. stapled into a MITMed connection).

Stapling an older Good response will not work, because CERT_CacheOCSPResponseFromSideChannel looks like this:

    rv = ocsp_GetDecodedVerifiedSingleResponseForID(handle, certID, cert,
						    time, pwArg,
						    encodedResponse,
						    &decodedResponse,
						    &singleResponse);
    if (rv == SECSuccess) {
	rvOcsp = ocsp_SingleResponseCertHasGoodStatus(singleResponse, time);
	/* Cache any valid singleResponse, regardless of status. */
	ocsp_CacheSingleResponse(certID, singleResponse, &certIDWasConsumed);

ocsp_CacheSingleResponse calls ocsp_


> 
> In my understanding of past discussions in NSS meetings, there was consensus
> that status revoked should always stick.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

10 years ago

Depends on: 956838

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 33

•

10 years ago

[Sorry, accidentally submitted my comment too early]

...

CERT_CacheOCSPResponseFromSideChannel calls ocsp_CacheSingleResponse which calls ocsp_CreateOrUpdateCacheEntry which does check that the new OCSP response is newer than the older one (based on thisUpdate). So, stapling a Good OCSP response will not allow an attacker to override a cached Revoked response unless the Good response is newer. And, if the Good response is newer then the certificate should be considered Good, on the presumption that the certificate was revoked either as "on hold" or "extended revoked / unknown certificate".

However, I do think that there is a problem in ocsp_GetOCSPStatusFromNetwork. It calls ocsp_CacheSingleResponse to cache the new response, which means we'll never replace a newer entry with an older one. However, ocsp_GetOCSPStatusFromNetwork doesn't return the newer (cached) status instead of the older (just-fetched) status, which seems like a legit bug. I filed bug 956838 for this.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

10 years ago

No longer depends on: 943952

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 34

•

10 years ago

Comment on attachment 8350855 [details] [diff] [review]
unknown-caching-tests.patch for Gecko [v2]

https://hg.mozilla.org/integration/mozilla-inbound/rev/75e5396d0847

Tests checked into mozilla-inbound. I did s/fetchCount/gFetchCount/g

Attachment #8350855 - Flags: checked-in+

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 35

•

10 years ago

Backed out for the failures in https://tbpl.mozilla.org/php/getParsedLog.php?id=32608408&tree=Mozilla-Inbound

http://hg.mozilla.org/integration/mozilla-inbound/rev/b1270a3aad90

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Updated

•

10 years ago

Flags: needinfo?(brian)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 36

•

10 years ago

Thanks Wes.

I relanded this with the missing fail-if = os == 'android' in the xpcshell.ini:

https://hg.mozilla.org/integration/mozilla-inbound/rev/d59351efd413

Flags: needinfo?(brian)

Ed Morley [:emorley]

Comment 37

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/d59351efd413

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 38

•

10 years ago

tests landed on mozilla-aurora, a=test-only:
https://hg.mozilla.org/releases/mozilla-aurora/rev/cd228d859f4b

status-firefox25: affected → wontfix

status-firefox26: affected → wontfix

status-firefox27: affected → fixed

status-firefox28: affected → fixed

status-firefox29: --- → fixed

status-firefox-esr17: affected → wontfix

Ryan VanderMeulen [:RyanVM]

Updated

•

10 years ago

Flags: in-testsuite+

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Comment 39

•

10 years ago

Tests landed in mozilla-beta, a=text-only:
https://hg.mozilla.org/releases/mozilla-beta/rev/49de12be81eb

Ryan VanderMeulen [:RyanVM]

Comment 40

•

10 years ago

Backed out from beta for test failures.
https://hg.mozilla.org/releases/mozilla-beta/rev/88682fb1c94e

u279076

Comment 41

•

10 years ago

Given in-testsuite coverage this fix will not be manually verified by QA. If you believe this warrants extra QA attention please nominate for testing by removing this whiteboard tag and adding the verifyme keyword. Please also provide any details you have that may inform our testing.

Whiteboard: [see links to user reports in comment 4] → [see links to user reports in comment 4][qa-]

Daniel Veditz [:dveditz]

Updated

•

10 years ago

status-firefox-esr24: affected → fixed

tracking-firefox-esr24: --- → 27+

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Assignee

Updated

•

10 years ago

Depends on: 964493

You need to log in before you can comment on or make changes to this bug.