Closed Bug 1040889 Opened 10 years ago Closed 10 years ago

NSSCertDBTrustDomain refreshes cached OCSP server error responses, which can prevent the client from ever trying to get a response again

Categories

(Core :: Security: PSM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34
Tracking Status
firefox30 --- unaffected
firefox31 + affected
firefox32 + fixed
firefox33 + fixed
firefox34 --- fixed
firefox-esr24 --- unaffected
firefox-esr31 --- wontfix
b2g-v1.4 --- unaffected
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: keeler, Assigned: keeler)

References

Details

(Keywords: relnote)

Attachments

(2 files, 5 obsolete files)

NSSCertDBTrustDomain skips fetching an OCSP response if it has a cached error indication from the server (e.g. SEC_ERROR_OCSP_SERVER_FAILURE). Unfortunately, it also basically goes, "Oh, I must not have gotten a response. I better remember that," causing that same error to be reinserted into the OCSP cache with a newer timestamp. In theory this error could persist indefinitely and NSSCertDBTrustDomain will never attempt to fetch an OCSP response again for that certificate.
Attached patch patch (obsolete) — Splinter Review
Assignee: nobody → dkeeler
Status: NEW → ASSIGNED
Attachment #8458880 - Flags: review?(brian)
Comment on attachment 8458880 [details] [diff] [review]
patch

Review of attachment 8458880 [details] [diff] [review]:
-----------------------------------------------------------------

r-, mostly because I am concerned about the expired stapled response case, but also because I think at a minimum we need to document what we're trying to do with some comments.

::: security/certverifier/NSSCertDBTrustDomain.cpp
@@ +521,2 @@
>      }
>      PR_SetError(error, 0);

Please add a comment that describes why we're doing this conditionally instead of unconditionally. My understanding is that error != 0 only when we attempted to fetch a response and the fetch failed (a network error, or a response that wasn't a valid Good response, or maybe else). When error == 0 and !response, that means we didn't even try to fetch a response, so we have nothing new to cache.

Isn't it possible for us to reach this point in the code with an expired stapled response? Note that there is some affordance for that a few lines below:

"if (stapledOCSPResponseErrorCode != 0) {"

Do we need to have logic like this:

if (error == 0) {
  error = stapledResponseErrorCode;
}
if (error == 0) {
  error = cachedReponseErrorCode;
} else {
   PRTime timeout = ...;
   if (mOCSPCache.Put(...
}

In particular, maybe the old logic that was reading a possibly-uninitialized result from PR_GetError() was intended to be reading the result of processing a stapled response? But, in the new code, and in my patch in the bug this was cloned from, the PR_SetError (or "rv = Success" in my patch) undoes that.

If it isn't possible to get to the expired stapled OCSP response case from here, then we should clarify that by removing the chunk of code below this chunk that handles that.
Attachment #8458880 - Flags: review?(brian) → review-
I'm going to wait to land bug 1039064 until we have this sorted out, because if David's patch is wrong regarding the expired stapled response case, then my patch is probably wrong too.
Blocks: 1039064
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #2)
> Please add a comment that describes why we're doing this conditionally
> instead of unconditionally. My understanding is that error != 0 only when we
> attempted to fetch a response and the fetch failed (a network error, or a
> response that wasn't a valid Good response, or maybe else). When error == 0
> and !response, that means we didn't even try to fetch a response, so we have
> nothing new to cache.

Your understanding is correct. I added some comments and an assertion.

> Isn't it possible for us to reach this point in the code with an expired
> stapled response? Note that there is some affordance for that a few lines
> below:

Yes, but that's properly handled. Basically, if hard-fail is on, we'll report the network/server failure error. If we have a more serious cached response, we'll report that. Finally, if there was an expired stapled response, we'll report that.
Attached patch patch with comments (obsolete) — Splinter Review
Attachment #8458880 - Attachment is obsolete: true
Attachment #8458957 - Flags: review?(brian)
Comment on attachment 8458957 [details] [diff] [review]
patch with comments

Review of attachment 8458957 [details] [diff] [review]:
-----------------------------------------------------------------

It is still unclear to me why we don't cache expired stapled revoked/unknown responses when fetching fails. Perhaps that should be a separate bug though?

If you think my suggested changes are equally readable or more readable than what you did, then please make those changes. Even after you added the new comments and assertion, I still had trouble understanding things.

Please ask cviecco to look over the final version of the patch. He was the reviewer on the initial changes. When such a mistake is found, the author and the reviewer of the code that added the bug should both understand the fix, so that both can learn from it. (i.e. it would have been better to have cviecco be the revieer of this patch.)

::: security/certverifier/NSSCertDBTrustDomain.cpp
@@ +506,5 @@
>  
>      response = DoOCSPRequest(arena.get(), url, request,
>                               OCSPFetchingTypeToTimeoutTime(mOCSPFetching),
>                               mOCSPGetConfig == CertVerifier::ocsp_get_enabled);
>    }

I suggest doing this:

- }
+ } else {
+   PR_ASSERT(cachedResponseErrorCode != 0);
+   PR_SetError(cachedResponseErrorCode, 0);
+ }

@@ +514,4 @@
>    if (!response) {
>      PRErrorCode error = PR_GetError();
> +    // If the current error code is 0, we haven't attempted to fetch a
> +    // response (thus, we have nothing to tell the cache).

If you take my suggestion, then you can PR_ASSERT(error != 0) here, which IMO is much clearer.

Please add a comment about why we don't want to cache an expired REVOKED or UNKNOWN response that was stapled. (I'm not sure why we don't, actually.)

@@ +525,4 @@
>      }
> +    // Note that because we attempted to fetch a response if the cached
> +    // response error code was 0 (see above), error cannot be 0 here.
> +    PR_ASSERT(error != 0);

If you take my suggestion, this assertion would be redundant.
Attachment #8458957 - Flags: review?(brian) → review+
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #6)
> Please ask cviecco to look over the final version of the patch.

FWIW, this doesn't need to block checking it in; you can land it without cviecco's r+ if you think it is urgent. It seems like we're probably going to uplift this all the way through beta, so not sure it matters either way, though.
We're probably not going to make 31 on this, so let's relnote that there's a known bug. We'll certainly get it fixed for 32.
Keywords: relnote
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #6)
> @@ +514,4 @@
> Please add a comment about why we don't want to cache an expired REVOKED or
> UNKNOWN response that was stapled. (I'm not sure why we don't, actually.)

At the moment, our cache doesn't have the ability to store an entry indicating an expired Revoked/Unknown response (maybe it should). Would it make sense to store a stapled expired Revoked/Unknown response as either just expired or just Revoked/Unknown?
Attached patch patch v2 (obsolete) — Splinter Review
I made the changes Brian suggested minus the comment about caching stapled expired Revoked/Unknown responses - we'll handle that in another bug.
Attachment #8458957 - Attachment is obsolete: true
Attachment #8459710 - Flags: review?(cviecco)
Comment on attachment 8459710 [details] [diff] [review]
patch v2

Review of attachment 8459710 [details] [diff] [review]:
-----------------------------------------------------------------

R+ with addition of extra check

::: security/certverifier/NSSCertDBTrustDomain.cpp
@@ +516,3 @@
>    if (!response) {
>      PRErrorCode error = PR_GetError();
> +    PR_ASSERT(error != 0);

PR_ASSERT becomes null on non-debug builds, change this for another mechanism?
Attachment #8459710 - Flags: review?(cviecco) → review+
Comment on attachment 8459710 [details] [diff] [review]
patch v2

Review of attachment 8459710 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/certverifier/NSSCertDBTrustDomain.cpp
@@ +519,5 @@
> +    // If the current error code is the same as the cached error code, we
> +    // haven't attempted to fetch a response (thus, we have nothing to tell the
> +    // cache). If the converse is true, then we do have something new to tell
> +    // the cache.
> +    if (error != cachedResponseErrorCode) {

When rebasing my patches on this one, I became unsure about this logic. Isn't it possible for DoOCSPRequest to return an error code that is equal to cachedResponseErrorCode? And, in that case, we'd still want to cache it, right?

Perhaps it would be clearer to have a flag "bool sentOCSPRequest" instead?
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #12)
> When rebasing my patches on this one, I became unsure about this logic.
> Isn't it possible for DoOCSPRequest to return an error code that is equal to
> cachedResponseErrorCode?

Actually, no, because we only attempt to fetch a response if the following condition is met:

497   if (cachedResponseErrorCode == 0 ||
498       cachedResponseErrorCode == SEC_ERROR_OCSP_UNKNOWN_CERT ||
499       cachedResponseErrorCode == SEC_ERROR_OCSP_OLD_RESPONSE) {

DoOCSPRequest shouldn't ever fail (i.e. return nullptr) having called PR_SetError with the error code as 0, SEC_ERROR_OCSP_UNKNOWN_CERT, or SEC_ERROR_OCSP_OLD_RESPONSE.

I suppose I should add a comment explaining this. Also, do you have an opinion on using debug-only vs. debug-and-release assertions? (see Camilo's review)
(In reply to David Keeler (:keeler) [use needinfo?] from comment #9)
> At the moment, our cache doesn't have the ability to store an entry
> indicating an expired Revoked/Unknown response (maybe it should). Would it
> make sense to store a stapled expired Revoked/Unknown response as either
> just expired or just Revoked/Unknown?

Imagine that you got a stapled Revoked response and then stored it in the cache. Then the response expired. I would that receiving an expired Revoked response (with no better entry already in the cache) would leave the cache in the same state as a Revoked response that was already expired when we put it in the cache. However, I can also accept the logic that we really don't need to put stapled OCSP responses (stapled or otherwise) into the cache, since the server is almost definitely going to be consistently stapling or not. (We're not too worried about what an attacker can do with stapled OCSP responses because only Must-Staple can really offer any reasonable protection from such an attacker; and, Must-Staple requires the server to consistently staple the response so again the cache isn't needed.)

(In reply to David Keeler (:keeler) [use needinfo?] from comment #13)
> (In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response)
> from comment #12)
> > When rebasing my patches on this one, I became unsure about this logic.
> > Isn't it possible for DoOCSPRequest to return an error code that is equal to
> > cachedResponseErrorCode?
> 
> Actually, no, because we only attempt to fetch a response if the following
> condition is met:
> 
> 497   if (cachedResponseErrorCode == 0 ||
> 498       cachedResponseErrorCode == SEC_ERROR_OCSP_UNKNOWN_CERT ||
> 499       cachedResponseErrorCode == SEC_ERROR_OCSP_OLD_RESPONSE) {
> 
> DoOCSPRequest shouldn't ever fail (i.e. return nullptr) having called
> PR_SetError with the error code as 0, SEC_ERROR_OCSP_UNKNOWN_CERT, or
> SEC_ERROR_OCSP_OLD_RESPONSE.

Are you sure that SEC_ERROR_OCSP_UNKNOWN_CERT and SEC_ERROR_OCSP_OLD_RESPONSE are not possible? Surely the OCSP responder could return an "Unknown" response to the OCSP request, and then DoOCSPRequest would return SEC_ERROR_OCSP_UNKNOWN_CERT, right? (Similarly, pretty sure SEC_ERROR_OCSP_OLD_RESPONSE is possible.)

> I suppose I should add a comment explaining this. Also, do you have an
> opinion on using debug-only vs. debug-and-release assertions? (see Camilo's
> review)

I don't have a strong opinion. I think MOZ_ASSERT is OK but it does require us to assume that everything is calling PR_SetError when it should be. Camilo's suggestion is safer because with his suggestion it wouldn't even be possible for us to accidentally PR_SetError(0, 0); (i.e. Good).
Attached patch patch v3 (obsolete) — Splinter Review
This modifies the patch so that it works better with the changes in bug 1039064.
Attachment #8459710 - Attachment is obsolete: true
Attachment #8461092 - Flags: review?(brian)
Comment on attachment 8461092 [details] [diff] [review]
patch v3

Review of attachment 8461092 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/certverifier/NSSCertDBTrustDomain.cpp
@@ +494,5 @@
>    // Only request a response if we didn't have a cached indication of failure
>    // (don't keep requesting responses from a failing server).
> +  const SECItem* response;
> +  bool attemptedRequest;
> +  PRErrorCode error;

Why not initialize all the values here instead an remove the else branch?
Attachment #8461092 - Flags: review?(cviecco) → review-
Comment on attachment 8461092 [details] [diff] [review]
patch v3

Review of attachment 8461092 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/certverifier/NSSCertDBTrustDomain.cpp
@@ +494,5 @@
>    // Only request a response if we didn't have a cached indication of failure
>    // (don't keep requesting responses from a failing server).
> +  const SECItem* response;
> +  bool attemptedRequest;
> +  PRErrorCode error;

I think it is fine this way. (Anyway, he's just following the example from my patch.)
Attachment #8461092 - Flags: review?(brian) → review+
The idea is that the compiler will tell us if those values are ever used uninitialized.
Comment on attachment 8461092 [details] [diff] [review]
patch v3

Review of attachment 8461092 [details] [diff] [review]:
-----------------------------------------------------------------

I would prefer the inits. But if brian says is OK then it is. 
r+ with initialization of error.
Attachment #8461092 - Flags: review- → review+
Looks like the compiler on windows didn't like that.
https://tbpl.mozilla.org/?tree=Try&rev=0054c9d157db
(In reply to David Keeler (:keeler) [use needinfo?] from comment #21)
> Looks like the compiler on windows didn't like that.
> https://tbpl.mozilla.org/?tree=Try&rev=0054c9d157db

David, I went ahead and checked this version into Mozilla-Inbound so that it can get into Nightly soon so the uplifting process can start sooner, mostly because I am eager to start clearing my backlog of patches to land next week:

https://hg.mozilla.org/integration/mozilla-inbound/rev/1ed822e820d3
https://hg.mozilla.org/mozilla-central/rev/1ed822e820d3
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Comment on attachment 8461092 [details] [diff] [review]
patch v3

Approval Request Comment
[Feature/regressing bug #]: mozilla::pkix / OCSP fetching/caching
[User impact if declined]: users may not get fresh revocation information
[Describe test coverage new/current, TBPL]: we have sufficient test coverage to catch regressions this may cause
[Risks and why]: low risk - it's unlikely this patch can make things worse
[String/UUID change made/needed]: none
Attachment #8461092 - Flags: approval-mozilla-aurora?
Attached patch patch for beta (obsolete) — Splinter Review
The original patch doesn't apply cleanly to beta, so this is a branch-specific patch. Carrying over r+.

Approval Request Comment
see previous comment
Attachment #8464865 - Flags: review+
Attachment #8464865 - Flags: approval-mozilla-beta?
Comment on attachment 8461092 [details] [diff] [review]
patch v3

Aurora+
Attachment #8461092 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8464865 [details] [diff] [review]
patch for beta

Beta+
Attachment #8464865 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Windows *debug* bustage apparently.
D'oh - I rebased that patch on the wrong version. (What got checked into aurora is correct, however.)
Attachment #8464865 - Attachment is obsolete: true
Attachment #8465593 - Flags: review+
Flags: needinfo?(dkeeler)
For future reference (and so hopefully no one else gets confused), this is what was checked into central and aurora.
Attachment #8461092 - Attachment is obsolete: true
Attachment #8465595 - Flags: review+
This does not meet ESR landing criteria.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: