Closed Bug 542538 Opened 14 years ago Closed 14 years ago

NSS: Add function for recording OCSP stapled replies

Categories

(NSS :: Libraries, enhancement, P1)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED
3.12.6

People

(Reporter: agl, Assigned: agl)

References

Details

Attachments

(3 files, 5 obsolete files)

Attached patch patch v1 (obsolete) — Splinter Review
When an OCSP stapled reply is found, we want some way to insert it into the OCSP cache.
Attachment #423803 - Attachment is patch: true
Attachment #423803 - Attachment mime type: application/octet-stream → text/plain
Attachment #423803 - Flags: review?(wtc)
This bug is highly desirable for NSS 3.12.6 but should not block 3.12.6.

Patch v1 is complicated because I asked Adam to refactor existing code
to allow sharing code with the new function.  I'll attach Adam's original
patch next, which implements the new function independently, without
modifying any existing functions.  It may be easier to review and assess
the risk.
Assignee: nobody → agl
Blocks: 360420, 527659
Severity: normal → enhancement
OS: Linux → All
Priority: -- → P1
Hardware: x86 → All
Target Milestone: --- → 3.12.6
go ahead have put review? for me once the reduced patch is given. We'll keep the v1 patch here to incorporate in 3.12.7. Refactor cleanup is a good thing;).

bob
I will shuffle code around in Adam's patch v1 and
see if I can reduce the amount of diffs.
Attached patch Adam's patch v2 (obsolete) — Splinter Review
I edited Adam's patch v1 to reduce the amount of diffs.

This patch adds a new public function,
CERT_CacheOCSPResponseFromSideChannel.  The new function
is based on CERT_CheckOCSPStatus, except that instead
of getting an OCSP response from network, we already have
an OCSP response in hand.

I asked Adam to refactor ocsp_GetOCSPStatusFromNetwork
so that the bottom half of ocsp_GetOCSPStatusFromNetwork,
when we already have an encoded response, becomes a new
internal function ocsp_CacheEncodedOCSPResponse that can
be shared with CERT_CacheOCSPResponseFromSideChannel.

Bob, should CERT_CacheOCSPResponseFromSideChannel take
a void *pwArg argument?
Attachment #423803 - Attachment is obsolete: true
Attachment #424116 - Flags: review?(rrelyea)
Attachment #423803 - Flags: review?(wtc)
Comment on attachment 424116 [details] [diff] [review]
Adam's patch v2

r-

For just one issue: This patch changes several OCSP errors from 'Network Errors' to 'OCSP Failed result'. That means if someone has ocsp_FetchingFailureIsVerificationFailure() set to false, there are cases that would have been accepted that weren't before. I don't think we can make this kind of semantic change in this patch.

I think we need the dual rv (function succeeded) and rv_ocsp (actual ocsp status) flow into ocsp_CacheEncodedOCSPResponse.

bob
Attachment #424116 - Flags: review?(rrelyea) → review-
Bob, this patch does not change the use of
ocsp_FetchingFailureIsVerificationFailure in
CERT_CheckOCSPStatus.  This patch doesn't
change CERT_CheckOCSPStatus at all.  So
CERT_CheckOCSPStatus still calls
ocsp_FetchingFailureIsVerificationFailure if
ocsp_GetOCSPStatusFromNetwork fails.
Bob, I found that this patch makes ocsp_GetOCSPStatusFromNetwork
return SECFailure when the function successfully gets a response
from the network.  So this patch is definitely r-.

Clearly the refactoring is much harder to verify correct by
code review than I expected.  For 3.12.6 we probably should
try Adam's original patch.  I'll look for it.
Ah, what I saw was that it was returning SECSuccess in cases where it should have returned SECFailure. Of course you are right, it always returns failure.

bob
Bob, I think I got your point in comment 5 now.
I will make another attempt at making the refactoring
approach work.
I will review this patch myself carefully tomorrow.
I made the change that Bob suggested.

Adam, please review the final return statement in
CERT_CacheOCSPResponseFromSideChannel and make sure
it implements your spec.  Alternatively, do you think
CERT_CacheOCSPResponseFromSideChannel should also have
a rv_ocsp output argument?
Attachment #424116 - Attachment is obsolete: true
Attachment #424167 - Flags: superreview?(rrelyea)
Attachment #424167 - Flags: review?(agl)
Comment on attachment 424167 [details] [diff] [review]
patch v3 (checked in)

r+ that takes care of it.

bob
Attachment #424167 - Flags: superreview?(rrelyea) → superreview+
Comment on attachment 424167 [details] [diff] [review]
patch v3 (checked in)

I verified carefully that this patch does not change the
behavior of existing functions CERT_CheckOCSPStatus and
ocsp_GetOCSPStatusFromNetwork, i.e., the refactoring was
successful.  So I checked it in on the NSS trunk (NSS 3.12.6).

Checking in certhigh/ocsp.c;
/cvsroot/mozilla/security/nss/lib/certhigh/ocsp.c,v  <--  ocsp.c
new revision: 1.62; previous revision: 1.61
done
Checking in certhigh/ocsp.h;
/cvsroot/mozilla/security/nss/lib/certhigh/ocsp.h,v  <--  ocsp.h
new revision: 1.15; previous revision: 1.14
done
Checking in nss/nss.def;
/cvsroot/mozilla/security/nss/lib/nss/nss.def,v  <--  nss.def
new revision: 1.205; previous revision: 1.204
done

I will check the new CERT_CacheOCSPResponseFromSideChannel
function tomorrow.  Adam, please focus your review on that,
too.

Bob, do you know if we need a void *pwArg argument for the
new function? CERT_CheckOCSPStatus has that argument.  Adam,
if Bob is not sure, we may want to add the argument just to
be safe.
Attachment #424167 - Attachment description: patch v3 → patch v3 (checked in)
Comment on attachment 424167 [details] [diff] [review]
patch v3 (checked in)

I also patched in OCSP stapling support and tested it against an Apache server (which also has support). Everything seems to be working.
Attachment #424167 - Flags: review?(agl) → review+
Adam: thanks for testing the new function.

To facilitate code review by Bob, I changed
CERT_CacheOCSPResponseFromSideChannel to look like
CERT_CheckOCSPStatus.  Please doublecheck I didn't
make a mistake.

Could you please write a patch to fix the following comments?
1. I forgot to update the comments for ocsp_CacheEncodedOCSPResponse
after I added the rv_ocsp argument.  Please document the
rv_ocsp argument and update the description of the return
value.
2. The use of "positive" and "valid" in your comments is
vague to me.  Please clarify that by using the exact
terms from the OCSP spec or whatever ocsp.c is using
("good"?).

I'm also not sure about some points in your patch.

1. If we already have a cached "revoked" response, why
do we still want to validate the stapled response?

+    if (rv == SECSuccess && rvOcsp == SECSuccess) {
+	/* The cached value is positive. We don't want to waste time validating
+	 * this OCSP response. */
+        CERT_DestroyOCSPCertID(certID);
+        return rv;
+    }

It is unlikely that a revoked cert will become good again.

2. Re: cache poisoning: this issue is not specific to
stapled responses.  Most OCSP responders use the HTTP
protocol (not HTTPS), so they can be attacked, too.
So I don't see why the cache policies for responses
from the network and responses from side channels
should differ.  Do you think our cache policy for
responses from the network is vulnerable to cache poisoning?
Are we caching expired responses from the network?
I will add a void *pwArg agument to
CERT_CacheOCSPResponseFromSideChannel.  The OCSP stapling code
should pass ss->pkcs11PinArg as pwArg to
CERT_CacheOCSPResponseFromSideChannel.  See

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslauth.c&rev=1.16&mark=235,254-255#230

After a lot of code browsing, I found that pwArg (also called
'wincx' in other NSS functions) is ultimately passed to the
PK11_Global.getPass and PK11_Global.verifyPass functions (set
by PK11_SetPasswordFunc and PK11_SetVerifyPasswordFunc).
pwarg and wincx *should* be two very different things.  In some cases, NSS 
stores a value passed in by one of those two names, and later outputs it
to a callback function by that same name.  I hope there are no cases where
we mix the two, passing a wincx value as a pwarg or vice versa.
wtc: will write the requested patch tomorrow.
(In reply to comment #16)
Current NSS code uses pwarg and wincx interchangeably.

SSL_AuthCertificate passes pwarg as the wincx argument to CERT_VerifyCertNow:

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslauth.c&rev=1.16&mark=235,254-255#230

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/certhigh/certvfy.c&rev=1.69&mark=1535-1536#1535

... Lots of intermediate functions with the wincx argument ...

Eventually, PK11_DoPassword passes wincx to PK11_Global.verifyPass
and PK11_Global.verifyPass:

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/pk11wrap/pk11auth.c&rev=1.11&mark=554,578,594#553
RE comment 16:

Actually in NSS proper, wincx is a badly named pwarg. They *are* the same thing. Wincx is the confusing historical name that has not yet been replaced by pwarg. This is the application supplied value that is passed to the password callback function that the application defined. For GUI apps this is usually the window context of the window associated with the current operation (for modality).

pwArg is needed if your function, or some function called by you (including directly) needs to authenticate to a token *and* does not have some other NSS structure which is presumed to be associated with a particular operation and has a pwarg/wincx. SymKeys and SECKey*s have wincx/pwargs (which means the are needed on creation. It appears SSL connections also have them (which makes sense), The usage in SSL clearly shows that pkcs11PinArg is identical to wincx (SSL just has the better name for it). I would be highly surprised if PSM does not pass the window context into SSL_SetPKCS11PinArg().

bob

> It is unlikely that a revoked cert will become good again.

Is this the case the the cert is really revoked, or set to 'HOLD'. There is an idea of 'HOLD' a cert that is thought to maybe be compromised, but we aren't sure. Later if it was found that the key for the cert was secure anyway (it just locked in the car or left at home), you would take the cert off of 'HOLD' state.

I'm not sure how 'HOLD' state is communicated via OCSP, but we wouldn't want that semantic to fail.

bob
(In reply to comment #14)

> 1. I forgot to update the comments for ocsp_CacheEncodedOCSPResponse
> after I added the rv_ocsp argument.  Please document the
> rv_ocsp argument and update the description of the return
> value.
> 2. The use of "positive" and "valid" in your comments is
> vague to me.  Please clarify that by using the exact
> terms from the OCSP spec or whatever ocsp.c is using
> ("good"?).

I have updated the comments here in the attached patch.


> 1. If we already have a cached "revoked" response, why
> do we still want to validate the stapled response?

(referencing #16. RFC 2560 2.2, an 'on hold' certificate is considered to be 'revoked' by OCSP. So there's a trade off here. If we don't respect a previous 'revoked' then an attacker can use OCSP stapling to give us a 'good' OCSP response from before the revocation and we'll accept it - even if we saw the revoked message. This lasts for as long as the timestamp on the OCSP response.

However, if we *do* respect the revoked status then any 'on hold' certificates could be poisoned forever. Esp when we have a disk cache.

> 2. Re: cache poisoning: this issue is not specific to
> stapled responses.  Most OCSP responders use the HTTP
> protocol (not HTTPS), so they can be attacked, too.
> So I don't see why the cache policies for responses
> from the network and responses from side channels
> should differ.  Do you think our cache policy for
> responses from the network is vulnerable to cache poisoning?
> Are we caching expired responses from the network?

1) I assert that it's reasonable to have different policies for responses from side channels and from the network:

Consider a client that validates OCSP stapled replies and certificates before the Finished message. (say, because it wants to check the certificate before sending client-side certs.) In that case, if the attacker can direct the client to make a connection to an attacker controlled TLS server, then he can present arbitrary inputs to side channel function. For a web browser, this is trival (just include an <img> link).

It's significantly harder to MITM attack an HTTP connection (although certainly not impossible)

2) I assert that the side channel function shouldn't cache invalid responses.

Note that ocsp_CacheEncodedOCSPResponse doesn't just add negative cache entries for valid, but revoked, responses, it does it for totally invalid responses too.

I'm not sure that this is a good idea for the network case (although I can see arguments for it), but it certainly isn't good for the side channel case.

Consider a user visiting an attacker controlled web site. An invisible <img> causes the browser to connect to https://attacker.example.com which presents an OCSP stapled response (that doesn't even parse) and a certificate for "www.bank.com". The side channel OCSP response must be parsed before checking the certificate. The certificate will return as invalid, but not before we inserted a negative OCSP cache entry for "www.bank.com".

In the future, the user cannot now visit "www.bank.com" over HTTPS. Maybe the user changes to HTTP in order to "make it work" and is now vulnerable to other attacks.
Attachment #424587 - Flags: review?
Comment on attachment 424357 [details] [diff] [review]
Add the pwArg argument to CERT_CacheOCSPResponseFromSideChannel (checked in)

r+ rrelyea
Attachment #424357 - Flags: superreview?(rrelyea) → superreview+
Attachment #424357 - Attachment description: Add the pwArg argument to CERT_CacheOCSPResponseFromSideChannel → Add the pwArg argument to CERT_CacheOCSPResponseFromSideChannel (checked in)
Comment on attachment 424357 [details] [diff] [review]
Add the pwArg argument to CERT_CacheOCSPResponseFromSideChannel (checked in)

I checked in this patch on the NSS trunk (NSS 3.12.6).

Checking in ocsp.c;
/cvsroot/mozilla/security/nss/lib/certhigh/ocsp.c,v  <--  ocsp.c
new revision: 1.63; previous revision: 1.62
done
Checking in ocsp.h;
/cvsroot/mozilla/security/nss/lib/certhigh/ocsp.h,v  <--  ocsp.h
new revision: 1.16; previous revision: 1.15
done
I edited Adam's patch and checked it in on the NSS trunk
(NSS 3.12.6).

Checking in ocsp.c;
/cvsroot/mozilla/security/nss/lib/certhigh/ocsp.c,v  <--  ocsp.c
new revision: 1.64; previous revision: 1.63
done
Checking in ocsp.h;
/cvsroot/mozilla/security/nss/lib/certhigh/ocsp.h,v  <--  ocsp.h
new revision: 1.17; previous revision: 1.16
done
Attachment #424587 - Attachment is obsolete: true
Attachment #424587 - Flags: review?
Adam, I'm still trying to parse the trade-off you described
in your answer to the first question in my comment 14.  I
don't understand what you meant by "respecting a cached
'revoked' status".

Do you think the ocsp_SetCacheItemResponse function should
compare "producedAt" timestamps of the cache item and the
new response", so that we will replace the cached response
with the new response only if the new response was produced
at a later time than the cached response?  This will prevent
an attacker from inserting a 'good' OCSP response from before
the revocation into our OCSP cache.
wtc: checking the producedAt time is a good idea. With that, we should still validate an OCSP response if we find a cache entry that is 'revoked'. But, if the response is valid and good we should compare the producedAt times and pick the most recent.

I'm reworking bits of the caching now to add disk caching support so I'll do this too.
wtc: Please check over my logic here.
Attachment #424984 - Flags: review?(wtc)
I don't believe that cert_RememberOCSPProcessingFailure will currently work in all cases. It calls ocsp_CreateOrUpdateCacheEntry with single==NULL and that causes ocsp_CreateOrUpdateCacheEntry to set the missingResponseError on any cache entry that it finds.

However, if there is already a cache entry, and that cache entry has a certStatusArena, then setting missingResponseError is not sufficient. It's the non-NULLness of certStatusArena that is checked in ocsp_GetCachedOCSPResponseStatusIfFresh
Attachment #424987 - Flags: review?(wtc)
Adam, do you think the CERT_CacheOCSPResponseFromSideChannel function
that we added is usable without the two remaining patches?

If so, I'll mark this bug fixed and open a new bug for the two remaining
patches.  Otherwise, I'll change the target milestone to 3.12.7.  Thanks.
As I recall, it's usable as is. The two additional patches should land at some point, but if the administration is easier with new bugs then that's not a problem.
I opened bug 554369 and moved the remaining two patches there.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Attachment #424984 - Attachment description: change the caching semantics of side channel responses. → change the caching semantics of side channel responses. (Moved to bug 554369)
Attachment #424984 - Attachment is obsolete: true
Attachment #424984 - Flags: review?(wtc)
Attachment #424987 - Attachment description: fix bug that I noticed in my travels → fix bug that I noticed in my travels (moved to bug 554369)
Attachment #424987 - Attachment is obsolete: true
Attachment #424987 - Flags: review?(wtc)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: