Last Comment Bug 375019 - Cache-enable pkix_OcspChecker_Check
: Cache-enable pkix_OcspChecker_Check
Status: RESOLVED FIXED
PKIX NSS312B2
:
Product: NSS
Classification: Components
Component: Libraries (show other bugs)
: trunk
: All All
: P1 normal (vote)
: 3.12
Assigned To: Kai Engert (:kaie)
:
Mentors:
Depends on: 403795 403853
Blocks:
  Show dependency treegraph
 
Reported: 2007-03-22 22:58 PDT by Kai Engert (:kaie)
Modified: 2008-02-18 02:55 PST (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
work in progress v1, offsite backup... (36.82 KB, patch)
2008-01-28 16:17 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
work in progress v2 (41.52 KB, patch)
2008-01-29 16:40 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Patch v3 (51.23 KB, patch)
2008-01-29 21:03 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
ocsp tracing improvements (5.27 KB, patch)
2008-01-29 21:04 PST, Kai Engert (:kaie)
nelson: review-
Details | Diff | Splinter Review
Patch v4 (52.90 KB, patch)
2008-01-30 11:45 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Patch v5 (45.36 KB, patch)
2008-01-30 13:13 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Patch v6 (52.89 KB, patch)
2008-01-30 21:26 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Patch v7 (61.27 KB, patch)
2008-01-31 11:44 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Patch v8 (64.65 KB, patch)
2008-01-31 13:32 PST, Kai Engert (:kaie)
nelson: review-
Details | Diff | Splinter Review
Patch v9 [checked in] (67.78 KB, patch)
2008-02-05 15:05 PST, Kai Engert (:kaie)
nelson: review+
alvolkov.bgs: review-
Details | Diff | Splinter Review
Incremental Patch v10 [checked in] (5.37 KB, patch)
2008-02-07 10:21 PST, Kai Engert (:kaie)
alvolkov.bgs: review+
Details | Diff | Splinter Review
ocsp tracing v11 [checked in] (6.00 KB, patch)
2008-02-07 15:43 PST, Kai Engert (:kaie)
nelson: review+
Details | Diff | Splinter Review
tracing diff diff for v1 (4.91 KB, patch)
2008-02-07 15:45 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Potential leak of certRequest (1.46 KB, patch)
2008-02-13 18:00 PST, Alexei Volkov
kaie: review+
Details | Diff | Splinter Review

Description Kai Engert (:kaie) 2007-03-22 22:58:53 PDT
When we worked on the OCSP cache in bug 205406, Nelson asked about caching OCSP and libPKIX.

Quoting my own comment from bug 205406 comment 41:
"My proposal for OCSP-cache-enabling libpkix is:
Change pkix_OcspChecker_Check to match the behaviour of CERT_CheckOCSPStatus as
its being done in patch v6.
In other words, add the cache lookup to the implementation of
pkix_OcspChecker_Check."

This will require that we make the internal functions ocsp_GetCachedOCSPResponseStatusIfFresh and ocsp_GetOCSPStatusFromNetwork functions available to libPKIX. (I don't know yet whether libPKIX operates on NSS public API only.)
Comment 1 Alexei Volkov 2007-05-17 11:48:41 PDT
No need to make them available publicly. libPKIX is linked a part of nss shared library.
Comment 2 Kai Engert (:kaie) 2007-11-14 16:23:43 PST
pkix_OcspChecker_Check can be parameterized by 
  PKIX_ProcessingParams *procParams
it's used (only) for the call to pkix_pl_OcspResponse_VerifySignature.

When we implemented the cache, we said, we should avoid repetitive verification and only store the verified result.

But what about applications where different procParams are being used?

Should the OCSP cache simply ignore it?
Is it sufficient to use the provided procParams when verifying a response, and reuse the cached result, even when the caller requested different procParams?
Comment 3 Kai Engert (:kaie) 2007-11-14 16:24:39 PST
CC'ing more people, thoughts on my comment 2 are welcome.
Comment 4 Kai Engert (:kaie) 2007-11-14 16:32:00 PST
In "classic" NSS (non-libpkix), life is simple. We have a global ocsp responder configuration. All cache objects were checked with the same responder rule. When the global config changes, we clear the cache.

With libpkix, life is more complicated.
A call to pkix_OcspChecker_Check provides a checkerObject, which specify a responder and a verification function.

Should the cache objects depend on the responder+verification function (and possibly the procParams, too, see comment 2), and therefore. This would mean, all these params become part of the cache entry key...?
Comment 5 Kai Engert (:kaie) 2007-11-14 16:42:14 PST
Should we simply define:
It's an application responsibility to clear the OCSP cache whenever it switches its use of checkers/responders/params?
Comment 6 Kai Engert (:kaie) 2007-11-14 18:19:45 PST
I think PKIX based OCSP should work, before we try to implement caching...
Making dependent on bug 403853.

Comment 7 Nelson Bolyard (seldom reads bugmail) 2007-11-15 00:06:07 PST
While libPKIX allows multiple different checker objects to be used, it's
not clear that public NSS interfaces will do so.  IIRC, the new 
CERT_PKIXVerifyCert merely offers the caller a boolean: do or don't use OCSP.
I would surely like to see a simple model for managing this. 
Kai's idea in comment 5 appeals to me.
Comment 8 Kai Engert (:kaie) 2007-12-18 12:15:59 PST
I've continued to work on this bug, and I conducted some more tests, always had environment variable NSS_ENABLE_PKIX_VERIFY=1 set.

In my testing, libPKIX's OCSP function pkix_OcspChecker_Check is reached (only) when all of the following is true:
- we are calling CERT_PKIXVerifyCert
- the caller requested CERT_REV_FLAG_OCSP (full chain)

On the other hand:
- When the caller of CERT_PKIXVerifyCert requests CERT_REV_FLAG_OCSP_LEAF_ONLY, then pkix_OcspChecker_Check is never reached.
- When classic NSS cert verification functions get called, then pkix_OcspChecker_Check is never reached.

I'm a bit surprised, but maybe you'll tell me that my observation is expected.

I had assumed NSS_ENABLE_PKIX_VERIFY=1 would cause all verification functions go through libpkix.
But it appears, the only place where this is checked is in chain building.


If CERT_PKIXVerifyCert requests CERT_REV_FLAG_OCSP_LEAF_ONLY, the code will not call any OCSP checker! Neither the classic code nor the libpkix code!
Comment 9 Nelson Bolyard (seldom reads bugmail) 2007-12-18 14:00:43 PST
We know that the process of validating a leaf cert involves (at least) two 
classes of checks, which are:
a) does the leaf cert meet the application's criteria for acceptability, and
b) does the leaf cert validate when checked against its issuer's cert (a recursive problem) and its issuer's CRL or OCSP response.  

The application's criteria might include (among others)
- the subject (alternate) name(s) in the cert are as expected by the app, and 
- its key usage and extended key usage attributes are acceptable to the app, 
- policy OID is acceptable to the app

The checks of the cert against its issuer cert would include (among others)
- issued cert's issuer name matches issuer cert's subject name
- issued cert's issuer Key ID matches issuer cert's subject key ID
- issued cert's signature verifies with issuer cert's public key
- issued cert's policy OIDs are allowed by issuer cert's policy extensions
  (when explicit policy checking is being performed)

The classic NSS Cert verification functions were structured so that an outer
function (CERT_VerifyCert) would check the first group of criteria, and then
a second function (CERT_VerifyCertChain) would verify the second set of criteria.  

The libPKIX functions are believed to only perform the second class of tests,
and thus libPKIX serves to act as a replacement for CERT_VerifyCertChain but
not as a replacement for CERT_VerifyCert or CERT_VerifyCertificate.  

But we've run into a few surprises.  It seems that libPKIX does not perform certain checks of that second class on the leaf cert, checks that we would expect it to perform, because they involve checking the issued cert against
information supplied by the issuer, not by the application.  

Previously we found that the leaf cert's policy extension policy OIDs were 
not being checked by libPKIX. Now, it seems that revocation checking is not being performed on the leaf cert by libPKIX, either.  

Frankly, this makes me wonder these questions: 
- Is libPKIX checking ANY of the second class of checks on the leaf cert? 
- Is libPKIX checking the signature in the leaf cert?
- Is libPKIX checking the issuer name against the issuer's subject name?
- Is libPKIX checking the Authority key ID against the issuer's subject key ID?

And, if not, is this because the path chain validation algorithm in RFC 3280
does not specify that it should do so?  Or is this merely an implementation bug, a failure to correctly implement the algorithm in RFC 3280.  And finally,
should libPKIX do these tests, even if RFC 3280 does not specify it?  
(IMO, the answer to that last question is emphatic: yes!)

Maybe we need a separate P1 bug to ensure the right answers to those questions.
Comment 10 Nelson Bolyard (seldom reads bugmail) 2007-12-18 14:29:54 PST
Looking at function CERT_VerifyCert, we see that revocation checking on 
the leaf cert is done in that function, after the call to CERT_VerifyCertChain
has returned successfully.  From that fact, we can infer that 
CERT_PKIXVerifyCert should likewise do the revocation checking on the leaf 
cert after libPKIX has been called to do the equivalent of CERT_VerifyCertChain.
Comment 11 Kai Engert (:kaie) 2007-12-18 14:54:49 PST
(In reply to comment #10)
> ... CERT_PKIXVerifyCert should likewise do the revocation checking 
> on the leaf cert ...

filed bug 408903.
Comment 12 Kai Engert (:kaie) 2008-01-28 16:17:47 PST
Created attachment 299872 [details] [diff] [review]
work in progress v1, offsite backup...
Comment 13 Kai Engert (:kaie) 2008-01-29 16:40:10 PST
Created attachment 300183 [details] [diff] [review]
work in progress v2
Comment 14 Kai Engert (:kaie) 2008-01-29 21:03:09 PST
Created attachment 300257 [details] [diff] [review]
Patch v3

First patch that seems to work :-)
Comment 15 Kai Engert (:kaie) 2008-01-29 21:04:38 PST
Created attachment 300259 [details] [diff] [review]
ocsp tracing improvements

This separate patch greatly helped me in testing/debugging the new code.
Comment 16 Kai Engert (:kaie) 2008-01-30 11:45:45 PST
Created attachment 300435 [details] [diff] [review]
Patch v4

In the previous patch I exported 3 functions.
I figured that's not necessary, I no longer export them but now declare them in ocspi.h 

I think pkix_OcspChecker_Check should use a failure result code by default, to ensure all failure exit paths will indicate the failure... Right now it sets resultCode to 0 which means "good".
For now I've changed that to SEC_ERROR_REVOKED_CERTIFICATE_OCSP, which is probably too strict, but at least this change helped me to find bug 414911 :-/

I added another change, which we might do in a separate bug. It's easy to move it to a separate patch. I was wondering about the behavior of pkix_OcspChecker_Check when the cert lacks a URI. The current code says "passes ocsp" in this scenario. For testing purposes I added a flag to the plContext object which may carry the callers revocation requirements. (Now I realize this might be better put into procParams).
Comment 17 Kai Engert (:kaie) 2008-01-30 12:43:43 PST
(In reply to comment #16)
> I added another change, which we might do in a separate bug. It's easy to move
> it to a separate patch. I was wondering about the behavior of
> pkix_OcspChecker_Check when the cert lacks a URI. The current code says "passes
> ocsp" in this scenario. For testing purposes I added a flag to the plContext
> object which may carry the callers revocation requirements. (Now I realize this
> might be better put into procParams). 

I've now moved this portion into a separate patch attached to bug 413997 (patch v1). I'll remove a new patch to this bug without this portion.
Comment 18 Kai Engert (:kaie) 2008-01-30 13:13:00 PST
Created attachment 300467 [details] [diff] [review]
Patch v5

I think this is now ready for review. My testing shows that OCSP requests triggered from libpkix are now passing through the cache. Who wants to have a look?

Here are some comments that might help you reviewing it:

- calculating the CERTOCSPCertID from a certificate looks like an expensive operation, so I'm trying hard to calculate it only once per OCSP request

- We can't use type CERTOCSPCertID directly inside pkix_OcspChecker_Check, so I introduced a wrapper type PKIX_PL_OCSPCertID in the nss portability layer

- Wrapper type PKIX_PL_OCSPCertID starts to own its CERTOCSPCertID*, but it might finally transfer ownership to the cache.

- updating the OCSP cache with results from the OCSP response requires access to CERTOCSPSingleResponse, which isn't easily accessible, so it was easiest to combine status-retrieval with cache-updating. This is why I created the combined function cert_GetOCSPStatusAndUpdateCache.

- in classic NSS code there was a single function that controlled most of the network and cache updating activity. But with libpkix the logic is distributed over multiple functions. This is why I introduced cert_RememberOCSPProcessingFailure and call it whenever OCSP processing fails.

- as said earlier, in function pkix_OcspChecker_Check, we must find a good default error code for the "resultCode" variable. We should have a better idea what to do while we work on 414911.

- Please look at the FIXME comment in the patch, I'd appreciate suggestions
Comment 19 Kai Engert (:kaie) 2008-01-30 13:16:13 PST
(In reply to comment #18)
> - calculating the CERTOCSPCertID from a certificate looks like an expensive
> operation, so I'm trying hard to calculate it only once per OCSP request

Actually, there is room for improvement, pkix_pl_OcspRequest_Create still calls CERT_CreateOCSPRequest (which calculates the cert id internally).

I should change that to pass the certid (already available) to pkix_pl_OcspRequest_Create and have it call cert_CreateSingleCertOCSPRequest (which accepts a cert id).
Comment 20 Kai Engert (:kaie) 2008-01-30 21:05:43 PST
While I tested this code, I ran into a failure, and I had drawn in incorrect conclusion. The result was bug 414911, which I've now resolved as invalid.

However, bug 414911 comment 3 and 4 explain the need for further improvements to the ocsp caching code.

I'll attach an updated patch.
Comment 21 Kai Engert (:kaie) 2008-01-30 21:26:03 PST
Created attachment 300589 [details] [diff] [review]
Patch v6

As a reminder, NSS has a global configuration setting called "OCSP failure mode". It defines whether a failure in processing (unable to received response) will be treated as a hard failure (the classic mode). Firefox 3 will use the soft failure mode by default.

While this soft failure mode is great for DV SSL, it's a problem for EV SSL.

When testing a cert for EV, we must be able to detect an OCSP failure, even when running in soft failure mode.

I enhanced the internal OCSP APIs to make the additional information available. In particular, function ocsp_GetCachedOCSPResponseStatusIfFresh will return two separate status values:
- rv_basedOnOcspFailureMode
- real_rv_ocsp

The OCSP APIs in classic NSS can continue to use rv_basedOnOcspFailureMode, while in libpkix we can look at real_rv_ocsp.

(For example, if an OCSP server can not be contacted and we are running in soft failure mode, rv_basedOnOcspFailureMode will say "success" and real_rv_ocsp will say "failure".)


For now I have changed the libpkix OCSP code to look at real_rv_ocsp, because that's what we require now for EV.

But libpkix OCSP code should not ignore whether we are running in soft or hard failure mode.

In order to be correct, the following needs to happen: Introduce a flag to plContext or ProcParams that define the desired OCSP failure mode (overriding the globally defined mode).

This will allow function CERT_PKIXVerifyCert to look at the revocation flags provided as input parameters, and set the context flags accordingly.


In addition to the global result OCSP result (success or failure) I've added the error to the cache. This seems necessary, as libpkix code that calls into OCSP is interested in the exact reason of OCSP failure.

So, whenever a cache entry gets created, but no OCSP response is available (in order to remember the failure), the cache entry will remember the current error code obtained used PORT_GetError().

libpkix obtains the error code when reading OCSP cache entries, so libpkix behavior can be identical, whether it's a real OCSP failure or a cached OCSP failure.
Comment 22 Kai Engert (:kaie) 2008-01-30 21:31:18 PST
I tested this patch using the following change in pkix_ocspchecker.c

                 /* No uri to check is considered passing! */
                 if (uriFound == PKIX_FALSE) {
                         /* no caching for certs lacking URI */
-                        passed = PKIX_TRUE;
-                        resultCode = 0;
+                        passed = PKIX_FALSE;
+                        resultCode = PORT_GetError();
                         goto cleanup;
                 }

This change does not make sense, it's just for testing.
This forces an error from the OCSP function when a cert lacks an OCSP AIA.
With this change I no longer get EV UI on paypal, because one of the intermediates gets rejected.

This is the result that I had initially expected when I filed (invalid) bug 414911, and with patch v6 we now get the expected behavior.
Comment 23 Kai Engert (:kaie) 2008-01-31 11:44:42 PST
Created attachment 300702 [details] [diff] [review]
Patch v7
Comment 24 Kai Engert (:kaie) 2008-01-31 11:55:44 PST
Comment on attachment 300702 [details] [diff] [review]
Patch v7

(In reply to comment #19)
> Actually, there is room for improvement, pkix_pl_OcspRequest_Create still calls
> CERT_CreateOCSPRequest (which calculates the cert id internally).
> 
> I should change that to pass the certid (already available) to
> pkix_pl_OcspRequest_Create and have it call cert_CreateSingleCertOCSPRequest
> (which accepts a cert id).


I fixed this in Patch v7.

CERT_CreateOCSPRequest currently ignored parameter signingCert.
I want to replace that with a call to cert_CreateSingleCertOCSPRequest.
For completeness, I added that parameter to the latter function, even though it will ignore it, too (because the pkix ocsp code sets the parameter).

This patch simplifies PKIX_PL_OcspRequestStruct and pkix_pl_OcspRequest_Create. Because we pass in a certificate ID and use cert_CreateSingleCertOCSPRequest, it no longer needs to deal with a cert list.
Comment 25 Kai Engert (:kaie) 2008-01-31 13:30:22 PST
Nelson, thanks for some first review comments you gave me through a personal message.

- You said some lines are longer than 80 chars. I fixed most of them, however, I propose to keep the long lines in file pkix_errorstrings.h, because more than 50% of all lines in that file are long lines. If you want this fixed, let me know, and I will file a separate bug to get this cleaned up...

- I've sorted the enums I found.

- I think I've changed all changed name_name_name to camelCaseNames.

- moved }else{ to single lines

- you said I'm adding functions to .h files without adding a comment.

Well, for libpkix this seems to be common style in the existing OCSP files. Neither pkix_pl_ocsprequest.h nor pkix_pl_ocspresponse.h use comments. Instead they point to the .c file. I've now added more comments to pkix_pl_ocspcertid.c

For file ocspi.h, this is for internal functions.
I've added documentation for the new functions I have created (cert_GetOCSPStatusAndUpdateCache and cert_RememberOCSPProcessingFailure).

I have not added documentation for the 2 existing functions that I simply make available, in order to allow the libpkix portability layer to call them. I think we don't require documentation for all internal functions, do we?
Comment 26 Kai Engert (:kaie) 2008-01-31 13:32:23 PST
Created attachment 300726 [details] [diff] [review]
Patch v8
Comment 27 Nelson Bolyard (seldom reads bugmail) 2008-01-31 23:26:54 PST
Kai, pkix_errorstrings.h is definitely an exception to the 80-column rule.

Some libPKIX functions have block comments in .h files and others in .c files.
All functions should have such block comments in one place or the other.
I believe there is some rule by which the past libPKIX authors decided whether
to put the comment in a .h or in the .c, but I don't know what that rule is.
I think it has something to do with whether the function is considered private
to libPKIX (for use only from within libPKIX itself) or is considered callable
from outside of libPKIX (e.g. from within CERT_PKIXVerifyCert). 

Alexei, can you enlighten us about that rule?
Comment 28 Nelson Bolyard (seldom reads bugmail) 2008-02-04 18:51:57 PST
Comment on attachment 300726 [details] [diff] [review]
Patch v8

R-.  In no particular order:

1. This patch adds an argument named signerCert to cert_CreateSingleCertOCSPRequest, and requires that the caller must 
always pass NULL as the value for that argument.  This is a good change,
but it took me a while to become convinced of that.

The patch adds a comment telling the reader to "See CERT_CreateOCSPRequest for an explanation why signerCert gets ignored."  But that comment talks about why that function returns without setting an error code.  The logic seems to think that since PSM will not use this feature of the function, the function can get away with breaking one of NSS's fundamental rules about error code setting.

So, please make these changes:

a) change both cert_CreateSingleCertOCSPRequest and CERT_CreateOCSPRequest
to set a meaningful error code if the argument is not NULL.  We have an 
error code that means "feature not implemented".  That is the right error
code to use in this case (don't use INVALID_ARGS for this).  And fix the comment in CERT_CreateOCSPRequest so that it no longer makes an excuse for not setting an error code.

b) replace the comment in cert_CreateSingleCertOCSPRequest with one that
explains that this feature may be implemented later.

2.  The patch makes this change:

> -    if (!certID || !rv_ocsp) {
> +    if (!certID || !missingResponseError
> +        || !rvBasedOnOcspFailureMode | !rvOcspReal) {
Should that be "||" instead of "|" ?    ^ here  
(I think it should)

3. This patch adds two new arguments, named rvBasedOnOcspFailureMode and rvOcspReal, to function ocsp_GetCachedOCSPResponseStatusIfFresh.  
Those names don't mean much to me.  I think we can find better names.  If the second value is "real", then is the first value "imaginary"?  Maybe 
the "real" value should not be a SECStatus, but rather some enum type that
lists the possible cert statuses that may come from an OCSP response, 
something like CERTCRLEntryReasonCode (but for OCSP).  At a minimum, we need a block comment to explain what they mean in detail.

I suspect that one of them tells if the attempt to fetch an OCSP response
itself succeeded or failed, and the other tells if the cert is OK or 
revoked.  But I'm not sure. Maybe they should be named "OperationStatus"
and "certValidity"?  Maintainers will not keep the code correct 
if they cannot figure out what it is supposed to do. 

Related to that observation, this comment

> /* rvOcspReal still indicates the failure */

tells me nothing.  Indicates WHAT failure?  Maybe the comment should 
explain why it's returning SECSuccess in the presence of a failure.

4. Why define a variable that we ignore?

> -    SECStatus rv_ocsp;
> +    SECStatus rvOcspReal; /* we ignore this */

Maybe we should pass NULL, rather than the address of this dummy variable,
to the function ocsp_GetCachedOCSPResponseStatusIfFresh and the function 
should understand a NULL pointer to mean "caller doesn't care for this 
result", rather than it being an error.

5. Please add a comment explaining WHY CERT_GetOCSPStatusForCertID does
not update the cache, and why a caller might want to do an OCSP request
and not cache the results.  It's not immediately obvious to me.  I doubt
it will be apparent to future maintainers of this code if we don't tell
them.

6. Comments on cert_GetOCSPStatusAndUpdateCache.  
6a) That name is really confusing.  Maybe this should be named 
"GetOCSPStatusFromResponse" or "InterpretOCSPResponseForStatus".  This 
function doesn't "get" the status in the sense of initiating a status 
request.  Rather it interprets the status in a previously fetched response.  

6b) It's not clear if the argument signerCert is input or output.  
If it is presently required to be NULL, that must be stated in the comment.

6c) The argument named updateCache is redundant.  Please eliminate it.
Use the test (certIDWasConsumed != NULL) rather than (updateCache != 0).
The comments should state that when certIDWasConsumed is non-NULL, 
the function will attempt to update the cache with the value.

6d) Please initialize the local variable named single.

7. The contents of file pkix_errorstrings.h are still not is sort -f order.
The symbol(s) named OCSPCHECKER* should follow those named OCSPCERTID*

8. We're trying to get rid of certain macros that obscure the meaning of
code in libPKIX without providing any real benefit.  So, 
8a) we ask that no new code ever use the macros PKIX_PL_NSSCALL or
PKIX_PL_NSSCALLRV, and 
8b) we ask that when modifying functions that use those macros, those
macros should be removed and replaced by their expansions.  E.g. 

An invocation such as 
  PKIX_PL_NSSCALLRV(a,b,c,d);
should be replaced by 
  b = c d;
(discarding the first argument), and one such as
  PKIX_PL_NSSCALL(a,b,c);
should be replaced by 
  b c;
(also discarding the first argument).

9. I'm going to ask Alexei to advise regarding the "FIXME" comment in 
pkix_OcspChecker_Check.

10. This patch eliminates blocks of code in pkix_pl_OcspRequest_Create and 
in pkix_pl_OcspRequest_Destroy that tried to deal with ownership issues
related to certs and certlists.  The fact that you got rid of both blocks
suggests that there si some new solution to the problem that obviates that
old code.  Please provide a short description of the new method by which 
this new code solves that old problem.  Please add it as a comment to this
bug.  It may also be appropriate to add comments to the code, in place of
the code being removed, briefly explaining the fix there, also.
Comment 29 Kai Engert (:kaie) 2008-02-05 14:53:45 PST
> a) change both cert_CreateSingleCertOCSPRequest and CERT_CreateOCSPRequest
> to set a meaningful error code if the argument is not NULL.  We have an 
> error code that means "feature not implemented".  

I found PR_NOT_IMPLEMENTED_ERROR


> That is the right error
> code to use in this case (don't use INVALID_ARGS for this).  And fix the
> comment in CERT_CreateOCSPRequest so that it no longer makes an excuse for not
> setting an error code.

done


> b) replace the comment in cert_CreateSingleCertOCSPRequest with one that
> explains that this feature may be implemented later.

done


> 2.  The patch makes this change:
> 
> > -    if (!certID || !rv_ocsp) {
> > +    if (!certID || !missingResponseError
> > +        || !rvBasedOnOcspFailureMode | !rvOcspReal) {
> Should that be "||" instead of "|" ?    ^ here  
> (I think it should)

Oops, thanks Nelson, typo fixed.


> 3. This patch adds two new arguments, named rvBasedOnOcspFailureMode and
> rvOcspReal, to function ocsp_GetCachedOCSPResponseStatusIfFresh.  
> Those names don't mean much to me.  I think we can find better names.  If the
> second value is "real", then is the first value "imaginary"?  

(This section is now mostly obsolete, because I switched to using a 
single rvOcsp.)

What do you think about rvOriginalOcspStatus and rvDerivedFromOcspFailureMode?

The idea is, which I have now explained in a comment,
some consumers might be interested in the raw original result from OCSP
(e.g. CERT_PKIXVerifyCert), other comsumers want the final result after
considering the failure mode (e.g. the classic NSS code).

In the new patch I've added this comment:

/* Return value SECFailure means: not found or not fresh.
 * On SECSuccess, the out parameters contain the OCSP status.
 * rvOriginalOcspStatus contains the status irrelevant of 
 *   the global OcspFailureMode setting.
 * rvDerivedFromOcspFailureMode contains the status
 *   based on the global OcspFailureMode setting.
 * If the cached attempt to obtain OCSP information had resulted
 * in a failure, missingResponseError shows the error code of
 * that failure.
 */


> I suspect that one of them tells if the attempt to fetch an OCSP response
> itself succeeded or failed, and the other tells if the cert is OK or 
> revoked. 

No. I think the above comments made that clear.


> Related to that observation, this comment
> > /* rvOcspReal still indicates the failure */
> tells me nothing.  Indicates WHAT failure?  Maybe the comment should 
> explain why it's returning SECSuccess in the presence of a failure.

(This section is now mostly obsolete, because I switched to using a 
single rvOcsp.)

I've changed that to
    if (OCSP_Global.ocspFailureMode == 
            ocspMode_FailureIsNotAVerificationFailure) {
        rv = SECSuccess;
        *rvDerivedFromOcspFailureMode = SECSuccess;
        /* rvOriginalOcspStatus still indicates the failure */
    }

As said in the comment above, "return SECSuccess" means "found cache entry".
If no response is cached, and "failure mode" is set to "soft", 
then rvDerivedFromOcspFailureMode indicates "success".
However, for consumers that don't want data hidden by the global failure mode,
the real OCSP result is provided in function rvOriginalOcspStatus.


> 4. Why define a variable that we ignore?

Because the implementation provides information as an out parameter,
which the caller isn't interested in.


> > -    SECStatus rv_ocsp;
> > +    SECStatus rvOcspReal; /* we ignore this */
> 
> Maybe we should pass NULL, rather than the address of this dummy variable,
> to the function ocsp_GetCachedOCSPResponseStatusIfFresh and the function 
> should understand a NULL pointer to mean "caller doesn't care for this 
> result", rather than it being an error.

Actually, now that I've been thinking about this more, 
I assume a caller would like to obtain 
EITHER rvOriginalOcspStatus OR rvDerivedFromOcspFailureMode.

It seems unlikely that a caller would ever be interested in both values,
so there is an alternative to your proposal:

Use an additional input parameter, which says 
  "should the global failure mode be ignored or considered?"
Let's remove one of the out parameters, use a single rvOcsp out parameter
(as the code in cvs currently does).

I've now made that change, which obsoletes some of my above explanations.


> 5. Please add a comment explaining WHY CERT_GetOCSPStatusForCertID does
> not update the cache, and why a caller might want to do an OCSP request
> and not cache the results.  It's not immediately obvious to me.  I doubt
> it will be apparent to future maintainers of this code if we don't tell
> them.

I've added a better comment.


> 6. Comments on cert_GetOCSPStatusAndUpdateCache.  
> 6a) That name is really confusing.  Maybe this should be named 
> "GetOCSPStatusFromResponse" or "InterpretOCSPResponseForStatus".  This 
> function doesn't "get" the status in the sense of initiating a status 
> request.  Rather it interprets the status in a previously fetched response.  

Please note this deficiency is present in the old external
API function CERT_GetOCSPStatusForCertID, too.
I had chosen a name for this "and update cache" helper function
that is similar to the existing function, to indicate that both 
functions are related...

Maybe we simply use cert_ProcessOCSPResponse?
This name indicates it's necessary to call the function for processing
the results available in the OCSP response (which currently includes updating 
the cache).
It's common for NSS function to return a status, so it seems ok to omit 
the "GetStatus" portion.


> 6b) It's not clear if the argument signerCert is input or output.  

Well, parameter signerCert is used all over the OCSP code, and it's never
being documented anywhere in the implementation code whether it's in or out...
However, it's already documented for public CERT_GetOCSPStatusForCertID.

I think it can be clearly derived that cert_GetOCSPStatusAndUpdateCache's
parameter signerCert matches 1:1 what external API CERT_GetOCSPStatusForCertID
uses.

Also, I think an out parameter of type CERTCertificate would use **, no?
It seems obvious to me that CERTCertificate* is an input parameter.

Anyway, I added this comment:
/*
 * The first 5 parameters match the definition of CERT_GetOCSPStatusForCertID.
 */


> If it is presently required to be NULL, that must be stated in the comment.

This function does not have any requirements on signerCert being 
NULL or non-NULL. It simply passes that parameter on to other functions.
I tried to follow the chain of function calls, but gave up after 4 levels...
This is old stuff.


> 6c) The argument named updateCache is redundant.  Please eliminate it.
> Use the test (certIDWasConsumed != NULL) rather than (updateCache != 0).

I have to agree.

(Although I had initially planned to offer the caller the ability to control
whether to update the cache or not, I now agree, there is no need to 
offer that.)


> The comments should state that when certIDWasConsumed is non-NULL, 
> the function will attempt to update the cache with the value.

This might change in the future.
I think we should say, if certIDWasConsumed is NULL, the function
might attempt to produce a deep copy of certID for updating the cache.
(to be implemented later).


> 6d) Please initialize the local variable named single.

good idea, done.


> 7. The contents of file pkix_errorstrings.h are still not is sort -f order.
> The symbol(s) named OCSPCHECKER* should follow those named OCSPCERTID*

done now, hopefully :-)


> 8. We're trying to get rid of certain macros that obscure the meaning of
> code in libPKIX without providing any real benefit.  

I'm really really glad to hear that. Done.


> 9. I'm going to ask Alexei to advise regarding the "FIXME" comment in 
> pkix_OcspChecker_Check.

Thanks, after having talked to Alexei I updated this code block with his help.


> 10. This patch eliminates blocks of code in pkix_pl_OcspRequest_Create and 
> in pkix_pl_OcspRequest_Destroy that tried to deal with ownership issues
> related to certs and certlists.  The fact that you got rid of both blocks
> suggests that there si some new solution to the problem that obviates that
> old code.  Please provide a short description of the new method by which 
> this new code solves that old problem.  Please add it as a comment to this
> bug.  It may also be appropriate to add comments to the code, in place of
> the code being removed, briefly explaining the fix there, also.

It's really simple, and I think no comments are necessary.

The new OCSP code no longer requires to deal with certificate lists at all.

The old code used NSS' external API to create an OCSP request based on
a list of certificates. Therefore, the old code created a certificate
list from the single certificate of interest.

The new code uses an internal API that takes a single certificate as input.
Comment 30 Kai Engert (:kaie) 2008-02-05 15:05:10 PST
Created attachment 301590 [details] [diff] [review]
Patch v9 [checked in]

Addressed Nelson's comments.
Comment 31 Nelson Bolyard (seldom reads bugmail) 2008-02-05 20:43:54 PST
Comment on attachment 301590 [details] [diff] [review]
Patch v9 [checked in]

Thanks, Kai, for addressing all my requests.  r=nelson
Comment 32 Kai Engert (:kaie) 2008-02-06 09:28:50 PST
Checked in. In addition to the patch, I removed an assignment to a "certList=NULL" assignment that Alexei recently checked in (the variable has gone away).

Marking fixed.

Checking in certhigh/ocsp.c;
/cvsroot/mozilla/security/nss/lib/certhigh/ocsp.c,v  <--  ocsp.c
new revision: 1.47; previous revision: 1.46
done
Checking in certhigh/ocspi.h;
/cvsroot/mozilla/security/nss/lib/certhigh/ocspi.h,v  <--  ocspi.h
new revision: 1.9; previous revision: 1.8
done
Checking in libpkix/include/pkix_errorstrings.h;
/cvsroot/mozilla/security/nss/lib/libpkix/include/pkix_errorstrings.h,v  <--  pkix_errorstrings.h
new revision: 1.16; previous revision: 1.15
done
Checking in libpkix/include/pkix_pl_pki.h;
/cvsroot/mozilla/security/nss/lib/libpkix/include/pkix_pl_pki.h,v  <--  pkix_pl_pki.h
new revision: 1.8; previous revision: 1.7
done
Checking in libpkix/include/pkixt.h;
/cvsroot/mozilla/security/nss/lib/libpkix/include/pkixt.h,v  <--  pkixt.h
new revision: 1.12; previous revision: 1.11
done
Checking in libpkix/pkix/checker/pkix_ocspchecker.c;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix/checker/pkix_ocspchecker.c,v  <--  pkix_ocspchecker.c
new revision: 1.7; previous revision: 1.6
done
Checking in libpkix/pkix/util/pkix_tools.h;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix/util/pkix_tools.h,v  <--  pkix_tools.h
new revision: 1.15; previous revision: 1.14
done
Checking in libpkix/pkix_pl_nss/pki/manifest.mn;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/manifest.mn,v  <--  manifest.mn
new revision: 1.5; previous revision: 1.4
done
RCS file: /cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocspcertid.c,v
done
Checking in libpkix/pkix_pl_nss/pki/pkix_pl_ocspcertid.c;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocspcertid.c,v  <--  pkix_pl_ocspcertid.c
initial revision: 1.1
done
RCS file: /cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocspcertid.h,v
done
Checking in libpkix/pkix_pl_nss/pki/pkix_pl_ocspcertid.h;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocspcertid.h,v  <--  pkix_pl_ocspcertid.h
initial revision: 1.1
done
Checking in libpkix/pkix_pl_nss/pki/pkix_pl_ocsprequest.c;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocsprequest.c,v  <--  pkix_pl_ocsprequest.c
new revision: 1.8; previous revision: 1.7
done
Checking in libpkix/pkix_pl_nss/pki/pkix_pl_ocsprequest.h;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocsprequest.h,v  <--  pkix_pl_ocsprequest.h
new revision: 1.4; previous revision: 1.3
done
Checking in libpkix/pkix_pl_nss/pki/pkix_pl_ocspresponse.c;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocspresponse.c,v  <--  pkix_pl_ocspresponse.c
new revision: 1.8; previous revision: 1.7
done
Checking in libpkix/pkix_pl_nss/pki/pkix_pl_ocspresponse.h;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocspresponse.h,v  <--  pkix_pl_ocspresponse.h
new revision: 1.5; previous revision: 1.4
done
Checking in libpkix/pkix_pl_nss/system/pkix_pl_lifecycle.c;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/system/pkix_pl_lifecycle.c,v  <--  pkix_pl_lifecycle.c
new revision: 1.17; previous revision: 1.16
done
Checking in libpkix/pkix_pl_nss/system/pkix_pl_lifecycle.h;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/system/pkix_pl_lifecycle.h,v  <--  pkix_pl_lifecycle.h
new revision: 1.4; previous revision: 1.3
done
Comment 33 Alexei Volkov 2008-02-06 11:12:34 PST
(In reply to comment #27)
> Kai, pkix_errorstrings.h is definitely an exception to the 80-column rule.
> 
> Some libPKIX functions have block comments in .h files and others in .c files.
> All functions should have such block comments in one place or the other.
> I believe there is some rule by which the past libPKIX authors decided whether
> to put the comment in a .h or in the .c, but I don't know what that rule is.

Remember how libpkix initially planed to be used by designers? NSS supposed to provide the low lever functionality and not supposed to be visible(accessible) to an application. For this, libpkix has it's public API defined in libpkix/include/*. files.

Comments should describe all libpkix public functions in those header files where they are declared. All private function suppose to be described where they defined.
Comment 34 Nelson Bolyard (seldom reads bugmail) 2008-02-06 11:58:02 PST
Thanks, Alexei.  
I think you're saying that some functions in libPKIX are purely private to 
libPKIX itself, only to be called by other libPKIX functions, while other 
libPKIX functions are "public", meaning intended to be called by code outside 
of libPKIX (although only from withing NSS - all libPKIX functions are private 
to NSS).  I think you're saying the that functions that are purely private to 
libPKIX are to be documented in the .c files, and the functions that are to be 
callable by NSS code outside of libPKIX itself are to be documented in the .h files.  
Comment 35 Alexei Volkov 2008-02-06 13:56:25 PST
Comment on attachment 301590 [details] [diff] [review]
Patch v9 [checked in]

r- based on the comments below. New patch is needed since this patch is already committed. 

*In function PKIX_PL_OcspCertID_Create  cid does not get destroyed in case of error. You should have PKIX_DECREF(cid) after cleanup label.

*In the same function, the product of CERT_CreateOCSPCertID is not check for NULL.

*In function PKIX_PL_OcspCertID_GetFreshCacheStatus, the comment below has a reference to a renamed/removed variable. 
        /* We ignore rvBasedOnOcspFailureMode, we are interested in the
         * real result. 

* In ocsp.c and pkix_ocspchecker.c it we try to save the status on the response in the cache. What worries me is that we do not differentiate between failures due to no connection to server and failures due to internal error like memory allocation or time parsing, etc. Seems like we may end up in situation, when we return "revoked" status for the good certs. Any opinions on this? (See code for pkix_OcspChecker_Check after "cleanup" label)
Comment 36 Kai Engert (:kaie) 2008-02-07 09:55:21 PST
Note that I have added block comments for all the new functions that I added to libpkix.

The only place where I might have omitted a comment for a function is in classic NSS ocsp.c, but it's just a tiny helper function.
Comment 37 Kai Engert (:kaie) 2008-02-07 10:20:17 PST
During our chat I learned the name for the override parameter named
  ignoreOcspFailureMode
was confusing.
I renamed that to 
  ignoreGlobalOcspFailureSetting


> *In function PKIX_PL_OcspCertID_Create  cid does not get destroyed in case of
> error. You should have PKIX_DECREF(cid) after cleanup label.

added


> *In the same function, the product of CERT_CreateOCSPCertID is not check for
> NULL.

check added


> *In function PKIX_PL_OcspCertID_GetFreshCacheStatus, the comment below has a
> reference to a renamed/removed variable. 
>         /* We ignore rvBasedOnOcspFailureMode, we are interested in the
>          * real result. 

comment is no longer necessary, removed


> * In ocsp.c and pkix_ocspchecker.c it we try to save the status on the response
> in the cache. What worries me is that we do not differentiate between failures
> due to no connection to server and failures due to internal error like memory
> allocation or time parsing, etc. Seems like we may end up in situation, when we
> return "revoked" status for the good certs. Any opinions on this? (See code for
> pkix_OcspChecker_Check after "cleanup" label)

Yes, this was the behavior of old ocsp code, and I implemented it in libpkix in the same way.

I propose we do this enhancement in a separate bug, and do it in both ocsp.c and libpkix with similar rules. I filed bug 416169.
Comment 38 Kai Engert (:kaie) 2008-02-07 10:21:33 PST
Created attachment 301943 [details] [diff] [review]
Incremental Patch v10 [checked in]
Comment 39 Kai Engert (:kaie) 2008-02-07 10:22:34 PST
Alexei or Nelson, could you please also review the earlier patch labeled "ocsp tracing improvements"? It's attachment 300259 [details] [diff] [review]. Thanks!
Comment 40 Alexei Volkov 2008-02-07 10:43:00 PST
Comment on attachment 301943 [details] [diff] [review]
Incremental Patch v10 [checked in]

r=alexei
Comment 41 Nelson Bolyard (seldom reads bugmail) 2008-02-07 11:02:04 PST
Comment on attachment 300259 [details] [diff] [review]
ocsp tracing improvements

Kai,
It appears to me that this patch adds a new function, named dumpCertID,
inside of a large #ifdef DEBUG.  It also adds a call to that new function
inside function ocsp_FindCacheEntry.  That call is not #ifdef DEBUG.
So, it appears to me that this patch will result in a build failure in 
non-DEBUG builds.  I haven't done a test build to test that hypothesis.
If this patch breaks non-DEBUG builds, then r-, else r+.
I am marking it r-, but feel free to tell me if I'm wrong.
Comment 42 Kai Engert (:kaie) 2008-02-07 15:43:18 PST
Created attachment 302020 [details] [diff] [review]
ocsp tracing v11 [checked in]

(In reply to comment #41)
> So, it appears to me that this patch will result in a build failure in 
> non-DEBUG builds.

Nelson, you're right, thanks for catching it!
I've updated the patch and verified it does build in a non-debug build.
Comment 43 Kai Engert (:kaie) 2008-02-07 15:45:45 PST
Created attachment 302021 [details] [diff] [review]
tracing diff diff for v1

Nelson, as bugzilla's diff feature is broken with v11, this is a diff between the original ocsp tracing patch and the new v11
Comment 44 Nelson Bolyard (seldom reads bugmail) 2008-02-07 17:08:31 PST
Comment on attachment 302020 [details] [diff] [review]
ocsp tracing v11 [checked in]

r=nelson
Comment 45 Kai Engert (:kaie) 2008-02-13 07:29:39 PST
Comment on attachment 302020 [details] [diff] [review]
ocsp tracing v11 [checked in]

Checking in ocsp.c;
/cvsroot/mozilla/security/nss/lib/certhigh/ocsp.c,v  <--  ocsp.c
new revision: 1.50; previous revision: 1.49
Comment 46 Kai Engert (:kaie) 2008-02-13 07:30:13 PST
Alexei, Nelson, thanks a lot for your thorough reviews.

All patches checked in, marking fixed.
Comment 47 Alexei Volkov 2008-02-13 18:00:58 PST
Created attachment 303174 [details] [diff] [review]
Potential leak of certRequest

Leaking certRequest if CERT_AddOCSPAcceptableResponses returns an error.
Comment 48 Kai Engert (:kaie) 2008-02-18 02:49:42 PST
Comment on attachment 303174 [details] [diff] [review]
Potential leak of certRequest

r=kengert

Thanks Alexei! I'll check it in.
Comment 49 Kai Engert (:kaie) 2008-02-18 02:55:27 PST
Comment on attachment 303174 [details] [diff] [review]
Potential leak of certRequest

Checking in pki/pkix_pl_ocsprequest.c;
/cvsroot/mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocsprequest.c,v  <--  pkix_pl_ocsprequest.c
new revision: 1.9; previous revision: 1.8
done

Note You need to log in before you can comment on or make changes to this bug.