Last Comment Bug 434398 - libPKIX cannot find issuer cert immediately after checking it with OCSP
: libPKIX cannot find issuer cert immediately after checking it with OCSP
Status: RESOLVED FIXED
PKIX
:
Product: NSS
Classification: Components
Component: Libraries (show other bugs)
: 3.12
: All All
: P1 major (vote)
: 3.12.1
Assigned To: Nelson Bolyard (seldom reads bugmail)
:
Mentors:
Depends on: 433594
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-18 16:15 PDT by Nelson Bolyard (seldom reads bugmail)
Modified: 2008-06-13 22:06 PDT (History)
1 user (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
patch part 1, v1 (2.68 KB, patch)
2008-05-18 19:16 PDT, Nelson Bolyard (seldom reads bugmail)
alvolkov.bgs: review+
julien.pierre: superreview+
Details | Diff | Review
patch v2 - patch v1 plus two leaks plugged (3.93 KB, patch)
2008-05-19 22:41 PDT, Nelson Bolyard (seldom reads bugmail)
no flags Details | Diff | Review
Patch, part 2, v1 (blocks patch part 1, v1) (1.25 KB, patch)
2008-05-20 16:56 PDT, Nelson Bolyard (seldom reads bugmail)
alvolkov.bgs: review+
julien.pierre: superreview+
Details | Diff | Review

Description Nelson Bolyard (seldom reads bugmail) 2008-05-18 16:15:15 PDT
NO attempts to fetch revocation information from the network should be 
done until AFTER we know that the cert being checked is valid in all 
respects except revocation.  That is, we MUST be sure that a cert has 
a valid issuer, and a valid cert chain that chains up to a trusted root,
before we do ANY OCSP or CRL DP fetches for that cert.  libPKIX MUST NOT
send out any OCSP or CRL DP fetches for any cert whose issuer's validity
has not yet been established.  

One implication of this rule is that, when checking revocation of all 
certs in a chain, the revocation checks must proceed in the direction 
from trust anchor to EE cert, and NOT in the opposite direction.

Bug 433594 contains evidence that strongly suggests that libPKIX does 
OCSP checks on certs whose issuer cert it has not yet found.  That is 
wrong, in the extreme!
Comment 1 Nelson Bolyard (seldom reads bugmail) 2008-05-18 17:31:58 PDT
I am confirming this because I can reproduce it now.

The mystery deepens.  It seems that libPKIX *IS* doing the OCSP revocation
checks in the right order, from root to leaf.  But a very strange thing 
happens.  I have a hypothesis or two about how it happens.

The cert chain sent out by the server https://www.techdata.com/ is 
incomplete, containing only the server cert, not the intermediate CA cert.
But when libPKIX checks the chain, it has the missing intermediate cert
chain in its possession.  I suspect it got that cert via an AIA cert fetch.  
So, first libPKIX does the OCSP check on the intermediate CA cert itself,
and this succeeds.  The intermediate CA cert being checked has:

Subject;
CN=VeriSign Class 3 Extended Validation SSL CA,
OU=Terms of use at https://www.verisign.com/rpa (c)06,
OU=VeriSign Trust Network,O="VeriSign, Inc.",C=US

Issuer:
CN=VeriSign Class 3 Public Primary Certification Authority - G5,
OU="(c) 2006 VeriSign, Inc. - For authorized use only",
OU=VeriSign Trust Network,O="VeriSign, Inc.",C=US

Then, after successfully doing the OCSP check on that cert, libPKIX next
tries to do the OCSP check on the leaf (EE, server) cert, which has:

Subject:
CN=www.techdata.com,OU=web1-5,O=Tech Data Corporation,
OID.2.5.4.9=5350 Tech Data Drive,L=Clearwater,ST=Florida,
postalCode=33760,C=US,serialNumber=465173,
OID.2.5.4.15="V1.0, Clause 5.(b)",
OID.1.3.6.1.4.1.311.60.2.1.2=Florida,
OID.1.3.6.1.4.1.311.60.2.1.3=US

Issuer:
CN=VeriSign Class 3 Extended Validation SSL CA,
OU=Terms of use at https://www.verisign.com/rpa (c)06,
OU=VeriSign Trust Network,O="VeriSign, Inc.",C=US

But when ocsp_CreateCertID calls CERT_FindCertIssuer, it returns NULL!
It fails to find the very issuer cert that it has JUST finished checking
via OCSP.  

So, the question is:  How can it fail to find that cert?  

I can think of two possible hypotheses:

1) after the OCSP check on the intermediate CA cert finished, it discarded
(freed) the intermediate CA certificate.  That would be premature, to say 
the least.

2) The intermediate CA cert has not been properly made available to NSS.
Perhaps it was fetched via AIA (this is presently an unproven hypothesis)
and then it was decoded with CERT_DecodeDERCertificate rather than with 
CERT_NewTempCertificate.  That would explain why CERT_FindCertIssuer failed
to find the issuer.  CERT_DecodeDERCertificate only decodes the cert, but 
does NOT enter the cert into the "temp cert DB" (now better known as the 
cert cache) where it must be for CERT_FindCertIssuer to find it.

I think the next step is for me to verify or refute the hypothesis that the 
missing intermediate CA certificate is being fetched via AIA.
Comment 2 Nelson Bolyard (seldom reads bugmail) 2008-05-18 18:23:03 PDT
I confirm that the intermediate CA certificate is being fetched via AIA,
and that it is being decoded with CERT_DecodeDERCertificate in functions
pkix_pl_Cert_CreateToList and PKIX_PL_Cert_Create, both in pkix_pl_cert.c.

Alexei and Julien, do you know of any reason why we should NOT change those
calls into calls to CERT_NewTempCertificate?  (I'm not sure that would be entirely sufficient to fix this bug, but it might be a start.)
Comment 3 Nelson Bolyard (seldom reads bugmail) 2008-05-18 19:16:36 PDT
Created attachment 321548 [details] [diff] [review]
patch part 1, v1

This patch seems to solve the problem of not being able to find the 
issuer cert immediately after checking the issuer cert with OCSP in libPKIX.

Alexei, please review.
Comment 4 Julien Pierre 2008-05-19 18:22:15 PDT
Nelson,

IMO, CERT_FindCertIssuer is part of the legacy PKIX code and should not be coming into the mix here when the chain validation is done with libpkix.
That API only returns one cert even though there are multiple possible paths. It is broken by design because of its prototype.

Even if we add the libpkix certs to the cache with CERT_NewTempCertificate, CERT_FindCertIssuer may not return the intended cert, but only one possible cert that chains to it.

I know it was deliberate for us to use CERT_DecodeDERCertificate instead of CERT_NewTempCertificate. I cannot recall the exact reason right now. I think one reason was that putting potentially bad certs (which haven't been verified yet) into the cache is not necessarily desirable, and can interfere with verification in other threads. That would be the case if any part of the pkix code relied on such a broken API as CERT_FindCertIssuer.

I think we need to make changes to OCSP to not rely on CERT_FindCertIssuer or other legacy pkix calls. I thought that Richard already worked on that actually. I will try to dig the bug(s).
Comment 5 Julien Pierre 2008-05-19 18:38:31 PDT
The bug was 339737 . In this case it was calling CERT_VerifyCert to verify the OCSP signing cert.

Nelson, is the code in pkix_pl_ocspresponse.c being used in this case ?

Comment 6 Julien Pierre 2008-05-19 18:42:19 PDT
Unfortunately, pkix_pl_OcspResponse_VerifySignature also calls CERT_FindCertIssuer . I would say that's a bug. But it seems you may not even be using that code here.

Comment 7 Julien Pierre 2008-05-19 19:00:09 PDT
I now see that PKIX_PL_OcspCertID_Create calls into CERT_CreateOCSPCertID, which calls ocsp_CreateCertId, which calls CERT_FindCertIssuer .

I guess we need to further differentiate the PKIX OCSP code from the legacy one and also have a PKIX variant of CERT_CreateOCSPCertID that doesn't use CERT_FindCertIssuer.
Comment 8 Nelson Bolyard (seldom reads bugmail) 2008-05-19 19:14:07 PDT
Julien, I think your question in comment 5 is equivalent to this question:
What is the stack that leads to the failing call to CERT_FindCertIssuer ?

CERT_FindCertIssuer()  			Line 228
ocsp_CreateCertID()  			Line 1666 
CERT_CreateOCSPCertID()  		Line 1737 
PKIX_PL_OcspCertID_Create()  		Line 170 
pkix_OcspChecker_Check()  		Line 183 
pkix_RevCheckCert()  			Line 361 
pkix_CheckChain()  			Line 949 
pkix_Build_ValidateEntireChain()  	Line 1631 
pkix_BuildForwardDepthFirstSearch()  	Line 2853 
pkix_Build_InitiateBuildChain()  	Line 4198 
PKIX_BuildChain()  			Line 4388 
CERT_PKIXVerifyCert()  			Line 2160 
Comment 9 Julien Pierre 2008-05-19 19:21:00 PDT
Yes, I figured what the stack was now by looking at the source.

Upon further examination of ocsp_CreateCertID, it only needs the issuer cert in order to get the subject name (which it can also get from the subject cert itself in the issuer field) and public key info (SPKI).

So, it doesn't matter if we call CERT_FindCertIssuer and it returns an alternate issuer cert that we don't necessarily trust, because any one of those certs would have the same subject and public key info. Thus, using CERT_FindCertIssuer is actually OK here, though I find it distasteful. It would be more efficient not to have to look up the issuer cert at all, and just have it passed from the higher-level PKIX function calls. And it would also not require the issuer cert to be in the cache.

Unfortunately, CERT_CreateOCSPCertID is already a public function, so we can't change its prototype. I would suggest adding a CERT_CreateOCSPCertIDWithSPKI call, and using that in PKIX_PL_OcspCertID_Create instead.
Comment 10 Nelson Bolyard (seldom reads bugmail) 2008-05-19 19:41:50 PDT
For grins, I set a breakpoint in CERT_FindCertIssuer, and then ran the command 
  vfychain -f -pp -s allow-crl-and-ocsp -u 1 -vv cert.000
and captured all the stacks at the breakpoint hits.  
Here are the unique stacks.  
I suspect one or more bugs will be filed due to these findings. :)


CERT_FindCertIssuer  				Line 228
ocsp_CreateCertID  				Line 1666 
CERT_CreateOCSPCertID  				Line 1737 
PKIX_PL_OcspCertID_Create  			Line 170 
pkix_OcspChecker_Check  			Line 183 
pkix_RevCheckCert  				Line 361 
pkix_CheckChain  				Line 949 
pkix_Build_ValidateEntireChain  		Line 1631 
pkix_BuildForwardDepthFirstSearch  		Line 2853 
pkix_Build_InitiateBuildChain  			Line 4198 
PKIX_BuildChain  				Line 4388 
CERT_PKIXVerifyCert  				Line 2160 


CERT_FindCertIssuer  				Line 228	
pkix_pl_OcspResponse_VerifySignature  		Line 834 
pkix_OcspChecker_Check  			Line 274 
pkix_RevCheckCert  				Line 361 
pkix_CheckChain  				Line 949 
pkix_Build_ValidateEntireChain  		Line 1631 
pkix_BuildForwardDepthFirstSearch  		Line 2853 
pkix_Build_InitiateBuildChain  			Line 4198 
PKIX_BuildChain  				Line 4388 
CERT_PKIXVerifyCert  				Line 2160 


CERT_FindCertIssuer()  				Line 228
cert_VerifyCertChainOld()  			Line 626 
cert_VerifyCertChain()  			Line 873 
CERT_VerifyCertChain()  			Line 882 
CERT_VerifyCert()  				Line 1479 
pkix_pl_OcspResponse_CallCertVerify()  		Line 746 
pkix_pl_OcspResponse_VerifySignature()  	Line 925 
pkix_OcspChecker_Check()  			Line 274 
pkix_RevCheckCert()  				Line 361 
pkix_CheckChain()  				Line 949 
pkix_Build_ValidateEntireChain()  		Line 1631 
pkix_BuildForwardDepthFirstSearch()  		Line 2853 
pkix_Build_InitiateBuildChain()  		Line 4198 
PKIX_BuildChain()  				Line 4388 
CERT_PKIXVerifyCert()  				Line 2160 


CERT_FindCertIssuer()  				Line 228
ocsp_AuthorizedResponderForCertID()  		Line 4187 
ocsp_VerifySingleResponse()  			Line 4322 
ocsp_GetVerifiedSingleResponseForCertID() 	Line 4912 
cert_ProcessOCSPResponse()  			Line 4969 
pkix_pl_OcspResponse_GetStatusForCert() 	Line 1015 
pkix_OcspChecker_Check()  			Line 288 
pkix_RevCheckCert()  				Line 361 
pkix_CheckChain()  				Line 949 
pkix_Build_ValidateEntireChain()  		Line 1631 
pkix_BuildForwardDepthFirstSearch()  		Line 2853 
pkix_Build_InitiateBuildChain()  		Line 4198 
PKIX_BuildChain()  				Line 4388 
CERT_PKIXVerifyCert()  				Line 2160 
Comment 11 Julien Pierre 2008-05-19 20:43:17 PDT
The 1st stack is this bug.

The 2nd one is the problem I mentioned in comment 6.

I believe the 3rd of those stacks is due to the cert callback being set to call CERT_VerifyCert . It's not technically a library bug. The application should register a cert callback that uses libpkix, IMO (we might want to provide one).

Looks like the 4th one is another case where the bottom function should be rewritten for libpkix.
Comment 12 Nelson Bolyard (seldom reads bugmail) 2008-05-19 20:57:00 PDT
This patch reveals a latent leak of two CERTCertificates.
When they were just the result of CERT_DecodeDERCert, the leaks went 
unnoticed, (apparently).  But now that they are the result of CERT_NewTempCertificate, their leaks cause an assertion failure at shutdown.  
Comment 13 Nelson Bolyard (seldom reads bugmail) 2008-05-19 22:08:41 PDT
I found one of the two leaks. 
It is caused by line 166 of pkix_pl_ocspresponse.c
<http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocspresponse.c&rev=1.9&mark=166#146>

The stack is 
pkix_pl_OcspResponse_Destroy()  	Line 166	
PKIX_PL_Object_DecRef()  		Line 936 
pkix_OcspChecker_Check()  		Line 305 
pkix_RevCheckCert()  			Line 361 
pkix_CheckChain()  			Line 949 
pkix_Build_ValidateEntireChain()  	Line 1631 
pkix_BuildForwardDepthFirstSearch()  	Line 2853 
pkix_Build_InitiateBuildChain()  	Line 4198 
PKIX_BuildChain()  			Line 4388 
CERT_PKIXVerifyCert()  			Line 2160 

There are two separate references to the OCSP signer's certificate, due to 
line 957 of the same file.  
<http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_ocspresponse.c&rev=1.9&mark=957#949>

The offending code properly destroys one reference, and then just erases the
second reference by assigning NULL to the pointer.  The fix is to delete
line 166 of that file.  That plugs the leak of the OCSP responder cert.

Now, I must find and plug the leak(s) of the root CA cert (leaked more than 
once).
Comment 14 Nelson Bolyard (seldom reads bugmail) 2008-05-19 22:18:57 PDT
I believe I have found one more leak.  It may be the only remaining leak.
It occurs in function pkix_pl_OcspResponse_VerifySignature following line 833.
The reference stored in the variable issuerCert at line 833 does not appear 
to be destroyed.  The stack is:

pkix_pl_OcspResponse_VerifySignature()  Line 840
pkix_OcspChecker_Check()  		Line 274 
pkix_RevCheckCert()  			Line 361 
pkix_CheckChain()  			Line 949 
pkix_Build_ValidateEntireChain()  	Line 1631 
pkix_BuildForwardDepthFirstSearch()  	Line 2853 
pkix_Build_InitiateBuildChain()  	Line 4198 
PKIX_BuildChain()  			Line 4388 
CERT_PKIXVerifyCert()  			Line 2160 

I will try to plug it next and see if any other leaks remain.
Comment 15 Nelson Bolyard (seldom reads bugmail) 2008-05-19 22:41:16 PDT
Created attachment 321717 [details] [diff] [review]
patch v2 - patch v1 plus two leaks plugged

I believe it is correct to plug these two leaks, even if we decide that
we do not want to use CERT_NewTempCertificate.  

However, having said that, I observer that nowhere else in NSS do we avoid
adding decoded certs to the temp cert DB.  There are many places in NSS where
we get certs from an external source, such as in an S/MIME signature, or an
OCSP signature, or from an SSL peer, and we decode them into the "temp cert DB"
by calling CERT_NewTempCertificate.   Isolating one thread/context from 
another was an objective of Stan's "crypto context" object, but we abandoned
Stan because it was too much work and there was no end in sight.  I think that
there is no point in locking one door while all the others remain unlocked,
that is, no point is trying to avoid using the temp cert DB for unproven certs
here, when we use it for unproven certs everywhere else.  IOW, I'm advocating
going ahead and using CERT_NewTempCertificate here.

One other question occurs to me.  IIRC, in a previous comment, Julien wrote 
that when building an OCSP "certID", we only get an issuer cert to extract 
its subject name.  I think he observed that any issuer cert that we get by 
those means will have the correct subject name.  Maybe I misunderstood what 
Julien was saying in that comment, but if I understood correctly, that 
observation begs the question: since the issuer's subject name is identical 
to the issuer name of the cert under test, what is the point of getting the 
issuer cert?  Why not just extract the name from the issuer field of the cert 
under test?
Comment 16 Julien Pierre 2008-05-20 10:52:06 PDT
Nelson,

re: comment 15, I wrote that we need the subject name and the SPKI (subject & public key info). We can get the former from the subject cert, but not the later. 
Comment 17 Nelson Bolyard (seldom reads bugmail) 2008-05-20 11:01:40 PDT
Ah, yes, and you wrote:
"any one of those certs would have the same subject and public key info."

Undoubtedly, any of those certs would have the same subject, else they 
would not be returned by CERT_FindCertIssuer.  The SPKIs might or might not
all be the same.  Certainly, if we validated the signature on the cert 
whose issuer we were seeking against the returned supposed issuer, we could
be certain that it is (or is not) the right SPKI, but AFAIK, we don't do 
that.  

I found that libPKIX has its own function, equivalent to CERT_FindCertIssuer
that has the same API characteristic: it returns a single result.  I see no
reason to have two functions that have the same limitation.  Seems to me 
that what is needed is a function named CERT_FindCertIssuers (plural).
Comment 18 Julien Pierre 2008-05-20 11:53:41 PDT
Nelson,

re: comment 17,

I think as long as the issuer key matches the subject cert, we are OK. And as you point out, any cert returned by CERT_FindCertIssuer should meet that requirement.
So I'm saying that in this particular case we don't need to have any particular issuer cert - any of them should be OK. CERT_FindCertIssuers isn't required here.

I agree that we don't need 2 functions that have the same limitation.
What is the name of the PKIX function ? Does it have any other arguments besides the subject cert ?
Comment 19 Nelson Bolyard (seldom reads bugmail) 2008-05-20 13:30:01 PDT
I don't know what "issuer key matches the subject cert" means. 
CERT_FindCertIssuer does not verify any relationship between the 
issuer cert's public key and the issued cert.  
Comment 20 Julien Pierre 2008-05-20 13:48:14 PDT
Nelson,

I believe it does so, but not by doing a signature check. It does so by checking that the AKID / SKID match between the subject and issuer cert. So, CERT_FindCertIssuer should  find the issuer with the proper public key.

If the CA screwed up and used multiple keys without the proper key IDs, then we won't detect that error at that step though. However this will probably be detected later, such as by the OCSP responder sending an error response.
Comment 21 Nelson Bolyard (seldom reads bugmail) 2008-05-20 14:05:58 PDT
In the absence of malicious CA certs, and in the absence of multiple CA certs, 
issued by multiple independent CAs, that happen to have the same subject name, 
I agree that CERT_FindCertIssuer should find an issuer cert with an 
appropriate public key.
Comment 22 Julien Pierre 2008-05-20 14:27:18 PDT
Those issuers issued by multiple independent CAs would have to have both the same subject name, but also the same key ID, in order for there to be ambiguity between two of them. That's more likely to be a deliberate (attack) case than incompetence.

But that would result in NSS sending the wrong issuer public key in the cert ID of the request. The responder should detect this as a bad request because the public key will be unknown, and fail the revocation.

It would be better to add a signature check, but it is probably not required.
I still think we should avoid the lookup altogether as I proposed in comment 9, since by the time we get to the OCSP revocation step, we have already verified that the signature of the cert matches its issuer's key. So we should just extract the SPKI from the already-known issuer cert to build the cert ID.
Comment 23 Nelson Bolyard (seldom reads bugmail) 2008-05-20 16:56:04 PDT
Created attachment 321854 [details] [diff] [review]
Patch, part 2, v1 (blocks patch part 1, v1)

I decided to separate the first patch (which changes CERT_DecodeDERCert 
calls into CERT_NewTempCertificate calls) from the second patch, which 
plugs two leaks, and does not depend on the first.  I will ask them to 
be reviewed separately.
Comment 24 Nelson Bolyard (seldom reads bugmail) 2008-05-20 16:57:07 PDT
Comment on attachment 321548 [details] [diff] [review]
patch part 1, v1

I'm unobsoleting this first patch.
Comment 25 Alexei Volkov 2008-05-22 10:50:44 PDT
Comment on attachment 321854 [details] [diff] [review]
Patch, part 2, v1 (blocks patch part 1, v1)

patch is correct. r=alexei
Comment 26 Julien Pierre 2008-06-06 19:18:01 PDT
Comment on attachment 321548 [details] [diff] [review]
patch part 1, v1

I'm OK'ing this patch per our meeting. I suggest you check it in and then keep this bug open with a P2 priority to implement the request to remove the unnecessary issuer lookup.
Comment 27 Nelson Bolyard (seldom reads bugmail) 2008-06-13 22:06:11 PDT
pkix_pl_ocspresponse.c; new revision: 1.11; previous revision: 1.10
pkix_pl_cert.c;         new revision: 1.18; previous revision: 1.17

Note You need to log in before you can comment on or make changes to this bug.