Closed Bug 341323 Opened 18 years ago Closed 18 years ago

Race condition in Stan import cert code called from __CERT_NewTempCertificate


(NSS :: Libraries, defect, P1)



(Not tracked)



(Reporter: julien.pierre, Assigned: nelson)



(3 files, 1 obsolete file)

__CERT_NewTempCertificate contains the following code :

    /* this function cannot detect if the cert exists as a temp cert now, but
     * didn't when CERT_NewTemp was first called.
    nssrv = NSSCryptoContext_ImportCertificate(gCC, c);
    if (nssrv != PR_SUCCESS) {
	goto loser;
    /* so find the entry in the temp store */
    tempCert = NSSCryptoContext_FindCertificateByIssuerAndSerialNumber(gCC,
    /* destroy the copy */
    if (tempCert) {
	/* and use the "official" entry */
	c = tempCert;
    	cc = STAN_GetCERTCertificateOrRelease(c);
        if (!cc) {
            return NULL;
    } else {
	return NULL;

There is a race condition in that code. NSSCryptoContext_ImportCertificate is a no-op if the cert already exists in the context's cert store. It just checks the content of the hash table, without incrementing any refcount or returning any object.

So, when we get to the next line, the NSSCryptoContext_FindCertificateByIssuerAndSerialNumber call, it's possible for the cert to have been removed from the context by another thread.

This is exactly what happens in a test program I wrote that calls __CERT_NewTempCertificate from multiple threads.

The Stan import API is broken by design. Just returning a boolean that says "yup, I imported" isn't good enough, because the caller is never guaranteed that the cert is actually in the store even after the function returned success !
I don't understand why there was no assertion in the final "return NULL" case in the code fragment. That case should never be valid - we import a cert, "successful", look it up, it's not there. And we just return NULL ?!

I believe unfortunately that this defficiency applies to all the import APIs in Stan, not just for this particular cert import function, but also for CRLs, cert chains, trust, S/MIME profile, etc.

There are two possible fixes :
1) Fix the API and the callers
- Rewrite the prototype of these import APIs to increment the refcount of the imported object and return it.
- Change all the callers so they deal with this appropriately.

2) Very ugly hacks.
If we import and then can't find the cert, just try importing again. I will attach such a hack patch for the case quoted above.
Assignee: nobody → julien.pierre.bugs
Priority: -- → P1
Blocks: 331279
Summary: Race condition in Stan import cert code → Race condition in Stan import cert code called from __CERT_NewTempCertificate
No longer blocks: 331279

The problem is that __CERT_NewTempCertificate calls two separate functions,
NSSCryptoContext_ImportCertificate and 
and their action is not atomic.

The proposed solution is to stop using those two functions in 
__CERT_NewTempCertificate and instead call a new function that atomicly 
combines the functionality of those two functions.

Once that's done, it's not clear to me that there remains any case in which
one would want to use the old NSSCryptoContext_ImportCertificate function
any more.  (That is, I imagine one would always want the atomic import-and-
return function in all cases.)  But I guess there's no rush to get rid of 
that old function.
Assignee: julien.pierre.bugs → nelson

NSSCryptoContext_ImportCertificate only has one caller - __CERT_New_TempCertificate, so once the new combined function is provided, it can be deleted .
untested.  Not for review at this time.
Remove the functions that Julien observed were dead code.
Attachment #232385 - Attachment is obsolete: true
Comment on attachment 232404 [details] [diff] [review]
next revision: remove dead old functions

Reviewers, Please carefully review New function 
NSSCryptoContext_FindOrImportCertificate for correctness.  
It is modeled on its predecessor.  I was surprised to find that
"importing" a cert into a crypto context (cc) meant:
- adding the cert to the cert store, and 
- marking the cert itself as belonging to the cc
But the cc object itself contains no record of the cert.
This means that a cert cannot simultaneously be part of more than one cc.
If we have only one cc, a "global" one, then this is probably OK,
but if we have multiple cc's (and that WAS the idea for Stan, no?)
then this seems like a design flaw.  
Bob, I'd especially like your comments on that.
Attachment #232404 - Flags: superreview?(rrelyea)
Attachment #232404 - Flags: review?(julien.pierre.bugs)
Comment on attachment 232404 [details] [diff] [review]
next revision: remove dead old functions

This patch solves the race condition in my test case.

I only have nits with the comments :

a) In __CERT_NewTempCertificate :
/* Should this be nssCertificate_Destroy(c);  instead ? */

It would be best to find the answer to that question before checkin.

b) In NSSCryptoContext_FindOrImportCertificate
/* Set error */
I think an nss_SetError call is missing . But I'm not sure which error code would apply . Perhaps a new one needs to be created. On the other hand, this should never happen - the CryptoContext always creates a CertStore . So the store can only be NULL if the creation failed. This might be a candidate for an assertion only, if we are confident about the CryptoContext creation code.

c) I think the comment at the beginning NSSCryptoContext_FindOrImportCertificate can be slightly improved. 

I propose the following changes :

- s/contains this cert/contains this DER cert .

- s/return the address of the NSSCertificate/return the address of the matching NSSCertificate/

- s/If this cert/If this DER cert/

- s/then add it to the store/then add the new NSSCertificate to the store/

- s/if this cert is not in the store/If this DER cert is not in the store/

Do we have any tests for certs with duplicate issuer and serial numbers ? I think it wouldn't be a luxury to have one when making changes to this area of the code.
Attachment #232404 - Flags: review?(julien.pierre.bugs) → review+
Comment on attachment 232404 [details] [diff] [review]
next revision: remove dead old functions

 /* Should this be nssCertificate_Destroy(c);  instead ? */

I think the answer is yes, but it would need investigation in nssCertificate_Destroy.

There is one point in the code where a partially constructed certificate calls loser, but it appears the parts that are constructed are arena alloc data.

Attachment #232404 - Flags: superreview?(rrelyea) → superreview+
I fixed the comments as Julien suggested and committed on trunk.
Fix race in CERT_NewTempCertificate.  Bug 341323. r=julien,rrelyea

Checking in certdb/stanpcertdb.c; new revision: 1.71; previous revision: 1.70
Checking in pki/cryptocontext.c;  new revision: 1.16; previous revision: 1.15
Checking in pki/nsspki.h;         new revision: 1.11; previous revision: 1.10
Checking in pki/pkistore.c;       new revision: 1.28; previous revision: 1.27
Checking in pki/pkistore.h;       new revision: 1.10; previous revision: 1.9
This is what I checked in.
On NSS 3.11 branch
certdb/stanpcertdb.c; new revision:; previous revision:
pki/cryptocontext.c;  new revision:; previous revision:
pki/nsspki.h;         new revision:; previous revision: 1.10
pki/pkistore.c;       new revision:; previous revision:
pki/pkistore.h;       new revision:; previous revision:
Closed: 18 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.