Closed Bug 216701 Opened 21 years ago Closed 21 years ago

Certificates stop verifying after new CA cert is installed if old CRL is present

Categories

(NSS :: Libraries, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: julien.pierre, Assigned: julien.pierre)

Details

Attachments

(1 file)

I had an old Intranet Netscape CA CRL in my Mozilla certificate database. The
CRL was from august 4th. I didn't have Mozilla set to automatically update it.

On august 8th, the Intranet Netscape CA certificate was renewed. The "not
before" in that new certificate was set to that date.

After I installed that root, I could no longer verify certificates issued by the
CA in mozilla.

I debugged the NSS code and found that the problem was in the CRL cache.
During certificate verification, we try to look up a CRL for every certificate
in the chain.

Due to security issues, the CRL cache is very strict about CRL validity : if it
finds any bad CRL, it automatically considers all certificates issued by the CA
as revoked. This is to protect against a partial failure or compromise of
revocation system. 

It turns out that the CRL I had dated 8/4/2003 could no longer verify because of
the presence of the new root certificate with a NotBefore of 8/8/2003 .

Here is a summary of what the CRL cache does :
1) gets called from the cert verification code with two certificates : the
certificates we want to check revocation on, and the issuer certificate
2) looks up all CRLs for the issuer's subject
3) tries to verify CRLs for this issuer against the issuer cert provided

It fails in step 3 because the issuer cert has been renewed and there is no
longer a time overlap between its validity date and the issuing date of the CRL.

The workaround is most likely that we need to check for other issuers certs in
step 3. Ie. we need to build a cert subject list and find out if there are other
certs with the same subject but different validity dates.

This cert search could be quite expensive to do potentially within the CRL
cache. However, it may not be so bad considering that the CRL cache also
remembers failures, so if it fails once, the search shouldn't be attempted again.
I was trying to create a preliminary fix for this, and it occurred to me that
there is another issue.

Even if I do search for multiple issuers certs for the CRL at the time of the
first verification of a cert for that issuer, it is possible that another issuer
certificate will become available later, if the user installs one in the
permanent cert database. Also, the new issuer certificate could live on a
removable token. This means that we would need to invalidate the CRL cache for
that issuer upon any token insertion, to make sure that we try to validate the
CRL again if it was cached as invalid.

We may also need to invalidate the CRL cache too if the token that contained the
issuer cert used to validate the CRL is removed, or if the issuer certificate is
deleted. In practice this is much less of an issue, as it shouldn't hurt
anything - the revocation will just continue to work after the token is
unplugged/issuer cert deleted. Hmm. The CRL cache might keep a reference to that
cert, though.
Digging back in the history of the CRL cache implementation, I found the
following bug :
http://bugzilla.mozilla.org/show_bug.cgi?id=166714

That bug set a precedent for decoding and caching a CRL even if the issuer
certificate was not available at the time of insertion into the cache. Since
that fix was made, the CRL cache keeps a state on each CRL and in particular
stores the reason why the CRL was invalid.

In the case of this particular bug, the reason code is CRL_CACHE_INVALID_CRLS .
This reason code is shared between two actual failures - invalid DER or invalid
signature - the later being the problem here. When this bug gets fixed we'll
need to differentiate between the two failure types. This is all internal to the
CRL cache, fortunately.

There is another parallel between this bug and 166174 .
In 166174 we marked the CRL as unverified and left the issuer pointer NULL. This
told the cache to verify the CRL at a later time.

The logic we need to implement to fix this one is :

1. at the time of the initial verification of the CRL, look for all possible
issuer certs for the given subject, if the one provided in the cert chain
doesn't match the CRL signature

2. if a matching cert is not found for the CRL initially, we need a way to know
when it's time to verify it again on subsequent revocation checks. In other
words, if the CRL gets cached as invalid due to an invalid signature, do we now
have a different issuer certificate available for that same subject that we want
to use to try to verify the CRL again ?

The first part of the fix is straightforward and I hope to have a patch
available soon. This would resolve most cases - where clients have both the old
and new CA certificates in their database, and just need the code to continue to
work.

The second part is much more involved, because the check must be very
inexpensive or free, or the CRL cache may start degrading performance unnecessarily.

There is unfortunately no way in PKCS#11 to say "notify me when a new issuer
cert for this subject is available".

I have opened a bug to poll the state of removable tokens to help with
implementing this check. See bug 216727 for this. It is not certain that it can
be implemented with adequate performance however.

Permanent tokens need to be dealt with also. Hooks on NSS cert object creation
can work to some degree, but they fail to solve the problem if there is shared
memory and multiple processes are touching the permanent token. Only a full
search/fetch on the token works, but it is too expensive for the CRL cache,
since those fetches (any PKCS#11 call) is precisely what the cache is trying to
avoid.

If we don't implement the second part of the fix, here are the consequences :

a) certs may stop verifying initially after the new CA cert is installed to the
database, but things will worki again if the user stops the application and
restarts it
b) if the new or old CA cert lives on a removable token, the problem will
continue to show up at random times

It is possible to reduce the occurrence of b) by copying the CA cert from the
removable token to the NSS cert database. We might already do this.
The current code where this fails is :

            int64 issuingDate = 0;
            signstatus = DER_UTCTimeToTime(&issuingDate,
&crlobject->crl.lastUpdate);
            if (SECSuccess == signstatus) {
                signstatus = CERT_VerifySignedData(&crlobject->signatureWrap,
                                                    cache->issuer, issuingDate,
wincx);

It fails because the CRL's issuing date (last update) doesn't fall within the
new issuer cert's NotBefore / NotAfter .

Bob suggest that I verify the CRL's signature against the date passed to the
verification operation (ie. the date passed to CERT_VerifyCert) rather than the
issuing date of the CRL.

At this level of the stack, that information is unavailable, as shown below :

CERT_VerifySignedData(CERTSignedDataStr * 0x00588558, CERTCertificateStr *
0x00583790, __int64 0x0003c416c22a2b00, void * 0x0012fb44) line 153
DPCache_Refresh(CRLDPCacheStr * 0x0058491c, CERTSignedCrlStr * 0x005884e0, void
* 0x0012fb44) line 1218 + 31 bytes
DPCache_Fetch(CRLDPCacheStr * 0x0058491c, void * 0x0012fb44) line 1383 + 17 bytes
DPCache_Update(CRLDPCacheStr * 0x0058491c, CERTCertificateStr * 0x00583790, void
* 0x0012fb44, int 0x00000000) line 1537 + 13 bytes
AcquireDPCache(CERTCertificateStr * 0x00583790, SECItemStr * 0x005837e4,
SECItemStr * 0x00000000, __int64 0x0003c5c145fd8467, void * 0x0012fb44,
CRLDPCacheStr * * 0x0012f6d0, int * 0x0012f6cc) line 1821 + 31 bytes
CERT_CheckCRL(CERTCertificateStr * 0x005800f0, CERTCertificateStr * 0x00583790,
SECItemStr * 0x00000000, __int64 0x0003c5c145fd8467, void * 0x0012fb44) line
1862 + 40 bytes
SEC_CheckCRL(NSSTrustDomainStr * 0x0054acf0, CERTCertificateStr * 0x005800f0,
CERTCertificateStr * 0x00583790, __int64 0x0003c5c145fd8467, void * 0x0012fb44)
line 303 + 27 bytes

The last function that had the information in the above stack was AcquireDPCache.
Changes to the CRL cache are needed to pass it all the way down the stack.

I think the reasoning for doing the CRL signature check against its lastUpdate
is that the CRL should always verify on that date. The assumption proves wrong
in this case, though, since the CA certificate has been renewed.

The problem is that this signature's check is used in the cache as an optimization.

If the CRL signature check passes, we use the CRL and put it into the cache.
If the CRL signature check fails, we consider it a bad CRL and every cert from
that CA is then considered to be bad from the beginning of time.

Obviously, any application can pass any int64 time it wants into CERT_VerifyCert
. eg. it could try to verify a cert in the year 20000. The CRL signature would
fail to verify at that date because the CA would be expired. We don't want that
to cause the CRL cache to be set into the invalid state.

We could decide not to cache the negative case anymore, but this would require
subsequent CRL fetches and substantially reduce the benefit of the CRL cache.

If I look at this from a high-level point of view, this is what happens :
1. we try to verify cert A at time t1
2. we find an issuer cert B valid for time t1
3. we locate the CRL C updated at time t2 that matches the subject S of cert B .
This is always the most recent CRL for S found in any token. There is no
relation between t2 and t1
4. we try to verify CRL C against cert B at time t2
5. NotBefore(B) > t2, so we fail and cache the CRL as bad

If we change step 4 as Bob suggests :
4. we try to verify CRL C against cert B at time t1
5. we cache the CRL

This can work and we can preserve the optimizations, IF in step 2 always return
an issuer B where NotBefore(B) <= t1 <= NotAfter(B) . If the condition is ever
false, and we go into CRL processing with that cert B, the wrong results will
end up being cached - ie. the CRL will be cached invalid.

Currently, we only do CRL checks from CERT_VerifyCert and derivatives. We don't
export the CRL check function. So perhaps this will not really happen.

Perhaps the solution there is to add a check for that condition in CERT_CheckCRL
before we even start looking for a CRL . If the issuer is expired at time t,
then it doesn't make sense to go fetch a CRL : we will always fail to verify the
CRL if there is one. So we could fail the revocation check with
SEC_ERROR_EXPIRED_ISSUER_CERTIFICATE .

I'll try to see how tough it is to make a patch for this. It ought to be a lot
less difficult than my other proposed approach of looking for an alternate
issuer cert.
Taking bug.
Assignee: wtc → jpierre
Comment on attachment 130439 [details] [diff] [review]
change date used to verify CRLs. Propagadate verification date from CERT_VerifyCert down the stack of the CRL cache.


Bob, please review this patch.
Attachment #130439 - Attachment description: change date used to verify CRLs → change date used to verify CRLs. Propagadate verification date from CERT_VerifyCert down the stack of the CRL cache.
Attachment #130439 - Flags: superreview?
Attachment #130439 - Flags: review?(relyea)
Attachment #130439 - Flags: superreview? → superreview?(wchang0222)
Attachment #130439 - Flags: superreview?(wchang0222) → superreview+
Checking in crl.c;
/cvsroot/mozilla/security/nss/lib/certdb/crl.c,v  <--  crl.c
new revision: 1.36; previous revision: 1.35
done
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Comment on attachment 130439 [details] [diff] [review]
change date used to verify CRLs. Propagadate verification date from CERT_VerifyCert down the stack of the CRL cache.


signstatus is uninitialized in the case where cert->issuer is NULL. around line
1209
Attachment #130439 - Flags: review?(rrelyea0264) → review-
Bob,

That was fixed in the patch that was checked in.
Question: in what NSS release did the fix for this bug first appear? 
That value should be recorded in the target milestone, but isn't.

It appears that the checkin for this bug *may* have reintroduced a previously
fixed bug, reported as bug 171219 and bug 175115, by reintroducing the check
for the issuer cert's validity dates into CERT_CheckCRL.
This bug was fixed in NSS 3.9.
Target Milestone: --- → 3.9
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: