Closed Bug 482789 Opened 15 years ago Closed 2 years ago

Firefox reports sec_error_crl_invalid error on many Microsoft HTTPS sites

Categories

(NSS :: Libraries, defect, P1)

3.12.1

Tracking

(Not tracked)

RESOLVED WORKSFORME
3.12.9

People

(Reporter: eimaj2, Unassigned)

References

()

Details

(Keywords: csectype-dos, Whiteboard: [sg:dos])

Attachments

(4 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 (.NET CLR 3.5.30729)

When visiting a number of Microsoft HTTPS sites, Firefox fails to connect and shows a standard Firefox error page. The text is:

Secure Connection Failed
An error occurred during a connection to connect.microsoft.com.
New CRL has an invalid format.
(Error code: sec_error_crl_invalid)

Other browsers (IE8 RC, Opera 9.62, Safari 4 beta) all connect without a problem.

Reproducible: Always

Steps to Reproduce:
1. Visit https://connect.microsoft.com/

Actual Results:  
Site cannot be visited, error page is shown. Text is as per Details.

Expected Results:  
A connection is established and the secure site is shown in the browser.
Since Firefox does not yet download CRLs at cert validation time, it must 
be that the CRL is already downloaded.  That would only happen if you had
manually downloaded the CRL at least once.   So, to diagnose this, we need
more info about the CRL that you have, and also 
- the URL from which you downloaded it 
- whether you have automatic periodic CRL updating enabled for this CRL
- when was the last time it was updated
- when is the next time it is scheduled to be updated
- Have there been prior update failures?

All that info should be available in various PSM windows.  Perhaps Kai can
advise us best on how to gather that info.

Jamie, Please create a zip file containing all these files from your profile
directory:
- the cert8.db file, (but not the key3.db file) 
- the secmod.db file, 
- If you have a subdirectory named cert8.dir, then also include the entire
contents of that subdirectory in the zip file.

Please email that zip file to me.  I will examine it for CRL problems.
Oh, the browser must not be running when you create that zip file.

After you've made the zip file, there should be some way to get rid of 
the CRL that is causing the problems.  Hopefully Firefox has some nice
GUI method, and Kai can advise us of that.  
But if not, there are command line tools available that can do it.
I don't remember explicitly adding a CRL, but that doesn't mean that I didn't. I was writing some SSL server code at one point in time and was doing a lot of messing around with certificates. It's possible I did something then.

I checked my revocation lists in Firefox, and sure enough there was one in there from Microsoft. I removed that and have no more problems accessing the site.

Even though my problem has been fixed, it's still worth investigating why this CRL is causing problems with NSS. It may be something that Microsoft are doing wrong, or it could be a genuine bug in NSS.
Severity: major → normal
Jamie, 
THANK YOU for reporting this problem, and for attaching your cert8.db file.
Without it, this might have been a VERY difficult problem to reproduce and 
diagnose.  With this, we may be able to nail this pretty quickly.
This bug is probably p2 for NSS 3.12.2, but P1 for the next release.
I will attach another attachment and add some diagnosis in a subsequent 
comment.
Assignee: nobody → julien.pierre.boogz
Group: core-security
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P2
Version: unspecified → 3.12.1
This zip file contains 3 certs in files named cert.00[012].
They are the present cert chain for the afflicted server.
I have saved the chain here for posterity, and also so that we can test 
this bug offline.

With these 3 certs in "." and the reporter's cert8.db file in DBDIR,
this bug can be reproduced with the command:

   vfychain -d DBDIR -u 1 -v cert.0*

Diagnosis in next comment to follow.
I marked thsi bug security sensitive because I suspect there may be potential DOS attack vectors here.

In this bug's cert8.db file are two certs for a CA that issues https server
certs.  The command 

certutil -L -d DBDIR -n "Microsoft Secure Server Authority"

shows those two CA certs.  

One of them is valid from 2007 Sep 28  to 2009 Apr 19.  
It has a 2k bit public key, and a subject key ID that ends with 0x55

The other is valid from 2008 Apr 09 to 2011 Feb 19. 
It has a different 2k bit public key, and a different subject key ID,
which ends in 0xbc.  

These two certs have an overlapping validity period from 
2008 Apr 09 to 2009 Apr 19, of which there are about 40 days left now.

Also in this cert8.db file is a very old CRL issued by this CA.  
The command 
  crlutil -L -d DBDIR -n "Microsoft Secure Server Authority"
shows that the CRL was issued on 2008 May 19, and should have been updated 
by 2008 May 27 (8 days later).  It bears an Authority Key ID that ends in 
0x55, indicating that its signature can be verified with the older CA cert 
whose subject key ID ends in 0x55.  The CRL is "empty"; that is, it contains
zero serial numbers of revoked certs.

The problem (for this https server cert's validation) is as follows:

The https server's cert has an authority key ID ending in 0xbc, indicating
that its signature can be verified with the public key in the newer CA cert,
not with the public key in the older CA cert.  

When we go to check that server cert to see if it was revoked by this issuer, 
we get the issuer cert for this server's cert, which is the cert whose 
subject key ID ends with 0xbc, and we find a CRL for that issuer, without 
regard to the issuer's subject Key ID.  We get the CRL for the older CA 
cert.  The signature on that CRL is then checked with the public key from
the server's cert, and the signature check fails (because it is being checked
with the wrong public key).  

This diagnosis reveals numerous other problems:

- Firefox's CRL fetcher had stopped fetching CRLs from this issuer, leaving 
the cert DB with an old CRL.

- The CRL lookup returns the CRL for the wrong issuer cert. (I think we 
may already have a bug on file about this.)

- The CRL signature check fails to notice that it is using the public key
from the wrong issuer CA cert to verify the signature on the CRL.

- The https server cert is ultimately rejected with 
  ERROR -8159: New CRL has an invalid format.
which is misleading (the CRL is definitely NOT new, and its format is not
invalid, AFAIK) but is a complaint about the CRL, not about the cert. 
There should be some error code that says something about revocation of 
the server cert.  

- When invoked with an issuer cert's nickname, the program crlutil only 
displays one CRL for one cert with the requested issuer's nickname.  :(

- crlutil does not display the content of the outer CRL, but only the inner
"to be signed" content of the CRL.

Looks like our CRL cache and CRL checking code needs more work.
Attachment #366913 - Attachment mime type: application/octet-stream → application/zip
I wonder if these issues do (or should) lessen our desired and zeal to 
get CRLDP done ASAP.
It is also interesting to note that the URL for the old CRL in the cert dB is:
> http://mscrl.microsoft.com/pki/mscorp/crl/Microsoft%20Secure%20Server%20Authority(4).crl

But the URL in the CrlDP extension of the https server cert is:
> http://mscrl.microsoft.com/pki/mscorp/crl/Microsoft%20Secure%20Server%20Authority(5).crl

That's (4) vs (5).
Whiteboard: [sg:dos]
Note the presence of these extensions:
  Certificate Authority Key Identifier   (just like in a cert)
  CRL Number                             (same in both certs)

Two different key IDs, two different signatures, but otherwise these
appear to have the same content.
This bug is almost certainly a duplicate of 
bug 217387  CRL's AuthorityKeyID ignored during CRL verification, or
bug 418762  PK11_ImportCRL ignores CRL AuthorityKeyID extension 

The only news here is that this example comes from a live Internet deployment
while the older bug came from a test suite.
Thanks for the investigation, Nelson.

I am relieved to hear that both CRLs have the same content, ie. same list of revoked serial numbers. It's perfectly valid to issue multiple CRLs that way, signed with different key, as long as they each include all the serial numbers for the CA.

Now, about some of the details.

Error -8159 is SEC_ERROR_CRL_INVALID . This maps to "New CRL is in invalid format". The same error code is returned in several places in the libraries, mostly during lookup calls. The only place when it's returned at import time is if PK11_ImportCRL is called, but that wasn't the case here. The CRL has already been fetched by Firefox's CRL fetcher, and was already in the NSS cert database. Clearly the string for this error is only relevant to the PK11_ImportCRL case. Perhaps we should consider adding a new error code to separate these 2 classes of errors, invalid CRLs during fetching and importing.

I'm not quite sure why this error code would be returned all the way to the SSL layer, though. The SEC_ERROR_CRL_INVALID should eventually be remapped to SEC_ERROR_REVOKED_CERTIFICATE during the revocation check in the CRL cache. This may be another bug.

I don't think bug 418762 is being hit here because the CRL was successfully imported into the database, either because the matching CA cert was in the DB, or no signature check was required at import time. I think it's the later - PSM blindly imports all CRLs into the DB without signature checks.

I think the root cause of this problem is bug 217387.

Bug 217392 also plays a role in this deployment - if the user instructed firefox to download both versions of the CRL, signed by different issuers, then one CRL would overwrite the other in softoken .
Nelson,

What type of DOS attack were you referring to in comment 6 ? I think this is just a bug, there will not be any false positive revocation checks because of this bug, only false negatives due to problems verifying the signature of the CRL .
It seems that an invalid CRL (or CRL whose validity appears to be 
contraindicated) causes all certs issued by its apparent issuer to be 
treated as revoked.  That's a denial of service to the user, denying
him access to all those sites.  

I agree that the CRL error should not have propagated all the way up,
and should have been turned into a revoked cert error along the way.
That's a separate bug I think.
Nelson,

This is actually not a DOS attack, but working as designed.

In the past, NSS implemented an "optional" revocation policy, ie. it did not fail verifying certs in the absence of a CRL, unlike the NIST policy.

So, the decision was made a long time ago that if a CRL was found, that was evidence that the user actually tried to use revocation - something most users did not do. If the attempt to use that CRL was not successful for whatever reason,  then it would result in all certs for the given issuer to be considered revoked, until the CRL problem was resolved.

The revocation code including the one in the CRL cache and many other places is written with that policy in mind - most error during revocation processing are considered fatal. There was agreement from other members of the team that this was the right thing to do when I wrote the code. I don't think that part needs to be changed. But we do need to extend our support to more types of CRLs in order to not trip on those errors when we add automatic CRL fetching via CRL DP.
Priority: P2 → P1
Target Milestone: --- → 3.12.5
Assignee: bugzilla+nospam → alexei.volkov.bugs
A variant of this bug, with CRL distribution points support of libpkix
enabled, has been reported by Linux Chrome users: http://crbug.com/53334

Steps to reproduce:

1. Run Linux Chrome on a Linux distribution with system NSS version 3.12.4
or later, so that CRL distribution points extension is supported.

2. Visit https://outlook.com.  Log in.  (You just need a Windows Live ID.
You don't need to have an Outlook Web App mailbox.)

3. Visit https://msdn.microsoft.com/en-us/subscriptions/manage/default.aspx
You'll get the "The server's security certificate is revoked!" error page.

The underlying cause is the same: there are two certificates for
"Microsoft Secure Server Authority", with overlapping validity periods
(both are valid now), and different subject key identifiers.  These
two certs are identified as "Microsoft Secure Server Authority(5)"
and "Microsoft Secure Server Authority(8)" in the CRL distribution points
and AIA extensions of the server certs.

I wrote this patch after three hours of work this afternoon.  It still
needs more testing, and I'm not sure if it correctly handles CRLs without
Authority Key Identifier extensions.

I agree this bug seems to be a duplicate of bug 217387, but it is
different from what Julien said in that bug:
  In the past, the CRL cache has never had to actually look up the issuer
  certificate. The issuer of the cert being verified was passed in from the
  CERT_VerifyCert call. However, it is possible that different keys are used to
  sign the certs than the CRL, so we can not rely on that cert to be the correct
  cert in this case.

In this bug, the same key is used to sign the cert and the CRL.
It's just that there are two parallel sets of certs and CRLs issued
by two CA certs with the same subject name but different keys.
Assignee: alexei.volkov.bugs → wtc
Status: NEW → ASSIGNED
OS: Windows Vista → All
Hardware: x86 → All
Target Milestone: 3.12.5 → 3.12.9
Wan-Teh - any update here?
Josh: I haven't worked on this bug since then (2010-09-03).  Sorry.
Josh, we usually do not keep DOS bugs hidden for Firefox. And, this is a not a very high priority one for Firefox. But, this may affect some other products, especially server products, so I am not sure if it is a good idea to open it up.
Assignee: wtc → nobody
Status: ASSIGNED → NEW
Seems unlikely that a server product would be using CRLs, but I guess they could.
Dan, servers that request client authentication certs will almost certainly want 
to have local copies of the CRLs for the CAs they trust to issue user certs.
Keywords: csec-dos
Group: crypto-core-security
Group: crypto-core-security
Group: core-security → crypto-core-security
Group: crypto-core-security

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: major → --
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: