Closed Bug 231051 Opened 21 years ago Closed 21 years ago

crlutil asserts after importing and deleting CRL

Categories

(NSS :: Tools, defect, P1)

Other
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: bishakhabanerjee, Assigned: julien.pierre)

Details

Attachments

(5 files)

Have already spoken to Julien on this, and through testing have determined that
it has to do with paths

While doing a crlutil -I -i <CRL.crl> -d <DB> inside the NSS test harness,
coredumps on Linux. It does the import, and then crashes. A crlutil -L -d <DB>
does list the CRL that was being imported.

Using the gdb that comes on my RedHat AS system, looking at the core file:
haddock:/u/bishakhabanerjee/mozilla 20% gdb
../../../../dist/Linux2.4_x86_glibc_PTH_DBG.OBJ/bin/crlutil -core core
GNU gdb Red Hat Linux (5.2-2)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
Core was generated by `crlutil -I -i /u/bishakhabanerjee/mozilla_1/PKITS_DATA/cr
ls/GoodCACRL.crl -d ..'.
Program terminated with signal 6, Aborted.
Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG
.OBJ/lib/libssl3.so...done.
Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O
BJ/lib/libssl3.so
Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG
.OBJ/lib/libsmime3.so...done.
Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O
BJ/lib/libsmime3.so
Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG
.OBJ/lib/libnss3.so...done.
Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O
BJ/lib/libnss3.so
Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG
.OBJ/lib/libplc4.so...done.
Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O
BJ/lib/libplc4.so
Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG
.OBJ/lib/libplds4.so...done.
Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O
BJ/lib/libplds4.so
Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG
.OBJ/lib/libnspr4.so...done.
Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O
BJ/lib/libnspr4.so
Reading symbols from /lib/i686/libpthread.so.0...done.
Loaded symbols for /lib/i686/libpthread.so.0
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
---Type <return> to continue, or q <return> to quit---
Reading symbols from /lib/i686/libc.so.6...done.
Loaded symbols for /lib/i686/libc.so.6
Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG
.OBJ/lib/libsoftokn3.so...done.
Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O
BJ/lib/libsoftokn3.so
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x401ebb41 in __kill () at __kill:-1
-1      __kill: No such file or directory.
        in __kill
Doesn't crash after doing the import
Bishakha, please try this test on Solaris.  
core files are often more useful on that platform.
this test script uses the NSS test harness. When running it, it will create a
HOSTDIR.1 under mozilla/tests_results

This crashes everytime *after* doing a crl import. A crlutil -L will tell you
that the CRL was imported.
the two scripts that I have attached above can be copied into
mozilla/security/nss/tests/pkits and run from there, e.g. as test1.sh and test2.sh

PKITS_DATA can be set to /share/builds/sbstools/nsstools/tmp/PKITS_data
I reproduced a "trap" on Windows running the test2 script.
Actually it is an assertion during NSS_Shutdown 

NTDLL! DbgBreakPoint address 0x77fa144b
SECMOD_Shutdown() line 112 + 36 bytes
NSS_Shutdown() line 556 + 5 bytes
main(int 0x00000006, char * * 0x004b4e18) line 473 + 5 bytes
CRLUTIL! mainCRTStartup + 227 bytes
KERNEL32! ProcessIdToSessionId + 381 bytes

The line is

	PORT_Assert(secmod_PrivateModuleCount == 0);
Ah HA!.  A reference count leak.  Some object is not being destroyed before
calling NSS_Shutdown, which in turn prevents the module from being unloaded,
which prevents the module count from going to zero.  

This assertion only occurs when a certain environment variable is set, and
apparently the "test harness" sets that variable.  

So, I think this is a P1 bug.  Previous ref count error bugs have been P1.
Priority: -- → P1
Target Milestone: --- → 3.9.1
On another bug, there was a comment on why this was not being seen more often.
If crlutil does not do teh import of a CRL for valid reasons, like "Peer's
certificate issuer is bad" or "Bad signature CRL", crlutil says so, and does not
crash. It is only when crlutil sees a valid CRL, and imports it, and then
crashes. So the path followed in the latter case does free its pbjects correctly.
Besides what is already mentioned, the following command also causes a crash:
crlutil -D (deleting a CRL from the database)

but the command: crlutil -L (listing the CRLs in the database) does not crash
Updating bug title.
Summary: crlutil crashes while importing CRL → crlutil asserts after importing CRL
Attached patch fix cert leakSplinter Review
The CA certificate used for verifying the signature was being leaked during the
import.
Attachment #139773 - Flags: superreview?(wchang0222)
Attachment #139773 - Flags: review?(MisterSSL)
Comment on attachment 139773 [details] [diff] [review]
fix cert leak

r=wtc.
Attachment #139773 - Flags: superreview?(wchang0222) → superreview+
Comment on attachment 139773 [details] [diff] [review]
fix cert leak

good job!
Attachment #139773 - Flags: review?(MisterSSL) → review+
Julien,

Re comment #8, I can still see the crash when deleting a CRL. Crash after
import has gone away. Attaching test script, which can be copied to
nss/tests/pkits directory.
Checking in pk11cert.c;
/cvsroot/mozilla/security/nss/lib/pk11wrap/pk11cert.c,v  <--  pk11cert.c
new revision: 1.129; previous revision: 1.128
done

Bishakha, I will look into the other crash on delete, but it is most likely in a
different codepath.
Comment on attachment 139773 [details] [diff] [review]
fix cert leak

Julien, this patch also needs to be checked in on
the NSS_3_9_BRANCH because this bug is targeted at
NSS 3.9.1.
Checked it in to NSS_3_9_BRANCH also .

Checking in pk11cert.c;
/cvsroot/mozilla/security/nss/lib/pk11wrap/pk11cert.c,v  <--  pk11cert.c
new revision: 1.128.4.1; previous revision: 1.128

Also, I determined that this bug was not a regression. It existed as far back as
NSS 3.2 (oldest NSS source tree I have), although the code used to live in
lib/certhigh in CERT_ImportCRL instead of PK11_ImportCRL .
Bishakha,

I reproduced a crash using your new script, but it was in vfychain :

vfychain -d PKITSdb -u 4 g:/nsstools/tmp/pkits_data/certs/ValidDSAParameterInher
itanceTest5EE.crt g:/nsstools/tmp/pkits_data/certs/DSAParametersInheritedCACert.
crt g:/nsstools/tmp/pkits_data/certs/DSACACert.crt
[1] + Done(133) ?
  33460 Trap    vfychain

I was not able to get the debugger to tell me where the crash happened (no
pop-up) due to the MKS shell intercepting the exception. Unfortunately the sh -X
is only effective on the parent shell program, but not on the sub-shells that
you call. We really need to find a fix for this problem !

Anyway, I think that crash is a different bug and would recommend opening
another bugzilla defect.
More findings :
1) In the shell script, using | tee $PKITS_LOG
is what causes the exception handler to be reset !
Unlike sh.exe, I could not find an argument to enable the system exception handler.
So, I just removed the | tee to get the debugger pop-up

2) Once I did this, I noticed in fact more assertions in crlutil upon delete.
They were bugs in the crlutil tool this time. After permanently deleting the CRL
from the database, the temporary object was not freed. I fixed two occurrences
of this bug, although the test only tripped on one of them.
I will attach a patch for this problem.

3) There is a remaining failure in vfychain. It is also an assertion caused by
strict shutdown. I have not yet figured out the cause for it or a patch.

Attached patch fix CRL leakSplinter Review
Attachment #139963 - Flags: superreview?(wchang0222)
Attachment #139963 - Flags: review?(MisterSSL)
I found that the vfychain trap is related to the DSA cert verification. If I
take out signature verification in lib/certhigh/certvfy.c :
	    /* rv = CERT_VerifySignedData(&subjectCert->signatureWrap,
				       issuerCert, t, wincx);
                                       */
and replace it with rv = SECSuccess;

Then the assertion does not occur in NSS_Shutdown().
I have tried debugging the code further. Simply removing the call to
CERT_VerifySignedDataWithPublicKey within CERT_VerifySignedData does not suffice
to fix the assertion. I have to also remove the code that extracts the key and
frees it, making CERT_VerifySignedData a no-op, for the test to pass.
I spent some time looking at seckey_ExtractPublicKey, but did not find anything
wrong with the code there.
I probably have to run only the vfychain program with my changes to find out
what the bug is, rather than re-rerunning the whole script.
Comment on attachment 139963 [details] [diff] [review]
fix CRL leak

sr=wtc. Please remove the "if (crl)" test in the
following code because 'crl' cannot be NULL at
that point.  (We just tested it before the
SEC_DeletePermCRL call.) The "crl = NULL;"
assignment is also not necessary because 'crl'
is a local variable and we are near the end of
that function.

>@@ -177,6 +178,10 @@
> 	return SECFailure;
>     }
>     rv = SEC_DeletePermCRL (crl);
>+    if (crl) {
>+        SEC_DestroyCrl(crl);
>+        crl = NULL;
>+    }
>     if (rv != SECSuccess) {
> 	SECU_PrintError
> 		(progName, "fail to delete the issuer %s's CRL from the perm database (reason: %s)",
Attachment #139963 - Flags: superreview?(wchang0222) → superreview+
Comment on attachment 139963 [details] [diff] [review]
fix CRL leak

At about line 180, the patch adds a NULL test of the variable crl.
However, just 6 lines earlier, the code tests crl for NULL and returns
if it is NULL.	So, crl cannot be null here.  

Please remove this unnecessary test of CRL for NULL.  Then check it in.  
Thanks.
Attachment #139963 - Flags: review?(MisterSSL) → review+
Checked in with suggested changes.

Tip :

Checking in crlutil.c;
/cvsroot/mozilla/security/nss/cmd/crlutil/crlutil.c,v  <--  crlutil.c
new revision: 1.21; previous revision: 1.20

3.9 branch :

Checking in crlutil.c;
/cvsroot/mozilla/security/nss/cmd/crlutil/crlutil.c,v  <--  crlutil.c
new revision: 1.20.38.1; previous revision: 1.20
Marking fixed.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Marking verified for both crashes after import of CRL, and delete of CRL.
Status: RESOLVED → VERIFIED
Summary: crlutil asserts after importing CRL → crlutil asserts after importing and deleting CRL
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: