Closed Bug 231051 Opened 21 years ago Closed 21 years ago

crlutil asserts after importing and deleting CRL

Categories

(NSS :: Tools, defect, P1)

Other
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: bishakhabanerjee, Assigned: julien.pierre)

Details

Attachments

(5 files)

Have already spoken to Julien on this, and through testing have determined that it has to do with paths While doing a crlutil -I -i <CRL.crl> -d <DB> inside the NSS test harness, coredumps on Linux. It does the import, and then crashes. A crlutil -L -d <DB> does list the CRL that was being imported. Using the gdb that comes on my RedHat AS system, looking at the core file: haddock:/u/bishakhabanerjee/mozilla 20% gdb ../../../../dist/Linux2.4_x86_glibc_PTH_DBG.OBJ/bin/crlutil -core core GNU gdb Red Hat Linux (5.2-2) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... Core was generated by `crlutil -I -i /u/bishakhabanerjee/mozilla_1/PKITS_DATA/cr ls/GoodCACRL.crl -d ..'. Program terminated with signal 6, Aborted. Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG .OBJ/lib/libssl3.so...done. Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O BJ/lib/libssl3.so Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG .OBJ/lib/libsmime3.so...done. Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O BJ/lib/libsmime3.so Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG .OBJ/lib/libnss3.so...done. Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O BJ/lib/libnss3.so Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG .OBJ/lib/libplc4.so...done. Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O BJ/lib/libplc4.so Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG .OBJ/lib/libplds4.so...done. Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O BJ/lib/libplds4.so Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG .OBJ/lib/libnspr4.so...done. Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O BJ/lib/libnspr4.so Reading symbols from /lib/i686/libpthread.so.0...done. Loaded symbols for /lib/i686/libpthread.so.0 Reading symbols from /lib/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 ---Type <return> to continue, or q <return> to quit--- Reading symbols from /lib/i686/libc.so.6...done. Loaded symbols for /lib/i686/libc.so.6 Reading symbols from /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG .OBJ/lib/libsoftokn3.so...done. Loaded symbols for /u/bishakhabanerjee/mozilla/dist/Linux2.4_x86_glibc_PTH_DBG.O BJ/lib/libsoftokn3.so Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 #0 0x401ebb41 in __kill () at __kill:-1 -1 __kill: No such file or directory. in __kill
Doesn't crash after doing the import
Bishakha, please try this test on Solaris. core files are often more useful on that platform.
this test script uses the NSS test harness. When running it, it will create a HOSTDIR.1 under mozilla/tests_results This crashes everytime *after* doing a crl import. A crlutil -L will tell you that the CRL was imported.
the two scripts that I have attached above can be copied into mozilla/security/nss/tests/pkits and run from there, e.g. as test1.sh and test2.sh PKITS_DATA can be set to /share/builds/sbstools/nsstools/tmp/PKITS_data
I reproduced a "trap" on Windows running the test2 script. Actually it is an assertion during NSS_Shutdown NTDLL! DbgBreakPoint address 0x77fa144b SECMOD_Shutdown() line 112 + 36 bytes NSS_Shutdown() line 556 + 5 bytes main(int 0x00000006, char * * 0x004b4e18) line 473 + 5 bytes CRLUTIL! mainCRTStartup + 227 bytes KERNEL32! ProcessIdToSessionId + 381 bytes The line is PORT_Assert(secmod_PrivateModuleCount == 0);
Ah HA!. A reference count leak. Some object is not being destroyed before calling NSS_Shutdown, which in turn prevents the module from being unloaded, which prevents the module count from going to zero. This assertion only occurs when a certain environment variable is set, and apparently the "test harness" sets that variable. So, I think this is a P1 bug. Previous ref count error bugs have been P1.
Priority: -- → P1
Target Milestone: --- → 3.9.1
On another bug, there was a comment on why this was not being seen more often. If crlutil does not do teh import of a CRL for valid reasons, like "Peer's certificate issuer is bad" or "Bad signature CRL", crlutil says so, and does not crash. It is only when crlutil sees a valid CRL, and imports it, and then crashes. So the path followed in the latter case does free its pbjects correctly.
Besides what is already mentioned, the following command also causes a crash: crlutil -D (deleting a CRL from the database) but the command: crlutil -L (listing the CRLs in the database) does not crash
Updating bug title.
Summary: crlutil crashes while importing CRL → crlutil asserts after importing CRL
Attached patch fix cert leakSplinter Review
The CA certificate used for verifying the signature was being leaked during the import.
Attachment #139773 - Flags: superreview?(wchang0222)
Attachment #139773 - Flags: review?(MisterSSL)
Comment on attachment 139773 [details] [diff] [review] fix cert leak r=wtc.
Attachment #139773 - Flags: superreview?(wchang0222) → superreview+
Comment on attachment 139773 [details] [diff] [review] fix cert leak good job!
Attachment #139773 - Flags: review?(MisterSSL) → review+
Julien, Re comment #8, I can still see the crash when deleting a CRL. Crash after import has gone away. Attaching test script, which can be copied to nss/tests/pkits directory.
Checking in pk11cert.c; /cvsroot/mozilla/security/nss/lib/pk11wrap/pk11cert.c,v <-- pk11cert.c new revision: 1.129; previous revision: 1.128 done Bishakha, I will look into the other crash on delete, but it is most likely in a different codepath.
Comment on attachment 139773 [details] [diff] [review] fix cert leak Julien, this patch also needs to be checked in on the NSS_3_9_BRANCH because this bug is targeted at NSS 3.9.1.
Checked it in to NSS_3_9_BRANCH also . Checking in pk11cert.c; /cvsroot/mozilla/security/nss/lib/pk11wrap/pk11cert.c,v <-- pk11cert.c new revision: 1.128.4.1; previous revision: 1.128 Also, I determined that this bug was not a regression. It existed as far back as NSS 3.2 (oldest NSS source tree I have), although the code used to live in lib/certhigh in CERT_ImportCRL instead of PK11_ImportCRL .
Bishakha, I reproduced a crash using your new script, but it was in vfychain : vfychain -d PKITSdb -u 4 g:/nsstools/tmp/pkits_data/certs/ValidDSAParameterInher itanceTest5EE.crt g:/nsstools/tmp/pkits_data/certs/DSAParametersInheritedCACert. crt g:/nsstools/tmp/pkits_data/certs/DSACACert.crt [1] + Done(133) ? 33460 Trap vfychain I was not able to get the debugger to tell me where the crash happened (no pop-up) due to the MKS shell intercepting the exception. Unfortunately the sh -X is only effective on the parent shell program, but not on the sub-shells that you call. We really need to find a fix for this problem ! Anyway, I think that crash is a different bug and would recommend opening another bugzilla defect.
More findings : 1) In the shell script, using | tee $PKITS_LOG is what causes the exception handler to be reset ! Unlike sh.exe, I could not find an argument to enable the system exception handler. So, I just removed the | tee to get the debugger pop-up 2) Once I did this, I noticed in fact more assertions in crlutil upon delete. They were bugs in the crlutil tool this time. After permanently deleting the CRL from the database, the temporary object was not freed. I fixed two occurrences of this bug, although the test only tripped on one of them. I will attach a patch for this problem. 3) There is a remaining failure in vfychain. It is also an assertion caused by strict shutdown. I have not yet figured out the cause for it or a patch.
Attached patch fix CRL leakSplinter Review
Attachment #139963 - Flags: superreview?(wchang0222)
Attachment #139963 - Flags: review?(MisterSSL)
I found that the vfychain trap is related to the DSA cert verification. If I take out signature verification in lib/certhigh/certvfy.c : /* rv = CERT_VerifySignedData(&subjectCert->signatureWrap, issuerCert, t, wincx); */ and replace it with rv = SECSuccess; Then the assertion does not occur in NSS_Shutdown(). I have tried debugging the code further. Simply removing the call to CERT_VerifySignedDataWithPublicKey within CERT_VerifySignedData does not suffice to fix the assertion. I have to also remove the code that extracts the key and frees it, making CERT_VerifySignedData a no-op, for the test to pass. I spent some time looking at seckey_ExtractPublicKey, but did not find anything wrong with the code there. I probably have to run only the vfychain program with my changes to find out what the bug is, rather than re-rerunning the whole script.
Comment on attachment 139963 [details] [diff] [review] fix CRL leak sr=wtc. Please remove the "if (crl)" test in the following code because 'crl' cannot be NULL at that point. (We just tested it before the SEC_DeletePermCRL call.) The "crl = NULL;" assignment is also not necessary because 'crl' is a local variable and we are near the end of that function. >@@ -177,6 +178,10 @@ > return SECFailure; > } > rv = SEC_DeletePermCRL (crl); >+ if (crl) { >+ SEC_DestroyCrl(crl); >+ crl = NULL; >+ } > if (rv != SECSuccess) { > SECU_PrintError > (progName, "fail to delete the issuer %s's CRL from the perm database (reason: %s)",
Attachment #139963 - Flags: superreview?(wchang0222) → superreview+
Comment on attachment 139963 [details] [diff] [review] fix CRL leak At about line 180, the patch adds a NULL test of the variable crl. However, just 6 lines earlier, the code tests crl for NULL and returns if it is NULL. So, crl cannot be null here. Please remove this unnecessary test of CRL for NULL. Then check it in. Thanks.
Attachment #139963 - Flags: review?(MisterSSL) → review+
Checked in with suggested changes. Tip : Checking in crlutil.c; /cvsroot/mozilla/security/nss/cmd/crlutil/crlutil.c,v <-- crlutil.c new revision: 1.21; previous revision: 1.20 3.9 branch : Checking in crlutil.c; /cvsroot/mozilla/security/nss/cmd/crlutil/crlutil.c,v <-- crlutil.c new revision: 1.20.38.1; previous revision: 1.20
Marking fixed.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Marking verified for both crashes after import of CRL, and delete of CRL.
Status: RESOLVED → VERIFIED
Summary: crlutil asserts after importing CRL → crlutil asserts after importing and deleting CRL
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: