Last Comment Bug 320029 - NSS crashes trying to make cert8.db from cert5.db
: NSS crashes trying to make cert8.db from cert5.db
Status: RESOLVED FIXED
REGRESSION in 3.11, fixed in 3.11.1, ...
: crash
Product: NSS
Classification: Components
Component: Libraries (show other bugs)
: 3.11
: Sun SunOS
: P2 normal (vote)
: 3.11.1
Assigned To: Robert Relyea
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-12-12 14:02 PST by Nelson Bolyard (seldom reads bugmail)
Modified: 2006-10-25 11:50 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
patch v1 (876 bytes, patch)
2005-12-12 17:56 PST, Nelson Bolyard (seldom reads bugmail)
rrelyea: review+
julien.pierre: superreview+
Details | Diff | Review

Description Nelson Bolyard (seldom reads bugmail) 2005-12-12 14:02:51 PST
I took a working cert5.db from an installation of an old (but still supported)
NSS-based server, and tried to make a cert8.db from it, using NSS 3.11 and 
this command:

   certutil -L -d dbdir -X 

certutil crashed.  Completely repeatable (at least on Solaris).
The directory contained an old secmod.db, an old keydb, and cert5.db.
I traced through it a bit and found that NSS succesfully copied one 
cert from the old DB to the new, and then crashed trying to get the 
second cert.  The problem seems to be a reference counting issue with
the old DB.  Seems that when the old CERTCertificate was destroyed, 
the old DB was closed (its ref count went to zero), so the DB was 
closed when the second cert read attempt was made.  

So, there are actually two issues here:
a) dbm should make at least SOME check on entry to each external method
that the DB structure is in a valid state, and not crash if it is not.
b) the ref counting in NSS needs to be fixed, at least in the DB conversion
code for DB5.  

This is probably going to become a big issue at these older server products
finally come to EOL, and their DBs need to be converted.  

See also bug 320027, about the fact that we have no regular QA testing of 
this NSS functionality.  

I will attach a cert8.db file and a stack trace to this bug.
Comment 1 Nelson Bolyard (seldom reads bugmail) 2005-12-12 14:10:24 PST
Marking P2 for 3.11.1.  But leaving state unconfirmed until I have more info.
Comment 2 Robert Relyea 2005-12-12 15:57:55 PST
I presume you mean attack a cert5.db;).

bob
Comment 3 timeless 2005-12-12 17:00:07 PST
i'd rather he attach a cert5.db than attack one :-)
Comment 4 Nelson Bolyard (seldom reads bugmail) 2005-12-12 17:53:36 PST
Stack trace at crash.

=>[1] certdb_Seq(db = (nil), key = 0xfffffd7fffdfd598, data = 0xfffffd7fffdfd5a8, flags = 0x7U), line 337 in "pcertdb.c"
  [2] nsslowcert_TraverseDBEntries(handle = 0xfffffd7fffdfd700, type = certDBEntryTypeCert, callback = 0xfffffd7ffed4de70 = &`libsoftokn3.so`pcertdb.c`certcallback(struct SECItemStr *dbdata, struct SECItemStr *dbkey, certDBEntryType type, void *data), udata = 0xfffffd7fffdfd628), line 4307 in "pcertdb.c"
  [3] TraversePermCertsNoLocking(handle = 0xfffffd7fffdfd700, certfunc = 0xfffffd7ffed4c610 = &`libsoftokn3.so`pcertdb.c`updateV5Callback(struct NSSLOWCERTCertificateStr *cert, struct SECItemStr *k, void *pdata), udata = 0x4553a0), line 4473 in "pcertdb.c"
  [4] nsslowcert_TraversePermCerts(handle = 0xfffffd7fffdfd700, certfunc = 0xfffffd7ffed4c610 = &`libsoftokn3.so`pcertdb.c`updateV5Callback(struct NSSLOWCERTCertificateStr *cert, struct SECItemStr *k, void *pdata), udata = 0x4553a0), line 4491 in "pcertdb.c"
  [5] UpdateV5DB(handle = 0x4553a0, updatedb = 0x48f6d0), line 3854 in "pcertdb.c"
  [6] openNewCertDB(appName = (nil), prefix = 0x4494a0 "", certdbname = 0x44de30 "/export/home/nb95248/tmp/telstra/old/cert8.db", handle = 0x4553a0, namecb = 0xfffffd7ffed2cc70 = &`libsoftokn3.so`dbinit.c`sftk_certdb_name_cb(void *arg, int dbVersion), cbarg = 0x456a00), line 4094 in "pcertdb.c"
  [7] nsslowcert_OpenPermCertDB(handle = 0x4553a0, readOnly = 0, appName = (nil), prefix = 0x4494a0 "", namecb = 0xfffffd7ffed2cc70 = &`libsoftokn3.so`dbinit.c`sftk_certdb_name_cb(void *arg, int dbVersion), cbarg = 0x456a00), line 4175 in "pcertdb.c"
  [8] nsslowcert_OpenCertDB(handle = 0x4553a0, readOnly = 0, appName = (nil), prefix = 0x4494a0 "", namecb = 0xfffffd7ffed2cc70 = &`libsoftokn3.so`dbinit.c`sftk_certdb_name_cb(void *arg, int dbVersion), cbarg = 0x456a00, openVolatile = 0), line 4663 in "pcertdb.c"
  [9] sftk_OpenCertDB(configdir = 0x44deb0 "/export/home/nb95248/tmp/telstra/old", prefix = 0x4494a0 "", readOnly = 0, certdbPtr = 0xfffffd7fffdfda88), line 172 in "dbinit.c"
  [10] sftk_DBInit(configdir = 0x44deb0 "/export/home/nb95248/tmp/telstra/old", certPrefix = 0x4494a0 "", keyPrefix = 0x449520 "", readOnly = 0, noCertDB = 0, noKeyDB = 0, forceOpen = 0, certdbPtr = 0xfffffd7fffdfda88, keydbPtr = 0xfffffd7fffdfda80), line 240 in "dbinit.c"
  [11] SFTK_SlotReInit(slot = 0x470230, configdir = 0x44deb0 "/export/home/nb95248/tmp/telstra/old", params = 0x451b10, moduleIndex = 0), line 2566 in "pkcs11.c"
  [12] SFTK_SlotInit(configdir = 0x44deb0 "/export/home/nb95248/tmp/telstra/old", params = 0x451b10, moduleIndex = 0), line 2672 in "pkcs11.c"
  [13] nsc_CommonInitialize(pReserved = 0xfffffd7fffdfdcf0, isFIPS = 0), line 3030 in "pkcs11.c"
  [14] NSC_Initialize(pReserved = 0xfffffd7fffdfdcf0), line 3052 in "pkcs11.c"
  [15] secmod_ModuleInit(mod = 0x456aa0, alreadyLoaded = 0xfffffd7fffdfddb0), line 150 in "pk11load.c"
  [16] SECMOD_LoadPKCS11Module(mod = 0x456aa0), line 322 in "pk11load.c"
  [17] SECMOD_LoadModule(modulespec = 0x457510 " name="NSS Internal PKCS #11 Module" parameters="configdir='/export/home/nb95248/tmp/telstra/old' certPrefix='' keyPrefix='' secmod='secmod.db' flags= " NSS="trustOrder=75 cipherOrder=100 slotParams={0x00000001=[slotFlags=RSA,RC4,RC2,DES,DH,SHA1,MD5,MD2,SSL,TLS,AES,RANDOM askpw=any timeout=30 ] }  Flags=internal,critical"", parent = 0x451350, recurse = 0x1), line 323 in "pk11pars.c"
  [18] SECMOD_LoadModule(modulespec = 0x450440 "name="NSS Internal Module" parameters="configdir='/export/home/nb95248/tmp/telstra/old' certPrefix='' keyPrefix='' secmod='secmod.db' flags= " NSS="flags=internal,moduleDB,moduleDBOnly,critical"", parent = (nil), recurse = 0x1), line 338 in "pk11pars.c"
  [19] nss_Init(configdir = 0x447698 "/export/home/nb95248/tmp/telstra/old", certPrefix = 0x446330 "", keyPrefix = 0x446330 "", secmodName = 0x42a1a8 "secmod.db", readOnly = 0, noCertDB = 0, noModDB = 0, forceOpen = 0, noRootInit = 0, optimizeSpace = 0, noSingleThreadedModules = 0, allowAlreadyInitializedModules = 0, dontFinalizeModules = 0), line 467 in "nssinit.c"
  [20] NSS_Initialize(configdir = 0x447698 "/export/home/nb95248/tmp/telstra/old", certPrefix = 0x446330 "", keyPrefix = 0x446330 "", secmodName = 0x42a1a8 "secmod.db", flags = 0), line 575 in "nssinit.c"
  [21] certutil_main(argc = 0x5, argv = 0xfffffd7fffdfeb68, initialize = 0x1), line 2798 in "certutil.c"
  [22] main(argc = 0x5, argv = 0xfffffd7fffdfeb68), line 3161 in "certutil.c"
(dbx) 


Here is the stack trace at the point where the old cert DB is prematurely 
closed.  

=>[1] sftk_freeCertDB(certHandle = 0xfffffd7fffdfd700), line 292 in "dbinit.c"
  [2] AddCertToPermDB(handle = 0x4553a0, cert = 0x499db0, nickname = 0x499430 "Thawte Server CA", trust = 0x499400), line 3400 in "pcertdb.c"
  [3] updateV5Callback(cert = 0x499db0, k = 0xfffffd7fffdfd560, pdata = 0x4553a0), line 3836 in "pcertdb.c"
  [4] certcallback(dbdata = 0xfffffd7fffdfd578, dbkey = 0xfffffd7fffdfd560, type = certDBEntryTypeCert, data = 0xfffffd7fffdfd628), line 4441 in "pcertdb.c"
  [5] nsslowcert_TraverseDBEntries(handle = 0xfffffd7fffdfd700, type = certDBEntryTypeCert, callback = 0xfffffd7ffed4de70 = &`libsoftokn3.so`pcertdb.c`certcallback(struct SECItemStr *dbdata, struct SECItemStr *dbkey, certDBEntryType type, void *data), udata = 0xfffffd7fffdfd628), line 4302 in "pcertdb.c"
  [6] TraversePermCertsNoLocking(handle = 0xfffffd7fffdfd700, certfunc = 0xfffffd7ffed4c610 = &`libsoftokn3.so`pcertdb.c`updateV5Callback(struct NSSLOWCERTCertificateStr *cert, struct SECItemStr *k, void *pdata), udata = 0x4553a0), line 4473 in "pcertdb.c"
  [7] nsslowcert_TraversePermCerts(handle = 0xfffffd7fffdfd700, certfunc = 0xfffffd7ffed4c610 = &`libsoftokn3.so`pcertdb.c`updateV5Callback(struct NSSLOWCERTCertificateStr *cert, struct SECItemStr *k, void *pdata), udata = 0x4553a0), line 4491 in "pcertdb.c"
  [8] UpdateV5DB(handle = 0x4553a0, updatedb = 0x48f6d0), line 3854 in "pcertdb.c"
  [9] openNewCertDB(appName = (nil), prefix = 0x4494a0 "", certdbname = 0x44de30 "/export/home/nb95248/tmp/telstra/old/cert8.db", handle = 0x4553a0, namecb = 0xfffffd7ffed2cc70 = &`libsoftokn3.so`dbinit.c`sftk_certdb_name_cb(void *arg, int dbVersion), cbarg = 0x456a00), line 4094 in "pcertdb.c"
  [10] nsslowcert_OpenPermCertDB(handle = 0x4553a0, readOnly = 0, appName = (nil), prefix = 0x4494a0 "", namecb = 0xfffffd7ffed2cc70 = &`libsoftokn3.so`dbinit.c`sftk_certdb_name_cb(void *arg, int dbVersion), cbarg = 0x456a00), line 4175 in "pcertdb.c"
  [11] nsslowcert_OpenCertDB(handle = 0x4553a0, readOnly = 0, appName = (nil), prefix = 0x4494a0 "", namecb = 0xfffffd7ffed2cc70 = &`libsoftokn3.so`dbinit.c`sftk_certdb_name_cb(void *arg, int dbVersion), cbarg = 0x456a00, openVolatile = 0), line 4663 in "pcertdb.c"
  [12] sftk_OpenCertDB(configdir = 0x44deb0 "/export/home/nb95248/tmp/telstra/old", prefix = 0x4494a0 "", readOnly = 0, certdbPtr = 0xfffffd7fffdfda88), line 172 in "dbinit.c"
  [13] sftk_DBInit(configdir = 0x44deb0 "/export/home/nb95248/tmp/telstra/old", certPrefix = 0x4494a0 "", keyPrefix = 0x449520 "", readOnly = 0, noCertDB = 0, noKeyDB = 0, forceOpen = 0, certdbPtr = 0xfffffd7fffdfda88, keydbPtr = 0xfffffd7fffdfda80), line 240 in "dbinit.c"
  [14] SFTK_SlotReInit(slot = 0x470230, configdir = 0x44deb0 "/export/home/nb95248/tmp/telstra/old", params = 0x451b10, moduleIndex = 0), line 2566 in "pkcs11.c"
  [15] SFTK_SlotInit(configdir = 0x44deb0 "/export/home/nb95248/tmp/telstra/old", params = 0x451b10, moduleIndex = 0), line 2672 in "pkcs11.c"
  [16] nsc_CommonInitialize(pReserved = 0xfffffd7fffdfdcf0, isFIPS = 0), line 3030 in "pkcs11.c"
  [17] NSC_Initialize(pReserved = 0xfffffd7fffdfdcf0), line 3052 in "pkcs11.c"
  [18] secmod_ModuleInit(mod = 0x456aa0, alreadyLoaded = 0xfffffd7fffdfddb0), line 150 in "pk11load.c"
  [19] SECMOD_LoadPKCS11Module(mod = 0x456aa0), line 322 in "pk11load.c"
  [20] SECMOD_LoadModule(modulespec = 0x457510 " name="NSS Internal PKCS #11 Module" parameters="configdir='/export/home/nb95248/tmp/telstra/old' certPrefix='' keyPrefix='' secmod='secmod.db' flags= " NSS="trustOrder=75 cipherOrder=100 slotParams={0x00000001=[slotFlags=RSA,RC4,RC2,DES,DH,SHA1,MD5,MD2,SSL,TLS,AES,RANDOM askpw=any timeout=30 ] }  Flags=internal,critical"", parent = 0x451350, recurse = 0x1), line 323 in "pk11pars.c"
  [21] SECMOD_LoadModule(modulespec = 0x450440 "name="NSS Internal Module" parameters="configdir='/export/home/nb95248/tmp/telstra/old' certPrefix='' keyPrefix='' secmod='secmod.db' flags= " NSS="flags=internal,moduleDB,moduleDBOnly,critical"", parent = (nil), recurse = 0x1), line 338 in "pk11pars.c"
  [22] nss_Init(configdir = 0x447698 "/export/home/nb95248/tmp/telstra/old", certPrefix = 0x446330 "", keyPrefix = 0x446330 "", secmodName = 0x42a1a8 "secmod.db", readOnly = 0, noCertDB = 0, noModDB = 0, forceOpen = 0, noRootInit = 0, optimizeSpace = 0, noSingleThreadedModules = 0, allowAlreadyInitializedModules = 0, dontFinalizeModules = 0), line 467 in "nssinit.c"
  [23] NSS_Initialize(configdir = 0x447698 "/export/home/nb95248/tmp/telstra/old", certPrefix = 0x446330 "", keyPrefix = 0x446330 "", secmodName = 0x42a1a8 "secmod.db", flags = 0), line 575 in "nssinit.c"
  [24] certutil_main(argc = 0x5, argv = 0xfffffd7fffdfeb68, initialize = 0x1), line 2798 in "certutil.c"
  [25] main(argc = 0x5, argv = 0xfffffd7fffdfeb68), line 3161 in "certutil.c"
(dbx) 

The problem appears to be in function UpdateV5DB in pcertdb.c.
The ref count member of the "updatehandle" struct is never initialized.
If the value happens to be non-zero, then this code succeeds, but if it 
is zero, the code crashes.  Initializing the ref count to 1 fixes the 
problem.

It appears to me that this code has had this bug for at least 4 years,
and perhaps longer.  This begs some questions, such as 
a) when did this break?
b) what change to what code broke it?
c) what was the most recent version in which an update from v5 worked (if any)?

Patch forthcoming.  

(Yes, I meant cert5.db, not cert8.  However, I need to get permission 
before I can attach that db file to this bug. :-(
Comment 5 Nelson Bolyard (seldom reads bugmail) 2005-12-12 17:56:18 PST
Created attachment 205683 [details] [diff] [review]
patch v1

Bob, please review.  
It looks like this code should always have had this problem.
I can't see how it ever worked, yet I know that it did at one time. :-/
Bob, any idean when/how it broke?
Comment 6 Robert Relyea 2005-12-13 14:52:28 PST
Comment on attachment 205683 [details] [diff] [review]
patch v1

This patch is sufficient for NSS 3.11 for NSS 3.12, I'd like to see an initializer function for NSSLOWCERTCertDBHandle along the lines of nsslowkey_NewHandle() for NSSLOWKEYDBHandle. Both UpdateV5DB and nsslowcert_OpenCertDB would use this initializer.
Comment 7 Robert Relyea 2005-12-13 14:54:21 PST
This is a regression on NSS 3.11. Prior to 3.11, the database data structure was not reference counted, but initialized only at startup and shutdown.

bob
Comment 8 Robert Relyea 2005-12-13 15:04:04 PST
so my comment #7 was meant to address nelson's question, but I realized it wasn't complete:

In NSS 3.11 I added the ability to open and close key and cert databases on the fly while Softoken was still running. This involved reference counting the databases. Whenever some code needed to access a database handle it would get a safe reference, then it would know the database handle would not disappear on it before it was through. There were several ad hoc versions of the key database, so a new 'constructor' was created to safely build these handles and make sure they were all properly initialized. I obviously missed the one ad hoc cert handle here. 

I believe keeping the around as a stack data structure is error prone, so in 3.12 we should change it to a pointer and create a constructor for the cert db handle much like the constructor for the key db handle.

bob
Comment 9 Nelson Bolyard (seldom reads bugmail) 2005-12-19 19:05:37 PST
Comment on attachment 205683 [details] [diff] [review]
patch v1

Julien, please SR for 3.11.1
Comment 10 Julien Pierre 2006-01-17 19:33:52 PST
Comment on attachment 205683 [details] [diff] [review]
patch v1

r+ for 3.11.1 . I agree with Bob's comments that we should use the new constructor function for 3.12 .
Comment 11 Nelson Bolyard (seldom reads bugmail) 2006-01-18 18:06:32 PST
Fix committed on 3.11 branch

/cvsroot/mozilla/security/nss/lib/softoken/pcertdb.c,v  <--  pcertdb.c
new revision: 1.53.2.1; previous revision: 1.53

Comment 12 Nelson Bolyard (seldom reads bugmail) 2006-01-18 18:19:26 PST
Checked in on trunk.
Checking in pcertdb.c;  new revision: 1.55; previous revision: 1.54

Bob and Julien, I know that there is another fix that you'd prefer over 
this one for the trunk.  But I don't know what that other preferred change
is, exactly.  So, I checked in the 3.11 fix on the trunk, too.  

Bob, please feel free to take this bug now, and make a patch for the change 
that you envisioned for the trunk.  
Comment 13 Robert Relyea 2006-01-19 09:52:09 PST
Checking in this patch to the trunk is fine for now. I'll take to bug and add the updated fix.

The reason we want the update is to reduce the changes of future problems. The reason we don't want it in 3.11 is because it would involve some extra rewrites.

bob
Comment 14 Nelson Bolyard (seldom reads bugmail) 2006-10-25 11:50:28 PDT
Bob, this bug is now fixed on the trunk and in NSS 3.11.1
If you still want to produce an enhanced fix for the trunk, 
please open an RFE on it.

Note You need to log in before you can comment on or make changes to this bug.