This crash was reported to us recently. => pk11_mkHandle(slot = 0xbd128, dbKey = 0x60, class = 1879048192U), line 2495 in "pkcs11u.c"  pk11_searchCertsAndTrust(slot = 0xbd128, derCert = 0xfe1615e0, name = 0xfe1615c8, derSubject = 0xfe1615d4, issuerSN = 0xfe161580, email = 0xfe1615bc, classFlags = 1U, handles = 0x340870, pTemplate = 0xfe1617c8, ulCount = 3), line 3800 in "pkcs11.c"  pk11_searchTokenList(slot = 0xbd128, search = 0x340870, pTemplate = 0xfe1617c8, ulCount = 3, tokenOnly = 0xfe161668, isLoggedIn = 0), line 4041 in "pkcs11.c"  NSC_FindObjectsInit(hSession = 16777217U, pTemplate = 0xfe1617c8, ulCount = 3U), line 4098 in "pkcs11.c"  traverse_objects_by_template(tok = 0x103648, sessionOpt = (nil), obj_template = 0xfe1617c8, otsize = 3U, callback = 0xfe6a3da0 = &`libnss3.so`devobject.c`retrieve_cert(struct NSSTokenStr *t, struct nssSessionStr *session, CK_OBJECT_HANDLE h, void *arg), arg = 0xfe161860), line 223 in "devobject.c"  nssToken_TraverseCertificatesBySubject(token = 0x103648, sessionOpt = (nil), subject = 0xfe161910, search = 0xfe161860), line 643 in "devobject.c"  NSSTrustDomain_FindBestCertificateBySubject(td = 0x1034b8, subject = 0xfe161910, timeOpt = (nil), usage = 0xfe161904, policiesOpt = (nil)), line 638 in "trustdomain.c"  CERT_FindCertByName(handle = 0x1034b8, name = 0x1342c4), line 337 in "stanpcertdb.c" (Note that this stack is similar to the stack in http://bugzilla.mozilla.org/show_bug.cgi?id=132548#c1.) This crash is caused by defererencing a null pointer. Here is what I think causes the crash: In pk11_searchCertsAndTrust(), the local variable 'cert' is set to NULL because some other function failed earlier. Then we pass &cert->certKey as the 'dbKey' argument to pk11_mkHandle. (If 'cert' is NULL, &cert->certKey is 0x60, which is the value of 'dbKey' in the stack trace.) I've determined that this is caused by nsslowcert_TraversePermCertsForNickname, nsslowcert_TraversePermCertsForSubject, or nsslowcert_TraversePermCerts passing a null 'cert' argument to the callback function pk11_cert_collect. pk11_cert_collect would store the null 'cert' in the certData.certs array. It would be easy to modify pk11_cert_collect so that it does not store a null 'cert' in the certData.certs array, but we might need to find out why the nsslowcert_TraversePermCerts* functions get a null 'cert' in the first place. Looking at the code, I found it's because nsslowcert_FindCertByKey(pcertdb.c:2874) or DecodeACert (pcertdb.c:3857 and 4111) or eventually nsslowcert_DecodeDERCertificate (pcertdb.c:3799) failed.
Created attachment 78551 [details] [diff] [review] Patch to prevent pk11_cert_collect from storing NULL in certData.certs This patch may prevent the crash but I am wondering if we should fix the underlying problem, that is, the nsslowcert_DecodeDERCertificate failure.
DER_DecodeCert could fail if the DER cert data isn't valid. This could occur because the database had been corrupted at some point, or somehow we've injected an invalid certificate into the database. Wan-Teh's fix seems resonable to me. We may want to review the code and see where else a corrupted certificate may cause us problems. We probably should look at the cert and make sure it's a corruption problem and not a valid cert that isn't being parsed by our templates. bob
Comment on attachment 78551 [details] [diff] [review] Patch to prevent pk11_cert_collect from storing NULL in certData.certs Just wanted to note that Bob has reviewed this patch and I've checked it in on the trunk. I specifically asked Bob whether pk11_cert_collect should return SECSuccess if 'cert' is NULL. I explained that if pk11_cert_collect returns SECFailure, nsslowcert_TraversePermCertsForSubject will abort the cert traversal. Bob said returning SECSuccess is correct.
We should land this on the branch. Nominating adt1.0.0. I will send mail to drivers.
Comment on attachment 78551 [details] [diff] [review] Patch to prevent pk11_cert_collect from storing NULL in certData.certs firstname.lastname@example.org
adt1.0.0+ (on ADTs behalf) checkin to 1.0. Pls check this into the trunk and 1.0 branch.
I checked in the fix on the MOZILLA_1_0_0_BRANCH.
Thanks, Wan-Teh. Should we check this in to NSS_CLIENT_TAG, too?
When checking a fix into 1.0, please add fixed1.0.0. Thanks!
Removing adt1.0.0+, as this is fixed on 1.0 branch.
Changed the QA contact to Bishakha.
I checked in the fix on the NSS_3_4_BRANCH.
Set target milestone 3.4.2.
Marked the bug fixed.