Closed Bug 394040 Opened 16 years ago Closed 16 years ago
Tstclnt crashed in NISCC testing
NISCC tests failed today with core file, different then other days (usually there is failure caused by bug 382012). Core analysis: t@null (l@1) program terminated by signal SEGV (no mapping at the fault address) Current function is CERT_GetNextGeneralName 281 next = current->l.next; => CERT_GetNextGeneralName(current = (nil)), line 281 in "genname.c"  CERT_CompareNameSpace(cert = 0x80b9b80, namesList = (nil), certsList = 0x80bd568, reqArena = 0x80bcda8, pBadCert = 0x8045fd4), line 1636 in "genname.c"  cert_VerifyCertChain(handle = 0x80944b0, cert = 0x80b5848, checkSig = 1, sigerror = (nil), certUsage = certUsageSSLServer, t = 1188298765567398LL, wincx = 0x807f030, log = (nil), revoked = (nil)), line 878 in "certvfy.c"  CERT_VerifyCertChain(handle = 0x80944b0, cert = 0x80b5848, checkSig = 1, certUsage = certUsageSSLServer, t = 1188298765567398LL, wincx = 0x807f030, log = (nil)), line 940 in "certvfy.c"  CERT_VerifyCert(handle = 0x80944b0, cert = 0x80b5848, checkSig = 1, certUsage = certUsageSSLServer, t = 1188298765567398LL, wincx = 0x807f030, log = (nil)), line 1540 in "certvfy.c"  CERT_VerifyCertNow(handle = 0x80944b0, cert = 0x80b5848, checkSig = 1, certUsage = certUsageSSLServer, wincx = 0x807f030), line 1591 in "certvfy.c"  SSL_AuthCertificate(arg = 0x80944b0, fd = 0x807fc80, checkSig = 1, isServer = 0), line 255 in "sslauth.c"  ssl3_HandleCertificate(ss = 0x809aa20, b = 0x80a2f20 "^N", length = 0), line 7120 in "ssl3con.c"  ssl3_HandleHandshakeMessage(ss = 0x809aa20, b = 0x80a1eae "", length = 4210U), line 7782 in "ssl3con.c"  ssl3_HandleHandshake(ss = 0x809aa20, origBuf = 0x809ac7c), line 7898 in "ssl3con.c"  ssl3_HandleRecord(ss = 0x809aa20, cText = 0x8046318, databuf = 0x809ac7c), line 8161 in "ssl3con.c"  ssl3_GatherCompleteHandshake(ss = 0x809aa20, flags = 0), line 206 in "ssl3gthr.c"  ssl_GatherRecord1stHandshake(ss = 0x809aa20), line 1258 in "sslcon.c"  ssl_Do1stHandshake(ss = 0x809aa20), line 149 in "sslsecur.c"  ssl_SecureSend(ss = 0x809aa20, buf = 0x8046458 "GET /stop HTTP/1.0\n\n", len = 20, flags = 0), line 1100 in "sslsecur.c"  ssl_Send(fd = 0x807fc80, buf = 0x8046458, len = 20, flags = 0, timeout = 4294967295U), line 1421 in "sslsock.c"  PR_Send(fd = 0x807fc80, buf = 0x8046458, amount = 20, flags = 0, timeout = 4294967295U), line 226 in "priometh.c"  main(argc = 13, argv = 0x8047620), line 980 in "tstclnt.c" I checked recent changes in NSS code and there was change in sslcon.c (bug 392846).
Summary: Tstclnt failed in NISCC testing. → Tstclnt crashed in NISCC testing.
SSL is irrelevant. This is a crash in cert chain verification code. Slavo, Do you have this core file? Can you tell me what cert file(s) were used in the case that failed? Does the problem happen on every NISCC now? Or did it only happen once? I want to know if we can reproduce it, and if so, how (short of running the entire NISCC test to do it). Julien, The immediate cause of this crash is that cert_VerifyCertChain called CERT_CompareNameSpace with a NULL list of names to compare with the namespace constraints. The first question is: how did the variable namesList come to be NULL in function cert_VerifyCertChain ? The only answer seems to be that CERT_GetCertificateNames returned NULL, indicating failure, and the function didn't notice. That's a definite bug. It looks like this will always cause a crash if the first call to CERT_GetCertificateNames returns NULL. This is an OLD bug that we've apparently just never run into before. The second question is: why did CERT_GetCertificateNames return NULL in this case, when it had apparently never done so before? What changed? Did the process simply run out of memory? Did the code change somewhere, causing a failure to decode any of the cert's possible names, in a way that it had never before failed? The only change I can see that looks even _remotely_ relevant is the fix for bug 390710 in genname.c, checked in 3 weeks ago. I'm pretty sure that's not the cause. If it was, then again the question is: why didn't this happen before now? (or is this crash not recent?) I will write a patch for the bug that fails to notice a NULL return from CERT_GetCertificateNames and that will fix the immediate cause of the crash. But the bigger question of what changed will remain. Even though this crash occurred on the trunk, the bug is present in 3.11.x so I have set the target fix milestone for 3.11.8
Assignee: nobody → nelson
Priority: -- → P1
Target Milestone: --- → 3.11.8
Until we find the cert chain that triggered this bug, I have no way to test it. Julien, please review.
Attachment #278702 - Flags: review?(julien.pierre.boogz)
I checked older logs and found that this bug was there also in 20070825.1 build. Next 2 days it was OK and then yesterday in build 20070828.1. I don't have any more info about cert chain. Core file and additional logs are stored in /niscc/archive/securitytip/20070828.1 directory on mahatma.
Comment on attachment 278702 [details] [diff] [review] patch v1, untested It doesn't make any sense why our code didn't trip on this earlier, and why it doesn't trip on it every time now if there was a regression. This brings the methodology of our NISCC testing into question. Slavo, please look into the possible reasons. Perhaps some certs are not being tested all the time. Anyway, this patch looks good, so r+.
Attachment #278702 - Flags: review?(julien.pierre.boogz) → review+
On trunk: Checking in certdb/genname.c; new revision: 1.34; previous revision: 1.33 Checking in certhigh/certvfy.c; new revision: 1.54; previous revision: 1.53 On branch: certdb/genname.c; new revision: 18.104.22.168; previous revision: 1.29 certhigh/certvfy.c; new revision: 22.214.171.124; previous revision: 126.96.36.199
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.