Closed Bug 311440 Opened 19 years ago Closed 19 years ago

selfserv with ECC enabled crashes in free()

Categories

(NSS :: Libraries, defect, P1)

3.11
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wtc, Assigned: nelson)

Details

(Keywords: crash, regression)

Attachments

(3 files)

I built the NSS tip (NSS 3.11 pre-release) with NSS_ENABLE_ECC=1. When running all.sh, selfserv crashes in free() because it frees memory that was not allocated from the heap (this stack trace was produced with gdb on Red Hat Enterprise Linux 4): #0 0x0079d7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x001387d5 in raise () from /lib/tls/libc.so.6 #2 0x0013a149 in abort () from /lib/tls/libc.so.6 #3 0x0016c27a in __libc_message () from /lib/tls/libc.so.6 #4 0x00172abf in _int_free () from /lib/tls/libc.so.6 #5 0x00172e3a in free () from /lib/tls/libc.so.6 #6 0x0032ee61 in PR_Free () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libnspr4.so #7 0x0067f2d5 in PORT_Free () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libnss3.so #8 0x00cda221 in ssl3_HandleECDHClientKeyExchange () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #9 0x00ccb389 in ssl3_HandleHandshakeMessage () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #10 0x00ccc64c in ssl3_HandleRecord () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #11 0x00ccd2f4 in ssl3_GatherCompleteHandshake () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #12 0x00cce891 in ssl_GatherRecord1stHandshake () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #13 0x00cd2f8c in ssl_Do1stHandshake () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #14 0x00cd42da in ssl_SecureRecv () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #15 0x00cd4555 in ssl_SecureRead () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #16 0x00cd7b5a in ssl_Read () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libssl3.so #17 0x003283e9 in PR_Read () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libnspr4.so #18 0x0804cbb4 in jobLoop () #19 0x0804c5e8 in thread_wrapper () #20 0x00340221 in _pt_root () from /home/wtc/nss-ecc/mozilla/dist/Linux2.6_x86_glibc_PTH_OPT.OBJ/lib/libnspr4.so #21 0x009ea341 in start_thread () from /lib/tls/libpthread.so.0 #22 0x001d7fee in clone () from /lib/tls/libc.so.6 If I remove the PORT_Free call in ssl3_HandleECDHClientKeyExchange, the crash is gone, but the ECC TLS tests still fail.
The output.log file is too big to be attached to the bug report, here is the relevant excerpt.
I examined results.html closer and found that the ECC SSL Cipher Coverage and Client Authentication tests passed. It was only the ECC SSL Stress tests that failed. So I've updated the summary of this bug accordingly.
Summary: ECC TLS doesn't work → ECC TLS stress tests failed
I believe this is the right fix. Nelson, please review. On Windows with the debug MSVC 7.1 runtime library, the free() call generates this debug assertion failure: Microsoft Visual C++ Debug Library Debug Assertion Failed! Program: C:\nss-ecc\mozilla\dist\win954.0_dbg.objd\bin\selfserv.exe File: dbgheap.c Line: 1132 Expression: _CrtIsValidHeapPointer(pUserData) For information on how your program can cause an assertion failure, see the Visual C++ documentation on asserts. (Press Retry to debug the application) Abort Retry Ignore
Attachment #198759 - Flags: review?(nelson)
Douglas sent me email, which reminded me that the ECC TLS stress tests are a known bug (bug 238051). So the only regression is that selfserv crashes in free(). I updated the bug's summary again.
Keywords: crash
Priority: -- → P1
Summary: ECC TLS stress tests failed → selfserv with ECC enabled crashes in free()
Target Milestone: --- → 3.11
(In reply to comment #5) > Douglas sent me email, which reminded me that the ECC TLS > stress tests are a known bug (bug 238051). So the only > regression is that selfserv crashes in free(). I updated > the bug's summary again. So do we know if a build with both patches applied (i.e. patches for bug Ids 311440 and 238051) passes the stress tests. I'm bringing this up because, if I remember correctly, the ec versions of the files in nss/tests/ssl already account for the lack of session reuse, e.g. if you look inside ecsslstress.txt, you'll notice that -N is set for two of the ECC stress tests which disables session reuse. You might have discovered a bug that I recall seeing in the past but only on some platforms and not very consistently. vipul
Vipul: you are right. I misunderstood what bug 238051 is. I found that the ECC SSL stress test failures exist in the first NSS release (NSS 3.9) whose all.sh can do ECC tests. So this is a pre-existing problem, and is different from bug 238051. I will open a new bug about it. I applied the patch from bug 238051 to the NSS tip. As expected, the ECC SSL stress tests still fail.
(In reply to comment #7) > Vipul: you are right. I misunderstood what bug 238051 is. > > I found that the ECC SSL stress test failures exist in the > first NSS release (NSS 3.9) whose all.sh can do ECC tests. > So this is a pre-existing problem, and is different from > bug 238051. I will open a new bug about it. > > I applied the patch from bug 238051 to the NSS tip. As > expected, the ECC SSL stress tests still fail. So does this mean we don't yet have a patch (or combination of patches) that would cause the stress test to pass? Here's my current understanding of where we stand. Please correct me if I'm wrong. - the current patch for 311440 fixes the selfserv crash - ECC session reuse is disabled but the stress tests still fail If so, do we know what's causing the failure message to be printed. Could it be that the test script is expecting the incorrect number of cache hits (because session reuse is disabled)? I know I had to make this change but it is possible that the change never made it back to the NSS tip. vipul
Vipul: Your understanding is correct: - the current patch for 311440 fixes the selfserv crash - ECC session reuse is disabled but the stress tests still fail All the error messages came from strsclnt: strsclnt: PR_Write returned error -12268: Cannot connect: SSL is disabled. selfserv did not emit any messages and was killed by our test script, apparently because it was still waiting for connections from strsclnt. The SSL error code -12268 is SSL_ERROR_SSL_DISABLED, "Cannot connect: SSL is disabled", which means: The local socket is configured in such a way that it cannot use any of the SSL cipher suites. Possible causes include: (a) both SSL2 and SSL3 are disabled, (b) All the individual SSL cipher suites are disabled, or (c) the socket is configured to handshake as a server, but the certificate associated with that socket is inappropriate for the Key Exchange Algorithm selected. Since strsclnt is an SSL client, (c) does not apply. I will try to continue looking into this next week.
Comment on attachment 198759 [details] [diff] [review] Patch to fix the crash in free() This patch appears to be correct. I found what appear to be two places in this function that appear to declare a fatal error without sending an alert to the peer. Those should be fixed. But they are not relevant to the immediate cause of this bug.
Attachment #198759 - Flags: review?(nelson) → review+
Comment on attachment 198759 [details] [diff] [review] Patch to fix the crash in free() Patch checked in on the NSS tip (NSS 3.11) I also examined all the other call sites of ssl3_ConsumeHandshakeVariable and verified that none of them free the SECItem's buffer returned by ssl3_ConsumeHandshakeVariable. I saw two places in this function marked with SEND_ALERT. I believe that's what Nelson meant.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Let me clarify about the two places in ssl3_HandleECDHClientKeyExchange. The first "SEND_ALERT", in the error path after the call to ssl3_ConsumeHandshakeVariable, should be removed because ssl3_ConsumeHandshakeVariable sends the alert itself, so no additional alert needs to be sent in that case. The comment "/* last gasp. */" appears in the error path following a call to PK11_PubDeriveWithKDF. It appears that no alert is sent in this error path, unless PK11_PubDeriveWithKDF itself sends an alert (I didn't check). I also didn't check to see if the caller of ssl3_HandleECDHClientKeyExchange sends an alert in this case. The second "SEND_ALERT", in the error path after the call to ssl3_InitPendingCipherSpec, is the second path in which no alert is sent.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: