Closed
Bug 638821
Opened 13 years ago
Closed 13 years ago
Possible lock order inconsistency (potential deadlock) in SSL machinery
Categories
(NSS :: Libraries, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
3.12.10
People
(Reporter: jseward, Assigned: wtc)
Details
Attachments
(1 file)
1.04 KB,
patch
|
wtc
:
review+
|
Details | Diff | Splinter Review |
using Helgrinding setup described in bug 551155 comment 13 and later (ignore preceding junk) when loading http://blog.chromium.org/2011/01/html-video-codec-support-in-chrome.html It appears there's a lock acquisition order inconsistency, leading to report below. ------------------- From code inspection: security/nss/lib/ssl/sslcon.c:3128: 3128 ssl_GetXmitBufLock(ss); /***************************************/ 3129 ssl_GetSSL3HandshakeLock(ss); acquires XmitBufLock then SSL3HandshakeLock vs security/nss/lib/ssl/ssl3con.c:5700 5700 PORT_Assert( ss->opt.noLocks || ssl_HaveSSL3HandshakeLock(ss) ); 5701 5702 if (ws != wait_hello_done && 5703 ws != wait_server_cert && 5704 ws != wait_server_key && 5705 ws != wait_cert_request) { 5706 SSL3_SendAlert(ss, alert_fatal, unexpected_message); 5707 PORT_SetError(SSL_ERROR_RX_UNEXPECTED_HELLO_DONE); 5708 return SECFailure; 5709 } 5710 5711 ssl_GetXmitBufLock(ss); /*******************************/ acquires SSL3HandshakeLock(implied) then XmitBufLock hence potential deadlock ------------------- Report: Thread #21: lock order "0x25AB2D18 before 0x25AAFD48" violated at 0x4C2BC62: pthread_mutex_lock (/home/sewardj/VgTRUNK/trunk/helgrind/hg_intercepts.c:496) by 0x7A3E4F5: PR_Lock (nsprpub/pr/src/pthreads/ptsynch.c:208) by 0x7A3E60D: PR_EnterMonitor (nsprpub/pr/src/pthreads/ptsynch.c:591) by 0xBA36CB3: ssl3_HandleServerHelloDone (security/nss/lib/ssl/ssl3con.c:5711) by 0xBA3A9C6: ssl3_HandleHandshakeMessage (security/nss/lib/ssl/ssl3con.c:8632) by 0xBA3B6F6: ssl3_HandleRecord (security/nss/lib/ssl/ssl3con.c:8727) by 0xBA3BC86: ssl3_GatherCompleteHandshake (security/nss/lib/ssl/ssl3gthr.c:209) by 0xBA3DC96: ssl_GatherRecord1stHandshake (security/nss/lib/ssl/sslcon.c:1258) by 0xBA42A0A: ssl_Do1stHandshake (security/nss/lib/ssl/sslsecur.c:151) by 0xBA4369D: ssl_SecureSend (security/nss/lib/ssl/sslsecur.c:1213) by 0xBA468D4: ssl_Write (security/nss/lib/ssl/sslsock.c:1652) by 0x600FE95: nsSSLThread::Run() (security/manager/ssl/src/nsSSLThread.cpp:1045) Required order was established by acquisition of lock at 0x25AB2D18 at 0x4C2BC62: pthread_mutex_lock (/home/sewardj/VgTRUNK/trunk/helgrind/hg_intercepts.c:496) by 0x7A3E4F5: PR_Lock (nsprpub/pr/src/pthreads/ptsynch.c:208) by 0x7A3E60D: PR_EnterMonitor (nsprpub/pr/src/pthreads/ptsynch.c:591) by 0xBA3D9FC: ssl2_BeginClientHandshake (security/nss/lib/ssl/sslcon.c:3128) by 0xBA42A0A: ssl_Do1stHandshake (security/nss/lib/ssl/sslsecur.c:151) by 0xBA4369D: ssl_SecureSend (security/nss/lib/ssl/sslsecur.c:1213) by 0xBA468D4: ssl_Write (security/nss/lib/ssl/sslsock.c:1652) by 0x600FE95: nsSSLThread::Run() (security/manager/ssl/src/nsSSLThread.cpp:1045) by 0x600F7D8: nsPSMBackgroundThread::nsThreadRunner(void*) (security/manager/ssl/src/nsPSMBackgroundThread.cpp:44) by 0x7A43A1C: _pt_root (nsprpub/pr/src/pthreads/ptthread.c:189) by 0x4C2CBE7: mythread_wrapper (/home/sewardj/VgTRUNK/trunk/helgrind/hg_intercepts.c:221) by 0x4E369C9: start_thread (/build/buildd/eglibc-2.11.1/nptl/pthread_create.c:300) followed by a later acquisition of lock at 0x25AAFD48 at 0x4C2BC62: pthread_mutex_lock (/home/sewardj/VgTRUNK/trunk/helgrind/hg_intercepts.c:496) by 0x7A3E4F5: PR_Lock (nsprpub/pr/src/pthreads/ptsynch.c:208) by 0x7A3E60D: PR_EnterMonitor (nsprpub/pr/src/pthreads/ptsynch.c:591) by 0xBA3DA0E: ssl2_BeginClientHandshake (security/nss/lib/ssl/sslcon.c:3129) by 0xBA42A0A: ssl_Do1stHandshake (security/nss/lib/ssl/sslsecur.c:151) by 0xBA4369D: ssl_SecureSend (security/nss/lib/ssl/sslsecur.c:1213) by 0xBA468D4: ssl_Write (security/nss/lib/ssl/sslsock.c:1652) by 0x600FE95: nsSSLThread::Run() (security/manager/ssl/src/nsSSLThread.cpp:1045) by 0x600F7D8: nsPSMBackgroundThread::nsThreadRunner(void*) (security/manager/ssl/src/nsPSMBackgroundThread.cpp:44) by 0x7A43A1C: _pt_root (nsprpub/pr/src/pthreads/ptthread.c:189) by 0x4C2CBE7: mythread_wrapper (/home/sewardj/VgTRUNK/trunk/helgrind/hg_intercepts.c:221) by 0x4E369C9: start_thread (/build/buildd/eglibc-2.11.1/nptl/pthread_create.c:300)
Updated•13 years ago
|
Component: Networking: HTTP → Security
QA Contact: networking.http → toolkit
Comment 1•13 years ago
|
||
could this be a dup of 588698 ?
Reporter | ||
Comment 2•13 years ago
|
||
(In reply to comment #1) > could this be a dup of bug 588698 ? It's clearly very closely related, but I think it's different. Bug 588698 comment 22 contains a patch which, amongst things, documents the rank order of locks. The relevant bit is Rank (order) of locks recvLock -> firstHandshake -> recvbuf -> ssl3Handshake -> xmitbuf -> "spec" sendLock -> In this bug (comment 0), we have sslcon3.c acquiring locks in the order ssl3Handshake -> xmitBuf which is in accordance with the documented order, but sslcon.c does xmitBufLock -> ssl3Handshake so it seems (at least superficially) that sslcon.c is wrong.
Updated•13 years ago
|
Assignee: nobody → nobody
Component: Security → Libraries
Product: Core → NSS
QA Contact: toolkit → libraries
Version: Trunk → trunk
Assignee | ||
Comment 3•13 years ago
|
||
Julian: thank you for the bug report and analysis. The bug in sslcon.c has been fixed on the NSS trunk (NSS 3.13) in bug 588698 comment 22. Before: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslcon.c&rev=1.40&mark=3128,3132#3128 After: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslcon.c&rev=1.41&mark=3129,3131#3128 For some reason (perhaps to minimize risk by only fixing the bug that the customer reported), I didn't backport this fix to the NSS_3_12_BRANCH for NSS 3.12.8. Can you find out if you have SSL 2.0 enabled? The code in sslcon.c should only be used if SSL 2.0 is enabled.
Assignee: nobody → wtc
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → 3.12.10
Reporter | ||
Comment 4•13 years ago
|
||
(In reply to comment #3) > Can you find out if you have SSL 2.0 enabled? The code in sslcon.c > should only be used if SSL 2.0 is enabled. How do I do that? I know about finding threading errors and stuff like that, but nothing about Fx's SSL code.
Comment 5•13 years ago
|
||
What's the value of security.enable_ssl2 ?
Reporter | ||
Comment 6•13 years ago
|
||
PreferenceName=security.enable_ssl2 Status=default Type=boolean Value=false and I'm pretty sure this is the same profile as used when the inconsistency was detected.
Reporter | ||
Comment 7•13 years ago
|
||
(In reply to comment #3) > The bug in sslcon.c has been fixed on the NSS trunk (NSS 3.13) > in bug 588698 comment 22. FTR, the inconsistency in summary is: sslcon.c: 3128 ssl_GetXmitBufLock(ss); /***************************************/ 3129 ssl_GetSSL3HandshakeLock(ss); vs ssl3con.c: 9053: ssl_GetSSL3HandshakeLock(ss); 5711: ssl_GetXmitBufLock(ss); /*******************************/ > Can you find out if you have SSL 2.0 enabled? The code in sslcon.c > should only be used if SSL 2.0 is enabled. SSL 2.0 is disabled, but sslcon.c is obviously being used. I have the following pref settings: security.enable_ssl2 false security.enable_ssl3 true security.enable_tls true
Reporter | ||
Comment 8•13 years ago
|
||
I found two more instances of the same thing: sslcon.c: 3128 ssl_GetXmitBufLock(ss); /***************************************/ 3129 ssl_GetSSL3HandshakeLock(ss); vs ssl3con.c 9053 ssl_GetSSL3HandshakeLock(ss); 8391 ssl_GetXmitBufLock(ss); /*************************************/ and sslcon.c: 3128 ssl_GetXmitBufLock(ss); /***************************************/ 3129 ssl_GetSSL3HandshakeLock(ss); vs sslsock.c: 466 ssl_GetSSL3HandshakeLock(ss); 467 ssl_GetXmitBufLock(ss); From this it is clear that sslcon.c:3128/9 are disagrees with at least three other lock-pair acquisition sequences. From which I'd guess the fix mentioned in comment #3 is the right fix.
Reporter | ||
Comment 9•13 years ago
|
||
A copy of sslcon.c rev 1.41, that makes the sslcon.c lock order agree with everyone else. With this in place I can't detect any lock order violations in the ssl machinery any more.
Reporter | ||
Updated•13 years ago
|
Attachment #522954 -
Flags: review?(wtc)
Assignee | ||
Comment 10•13 years ago
|
||
Comment on attachment 522954 [details] [diff] [review] patch, version 1 r=wtc. Patch checked in on the NSS_3_12_BRANCH (NSS 3.12.10). Checking in sslcon.c; /cvsroot/mozilla/security/nss/lib/ssl/sslcon.c,v <-- sslcon.c new revision: 1.40.2.1; previous revision: 1.40 done Note: this patch is already in the NSS trunk (NSS 3.13) as part of the patch (attachment 470102 [details] [diff] [review]) in bug 588698.
Attachment #522954 -
Flags: review?(wtc)
Attachment #522954 -
Flags: review+
Assignee | ||
Comment 11•13 years ago
|
||
Just FYI: I figured out why we always start the handshake in ssl2_BeginClientHandshake. That is set up by the SSL_ResetHandshake call: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslsecur.c&rev=1.48&mark=220,243,247#217 In the first ssl2_BeginClientHandshake call, if SSL 2.0 is disabled, we switch to ssl_GatherRecord1stHandshake as the handshake function: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/ssl/sslcon.c&rev=1.41&mark=3119-3120,3123,3126,3130#3119 and we use ssl_GatherRecord1stHandshake from that point on.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•