Closed
Bug 283690
Opened 20 years ago
Closed 20 years ago
NSS improperly handles PKCS11 sessions for SSL derived keys.
Categories
(NSS :: Libraries, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
3.10
People
(Reporter: rrelyea, Assigned: rrelyea)
References
Details
Attachments
(1 file, 2 obsolete files)
|
1.98 KB,
patch
|
nelson
:
review+
|
Details | Diff | Splinter Review |
There may be a duplicate of this bug. It would show up as 'nss passes invalid object handles to C_DestroyObject()'. But I couldn't find the orginal bug. ----------------------------------------------- When verifying that we are really are creating keys on their own session, I went looking at our SSL code. In SSL, we call a function to 'derive' keys for us. This function returns a symkey, which basically has the session that the derived keys all live on, and a data structure that has the handles for all the keys that were derived. We then call PK11_SymKeyFromHandle() to create separate symkeys for each of these keys. PK11_SymKeyFromHandle() gets a new symKey structure (which automatically returns a structure with a session), closes the session, and addes the session from the parent key, marking the symKey as not being the owner of the session. After it creates all these SymKeys, the original SymKey is freed! (which in theory will free the session). -------------- Why I think it works "most" of the time. To make getting keys more efficient, NSS keeps a freelist of SymKeys. This saves memory allocations, lock allocations, and session creations for the new keys. What I believe is happening is when we free the parent SymKey, the SymKey is placed on the free list, which the session that was created for it. Later, when the key is reused, the session is problably reused. The only exception is cases like PK11_SymKeyFromHandle, where the session is closed because the new key is going to share a session with some other key.
| Assignee | ||
Comment 1•20 years ago
|
||
I believe this problem is not only the source of bugs, but also a major performance issue. The NSS key library is trying not only to keep the keys on the freelist, but also to limit the number of new sessions that need to be created to handle SSL operations. Grabing new keys from the free list, then closing their sessions is inefficient. To fix this part of the problem, I suggest we keep 2 types of key freelists: those which we preserve the session, and those which we know won't own their sessions. The former is already kept on a per slot basis. The latter can be allocated out of a general pool. When we want a symkey which we are going to supply our own session for, we should grab it from the general free list. This reduce the number of new session calls we need to make, and allow us to recycle existing sessions better. This doesn't really "FIX" the real problem that this bug describes, however (it will *mask* the problem for longer). I'm wondering if we don't need a special call for SSL to create a stack of keys from the derive, and then a call to free that entire stack when the socket is finished. There are two other possible fixes I see for the 'real' problem: 1) hang on to the symkey returned by us from PK11_DeriveSymKey() along with all the child keys. 2) keep a reference to the parent key in the child keys and free that reference when we free the child. I think the latter is cleanest. bob
| Assignee | ||
Comment 2•20 years ago
|
||
BTW if we recycle contexts like we recycle keys, we can come pretty close to elliminating memory allocations in pk11wrap (and probably softoken) when running SSL. bob
| Assignee | ||
Comment 3•20 years ago
|
||
OK, I'm about to attach a patch to fix the base problem. In doing so I discovered that we are using some trickery to keep the session around, by having one of the keys 'inherit' the session. The problem comes is, because of references or other things, this key is freed before the other keys. Most likely the SSL session is over at this point, and the other keys won't be used, but it will trigger the delete of the non-existant keys. This patch will guarrentee the session isn't squashed before until the last key is deleted. I haven't tested the patch, I've just verified that it compiles. This patch doesn't fix the performance issue I've identified. bob
| Assignee | ||
Comment 4•20 years ago
|
||
| Assignee | ||
Comment 5•20 years ago
|
||
oops, the other attachement wasn't a patch;(.
Attachment #175587 -
Attachment is obsolete: true
Comment 6•20 years ago
|
||
Saul, please consider testing the above patch to ensure that it is correct and to see if it improves performance.
| Assignee | ||
Comment 7•20 years ago
|
||
The patch shouldn't improve performance, but it should improve the hardware issue. To improve performance we need to implement the separate freelists I suggested (which are not part of this patch). bob
Updated•20 years ago
|
Assignee: wtchang → rrelyea
Comment 8•20 years ago
|
||
nominating for NSS 3.10. Current measurements show 25% of TOTAL server CPU time being spent in softoken bookkeeping code, that is, in softoken, but not in freebl. #1 source of lock contention is function pk11_ObjectFromHandleOnSlot(). Looks like majority of session keys are all on the same session. Investigation continues.
Priority: -- → P2
Target Milestone: --- → 3.10
Version: 3.9.7 → 3.9.5
| Assignee | ||
Comment 9•20 years ago
|
||
Actually object handles are not on a per session basis, but a per slot basis (the same object handle referenced from different sessions point to the same object). Two things that could help out this contention: 1) move to reader/writer locks. I don't know how much this will help because we still have a fair number of writes (key creations and deletions), but there are a *LOT* more reads.... it's especially noteworthy that it's contending on the lookup. 2) There are a set of functions call 'tokenKeyLock()' and 'tokenKeyUnlock()' which are using the same locks. Since we know we are condending on these locks, we should create another slot level lock for these functions. 3) Look at increasing the bucket size for the hash tables. bob
Comment 10•20 years ago
|
||
Comment on attachment 175588 [details] [diff] [review] Don't let the owner of a key's session go until the key is done. Bob, this patch increases the size of struct PK11SymKeyStr which is public, IINM. Doesn't this potentially break backward binary compatibility? If you think it's OK for us to lengthen this struct, what do you think about lengthening struct CERTCertificateRequestStr by adding this member? SECItem ** extensions; That change would benefit the fix for bug 263779.
Comment 11•20 years ago
|
||
This patch is intended to help us find and eliminate causes of invalid handle errors.
Comment 12•20 years ago
|
||
I threw in an assertion at every place where softoken returns CKR_XxXXXXX_HANDLE_INVALID and ran ssl.sh. Assertion failed in the first SSL3 stress test. Stack is [5] PR_Assert(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff16cb7d [6] NSC_DestroyObject(hSession = 791U, hObject = 388U), line 201 in "pkcs11c.c" [7] PK11_FreeSymKey(symKey = 0x57a710), line 159 in "pk11skey.c" [8] PK11_DestroyContext(context = 0x5990b0, freeit = 1), line 97 in "pk11cxt.c" [9] ssl3_CleanupKeyMaterial(mat = 0x5b75f0), line 1261 in "ssl3con.c" [10] ssl3_DestroyCipherSpec(spec = 0x5b7550), line 1286 in "ssl3con.c" [11] ssl3_DestroySSL3Info(ssl3 = 0x5b7420), line 8995 in "ssl3con.c" [12] ssl_DestroySocketContents(ss = 0xfffffd7ffec04c50), line 377 in "sslsock.c" [13] ssl_FreeSocket(ss = 0x5581a0), line 439 in "sslsock.c" [14] ssl_DefClose(ss = 0x5581a0), line 247 in "ssldef.c" [15] ssl_SecureClose(ss = 0x5581a0), line 904 in "sslsecur.c" [16] ssl_Close(fd = 0x432fb0), line 1204 in "sslsock.c" [17] PR_Close(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff168d95 [18] handle_connection(tcp_sock = 0x432fb0, model_sock = 0x4329b0, requestCert = 0), line 1078 in "selfserv.c" [19] jobLoop(a = (nil), b = (nil), c = 0), line 507 in "selfserv.c" [20] thread_wrapper(arg = 0x540780), line 476 in "selfserv.c" PK11_DestroyContext was being called to destroy a MAC context, and the session handle was invalid. The MAC key corresponding to that context had just been destroyed. I did the test a second time and got a nearly identical crash. The only difference was in frames 8 and 9. ssl3_CleanupKeyMaterial was trying to destroy the server "write" (encryption) key. Now I will test with Bob's patch above.
Comment 13•20 years ago
|
||
With Bob's patch (above) patch applied, the crashes no longer occur in ssl.sh. I have run ssl.sh several time through without failure.
Comment 14•20 years ago
|
||
Comment on attachment 175588 [details] [diff] [review] Don't let the owner of a key's session go until the key is done. Bob, this patch appears to be correct. There is still additional optimization to be achieved, but in the interest of correctness, I'd like you to check this in, unless you have objections to your own patch :)
Attachment #175588 -
Flags: review+
Comment 15•20 years ago
|
||
Comment on attachment 176111 [details] [diff] [review] assert on invalid session/object handle Wan-Teh, Please review this patch. I believe that passing any PKCS11 function a handle to a session or object that has already been destroyed is a programming error. As I understand your criteria for asserts, you believe we should assert on programming errors. So, do you agree that the assertions in this patch are appropriate?
Attachment #176111 -
Flags: review?(wtchang)
| Assignee | ||
Comment 16•20 years ago
|
||
RE: binary compatibility. (comment 10) PK11SymKey is opaque publically (the data structure is defined in secmodti.h, which is private). The public definition is in secmodt.h which is simply: typedef struct PK11SymKeyStr PK11SymKey; As far as lengthening data public data structures, I think we should apply the following criteria: 1) avoid it if there is a reasonable alternative. 2) never do it for a data structure the application would reasonably imbed it one of it's own data structures. Under those criteria some data structures obviously full in case '2': SECItem. Some are "relatively safe": PK11SlotInfo, SECMODModule Some are "iffy": CERTCertificate I think we have lengthened some of these data structures in the past. The CERTCertRequest looks like it fits between the two. It's API implies applications are going to try to build one up itself (it has a constructor and a destructor. I'd say it's probably at least as safe to extend as a CERTCertificate (which I think we have in the past). bob bob
Comment 17•20 years ago
|
||
Comment on attachment 176111 [details] [diff] [review] assert on invalid session/object handle The assertions you added apply not only to NSS internal programming errors, but also errors in other PKCS#11 applications. The PKCS#11 API is public, and the caller may very well not be a piece of NSS code. For example, the softoken can be called directly from the JDK or other applications. Thus, IMO, we should not unconditionally assert if they pass bogus session or object handles to us. Perhaps we should conditionally assert, maybe based on the state of an environment variable.
Attachment #176111 -
Flags: superreview-
Comment 18•20 years ago
|
||
I checked in Bob's Patch above for him, while he is on vacation. I'm marking this bug fixed. There may be pressure to back-port this to NSS 3.9.x for some products.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Comment 19•20 years ago
|
||
Comment on attachment 176111 [details] [diff] [review] assert on invalid session/object handle Withdrawing patch
Attachment #176111 -
Attachment is obsolete: true
Attachment #176111 -
Flags: review?(wtchang)
Comment 20•20 years ago
|
||
Changed bug summary to show these are PKCS11 sessions, not SSL sessions, that are being mishandled.
Summary: NSS improperly handles sessions for SSL derived keys. → NSS improperly handles PKCS11 sessions for SSL derived keys.
| Assignee | ||
Comment 21•19 years ago
|
||
*** Bug 216552 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•