Closed
Bug 292049
Opened 19 years ago
Closed 19 years ago
NSS passes invalid handles to C_UnwrapKey
Categories
(NSS :: Libraries, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
3.11
People
(Reporter: nelson, Assigned: nelson)
References
Details
Attachments
(1 file, 4 obsolete files)
22.54 KB,
patch
|
rrelyea
:
review+
|
Details | Diff | Splinter Review |
A certain PKCS11 module crashes if/when its C_UnwrapKey method is called and is passed CK_INVALID_HANDLE (that is, zero) as the session handle. NSS has been seen to do this when using a temporary ("session") RSA private key. Here is a stack trace. C_UnwrapKey (0, febdcbb4, 1dd848, 20e00c, 80, febdcbd0) + fc pk11_AnyUnwrapKey (b31c8, 1dd848, 183958, 0, febdcda0, 370) + 5a4 PK11_PubUnwrapSymKey (208058, febdcda0, 370, 10c, 0, 26930) + b0 I observe that NONE of the following functions checks for a valid session handle before returning the symkey or using the session handle: pk11_getKeyFromList pk11_CreateSymKey C_UnwrapKey I'm going to attach a patch that will a) help us figure out where the invalid session handle is coming from, and b) avoid passing a zero session handle to C_UnwrapKey
Assignee | ||
Comment 1•19 years ago
|
||
P1 for 3.10.1
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → 3.10.1
Assignee | ||
Comment 2•19 years ago
|
||
Bob, even though I don't intend to commit this patch as is, please review it and tell me if any of my new error path code is obviously wrong. Thanks. Saul, please test this patch on the system experiencing this failure. Let's see if it either cures the crash or creates a core showing clearly the source of the invalid session handle.
Attachment #181938 -
Flags: review?(rrelyea)
Comment 3•19 years ago
|
||
Comment on attachment 181938 [details] [diff] [review] patch for debugging r+ for debugging. pk11_GetNewSession should never return CK_INVALID_HANDLE. PK11_GetRWSession can! (the checks there should probably permanent). pk11_CloseSession checks the only value itself. It's coded to always pair with a pk11_GetNewSession. Looking at the code, it looks like there is a bug in PK11_GetRWSession. If it can't get any new sessions, it closes the slot session and marks it as invalid. PK11_RestoreROSession then tries to restore it. The problem is if some keys were created with the slot session, they now suddenly become invalid. PK11_GetRWSession should just fail in those cases.
Attachment #181938 -
Flags: review?(rrelyea) → review+
Comment 4•19 years ago
|
||
A full patch should probably add some comments;).
Assignee | ||
Comment 5•19 years ago
|
||
This patch replaces both patches above. It differs from the first patch by using CK_INVALID_SESSION instead of CK_INVALID_HANDLE for comparison with session handles. It differs from Bob's patch in that it releases the slot monitor even when the rwsession passed in is CK_INVALID_SESSION. Bob, please confirm that my revision of your patch is the right thing for this function to do. Saul, please give this new patch a try.
Attachment #181938 -
Attachment is obsolete: true
Attachment #181982 -
Attachment is obsolete: true
Attachment #182002 -
Flags: review?(rrelyea)
Assignee | ||
Comment 6•19 years ago
|
||
Saul, I have a question about the occurence of this bug. Does it happen one the first handshake, or the second? Or does it happen after many handshakes have happened, (many > 1000) ?
Comment 7•19 years ago
|
||
Comment on attachment 182002 [details] [diff] [review] new debug patch r- PK11_GetRWSession releases the slot monitor itself if the session is invalid. perhaps the correct fix is to maintain your patch for RestoreROSession and for GetRWSession to always maintain the SlotMonitor. The reason is there are 2 conditions where the rw session could be CK_INVALID_SESSION without releasing the monitor... slot->session is CK_INVALID_SESSION (a case which should assert), and the PKCS #11 module returns crv as OK, but gives us an invalid session. The latter case is a bug in the PKCS #11 module, but we should fail in that case, not hold locks open forever;).
Attachment #182002 -
Flags: review?(rrelyea) → review-
Comment 8•19 years ago
|
||
BTW the rest of the patch is fine.
Assignee | ||
Comment 9•19 years ago
|
||
Bob, You assert that PK11_GetRWSession doesn't hold the monitor when it returns CK_INVALID_SESSION, but there are at least two paths through PK11_GetRWSession that can return CK_INVALID_SESSION but hold the slot monitor. They are: a) slot->defRWSession is true, but slot->session is CK_INVALID_SESSION, or b) C_OpenSession returns CKR_OK but has set rwsession to CK_INVALID_SESSION PK11_GetRWSession does not detect these, and returns while holding the monitor. That is why my code released the monitor when rwsession is CK_INVALID_SESSION. I agree that we can and should assert in either of these cases. But in the non-debug builds, what should PK11_GetRWSession do in these cases? Should PK11_GetRWSession ALWAYS release the monitor every time it returns CK_INVALID_SESSION? even when slot->defRWSession is true? If slot->session is CK_INVALID_SESSION, should PK11_GetRWSession a) clear the slot->defRWSession ? and/or b) try to get a new RW session, and if it succeeds, make that the new default RW session for that slot? Should the definition of PK11_RWSessionHasLock be changed from (PRBool)(!slot->isThreadSafe || slot->defRWSession) to (PRBool)(!slot->isThreadSafe || (slot->defRWSession && slot->session != CK_INVALID_SESSION) ?? Should PK11_RestoreROSession be changed to only exit the slot monitor if (slot->defRWSession && slot->session != CK_INVALID_SESSION) ?? I will attach another patch with my proposed answer to these questions.
Assignee | ||
Comment 10•19 years ago
|
||
This patch differs from the previous one only in pk11skey.c It ensures that the lock is not held whenever GetRWSession returns 0. It also attempts to "fix" the slotInfo (or at least return it to a self- consistent state) in the event that slot->defRWSession is true but slot->session is zero. Does this seem better, Bob?
Attachment #182002 -
Attachment is obsolete: true
Attachment #182026 -
Flags: review?(rrelyea)
Comment 11•19 years ago
|
||
It sort of went in the other direction I was thinking. I was thinking to just remove the exitMonitor in PK11_GetRWSession(). That would make all the rest a lot easier to follow (you always have the monitor out of RWSession....). The direction you went has the advantage as it is more what the callers would expect (that no locks are held if the function fails). It also has an attempt at recovering from slot->session being CK_INVALID_HANDLE in the defRW case. (defRW BTW happens when NSS finds a token which is not read only, but can only support one session). Anyway there is definate advantages for the added complexity. Here are the things I saw wrong in the patch: -- in PK11_GetRWSession(): + PORT_Assert(rwsession != CK_INVALID_SESSION || crv != CKR_OK); + if (crv != CKR_OK || rwsession == CK_INVALID_SESSION) { + if (crv == CKR_OK) + crv = CKR_DEVICE_ERROR; + if (haveMonitor) + PK11_ExitSlotMonitor(slot); The PORT_Assert should read 'crv == CKR_OK && rwSession != CKR_INVALID_SESSION'. It's valid for a PKCS #11 module to faile a C_OpenSession because it has ran out of sessions. +PK11_RWSessionHasLock(PK11SlotInfo *slot,CK_SESSION_HANDLE session_handle) +{ + PRBool hasLock; + hasLock = (PRBool)(!slot->isThreadSafe || + (slot->defRWSession && slot->session != CK_INVALID_SESSION)); + return hasLock; +} hasLock should also be false if session_handle is CK_INVALID_SESSION. in pk11_RestoreRWSession: + PORT_Assert(rwsession != CK_INVALID_SESSION); + if (rwsession != CK_INVALID_SESSION) { + PRBool doExit = PK11_RWSessionHasLock(slot, rwsession); + if (!pk11_RWSessionIsDefault(slot, rwsession)) + PK11_GETTAB(slot)->C_CloseSession(rwsession); + if (doExit) + PK11_ExitSlotMonitor(slot); Remove the Assert (it's valid for PK11_GetRWSession to fail). your direction is fine, r+ for either going to your previous patch and removing the PK11_ExitMonitor call, or taking your current patch with the above changes. bob
Assignee | ||
Comment 12•19 years ago
|
||
(In reply to comment #11) > Here are the things I saw wrong in the patch: > -- in PK11_GetRWSession(): > + PORT_Assert(rwsession != CK_INVALID_SESSION || crv != CKR_OK); Are you sure that's wrong? > The PORT_Assert should read > 'crv == CKR_OK && rwSession != CKR_INVALID_SESSION'. > It's valid for a PKCS #11 module to faile a C_OpenSession because it has ran > out of sessions. Yes, and it must return a crv other than CKR_OK in that case. My assertion allows that case to proceed. It says that the code may proceed if either (a) we got a valid non-zero session number, or (b) there was a non-zero error code set. It disallows the case of getting a 0 session ID together with CKR_OK. It appears to me that the assertion you suggested above would cause an assertion failure if either: - a 0 session id was returned, or - any non-zero error code was returned. > +PK11_RWSessionHasLock(PK11SlotInfo *slot,CK_SESSION_HANDLE session_handle) > +{ > + PRBool hasLock; > + hasLock = (PRBool)(!slot->isThreadSafe || > + (slot->defRWSession && slot->session != CK_INVALID_SESSION)); > + return hasLock; > +} > hasLock should also be false if session_handle is CK_INVALID_SESSION. Agreed. > in pk11_RestoreRWSession: > + PORT_Assert(rwsession != CK_INVALID_SESSION); > Remove the Assert (it's valid for PK11_GetRWSession to fail). Yes, but I claim that when PK11_GetRWSession fails, returning zero, it is not valid for the caller to thereafter call pk11_RestoreROSession. By analogy, when open() returns -1, one does not call close(-1). Do you agree with that? I think this also means that, if we want to ensure that NSS never calls a PKCS11 C_* function with a zere sessionID, then we must inspect EVERY caller of PK11_GetRWSession to ensure that it tests the returned session ID and does NOT proceed to call the PKCS11 C_* function if the returned sessionID was zero.
Comment 13•19 years ago
|
||
Your right, your assert is correct. It might be easer to read if it read (! (ckr == CKR_OK && rwsesson == CKR_INVALID_SESSION)) ) As for the other assert, that would be fine if we go through and change all the callers of GetRWSession to abort on a bad session. Since I didn't see any patches to that effect, I assumed the code was not changing. This kind of change will have to be implemented carefully with the KeyPairGen function, which currently knows way to much about the sideeffects of the subfunctions it calls.
Assignee | ||
Comment 14•19 years ago
|
||
Comment on attachment 182026 [details] [diff] [review] ensure that lock is not held if returned session is 0 I have a newer patch, much more extensive, that detects possible zero session handles in many more places and avoids sending them to the PKCS11 module. I will attach that patch here soon.
Attachment #182026 -
Attachment is obsolete: true
Attachment #182026 -
Flags: review?(rrelyea)
Assignee | ||
Comment 15•19 years ago
|
||
Unlike the previous patches, which were designated as only for debugging purposes, this patch is intended for real checkin. Bob, please review. This patch detects numerous previous-undetected situations that would have caused zero session handles to be sent to a PKCS11 module, and prevents those invalid PKCS11 function calls from being made. With this patch in place, we see no more crashes with zero session handles being passed in to the PKCS11 module. This patch isn't as clean as it might be. There are some stylistic inconsistencies, and some error number inconsistencies, swo I won't be crushed if this fails review for those issues :). We could use some new error codes for these situations. SEC_ERROR_BAD_DATA and SEC_ERROR_LIBRARY_FAILURE aren't adequately descriptive, IMO.
Attachment #183351 -
Flags: review?(rrelyea)
Comment 16•19 years ago
|
||
Comment on attachment 183351 [details] [diff] [review] patch to eliminate zero session handles r+ = relyea, don't let any religious arguments over the comments below keep you from checking in the pack as is if you want. the following comments are about code readability/style, not function. /* XXX Perhaps this should be: ** if (symKey->sessionOwner) */ change to /* pk11_CloseSession handles the case where sessionOwner == PR_FALSE correctly */ + if (session == CK_INVALID_SESSION) { + if (!isPerm) + pk11_ExitKeyMonitor(symKey); + crv = CKR_SESSION_HANDLE_INVALID; } else { - pk11_ExitKeyMonitor(symKey); + crv = PK11_GETTAB(slot)->C_DeriveKey(session, &mechanism, + baseKey->objectID, keyTemplate, templateCount, &symKey->objectID); + if (isPerm) { + PK11_RestoreROSession(slot, session); + } else { + pk11_ExitKeyMonitor(symKey); + } } + PORT_Assert(rwsession != CK_INVALID_SESSION || crv != CKR_OK); The assert is more readable if we don't explicitly distribute the implied '!' using boolean algrebra.... that is We're asserting that the function didn't return CKR_OK and and invalid session at the same time, so it's more readable to say: PORT_Assert(!(crv == CKR_OK && rwsession == CKR_INVALID_SESSION)); The assert as written is functionally the same as the above assert, but requires a lot more head scratching to believe it (or requires writting the above assert and doing the Boolean algebra;). bob
Attachment #183351 -
Flags: review?(rrelyea) → review+
Comment 17•19 years ago
|
||
ARGU..... something got lost in the cut and paste, the second issue was... if (session == CK_INVALID_SESSION) { + if (!isPerm) + pk11_ExitKeyMonitor(symKey); + crv = CKR_SESSION_HANDLE_INVALID; } else { - pk11_ExitKeyMonitor(symKey); + crv = PK11_GETTAB(slot)->C_DeriveKey(session, &mechanism, + baseKey->objectID, keyTemplate, templateCount, &symKey->objectID); + if (isPerm) { + PK11_RestoreROSession(slot, session); + } else { + pk11_ExitKeyMonitor(symKey); + } } Nesting is getting pretty deep here, good place for a 'goto done'.
Assignee | ||
Updated•19 years ago
|
QA Contact: bishakhabanerjee → jason.m.reid
Assignee | ||
Comment 18•19 years ago
|
||
Checking in pk11akey.c; new revision: 1.4; previous revision: 1.3 Checking in pk11auth.c; new revision: 1.3; previous revision: 1.2 Checking in pk11obj.c; new revision: 1.7; previous revision: 1.6 Checking in pk11skey.c; new revision: 1.102; previous revision: 1.101 Checking in pk11slot.c; new revision: 1.86; previous revision: 1.85 Checking in pk11util.c; new revision: 1.51; previous revision: 1.50 This was (is) a P1 bug, but I'm not going to try to fix it for 3.10.x now.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Target Milestone: 3.10.1 → 3.11
You need to log in
before you can comment on or make changes to this bug.
Description
•