Closed Bug 257604 Opened 21 years ago Closed 20 years ago

NSS deletes PKCS#11 session objects after closing the session

Categories

(NSS :: Libraries, defect, P1)

Sun
SunOS
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: david.jacobson, Assigned: julien.pierre)

References

Details

User-Agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.4) Gecko/20040414 Build Identifier: $Header: NSS 3.3.5webserver Aug 11 2003 13:05:45 $ Here is a sample set of calls into the PKCS#11 library captured with the Solaris "truss" utility and run through "grep -n C_". (Thus first column is the line number in the original truss output. Next is the process ID, then the LWP ID and thread ID (both the same). Calls are indicated by -> and returns by <-.) Note that in line 15533 thread 30 makes a call to C_CloseSession asking it to close session 0x3000016 and it returns in line 16056. Then in line 16064 thread 25 calls C_DestroyObject passing session handle 0x3000016. But, of course, session 0x3000016 became invalid because of the call to C_CloseSession, which has already returned. The library detects handle 0x3000016 is invalid and returns error 179 in line 16069. (Error 179 is CKR_SESSION_HANDLE_INVALID.) So in this case the PKCS#11 library caught the problem and reported an error. However, it is possible that another thread creates a new session between the two actions, and gets assigned the handle that was just made invalid. Then the opeation names a valid session that was not intended. 15533:11726/30@30: -> libvpkcs11:C_CloseSession(0x3000016, 0x1, 0xeb740, 0xf8c6f760) 15560:11726/25@25: <- libvpkcs11:C_DestroyObject() = 0 15563:11726/25@25: -> libvpkcs11:C_CloseSession(0x4200000c, 0x19, 0x0, 0x0) 15681:11726/36@36: -> libvpkcs11:C_SignFinal(0xc280000c, 0x0, 0xe42efba4, 0xfda80cd0) 15699:11726/31@31: <- libvpkcs11:C_SignInit() = 0 15701:11726/31@31: -> libvpkcs11:C_SignUpdate(0x73000011, 0xf8c3b4e8, 0xd, 0x0) 15721:11726/32@32: <- libvpkcs11:C_SignUpdate() = 0 15722:11726/32@32: -> libvpkcs11:C_SignFinal(0x42800004, 0x126b969, 0xe4beb484, 0x0) 15801:11726/34@34: <- libvpkcs11:C_SignFinal() = 0 15802:11726/34@34: -> libvpkcs11:C_SignFinal(0xf280001f, 0x0, 0xe48eb414, 0xfda80cd0) 15813:11726/34@34: <- libvpkcs11:C_SignFinal() = 145 15814:11726/34@34: -> libvpkcs11:C_SignInit(0xf280001f, 0xe48eb47c, 0xb280000f, 0x108) 15830:11726/29@29: <- libvpkcs11:C_DigestUpdate() = 0 15831:11726/29@29: -> libvpkcs11:C_DigestUpdate(0xd2000005, 0xf8d3fae0, 0x4, 0x0) 15858:11726/34@34: <- libvpkcs11:C_SignInit() = 0 15860:11726/34@34: -> libvpkcs11:C_SignUpdate(0xf280001f, 0xe48eb4e8, 0xd, 0x0) 15871:11726/26@26: <- libvpkcs11:C_SignUpdate() = 0 15872:11726/26@26: -> libvpkcs11:C_SignUpdate(0xe280001e, 0x1216868, 0x3f, 0x0) 15971:11726/36@36: <- libvpkcs11:C_SignFinal() = 145 15973:11726/36@36: -> libvpkcs11:C_SignInit(0xc280000c, 0xe42efc0c, 0x22800009, 0x108) 16013:11726/36@36: <- libvpkcs11:C_SignInit() = 0 16014:11726/36@36: -> libvpkcs11:C_SignUpdate(0xc280000c, 0xe42efc78, 0xd, 0x0) 16056:11726/30@30: <- libvpkcs11:C_CloseSession() = 0 16063:11726/25@25: <- libvpkcs11:C_CloseSession() = 0 16064:11726/25@25: -> libvpkcs11:C_DestroyObject(0x3000016, 0x82800005, 0xfdaa96c4, 0xfda7c7f8) 16069:11726/25@25: <- libvpkcs11:C_DestroyObject() = 179 Reproducible: Sometimes Steps to Reproduce: Around May of 2004 I could reproduce this problem easily. That's when the trace above was collected. With NSS running on a server that was serving about 600 HTTPS pages per second, this error would happen every 10 to 20 seconds. Today I tried again, and could not reproduce it. Here are some things that have changed in the intervening time. 1. The old machine is no longer available. We are now running on a different machine with a slightly faster processor. 2. The library has been modified in ways that affect timing. Also, our handles have a bit field that is incremented every time the same storage is reused as a logically new session. At the time it was failing that field was 4 bits wide. Now it is 8 bits wide. But that should only make it more likely that the system detects this problem.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Thanks for the bug report. Julien, could you work with David to look into this bug?
Assignee: wchang0222 → julien.pierre.bugs
Which version of NSS are you running in each case? A quick look at the tip shows we call C_DestroyObject only 4 places: 1) when importing a public key. 2) when PK11_DestroyObject is called. 3) when PK11_DestroyTokenObject is called. 4) when freeing the key. cases 1 & 2 use the global slot session (which in the server case should never
David, Were you running on NSS 3.3.x when you reproduced the problem in may, and NSS 3.9.x now when you are unable to ?
rescued from the mailbox that catches our bounces: Date: Mon, 25 Oct 2004 09:16:08 -0700 From: david jacobson <David.Jacobson@Sun.COM> To: bugzilla-daemon@mozilla.org Subject: Re: [Bug 257604] NSS makes calls into PKCS#11 (external libraries) that occasionally use invalid handles I'm 99% sure that it was the same version of NSS. What was different is that I was using a new implementation of the user library for the Sun CryptoAccelerator 4000. It is based on a new implementation of tables (object table, session table, etc.) that significantly reduces the number of synchronization calls and malloc-free calls, and that changed the timing quite a bit. -- David Jacobson
We need to fix this bug, as it negatively impacts the use of several PKCS#11 modules produced by Sun with NSS . Bumping priority, and targetting to 3.9.5 .
Priority: -- → P1
Target Milestone: --- → 3.9.5
It could be a dupe, but not necessarily, so I'm keeping both open .
Mass reassign of 3.9.5 target bugs to 3.9.6 .
Target Milestone: 3.9.5 → 3.9.6
I think it is dupe. David, please test with NSS 3.10 RTM, and reopen if the problem happens again . *** This bug has been marked as a duplicate of 196353 ***
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
Target Milestone: 3.9.6 → 3.10
This bug is about using session handles after the session has been closed. Bug 196353 is about the infamous double-delete problem, wherein a session object was explicitly deleted after the session that created it was closed, implicitly deleting the object. Although they both involve use in invalid handles, one is about invalid session handles and the other about invalid object handles. I'm not comfortable closing this, unless we show at least one of the following: 1. Bug 196353 would have caused this bug (reason about the trace found in this bug report) 2. The fix for 196353 was general enough to have fixed this problem, too. (reason about the code) 3. The bug can not be reproduced. So I'm reopening it. Feel free to close it again if you can defend the case that the bug is now gone.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
This case is about deleting a session object after its session was already deleted. We definitely found and fixed a case of that. Please retest with 3.10 and see if the problem is any longer reproducible. 3.10 is available in /s/b/c
David, both of these two bugs (257604, 196353) include traces that show a session being closed and then an object on that session being destroyed. The object destruction references the session on which it was created, because the session number is stored along with the object handle. From an analysis of the erroneous sequence of events these appear to be the same problem. What do you see that makes these two sequences of events appear to be different problems? The fix employed for bug 283690 is to ensure that sessions are not deleted until after all their objects have been deleted. That appears to have eliminated all sequences like the ones shown above.
Summary: NSS makes calls into PKCS#11 (external libraries) that occasionally use invalid handles → NSS deletes PKCS#11 session objects after closing the session
Based on your comment "The fix employed for bug 283690 is to ensure that sessions are not deleted until after all their objects have been deleted. That appears to have eliminated all sequences like the ones shown above.", I'm willing to close it. I think we should say that the fix to Bug 283690 also fixed this problem, not that it is a duplicate of Bug 196353. (I'm new to Bugzilla. I can't seem to resolve it to closed, so I'm choosing "worksforme". Bottom line is that I agree that we can consider it fixed.)
Status: REOPENED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → WORKSFORME
Two bugs are duplicates if they report the same wrong behavior, even if the reporters are concerned with different aspects of that wrong behavior. But since you that we can consider it fixed, I will change this bug to resolved fixed.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Status: REOPENED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.