Closed
Bug 196353
Opened 22 years ago
Closed 20 years ago
NSS deletes PKCS#11 session objects after session was closed
Categories
(NSS :: Libraries, defect, P1)
Tracking
(Not tracked)
VERIFIED
FIXED
3.10
People
(Reporter: bugz, Assigned: julien.pierre)
References
Details
(Reporting for David Jacobson)
The webserver, which accesses cryptographic services via NSS, sometimes makes a
call to C_DestroyObject after the designated object has already been implicitly
destroyed by C_CloseSession. (The object's CKA_TOKEN attribute is false, i.e.
it is a session object. PKCS#11 requires that session objects be automatically
deleted when the session that created them is closed. Thus the C_CloseSession
is the first deletion, and the explicit C_DestroyObject is a second.)
With luck, the C_DestroyObject fails with CKR_OBJECT_HANDLE_INVALID. But
sometimes in the intervening time some other thread has done one of the
oprations that generates a key, and that key gets assigned the same handle as
the deleted key had. Then then erroneous C_DestroyObject destroys the other
thread's object.
We have truss output that shows the bug. But first, we use an empty
function, debug_trace, to show things we need to that are not normally
visible via truss. The first parameter is a tag, and the second is a
value. The tags we use are
1 session handle created by C_OpenSession
2 object handle found by C_FindObjects
3 object handle created by C_CreateObject
4 object handle created by C_CopyObject
5 object handle created by C_GenerateKey
6 object handle for public key created by C_GenerateKeyPair
7 object handle for private key created by C_GenerateKeyPair
8 object handle created by C_UnwrapKey
9 object handle returned directly by C_DeriveKey (often 0)
a object handle for client mac secret generated by C_DeriveKey
b object handle for server mac secret generated by C_DeriveKey
c object handle for client key generated by C_DeriveKey
d object handle for server key generated by C_DeriveKey
e object handle deleted by C_CloseSession
Here are selected lines from the truss output, with line numbers of the
original truss output down the left side. The lines are too long and have wraped.
3029 6755/19@19: -> libvpkcs11:C_OpenSession(0x1, 0x4, 0x56b7e0, 0xfe99aaa0)
3034 6755/19@19: -> libvpkcs11:debug_trace(0x1, 0x8b, 0x4, 0x56b7e0)
3036 6755/19@19: <- libvpkcs11:C_OpenSession() = 0
3037 6755/19@19: -> libvpkcs11:C_DeriveKey(0x8b, 0xfb04f734, 0x54,
0xfb04f740)
5700 6755/19@19: -> libvpkcs11:debug_trace(0xa, 0x55, 0x53, 0x0)
5710 6755/19@19: -> libvpkcs11:debug_trace(0xb, 0x56, 0x53, 0x0)
5714 6755/19@19: -> libvpkcs11:debug_trace(0xc, 0x5a, 0x53, 0x0)
5717 6755/19@19: -> libvpkcs11:debug_trace(0xd, 0x5b, 0x53, 0x0)
6037 6755/19@19: -> libvpkcs11:debug_trace(0x9, 0x0, 0x53, 0x1260388)
6043 6755/19@19: <- libvpkcs11:C_DeriveKey() = 0
62452 6755/36@36: -> libvpkcs11:C_CloseSession(0x8b, 0x56b7e0, 0x2e,
0xf8eff740)
62463 6755/36@36: -> libvpkcs11:debug_trace(0xe, 0x56, 0xff38f6ac,
0x73f00)
62470 6755/36@36: -> libvpkcs11:debug_trace(0xe, 0x5b, 0xff38f6ac,
0x73e58)
62916 6755/20@20: -> libvpkcs11:C_DestroyObject(0x8b, 0x56, 0x0, 0x0)
62921 6755/20@20: <- libvpkcs11:C_DestroyObject() = 130
Note that the C_DeriveKey launched in line 3037 creates a server mac
secret with handle 0x56, traced in line 5710.
In line 62452 C_CloseSession is called, which destroys this object, as
traced in line 62463.
Then in line 62916 there is a call to C_DestroyObject listing this
object. But this is an invalid object handle! The return code of 130
is CKR_OBJECT_HANDLE_INVALID.
Reporter | ||
Comment 1•22 years ago
|
||
Here was my reply:
Unfortunately, your scheme for generating object handles is going to cause all
kinds of problems for NSS. I agree that NSS should be a better PKCS#11 citizen,
and recognize when a session is being closed for which NSS still has object
handles from. But the complexity of including that kind of logic in NSS today
would be daunting. The SSL code does not synchronize objects with sessions
(well, it does to some degree, but as can see, there are cases where it is
possible for one thread to close the session before another thread destroys the
object).
The story I have heard in the past is that tokens should make an effort to
ensure that object handles are not reused in a short timeframe. That is, once a
specific object handle is released, it should not appear again for a long time
(depending on how you define "long" time). NSS uses a monotonically increasing
counter, so object handles are not reused until 2**28 objects have been created
(it's 28 because IIRC the high-order 4 bits are used for some kind of "header"
information). This avoids the kind of problem you are seeing.
In your case, what is the size of your table? If it is indexed by a short, you
would still have 16 bits left over to play with in the object handle. You could
use a counter in those 16 bits, and simply mask the handle before indexing the
table. Is that a possibility for you?
Reporter | ||
Comment 2•22 years ago
|
||
Bob, is my intrepretation correct? What could we do about changing NSS? It's
my suspicion it would be a fairly involved change...
Reporter | ||
Comment 3•22 years ago
|
||
In case it's not clear, the token in question is not the softoken. This token
generates object handles from an index, and the bug results from the same index
being used for a recently destroyed object and a new object.
Comment 4•22 years ago
|
||
I think your evaluation is correct. NSS should attempt to keep track of the
session the object was created with. We should avoid closing the session before
we delete the object (most cases with SymKeys, I thought that that was the case,
but perhaps with session sharing, this doesn't always happen as expected).
As far as cycling through handles, it's pretty important that old handles not be
reused quickly. Often it's not possible to know that the handle has been changed
on the fly. This is particularly true of session handles (more so than object
handles).
Anyway we should see what is causing the sessions to be destroyed before the
C_DestroyObject call in NSS. We should at least try to keep that one safe.
bob
Reporter | ||
Comment 5•22 years ago
|
||
I've tried to reproduce this with the tip, 3.3.5 beta, and 3.3.2, but have been
unable to do so. David, can you give me more information on your test setup?
Also, is it possible for you to send me a complete log (like the excerpt on this
bug) showing the failure?
Assignee | ||
Comment 7•21 years ago
|
||
Bumping up priority and taking bug .
Assignee: bugz → julien.pierre.bugs
Priority: P2 → P1
Hardware: PC → All
Target Milestone: --- → Future
Assignee | ||
Updated•21 years ago
|
Target Milestone: Future → 3.9.5
Assignee | ||
Comment 8•20 years ago
|
||
Mass reassign of 3.9.5 target bugs to 3.9.6 .
Target Milestone: 3.9.5 → 3.9.6
Assignee | ||
Updated•20 years ago
|
Target Milestone: 3.9.6 → 3.10
Comment 9•20 years ago
|
||
Ah, this is it. making this bug depend on bug 283690, which describes a possible
mechanism which triggers this problem.
Depends on: 283690
Comment 10•20 years ago
|
||
Since bug 283690 was fixed, I haven't seen any recurrences of this problem
with SSL testing or with all.sh.
I have been running with a softoken that had assertions in it in every
place that returned CKR_OBJECT_HANDLE_INVALID or CKR_SESSION_HANDLE_INVALID
and haven't hit any of them. SO, I propose that we mark this bug fixed.
Any objections?
Assignee | ||
Comment 11•20 years ago
|
||
I think we should rerun our test case with the version of Solaris softoken that
could reproduce this problem (pre-FCS solaris 10 build) before we close this bug.
Assignee | ||
Comment 12•20 years ago
|
||
*** Bug 257604 has been marked as a duplicate of this bug. ***
Comment 13•20 years ago
|
||
(In reply to comment #11)
> I think we should rerun our test case with the version of Solaris softoken that
> could reproduce this problem (pre-FCS solaris 10 build) before we close this
> bug.
OK, who can do that?
I don't see enough info in this bug to know how to reproduce it.
QA Contact: bishakhabanerjee → jason.m.reid
Assignee | ||
Comment 14•20 years ago
|
||
I spent a little bit of time last week trying with old bits of solaris softoken
and NSS 3.9.5 (before this fix), but didn't get it to crash. It used to crash
within minutes, so I must have done something wrong. I will try again later
today. I can only verify the 3.10 fix if the problem is reproduced with 3.9
first. Perhaps we should mark this bug fixed, and mark it VERIFIED later when
the test is completed.
Assignee | ||
Comment 15•20 years ago
|
||
I was able to reproduce the bad handle problem with 3.9.5 about 5 times in a
half hour period . But with 3.10, it did not occur in a run that lasted for a
whole 2 hour meeting. Marking FIXED.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•20 years ago
|
Status: RESOLVED → VERIFIED
Comment 16•20 years ago
|
||
Making the description of this bug match the observed behavior
Summary: NSS double-deletes PKCS#11 objects → NSS deletes PKCS#11 session objects after session was closed
You need to log in
before you can comment on or make changes to this bug.
Description
•