An efficient method to generate unique PKCS#11 session handles

REOPENED
Assigned to

Status

NSS
Libraries
P2
enhancement
REOPENED
16 years ago
12 years ago

People

(Reporter: Wan-Teh Chang, Assigned: Robert Relyea)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

16 years ago
In NSS 3.4, we generate new PKCS#11 session handles from
a 24-bit counter.  When the counter exceeds 2^24 and wraps
around, we need to check that the session handle we generate
is not still in use.  We do that by doing a lookup in the
session hash table and retry if there is a collision.  This
can be expensive if a big block of consecutive session
handles are still in use.

If performance profiling shows that the lookup-and-retry
method in NSS 3.4 is a performance bottleneck, we will need
to implement a more efficient method to generate PKCS#11
session handles.
(Reporter)

Updated

16 years ago
Priority: -- → P2
Target Milestone: --- → 3.4.1

Comment 1

16 years ago
Created attachment 70961 [details]
selfserv results with the tip (beta2)

Performance appears to be on a par with where we were before the bix to bug
125149.
The existing design of session ID numbers for the software tokens 
allocates 8 bits to slot number, and 24 bits to session number within
the slot.  But we do not antifipate having anywhere near 2^24 simultaneous
PKCS#11 sessions.  Our present tests use fewer than 2^16.  Even allowing 
for growth and scalability, we anticipate fewer than 2^18 sessions.

The reason that the number of sessions is so high (2^24) in the current
design is to try to reduce problems that arise from the reuse of old 
sessionIDs after a softoken removal/insertion.  The idea is that in the 
event that a software token is logically removed and reinserted (e.g.
due to reconfiguration of, say, a DB token), it would be bad if session
IDs issued immediately after the insertion were indistinguishable from
session IDs that were issued before the removal and are still being used
by threads that are not yet aware of the reinsertion.  

But having such a large space of sessions leads to potentially large 
amounts of storage required to keep track of so many different sessions.  
And, when the session ID count eventually does wrap around at 2^24, it
does not provide a way to distinguish session ID for previous insertions
from session IDs in the current insertion.

So, I propose an alternative to the present design, based on the following 
assumptions:

#1) virtual insertion/removal of __software__ tokens will be infrequent.  
It will be a configuration change to a running server (or client) that 
will necessitate it.  It will not happen nearly as often as the removal
insertion of a physical token may occur in a client.  

#2) the solution should avoid problems of reuse of old sessions, even if
those old sessions last a REALLY Long time, e.g. even after 2^24 sessions
have come and gone, it should not be possible for an application to cause
a problem by reusing a session ID that it got over 2^24 sessions ago.
We should be able to tell whether a session if from the current insertion
or from a previous insertion, regardless of the number of sessions that 
have been issued since the last (re)insertion.

Note that I'm talking about session IDs that survive a large number of 
other session creations, not that survive a large number of token insertions
and removals.  

#3) For each software token "slot", we keep a counter of the number of times
that a token has been (re)inserted into that slot.  Each time a software 
token is (re)inserted, we increment the counter.  We call this counter the
slot "generation" or "series" number.  

#4) Each NSS software token slot continues to keep linked lists of sessions, 
which lists are found using a hash of the session ID value.  

Proposal: part A

I propose that a session ID contain 3 parts: 
1. slot number (8 bits)
2. slot "series" number  (6 least significant bits of the slot's series
   number at the time the session was created).
3. session number within slot and series.  These 18-bit numbers start at 1 
   each time the token is (re)inserted in the slot, and wrap around as needed.

Using this scheme, it is not necessary for the space of session IDs to be
bigger than twice the maximum number of concurrent sessions.  If we anticipate 
2^17 concurrent sessions, then 18 bits is enough for the session number.  

session numbers are assigned round-robin (skipping in-use numbers) from
within the space of 2^18 (1/4 meg) sessions, much as they are now (with
Wan-Teh's recent patch).  The session table entry records the entire
session ID (all 3 parts).  A session ID from a previous insertion will
not match a session ID from a later insertion because of the series number
inside the session ID.  When a token is reinserted, all the old sessions
are terminated, removed from the slot's session tables.  All the sessions
entered into the table after that have the new series number, and will not
match any session IDs issued before the reinsertion.

A 6-bit series number will detect use of old sessions for up to 64 
reinsertions of the token, regardless of the number of sessions that have
been created on that token since the old session was originally created.
The ability to detect stale session IDs is limited only by the number of
reinsertions, not by the number of intervening sessino creations.
It seems unlikely to me that a thread would hold onto a session ID and
not discover that the token has been reinserted for so long a time that
the token is reinserted exactly 64 times (or a multiple there of).

Proposal: part B

In the present implementation, each NSS softoken slot uses 1024 linked 
lists of session IDs.  I propose that the "hash" function used to find
the index of the chain "head" from the session ID simply be to use the 
10 least significant bits of the session ID as the index.  If the number 
of concurrent sessions is limited to 256K, then there can be at most 
256 session ID entries in each list.  

When the session ID numbers wrap around after 2^18 (256k) sessions, we need 
to be able to find new session ID numbers that are not already in use.
To do this efficiently, I propose to use bit masks to quickly determine 
whether a session number is already in use within a slot.  A single bit
mask for the slot would have to be protected by a lock that could become
hot (a slot lock).  To avoid a hot lock,  I propose that there be a 
separate bit mask associated with  each of the slot's linked lists of 
sessions (that is, with each of the slot's session chain "heads").  
The lock that protects each chain also protects the bit mask for that
chain.  A single bit mask of 256 bits (32 bytes, 8 32-bit words) can be 
used to quickly find an unused session ID from the space of IDs that can 
be kept in that chain.  

When searching for an unused session ID, we simply find the next bit in 
the mask that is zero (indicating not in use), then the session ID is
simply 
	(bit_index << 10) | list_head_index

If we find that the entire chain is full, we release the lock on the 
current chain and try another chain.

conclusion:

This design uses less memory than the present day design because it 
limits the number of concurrent session entries to at most 2^18 rather
than 2^24.  The 1024 32byte bit masks occupy only 32kB per slot.  

If the performance of the existing design is deemed too slow, or using
too much memory, this alternative should be easy to implement and should
improve speed and memory use.
(Reporter)

Comment 3

16 years ago
Changed the QA contact to Bishakha.
QA Contact: sonja.mirtitsch → bishakhabanerjee
(Assignee)

Comment 4

16 years ago
This bug has been fixed in 3.4
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED
(Reporter)

Comment 5

16 years ago
This bug (or rather request for enhancement) is not fixed in
NSS 3.4.  This bug is to request a method for generating unique
PKCS #11 session handles that is more efficient than the method
used in NSS 3.4.

I am reopening this bug and setting target milestone to Future.
We can also resolve it WONTFIX if the method used in NSS 3.4
does not have performance problems.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 3.4.1 → Future
QA Contact: bishakhabanerjee → jason.m.reid
QA Contact: jason.m.reid → libraries
You need to log in before you can comment on or make changes to this bug.