In NSS 3.4, we generate new PKCS#11 session handles from a 24-bit counter. When the counter exceeds 2^24 and wraps around, we need to check that the session handle we generate is not still in use. We do that by doing a lookup in the session hash table and retry if there is a collision. This can be expensive if a big block of consecutive session handles are still in use. If performance profiling shows that the lookup-and-retry method in NSS 3.4 is a performance bottleneck, we will need to implement a more efficient method to generate PKCS#11 session handles.
Created attachment 70961 [details] selfserv results with the tip (beta2) Performance appears to be on a par with where we were before the bix to bug 125149.
The existing design of session ID numbers for the software tokens allocates 8 bits to slot number, and 24 bits to session number within the slot. But we do not antifipate having anywhere near 2^24 simultaneous PKCS#11 sessions. Our present tests use fewer than 2^16. Even allowing for growth and scalability, we anticipate fewer than 2^18 sessions. The reason that the number of sessions is so high (2^24) in the current design is to try to reduce problems that arise from the reuse of old sessionIDs after a softoken removal/insertion. The idea is that in the event that a software token is logically removed and reinserted (e.g. due to reconfiguration of, say, a DB token), it would be bad if session IDs issued immediately after the insertion were indistinguishable from session IDs that were issued before the removal and are still being used by threads that are not yet aware of the reinsertion. But having such a large space of sessions leads to potentially large amounts of storage required to keep track of so many different sessions. And, when the session ID count eventually does wrap around at 2^24, it does not provide a way to distinguish session ID for previous insertions from session IDs in the current insertion. So, I propose an alternative to the present design, based on the following assumptions: #1) virtual insertion/removal of __software__ tokens will be infrequent. It will be a configuration change to a running server (or client) that will necessitate it. It will not happen nearly as often as the removal insertion of a physical token may occur in a client. #2) the solution should avoid problems of reuse of old sessions, even if those old sessions last a REALLY Long time, e.g. even after 2^24 sessions have come and gone, it should not be possible for an application to cause a problem by reusing a session ID that it got over 2^24 sessions ago. We should be able to tell whether a session if from the current insertion or from a previous insertion, regardless of the number of sessions that have been issued since the last (re)insertion. Note that I'm talking about session IDs that survive a large number of other session creations, not that survive a large number of token insertions and removals. #3) For each software token "slot", we keep a counter of the number of times that a token has been (re)inserted into that slot. Each time a software token is (re)inserted, we increment the counter. We call this counter the slot "generation" or "series" number. #4) Each NSS software token slot continues to keep linked lists of sessions, which lists are found using a hash of the session ID value. Proposal: part A I propose that a session ID contain 3 parts: 1. slot number (8 bits) 2. slot "series" number (6 least significant bits of the slot's series number at the time the session was created). 3. session number within slot and series. These 18-bit numbers start at 1 each time the token is (re)inserted in the slot, and wrap around as needed. Using this scheme, it is not necessary for the space of session IDs to be bigger than twice the maximum number of concurrent sessions. If we anticipate 2^17 concurrent sessions, then 18 bits is enough for the session number. session numbers are assigned round-robin (skipping in-use numbers) from within the space of 2^18 (1/4 meg) sessions, much as they are now (with Wan-Teh's recent patch). The session table entry records the entire session ID (all 3 parts). A session ID from a previous insertion will not match a session ID from a later insertion because of the series number inside the session ID. When a token is reinserted, all the old sessions are terminated, removed from the slot's session tables. All the sessions entered into the table after that have the new series number, and will not match any session IDs issued before the reinsertion. A 6-bit series number will detect use of old sessions for up to 64 reinsertions of the token, regardless of the number of sessions that have been created on that token since the old session was originally created. The ability to detect stale session IDs is limited only by the number of reinsertions, not by the number of intervening sessino creations. It seems unlikely to me that a thread would hold onto a session ID and not discover that the token has been reinserted for so long a time that the token is reinserted exactly 64 times (or a multiple there of). Proposal: part B In the present implementation, each NSS softoken slot uses 1024 linked lists of session IDs. I propose that the "hash" function used to find the index of the chain "head" from the session ID simply be to use the 10 least significant bits of the session ID as the index. If the number of concurrent sessions is limited to 256K, then there can be at most 256 session ID entries in each list. When the session ID numbers wrap around after 2^18 (256k) sessions, we need to be able to find new session ID numbers that are not already in use. To do this efficiently, I propose to use bit masks to quickly determine whether a session number is already in use within a slot. A single bit mask for the slot would have to be protected by a lock that could become hot (a slot lock). To avoid a hot lock, I propose that there be a separate bit mask associated with each of the slot's linked lists of sessions (that is, with each of the slot's session chain "heads"). The lock that protects each chain also protects the bit mask for that chain. A single bit mask of 256 bits (32 bytes, 8 32-bit words) can be used to quickly find an unused session ID from the space of IDs that can be kept in that chain. When searching for an unused session ID, we simply find the next bit in the mask that is zero (indicating not in use), then the session ID is simply (bit_index << 10) | list_head_index If we find that the entire chain is full, we release the lock on the current chain and try another chain. conclusion: This design uses less memory than the present day design because it limits the number of concurrent session entries to at most 2^18 rather than 2^24. The 1024 32byte bit masks occupy only 32kB per slot. If the performance of the existing design is deemed too slow, or using too much memory, this alternative should be easy to implement and should improve speed and memory use.
Changed the QA contact to Bishakha.
QA Contact: sonja.mirtitsch → bishakhabanerjee
This bug has been fixed in 3.4
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED
This bug (or rather request for enhancement) is not fixed in NSS 3.4. This bug is to request a method for generating unique PKCS #11 session handles that is more efficient than the method used in NSS 3.4. I am reopening this bug and setting target milestone to Future. We can also resolve it WONTFIX if the method used in NSS 3.4 does not have performance problems.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 3.4.1 → Future
QA Contact: bishakhabanerjee → jason.m.reid
You need to log in before you can comment on or make changes to this bug.