Closed
Bug 364684
Opened 18 years ago
Closed 17 years ago
NSS crashes when slot's session handle counter overflows
Categories
(NSS :: Libraries, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
3.11.6
People
(Reporter: nelson, Assigned: nelson)
Details
Attachments
(2 files, 5 obsolete files)
15.74 KB,
patch
|
rrelyea
:
review+
wtc
:
superreview+
|
Details | Diff | Splinter Review |
13.44 KB,
patch
|
Details | Diff | Splinter Review |
sftk_HandleObject uses the following code to assign new handles to session objects. 1818 /* PKCS #11 object ID's are unique for all objects on a 1819 * token */ 1820 PZ_Lock(slot->objectLock); 1821 object->handle = slot->tokenIDCount++; 1822 PZ_Unlock(slot->objectLock); It actually does this for all new objects, but then the handle is typically changed again for token objects. Anyway, when the value of this tokenIDCount counter goes from 0x7fffffff to 0x80000000, bad things happen, specifically crashes. This is because the sign bit tells softoken that this is a token object handle, not a session object handle. This is actually seen to happen about every 90 days in busy servers. It's obviously easy enough to change that line of code to AND off the high order bit, e.g. 1821 object->handle = slot->tokenIDCount++ & ~SFTK_TOKEN_MASK; But that won't solve the problem of duplicated session handles. So, the challenge is to find a better way of assigning unique session handles. I can think of several ideas, including a free list of handle values. Good ideas welcome.
Assignee | ||
Comment 1•18 years ago
|
||
Bob, in your review, please tell me if the understanding reflected in the new comments is incorrect, and if so, please enlighten me.
Assignee | ||
Updated•18 years ago
|
Priority: -- → P1
Target Milestone: --- → 3.11.5
Assignee | ||
Updated•18 years ago
|
Attachment #249605 -
Flags: review?(wtchang)
Assignee | ||
Comment 2•18 years ago
|
||
It's unclear to me how we want to target this bug fix. Do we intend to hold of all softoken fixes for the remainder of NSS 3.11.x ? Do all softoken fixes go into 3.12 (and later) at this point? Or might we produce 3.11.x releases of softoken (with x > 5) that are not FIPS validated and contain fixes not in softoken 3.11.4 ? Until I know, this bug will be targetted at 3.12.
Target Milestone: 3.11.5 → 3.12
Assignee | ||
Comment 3•18 years ago
|
||
Comment on attachment 249605 [details] [diff] [review] patch v1 I see a few typos in my patch that I will fix before checkin, e.g. >+ * At this point, all new objects are structured session objects. Should read: structured as session objects. >+ * sftk_narrowToXxxObject uses soft_isToken, Should read: uses sftk_isToken I should initialize duplicateObject to NULL at the beginning of this new loop below. >+ do { >+ PRUint32 index; >+ >+ PZ_Lock(slot->objectLock); >+ handle = slot->tokenIDCount; >+ slot->tokenIDCount = (handle + 1U) & ~SFTK_TOKEN_MASK; >+ /* Is there already a session object with this handle? */ >+ index = sftk_hash(handle, slot->tokObjHashSize); >+ sftkqueue_find2(duplicateObject, handle, index, slot->tokObjects); >+ PZ_Unlock(slot->objectLock); >+ } while (duplicateObject != NULL); >+ object->handle = handle;
Assignee | ||
Comment 4•18 years ago
|
||
This patch contains all the fixes I described in previous comments, plus one enhancement. It only starts checking for duplicate object IDs after the object ID counter has wrapped around once. With this patch, the counter counts from 0x00000001 up to 0x7fffffff, then keeps going up through 0x80000000 and eventually through 0xffffffff. We mask off the high order bit to get the actual object handle value, so the session object handles are always in the range 0x1-0x7fffffff. Once the high-order "sign" bit is set in the counter, it stays set. When the counter overflows from 0xffffffff to zero, it gets set to 0x80000001. This way, we remember that it has wrapped around (gone past 0x7fffffff) and we avoid using a zero object handle. The sign bit also serves to remind us to check for (and avoid) duplicate object handles.
Attachment #249605 -
Attachment is obsolete: true
Attachment #249825 -
Flags: superreview?(wtchang)
Attachment #249825 -
Flags: review?
Attachment #249605 -
Flags: review?(wtchang)
Attachment #249605 -
Flags: review?(rrelyea)
Assignee | ||
Updated•18 years ago
|
Attachment #249825 -
Flags: review? → review?(rrelyea)
Assignee | ||
Comment 5•18 years ago
|
||
This patch renames "tokenIDCount" (which was not a count of token IDs) to "sessionObjectHandleCount" which accurately describes its purpose. It also uses a variable named minSessionObjectHandle, rather than a hard-coded decimal constant 1, to initialize the sessionObjectHandleCount at NSS Init time, and also upon counter overflow. This allows me to test the overflow code by setting the value of this variable to a large value, such as 0x7fffff80 in the debugger.
Attachment #249825 -
Attachment is obsolete: true
Attachment #249976 -
Flags: superreview?(wtchang)
Attachment #249976 -
Flags: review?(rrelyea)
Attachment #249825 -
Flags: superreview?(wtchang)
Attachment #249825 -
Flags: review?(rrelyea)
Comment 6•18 years ago
|
||
Comment on attachment 249976 [details] [diff] [review] patch v3 We already solved this problem for session handles in bug 125149. So I encourage you to make the code that tests for a duplicate object handle as close as possible to the corresponding code in NSC_OpenSession. In particular, a sftkqueue_find call could replace the sftk_hash and sftkqueue_find2 calls in this patch. I'd not rename the tokenIDCount member, or would do a more complete job. To see why, please read this comment in pkcs11i.h: > ..., objectLock protects all elements of the token >- * object hash table (tokObjects[], tokenIDCount, and tokenHashTable), >+ * object hash table (tokObjects[], sessionObjectHandleCount, and tokenHashTable), We renamed one member, the the comment still says "token object hash table", and there are still members named tokObjects and tokObjHashSize. In pkcs11.c, I'd declare minSessionObjectHandle as const and only manually remove the const before building the file for a debugging session. If this bug is critical and there is no workaround, we need to fix it in a NSS 3.11.x patch release, which will need to be revalidated. (How the revalidation should be done is another question.)
Attachment #249976 -
Flags: superreview?(wtchang) → superreview+
Assignee | ||
Comment 7•18 years ago
|
||
This patch is the same as v3, plus the following changes: a) now uses a sftkqueue_find call instead of the sftk_hash and sftkqueue_find2 calls used in previous patch. b) more complete renaming of variables related to the slot's object hash table and the slot's session object handle counter. The changed names were/are: tokObjects -> slotObjects (new in this patch) tokObjHashSize -> objectHashSize ( " " " " ) tokenHashTable -> objectHashTable ( " " " " ) tokenIDCount -> sessionObjectHandleCount (also in patch v3) This renaming affects only 3 files in softoken, but it changes many lines in those 3 files. Because of the magnitude of this change, I'm asking for review of this patch, especially from Bob Relyea (even though patch v3 got sr+).
Attachment #250338 -
Flags: superreview?(wtchang)
Attachment #250338 -
Flags: review?(rrelyea)
Assignee | ||
Comment 8•18 years ago
|
||
Wan-Teh, in comment 6 you wrote: > If this bug is critical and there is no workaround, we need to fix it in > a NSS 3.11.x patch release, which will need to be revalidated. This bug is critical for NSS servers that stay up long enough for this counter to overflow. That is less than 90 days for some busy servers. I'm genuinely surprised that we haven't seen this 4+ year old bug before now. We generate at least 5 new objects for each SSL connection. It overflows after ~429.5 million connections. At 55 connections/second, that's ~90 days. At 100 connections/second, that's less than 50 days. There is no workaround, other than stopping and restarting the server before the counter overflows. That workaround is unacceptable to some server owners.
Comment 9•17 years ago
|
||
Comment on attachment 250338 [details] [diff] [review] patch v4, incorporating Wan-Teh's comments I strongly suggest that you create a version of this patch without the renaming for the NSS_3_11_BRANCH. This patch is inundated with noise that it's hard to see the bug fix. >+static PRUint32 minSessionObjectHandle = 1U; I still think we should make this variable 'const'. >+ sftkqueue_find(duplicateObject, handle, slot->slotObjects, \ >+ slot->objectHashSize); Is it necessary to continue the line here with a backslash (\)? It's not necessary in C, but since sftkqueue_find is a C preprocessor macro, I am not sure what the rule is. >- slot->tokObjHashSize = SPACE_TOKEN_OBJECT_HASH_SIZE; >+ slot->objectHashSize = SPACE_TOKEN_OBJECT_HASH_SIZE; >- slot->tokObjHashSize = TIME_TOKEN_OBJECT_HASH_SIZE; >+ slot->objectHashSize = TIME_TOKEN_OBJECT_HASH_SIZE; This is another reason that I think renaming should be done in a separate patch. We see "TOKEN_OBJECT" here again.
Attachment #250338 -
Flags: superreview?(wtchang) → superreview+
Assignee | ||
Comment 10•17 years ago
|
||
Alright, I'll make a separate patch for the branch. This patch is waiting for Bob Relyea's review, even on the trunk.
Target Milestone: 3.12 → 3.11.5
Comment 11•17 years ago
|
||
(In reply to comment #7) > > tokenHashTable -> objectHashTable ( " " " " ) The new name may be incorrect. We seem to store token key objects in this hash table.
Comment 12•17 years ago
|
||
Comment on attachment 250338 [details] [diff] [review] patch v4, incorporating Wan-Teh's comments >+ * (head[]->refCount), objectLock protects all elements of the slot's It may be better to change "token's" to "slot's". >+ SFTKObject **slotObjects; /* variable - reset */ >+ unsigned int objectHashSize; /* invariant */ We seem to only store session object handles in this hash table. If I'm right, the new names probably should reflect that. The fact that only session object handles are stored in this hash table is used in proving the correctness of the following line in this patch: + sftkqueue_find(duplicateObject, handle, slot->slotObjects, \ + slot->objectHashSize);
Comment 13•17 years ago
|
||
Comment on attachment 250338 [details] [diff] [review] patch v4, incorporating Wan-Teh's comments r+ = relyea
Attachment #250338 -
Flags: review?(rrelyea) → review+
Assignee | ||
Comment 14•17 years ago
|
||
Wan-Teh, Now that you've pointed out that the NSS-style hash table of objects is really exclusively a table of session objects, I am convinced more than ever that these variable names MUST be changed, even on the branch. There were two variables whose names CLEARLY identified them as pertaining to token objects, that were in fact exclusively for session objects. That's as wrong as it can be, as wrong as a logic error in the code, and completely misleading to developers (like me!). I think this new set of names clears it up. Please let me know if you disagree. So, this patch is the same as the previous one, except for the new names tokenIDCount -> sessionObjectHandleCount (also in patch v3) tokenHashTable -> tokObjHashTable (new in this patch) tokObjects -> sesObjHashTable ( " " " " ) tokObjHashSize -> sesObjHashSize ( " " " " )
Attachment #249976 -
Attachment is obsolete: true
Attachment #249976 -
Flags: review?(rrelyea)
Assignee | ||
Comment 15•17 years ago
|
||
I also meant to fix the misnamed _TOKEN_OBJECT_HASH_SIZE symbols, which are actually SESSION_OBJECT_HASH_SIZEs.
Attachment #250541 -
Attachment is obsolete: true
Attachment #250545 -
Flags: review?(wtchang)
Comment 16•17 years ago
|
||
Comment on attachment 250545 [details] [diff] [review] patch v6 - alternative variable renaming (take two) Nelson, Bob should review the new names. I was just pointing out that some of the new names you chose before were inaccurate, and that there are other members/macros that also need to be renamed. I don't know the code well enough to do a good review quickly. I would change 'sesObj' to 'sessObj' to be consistent with the existing member named 'sessHashSize'. Note that -- NSS developers read both the (patched) code and the diffs -- non-NSS developers only read the diffs So smaller, simpler diffs have its advantages. Imagine the FIPS testing lab or a Federal government consultant review this patch to assess the differences from the version of softoken that we submitted for FIPS validation. Simply adding comments to pkcs11i.h to explain what each structure member is would accomplish most of the benefits of renaming. I don't want to block the checkin of your fix, so you can check in this patch. (I didn't review it.)
Attachment #250545 -
Flags: review?(wtchang) → review?(rrelyea)
Comment 17•17 years ago
|
||
Comment on attachment 250545 [details] [diff] [review] patch v6 - alternative variable renaming (take two) r+, though wtc is correct, sess is more consistant than ses for abreviation for session. The existing code has lots of occurances of sessObject. Also, in the line fixing my typo of declare, it misses the typo of formatting.
Attachment #250545 -
Flags: review?(rrelyea) → review+
Assignee | ||
Comment 18•17 years ago
|
||
This patch incorporates the suggestion to use an "sess" prefix, rather than "ses". This was committed on the trunk some time ago. I plan to commit it to the 3_11 BRANCH today.
Attachment #250545 -
Attachment is obsolete: true
Assignee | ||
Comment 19•17 years ago
|
||
Committed on branch. Checkin comment: Bug 364684. Don't let the counter whose values are used for session object handles overflow the sign bit. Rename several softoken variables to correctly identify their meaning. r=rrelyea,wtchang tokenIDCount -> sessionObjectHandleCount tokenHashTable -> tokObjHashTable tokObjects -> sessObjHashTable tokObjHashSize -> sessObjHashSize Checking in pkcs11.c; new revision: 1.112.2.21; previous revision: 1.112.2.20 Checking in pkcs11i.h; new revision: 1.41.2.4; previous revision: 1.41.2.3 Checking in pkcs11u.c; new revision: 1.69.2.4; previous revision: 1.69.2.3
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Target Milestone: 3.11.5 → 3.11.6
You need to log in
before you can comment on or make changes to this bug.
Description
•