Closed Bug 123693 Opened 23 years ago Closed 21 years ago

Memory leaks in the error paths in softoken/pkcs11.c, PK11_SlotInit

Categories

(NSS :: Libraries, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wtc, Assigned: wtc)

Details

Attachments

(2 files, 2 obsolete files)

In mozilla/security/nss/lib/softoken/pkcs11.c, function PK11_SlotInit(),
if we fail in initializing any member of the PK11Slot object, the
partially constructed PK11Slot object is not destroyed.  So we leak
memory in the error paths.

The best way to fix this is probably to call pk11_DestroySlotData()
before we return from PK11_SlotInit() on failure.  This may require
fixing pk11_DestroySlotData() so that it can destroy a partially
constructed PK11Slot object.
Priority: -- → P3
Target Milestone: --- → 3.4.1
Priority: P3 → P2
Changed the QA contact to Bishakha.
QA Contact: sonja.mirtitsch → bishakhabanerjee
Set target milestone to NSS 3.5.
Target Milestone: 3.4.1 → 3.5
Set target milestone to NSS 3.6.
Target Milestone: 3.5 → 3.6
Target Milestone: 3.6 → 3.7
Moved to target milestone 3.8 because the original
NSS 3.7 release has been renamed 3.8.
Target Milestone: 3.7 → 3.8
Remove target milestone of 3.8, since these bugs didn't get into that release.
Target Milestone: 3.8 → ---
Target Milestone: 3.9
Target Milestone: --- → 3.9
Removing 3.9 as target milestone.  I will look at this bug.  
Target Milestone: 3.9 → ---
Assignee: kirk.erickson → MisterSSL
Status: NEW → ASSIGNED
Attachment #140142 - Flags: review?(wchang0222)
Nominated for NSS 3.9.1
Target Milestone: --- → 3.9.1
Comment on attachment 140142 [details] [diff] [review]
Implement Wan-Teh's suggested fix.

This patch looks correct.  I verified that
pk11_DestroySlotData can destroy partially
constructed PK11Slot.

I think it is better to say:
	crv = CKR_HOST_MEMORY;
	goto loser;
than "goto mem_loser;" because the former
is the prevalent style in this file.  (I
wonder if you did a lot of assembly
programming before.)
Attachment #140142 - Flags: review?(wchang0222) → review?(rrelyea0264)
Comment on attachment 140142 [details] [diff] [review]
Implement Wan-Teh's suggested fix.

>     if (slot == NULL) {
>-	return CKR_HOST_MEMORY;
>+	return CKR_HOST_MEMORY;		/* this error is bogus */
>     }

I don't know why you added this comment.
pk11_NewSlotFromID can only fail because
memory allocation fails (except one case:
if moduleIndex != index).

I've verified that this patch is correct,
with the two issues I mentioned in this
comment and the previous comment.
Attachment #140142 - Flags: superreview+
Comment on attachment 140142 [details] [diff] [review]
Implement Wan-Teh's suggested fix.

The goto memloser in this function is fine.
Attachment #140142 - Flags: review?(rrelyea0264) → review+
This fix doesn't need to go into NSS 3.9.1.
I found this bug during code inspection.  No
customer was affected by this bug.
softoken has many places that return CKR_HOST_MEMORY regardless of the 
real underlying error.  I think we should fix them.
In this example, there is one other possible cause.  
But I removed the new comment, and checked it in.

/cvsroot/mozilla/security/nss/lib/softoken/pkcs11.c,v  <--  pkcs11.c
new revision: 1.90; previous revision: 1.89
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Target Milestone: 3.9.1 → 3.10
I backed out that patch.  It caused shlibsign to crash on windows.
shlibsign is invoked by the makefile without a reasonable -d argument,
so it tries to open the DBs in $HOME/.netscape (which typically doesn't
exist on windows), which fails.  
The patch above causes the slot structure to be freed, but the reference
to that structure still exists in a PL hash table.  I will attach a new
patch that fixes that.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch Patch v2Splinter Review
This patch is essentially the same as v1, except:
1) the addition of a call to /tmp/123693b.txt at the new loser label, and
2) the removal of an assertion that is no longer correct, now that we remove
slots during initialization.

I think this bug actually was probably originally invalid.  
It claimed that data was leaked, but actually the data was still referenced
by the hash table entry, and was cleaned up during shutdown.
The new patch cleans up the data right away, rather than waiting until
shutdown.
Attachment #140142 - Attachment is obsolete: true
Attachment #140204 - Flags: review?(rrelyea0264)
Seems to have been a but-n-paste problem in that last comment.  
I meant to write:

1) the addition of a call to PL_HashTableRemove at the new loser label, and
2) the removal of an assertion that is no longer correct, now that we remove
slots during initialization.

Status: REOPENED → ASSIGNED
Comment on attachment 140204 [details] [diff] [review]
Patch v2

As I look at stuff, We may return this non-existant slotID if the application
calls C_GetSlotList().

This shouldn't be an issue since 1) C_Initialize() will fail,  and the code
will always return NULL for this slot (which will appear to the user as in
invalid slot id).

Nothing wrong with the patch, just the slightly different underlying semantics
incase something else weird pops up.

bob
Attachment #140204 - Flags: review?(rrelyea0264) → review+
Checked in Patch v2.

/cvsroot/mozilla/security/nss/lib/softoken/pkcs11.c,v  <--  pkcs11.c
new revision: 1.92; previous revision: 1.91
Status: ASSIGNED → RESOLVED
Closed: 21 years ago21 years ago
Resolution: --- → FIXED
If an error occurs during the construction of
a new slot, the fix for this bug only removes
the slot from the slot hash table but does not
remove it from the slot list.  This leaves the
slot list inconsistent with the slot hash table.

Here is a supplemental patch that maintains the
consistency between the slot list and the
slot hash table.  It achieves that by adding the
new slot to the slot hash table and slot list
only after the slot is fully constructed.  So
we don't have to worry about removing the slot
from the slot hash table and slot list if the
construction of the slot fails.

1. I changed pk11_NewSlotFromID to be only
responsible for adding the slot to the slot
hash table and slot list and not responsible
for creating the PK11Slot structure.  I
renamed it pk11_RegisterSlotWithID to reflect
its reduced responsibility.

I changed the return type of this function
to report the reason of the failure.

2. I removed one of the two duplicate assignments
"slot->slotID = slotID;".

3. I removed the removal of the slot from the
slot hash table under the "loser" label.

4. I restored the assertion that asserts the
consistency between the slot hash table and the
slot list.
Comment on attachment 140282 [details] [diff] [review]
Add the new slot to the slot hash table and slot list only when the slot is fully constructed

Let me know if you think this supplemental patch is a good idea.  Thanks.
Attachment #140282 - Flags: superreview?(MisterSSL)
Attachment #140282 - Flags: review?(rrelyea0264)
Comment on attachment 140282 [details] [diff] [review]
Add the new slot to the slot hash table and slot list only when the slot is fully constructed

Bob, if you like this patch, I'd like you to
decide
1. the new name of the pk11_NewSlotFromID
function; and

2. the appropriate error code for

     /* make sure the slotID for this module is valid */
     if (moduleIndex != index) {
-	return NULL;
+	return CKR_DEVICE_ERROR;
     }
1) The register name is fine.
2) CKR_SLOT_ID_INVALID or CKR_ARGUMENTS_BAD. The former is better.

We should make sure all.sh continues to run, and the builds continue to work.
Also, after checking in, we should make sure there is no regression in mozilla
switching into FIPS mode.

bob
I'm reopening this because I think Wan-Teh's patch is a superior solution 
to the one that is checked in.  Review comments will follow.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reassigning to Wan-Teh
Assignee: MisterSSL → wchang0222
Status: REOPENED → NEW
Comment on attachment 140282 [details] [diff] [review]
Add the new slot to the slot hash table and slot list only when the slot is fully constructed

r=MisterSSL with two suggested changes:

1. Return CKR_SLOT_ID_INVALID instead of CKR_DEVICE_ERROR here:

>@@ -2372,7 +2372,7 @@
> 
>     /* make sure the slotID for this module is valid */
>     if (moduleIndex != index) {
>-	return NULL;
>+	return CKR_DEVICE_ERROR;  /* XXX is this the right error? */
>     }
> 
>     if (nscSlotList[index] == NULL) {


2.  remove the slotID argument from pk11_RegisterSlotWithID, 

>-PK11Slot * pk11_NewSlotFromID(CK_SLOT_ID slotID, int moduleIndex)
>+static CK_RV
>+pk11_RegisterSlotWithID(PK11Slot *slot, CK_SLOT_ID slotID, int moduleIndex)
> {

and do not assign that argument value to slot->slotID in that function.
e.g.

>     slot->index = (nscSlotCount[index] & 0x7f) | ((index << 7) & 0x80);
>-     slot->slotID = slotID;
>-    nscSlotList[index][nscSlotCount[index]++] = slotID;
>+    nscSlotList[index][nscSlotCount[index]++] = slot->slotID;

Instead, go back to setting slot->slotID where it was set before.

>@@ -2534,7 +2528,6 @@
>     slot->isLoggedIn = PR_FALSE;
>     slot->ssoLoggedIn = PR_FALSE;
>     slot->DB_loaded = PR_FALSE;
>+    slot->slotID = slotID;
>     slot->certDB = NULL;
>     slot->keyDB = NULL;
>     slot->minimumPinLen = 0;

The rationale for this change is to set set->slotID as early as possible.
In my previous patch, I changed pk11_NewSlotFromID to set slot->slotID
because that function was called at the very beginning of PK11_SlotInit,
so I was changing it to set slot->slotID earlier than it had been.
Now that PK11_SlotInit has been renamed pk11_RegisterSlotWithID, and the
call to it has been moved to near the end of PK11_SlotInit, I'd prefer that
slot->slotID be set at the earlier time.
Attachment #140282 - Flags: superreview?(MisterSSL) → superreview+
Attachment #140282 - Flags: review?(rrelyea0264)
I incorporated Bob's and Nelson's review comments.
Attachment #140282 - Attachment is obsolete: true
Attachment #140637 - Flags: superreview?(MisterSSL)
Attachment #140637 - Flags: review?(rrelyea0264)
Comment on attachment 140637 [details] [diff] [review]
Supplemental patch v1.1

r=MisterSSL
Attachment #140637 - Flags: superreview?(MisterSSL) → superreview+
Comment on attachment 140637 [details] [diff] [review]
Supplemental patch v1.1

looks good r=relyea
Attachment #140637 - Flags: review?(rrelyea0264) → review+
Did the tests Bob asked for in comment 23.

Built and ran all.sh on Windows XP.  Had Mozilla switch into
and out of FIPS mode.

Checking in pkcs11.c;
/cvsroot/mozilla/security/nss/lib/softoken/pkcs11.c,v  <--  pkcs11.c
new revision: 1.93; previous revision: 1.92
done
Status: NEW → RESOLVED
Closed: 21 years ago21 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: