Closed Bug 444850 Opened 17 years ago Closed 16 years ago

NSS misbehaves badly in the presence of a disabled PKCS#11 slot

Categories

(NSS :: Libraries, defect, P1)

3.11

Tracking

(Not tracked)

RESOLVED FIXED
3.11.10

People

(Reporter: nelson, Assigned: nelson)

References

Details

Attachments

(6 files, 10 obsolete files)

941 bytes, patch
alvolkov.bgs
: review+
Details | Diff | Splinter Review
1.33 KB, patch
alvolkov.bgs
: review+
julien.pierre
: review+
Details | Diff | Splinter Review
6.50 KB, patch
julien.pierre
: review+
Details | Diff | Splinter Review
14.44 KB, patch
julien.pierre
: review+
rrelyea
: superreview+
Details | Diff | Splinter Review
13.24 KB, patch
rrelyea
: review+
Details | Diff | Splinter Review
60.07 KB, patch
rrelyea
: review+
alvolkov.bgs
: review+
Details | Diff | Splinter Review
In the past month, we've received numerous reports of NSS misbehaving when it is configured to have Solaris S10's "Crypto Framework" PKCS11 module (a.k.a. MetaSlot, SoftToken (note: two adjacent T's)). Simply adding that PKCS#11 module to the secmod.db, and doing nothing else (not putting any certs or keys into that module, not using that module as the "default" for any mechanisms) is enough to cause the following misbehaviors: 1. The SSL server sends out only its server cert, not the entire cert chain, even though the entire cert chain is present in the secmod.db and has the proper and expected trust flags. 2. Received cert chains that are valid and correct are rejected with "unknown issuer". 3. Attempts to pretty print the contents of a certificate using certutil -Ln cause certutil to crash. There are actually many different bugs in NSS revealed by these tests, but they all have one root cause in common. The PKCS#11 module tells NSS that it has two slots, both non-removable, and one of them, the second one, is not present. This combination of a non-removable, not present, slot/token causes NSS to mark the slot as disabled during initialization. Being marked as disabled means that NSS does not initialize the token structure associated with the slot, and does not establish a default session for that slot. See http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/nss/lib/pk11wrap/pk11slot.c&rev=1.93&mark=1323-1329,1334#1323 This appears to be as intended and correct. However, thereafter, NSS ignores these indications that the slot is disabled, and attempts to use the slot and token anyway. This leads to a multitude of errors, including attempting to make PKCS#11 API calls on that slot/token, using an invalid session handle (the value zero). There is one loop, in particular, that walks through all the "active" slots (which includes disabled ones, :-( ) trying to look for certs in all those slots. when it attempts to access one slot with an invalid session number, and the module correctly rejects that attempt, the loop aborts, reporting no certs found, even when there were multiple certs found. That is the immediate cause of most of the symptoms described above. Having debugged this in detail, I intend to attach a series of patches to correct various aspects of this. I intend to fix this at various levels, in a redundant belt-and-suspenders approach. I intend to try to a) stop NSS from using disabled slots b) stop NSS from using the invalid session zero with PKCS#11 modules, and c) get that loop to reports the certs it DID find, even when one or more slots is unhappy, and d) not crash in certutil when an attempt to find all certs with a given subject name fails, even though one such cert is known to exist. Ideally, I'd like Bob Relyea to review these patches. I'm concerned that the patches might detect and introduce failures in cases that are intended to work, despite the obvious problems. But I think we can't wait until August to release the fixes for these symptoms, so I'll be asking others to review this in his absence.
Note that, although the platform on which the problem was observed was Solaris S10, the problems reported here are not specific to that platform but are in code that is common to all platforms. So, this bug is filed against ALL/ALL.
certutil finds a cert by nickname, then calls CERT_CreateSubjectCertList to get a list of certs with that same subject name. At a minimum, the list should come back with the very same cert from which certutil took the subject name for the search criteria. But when that function returns NULL, certutil crashes. This patch fixes that. I will request review later, after I've attached most of the patches
Among other things, this patch causes nssToken_CreateFromPK11SlotInfo to not generate a new token object for a disabled slot. It also avoids a fixes a crash in STAN_InitTokenForSlotInfo that occurs when nssToken_CreateFromPK11SlotInfo returns NULL (which it could do before this patch, so this is a real crash fix). One of my concerns about this patch is that it causes several files to depend on a new header file for the function PK11_IsDisabled. I suppose that there is supposed to be some sort of ordering of dependencies, to prevent circular dependencies, but they aren't recorded anywhere, and so I don't know if it's OK for files in lib/dev and lib/pki to depend on pk11pub.h or not. I hope Bob or Wan-Teh can answer that. If not, then another option is to put a copy of the disabled bit in the NSSSlot structure, also.
This patch detects the use of the reserved invalid session number zero in several places where we blindly accepted it before.
I believe this is the most immediate cause of the problems with failure to find issuer certs when a slot/token is throwing fits.
Comment on attachment 329182 [details] [diff] [review] use PK11_IsDisabled to detect and avoid disabled slots, v1 I think I see a problem with this patch for the function nssTrustDomain_GetActiveSlots. I think this patch makes it leak NSSSlots that are disabled. :( Will test that. > for (tp = tokens; *tp; tp++) { >- slots[count++] = nssToken_GetSlot(*tp); >+ NSSSlot * slot = nssToken_GetSlot(*tp); >+ if (!PK11_IsDisabled(slot->pk11slot)) { >+ slots[count++] = slot; >+ } > }
Comment on attachment 329183 [details] [diff] [review] don't allow (session->handle == CK_INVALID_SESSION), v1 I think this patch is incomplete. - nssToken_Refresh should be smarter and return failure when the session handle is zero and when nssSession_ImportNSS3Session returns NULL. - All the callers of nssToken_Refresh should pay attention to its return value, not ignore it.
This seems more complete than the previous version of this patch.
Attachment #329183 - Attachment is obsolete: true
This version of the patch plugs the leak in the previous version.
Attachment #329182 - Attachment is obsolete: true
corrected version.
Attachment #329189 - Attachment is obsolete: true
I have one more concern about these patches. I'm concerned that there are causes for slots to be disabled temporarily, and this patch might have the unintended effect of permanently disabling (for the lifetime of a process) a slot that was intended to be disabled only temporarily. I think that slots with removable tokens may be disabled while the tokens are removed. I guess it will take lots of testing to find out. Too bad we don't have any sort of regression testing for removable tokens.
Priority: -- → P1
Comment on attachment 329171 [details] [diff] [review] fix certutil crash, v1 (checked in on trunk and branch) This patch is now tested, and as expected, it stops the crash in certutil -L -n Server-Cert.
Attachment #329171 - Flags: superreview?(wtc)
Attachment #329171 - Flags: review?(alexei.volkov.bugs)
I am able to reproduce this problem in selfserv now. The failure occurs while trying to build the server's cert chain to send out to the client. The stack is: =>[1] nssTrustDomain_FindCertificatesBySubject(td = 0x133510, subject = 0x1531d8, rvOpt = (nil), maximumOpt = 0, arenaOpt = 0x13c4c0), line 638 in "trustdomain.c" [2] find_cert_issuer(c = 0x1531a0, timeOpt = (nil), usage = 0xffbff284, policiesOpt = (nil), td = 0x133510, cc = 0x13a870), line 443 in "certificate.c" [3] nssCertificate_BuildChain(c = 0x1531a0, timeOpt = (nil), usage = 0xffbff3a0, policiesOpt = (nil), rvOpt = (nil), rvLimit = 20U, arenaOpt = (nil), statusOpt = (nil), td = 0x133510, cc = 0x13a870), line 514 in "certificate.c" [4] NSSCertificate_BuildChain(c = 0x13c680, timeOpt = (nil), usage = 0xffbff3a0, policiesOpt = (nil), rvOpt = (nil), rvLimit = 20U, arenaOpt = (nil), statusOpt = (nil), td = 0x133510, cc = 0x13a870), line 562 in "certificate.c" [5] CERT_CertChainFromCert(cert = 0x13b598, usage = certUsageSSLServer, includeRoot = 1), line 955 in "certhigh.c" [6] SSL_ConfigSecureServer(fd = 0x47170, cert = 0x13b598, key = 0x142bb8, kea = ssl_kea_rsa), line 736 in "sslsecur.c" [7] server_main(listen_sock = 0x47170, requestCert = 0, privKey = 0xffbff5ac, cert = 0xffbff5c0), line 1450 in "selfserv.c" [8] main(argc = 11, argv = 0xffbff66c), line 2086 in "selfserv.c" I expect that the "Don't abort" patch above will suffice to cure this particular symptom. I will test that shortly.
Comment on attachment 329184 [details] [diff] [review] Don't abort search for certs when a slot reports error, v1 As expected, this patch cures the symptom of the server sending out an empty cert chain (containing only the server cert).
Attachment #329184 - Flags: superreview?(wtc)
Attachment #329184 - Flags: review?(alexei.volkov.bugs)
Comment on attachment 329190 [details] [diff] [review] use PK11_IsDisabled to detect and avoid disabled slots, v2 As expected, this patch has the effect of keeping the disabled slot out of the trust domain's token list, which in turn means that NSS will never attempt to search the disabled token/slot for certs. This patch AVOIDS the problem that the previous patch works around. However, the question is: will this patch have undesired detrimental effects on removable tokens? I'm asking Bob to review it for that reason.
Attachment #329190 - Flags: superreview?(wtc)
Attachment #329190 - Flags: review?(rrelyea)
The previous version of this patch was dependent on another patch. Since I want these patches to be independent, I made this more complete. This patch now contains one piece also found in another of these patches. Also, I found that there were some other bugs in the patched functions, and I fixed them, too. I found that, without this patch, the number of calls to find_objects that attempts to get C_FindObjectsInit to use an invalid (zero) session handle is staggering!
Attachment #329191 - Attachment is obsolete: true
This part is low enough risk that it needn't wait for Bob's return.
Attachment #329623 - Flags: superreview?(wtc)
Attachment #329623 - Flags: review?(alexei.volkov.bugs)
Comment on attachment 329622 [details] [diff] [review] don't allow (session->handle == CK_INVALID_SESSION), v4 I want Bob to review this for inobvious issues. I'm also going to test this on another non-Solaris system that has a removable token.
Attachment #329622 - Flags: review?(rrelyea)
Comment on attachment 329171 [details] [diff] [review] fix certutil crash, v1 (checked in on trunk and branch) r+ with one comment: should not call PORT_SetError to avoid real error code to be overwritten.
Attachment #329171 - Flags: review?(alexei.volkov.bugs) → review+
Comment on attachment 329171 [details] [diff] [review] fix certutil crash, v1 (checked in on trunk and branch) r=wtc. Why do you change the error code to SEC_ERROR_LIBRARY_FAILURE? First, we'll lose the error code set by CERT_CreateSubjectCertList. Second, SEC_ERROR_LIBRARY_FAILURE means an NSS internal error (bug), but CERT_CreateSubjectCertList can fail for other reasons (e.g., out of memory).
Attachment #329171 - Attachment description: fix certutil crash, v1 → fix certutil crash, v1 (wtc disagrees with the SEC_ERROR_LIBRARY_FAILUREerror code)
Attachment #329171 - Flags: superreview?(wtc) → superreview+
> Why do you change the error code to SEC_ERROR_LIBRARY_FAILURE? It is not, in general, an error for CERT_CreateSubjectCertList to make no list. If no certs meet the selection criteria, then making no list is appropriate. We are not guaranteed that any error code will be set, in general, but in this case, it is the meaningless SEC_ERROR_IO. In this code path in certutil, the function is only called when the program is already in possession of a CERTCertificate for a cert that qualifies. There's no way that an out-of-memory condition should develop in certutil, since it is so short lived. The fact that the program already possesses a CERTCertificate for a cert that meets the criteria means that if CERT_CreateSubjectCertList returns no list, a software error has occurred. The program has an existence proof that a certificate meets the criteria and is known to NSS. If NSS fails to include that cert in the returned list of certs with that subject, an NSS library has failed to do its job. Whatever error code is set by the library at that point is irrelevant. It is actually SEC_ERROR_IO, which is utterly meaningless. (NSS really should do better at translating PKCS#11 errors into meaningful error codes.) So, rather than report an error code that will send users and developers off in pursuit of undomesticated water fowl, I chose to put the blame squarely where it belongs. Do you disagree with that?
Comment on attachment 329184 [details] [diff] [review] Don't abort search for certs when a slot reports error, v1 r=wtc. I would not align the '=' signs in the initializations of 'count' and 'errors' because that's not the prevalent style in this file. Since 'count' is now used in a larger scope, it would be nice to give it a more meaningful name such as 'numCerts' or 'collectionCount'.
Attachment #329184 - Flags: superreview?(wtc) → superreview+
(In reply to comment #21) It's not obvious at all. You should add a short comment to explain the SEC_ERROR_LIBRARY_FAILURE error code.
Comment on attachment 329190 [details] [diff] [review] use PK11_IsDisabled to detect and avoid disabled slots, v2 r=wtc. 1. mozilla/security/nss/lib/dev/devslot.c >+ return (PRBool)!PK11_IsDisabled(slot->pk11slot); The (PRBool) typecast is not necessary because PK11_IsDisabled returns PRBool. 2. mozilla/security/nss/lib/pk11wrap/dev3hack.c >+ /* Don't create a token object for a disabled slot */ >+ if (nss3slot->disabled) { >+ return NULL; >+ } Would be nice to set an error code. 3. mozilla/security/nss/lib/pki/pki3hack.c > token = nssToken_CreateFromPK11SlotInfo(td, slot); > PK11Slot_SetNSSToken(slot, token); >- NSSRWLock_LockWrite(td->tokensLock); >- nssList_Add(td->tokenList, token); >- NSSRWLock_UnlockWrite(td->tokensLock); >+ /* Don't add non-existent token to TD's token list */ >+ if (token) { >+ NSSRWLock_LockWrite(td->tokensLock); >+ nssList_Add(td->tokenList, token); >+ NSSRWLock_UnlockWrite(td->tokensLock); >+ } Should we move "PK11Slot_SetNSSToken(slot, token);" into the if block, too? 4. mozilla/security/nss/lib/pki/trustdomain.c Should nssTrustDomain_GetActiveSlots free 'slots' and return NULL if 'count' is 0 at the end of the function?
Attachment #329190 - Flags: superreview?(wtc) → superreview+
Comment on attachment 329623 [details] [diff] [review] lowest risk subset patch to disallow invalid handles (checked in on trunk and branch, with requested changes) r=wtc. >+ /* PORT_Assert(session->handle != CK_INVALID_SESSION); */ You should remove this comment.
Attachment #329623 - Flags: superreview?(wtc) → superreview+
(In reply to comment #24) Wan-Teh wrote: > 1. mozilla/security/nss/lib/dev/devslot.c > > >+ return (PRBool)!PK11_IsDisabled(slot->pk11slot); > > The (PRBool) typecast is not necessary because PK11_IsDisabled > returns PRBool. Yes, PK11_IsDisabled returns PRBool, but the returned value is not the result of PK11_IsDisabled, but rather is the c language "not" of that, e.g. !PK11_IsDisabled. Is that value a PRBool? If c compilers all agree that a boolean "not" result is always a PRBool, then I agree, the cast is unnecessary. > 2. mozilla/security/nss/lib/pk11wrap/dev3hack.c > > >+ /* Don't create a token object for a disabled slot */ > >+ if (nss3slot->disabled) { > >+ return NULL; > >+ } > > Would be nice to set an error code. Yeah, I agree. Care to nominate one? I was too lazy to create one like SEC_ERROR_DISABLED_SLOT but maybe there already is one that's close enough? > 3. mozilla/security/nss/lib/pki/pki3hack.c > Should we move "PK11Slot_SetNSSToken(slot, token);" into > the if block, too? I wanted the slot's token pointer value to be updated, even if it is being updated to NULL. > 4. mozilla/security/nss/lib/pki/trustdomain.c > > Should nssTrustDomain_GetActiveSlots free 'slots' > and return NULL if 'count' is 0 at the end of the function? Good question. I need to look into that. Thanks for the comments.
you can't convert to prbool by casting, you have to use !! or ! depending. and yes, PRBool is understood to be compatible w/ !! and !.
PRBool is a synonym of 'int'. It is not an enum or the built-in bool type of the C++ language. ! applied to a PRBool value is 0 or 1, which is also a PRBool value. So the (PRBool) cast is not needed.
Re: error code for the if (nss3slot->disabled) condition in dev3hack.c The first issue is whether we should use SEC_ error or NSS_ (Stan) error here. If NSS_ error (http://lxr.mozilla.org/security/source/security/nss/lib/base/errorval.c), I think NSS_ERROR_INVALID_ARGUMENT, NSS_ERROR_DEVICE_ERROR, or NSS_ERROR_PKCS11 can be used. If SEC_ error, I think SEC_ERROR_INVALID_ARGS or SEC_ERROR_NO_TOKEN can be used. Setting an approximate or overly generic error code is still better than using whatever error code is left over from the previous failed operation.
In reply to comment 28, thanks. I thought that PRBool was an enum. I think I was confusing it with PRStatus.
Comment on attachment 329184 [details] [diff] [review] Don't abort search for certs when a slot reports error, v1 seeing second review
Attachment #329184 - Flags: review?(julien.pierre.boogz)
Comment on attachment 329623 [details] [diff] [review] lowest risk subset patch to disallow invalid handles (checked in on trunk and branch, with requested changes) Seeking second review for branch
Attachment #329623 - Flags: review?(julien.pierre.boogz)
Comment on attachment 329623 [details] [diff] [review] lowest risk subset patch to disallow invalid handles (checked in on trunk and branch, with requested changes) This is fine. I have the same feedback as Wan-Teh here - please delete the commented assertion that this patch adds before check-in. + /* PORT_Assert(session->handle != CK_INVALID_SESSION); */
Attachment #329623 - Flags: review?(julien.pierre.boogz) → review+
Comment on attachment 329184 [details] [diff] [review] Don't abort search for certs when a slot reports error, v1 This looks OK, but there are a couple of issues I am wondering about in the code. They are not regressions in the patch. 1) nssToken_FindCertificatesBySubject returns both an array of instances and a status. But the code only checks the status. It's potentially possible for an array to be returned and leaked. I would suggest to add an assertion that instances is really NULL when we continue in the loop because of status not being PR_SUCCESS. 2) The whole counting business is a bit confusing. If everything goes right, then count should at most be equal to maximumOpt, but never greater. Yet we allow that case and ignore it. Again, I would suggest an assertion that ensures that count is <= maximumOpt . Or we could also perform the count on the number of instances, before we add them to the collection, but that is more code.
Attachment #329184 - Flags: review?(julien.pierre.boogz) → review+
Attachment #329623 - Flags: review?(alexei.volkov.bugs) → review+
Comment on attachment 329171 [details] [diff] [review] fix certutil crash, v1 (checked in on trunk and branch) I added a comment explaining the library failure error code.
Attachment #329171 - Attachment description: fix certutil crash, v1 (wtc disagrees with the SEC_ERROR_LIBRARY_FAILUREerror code) → fix certutil crash, v1 (checked in on trunk)
Comment on attachment 329623 [details] [diff] [review] lowest risk subset patch to disallow invalid handles (checked in on trunk and branch, with requested changes) I replaced the commented-out assertion, but I intend to put it back in, (not commented out) when we get the rest of this mess of disabled slots working better. Checking in dev/devtoken.c; new revision: 1.49; previous revision: 1.48
Attachment #329623 - Attachment description: lowest risk subset patch to disallow invalid handles → lowest risk subset patch to disallow invalid handles (checked in on trunk)
Comment on attachment 329171 [details] [diff] [review] fix certutil crash, v1 (checked in on trunk and branch) Checking in certutil.c; new revision: 1.139; previous revision: 1.138
I was about to commit v1 of this patch on the trunk when I realized that there are two nearly identical functions in this source file, and the existing patch only fixes one of them. This patch fixes both. It also addresses Wan-Teh's request for a better variable name. This patch is for the trunk, while the previous patch was for the branch. I can produce a branch flavor of this patch, if desired.
Attachment #329184 - Attachment is obsolete: true
Attachment #330715 - Flags: superreview?(wtc)
Attachment #330715 - Flags: review?(julien.pierre.boogz)
Attachment #329184 - Flags: review?(alexei.volkov.bugs)
This is the same as the previous patch, but is for the branch. This patch should compare to the previous (v1) version of this patch.
Comment on attachment 330715 [details] [diff] [review] Don't abort search for certs when a slot reports error, v2 (checked in with requested fix on trunk and branch) This is OK, but I have the same remarks about nssTrustDomain_FindCertificatesBySubject as I did for the previous patch.
Attachment #330715 - Flags: review?(julien.pierre.boogz) → review+
Comment on attachment 330715 [details] [diff] [review] Don't abort search for certs when a slot reports error, v2 (checked in with requested fix on trunk and branch) r=wtc. It seems that we can define a common function that takes a pointer to nssToken_FindCertificatesByNickname or nssToken_FindCertificatesBySubject as an argument.
Attachment #330715 - Flags: superreview?(wtc) → superreview+
Nelson, (In reply to comment #34) Julien wrote: > (From update of attachment 329184 [details] [diff] [review]) > > The whole counting business is a bit confusing. If everything goes right, then > count should at most be equal to maximumOpt, but never greater. Yet we allow > that case and ignore it. The original code does assume that count <= maximumOpt. This is why the test in the code below is == 0, not <= 0. - numRemaining = maximumOpt - count; - if (numRemaining == 0) break; To match that, in the new code we should use == or >= in the following new code. + if (collectionCount > maximumOpt) + break; Otherwise, we may continue the loop with numRemaining == 0, and 0 has a special meaning.
Comment on attachment 330715 [details] [diff] [review] Don't abort search for certs when a slot reports error, v2 (checked in with requested fix on trunk and branch) I withdrew my r= and wait for Nelson's reply to my comment 42.
Attachment #330715 - Flags: superreview+ → superreview?(wtc)
I agree the new test should be + if (collectionCount >= maximumOpt) + break;
Both functions nssTrustDomain_FindCertificatesByNickname and nssTrustDomain_FindCertificatesBySubject start by getting a list of "active slots" to search for certs matching the criteria. They get that list by calling nssTrustDomain_GetActiveSlots. Sadly, the returned list of slots may contain slots that have been "disabled" for various reasons, and/or have no tokens present. NSS would have been unable to establish any session with such slots, and would have only an invalid session handle for them. Yet despite these obvious disqualifications, these slots have been placed into the "active slots" list. Consequently, when NSS attempts to search those slots, it gets errors about using an invalid session handle. Ideally, I'd really like to keep all slots for which no valid session handle exists out of the active slots list completely. This would be very easy to do, and the existing patches do it, except for the fact that the state of being "disabled" is NOT always permanent (for the lifetime of a process). Slots with removable tokens are also "disabled" while the tokens are absent. We want to be sure that such slots ARE treated as "active slots" when a token is (re)inserted. The concern is that, if we remove all slots that are disabled from the active token list, perhaps slots with removable tokens will not be utilized once they are reinserted. This is part of the price we pay for lack of design documentation. It's not clear how slots with removable tokens are supposed to work. The choices seem to be: a) keep disabled slots out of the active slot list, but update that list when a token is inserted into a slot, or b) keep disabled slots in the active slot list, but avoid actually trying to use them when walking down the slot list, if their tokens are removable and absent.
Comment on attachment 330715 [details] [diff] [review] Don't abort search for certs when a slot reports error, v2 (checked in with requested fix on trunk and branch) r=wtc. Please remember to change the new test in nssTrustDomain_FindCertificatesByNickname to if (collectionCount >= maximumOpt). The new test in nssTrustDomain_FindCertificatesBySubject is correct.
Attachment #330715 - Attachment description: Don't abort search for certs when a slot reports error, v2 (for trunk) → Don't abort search for certs when a slot reports error, v2 (for trunk) (wtc: use collectionCount >= maximumOpt in one of the new tests)
Attachment #330715 - Flags: superreview?(wtc) → superreview+
Comment on attachment 330715 [details] [diff] [review] Don't abort search for certs when a slot reports error, v2 (checked in with requested fix on trunk and branch) Checking in lib/pki/trustdomain.c; new revision: 1.58; previous revision: 1.57
Attachment #330715 - Attachment description: Don't abort search for certs when a slot reports error, v2 (for trunk) (wtc: use collectionCount >= maximumOpt in one of the new tests) → Don't abort search for certs when a slot reports error, v2 (for trunk) (checked in with requested fix)
Comment on attachment 329622 [details] [diff] [review] don't allow (session->handle == CK_INVALID_SESSION), v4 This patch has the effect that the slot->nssToken object is not created during initialization if the removable token is not present. That would be OK, in my opinion, if it got created at a later time, such as when the token is inserted into the slot. But that doesn't happen. STAN_InitTokenForSlotInfo only gets calls during initialization, not when a token gets plugged in. So, with this patch, we see strange crashes when we attempt to authenticate to a token that has been plugged in after initialization. NSS gets quite far, and even successfully authenticates to the underlying token, before NSS discovers that the nssToken object is missing (NULL). There seems to be a fundamental choice to make here. Either a) the slot->nssToken object always gets created during initialization, even when there is no token in the slot, and we deal with all the problems that can occur when the session pointers in that token object are NULL, or b) we allow slot->nssToken to be null if no token is present during initialization (as this patch does now), and we change NSS to create the nssToken object when the token is inserted. I am presently working on a patch to do the former. I find that I must test all patches alternately on two systems, 1) a system with removable tokens, and 2) a system with a slot that gets marked disabled during startup. Unfortunately for me, the two systems I have for those purposes have different types of CPUs, different OSes, and one is only accessible to me through a VPN that only works sometimes.
Attachment #329622 - Attachment is obsolete: true
Attachment #329622 - Flags: review?(rrelyea)
Comment on attachment 329190 [details] [diff] [review] use PK11_IsDisabled to detect and avoid disabled slots, v2 The separation of this patch from the patch to detect and disallow session objects with invalid session handles was and is artificial. It seems like these should be separable, but so far, it has been necessary to always update both at the same time. So, going forward, I will just combine the two into a bigger patch.
Attachment #329190 - Attachment is obsolete: true
Attachment #329190 - Flags: review?(rrelyea)
This patch (for the trunk) works on both my Windows system with removable tokens, and on a S10 system with the module that reports absent non-removable tokens. I've restored the assertion that detects an attempt to call find_objects with an invalid (zero) session handle, and the assertion is not hit on either system (a big improvement over the existing code). This patch is one of two alternatives I'm contemplating. The other would not create a token object when no token is present during initialization, but would create that object later, if/when it was inserted. I think that's cleaner, but it might be a much bigger change. I found a sequence of operations that caused a crash with the previous version of this patch, but does not cause a crash with this patch. I will record it here for posterity. It is: 1) start the browser with the slot empty 2) insert the token 3) start a login to the token (using the device manager) 4) enter a wrong password, then 5) on the second try, cancel. You will get a dialog saying login failed. 6) start another login using the device manager 7) enter the correct password Boom. I don't know what was special about that sequence of operations, but without the failed attempts, entering the correct password worked as expected.
Comment on attachment 331456 [details] [diff] [review] create token object, even when there is no session, for trunk, v5 In this patch, I made the changes previously suggested by Wan-Teh. I set an error code, using PORT_SetError, when the code detects an attempt to use a disabled slot, and I discard an empty slot list in nssTrustDomain_GetActiveSlots, rather than returning an empty list. That code path should probably set an error code, too. I chose to use PORT_SetError rather than an NSS error code because I'm not sure that any code actually pays attention to NSS error codes anywhere. I wanted to be sure that the error code would actually be noticed by callers of NSS, if they call PR_GetError. If setting an NSS error code also assures that outcome, I should consider that, but I don't believe it does.
Attachment #331456 - Flags: superreview?(rrelyea)
Attachment #331456 - Flags: review?(wtc)
This patch is the same as the previous one, but it ignores whitespace. It may be easier to review this copy than the the previous one. (It's a shame that bugzilla's diff viewing tool can't do this with regular patches.)
Comment on attachment 329171 [details] [diff] [review] fix certutil crash, v1 (checked in on trunk and branch) cmd/certutil/certutil.c; new revision: 1.97.2.14; previous revision: 1.97.2.13
Attachment #329171 - Attachment description: fix certutil crash, v1 (checked in on trunk) → fix certutil crash, v1 (checked in on trunk and branch)
Comment on attachment 329623 [details] [diff] [review] lowest risk subset patch to disallow invalid handles (checked in on trunk and branch, with requested changes) lib/dev/devtoken.c; new revision: 1.39.28.2; previous revision: 1.39.28.1
Attachment #329623 - Attachment description: lowest risk subset patch to disallow invalid handles (checked in on trunk) → lowest risk subset patch to disallow invalid handles (checked in on trunk and branch, with requested changes)
Blocks: 374247
Attachment #330716 - Attachment is obsolete: true
Comment on attachment 330715 [details] [diff] [review] Don't abort search for certs when a slot reports error, v2 (checked in with requested fix on trunk and branch) lib/pki/trustdomain.c; new revision: 1.51.28.6; previous revision: 1.51.28.5
Attachment #330715 - Attachment description: Don't abort search for certs when a slot reports error, v2 (for trunk) (checked in with requested fix) → Don't abort search for certs when a slot reports error, v2 (checked in with requested fix on trunk and branch)
Attachment #331456 - Flags: review?(julien.pierre.boogz)
Comment on attachment 331522 [details] [diff] [review] create token object, even when there is no session, for trunk, v5 (IGNORING WHITESPACE) 1) There is a problem in the following code sequence : nssSession_ExitMonitor(session); /* token notr emoved, finished */ if (session->handle != CK_INVALID_SESSION) return PR_TRUE; The issue is that session->handle is being read after the monitor on the session has been released. The session handle may therefore have changed due to another thread operating on the default session. The handle check needs to happen before the monitor is released. This is a problem in existing code, but I feel that it needs to be fixed in this patch, since it rewrites some of the code that contained this bug. 2) in find_objects, doesn't somebody need to acquire the session monitor also, in order to check session ? There is an existing if check for session->handle, which is done outside the monitor ownership. That seems wrong, because again, session->handle could change when another thread acquires the monitor. And therefore, the new assertion would also be in the wrong place.
Attachment #331522 - Flags: review-
Note that I reviewed the whole patch and the rest looks fine to me.
Comment on attachment 331456 [details] [diff] [review] create token object, even when there is no session, for trunk, v5 See my comments on the same patch "IGNORING WHITESPACE".
Attachment #331456 - Flags: review?(julien.pierre.boogz) → review-
Julien, I believe the session monitor protects the opaque objects inside the PKCS#11 module that are identified by the session handle, objects such as the contexts of multi-part hash or encryption operations that may be undertaken on a session (using a session handle) in the module. I perceive that, as used in NSS, the session monitor *does not protect* the contents of an nssSession object, but rather protects the contents of the underlying PKCS#11 module's data storage associated with the session handle held in that object. I think you will find that the monitor is typically held only around the calls to one (or a few) PKCS#11 API functions that use a session handle as an argument, and is not held generally around all code that uses an nssSession object. If the monitor was intended to protect the nssSession object itself, one might imagine that functions such as nssToken_GetDefaultSession, that return pointers to these objects, would acquire those monitors. But they do not. I don't mind changing nssSlot_IsTokenPresent in this patch, as you suggested, so that the decision of whether or not the handle is invalid is made while holding the session monitor, and will do so. But I think that the questions you're raising are, essentially: - what all is protected by the monitor in the nssSession object? and - can session->handle be referenced without holding that monitor? and implicitly - are nssSession objects adequately protected by the current code? If the answer to this last question is negative, then I think the scope and magnitude of the necessary changes are MUCH greater than the scope and magnitude of this patch, and I would say that this patch should not be held up while those larger changes are contemplated.
Nelson, Re: comment 59, I am not sure that the purpose you state forthe session monitor (which may be correct) is the only one. If this monitor does not protect the content of an nssSession object, then what does ? If you look at code from nssSlot_IsTokenPresent in the diff for the patch, you will see the following existing assignment : session->handle = CK_INVALID_SESSION; This assignment is done while the monitor is held. Is that accidental or intentional ? In other places, for example in find_objects, there is code that reads session->handle . It tries to check that it's valid before it goes to go a PKCS#11 call with that handle. But it doesn't check after acquiring the monitor if that session handle is still valid. So, what I see there is a race - the session handle can be invalidated between the time it has been checked and the time it's being used to make a call. Is there some other higher-level global lock I am missing that prevents this race from actually happening ? So far, I am not seeing it. I think protecting the content of the object was one of the purposes of the nssMonitor, but it seems to have been applied only to writes and not to reads.
> If this monitor does not protect the content of an nssSession object, then > what does ? Good question. Maybe nothing. I think that if you survey all the code that uses that object, you will find that most of it uses nothing. If that is the case, then that may need to be fixed, but it is beyond the scope of this bug. > This assignment is done while the monitor is held. Is that accidental or > intentional ? My guess is that it is essentially accidental. I don't think it indicates that the monitor is used for that purpose in a systematic way throughout NSS. But I'd be happy to be wrong. I'm going way out on a limb here, and *guessing* that the objects that represent modules, slots, PK11Slots, and nssSessions simply aren't systematically locked when used. I suspect the rationale is something like this: Most of these things are entirely static, setup once during init, and then effectively read-only after that. No race occurs while they are being initialized, and there is no need to lock them while they are in that read-only state. Of course, removable tokens aren't quite so static. :) Take a look at the functions that call nssTrustDomain_GetActiveSlots, such as nssTrustDomain_FindCertificatesByNickname, nssTrustDomain_FindCertificatesBySubject, (which my patch improves), and also nssTrustDomain_FindCertificateByIssuerAndSerialNumber NSSTrustDomain_TraverseCertificates and NSSTrustDomain_TraverseUserCertificates nssTrustDomain_FindTrustForCertificate nssTrustDomain_FindCRLsBySubject (which all need to be fixed in the same way as nssTrustDomain_FindCertificatesBySubject was, I think :-/ ) and the functions they call, such as nssTrustDomain_GetActiveSlots and nssTrustDomain_GetSessionForToken As those functions get and traverse lists of slots and tokens, calling nssTrustDomain_GetSessionForToken for each token, none of them appear to do any locking of any of those objects. I'm not saying that's correct or good, but I'm saying that if we're going to change it, we must do that in a systematic way that affects all the relevants parts of NSS at once. Until such time as we decide to do that, I think we should not attempt to impose a new set of locking requirements on patches that fix bugs in the existing code.
This is the same as v5, except for a tiny difference in PK11_IsTokenPresent suggested by Julien.
Attachment #331456 - Attachment is obsolete: true
Attachment #331522 - Attachment is obsolete: true
Attachment #331699 - Flags: superreview?(rrelyea)
Attachment #331699 - Flags: review?(wtc)
Attachment #331456 - Flags: superreview?(rrelyea)
Attachment #331456 - Flags: review?(wtc)
OK, Just some answers to the questions in the comments.... RE pk11pub.h in lib/dev and lib/pki --- It's not a layer violation. PK11_wrap is logically below the stan code. They only issue is a pure stan implementation should have a disable at the stan level. Your idea of creating a stan "is it disabled" is probably most correct, though there isn't anything wrong with what you actually did. RE temporary disable. The ony way to temporarily disable a slot is with the user functions PK11_UserDisableSlot() and PK11_UserEnableSlot(). I don't know of any applications that currently use them, but for correctness anything we do to the disabled slot in the normal init path, we should also do with PK11_UserDisableSlot() and we should undo with PK11_UserEnableSlot(). Looking at the code, there appears to be a bug in PK11_UserEnableSlot(). It will enable a slot even if it was disabled 'automatically'. It should probably only renable slots that were disabled by the user. NOTE:I have not yet looked at the patches, but disables slots should still show up in queries for all the slots in the system. In short, mozilla should still show the slots in it's device UI. RE: locking of session handles. -- the global session handle is purposefully not protected by a lock. It's considered an 'invariant' in the code even though it can change. That change is always triggered by a token removal. In all those cases all the existing operations on that token are expected to fail or not more or less randomly. In general you cannot lock the validity of a session handle because you can't prevent the user from invalidating it by pulling the token. I'll go look at the patches themselves now.... bob
re my comments on session handles. Those comments apply to slot->sessionHandles, which are integers. We may in fact have to look more closely at the stan session objects, if we don't at least hold a reference to them... bob
Bob, Re: comment 63 and comment 64, About session handles - are you saying that it doesn't matter if the value of the value of the session handle is changed randomly by another thread in the middle of the execution of those functions ? If so, then we should remove all the code that checks that the handle is not CK_INVALID_HOME, as well as the assertions. I can see how we can never ensure that the session handle is valid - the session of course becomes invalid the moment the token is unplugged - but that's different from the handle to the session being changed.
I meant CK_INVALID_SESSION . sigh.
Comment on attachment 331699 [details] [diff] [review] create token object, even when there is no session, for trunk, v6 (checked in) r+ With the following observations... I moving the session handle check inside the monitor isn't necessary, but doesn't hurt (so it's ok to keep it). I'm more worred about the lifetime of the session object itself. It doesn't appear to be ref-counted as I would have expected. in line 1207 the added crv = appears to be redundant (that is it is set but never checked). I believe this semantic is correct (It doesn't seem right to necessarily fail a token just because it's random number generator returned an error). I believe this code will function correctly WRT PK11_UserDisableSlot()/PK11_UserEnableSlot() since the slot left out of the trust domain on init. A user disabled slot will remain on the trust domain, but will disappear for any Active list. It would show up again when it's turned on. I see some things that should probably be looked at namely: 1) life cycle of session objects. 2) Only renabling User Disabled tokens. both of which are pre existing conditions so should not hold up this patch. bob
Attachment #331699 - Flags: superreview?(rrelyea) → superreview+
Comment on attachment 331699 [details] [diff] [review] create token object, even when there is no session, for trunk, v6 (checked in) Thanks, Bob, I expect (and hope) there will be some more discussion about all this. You're right that the addition of "crv =" at line 1212 in PK11_InitToken didn't actually change the essential behavior of the function any. It just allowed me to see what value was being returned by PKCS#11 modules for which the source was not handy (so, stepping into the function wasn't possible), while debugging, and set breakpoints conditioned on those return values. I suspected that this function was returning errors, and sure enough. >- PK11_GETTAB(slot)->C_SeedRandom(slot->session, >+ crv = PK11_GETTAB(slot)->C_SeedRandom(slot->session, > random_bytes, sizeof(random_bytes)); One more question. I wanted to put the assertion that the session handle was not zero into find_objects. I really want us to stop asking modules to use invalid handles. I figure that if NSS is working correctly, we'll never this this assertion, and if it's not, we'll find out and fix it. But there is some concern that perhaps NSS has more ways of using invalid session handles than we know about, and this assertion might be perceived as a reliability regression for FF3, especially in developers' builds. So, what do you think of the advisability of that assertion?
Bob, even though you've SR'ed my latest patch for this bug, I'd still like to explore alternatives with you. One of the things that bothers me about this patch is that, with it, there is ALWAYS an nssToken object allocated and present, even when there is no token in the slot. The object sits there, with many flags and variables that purport to describe the state of an object that does not exist. This seems like a recipe for lots of bugs (like the ones this bug report is trying to fix). I would prefer that there be NO nssToken object when the slot is empty. But I tried to make that happen, and found that there were lots of problems with it. In particular, there seems to be no code path to create an nssToken except during module initialization. I had a patch that deleted the nssToken when the token was removed. In testing it, I was surprised to find that NSS was able to go so far as to actually login to the token, even though there was no nssToken object present. This suggests to me that some (perhaps much) of the information that is needed to interact with a token is not stored in the right object. It is possible to login to a token in the absence of an nssToken object because the data needed to do so is not stored in the nssToken object, but rather is somewhere else. Do you agree, in principle, with the idea that nssToken objects should not exist when the slot is empty, and with the idea that it ought not to be possible to interact meaningfully with a token without having an nssToken object?
re NULL session handle assert. I'm of two minds... one, we shouldn't be calling find will a null session handle, but 2 It's good to call the module once in a while with a NULL session handle. PKCS #11 modules are supposed to handle that correctly (e.i. return invalid session), and we do use invalid session handle internally to signal error states. I can't think of a case, however, when we should call find with such a handle, so the assert is probably good. RE nssToken object. Well what you describe is exactly what was supposed to happen. Token objects were only suppose to exist if the token was present, and was supposed to be deleted when the token was removed. Ian evidently didn't quite implement it that way, or at least he didn't in the non-pure Stan case. nssToken objects are stan objects, so it's not surprising you make it quite a ways before you notice the lack. All the old wrapper code still uses the PK11SlotInfo object as the basic knowledge of the slot/token. In Stan that task is split between the nssSlot and nssToken objects. Part of the problem may be in the 'mixed' code Ian slaved the nssToken object to the PK11SlotInfo rather than the nssSlot. Anyway, I agree with your principle. bob
Bob, I agree the assert is good, but only as long as we assert on the handle that will actually be used for the PKCS#11 call that follows. Because nssSession isn't locked, that's not necessarily the case today, as session->handle can change during the test or assert, and the actual PKCS#11 call. The problem can be solved in several ways : 1) short-term simple fix : read session->handle only once, to a local variable, and then test/assert on that local variable, and then use that local variable for the handle passed to the PKCS#11 call 2) longer-term : properly lock nssSession so that handle is properly protected . This is a pre-existing problem that shouldn't hold up the patch for this bug. There should probably be a separate bug, at this point.
Comment on attachment 331699 [details] [diff] [review] create token object, even when there is no session, for trunk, v6 (checked in) Seeking second review for branch.
Attachment #331699 - Flags: review?(julien.pierre.boogz)
Attachment #331699 - Flags: review?(julien.pierre.boogz) → review+
re locking the session handle. (long term) So we shouldn't have to lock the session handle, but we should make the session handle object reference counted. the session handle within the session handle object should be invariant. bob
On trunk lib/dev/devslot.c; new revision: 1.24; previous revision: 1.23 lib/dev/devtoken.c; new revision: 1.50; previous revision: 1.49 lib/pk11wrap/dev3hack.c; new revision: 1.24; previous revision: 1.23 lib/pk11wrap/pk11slot.c; new revision: 1.94; previous revision: 1.93 lib/pki/pki3hack.c; new revision: 1.96; previous revision: 1.95 lib/pki/trustdomain.c; new revision: 1.59; previous revision: 1.58
Attachment #331699 - Flags: review?(wtc)
On Branch: dev/devslot.c; new revision: 1.22.2.2; previous revision: 1.22.2.1 dev/devtoken.c; new revision: 1.39.28.3; previous revision: 1.39.28.2 pk11wrap/dev3hack.c; new revision: 1.21.28.2; previous revision: 1.21.28.1 pk11wrap/pk11slot.c; new revision: 1.87.2.3; previous revision: 1.87.2.2 pki/pki3hack.c; new revision: 1.86.28.7; previous revision: 1.86.28.6 pki/trustdomain.c; new revision: 1.51.28.7; previous revision: 1.51.28.6 Watching tinderbox now.
In comment 61, I wrote that there was a list of functions that call nssTrustDomain_GetActiveSlots and all need to be fixed. Some of them are now fixed, and some are not. > nssTrustDomain_FindCertificatesByNickname, (now fixed) > nssTrustDomain_FindCertificatesBySubject, (now fixed) > nssTrustDomain_FindCertificateByIssuerAndSerialNumber (still buggy) > NSSTrustDomain_TraverseCertificates (still buggy) > NSSTrustDomain_TraverseUserCertificates (still buggy) > nssTrustDomain_FindTrustForCertificate (still buggy) > nssTrustDomain_FindCRLsBySubject (still buggy) I wonder if it makes sense to consider this bug fixed while some functions remain that continue to abort their loops at the first bad slot they encounter. I rather doubt it. :(
I am trying to determine how to test the patches to each of the following functions: > nssTrustDomain_FindCertificateByIssuerAndSerialNumber - called when verifying a signature with the old PKCS7 library sec_pkcs7_verify_signature -> CERT_FindCertByIssuerAndSN - MAY be called to verify CMS signature in libSMIME NSS_CMSSignerInfo_GetSigningCertificate -> CERT_FindCertByIssuerAndSN > NSSTrustDomain_TraverseCertificates called from CERT_GetCertNicknames called from NSS_GetClientAuthData called from CERT_FindUserCertsByUsage called from CERT_MatchUserCert (dead) called from various PSM functions called from PK11_TraverseSlotCerts called from CERT_GetSSLCACerts used in signtool commands to list certs called from PK11_ListCerts used by certutil -L (among many others) > NSSTrustDomain_TraverseUserCertificates No need to test. It's dead code, being removed. > nssTrustDomain_FindTrustForCertificate called from nssTrust_GetCERTCertTrustForCert called from fill_CERTCertificateFields called from stan_GetCERTCertificate called from STAN_ChangeCertTrust called from CERT_ChangeCertTrust used in certutil -A, -M called from __CERT_AddTempCertToPerm > nssTrustDomain_FindCRLsBySubject called from PK11_FindCrlByName called from SEC_FindCrlByKeyOnSlot called from crl_storeCRL called from PK11_ImportCRL called from SEC_NewCrl called from CERT_ImportCRL called from SECU_StoreCRL used in crlutil called from crlutil's ImportCRL
Bob Relyea, please read this command and add your thoughts. Bug 444974 was experiencing the same problems described in comment 0 above. Some problem with a PKCS#11 module was causing the loops in trustdomain.c to terminate prematurely, and this occurred only when the smart card was plugged into the reader for the third-party PKCS#11 module. Then the reporter tried FF 3.0.2 build 6, which incorporates NSS 3.12.1.1, and his problem deteriorated from a mere "unknown issuer" error to a crash. The problem is that there are many functions in nss/lib/dev/devtoken.c that assume that the "defaultSession" field of the NSSToken structure will never be NULL. We see this line of code in 9 places in that file: nssSession *session = (sessionOpt) ? sessionOpt : token->defaultSession; Thereafter, the code blindly assumes that session cannot be NULL. We could change those functions to test session and try to do something else if it is NULL. But I wonder: why are these functions even getting called in that case? When we have no default session for a token, this pretty much means the token is useless to us. So it seems to me we should detect that at a high level and stop trying to use that token while that condition is trie. token->defaultSession is assigned in just two places, both in nss/lib/pk11wrap/dev3hack.c. One in nssToken_CreateFromPK11SlotInfo and one in nssToken_Refresh. Both places assign to it the value returned from a call to nssSession_ImportNSS3Session. That function takes a CK_SESSION_HANDLE argument, and returns a NULL value if that handle is 0, or if the malloc fails. The most likely explanation for that function to return NULL is that the CK_SESSION_HANDLE passed to it was the magic value CK_INVALID_SESSION, which is 0. This begs the question, why are the callers of nssSession_ImportNSS3Session passing it CK_INVALID_SESSION? Why are they even trying to do what they're doing with an unusable token? Should those callers do something else instead? and if so, what? The BIG question I have is: - Should we allow an NSSToken to have a null defaultSession pointer, and require that every function that tries to use it test it for NULL? *OR* - Should we disallow NSSTokens to have null defaultSession pointers, and refuse to create them, or destroy them, when that value is NULL? Disallowing NSSTokens with null defaultSessions would avoid getting into the functions that would try to use the null defaultSession in the first place. Those functions will generally not be called with a NULL NSSToken pointer. However, if we destroy the NSSToken structure when the default session goes to NULL, then we have to re-create it when a token is plugged in, something that NSS does not do now. But I think that is the right solution. Bob, I'd like your help in figuring out how to solve this problem.
Attachment #331699 - Attachment description: create token object, even when there is no session, for trunk, v6 → create token object, even when there is no session, for trunk, v6 (checked in)
This link shows all the references to token->defaultSession. All the places that assume that it is non-NULL are in the list from this search. http://mxr.mozilla.org/security/search?string=defaultSession&case=on&find=nss%2Flib&findi=&filter=%5E%5B%5E%5C0%5D*%24&hitlimit=&tree=security No token object should exist with a NULL defaultSession. IMO, the right fix is to ensure that token objects do not exist with NULL defaultSessions. That fix will, in general, avoid any of these functions from being called.
The code that should first detect that a token has no session includes the function nssToken_CreateFromPK11SlotInfo, and all the callers of nssToken_Refresh, which are listed at http://mxr.mozilla.org/security/ident?i=nssToken_Refresh They should destroy the token object if no session can be created for it. I have previously developed a patch to do just that. The unresolved issue with that patch was that it necessitates a new call to nssToken_CreateFromPK11SlotInfo that would occur when a token is (re)inserted into the slot. The existing code assumes that the token object is always present.
re comment 79: those are good questions. In the idealized stan design the token object itself wouldn't exist. The hybrid is definitively causing problems. I think the ideal fix would be to make the PK11SlotInfo * object point to the nssslot which would then point the the nsstoken. Unfortunately doing that surgery would require a pretty extensive patch with some careful review (particularly along the lines of object lifecycle). Barring that extensive of a change, I'm wondering how we are getting into this code if the token isn't even present. If the token is present, we should have a valid default session. If we don't have a valid default session, we should treat the token as 'not present'. So the question is 1) how are we getting into these functions we effectively non-present tokens, and 2) if the token is present why is the session null. (NOTE: disable should imply not present). bob
Comment on attachment 334034 [details] [diff] [review] patch for remaining functions in trustdomain.c (checked in) Bob, please review. I'd like to get the patches for bug 200704, bug 444974 and this bug all into NSS 3.12.next
Attachment #334034 - Flags: review?(rrelyea)
Blocks: 293319
Attachment #334034 - Flags: review?(rrelyea) → review+
Comment on attachment 334034 [details] [diff] [review] patch for remaining functions in trustdomain.c (checked in) lib/pki/trustdomain.c; new revision: 1.60; previous revision: 1.59
Attachment #334034 - Attachment description: patch for remaining functions in trustdomain.c → patch for remaining functions in trustdomain.c (checked in)
I think this is all fixed on the trunk. The remaining task is to back port this work to the 3.11 branch. I will attempt to do that soon (probably at least a day away though).
Depends on: 444974
This patch for the 3.11. branch back ports changes that have already been committed on the trunk for this and related bugs. It includes: - the final patch for the trunk in this bug, for file trustdomain.c - the patch for bug 444974 (which bug was revealed when the patches for this bug were committed on the trunk) - the patch for bug 431929 - a few miscellaneous changes (all null pointer detections), all of which are on the trunk now, but not in 3.11, and - the difference between patch v5 and patch v6 of the patch for this bug which was named "create token object, even when there is no session". It seems that the version checked in on the branch was v5. This patch includes the diffs between v5 and v6, all in file dev/devslot.c. Since all these changes have already been reviewed for the trunk, I'm just going to ask for one additional review for this patch.
Attachment #344595 - Flags: review?
Attachment #344595 - Flags: review? → review?(julien.pierre.boogz)
Comment on attachment 344595 [details] [diff] [review] roll-up patch for 3.11.10 (checked in on branch) The major components of this patch are parts of other patches, from this bug and others. The include: This bug "patch for remaining functions in trustdomain.c" https://bugzilla.mozilla.org/attachment.cgi?id=334034 Bug 444974 "Stop assuming session pointers are non-NULL" https://bugzilla.mozilla.org/attachment.cgi?id=339706 This bug, the difference between v5 and v6 of the patch "create token object, even when there is no session, for trunk" https://bugzilla.mozilla.org/attachment.cgi?id=331456 (v5) https://bugzilla.mozilla.org/attachment.cgi?id=331699 (v6) This difference is the entire change for dev/devslot.c in this patch.
Comment on attachment 344595 [details] [diff] [review] roll-up patch for 3.11.10 (checked in on branch) Looking for a reviewer for this patch, which back ports fixes from trunk to branch. This is holding up NSS 3.11.10, and Julien has been given other tasks of higher priority.
Attachment #344595 - Flags: superreview?(wtc)
Attachment #344595 - Flags: review?(rrelyea)
Attachment #344595 - Flags: review?(rrelyea) → review+
Comment on attachment 344595 [details] [diff] [review] roll-up patch for 3.11.10 (checked in on branch) r+ this is quite a rollup bug. (it may have been better to review the separate component parts as there are several logical changes). Most changes introduced no new semantic, only rearraging of the code. The rest did the following: 1) added more fault tolerance to the code, detecting failures in the session. 2) in the CRL lookup, we now no longer fail if we have a failure in one token. Both changes appear to be desirable and correct. bob
Comment on attachment 344595 [details] [diff] [review] roll-up patch for 3.11.10 (checked in on branch) seeking another review for branch
Attachment #344595 - Flags: review?(alexei.volkov.bugs)
Attachment #344595 - Flags: superreview?(wtc)
Nelson, Re: attachment 344595 [details] [diff] [review], the change in devslot.c doesn't seem to correspond to any of the other patches you referenced. That change is not on the trunk also and probably should be, if it's going to the branch.
Re comment 92, yes, I mentioned that issue with devslot.c in comment 86. I had planned to check it in when when I checkin the fix for the 3.11 branch.
Nelson, Re: comment 93 and 87, I missed that. So many patches. The description of attachment 331699 [details] [diff] [review] says "create token object, even when there is no session, for trunk, v6 (checked in)". I guess that's incorrect if only v5 was checked in. Since the v6 patch got 2 reviews, the remainder should be checked in to the trunk. Because there are so many patches involved here, I'm first verifying that the roll-up patch is the sum of the others, modulo the changes you listed, and then I will review each patch individually.
Comment on attachment 344595 [details] [diff] [review] roll-up patch for 3.11.10 (checked in on branch) Alexei has taken over this review. Taking myself off.
Attachment #344595 - Flags: review?(julien.pierre.boogz)
Comment on attachment 344595 [details] [diff] [review] roll-up patch for 3.11.10 (checked in on branch) Patch looks good r+. Two things: it partially includes Julien's patch for bug 232392. I guess we should not forget to integrate the rest of the patch into the branch later. Another thing is how we create and destroy nssToken default session. It always assumed that the session will be created on the other object(not token) arena. But, for the constructor(nssSession_ImportNSS3Session function http://mxr.mozilla.org/security/ident?i=nssSession_ImportNSS3Session) arena is optional. Token destructor does not do any thing to session, thinking that it belongs to another object arena. So my question is where a possibility, that nssToken session will be created without use of arena?
Attachment #344595 - Flags: review?(alexei.volkov.bugs) → review+
Comment on attachment 331699 [details] [diff] [review] create token object, even when there is no session, for trunk, v6 (checked in) Previously I had accidentally checked in version 5 of this patch, for file devslot.c, even though version 6 was the reviewed version. Now I have corrected that by committing the diffs. nss/lib/dev/devslot.c; new revision: 1.25; previous revision: 1.24
This bug is the last pending fix for NSS 3.11.10 (AFAIK). Is there anything blocking the resolution of this bug ?
Comment on attachment 344595 [details] [diff] [review] roll-up patch for 3.11.10 (checked in on branch) Committed on branch dev/ckhelper.c; new: 1.34.28.2; previous: 1.34.28.1 dev/dev.h; new: 1.37.28.2; previous: 1.37.28.1 dev/devslot.c; new: 1.22.2.3; previous: 1.22.2.2 dev/devtoken.c; new: 1.39.28.4; previous: 1.39.28.3 dev/devutil.c; new: 1.26.28.2; previous: 1.26.28.1 pk11wrap/dev3hack.c; new: 1.21.28.3; previous: 1.21.28.2 pk11wrap/pk11cert.c; new: 1.143.2.14; previous: 1.143.2.13 pki/trustdomain.c; new: 1.51.28.8; previous: 1.51.28.7
Attachment #344595 - Attachment description: roll-up patch for 3.11.10 → roll-up patch for 3.11.10 (checked in on branch)
with patches committed on trunk and branch, I am optimistically marking this bug resolved/fixed.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: