Closed
Bug 164501
Opened 22 years ago
Closed 22 years ago
PK11_FindCrlByName returns invalid SECItem when running low on memory
Categories
(NSS :: Libraries, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
3.8
People
(Reporter: julien.pierre, Assigned: rrelyea)
References
Details
Attachments
(3 files, 2 obsolete files)
1.43 KB,
patch
|
Details | Diff | Splinter Review | |
6.08 KB,
patch
|
Details | Diff | Splinter Review | |
1.50 KB,
patch
|
Details | Diff | Splinter Review |
I'm doing some work from home on my OS/2 machine this week-end, so I used the same large cert7.db database that holds my 26 MB CRL. Much to my surprise, NSS is unable to correctly lookup the CRL when verifying a certificate using certutil -V. Further investigation shows that PK11_FindCrlByName returns a non-NULL SECItem*, with a data field of NULL, and a length of NULL. However, it also returns a non-zero CK_OBJECT_HANDLE, indicating that it found a CRL. I tried this on an NSS 3.5 tree and the same problem occurred. So this is not related to the recent memory optimizations in 3.6. When I do the same thing on Solaris, both in 3.5 and 3.6, there is no problem, and the CRL is found. So it seems that this bug only shows on OS/2 so This might be yet another DBM issue. I will continue investigating.
Reporter | ||
Comment 1•22 years ago
|
||
FYI, PK11_LookupCrls, used for crlutil -L, works fine. It's only the lookup of the CRL by subject that fails. I traced the code some more and see that it's not DBM causing the problem - it reads it fine. Somehow the object is getting lost when being returned to the higher-level functions. This may be due to the excessive number of copies we do. Despite being 32-bit, OS/2 has a per-process address space limit of 512MB. There are other factors that reduce this limit, such as shared memory in use in the OS. It is not impossible that I have hit this memory addressing limit in my test, but I don't know for sure yet. In fact, I just confirmed that this is the problem. The process was using more than 300 MB of memory for some reason, and some calls to SECItem_DupItem were returning NULL, and eventually this is what got returned to the top. I disabled the arena free list by setting NSS_DISABLE_ARENA_FREE_LIST=1. Then the memory usage of the process was much lower, and the CRL lookup succeeded !!! Now for the interesting part about how I discovered this : I have a test that checks for revocation status on a certificate listed on the CRL. On OS/2, the test was returning that the certificate was valid !!! What happened is that after getting the invalid SECItem, the higher-level code tried to decode it. That failed. So it was unable to build a CERTSignedCrl*, and returned NULL. Then SEC_CheckCrl believed that there was no CRL for the issuer, and proceeded happily to validate the certificate.
Priority: -- → P1
Target Milestone: --- → 3.6
Reporter | ||
Updated•22 years ago
|
OS: OS/2 → All
Hardware: PC → All
Summary: PK11_FindCrlByName returns invalid SECItem on OS/2 → PK11_FindCrlByName returns invalid SECItem when running low of memory
Comment 2•22 years ago
|
||
This patch changes static function FreeArenaList in nsprpub/lib/ds/plarena.c to free any arena block > 100KB rather than putting it on the free arena list. This should provide some immediate reduction in memory usage when giant things get allocated in an arena pool. This patch also causes the blocks that are put on the free list to be cleared and causes the pool's counter of arena blocks to be reduced by the number of blocks moved to the free list. Review invited. (I haven't tested this yet.) I will also produce a patch that makes the arena free list be sorted in ascending order of free block size. This should make the first-fit allocation algorithm perform as a best fit one.
Comment 3•22 years ago
|
||
This patch is an alternative to the previous one. It keeps the freelist ordered by increasing arena block size, which causes the first-fit allocation scheme to be a best-fit allocation scheme. To shorten the time spent searching the freelist for insertion or removal of a fitting block, the freelist is broken into 4 lists, all protected by the same single lock. The 4 lists are used for different sizes of blocks. The ranges of sizes for the 4 lists are: 0k < x < 2k 2k < x < 8k 8k < x < 32k 32k < x comments invited. Code is untested.
Comment 4•22 years ago
|
||
Oops. That last comment should have said: 0k < x <= 2k 2k < x <= 8k 8k < x <= 32k 32k < x
Reporter | ||
Comment 5•22 years ago
|
||
Nelson, I can't compile your patch. I think you made it against a different version of NSPR than the one I'm using. I am using the tip of NSPR. I tried to rev plarena.c to release 3.6.2.1 which you patched, but I still get a compile error : [e:\dev\nss\36\mozilla\nsprpub\os22.45_icc_dbg.obj\lib\ds]gmake icc -Foplarena.obj -c /Q /W1 /Q /qlibansi /Gd+ /Gm+ /Su4 /Mp /Tl9 /Ti+ -UND EBUG -DDEBUG_charm -DDEBUG=1 -DXP_OS2=1 -DXP_PC=1 -DBSD_SELECT=1 -D_PR_GLO BAL_THREADS_ONLY=1 -DXP_OS2_VACPP=1 -DOS2=4 -DTCPV40HDRS=1 -D_X86_=1 -DHAVE _STRERROR=1 -DFORCE_PR_LOG -IE:/DEV/NSS/36/mozilla/security/nss/../../dist/OS2 2.45_icc_DBG.OBJ/include -I../../../pr/include ../../../lib/ds/plarena.c ../../../lib/ds/plarena.c(306:38) : error EDC0068: Operation between types "unsi gned long" and "struct PLArena*" is not allowed. ../../../lib/ds/plarena.c(306:42) : error EDC0045: Undeclared identifier base. E:/DEV/MOZTOOLS/GMAKE.EXE: *** [plarena.obj] Error 12 [e:\dev\nss\36\mozilla\nsprpub\os22.45_icc_dbg.obj\lib\ds] Please give me a patch for the NSPR tip. I have a good test case to reproduce this problem - my cert verification test program runs out of memory when the NSS arena free list is enabled, because it hits the OS/2 512 MB process limit. It only runs successfully with the free list disabled. So if I could get the test to work with your patch and the free list enabled, that would be good progress. I feel that the arena enhancement should probably be the subject of another bug though, on NSPR, so feel free to open one. In this bug, I will track down the place where we try to allocate memory and it fails without us catching it.
Reporter | ||
Comment 6•22 years ago
|
||
The problem occurs in softoken/pcertdb.c, line 4824 . The SECITEM_DupItem returns NULL. This error is ignored, and the function returns SECSuccess .
Reporter | ||
Comment 7•22 years ago
|
||
Unfortunately, even checking for this condition and returning SECFailure doesn't fix the problem. This is a design issue - this API wasn't designed to support failures. It was supposed to either return objects or no objects, but no in between, such as "not enough memory to return the object that is present" which is this case. So we get stuck with bogus data being returned. However, this is an internal API, so it could be fixed with a better prototype, as well additional checking in some of the layers. I'm afraid the stack is rather big when this happens, though : Function | Part -----------------------------------+----------------- nsslowcert_FindCrlByKey | PCERTDB.OBJ pk11_getCrl | PKCS11U.OBJ pk11_FindCrlAttribute | PKCS11U.OBJ pk11_FindTokenAttribute | PKCS11U.OBJ pk11_FindAttribute | PKCS11U.OBJ NSC_GetAttributeValue | PKCS11.OBJ nssCKObject_GetAttributes | CKHELPER.OBJ nssCryptokiCRL_GetAttributes | CKHELPER.OBJ nssCRL_Create | CERTIFICATE.OBJ crl_createObject | PKIBASE.OBJ nssPKIObjectCollection_GetObjects | PKIBASE.OBJ nssPKIObjectCollection_GetCRLs | PKIBASE.OBJ nssTrustDomain_FindCRLsBySubject | TRUSTDOMAIN.OBJ PK11_FindCrlByName | PK11CERT.OBJ SEC_FindCrlByKeyOnSlot | CRL.OBJ DPCache_Fetch | CRL.OBJ DPCache_Update | CRL.OBJ AcquireDPCache | CRL.OBJ CERT_CheckCRL | CRL.OBJ SEC_CheckCRL | CERTVFY.OBJ cert_VerifyCertChain | CERTVFY.OBJ CERT_VerifyCertificate | CERTVFY.OBJ VerifyCert | CERTUTIL.OBJ ValidateCert | CERTUTIL.OBJ main | CERTUTIL.OBJ _start | EXESTRTI 0x1FFECE33 | DOSCALL1.DLL:4
Assignee | ||
Comment 8•22 years ago
|
||
The problem is that nssCKObject_GetAttributes() has to handle the case where we may be asking for attributes that are not critical and may not be understood by the higher level. This means the CRL that gets created has a NULL value. Somewhere this value needs to be checked for validity, and if that check fails we need to fail a crl validiation. Changing the error code coming back from softoken won't affect this.
Comment 9•22 years ago
|
||
Comment on attachment 96763 [details] [diff] [review] patch creates 4 freelists, sorted by size This patch contains a syntax error. The replacement patch will be added to bug 166701
Attachment #96763 -
Attachment is obsolete: true
Attachment #96763 -
Flags: needs-work+
Assignee | ||
Comment 11•22 years ago
|
||
The basic idea of this patch is that error types are sent up using PORT_SetError(). In our search routine, if we simply don't find what we are looking for, we set the error NOT_FOUND. Upper level search codes, when no CRL is found check for this error code and treats it as a no error condition. This allows the CRL data to fail for other reasons. This patch does fail cert validation if the C_FindInit* calls return an error, or the object is found, but an error is found in subsequent processing. Our standard regression tests pass with this code. This patch deals with the Cert lookup and Cert Validations paths, so it is deemed high risk. It should be tested against Julian's tests, and ran in a couple of existing products before it goes into RTM code. I recommend moving this bug to NSS 3.7.
Assignee | ||
Comment 12•22 years ago
|
||
since the patch is riskier than the current behavior, moving out to 3.7.
Target Milestone: 3.6 → 3.7
Assignee | ||
Comment 13•22 years ago
|
||
Fixed checked into NSS 3.7. I'd like Julien to verify against his test case before I close this, though.
Reporter | ||
Comment 14•22 years ago
|
||
Actually, I ran this with my test case yesterday. The test was the verification of a revoked certificate. Prior to this fix, the code would return an invalid SECItem, then the decoding would fail, then the CRL code would do the same thing and assume revoked by default. Now, the wrapper code returns a NULL SECItem, so the CRL code assumes there is no CRL, and therefore the CRL validates. I have adjusted the test case by adding a malloc() in NSSInit on my tree that allocates RAM based on the environment variable "LEAK". This eats into OS/2's limited process address space (about 300 MB actual due to shared memory, out of the 512 MB theoritical). If I set LEAK to 10 MB, then the cert is considered revoked because the CRL can be read/decoded. If I set LEAK to 100 MB, there isn't enough memory, and no CRL is found, and the cert actually verifies ... We need some way of returning something other than a good or a NULL CRL. This is why I was saying the API was broken. There is no way to report a failure ...
Reporter | ||
Comment 15•22 years ago
|
||
In the previous comment, in place of "same thing" read "safe thing". Also, it's the cert that validates, not the CRL.
Reporter | ||
Comment 16•22 years ago
|
||
Sorry, I think I tested on the 3.6 branch, which doesn't have the fix. I am building the tip now.
Reporter | ||
Comment 17•22 years ago
|
||
Even with the tip, it still fails. I found that the code doesn't go through the code path that Bob thought. Even though the last error set is indeed NSS_ERROR_NO_MEMORY, what happens is that the error is never checked. Inside of PK11_FindCrlByName, at line 3704, a non-NULL array is returned. This causes the NSS_GetError() call that Bob added to be skipped. However, the content of the array is empty, so the code goes on to the for loop , which checks for interesting CRLs in the array. There are none, so it just returns SEC_ERROR_CRL_NOT_FOUND. Suggestions : - if the array is completely empty, then perhaps this should be a fatal error and we should not set "not found" ? I'll create a patch for this - we probably should not be returning an empty array
Reporter | ||
Updated•22 years ago
|
Summary: PK11_FindCrlByName returns invalid SECItem when running low of memory → PK11_FindCrlByName returns invalid SECItem when running low on memory
Reporter | ||
Comment 18•22 years ago
|
||
This causes the correct behavior - the certs (both revoked and valid) are considered invalid when a CRL cannot be successfully transferred from softoken due to a low memory condition. However, the error code reported at the highest level from CERT_VerifyCertificate is "invalid arguments". The fix for that may not necessarily be in the same code that was just patched but higher up.
Reporter | ||
Comment 19•22 years ago
|
||
Bob, Could you please review my latest (patch 102646) ? I think this should go in the tip and perhaps also in 3.6.1 . Thanks.
Assignee | ||
Comment 20•22 years ago
|
||
That patch would work, though a simpler patch for the same effect would be to change line 3726 from: > if (!crls) { to: > if (!crls || *crls == NULL) { > if (crls) nssCrlArray_Destroy(crls); bob
Comment 21•22 years ago
|
||
Comment on attachment 102646 [details] [diff] [review] patch to consider an empty array an error and not set "not found" code I just wanted to ask that you use the prevalent comment block formatting style in the file.
Reporter | ||
Comment 22•22 years ago
|
||
Fix comment style
Attachment #102646 -
Attachment is obsolete: true
Comment 23•22 years ago
|
||
Moved to target milestone 3.8 because the original NSS 3.7 release has been renamed 3.8.
Target Milestone: 3.7 → 3.8
Assignee | ||
Comment 24•22 years ago
|
||
The shorter version of the patch i proposed in the comment is now checked in to the trunk.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•