Last Comment Bug 160635 - CRL lookup consumes too much memory, and leaks some
: CRL lookup consumes too much memory, and leaks some
Status: RESOLVED FIXED
[3.4.3] [3.5.1]
:
Product: NSS
Classification: Components
Component: Libraries (show other bugs)
: 3.4
: All All
: P1 normal (vote)
: 3.6
Assigned To: Julien Pierre
: Bishakha Banerjee
:
Mentors:
Depends on:
Blocks: 162983 164373 164501
  Show dependency treegraph
 
Reported: 2002-08-02 00:36 PDT by Julien Pierre
Modified: 2002-10-11 16:59 PDT (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
patch for NSS 3.6 to fix CRL memory leaks (1.30 KB, patch)
2002-08-12 18:28 PDT, Julien Pierre
wtc: review+
Details | Diff | Splinter Review
alternate patch that fixes the leak while also reducing memory overhead (4.88 KB, patch)
2002-08-12 19:04 PDT, Julien Pierre
no flags Details | Diff | Splinter Review
Better patch for dealing with excessive crl copies (5.56 KB, patch)
2002-08-14 09:26 PDT, Robert Relyea
no flags Details | Diff | Splinter Review

Description Julien Pierre 2002-08-02 00:36:05 PDT
Test case is a cert7.db containing a single 26 MB CRL.
Run crlutil -d . -L to list CRLs.

On the current NSS tip, debug build, the process grows to 260 MB of memory
before even attempting to decode the CRL. This is the memory usage observed in
top at the following stack, upon entry of the CRL decode function :

=>[1] CERT_DecodeDERCrlEx(narena = 0x62010, derSignedCrl = 0xffbef5e0, type = 1,
options = 0), line 367 in "crl.c"
  [2] CERT_DecodeDERCrl(narena = 0x62010, derSignedCrl = 0xffbef5e0, type = 1),
line 458 in "crl.c"
  [3] pk11_CollectCrls(slot = 0x5de50, crlID = 3497291292U, arg = 0x623f0), line
854 in "pk11cert.c"
  [4] PK11_TraverseSlot(slot = 0x5de50, arg = 0xffbef76c), line 951 in "pk11cert.c"
  [5] pk11_TraverseAllSlots(callback = 0xff1b1b18 = &PK11_TraverseSlot(struct
PK11SlotInfoStr *slot, void *arg), arg = 0xffbef76c, wincx = (nil)), line 983 in
"pk11cert.c"
  [6] PK11_LookupCrls(nodes = 0x623f0, type = 1, wincx = (nil)), line 1067 in
"pk11cert.c"
  [7] SEC_LookupCrls(handle = 0x556f8, nodes = 0xffbef86c, type = 1), line 668
in "crl.c"
  [8] ListCRLNames(certHandle = 0x556f8, crlType = 1), line 111 in "crlutil.c"
  [9] ListCRL(certHandle = 0x556f8, nickName = (nil), crlType = 1), line 147 in
"crlutil.c"
  [10] main(argc = 4, argv = 0xffbef9fc), line 383 in "crlutil.c"
(dbx) p derSignedCrl
derSignedCrl = 0xffbef5e0

Something in the lookup of objects is causing a huge amount of memory to be
allocated just to look for that one 26 MB object. This is *wrong* ! We shouldn't
consume 10x as much RAM as the entire database size to look up one object ...

I'm not sure what the problem is here as I haven't had time to investigate it
and my other CRL tasks will not leave me any time to do so.

My best guesses would be some inefficiencies in pk11wrap and softoken. Third
would be berkeley DB - maybe it really wasn't meant to handle such large objects.
Comment 1 Nelson Bolyard (seldom reads bugmail) 2002-08-02 12:15:35 PDT
Julien, I have a easily testible hypothesis as to the cause of this problem.
I think it has to do with the way that NSPR's arena free list works. 
To teste this, simply disable NSS's use of the arena free list entirely.
That is done by setting an environment variable: NSS_DISABLE_ARENA_FREE_LIST 

Please try that and see if it solves the memory bloat, and report it here.  
Also, see what effect if has on performance.  

If disabling the free list solves the problem, then there is a pretty simple
change we can make to nsprpub/lib/ds/plarena.c, or perhaps to PORT_FreeArena
in security/nss/lib/util/secport.c.  I won't explain my hypothesis unless
testing confirms it.
Comment 2 Terry Hayes 2002-08-02 15:02:18 PDT
I'd also like to see this test done with a modification to the CRL parsing code 
that uses a larger "chucksize" for the arena.  I'd be interested in a run that 
uses 256K (or more) as the argument to PORT_NewArena.  This should probably be 
combined with Nelson's suggestion to disable the free list, since the large 
pools won't be useful in too many other places.
Comment 3 Julien Pierre 2002-08-02 15:56:24 PDT
Nelson,

I set the environment variable. The process memory usage was reduced to 182 MB
upon entry of the CRL decode function, as measured by top. Also, it is worth
noting that no further process memory growth was observed during the decode
afterwards, even though it does need about 30 MB of memory to allocate the array
of CRLEntry structure and the CRLEntry structures themselves.

I think what this probably means is that there were a lot of temporary objects
created during the lookup, enough to total 182 MB at the peak (or 260 MB with
the free list enabled). But some of the objects were freed , which allowed for
the decode to reuse the memory already allocated in the process.

Still, the problem remains that we are using 182 MB of RAM for temporary objects
to lookup a 26 MB object in a 26 MB database.
Comment 4 Julien Pierre 2002-08-02 17:18:27 PDT
Terry, this bug is about things that happen before CRL decoding - in other
words, the memory being allocated while reading the CRL from cert7.db , before
CRL decoding function is called. I measured it by setting a breakpoint upon
entry of the CRL decode function, and looking at the process memory usage when
that breakpoint was hit.

I am not sure where the arena is being created in the lookup case. This would
need to be investigated before I could change the chunk size.
Comment 5 Julien Pierre 2002-08-02 20:06:20 PDT
I found where the arena was allocated, it is in SEC_LookupCrls . The default
chunk size was 2 KB. I increased it to 256 KB. It didn't reduce the memory usage
upon entry of the CRL function, on the contrary, it slightly increased it to 183
MB (this was with the freelist disabled). I wouldn't expect the arena chunk size
to make a difference in the amount of memory used, since it is up to the caller
to decide what it wants to allocate / deallocate. So I stand by my initial
comments that this is a more fundamental problem in NSS that we make too many
temporary copies of objects when looking them up. In this case, it looks like we
make about 7 copies of this 26 MB CRL in memory before it gets decoded.
Comment 6 Terry Hayes 2002-08-04 16:00:51 PDT
Actually, I was looking for a change in the processing time due to this change.
Did you see any change? You are right, this discussion should be moved to the
correct bug for further discussion.
Comment 7 Robert Relyea 2002-08-05 10:15:33 PDT
I recently made several changes to the softoken which reduced the number of
memory copies and memory allocations used in the certificate case (In this case
to improve the performance of allocating a single cert). Some of these changes
may have and effect on CRL's but I did not touch most of the CRL processing
code. The underlying database primitives from pcertdb does a lot of copying of
data. That would be a good first place to look.

bob
Comment 8 Julien Pierre 2002-08-05 11:47:31 PDT
Terry, yes, I saw a difference in performance when I increased the arena chunk
size. See bug # 160640 . 
Comment 9 Nelson Bolyard (seldom reads bugmail) 2002-08-05 13:42:01 PDT
The result reported in comment 3 above, that memory usage was reduced by 
~80 MB when NSS does not use NSPR's arena free list, suggests that we 
need to change the way we use that arena free list.

IIRC, NSPR's arena free list is more or less a LIFO.  When an arena or 
chain of arenas is freed, it is put on the head of the free list.  
Allocation of arenas from the freelist uses a first-fit algorithm.  
Starting at the head of the list, the first arena that is large enough 
to hold the requested amount of memory is taken.   

This means that a 25 MB arena can be freed to the free list, and then 
that huge arena can be allocated to an arenapool that needs only 2KB.
Then if another 25 MB arena is needed, a new one must be created.

I can think of several possible solutions to this problem.  These are not
mutually exclusive.  They can be implemented either in NSPR or in NSS.  
They include:

1. When freeing an arena that is larger than some limit, free it back 
to the heap, rather than to the free arena list.

2. When the zone allocator is in use, make sure that arenas are allocated
using that allocator, and then disable the arena free list alltogether,
and let the zone allocator act as the arena free list.  This will prevent
the problem of allocating very large arenas when a small one would suffice.

3. Change NSPR's arena free list to keep the list sorted in ascending
order by size of the arena.  When a very large arena is freed, it will
go to the end of the list (or near the end).  Then the first fit allocation
algorithm will take the smallest arena from the free list that is large
enough to satisfy the allocation request.  This solution may have the 
undesirable effect of increasing the time that the free list lock is held
for a typical deallocation.  
Comment 10 Stephane Saux 2002-08-05 18:47:38 PDT
Another option would be to have size buckets. Storing an arena is such a free
list is only one more test on the size.  retrieving is one test, then if there's
nothing in the correct bucket, go to the next bigger one.
Comment 11 Nelson Bolyard (seldom reads bugmail) 2002-08-05 19:20:44 PDT
"Size buckets" are exactly what the zone allocator implements.  I think that, 
when the zone allocator is enabled, it should replace all other NSS and NSPR 
free lists.
Comment 12 Julien Pierre 2002-08-09 15:34:57 PDT
I believe there is a significant memory leak in the lookup of the CRL. Yesterday
I added a loop in certutil to do the verification of a single certificate more
than once, with a cert database containing a 26 MB CRL and a revoked
certificate. The process memory measured by top kept growing from about 300 MB
at the first iteration to about 700 MB when I stopped it. The machine was
swapping heavily.

I used the dbx checkleaks functionality which showed a confirmed few small
leaks, as well as potential very large leaks that seemed to comprise the entire CRL.
Comment 13 Julien Pierre 2002-08-09 19:27:45 PDT
Confirmed the leak when using Mozilla. Mozilla does a large number of
(unnecessary, repeated) cert verifications for each cert in PSM. With my cert
database contained about 12 user certs and the AOL CRL (30 KB) and Thawte CRL
(300 KB), I was able to eat up 120 MB of RAM and make my OS/2 system start to
swap in about 5 clicks back and forth between tabs in PSM. When I removed the
CRLs, I could click on tabs as many times as I wanted without any more memory
being used.
I'd like to point out that this was a very realistic real-world test, with my
production database, not a huge 26 MB CRL, with a real application, and the leak
was so significant that it made Mozilla unusable.
Comment 14 Julien Pierre 2002-08-09 19:35:29 PDT
I also tried the same test as in comment #13 with NSS 3.5 . The same leak was
observed. So this isn't a regression. It has probably been around in NSS since
at least 3.4 , since there isn't much difference between 3.4 and 3.5. It may
have been in there for much longer.

I will try to create a small database with just a thawte cert and CRL so the
leak can be reproduced quickly with certutil on all versions of NSS back to 3.2.
Comment 15 Julien Pierre 2002-08-09 21:24:39 PDT
There are 3 leaks for this case, two minor ones and a big one.

With NSS 3.3 it leaks as follows between two verifications - these ar the small
leaks :

(dbx) showleaks
Checking for memory leaks...

Actual leaks report    (actual leaks:         2  total size:      60 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
    34       1   0x2068c0  PR_Malloc < PORT_Alloc < SECITEM_CopyItem <
cert_FindExtensionByOID < cert_FindExtension < CERT_FindCertExtension <
CERT_GetCertificateNames < CERT_VerifyCertChain < CERT_VerifyCert < ValidateCert
< main 
    26       1   0x220c40  PR_Malloc < PORT_Alloc < SECITEM_CopyItem <
cert_FindExtensionByOID < cert_FindExtension < CERT_FindCertExtension <
CERT_GetCertificateNames < CERT_VerifyCertChain < CERT_VerifyCert < ValidateCert
< main 
 

Possible leaks report  (possible leaks:       0  total size:       0 bytes)

That's a leak of 60 bytes, which was observed when verifying my Thawte cert
against the Thawte CRL.

Here is the result of the same test in NSS 3.4 - which is the same , plus a big
leak of the entire CRL :

(dbx) showleaks
Checking for memory leaks...

Actual leaks report    (actual leaks:         3  total size:     228 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
   108       1   0x2a4f98  PL_strdup < nsslowcert_FindCrlByKey < pk11_getUrl <
pk11_FindCrlAttribute < pk11_FindTokenAttribute < pk11_FindAttribute <
NSC_GetAttributeValue < PK11_GetAttributes < PK11_FindCrlByName <
SEC_FindCrlByKeyOnSlot < SEC_FindCrlByName < SEC_CheckCRL < CERT_VerifyCertChain
< CERT_VerifyCert < ValidateCert < main 
   108       1   0x2a3410  PL_strdup < nsslowcert_FindCrlByKey < pk11_getUrl <
pk11_FindCrlAttribute < pk11_FindTokenAttribute < pk11_FindAttribute <
NSC_GetAttributeValue < PK11_GetAttributes < PK11_FindCrlByName <
SEC_FindCrlByKeyOnSlot < SEC_FindCrlByName < SEC_CheckCRL < CERT_VerifyCertChain
< CERT_VerifyCert < ValidateCert < main 
    12       1   0x2a0a10  PR_Malloc < PORT_Alloc < SECITEM_DupItem <
nsslowcert_FindCrlByKey < pk11_searchCrls < pk11_searchTokenList <
NSC_FindObjectsInit < pk11_FindObjectByTemplate < PK11_FindCrlByName <
SEC_FindCrlByKeyOnSlot < SEC_FindCrlByName < SEC_CheckCRL < CERT_VerifyCertChain
< CERT_VerifyCert < ValidateCert < main 
 

Possible leaks report  (possible leaks:       1  total size:  294518 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
294518       1   0x4afe80  PR_Malloc < PORT_Alloc < SECITEM_DupItem <
nsslowcert_FindCrlByKey < pk11_searchCrls < pk11_searchTokenList <
NSC_FindObjectsInit < pk11_FindObjectByTemplate < PK11_FindCrlByName <
SEC_FindCrlByKeyOnSlot < SEC_FindCrlByName < SEC_CheckCRL < CERT_VerifyCertChain
< CERT_VerifyCert < ValidateCert < main 
 
(dbx) 

I have confirmed that the last "possible leak" is real.
Ie. the CRL object returned through PKCS#11 is not being freed.

I tested this again on NSS 3.6, and got the same results as with NSS 3.4 .
Comment 16 Julien Pierre 2002-08-12 16:56:14 PDT
Just for the record, this is the leak report for NSS 3.6. It looks like function
names changed, so this is useful to make the patch. Unfortunately this means
that the fix won't be as easy to backport to 3.4 or 3.5 if we need to.

(dbx) showleaks
Checking for memory leaks...

Actual leaks report    (actual leaks:         3  total size:     228 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
   108       1    0xa37b0  PR_Malloc < PORT_Alloc < PORT_Strdup <
nsslowcert_FindCrlByKey < pk11_getUrl < pk11_FindCrlAttribute <
pk11_FindTokenAttribute < pk11_FindAttribute < NSC_GetAttributeValue <
nssCKObject_GetAttributes < nssCryptokiCRL_GetAttributes < nssCRL_Create <
crl_createObject < nssPKIObjectCollection_GetObjects <
nssPKIObjectCollection_GetCRLs < nssTrustDomain_FindCRLsBySubject 
   108       1    0x9ff68  PR_Malloc < PORT_Alloc < PORT_Strdup <
nsslowcert_FindCrlByKey < pk11_getUrl < pk11_FindCrlAttribute <
pk11_FindTokenAttribute < pk11_FindAttribute < NSC_GetAttributeValue <
nssCKObject_GetAttributes < nssCryptokiCRL_GetAttributes < nssCRL_Create <
crl_createObject < nssPKIObjectCollection_GetObjects <
nssPKIObjectCollection_GetCRLs < nssTrustDomain_FindCRLsBySubject 
    12       1    0xb0200  PR_Malloc < PORT_Alloc < SECITEM_DupItem <
nsslowcert_FindCrlByKey < pk11_searchCrls < pk11_searchTokenList <
NSC_FindObjectsInit < find_objects < find_objects_by_template <
nssToken_FindCRLsBySubject < nssTrustDomain_FindCRLsBySubject <
PK11_FindCrlByName < SEC_FindCrlByKeyOnSlot < SEC_FindCrlByName < SEC_CheckCRL <
cert_VerifyCertChain 
 

Possible leaks report  (possible leaks:       1  total size:  294518 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
294518       1   0x4093d0  PR_Malloc < PORT_Alloc < SECITEM_DupItem <
nsslowcert_FindCrlByKey < pk11_searchCrls < pk11_searchTokenList <
NSC_FindObjectsInit < find_objects < find_objects_by_template <
nssToken_FindCRLsBySubject < nssTrustDomain_FindCRLsBySubject <
PK11_FindCrlByName < SEC_FindCrlByKeyOnSlot < SEC_FindCrlByName < SEC_CheckCRL <
cert_VerifyCertChain 
 
(dbx) 
Comment 17 Julien Pierre 2002-08-12 18:28:00 PDT
Created attachment 95026 [details] [diff] [review]
patch for NSS 3.6 to fix CRL memory leaks

All the leaks were isolated in softoken. The url and the entire DER CRL were
leaked.
Comment 18 Julien Pierre 2002-08-12 18:33:20 PDT
I just created a patch for NSS 3.6 to plug the leaks. While doing so, I noticed
an incredible number of copies. In particular, the biggest leak in this case was
a leak of a complete copy of the DER CRL, which was only used as a boolean to
see if we have the CRL ! This is a waste and increases the memory footprint
unnecessarily. The patch I have attached only plugs the leak but does not
decrease the footprint, which was the original subject of this bug. I will
attempt to produce another patch that fixes both at the same time, but it will
be a more extensive patch.

Note that this leak and inefficient use of memory was only in softoken. Still, I
have seen a high memory usage occur when searching for CRLs in other tokens. So
there is probably yet another part in this problem, in the PKCS#11 wrapper,
which could use memory optimization.
Comment 19 Julien Pierre 2002-08-12 19:04:30 PDT
Created attachment 95034 [details] [diff] [review]
alternate patch that fixes the leak while also reducing memory overhead
Comment 20 Wan-Teh Chang 2002-08-13 14:22:21 PDT
Assigned the bug to Julien.  Thanks for writing the bug report
and proposing fixes!
Comment 21 Julien Pierre 2002-08-13 15:34:55 PDT
Actually, these patches apply successfully to all of NSS 3.4, NSS 3.5 and NSS
3.6 trees, and I have verified that they fix the leak in all cases.
Comment 22 Wan-Teh Chang 2002-08-13 16:24:09 PDT
Comment on attachment 95026 [details] [diff] [review]
patch for NSS 3.6 to fix CRL memory leaks

r=wtc.
Comment 23 Wan-Teh Chang 2002-08-13 16:44:55 PDT
Comment on attachment 95034 [details] [diff] [review]
alternate patch that fixes the leak while also reducing memory overhead

In pcertdb.c, you have:

>@@ -4784,14 +4784,13 @@
>  * Lookup a CRL in the databases. We mirror the same fast caching data base
>  *  caching stuff used by certificates....?
>  */
>-SECItem *
>+SECStatus
> nsslowcert_FindCrlByKey(NSSLOWCERTCertDBHandle *handle, SECItem *crlKey, 
>-		char **url, PRBool isKRL)
>+		char **url, SECItem** crl, PRBool isKRL)
> {
>     SECItem keyitem;
>     DBT key;
>     SECStatus rv;
>-    SECItem *crl = NULL;
>     PRArenaPool *arena = NULL;
>     certDBEntryRevocation *entry = NULL;
>     certDBEntryType crlType = isKRL ? certDBEntryTypeKeyRevocation  
>@@ -4814,13 +4813,16 @@
>     entry = ReadDBCrlEntry(handle, crlKey, crlType);
> 	
>     if ( entry == NULL ) {
>+        rv = SECFailure;
> 	goto loser;
>     }

You also need to add 'rv = SECFailure;' to the body of 'if ( arena == NULL )'.

I suggest that you initialize rv to SECFailure and set rv to SECSuccess
right before the 'loser' label.

In pkcs11u.c, you have:

>@@ -358,8 +358,10 @@
>     }
> 
>     isKrl = (PRBool) object->obj.handle == PK11_TOKEN_KRL_HANDLE;
>-    crl = nsslowcert_FindCrlByKey(object->obj.slot->certDB,&object->dbKey,
>-								NULL,isKrl);
>+    if (SECSuccess != nsslowcert_FindCrlByKey(object->obj.slot->certDB,&object->dbKey,
>+								NULL, &crl, isKrl)) {
>+        return NULL;
>+    }
>     object->obj.objectInfo = (void *)crl;
>     object->obj.infoFree = (PK11Free) pk11_FreeItem;
>     return crl;

This does not seem to be equivalent to the original code.  The original
code sets object->obj.infoFree to pk11_FreeItem if nsslowcert_FindCrlByKey
fails while object->obj.objectInfo remains NULL.  Not sure if this matters.


>@@ -377,13 +378,9 @@
>     }
> 
>     isKrl = (PRBool) object->obj.handle == PK11_TOKEN_KRL_HANDLE;
>-    crl = nsslowcert_FindCrlByKey(object->obj.slot->certDB,&object->dbKey,
>-								&url,isKrl);
>-    if (object->obj.objectInfo == NULL) {
>-	object->obj.objectInfo = (void *)crl;
>-	object->obj.infoFree = (PK11Free) pk11_FreeItem;
>-    } else {
>-	if (crl) SECITEM_FreeItem(crl,PR_TRUE);
>+    if (SECSuccess != nsslowcert_FindCrlByKey(object->obj.slot->certDB,&object->dbKey,
>+								&url, NULL, isKrl)) {
>+        return NULL;
>     }
>     return url;
> }

This is not equivalent to the original code.  The original code sets
object->obj.objectInfo if it is originally NULL.  I am not sure if
this side effect is necessary.
Comment 24 Julien Pierre 2002-08-13 17:51:51 PDT
Wan-Teh,

About the review :

- pcertdb.c : I agree with your comments

- pkcs11u.c : I agree with your first comment, I shouldn't return NULL there. I
just want to
PR_ASSERT(crl == NULL); 
and fall through

- pkcs11u.c : I am not sure about your second comment. This is where the memory
optimization is. By not passing a pointer to return a SECItem for the CRL,
nsslowcert_FindCrlByKey does not make a temporary copy of the full CRL.
I'm not exactly sure what the purpose of that copy was, since it appears it was
going to get freed right away or soon after. 
I changed the prototype for nsslowcert_FindCrlByKey solely to do this. If we
decide that we don't want to do it, then we can just go with the first simpler
patch.
Comment 25 Wan-Teh Chang 2002-08-13 18:04:39 PDT
Comment on attachment 95034 [details] [diff] [review]
alternate patch that fixes the leak while also reducing memory overhead

Regarding my second comment about pkcs11u.c

>@@ -377,13 +378,9 @@
>     }
> 
>     isKrl = (PRBool) object->obj.handle == PK11_TOKEN_KRL_HANDLE;
>-    crl = nsslowcert_FindCrlByKey(object->obj.slot->certDB,&object->dbKey,
>-								&url,isKrl);
>-    if (object->obj.objectInfo == NULL) {
>-	object->obj.objectInfo = (void *)crl;
>-	object->obj.infoFree = (PK11Free) pk11_FreeItem;
>-    } else {
>-	if (crl) SECITEM_FreeItem(crl,PR_TRUE);
>+    if (SECSuccess != nsslowcert_FindCrlByKey(object->obj.slot->certDB,&object->dbKey,
>+								&url, NULL, isKrl)) {
>+        return NULL;
>     }
>     return url;
> }

I meant that the new code doesn't have the equivalent of

      if (object->obj.objectInfo == NULL) {
	object->obj.objectInfo = (void *)crl;
	object->obj.infoFree = (PK11Free) pk11_FreeItem;

but I am not sure if this piece of code is necessary.

Bob, could you review Julien's patches?
Comment 26 Robert Relyea 2002-08-14 08:42:13 PDT
I think we may want to consider changing the lookup to return the entry instead.
This will get rid of the memory copy, and deal with the URL case, and still be
more efficient.

The reason for the getURL call was the URL wasn't recorded because I was storing
just the raw CRL, not a CRL structure. If we store the CRLEntry, it will have
both the rawCRL and the URL. And it avoids the data copy.

Right now if you do a getattributes call and pass both the URL and the VALUE,
with Julian's code you get two fetches from the database (and two entry copies).
In the old code you would also get this, but only if you VALUE proceeded URL in
the call.

bob
Comment 27 Robert Relyea 2002-08-14 09:26:02 PDT
Created attachment 95265 [details] [diff] [review]
Better patch for dealing with excessive crl copies

This will elliminate the offending CRL copy altogether. There is one more thing
we can do and that is change the nsslowcert_FindCrlByKey() in pkcs11.c with a
new function nsslowcert_VerifyCrlForKeyExists() call. This call would not even
create an entry because all we want to know is does the CRL exist or not... we
don't need to copy the full CRL. 

It may even be possible to make it even more effecient if we can see if there
is a way to determine that the CRL exists in the database without fetching it
through the berkeley DB structure.
Comment 28 Julien Pierre 2002-08-15 21:02:42 PDT
Bob,

I don't know enough about the softoken code structure at this time to be able to
review your patch, which is more extensive than mine. I have other more urgent
tasks that prevent me from looking into it more closely right now, but I have
verified that it works and solves the memory leak problem.
Comment 29 Julien Pierre 2002-08-15 21:26:15 PDT
FYI, Bob's latest patch also reduced the peak memory usage of the certutil
process in my repeated cert verification test with the 26 MB large CRL to 157 MB
from 182 MB . The savings corresponds to 1x the size of the CRL.
Comment 30 Robert Relyea 2002-08-16 12:24:34 PDT
That follows expectations, since it removes one copy as well as removing the leaks.
Comment 31 Julien Pierre 2002-08-25 07:29:11 PDT
Bob,

The latest patch for this is causing crashes. See bug # 164373 .
Comment 32 Robert Relyea 2002-08-26 09:26:54 PDT
See patch attached to 164373.

bob
Comment 33 Julien Pierre 2002-09-04 16:00:30 PDT
The leak is fixed. The memory usage has also been reduced by Bob's new handling
of large objects. With arena free lists disabled and these changes, the process
peak size before entering the decode function is now 107 MB, which is roughly
equal to 4 copies of the 26 MB CRL + other program memory.

Comment 34 Wan-Teh Chang 2002-10-04 11:37:30 PDT
Comment on attachment 95265 [details] [diff] [review]
Better patch for dealing with excessive crl copies

This patch requires some additional changes,
which were already made on the tip.  Remember
to make these changes if you apply this patch
to a branch.  The changes are described below.

In pkcs11u.c,

>@@ -991,7 +968,7 @@
> static PK11Attribute *
> pk11_FindCrlAttribute(PK11TokenObject *object, CK_ATTRIBUTE_TYPE type)
> {
>-    SECItem *crl;
>+    certDBEntryRevocation *crl;
>     char *url;

The unused variable "char *url;" should be deleted.

>+    crl =  pk11_getCrl(object);
>+    switch (type) {
>     case CKA_NETSCAPE_URL:
>-	url = pk11_getUrl(object);
> 	if (url == NULL) {
> 	    return (PK11Attribute *) &pk11_StaticNullAttr;
> 	}

"url == NULL" should be "crl->url == NULL".
Comment 35 Wan-Teh Chang 2002-10-04 11:44:01 PDT
Comment on attachment 95265 [details] [diff] [review]
Better patch for dealing with excessive crl copies

Bob, please merge this patch on the NSS_3_4_BRANCH.
Thanks.
Comment 36 Robert Relyea 2002-10-04 14:55:26 PDT
so merged;).

bob
Comment 37 Julien Pierre 2002-10-04 19:07:50 PDT
Bob,

Could you also please merge it into NSS_3_5_BRANCH ?
This is needed for testing the fix with the browser, which won't work with 3.4
due to some missing functions. My browser leaks badly within seconds of opening
PSM. Given the number of CRLs I have and the fact that it does 40 cert
verifications and therefore CRL checks for each cert, and does it again for each
repaint, it's very bad. It quickly reaches a half GB of memory usage ...
(I know 3.6 fixes it but I was trying to figure out something that looks like a
regression in 3.6)
Comment 38 Wan-Teh Chang 2002-10-04 19:17:14 PDT
Julien, it's easy to merge Bob's CRL memory leak fix
on NSS_3_5_BRANCH.  Just apply attachment 95265 [details] [diff] [review] and
then make the two changes described in comment #34.
Comment 39 Wan-Teh Chang 2002-10-04 19:18:36 PDT
By the way, the additional change is attachment 96728 [details] [diff] [review].
Comment 40 Julien Pierre 2002-10-04 19:24:12 PDT
Checked in both fixes to NSS_3_5_BRANCH :

Checking in pcert.h;
/cvsroot/mozilla/security/nss/lib/softoken/pcert.h,v  <--  pcert.h
new revision: 1.4.6.1; previous revision: 1.4
done
Checking in pcertdb.c;
/cvsroot/mozilla/security/nss/lib/softoken/pcertdb.c,v  <--  pcertdb.c
new revision: 1.17.2.1; previous revision: 1.17
done
Checking in pkcs11.c;
/cvsroot/mozilla/security/nss/lib/softoken/pkcs11.c,v  <--  pkcs11.c
new revision: 1.45.2.2; previous revision: 1.45.2.1
done
Checking in pkcs11u.c;
/cvsroot/mozilla/security/nss/lib/softoken/pkcs11u.c,v  <--  pkcs11u.c
new revision: 1.30.2.1; previous revision: 1.30
done
Comment 41 Wan-Teh Chang 2002-10-11 16:59:39 PDT
Marked the bug fixed.

Note You need to log in before you can comment on or make changes to this bug.