SSL client with single handshake leaks memory with NSS 3.4

VERIFIED FIXED in 3.4

Status

P1
critical
VERIFIED FIXED
17 years ago
17 years ago

People

(Reporter: julien.pierre, Assigned: bugz)

Tracking

Sun
Solaris
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 1 obsolete attachment)

(Reporter)

Description

17 years ago
After running the web server GAT client in SSL mode for a few minutes 
on my Sun, it starts swapping because the client has used up all the RAM.
Quickly, the ops/s go down to zero because all the machine is doing is paging.

I have checked this with dbx checkleaks and here is the information I collected :

Possible leaks report  (possible leaks:     366  total size:  755424 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
596496     289      -      PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
PORT_ArenaZAlloc < SEC_ASN1DecoderStart < SEC_ASN1Decode < SEC_ASN1DecodeItem <
nsslowcert_DecodeDERCertificate < DecodeACert < FindCertByKey <
nsslowcert_FindCertByKey < pk11_getCert < pk11_FindCertAttribute <
pk11_FindTokenAttribute < pk11_FindAttribute < NSC_GetAttributeValue 
156864      76      -      PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
PORT_ArenaZAlloc < SEC_ASN1DecoderStart < SEC_ASN1Decode < SEC_ASN1DecodeItem <
nsslowcert_DecodeDERCertificate < DecodeACert < FindCertByKey <
nsslowcert_FindCertByKey < nsslowcert_FindCertByIssuerAndSN <
pk11_searchCertsAndTrust < pk11_searchTokenList < NSC_FindObjectsInit <
find_object_by_template 
  2064       1   0x1e9cd8  PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
EncodeDBCertKey < ReadDBCertEntry < FindCertByKey < nsslowcert_FindCertByKey <
nsslowcert_FindCertByIssuerAndSN < pk11_searchCertsAndTrust <
pk11_searchTokenList < NSC_FindObjectsInit < find_object_by_template <
get_cert_trust_handle < nssToken_FindTrustForCert <
nssTrust_GetCERTCertTrustForCert < fill_CERTCertificateFields
(Reporter)

Updated

17 years ago
Priority: -- → P1
(Reporter)

Comment 1

17 years ago
I put "SSL client" in the description because this leak doesn't appear to be
happening on the server side as far as I can tell, only on the client side. It
isn't per se in the SSL code.

Comment 2

17 years ago
Let's plug these leaks!  I'll see if I can get around
to these tomorrow.  If not, I may have to ask Ian to
help.

Bob, Ian, you can go ahead and look at the leaks in your
code.  Thanks.
Status: NEW → ASSIGNED
Target Milestone: --- → 3.4
(Reporter)

Comment 3

17 years ago
FYI, I have confirmed that there is no leak when I put the 3.3.2 shared libraries
into my client. The memory stays stable under top for over 3 minutes.

With the NSS 3.4 binaries (built 1/31/2001 afternoon), the process grows about 5
MB per minute, eating up all my available RAM in about 5 minutes. So it is
definitely 3.4 related.

I will try running the server with 3.4 binaries for a longer period, hitting it
with a 3.4 client, to make sure there really isn't also a server-side leak.
(Reporter)

Comment 4

17 years ago
I meant I would hit it the server with a non-leaking 3.3 client of course, to
avoid running out of RAM.
(Assignee)

Comment 5

17 years ago
Good catch, Julien.  I've already found one leak, but there seems to be another.
 I'm looking into it.
(Reporter)

Comment 6

17 years ago
My server crashed repeatedly in the overnight test. I was running optimized bits
so I'm not sure if it was because of a leak or not. The core file was rather
large, 460 MB, though, so it is likely the case.

I'm building newer 3.4 bits with your fixes from this morning now and will test
them with both the client and server.
(Reporter)

Comment 7

17 years ago
The client still leaks. The client process size grew from 7MB to 87MB over
lunch. The ops/s were down to zero.
I'll restart the test again with a 3.3.2 client and the latest 3.4 on the server
side.
(Reporter)

Comment 8

17 years ago
To clarify further what my SSL test client was doing :
It was just doing full handshake SSL ops, without client auth, just lots of
them, with many threads, in a loop.
That test is the one that showed a client leak with 3.4 but no leak with 3.3 .

When I try to run the full GAT test suite in continuous stres mode - a test
suite which includes over 400 different tests, including about 25 with client
auth, running one test per client thread - then the client leaks both with NSS
3.3.2 and NSS 3.4 .

Right now I'm just running the full handshake test again with client auth using
3.3.2 client, trying to see if the 3.4 based server will leak or not. I'm not
confident yet about the lack of leak on the server side due to the fact that I
haven't been able to run the test long enough.

What client are we talking about here?

Ian wrote me that this is tstclnt, but it cannot be that, because tstclnt
does only one connection and exits.  
Is it strsclnt?   
Some special GAT client?
(Reporter)

Comment 10

17 years ago
It is the http/https client used in the web server GAT which I wrote, called
httptest.
Typically, when GAT is run in web server tinderboxes, httptest just runs every
test twice, and uses a pool of 50 threads to run all 400+ tests.
I also have used httptest in the past as an SSL stress client, by disabling the
results checking. This is how I have been collecting all my SSL performance data
in the past.
(Reporter)

Comment 11

17 years ago
I have confirmed there is no server leak when running with NSS 3.4 and doing
full handshakes from the client.

Comment 12

17 years ago
Assigned the bug to Ian.  Nelson and Bob will also look for
memory leaks in their code.
Assignee: wtc → ian.mcgreer
Status: ASSIGNED → NEW
(Reporter)

Comment 13

17 years ago
FYI, the dbx checkleaks output with 3.4 as of noon today looks like this :

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:     266  total size:  549024 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
470592     228      -      PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
PORT_ArenaZAlloc < SEC_ASN1DecoderStart < SEC_ASN1Decode < SEC_ASN1DecodeItem <
nsslowcert_DecodeDERCertificate < DecodeACert < FindCertByKey <
nsslowcert_FindCertByKey < pk11_getCert < pk11_FindCertAttribute <
pk11_FindTokenAttribute < pk11_FindAttribute < NSC_GetAttributeValue 
 78432      38      -      PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
PORT_ArenaZAlloc < SEC_ASN1DecoderStart < SEC_ASN1Decode < SEC_ASN1DecodeItem <
nsslowcert_DecodeDERCertificate < DecodeACert < FindCertByKey <
nsslowcert_FindCertByKey < nsslowcert_FindCertByIssuerAndSN <
pk11_searchCertsAndTrust < pk11_searchTokenList < NSC_FindObjectsInit <
find_object_by_template 


The httptest command I am using to reproduce this single-threaded,
no-client-auth full-handshake test is :
./httptest -s -H 1 -h strange:1890 -P -e 3600 -L 30 -x url51

Comment 14

17 years ago
(Sigh) These leaks look like the old "Arena shift the blame" leaks where the
arena code shifts the leaks from the real leaker who reclaimed an arena free
list object, to the poor chap that originally grabbed it. We may want to run
without reclaimed arena data to find the real culprits.
(Reporter)

Comment 15

17 years ago
Bob, is there a way to disable arenas ?

Comment 16

17 years ago
Not disable. I've been meaning to write a patch which allows NSS to control
whether arena's for freed or placed on a free list based on an environment
variable. Currently I've just been hacking secport so that it calls
PL_ArenaXXXFinishXXX rather than PL_ArenaXXXFreeXXX in PORT_FreeArena.

bob
Julien,

In this bug and in 123081, can you get the dbx checkleak stacks to be 
longer and more complete?  NSS stacks tend to be pretty deep, and it 
is possible (indeed, likely, IMO) that the code responsible for the 
leak is not being shown in these short checkleak stacks.  
(Reporter)

Comment 18

17 years ago
Nelson,

Unfortunately I can't. dbx checkleaks can only match up to 16 levels deep in the
stack, and that's what I had it set to.
(Reporter)

Updated

17 years ago
Blocks: 123094
Created attachment 67546 [details] [diff] [review]
Patch to disable arena free list with an evironment variable

This patch will check for the presence of the environment variable
NSS_DISABLE_ARENA_FREE_LIST in debug builds on platforms that have 
environment vairables, and will disable the arena free list if found.  

Maybe it should be for both debug and optimized builds?

Comment 20

17 years ago
I would vote for that.

bob

Comment 21

17 years ago
Comment on attachment 67546 [details] [diff] [review]
Patch to disable arena free list with an evironment variable

>+#if defined(DEBUG) && ( defined(XP_UNIX) || defined(XP_WIN32) )
>+	    char *ev = getenv("NSS_DISABLE_ARENA_FREE_LIST");
>+	    if (!ev)
>+#endif /* DEBUG */

You can just say:

#include "prenv.h"

#ifdef DEBUG
    char *ev = PR_GetEnv("NSS_DISABLE_ARENA_FREE_LIST");
    if (!ev)
#endif /* DEBUG */
Attachment #67546 - Flags: needs-work+
(Reporter)

Comment 22

17 years ago
Here are the latest results for this case with arenas disabled :

(dbx) showleaks
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:     277  total size:  571728 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
127968      62      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssList_Create < nssList_Clone <
nssList_CreateIterator < nssTrust_GetCERTCertTrustForCert <
fill_CERTCertificateFields < STAN_GetCERTCertificate < __CERT_NewTempCertificate
< ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
119712      58      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssUTF8_Duplicate < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
119712      58      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssItem_Create < nssItem_Duplicate <
add_subject_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
117648      57      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < new_cache_entry < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 26832      13      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord < ssl3_GatherCompleteHandshake <
ssl_GatherRecord1stHandshake < ssl_Do1stHandshake < ssl_SecureSend 
 26832      13      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 26832      13      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
  2064       1    0xe4fc8  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
  2064       1    0xfcfc8  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
  2064       1    0xff820  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
 
(dbx) 
(Reporter)

Comment 23

17 years ago
Additional results after exactly 1000 new SSL sessions, full handshake, no
client auth.

(dbx) showleaks
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:     205  total size:  423120 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
107328      52      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssItem_Create < nssItem_Duplicate <
add_subject_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
105264      51      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < new_cache_entry < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
105264      51      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssUTF8_Duplicate < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssList_CreateIterator <
nssTrust_GetCERTCertTrustForCert < fill_CERTCertificateFields <
STAN_GetCERTCertificate < __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake < ssl_Do1stHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord < ssl3_GatherCompleteHandshake <
ssl_GatherRecord1stHandshake < ssl_Do1stHandshake < ssl_SecureSend 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
  2064       1    0xed4b0  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
  2064       1    0xffc38  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
  2064       1   0x101478  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < new_cache_entry <
add_issuer_and_serial_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache
< NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate
< ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
  2064       1   0x100460  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
Summary: memory leak in SSL client → memory leak in SSL client in NSS 3.4
(Reporter)

Comment 24

17 years ago
Created attachment 67988 [details] [diff] [review]
better patch to disable arena free lists with an environment variable
(Reporter)

Updated

17 years ago
Attachment #67546 - Attachment is obsolete: true
(Reporter)

Comment 25

17 years ago
Output after 1000 runs , with patch 67989 applied:

(dbx) sl
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:      69  total size:  142416 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < remove_subject_entry <
nssCertificateStore_Remove < CERT_DestroyCertificate < ssl_DestroySID <
ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo < ssl_FreeSocket 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 
(dbx) 
(Reporter)

Comment 26

17 years ago
After applying 

r1.6 of /cvsroot/mozilla/security/nss/lib/pki/pkistore.c,v 

and

r 1.24 of /cvsroot/mozilla/security/nss/lib/pki/tdcache.c,v  

The leak in this test is going down :

(dbx) sl
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:      64  total size:  132096 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < remove_subject_entry <
nssCertificateStore_Remove < CERT_DestroyCertificate < ssl_DestroySID <
ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo < ssl_FreeSocket 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 20640      10      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 20640      10      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 
(dbx) 
(Reporter)

Comment 27

17 years ago
oops, sorry, there wasn't much difference with that update. I thought it had one
less stack, but it's the same. The number of bytes leaked is comparable.

Comment 28

17 years ago
Comment on attachment 67988 [details] [diff] [review]
better patch to disable arena free lists with an environment variable

>-    static PRBool  doFreeArenaPool;
>+    static PRBool  doFreeArenaPool = PR_FALSE;

Static variables are initialized to 0 by the compilers.
So this change is not necessary.  Sorry, couldn't resist :-)

r=wtc.	Thanks, Julien.
Attachment #67988 - Flags: review+
(Reporter)

Comment 29

17 years ago
This is the dbx output with the double handhshake test :
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:     141  total size:  291024 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 49536      24      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 49536      24      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 47472      23      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 47472      23      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < remove_subject_entry <
nssCertificateStore_Remove < CERT_DestroyCertificate < ssl_DestroySID <
ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo < ssl_FreeSocket 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < remove_subject_entry <
nssCertificateStore_Remove < CERT_DestroyCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl3_GatherAppDataRecord 
 
(dbx) 
(Reporter)

Comment 30

17 years ago
As shown in bug 123081, this leak occurs in the first SSL handshake on a
session, regardless of whether client auth is used. The dbx output after 1000
tests shows the same stacks (though not exactly the same numbers) for both
cases, so I'm assimilating both cases into this defect, not specifying in the
title whether client auth is used or not as it makes no difference for this leak.
Summary: memory leak in SSL client in NSS 3.4 → SSL client with single handshake leaks memory with NSS 3.4
(Reporter)

Updated

17 years ago
Blocks: 123081

Comment 31

17 years ago
Created attachment 68059 [details] [diff] [review]
Allocate hash entries from the heap rather than the arena so they can be freed on the fly

This patch gets rid of all the leaks in PK11_ListCerts except those that have
to do with the built-in token (which, unfortunately, is still pretty
substantial).
(Reporter)

Comment 32

17 years ago
Here are the results after 1000 passes. I started the measurements at the
beginning of the 3rd pass. There are still some leaks, but not on every SSL
session/handshake.

(dbx) sl
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:      12  total size:   22728 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
    24       1    0xdbb88  calloc < PR_Calloc < nss_ZAlloc <
nss_arena_hash_alloc_entry < PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add
< remove_subject_entry < nssCertificateStore_Remove < CERT_DestroyCertificate <
ssl_DestroySID < ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo <
ssl_FreeSocket < ssl_DefClose 
(Assignee)

Comment 33

17 years ago
Created attachment 68204 [details] [diff] [review]
debug output for subject list in temp store

Julien-

Couple things.	First, I reviewed pkistore.c again and found one more misuse of
an arena.  I fixed it, see 

/cvsroot/mozilla/security/nss/lib/pki/pkistore.c,v  <--  pkistore.c
new revision: 1.7; previous revision: 1.6

I'm not sure if this is related to the leak.  It could be, but I'm not
convinced it is.

The other thing is this attachment.  This is the only way I can think of trying
to track the leak down.  If you could apply it and then attach the output, that
should be useful.
(Reporter)

Comment 34

17 years ago
Ian,

This patch causes my client to be much too slow to be usable due to the printfs
being executed for each test iteration (or more frequently, I don't know).
FYI, it looks like this and just continuously scrolls :

list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
(Reporter)

Comment 35

17 years ago
After 1000 runs without client auth, and with the built-in root cert module
deleted, arena free lists disabled, here is the output of dbx checkleaks :

Possible leaks report  (possible leaks:      12  total size:   22728 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
    24       1    0xf9b88  calloc < PR_Calloc < nss_ZAlloc <
nss_arena_hash_alloc_entry < PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add
< remove_subject_entry < nssCertificateStore_Remove < CERT_DestroyCertificate <
ssl_DestroySID < ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo <
ssl_FreeSocket < ssl_DefClose 
 
(dbx) 
(Reporter)

Comment 36

17 years ago
I just did a 1000 runs with client auth in the first handshake (required on the
server). Again this is without the root certs module library in the client.
The leak was exactly the same as without client auth, as pasted above :

(dbx) sl
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:      12  total size:   22728 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
    24       1    0xe9de8  calloc < PR_Calloc < nss_ZAlloc <
nss_arena_hash_alloc_entry < PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add
< remove_subject_entry < nssCertificateStore_Remove < CERT_DestroyCertificate <
ssl_DestroySID < ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo <
ssl_FreeSocket < ssl_DefClose 
(Reporter)

Comment 37

17 years ago
I reran both single-handshake tests - without and with client auth - this time
with the built-in module installed. The leak was still exactly the same.
Therefore I conclude the built-ins are not the cause of this leak.
(Assignee)

Comment 38

17 years ago
Created attachment 68245 [details] [diff] [review]
attempt to fix uniqueness of temp certs

The printf output showed certs (or more likely, the same cert) flying in and
out of the temporary store, while the same subject list never goes away.  This
explains the leak.

I believe I know why this happens.  CERT_NewTempCertificate is not looking in
the current temp store for the cert, thus it has the potential to create
multiple instances of the same cert.  These instances could get dumped in the
store, creating a subject list of the same cert over and over.

This patch attempts to address both issues.  That is, CERT_NewTempCertificate
should make sure no temp certs already exist for the DER.  And secondly, the
temp store should use a stronger comparison than pointer address to determine
if a cert is already in a subject list.

Julien, go ahead and give this a shot when you have the change (and a working
network).
(Assignee)

Comment 39

17 years ago
I checked in the certdb portion,

/cvsroot/mozilla/security/nss/lib/certdb/stanpcertdb.c,v  <--  stanpcertdb.c
new revision: 1.30; previous revision: 1.29


This change is needed whether it fixes this bug or not, but I believe it will
fix this bug.
(Reporter)

Comment 40

17 years ago
While I haven't done extensive testing in the debugger yet, it would appear from
the top output that the client's memory usage indeed stabilizes, both without
client auth and with single-handshake client auth. I will do the full
verification tomorrow and mark this bug fixed if that's the case.
(Reporter)

Comment 41

17 years ago
When running the client in single-threaded mode with client auth, there is no
leak. I ran a stress test for 3h during a meeting , for 68814 sessions on
Solaris. The top memory usage for httptest was exactly the same at the beginning
of my test as at the end.

I also ran the same test on my NT box, for 80675 sessions and the same duration,
and saw no leak there either.

(Assignee)

Comment 42

17 years ago
Both Wan-Teh and I have verified this bug is fixed.  This includes testing under
Purify/BoundsChecker and running a client for several hours.

Marking fixed.
(Assignee)

Comment 43

17 years ago
oops, did everything but change the state...
Status: NEW → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED

Comment 44

17 years ago
Marked the bug verified per Julien's comment in
http://bugzilla.mozilla.org/show_bug.cgi?id=125149#c41.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.