Closed Bug 122907 Opened 23 years ago Closed 22 years ago

SSL client with single handshake leaks memory with NSS 3.4

Categories

(NSS :: Libraries, defect, P1)

Sun
Solaris
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: julien.pierre, Assigned: bugz)

References

Details

Attachments

(4 files, 1 obsolete file)

After running the web server GAT client in SSL mode for a few minutes 
on my Sun, it starts swapping because the client has used up all the RAM.
Quickly, the ops/s go down to zero because all the machine is doing is paging.

I have checked this with dbx checkleaks and here is the information I collected :

Possible leaks report  (possible leaks:     366  total size:  755424 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
596496     289      -      PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
PORT_ArenaZAlloc < SEC_ASN1DecoderStart < SEC_ASN1Decode < SEC_ASN1DecodeItem <
nsslowcert_DecodeDERCertificate < DecodeACert < FindCertByKey <
nsslowcert_FindCertByKey < pk11_getCert < pk11_FindCertAttribute <
pk11_FindTokenAttribute < pk11_FindAttribute < NSC_GetAttributeValue 
156864      76      -      PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
PORT_ArenaZAlloc < SEC_ASN1DecoderStart < SEC_ASN1Decode < SEC_ASN1DecodeItem <
nsslowcert_DecodeDERCertificate < DecodeACert < FindCertByKey <
nsslowcert_FindCertByKey < nsslowcert_FindCertByIssuerAndSN <
pk11_searchCertsAndTrust < pk11_searchTokenList < NSC_FindObjectsInit <
find_object_by_template 
  2064       1   0x1e9cd8  PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
EncodeDBCertKey < ReadDBCertEntry < FindCertByKey < nsslowcert_FindCertByKey <
nsslowcert_FindCertByIssuerAndSN < pk11_searchCertsAndTrust <
pk11_searchTokenList < NSC_FindObjectsInit < find_object_by_template <
get_cert_trust_handle < nssToken_FindTrustForCert <
nssTrust_GetCERTCertTrustForCert < fill_CERTCertificateFields
Priority: -- → P1
I put "SSL client" in the description because this leak doesn't appear to be
happening on the server side as far as I can tell, only on the client side. It
isn't per se in the SSL code.
Let's plug these leaks!  I'll see if I can get around
to these tomorrow.  If not, I may have to ask Ian to
help.

Bob, Ian, you can go ahead and look at the leaks in your
code.  Thanks.
Status: NEW → ASSIGNED
Target Milestone: --- → 3.4
FYI, I have confirmed that there is no leak when I put the 3.3.2 shared libraries
into my client. The memory stays stable under top for over 3 minutes.

With the NSS 3.4 binaries (built 1/31/2001 afternoon), the process grows about 5
MB per minute, eating up all my available RAM in about 5 minutes. So it is
definitely 3.4 related.

I will try running the server with 3.4 binaries for a longer period, hitting it
with a 3.4 client, to make sure there really isn't also a server-side leak.
I meant I would hit it the server with a non-leaking 3.3 client of course, to
avoid running out of RAM.
Good catch, Julien.  I've already found one leak, but there seems to be another.
 I'm looking into it.
My server crashed repeatedly in the overnight test. I was running optimized bits
so I'm not sure if it was because of a leak or not. The core file was rather
large, 460 MB, though, so it is likely the case.

I'm building newer 3.4 bits with your fixes from this morning now and will test
them with both the client and server.
The client still leaks. The client process size grew from 7MB to 87MB over
lunch. The ops/s were down to zero.
I'll restart the test again with a 3.3.2 client and the latest 3.4 on the server
side.
To clarify further what my SSL test client was doing :
It was just doing full handshake SSL ops, without client auth, just lots of
them, with many threads, in a loop.
That test is the one that showed a client leak with 3.4 but no leak with 3.3 .

When I try to run the full GAT test suite in continuous stres mode - a test
suite which includes over 400 different tests, including about 25 with client
auth, running one test per client thread - then the client leaks both with NSS
3.3.2 and NSS 3.4 .

Right now I'm just running the full handshake test again with client auth using
3.3.2 client, trying to see if the 3.4 based server will leak or not. I'm not
confident yet about the lack of leak on the server side due to the fact that I
haven't been able to run the test long enough.

What client are we talking about here?

Ian wrote me that this is tstclnt, but it cannot be that, because tstclnt
does only one connection and exits.  
Is it strsclnt?   
Some special GAT client?
It is the http/https client used in the web server GAT which I wrote, called
httptest.
Typically, when GAT is run in web server tinderboxes, httptest just runs every
test twice, and uses a pool of 50 threads to run all 400+ tests.
I also have used httptest in the past as an SSL stress client, by disabling the
results checking. This is how I have been collecting all my SSL performance data
in the past.
I have confirmed there is no server leak when running with NSS 3.4 and doing
full handshakes from the client.
Assigned the bug to Ian.  Nelson and Bob will also look for
memory leaks in their code.
Assignee: wtc → ian.mcgreer
Status: ASSIGNED → NEW
FYI, the dbx checkleaks output with 3.4 as of noon today looks like this :

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:     266  total size:  549024 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
470592     228      -      PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
PORT_ArenaZAlloc < SEC_ASN1DecoderStart < SEC_ASN1Decode < SEC_ASN1DecodeItem <
nsslowcert_DecodeDERCertificate < DecodeACert < FindCertByKey <
nsslowcert_FindCertByKey < pk11_getCert < pk11_FindCertAttribute <
pk11_FindTokenAttribute < pk11_FindAttribute < NSC_GetAttributeValue 
 78432      38      -      PR_Malloc < PL_ArenaAllocate < PORT_ArenaAlloc <
PORT_ArenaZAlloc < SEC_ASN1DecoderStart < SEC_ASN1Decode < SEC_ASN1DecodeItem <
nsslowcert_DecodeDERCertificate < DecodeACert < FindCertByKey <
nsslowcert_FindCertByKey < nsslowcert_FindCertByIssuerAndSN <
pk11_searchCertsAndTrust < pk11_searchTokenList < NSC_FindObjectsInit <
find_object_by_template 


The httptest command I am using to reproduce this single-threaded,
no-client-auth full-handshake test is :
./httptest -s -H 1 -h strange:1890 -P -e 3600 -L 30 -x url51
(Sigh) These leaks look like the old "Arena shift the blame" leaks where the
arena code shifts the leaks from the real leaker who reclaimed an arena free
list object, to the poor chap that originally grabbed it. We may want to run
without reclaimed arena data to find the real culprits.
Bob, is there a way to disable arenas ?
Not disable. I've been meaning to write a patch which allows NSS to control
whether arena's for freed or placed on a free list based on an environment
variable. Currently I've just been hacking secport so that it calls
PL_ArenaXXXFinishXXX rather than PL_ArenaXXXFreeXXX in PORT_FreeArena.

bob
Julien,

In this bug and in 123081, can you get the dbx checkleak stacks to be 
longer and more complete?  NSS stacks tend to be pretty deep, and it 
is possible (indeed, likely, IMO) that the code responsible for the 
leak is not being shown in these short checkleak stacks.  
Nelson,

Unfortunately I can't. dbx checkleaks can only match up to 16 levels deep in the
stack, and that's what I had it set to.
Blocks: 123094
This patch will check for the presence of the environment variable
NSS_DISABLE_ARENA_FREE_LIST in debug builds on platforms that have 
environment vairables, and will disable the arena free list if found.  

Maybe it should be for both debug and optimized builds?
I would vote for that.

bob
Comment on attachment 67546 [details] [diff] [review]
Patch to disable arena free list with an evironment variable

>+#if defined(DEBUG) && ( defined(XP_UNIX) || defined(XP_WIN32) )
>+	    char *ev = getenv("NSS_DISABLE_ARENA_FREE_LIST");
>+	    if (!ev)
>+#endif /* DEBUG */

You can just say:

#include "prenv.h"

#ifdef DEBUG
    char *ev = PR_GetEnv("NSS_DISABLE_ARENA_FREE_LIST");
    if (!ev)
#endif /* DEBUG */
Attachment #67546 - Flags: needs-work+
Here are the latest results for this case with arenas disabled :

(dbx) showleaks
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:     277  total size:  571728 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
127968      62      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssList_Create < nssList_Clone <
nssList_CreateIterator < nssTrust_GetCERTCertTrustForCert <
fill_CERTCertificateFields < STAN_GetCERTCertificate < __CERT_NewTempCertificate
< ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
119712      58      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssUTF8_Duplicate < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
119712      58      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssItem_Create < nssItem_Duplicate <
add_subject_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
117648      57      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < new_cache_entry < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 26832      13      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord < ssl3_GatherCompleteHandshake <
ssl_GatherRecord1stHandshake < ssl_Do1stHandshake < ssl_SecureSend 
 26832      13      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 26832      13      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
  2064       1    0xe4fc8  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
  2064       1    0xfcfc8  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
  2064       1    0xff820  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
 
(dbx) 
Additional results after exactly 1000 new SSL sessions, full handshake, no
client auth.

(dbx) showleaks
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:     205  total size:  423120 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
107328      52      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssItem_Create < nssItem_Duplicate <
add_subject_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
105264      51      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < new_cache_entry < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
105264      51      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssUTF8_Duplicate < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nssList_CreateIterator <
nssTrust_GetCERTCertTrustForCert < fill_CERTCertificateFields <
STAN_GetCERTCertificate < __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake < ssl_Do1stHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord < ssl3_GatherCompleteHandshake <
ssl_GatherRecord1stHandshake < ssl_Do1stHandshake < ssl_SecureSend 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
  2064       1    0xed4b0  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
  2064       1    0xffc38  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
  2064       1   0x101478  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < new_cache_entry <
add_issuer_and_serial_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache
< NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate
< ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
  2064       1   0x100460  PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake <
ssl3_HandleRecord < ssl3_GatherCompleteHandshake 
Summary: memory leak in SSL client → memory leak in SSL client in NSS 3.4
Attachment #67546 - Attachment is obsolete: true
Output after 1000 runs , with patch 67989 applied:

(dbx) sl
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:      69  total size:  142416 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < remove_subject_entry <
nssCertificateStore_Remove < CERT_DestroyCertificate < ssl_DestroySID <
ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo < ssl_FreeSocket 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 
(dbx) 
After applying 

r1.6 of /cvsroot/mozilla/security/nss/lib/pki/pkistore.c,v 

and

r 1.24 of /cvsroot/mozilla/security/nss/lib/pki/tdcache.c,v  

The leak in this test is going down :

(dbx) sl
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:      64  total size:  132096 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < remove_subject_entry <
nssCertificateStore_Remove < CERT_DestroyCertificate < ssl_DestroySID <
ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo < ssl_FreeSocket 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 20640      10      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 20640      10      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 
(dbx) 
oops, sorry, there wasn't much difference with that update. I thought it had one
less stack, but it's the same. The number of bytes leaked is comparable.
Comment on attachment 67988 [details] [diff] [review]
better patch to disable arena free lists with an environment variable

>-    static PRBool  doFreeArenaPool;
>+    static PRBool  doFreeArenaPool = PR_FALSE;

Static variables are initialized to 0 by the compilers.
So this change is not necessary.  Sorry, couldn't resist :-)

r=wtc.	Thanks, Julien.
Attachment #67988 - Flags: review+
This is the dbx output with the double handhshake test :
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:     141  total size:  291024 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 49536      24      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_subject_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 49536      24      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_issuer_and_serial_entry
< add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 47472      23      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_certificate_entry <
nssCertificateStore_Add < NSSCryptoContext_ImportCertificate <
__CERT_NewTempCertificate < ssl3_HandleCertificate < ssl3_HandleHandshakeMessage
< ssl3_HandleHandshake < ssl3_HandleRecord 
 47472      23      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < add_nickname_entry <
add_cert_to_cache < nssTrustDomain_AddCertsToCache <
NSSTrustDomain_FindCertificateByEncodedCertificate < __CERT_NewTempCertificate <
ssl3_HandleCertificate < ssl3_HandleHandshakeMessage < ssl3_HandleHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < remove_subject_entry <
nssCertificateStore_Remove < CERT_DestroyCertificate < ssl_DestroySID <
ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo < ssl_FreeSocket 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
 24768      12      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nss_arena_hash_alloc_entry <
PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add < remove_subject_entry <
nssCertificateStore_Remove < CERT_DestroyCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake 
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl3_GatherAppDataRecord 
 
(dbx) 
As shown in bug 123081, this leak occurs in the first SSL handshake on a
session, regardless of whether client auth is used. The dbx output after 1000
tests shows the same stacks (though not exactly the same numbers) for both
cases, so I'm assimilating both cases into this defect, not specifying in the
title whether client auth is used or not as it makes no difference for this leak.
Summary: memory leak in SSL client in NSS 3.4 → SSL client with single handshake leaks memory with NSS 3.4
Blocks: 123081
This patch gets rid of all the leaks in PK11_ListCerts except those that have
to do with the built-in token (which, unfortunately, is still pretty
substantial).
Here are the results after 1000 passes. I started the measurements at the
beginning of the 3rd pass. There are still some leaks, but not on every SSL
session/handshake.

(dbx) sl
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:      12  total size:   22728 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
    24       1    0xdbb88  calloc < PR_Calloc < nss_ZAlloc <
nss_arena_hash_alloc_entry < PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add
< remove_subject_entry < nssCertificateStore_Remove < CERT_DestroyCertificate <
ssl_DestroySID < ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo <
ssl_FreeSocket < ssl_DefClose 
Julien-

Couple things.	First, I reviewed pkistore.c again and found one more misuse of
an arena.  I fixed it, see 

/cvsroot/mozilla/security/nss/lib/pki/pkistore.c,v  <--  pkistore.c
new revision: 1.7; previous revision: 1.6

I'm not sure if this is related to the leak.  It could be, but I'm not
convinced it is.

The other thing is this attachment.  This is the only way I can think of trying
to track the leak down.  If you could apply it and then attach the output, that
should be useful.
Ian,

This patch causes my client to be much too slow to be usable due to the printfs
being executed for each test iteration (or more frequently, I don't know).
FYI, it looks like this and just continuously scrolls :

list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
list readded 10e4b0 (subject=58)
cert added to list 10e4b0 (subject=58)
After 1000 runs without client auth, and with the built-in root cert module
deleted, arena free lists disabled, here is the output of dbx checkleaks :

Possible leaks report  (possible leaks:      12  total size:   22728 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
    24       1    0xf9b88  calloc < PR_Calloc < nss_ZAlloc <
nss_arena_hash_alloc_entry < PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add
< remove_subject_entry < nssCertificateStore_Remove < CERT_DestroyCertificate <
ssl_DestroySID < ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo <
ssl_FreeSocket < ssl_DefClose 
 
(dbx) 
I just did a 1000 runs with client auth in the first handshake (required on the
server). Again this is without the root certs module library in the client.
The leak was exactly the same as without client auth, as pasted above :

(dbx) sl
Checking for memory leaks...

Actual leaks report    (actual leaks:         0  total size:       0 bytes)

 

Possible leaks report  (possible leaks:      12  total size:   22728 bytes)

 Total  Num of  Leaked      Allocation call stack
 Size   Blocks  Block
                Address
======  ====== ==========  =======================================
 22704      11      -      PR_Malloc < PL_ArenaAllocate <
nss_zalloc_arena_locked < nss_ZAlloc < nsslist_add_element < nssList_AddUnique <
add_subject_entry < nssCertificateStore_Add < NSSCryptoContext_ImportCertificate
< __CERT_NewTempCertificate < ssl3_HandleCertificate <
ssl3_HandleHandshakeMessage < ssl3_HandleHandshake < ssl3_HandleRecord <
ssl3_GatherCompleteHandshake < ssl_GatherRecord1stHandshake 
    24       1    0xe9de8  calloc < PR_Calloc < nss_ZAlloc <
nss_arena_hash_alloc_entry < PL_HashTableRawAdd < PL_HashTableAdd < nssHash_Add
< remove_subject_entry < nssCertificateStore_Remove < CERT_DestroyCertificate <
ssl_DestroySID < ssl_FreeLockedSID < ssl_FreeSID < ssl_DestroySecurityInfo <
ssl_FreeSocket < ssl_DefClose 
I reran both single-handshake tests - without and with client auth - this time
with the built-in module installed. The leak was still exactly the same.
Therefore I conclude the built-ins are not the cause of this leak.
The printf output showed certs (or more likely, the same cert) flying in and
out of the temporary store, while the same subject list never goes away.  This
explains the leak.

I believe I know why this happens.  CERT_NewTempCertificate is not looking in
the current temp store for the cert, thus it has the potential to create
multiple instances of the same cert.  These instances could get dumped in the
store, creating a subject list of the same cert over and over.

This patch attempts to address both issues.  That is, CERT_NewTempCertificate
should make sure no temp certs already exist for the DER.  And secondly, the
temp store should use a stronger comparison than pointer address to determine
if a cert is already in a subject list.

Julien, go ahead and give this a shot when you have the change (and a working
network).
I checked in the certdb portion,

/cvsroot/mozilla/security/nss/lib/certdb/stanpcertdb.c,v  <--  stanpcertdb.c
new revision: 1.30; previous revision: 1.29


This change is needed whether it fixes this bug or not, but I believe it will
fix this bug.
While I haven't done extensive testing in the debugger yet, it would appear from
the top output that the client's memory usage indeed stabilizes, both without
client auth and with single-handshake client auth. I will do the full
verification tomorrow and mark this bug fixed if that's the case.
When running the client in single-threaded mode with client auth, there is no
leak. I ran a stress test for 3h during a meeting , for 68814 sessions on
Solaris. The top memory usage for httptest was exactly the same at the beginning
of my test as at the end.

I also ran the same test on my NT box, for 80675 sessions and the same duration,
and saw no leak there either.

Both Wan-Teh and I have verified this bug is fixed.  This includes testing under
Purify/BoundsChecker and running a client for several hours.

Marking fixed.
oops, did everything but change the state...
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Marked the bug verified per Julien's comment in
http://bugzilla.mozilla.org/show_bug.cgi?id=125149#c41.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: