Open Bug 282320 Opened 20 years ago Updated 17 years ago

Memory leak on ldap_simple_bind_s

Categories

(Directory :: LDAP C SDK, defect)

defect
Not set
normal

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: girish.sundarraj, Assigned: mcs)

Details

(Keywords: memory-leak)

Attachments

(7 files, 2 obsolete files)

User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)
Build Identifier: netscape sdk dll downloaded from sun ldap csdk 5.08

simple_bind_s dumps memory leak when bind operation is done on a separate 
thread. The leak is from set_ld_error function of error.c(168). This happens 
in the dll downloaded from sun csdk 5.08. If binding is done on the main thread
the leak does not happen. The detailed memory leak dump from rational purify 
run on windows is attached below.

[W] MLK: Memory leak of 12 bytes from 1 block allocated in set_ld_error 
[NSLDAP32V50.DLL]
        Distribution of leaked blocks
        Allocation location
            calloc         [MSVCRT.dll]
            set_ld_error   [open.c:168]
            ldap_set_lderrno [error.c:239]
            nsldapi_send_server_request [request.c:306]
            nsldapi_send_initial_request [request.c:134]
            simple_bind_nolock [sbind.c:135]
            ldap_simple_bind [sbind.c:64]
            ldap_simple_bind_s [sbind.c:163]
            local1(void)   [NSLDAP.cpp:44]
            closeIt(void *) [NSLDAP.cpp:282]
[W] MLK: Memory leak of 2 bytes from 2 blocks allocated in nslberi_malloc 
[NSLDAP32V50.DLL]
        Distribution of leaked blocks
        Allocation location
            malloc         [MSVCRT.dll]
            nslberi_malloc [io.c:1405]
            ber_get_stringa [decode.c:265]
            ber_scanf      [decode.c:448]
            nsldapi_parse_result [error.c:395]
            ldap_parse_result [error.c:296]
            ldap_result2error [error.c:191]
            ldap_simple_bind_s [sbind.c:169]
            local1(void)   [NSLDAP.cpp:44]
            closeIt(void *) [NSLDAP.cpp:282]



Reproducible: Always

Steps to Reproduce:
1. Write a normal c program with function (other than main) that makes 
ldap_init & ldap_simple_bind_s and ldap_unbind.
2. From main function spawn new thread that invokes this function
3. run the application against rational purify

Actual Results:  
Rational purify dumps following leak
[W] MLK: Memory leak of 12 bytes from 1 block allocated in set_ld_error 
[NSLDAP32V50.DLL]
        Distribution of leaked blocks
        Allocation location
            calloc         [MSVCRT.dll]
            set_ld_error   [open.c:168]
            ldap_set_lderrno [error.c:239]
            nsldapi_send_server_request [request.c:306]
            nsldapi_send_initial_request [request.c:134]
            simple_bind_nolock [sbind.c:135]
            ldap_simple_bind [sbind.c:64]
            ldap_simple_bind_s [sbind.c:163]
            local1(void)   [NSLDAP.cpp:44]
            closeIt(void *) [NSLDAP.cpp:282]
[W] MLK: Memory leak of 2 bytes from 2 blocks allocated in nslberi_malloc 
[NSLDAP32V50.DLL]
        Distribution of leaked blocks
        Allocation location
            malloc         [MSVCRT.dll]
            nslberi_malloc [io.c:1405]
            ber_get_stringa [decode.c:265]
            ber_scanf      [decode.c:448]
            nsldapi_parse_result [error.c:395]
            ldap_parse_result [error.c:296]
            ldap_result2error [error.c:191]
            ldap_simple_bind_s [sbind.c:169]
            local1(void)   [NSLDAP.cpp:44]
            closeIt(void *) [NSLDAP.cpp:282]



Expected Results:  
Leaks should not have happened

We cannot release our product that uses these dlls with memory leak. Hence
it is blocker for us which needs to be fixed ASAP
This comment appears in mozilla/directory/csdk/ldap/libldap/error.c near the
ldap_set_lderrno() function:


/*
 * Note: there is no need for callers of ldap_set_lderrno() to lock the
 * ld mutex.  If applications intend to share an LDAP session handle
 * between threads they *must* perform their own locking around the
 * session handle or they must install a "set lderrno" thread callback
 * function.
 *
 */

This comment is slightly "dated" because for quite some time open.c has
installed a default set of thread callback functions.  On Windows, thread local
storage is used.  Looking at the open.c:set_ld_error() code, it looks to me like
there may be a one-time leak.  Here is the start of that function:

#ifdef USE_WINDOWS_TLS
static void
set_ld_error( int err, char *matched, char *errmsg, void *dummy )
{
    struct nsldapi_ldap_error *le;
    void *tsd;

    le = TlsGetValue( dwTlsIndex );

    if (le == NULL) {
        tsd = (void *)calloc(1, sizeof(struct nsldapi_ldap_error));
        TlsSetValue( dwTlsIndex, tsd );
    }
    ...

I assume the leak that Purify is reporting is the tsd value (that is the only
calloc in that function).  Can you verify whether it is a one-time leak (which
is not so critical) or a repeated leak (which is critical)?

Please look into my comments
Comment on attachment 174580 [details]
sample code to simulate the memory leak

Hi,

Thanks for the quick reply.

I verified with purify just now. The memory leak is growing for each bind
operation spawned in different threads. However in my context, the LDAP pointer
is local to thread and hence I don’t see any reason for synchronizing it using
mutex.

In our application , each time a user logs in a thread is created for his
session. In this thread we bind to LDAP using following routines...

1) ldap_init()
2) ldap_simple_bind_s()

This itself creates the leak. So for each session created additional leaks are
dumped.  Since the leak is growing for each client connect it is very critical.
Also similar leaks could also be seen on ldap_search_s in the same locations...
I hope all these boils down to same issue.

Also attaching the sample program which I am using to isolate the issue. Please
link it with netscape ldap libraries. This sample is programmed for windows

The main program is like this

int main()
{

	DWORD threadId;
	HANDLE th;
	th =
CreateThread(NULL,0,(LPTHREAD_START_ROUTINE)threadProc,NULL,NULL,&threadId);

	//bindToLDAP();

	getch();	

	return 0;	
}


threadProc thread function calls bindToLDAP() internally which causes the leak.
That means when binding happens in new thread it leaks. But if you comment
"CreateThread" and call bindToLDAP directly in the main thread by uncommenting
it, then the leak does not happen.

Hope you have all the necessary inputs needed to debug your application and
suggest me a way to overcome this crisis. 


Thanks in advance...
Thanks for the sample code.  I do not have time to reproduce and debug this today.

However, if you really are not sharing LDAP * handles across multiple threads,
you can work around the leak by overriding the default thread functions and
disabling them.  At least that should work.  To do that, after each call to
ldap_init() use ldap_set_option() to clear the thread functions, e.g.,

  LDAP *ld = ldap_init(...);
  if (NULL != ld)
  {
        struct ldap_thread_fns threadFns = {0};
        struct ldap_extra_thread_fns exThreadFns = {0};
        
        if (ldap_set_option(ld, LDAP_OPT_THREAD_FN_PTRS,
                (void *)&threadFns ) != LDAP_SUCCESS ||
            ldap_set_option(ld, LDAP_OPT_EXTRA_THREAD_FN_PTRS,
                (void *)&exThreadFns) != LDAP_SUCCESS) {
               /* error */
        }
  }
Status: NEW → ASSIGNED
Severity: blocker → normal
Keywords: mlk
Memory leak still happens if you try to communicate to LDAP server through SSL

If we make call to following SSL API's we get huge memory leak
nErrorCode = ldapssl_client_init("E:/Program Files/Softerra/LDAP Browser  2.6",NULL);
hLDAP = ldapssl_init("10.66.16.5" ,636,1);

Could you please let me know what needs to be done here?

(In reply to comment #4)
> Thanks for the sample code.  I do not have time to reproduce and debug this
> today.
> However, if you really are not sharing LDAP * handles across multiple threads,
> you can work around the leak by overriding the default thread functions and
> disabling them.  At least that should work.  To do that, after each call to
> ldap_init() use ldap_set_option() to clear the thread functions, e.g.,
>   LDAP *ld = ldap_init(...);
>   if (NULL != ld)
>   {
>         struct ldap_thread_fns threadFns = {0};
>         struct ldap_extra_thread_fns exThreadFns = {0};
>         if (ldap_set_option(ld, LDAP_OPT_THREAD_FN_PTRS,
>                 (void *)&threadFns ) != LDAP_SUCCESS ||
>             ldap_set_option(ld, LDAP_OPT_EXTRA_THREAD_FN_PTRS,
>                 (void *)&exThreadFns) != LDAP_SUCCESS) {
>                /* error */
>         }
>   }

Do your threads exit when they are done?  When you use SSL, the libprldap (libldap using NSPR) layer is used.  It is designed to free all memory when the thread exits.  Can you provide details on where the leak is occurring (I am sure it will be in a slightly different place or at least the stack will be deeper than when not using SSL)?
Attached file Sample for SSL memory leak (obsolete) —
Sample for SSL memory leak
We get malloc memory leak for following api calls
ldap_simple_bind_s
ldapssl_client_init
Attached image purify screen shot for memory leak (obsolete) —
purify screen shot for memory leak. We also get leak for ldap_result call
which is not included in the sample code.
Attachment #223779 - Attachment is obsolete: true
Attachment #223782 - Attachment is obsolete: true
The latest cpp file which was build using debug version of mozilla which
contains the memory leak dump of purify in the text format
Attached purify memory leak dump file in txt format which depicts exact location of memory leak which would be helpful for u to debug.

The latest cpp file with caption "Sample SSL code which is par with purify dump" which was build using debug version of mozilla which
contains the memory leak dump of purify in the text format with caption
"Purify memory leak dump"
First, there was a client cert auth memory leak fixed in the mozilla ldap c sdk 5.17.  You can grab the binaries from here:
ftp://ftp.mozilla.org/pub/mozilla.org/directory/c-sdk/releases/v5.17

But I don't think that is the problem you are seeing.  NSS caches many things in memory.  These things are normally cleaned up by performing the following two steps after your program is finished, before calling exit (or return from main):

/* Clean up any NSS objects cached as a result of client
   side SSL connections */
SSL_ShutdownClientSessionIDCache();

/* Tell NSS we are done - shut it down */
if ((rv = NSS_Shutdown()) != SECSuccess) {
    fprintf(stderr,
         "NSS_Shutdown failed: %d", PR_GetError());
}

Try putting this code after your ldap_unbind_s in your main test program, and run it with purify.  You should see most, if not all, of your memory leaks go away.
More leak detected after making call to SSL_Shutdown
Hi,

There is no method called SSL_ShutdownClientSessionIDCache. We have
only SSL_ShutdownServerSessionIDCache. Should we use this?

Even if we use the method along with NSS_Shutdown, we get more memory leaks
than before.

Are we misssing out anything else. Also I am using 5.17 version of cldap
Comment on attachment 223924 [details]
More leak detected after making call to NSS_Shutdown

More leak detected after making call to NSS_Shutdown
Attachment #223924 - Attachment description: More leak detected after making call to SSL_Shutdown → More leak detected after making call to NSS_Shutdown
Sorry - the client cache clearing function is called SSL_ClearSessionCache().
Even after putting the code SSL_ClearSessionCache(), memory leak happens.
Even after putting SSL_ClearSessionCache() and NSS_Shutdown code in place,
the memory leak occurs. Please help as we are waiting to go for production.
I did find a memory leak in your code.  GetSSLConnection() calls ldapssl_install_routines() which calls ldapssl_alloc_sessioninfo() which does the following:
... ssip = (LDAPSSLSessionInfo *)PR_Calloc( 1, sizeof( LDAPSSLSessionInfo ))

ldapssl_install_routines() should only be called from ldapssl_init().  If this is not clear in the documentation, please let us know where to fix it.

What version of NSS are you using?

I don't have Windows/purify, but I do have Fedora Core 5 and valgrind 3.1.0:
uname -a -> Linux localhost.localdomain 2.6.16-1.2096_FC5 #1 Wed Apr 19 05:14:36 EDT 2006 i686 i686 i386 GNU/Linux

I modified your program slightly for linux/gcc and ran it using valgrind.  My memory leak traces are very different than yours.  It appears purify leaves out stack frames, so it's hard to trace exactly what is leaking.  However, valgrind produces a full stack trace.  What's interesting is that valgrind did not report the leak associated with ldapssl_install_routines(), but just by visual inspection of the mozldap 5.17 code, I don't see any place that would free the session info.

I also modified the code to call ldapssl_client_init in main(), before doSearch(), then call doSearch() many times, then the SSL/NSS clean up.  I'm trying to find out if there is a leak per connection or per operation.  There is only one odd thing I've found - a leak which occurs during operation processing, but only 2 times (with a run of 1000 doSearch() calls).  So that particular leak doesn't appear to be per connection or operation.  All of the other memory leaks are one time only, associated with NSS_Initialize which should only ever be called once per process (and the leaks are less than 100 KB so not bad).

I'm using the latest ldap c sdk with NSPR 4.6 and NSS 3.11.  NSS 3.11 has a memory leak (fixed in NSS 3.11.1) but AFAIK it only affects the server side, but this may be it's manifestation on the client side.

I'm going to attach my code, and the valgrind memory leak log.
(In reply to comment #20)
> I did find a memory leak in your code.  GetSSLConnection() calls
> ldapssl_install_routines() which calls ldapssl_alloc_sessioninfo() which does
> the following:
> ... ssip = (LDAPSSLSessionInfo *)PR_Calloc( 1, sizeof( LDAPSSLSessionInfo ))
> 
> ldapssl_install_routines() should only be called from ldapssl_init().  If this
> is not clear in the documentation, please let us know where to fix it.

I suppose it would be possible to modify ldapssl_install_routines() so it noticed if it was being called a 2nd time.  But as you said, applications should either use ldapssl_init() OR ldap_init() followed by ldapssl_install_routines().
Could you please let me know from where I can download NSS 3.11.1 binaries?
I can see only the source code.
It looks as though the NSS 3.11.1 binaries have not yet been posted to ftp.mozilla.org.  Keep checking - I expect them any day now.
Hmm - NSS 3.11.2 is now out - still no binaries :-(  I would suggest to try and build it for yourself.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: