User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:22.214.171.124) Gecko/20060728 Firefox/126.96.36.199 Build Identifier: http://www.mozilla.org/directory There currently is no public API in libssldap to cleanup references to NSS objects, making calls to NSS_Shutdown by client apps unsafe. Reproducible: Always Steps to Reproduce: The following code initializes an SSL session with SASL/EXTERNAL authentication: ldapssl_clientauth_init(...); ldapssl_init(...); ldapssl_enable_clientauth(...); ldap_sasl_bind_s(...); If you try calling NSS_Shutdown and re-peforming the above steps, NSS_Shutdown always returns error SEC_ERROR_BUSY, and the libldapssl calls (in my case it was ldapssl_enable_clientauth) causes a segmentation fault. I am running LDAP C SDK 5.17 with NSS 3.7.3 and NSPR 4.2.2 on Solaris 8. Actual Results: Segmentation fault inside ldapssl_enable_clientauth Expected Results: ldapssl_enable_clientauth, nor any libldapssl functions, should seg fault after NSS_Shutdown. There is a thread in mozilla.dev.tech.ldap titled "Re-reading certificates at runtime" that discusses this further.
I have a somewhat different perspective on this RFE. The sequence of calls in comment 0 shows that a number of ldap calls are made to initialize state, which calls also initialize NSS. The bug reporter wishes to start over, re-initializing NSS, for the purpose of picking up a different (new or modified) set of cert/key DBs. IMO, architecturally speaking, the right way to do this is for his code to shut down libldap, effectively undoing the initialization steps shown above, preferrably in reverse order, and have libldap shut down NSS in the course of those steps. Only libldap knows what handles/references it is holding to NSS objects, and of course all those handles/references must be released/ destroyed before shutting down NSS. So, rather than the user trying to shutdown NSS directly, tearing out the foundation on which libldap sits, without libldap's participation, it would be better if libldap facilitated that shutdown. I'd also like to suggest that methods be added/enhanced to facilitate the caller specification of such things as: a) DB file name prefixes b) read-only vs read-write attribute of NSS Someone suggested to the bug reporter than he might initialize NSS himself before calling ldapssl_clientauth_init. I guess that works with the present implementation (the one in /directory/suncsdk/c-sdk), but it seems fragile. It will break if libldap's implementation changes in ways that it ought reasonably to be able to change. One related comment: it appears to me that if ldapssl_enable_clientauth is called a second time on the same LDAP struct, it will leak the previously set nickname and password strings. I don't know whether it should (a) detect a second call and refuse to change the established values, or (b) free the previously used values on a second call.
(In reply to comment #1) > ... So, rather than the user trying to > shutdown NSS directly, tearing out the foundation on which libldap sits, > without libldap's participation, it would be better if libldap facilitated > that shutdown. I agree with this. If you have destroyed all LDAP * session handles, most every NSS resource should be released. But apparently not all. > I'd also like to suggest that methods be added/enhanced to facilitate the > caller specification of such things as: > a) DB file name prefixes > b) read-only vs read-write attribute of NSS OK, but the original intent was that (for libldap) NSS initialization should be separable from the layering of LDAP over SSL/TLS. I am not sure we have really kept it separate, but more advanced applications should in theory be able to write their own NSS initialization code (directly calling NSS APIs) and still use functions like ldapssl_init(), ldap_start_tls_s(), ldapssl_install_routines(), and ldapssl_enable_clientauth(). > Someone suggested to the bug reporter than he might initialize NSS himself > before calling ldapssl_clientauth_init. I guess that works with the present > implementation (the one in /directory/suncsdk/c-sdk), but it seems fragile. > It will break if libldap's implementation changes in ways that it ought > reasonably to be able to change. If this is true that is something we should try to fix in libldap. I don't think the initialization vs. "use LDAP over SSL" dependencies are very tight, but I am not 100% sure. > One related comment: it appears to me that if ldapssl_enable_clientauth is > called a second time on the same LDAP struct, it will leak the previously > set nickname and password strings. I don't know whether it should > (a) detect a second call and refuse to change the established values, or > (b) free the previously used values on a second call. I would say (a). Note that most applications just get a new LDAP * handle when they need to make a new connection or reauthenticate. But the reporter of this bug is trying to use the LDAP reconnect option, which is not as widely used.
(In reply to comment #2) > > One related comment: it appears to me that if ldapssl_enable_clientauth is > > called a second time on the same LDAP struct, it will leak the previously > > set nickname and password strings. I don't know whether it should > > (a) detect a second call and refuse to change the established values, or > > (b) free the previously used values on a second call. > > I would say (a). Note that most applications just get a new LDAP * handle when > they need to make a new connection or reauthenticate. But the reporter of this > bug is trying to use the LDAP reconnect option, which is not as widely used. Errr.... I mean to say (b) (free the old values and start using the new ones). Sorry.
i dont like this approach of adding yet another libssldap call, as a wrapper to NSS_Shutdown() or what else in NSS. my reasons for this are fairly simple: - i expect NSS to handle NSS state. the user can initiate shutdown according to NSS guidelines and libssldap can get notified via NSS_RegisterShutdown handler to do its own cleanup before NSS shuts itself down. - since some libssldap users use NSS not only for LDAP but also for other things in their applications introducing yet another wrapper makes things more complex and less transparent for them. ideally i would say libssldap here should be as transparent as it can be which is not the case today. - the main interest here is to make libssldap 'restartable' after NSS shut down ie clean up its own state with minimal user interaction. we already have quite a few wrappers that were the source of problems in the past, sometimes due to poor design and sometimes due to changes in NSS. the bottom line for me is this: if we can use NSS_RegisterShutdown() to register our handler to do the cleanup i think we should use it and keep the things simple and transparent. if we cannot use NSS_RegisterShutdown then we gonna have to introduce some public wrapper that gonna perform our own cleanup plus NSS cleanup upon the users request.
Architecurally, either a) libldap should BOTH initialize NSS and shutdown NSS for the appliction, or b) libldap should NEITHER initialize NSS nor shut it down, and the appliation should do those things itself. Since ldap now initializes NSS for the app, it should also do the shutdown, IMO.
(In reply to comment #5) > Since ldap now initializes NSS for the app, it should also do the shutdown, > IMO. point taken. can you please comment on the following stuff i did ask earlier today : - NSS_RegisterShutdown() can be used to hookup my own callback so that its called upon NSS_Shutdown() invocation, prior to actual shutdown taking place. anything i should be aware of? like limitations etc ? - SSL_ClearSessionCache() shall be called prior to NSS_Shutdown i been told. tho it is my understanding SSL_ClearSessionCache is pretty harmless i'd like to understand if it is really is a prerequisite to NSS_Shutdown and if so any specific details explaining why. - SSL_ClearSessionCache() can be safely called from NSS_Register Shutdown handler. true or false ? and can you also expand on proper NSS shutdown sequence as it seems to me now that more than one NSS/SSL call should be invoked to do that nice and clean. regardless of which approach we end up taking this is important for us to understand the logic of NSS shutdown and what steps should be taken in what order.
ok, heres somewhat dodgy woraround for now : ldap_unbind(ld); SSL_ClearSessionCache(); NSS_Shutdown(); NSS_Initialize(MY_CERTDB_PATH,NULL,NULL,NULL,NSS_INIT_READONLY); Rich was absolutely right saying that SSL_ClearSessionCache() has to be called prior to NSS_Shutdown(). i dunno why NSS shutdown does not clean up its own stuff like cache in this case but whatever explanation is we gonna have to call SSL_ClearSessionCache() or ask users to do so for us. now i did try above workaround in the tight loop where ldapssl_* init functions are called then we bind over SSL to make sure it works then the workaround sequence executed then goto ldapssl_* init over again. funny thing is we actually do cleanup after ourselves in libssldap when ldap_unbind() gets called. it does clean up via ldapssl_disposehandle() for the session info and ldapssl_close() for the socket so its all rosy. the reason NSS_Initialize() has to be called explicitly by the user is because we have that "inited" global there which protects basic_init from happening more than once right now. now regarding leaks. with above workaround running in the loop and lib umem audit setup i found to my surprise that we dont actually leak any thing there but NSS does, via NSS_Initialize(), big time. right now i have absolutely no idea if i have to call any additional NSS/SSL/etc functions upon NSS shutdown to prevent those NSS_Initialize leaks. on the other hand my testcase doesnt represent any real life scenario coz its extremely unlikely any app gonna execute that many shutdown/init seqs in the loop. if one call shutdown/init only every once in a while i doubt those leaks gonna make much difference, still they are leaks and we need to figure out how to get rid of them if its in our scope. thats it for now. next i'm gonna find out about restarting basic_init either via NSS_RegisterShutdown hook or new pub api [whichever works] .
Created attachment 239042 [details] [diff] [review] first cut of the patch this is preliminary version of the fix. posting it so you can have a look at general approach and advise on what else should be taken into acount here. i'm still looking for any leftovers caused by re-inits and via diff ldapssl init functions to cleap up and checking if we have to do anything special re NSPR and if we need some giant lock there to protect cleanup. as you can see from the diff we can now handle shutdown and re-init when users calling either new public api ldapssl_shutdown() OR users calling NSS_Shutdown() directly. handling both is pretty flexible i reckon.
Anton, NSS_Shutdown can't call SSL_ClearSessionCache because this would introduce a circular dependency between shared libraries - libssl3 depends on libnss3, so libnss3 cannot call a libssl3 function .
thanx again for your help Julien! can you please advise of what else has to be called [NSS/SSL cleanup functions i mean] prior to NSS_Shutdown() in our case as there seems to be quite a few leaks in NSS after re-init. i can post/send libumem ::findleaks/bufctl output if thats of any help. (In reply to comment #9) > Anton, > > NSS_Shutdown can't call SSL_ClearSessionCache because this would introduce a > circular dependency between shared libraries - libssl3 depends on libnss3, so > libnss3 cannot call a libssl3 function . >
ok, i did run a few tests and it seems we are clean here with this patch so far. - NSPR cannot be inited twice so all subsequent reinits will be noops. - there was a problem with NSS_RegisterShutdown() which i believe Julien just fixed. it wont apply to us directly with current NSS anyway coz we are probably the last to call NSS_RegisterShutdown() in the NSS chain. see https://bugzilla.mozilla.org/show_bug.cgi?id=353608 - sent all the info about NSS leaks upon reinit to Julien. most of them are known to NSS team and in the process of being plugged. - there is nothing else we can call to help NSS shutdown/cleanup here so at this point i believe its done properly in the patch. - locking. i'm still not sure about this and need your input here. since there is a potential for a race for "inited" gobal and un/register of the handler i feel like this should be locked somehow so neither init nor shutdown can take place while the other one still running. my take on it is to introduce some libssldap specific giant lock to take care of that but on the other hand its clear that any app should wait for either ldapssl_shutdown() or NSS_Shutdown() to return success before calling init sequence [say it does so on diff thread] and odds of this race condition happening are kinda low. what are your expectations ?? apart from locking issue i think this patch is ready for review and ci if its any good. so please do review it and tell me what you think re locking. again, many thanx to Julien for all the help.
R.e. locking, I think it is OK to make that issue the application's problem in this case. Initialization of libraries is often confined to one thread (if they are shutting down libldap/libssldap/NSS and are using more than one thread, the application will need to coordinate among threads anyway).
(In reply to comment #12) thanx Mark! what do you think about the rest of the patch ? is there anything else i need to address ? if not i wanna check it in on sun_ merge branch and move on.
Re: NSS_Shutdown() and cache clearing - if there were a function SSL_Shutdown() it could also clear the caches and perform the NSS_Shutdown(). But AFAIK there's not so we have to clear the caches explicitly. I think the code looks good.
(In reply to comment #14) thanx for review Rich! perhaps we should file an RFE against NSS to provide SSL_Shutdown(). its unclear that one has to use SSL_ClearSessionCache() in order to clean up SSL lib state, critical to let NSS_Shutdown() go: neither docs nor header comments in NSS indicate that. maybe NSS can benefit from SSL_Shutdown as some sort of aggregator 'cleanup on user request' function that can take care of session cache and whatever else ? in any case Julien and Nelson are on cc here so they can comment if its a good idea or not. there is one more thing regarding locking. i found that if we use NSS own NSS_IsInitialized() we can probably get rid of that "inited" global and avoid locking altogether. i might give it a try later when merging ssldap changes.
Comment on attachment 239042 [details] [diff] [review] first cut of the patch The changes look OK to me. Please commit. R.e. the idea of using NSS_IsInitialized() that is worth investigating. If an application has already initialized NSS but also needs to call one of our functions to do some LDAP-specific NSS initialization we should make sure that works too.
Attachment #239042 - Flags: review+
Yes. Go ahead and commit this first, then we can investigate using NSSIsInitialized(), perhaps as another bug.
thanx for all reviews and comments on this! Checking in mozilla/directory/c-sdk/ldap/include/ldap_ssl.h; /cvsroot/mozilla/directory/c-sdk/ldap/include/ldap_ssl.h,v <-- ldap_ssl.h new revision: 188.8.131.52; previous revision: 5.5 done Checking in mozilla/directory/c-sdk/ldap/libraries/libldap_ssl.ex; /cvsroot/mozilla/directory/c-sdk/ldap/libraries/libldap_ssl.ex,v <-- libldap_ssl.ex new revision: 184.108.40.206; previous revision: 5.5 done Checking in mozilla/directory/c-sdk/ldap/libraries/libssldap/clientinit.c; /cvsroot/mozilla/directory/c-sdk/ldap/libraries/libssldap/clientinit.c,v <-- clientinit.c new revision: 220.127.116.11; previous revision: 5.6 done
These have been merged on to the trunk. This bug can be resolved as FIXED.
Reassigned to Anton (for tracking purposes) and marked fixed.
Assignee: mcs → anton.bobrov
Status: NEW → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.