Open Bug 354614 Opened 18 years ago Updated 2 years ago

non-NSPR programs may crash if they unload libraries that use NSPR

Categories

(NSPR :: NSPR, defect)

x86
Windows NT
defect

Tracking

(Not tracked)

People

(Reporter: julien.pierre, Unassigned)

Details

NSPR currently doesn't know how to do a full shutdown. This means PR_Cleanup leaves memory leaks. But even more seriously, sometimes NSPR internal background threads are left running. This happens on Windows with OS_TARGET=WINNT . As soon as the NSPR DLL is paged out of memory, the application will crash. Many programs found this the hard way, for example web server plugins for IIS / Apache that use NSPR . After the web server unloaded these plugins, the server would crash.

The same is true of the NSS PKCS#11 softoken library. See bug 354613 .

Various workarounds had to be put in the web server plugins, which generally included either the DLL leaking itself (dlopen, LoadLibrary) so that it stays resident in the process forever, and the web server is unable to actually unload it (dlclose / FreeLibrary just reduce the refcount by 1).

I'm wondering if we should continue to put these workarounds one at a time in each library, or if NSPR itself could take care of the problem in a generic way, eg. by leaking itself and always staying resident automatically if it knows there is no hope of unloading ever succeeding .
Summary: non-NSPR programs may crash if they load libraries that use NSPR → non-NSPR programs may crash if they unload libraries that use NSPR
it'd obviously be nicer if nspr could cleanup correctly :).
timeless,

Yes, it would be much nicer, but I don't think the current NSPR API design allows fixing this properly. The problem is that the NSPR initialization usually happens implicitly . The first module that calls any NSPR function initializes it automatically. This initialization isn't refcounted. Thus, even if we fix PR_Cleanup to shut down the NSPR internal threads, as soon as one module would call it, other modules that also use NSPR would be dead in the water at worst, and at best would be restarting NSPR automatically . So the consequence is that this fix would work if you have one web server plug-in using NSPR, but not two.

IMO, the proper fix requires an explicit refcounted initialization / deinitialization of NSPR. Only the last caller of PR_Cleanup would actually stop the internal threads when the refcount goes to zero.

Also, note that there is a lot of work left to clean up PR_Cleanup .
See bug 255452, 254983, 254987 .

Another library affected by this problem is the LDAP SDK . See bug 286598 . This one is a Unix crash is _pt_thread_death , ie. thread termination callback . So I think the problem of unloading NSPR is not limited to the Windows NT build.
Julien, you're right that the problem of unloading NSPR is not limited
to the WINNT build.  We just ran into a crash on Linux, which is similar
to bug 286598 of the LDAP C SDK:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=216718#c11
QA Contact: wtchang → nspr

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: wtc → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.