Open Bug 254983 Opened 20 years ago Updated 2 years ago

There is no way to manually detach NSPR from native threads. PR_Cleanup hangs.

Categories

(NSPR :: NSPR, defect)

defect

Tracking

(Not tracked)

People

(Reporter: julien.pierre, Unassigned)

Details

(Whiteboard: Remaining work: reimplement PR_DetachThread for non-Unix platforms)

Attachments

(4 files)

We have an application which calls NSPR functions on threads that weren't
created by NSPR. These are callback functions of a plug-in running in a
(Microsoft IIS) web server program. In this case, NSPR attaches to the server
thread automatically the first time an NSPR call is made on that thread.

Later, the application calls PR_Cleanup. Then, NSPR tries to join these threads,
which are now considered NSPR user threads. On Windows, the threads
automatically gets detached by NSPR through the use of the DLL_INIT_THREAD
DllMain entry point . However, we can't control these threads because they were
created by the server. The plug-in is only allowed to return up to the caller,
but not to exit from the thread.

It so happens that the server calls the shutdown function of the plug-in, which
calls PR_Cleanup, before those threads have exited. Therefore, the threads are
still attached, and PR_Cleanup waits forever for them to exit.

We need a way for the plug-in threads to manually detach from NSPR before they
return control to IIS, so that cleanup will not hang.

There is a deprecated API called PR_DetachThread which is currently a no-op, but
for which there is an implemention that is ifdef'ed out . It looks like this
implementation would solve the problem and let us manually detach. Is there any
reason not to put this code back in ? I don't think the manual detach is
harmful, especially since any call to NSPR APIs on that thread will
automatically re-attach to the thread.
This is a ZIP of the pseudo-code for an ISAPI plug-in using NSPR, which does
nothing but initialize NSPR and create a lock in its initialization, use the
lock in the main callback, and destroy the lock and cleanup NSPR in the
termination function.
Comment on attachment 155896 [details]
zip of ISAPI plug-in using NSPR 

Forgot to mention that this ZIP includes source code and Makefile . The plug-in
builds under Windows under mozilla/security/nss/lib/isapi .
Attachment #155896 - Attachment description: test case → zip of ISAPI plug-in using NSPR
This is the program to run in order to load and test the plug-in.
The ZIP includes the Makefile . It builds under mozilla/security/nss/cmd/server
.

This program takes two arguments "restarts" and "requests" .
- "restarts" is the number of time the server will load/unload the plug-in,
which will in turn init/cleanup NSPR .

- "requests" is the number of calls the server will make to the plug-in, on its
own native worker thread, before unloading the plug-in.

When setting requests to 1 or greater, the followig scenario occurs :
1. the server calls the plug-in on its own native worker thread
2. the plug-in calls PR_Lock, which causes NSPR to attach to the server thread
3. the plug-in returns control to the server worker thread
4. the worker thread notifies the main server thread that it is done with
requests (the plug-in code will no longer be called)
5. the main thread calls the plug-in cleanup function, which calls PR_Cleanup
6. PR_Cleanup hangs in PR_WaitCondvar, because it is trying to join the
server's worker thread, which has not exited

This "server" program mimics the behavior of IIS . Of course, we don't have
source to the actual IIS, or any opportunity to change it.

Fixes to the "plug-in" program are on the other hand possible. We would like to
keep using NSPR for it.
This allows a manual thread detach, as required in this application.
Comment on attachment 155992 [details] [diff] [review]
bring back PR_Detach implementation

>Index: pruthr.c
>===================================================================
>RCS file: /cvsroot/mozilla/nsprpub/pr/src/threads/combined/pruthr.c,v
>retrieving revision 3.36
>diff -u -1 -0 -r3.36 pruthr.c
>--- pruthr.c	25 Apr 2004 15:01:02 -0000	3.36
>+++ pruthr.c	13 Aug 2004 00:41:51 -0000
>@@ -100,20 +100,21 @@
> {
> #if defined(XP_MAC)
> #pragma unused (maxPTDs)
> #endif
> 
>     PRThread *thread;
>     PRThreadStack *stack;
> 
>     _pr_terminationCVLock = PR_NewLock();
>     _pr_activeLock = PR_NewLock();
>+	_pr_userActive = 0;

Why is this needed?

> PR_IMPLEMENT(void) PR_DetachThread(void)
> {
>     /*
>      * On IRIX, Solaris, and Windows, foreign threads are detached when
>      * they terminate.
>      */
>-#if !defined(IRIX) && !defined(WIN32) \
>+#if !defined(IRIX) /* && !defined(WIN32) \ */ \
>         && !(defined(SOLARIS) && defined(_PR_GLOBAL_THREADS_ONLY))
>     PRThread *me;
>     if (_pr_initialized) {
>         me = _PR_MD_GET_ATTACHED_THREAD();
>         if ((me != NULL) && (me->flags & _PR_ATTACHED))
>             _PRI_DetachThread();
>     }
> #endif
> }

The comment needs to be updated to reflect the new
code.

You should remove the whole #if so this is done for
all platforms.	You also need to fix the other two
PR_DetachThread implementations (in btthread.c and
ptthread.c).
Attachment #155992 - Flags: review-
I don't think the _pr_userActive = 0 is necessary for this patch. It just made
the diff because of other fixes. 

I wasn't sure about removing the whole #if, I wanted your feedback on it before.
I will take a look at other two thtread implementations.

Setting target to P3 because we have a workaround - which is simply never to
call PR_Cleanup, now that the plug-in never gets unloaded/reloaded ...
Priority: -- → P3
Target Milestone: --- → 4.6
I am now motivated to reimplement PR_DetachThread.  It could be used
to work around a problem we ran into on Linux:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=216718#c11
Status: NEW → ASSIGNED
Priority: P3 → P2
Target Milestone: 4.6 → 4.6.5
QA Contact: wtchang → nspr
This patch implements PR_DetachThread for Unix.

PR_DetachThread means "remove the PRThread structure associated with
the current thread".  NSPR creates a PRThread structure for each thread.
The pointer to this PRThread structure is stored as thread-private data.
So PR_DetachThread should:
1. get the thread-private data and save it in the variable 'thred'.
2. if 'thred' is NULL, there is no PRThread structure for this thread,
   so we're done.
3. otherwise, pass 'thred' to the _pt_thread_death function to free the
   PRThread structure.
4. finally, we must set the thread-private data to NULL, to prevent the
   pthread library from calling the thread-private data's destructor
   (_pt_thread_death) when this thread terminates.

You may have noticed that this patch is related to the patch in bug 362768.
Indeed, both patches address the same issue, that any thread that stays
around after libnspr4.so is unloaded must not have a non-NULL thread-private
data, otherwise the pthread library will call the destructor _pt_thread_death
on it.  Unfortunately there is no pthread function to change or unregister a
thread-private data destructor, so the only solution is to set the
thread-private data to NULL in each thread.  The patch in bug 362768 only
takes care of the thread that unloads libnspr4.so.  Other threads need to
call PR_DetachThread to set their thread-private data to NULL.  (Alternatively,
they need to terminate before libnspr4.so is unloaded.)
Attachment #247615 - Flags: superreview?(nelson)
Attachment #247615 - Flags: review?(alexei.volkov.bugs)
Comment on attachment 247615 [details] [diff] [review]
Patch for Unix [checked in]

r=nelson
Attachment #247615 - Flags: superreview?(nelson) → superreview+
Comment on attachment 247615 [details] [diff] [review]
Patch for Unix [checked in]

I checked in the "patch for Unix" on the NSPR trunk (NSPR 4.7).

Checking in ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v  <--  ptthread.c
new revision: 3.69; previous revision: 3.68
done
Comment on attachment 247615 [details] [diff] [review]
Patch for Unix [checked in]

I checked in the "patch for Unix" on the NSPRPUB_PRE_4_2_CLIENT_BRANCH
(Mozilla 1.9 alpha 2).

Checking in ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v  <--  ptthread.c
new revision: 3.42.2.15; previous revision: 3.42.2.14
done
This bug can be fixed in NSPR 4.7.  Our immediate problem can
be better addressed by the patch in bug 362768.
Target Milestone: 4.6.5 → 4.7
Comment on attachment 247615 [details] [diff] [review]
Patch for Unix [checked in]

Sorry for delay. The patch looks good.
Attachment #247615 - Flags: review?(alexei.volkov.bugs) → review+
Priority: P2 → P1
Whiteboard: Remaining work: reimplement PR_DetachThread for non-Unix platforms
is this ready for check in?
Comment on attachment 247615 [details] [diff] [review]
Patch for Unix [checked in]

please ignore my previous comment.

marking patch as checked in, see comment 10 and comment 11
Attachment #247615 - Attachment description: Patch for Unix → Patch for Unix [checked in]
Wan-Teh, Kai, what is the status of this bug?
One patch got r-, one got r+ and is checked in.
What remains to be done?
This bug is not fixed yet.  The remaining work is specified in the
Status Whiteboard: reimplement PR_DetachThread for non-Unix platforms.

We can mark this bug fixed, changing the bug's summary to reflect that
it's only fixed for Unix, and open a new bug for the remaining work.

The bug assignee didn't login in Bugzilla in the last 7 months and this bug has priority 'P1'.
:KaiE, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: wtc → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(kaie)
Flags: needinfo?(kaie)
Priority: P1 → --
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: