Closed Bug 214411 Opened 22 years ago Closed 21 years ago

Solaris native thread support: GetThreadPrivate fails if thread not created via NSPR

Categories

(NSPR :: NSPR, defect)

Sun
Solaris
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: gerard.roos, Assigned: wtc)

Details

Attachments

(2 files, 4 obsolete files)

User-Agent: Mozilla/4.8 [en] (X11; U; SunOS 5.8 sun4u) Build Identifier: NSPR distribution v4.3 With solaris native thread support and with a thread created 'behind the scenes' of NSPR: the thread's NSPR data obtained by _MD_CURRENT_THREAD() (i.e. _pr_current_thread_tls()) is NULL, producing a SEGV in nspr-4.3/mozilla/nsprpub/pr/src/threads/prtpd.c:228 (dereferencing NULL pointer 'self'). Reproducible: Always Steps to Reproduce: 1. Build NSPR (configure --with-native-threads) 2. create a thread yourself without using NSPR 3. in that thread, call PR_GetCurrentThread(), which returns NULL (API doc (out-dated?) says "Never returns NULL"). 4. PR_GetCurrentThread() is also used in PR_GetThreadPrivate() and is assumed to never return NULL. Actual Results: NULL pointer dereference. Expected Results: Lazy initialization of NSPR-specific thread data on that thread upon access (like it is done in the Windows (and probably other platforms as well) implementation if I am not mistaken...)
Version: other → 4.3
Until now this patch has been working for me. However, I am not fully certain that _PR_InitThreads() is intended (and thus legal) to be called at any moment later than NSPR initialization. Furthermore, I suspect---I do not yet fully understand every line of code, sorry to provide a patch anyhow---that by calling _PR_InitThreads() after NSPR initialization, I am doing more work than is actually required to initialize the current thread's missing NSPR-specific data structures (but the extra work--if any--does not seem to hurt so far).
Thanks for the bug report and the patch. Your patch is not the right way to solve the problem. But you don't need to solve this problem in the first place. You should build NSPR without any configure option. NSPR uses the pthread interface by default on Solaris. The pthread interface is merely another interface to the native thread engine. The pthread-based NSPR works with threads not created via NSPR. Please let me know if you must use the Solaris native threads (the thr_ interface) based NSPR. That implementation is deprecated and I am not planning to support it. If you must use that implementation, I can give you some pointers on the right way to solve the problem.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Thanks for your answer. I have been using NSPR with pthread support (default build) in the first place for building my shared library. However, 'If a POSIX threads program calls fork(2), it implicitly calls fork1(2)' (man pthreads). This change in forking behavour broke the (existing) application into which I was loading my library. I thus had to build the library without pthread dependency. This problem led to the 'solution' proposed above. I would be grateful could you give me hints for a better (i.e. correct) solution.
_pr_current_thread_tls is the right function to modify. You want to make it look like the _MD_CURRENT_THREAD function in mozilla/nsprpub/pr/src/md/windows/w95thred.c. thr_getspecific(threadid_key, (void **)&ret); if (NULL == thread) { thread = _PRI_AttachThread( PR_USER_THREAD, PR_PRIORITY_NORMAL, NULL, 0); } PR_ASSERT(thread != NULL); return thread; Then, you need to set up a destructor for the thread specific data key 'threadid_key', in which you can free the PRThread structure for the attached threads. The destructor needs to be registered using the THR_KEYCREATE call in mozilla/nsprpub/pr/src/md/unix/solaris.c (we are passing NULL right now). The equivalent destructor for the pthreads-based implementation is _pt_thread_death in mozilla/nsprpub/pr/src/pthreads/ptthread.c. You need to write the equivalent code for solaris.c (and perhaps some files in mozilla/nsprpub/pr/src/threads/combined/). Is it possible for you to modify the existing application so that it works with fork1(2) too? fork1(2) is the portable fork semantics for POSIX Threads.
Regarding the destructor for 'threadid_key': It is possible that you just need to register it using the THR_KEYCREATE call in solaris.c, and it would look similar to the code under the DLL_DETACH_THREAD case in mozilla/nsprpub/pr/src/md/windows/w95dllmain.c. Something like this: static void threadid_key_destructor(void *value) { PRThread *me = (PRThread *)value; PR_ASSERT((me != NULL) && (me->flags & _PR_ATTACHED)); _PRI_DetachThread(); } Don't look at _pt_thread_death in ptthread.c.
Attachment #128836 - Attachment is obsolete: true
_MD_GET_ATTACHED_THREAD is defined in _solaris.h
Since _MD_GET_ATTACHED_THREAD (_pr_current_thread_tls on Solaris) is called from within _PRI_Attach_Thread, a second function _pr_attached_thread_tls() is required to prevent an endless loop. It simply returns the attached thread data (as did _pr_current_thread_tls() before), whether valid or NULL.
The last three patches together (one patchfile per file, sorry for not providing a single patchfile) should solve the problem as requested/expected, i.e. tests done so far have shown correct behaviour and have completed successfully.
Wan-Teh Chang, any comments on the patches by Gerard?
Attachment #129589 - Flags: review+
Comment on attachment 129590 [details] [diff] [review] leave _MD_GET_ATTACHED_THREAD undefined on Solaris This patch needs a minor tweak: !defined(SOLARIS) should be replaced by !(defined(SOLARIS) && defined(_PR_GLOBAL_THREADS_ONLY)) because we can also build NSPR on Solaris with _PR_LOCAL_THREADS_ONLY defined.
Attachment #129590 - Flags: review-
Comment on attachment 129591 [details] [diff] [review] register threadid_key destructor and correct lazy init of thread data This patch has a subtle bug, which I discovered only after running an NSPR test (foreign.c, modified to create Solaris native threads) inside a debugger. It turns out that when Solaris destroys thread specific data, it first sets the thread specific data to NULL and then passes the value to the destructor. Consider our destructor. static void threadid_key_destructor(void *value) { PRThread *me = (PRThread *)value; PR_ASSERT((me != NULL) && (me->flags & _PR_ATTACHED)); _PRI_DetachThread(); } When it is invoked, Solaris sets the data for threadid_key to NULL, and passes the value to it. But we are not passing this value to _PRI_DetachThread()! _PRI_DetachThread is supposed to get that value with a _PR_MD_CURRENT_THREAD() call, but _PR_MD_CURRENT_THREAD() will see that the current thread is NULL and attach the thread again. So we end up detaching the newly attached PRThread structure, and the old PRThread structure is leaked. The right solution is to modify _PRI_DetachThread() to take a PRThread * parameter. But this will require modify all the callers of _PRI_DetachThread(). A simpler workaround is for our destructor to set the current thread before calling _PRI_DetachThread(): static void threadid_key_destructor(void *value) { PRThread *me = (PRThread *)value; PR_ASSERT((me != NULL) && (me->flags & _PR_ATTACHED)); + /* + * The Solaris thread library sets the thread specific + * data (the current thread) to NULL before invoking + * the destructor. We need to restore it to prevent the + * _PR_MD_CURRENT_THREAD() call in _PRI_DetachThread() + * from attaching the thread again. + */ + _PR_MD_SET_CURRENT_THREAD(me); _PRI_DetachThread(); } Gerard, could you add this _PR_MD_SET_CURRENT_THREAD(me) call to your threadid_key_destructor and redo your tests?
Attachment #129591 - Flags: review-
Attached patch Proposed patchSplinter Review
This patch subsumes Gerard's three patches. It also includes modifications to the foreign.c and provider.c tests to create Solaris native threads.
Attachment #129589 - Attachment is obsolete: true
Attachment #129590 - Attachment is obsolete: true
Attachment #129591 - Attachment is obsolete: true
Patch checked into the NSPR trunk.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Whiteboard: [4.5]
Target Milestone: --- → 4.4.1
hi, sorry for my late reply. I have run into a bug in threadid_key_destructor() lately. The assertion ((me != NULL) && (me->flags & _PR_ATTACHED)) as proposed in #5 is too restrictive: the PRThreads flag can be _PR_PRIMORDIAL. During implicit initialization (_PR_ImplicitInitialization -> _PR_InitStuff -> _PR_InitThreads), InitThreads sets the thread's flag to _PR_PRIMORDIAL, but never to _PR_ATTACHED. So, the corrected destructor function is: static void threadid_key_destructor(void *value) { PRThread *me = (PRThread *)value; PR_ASSERT(me != NULL); /* the thread could be PRIMORDIAL (thus not ATTACHED) */ if (me->flags & _PR_ATTACHED) { /* * The Solaris thread library sets the thread specific * data (the current thread) to NULL before invoking * the destructor. We need to restore it to prevent the * _PR_MD_CURRENT_THREAD() call in _PRI_DetachThread() * from attaching the thread again. */ _PR_MD_SET_CURRENT_THREAD(me); _PRI_DetachThread(); } } If the thread is not _PR_ATTACHED, nothing is done. This behaviour is similar to the one found in e.g. irix.c: irix_detach_sproc().
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
You are right. Here is your contribution in the form of a patch.
Supplemental patch checked into the NSPR trunk (NSPR 4.5) and NSPR_4_4_BRANCH (NSPR 4.4.1.). Please mark the bug VERIFIED if the code is now working to your satisfaction.
Status: REOPENED → RESOLVED
Closed: 22 years ago21 years ago
Resolution: --- → FIXED
all tests passed, works so far.
Status: RESOLVED → VERIFIED
Whiteboard: [4.5]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: