Closed
Bug 749849
Opened 13 years ago
Closed 12 years ago
NSPR pthread_key_t leak and memory corruption
Categories
(NSPR :: NSPR, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
4.10.3
People
(Reporter: elio.maldonado.batiz, Assigned: KaiE)
Details
(Whiteboard: summary: comment 45)
Attachments
(8 files, 7 obsolete files)
1.06 KB,
text/plain
|
Details | |
1.54 KB,
text/plain
|
Details | |
2.00 KB,
patch
|
rrelyea
:
review+
|
Details | Diff | Splinter Review |
2.65 KB,
patch
|
Details | Diff | Splinter Review | |
1.27 KB,
patch
|
Details | Diff | Splinter Review | |
737 bytes,
patch
|
KaiE
:
review+
|
Details | Diff | Splinter Review |
2.59 KB,
patch
|
rrelyea
:
review+
wtc
:
checked-in+
|
Details | Diff | Splinter Review |
2.15 KB,
patch
|
KaiE
:
review+
wtc
:
checked-in+
|
Details | Diff | Splinter Review |
This bug was originally reported against nspr 4.8.6 on RHEL 5 with heavy usage but with the reproducers provided I could see on RHEL 6 and fedora-16.
Reporter | ||
Comment 1•13 years ago
|
||
Reporter | ||
Comment 2•13 years ago
|
||
Assignee | ||
Updated•13 years ago
|
Attachment #619215 -
Attachment mime type: text/x-csrc → text/plain
Assignee | ||
Updated•13 years ago
|
Attachment #619217 -
Attachment mime type: text/x-csrc → text/plain
Assignee | ||
Comment 3•13 years ago
|
||
Comment on attachment 619218 [details] [diff] [review]
Proof of concept fix from Joe Orton (fix the leak)
Did you intend to ask Wan-Teh for review?
Attachment #619218 -
Attachment is patch: true
Attachment #619218 -
Flags: review?(wtc)
Reporter | ||
Comment 4•13 years ago
|
||
This hasn't gotten attention for quite some time, I hope it will soon when Wan-Teh comes back. Red Hat folks doing planning are asking me if we can get it on time for an upcoming RHEL update.
Assignee | ||
Comment 5•13 years ago
|
||
Elio, I propose to bring such issues to attention during the weekly conf call, if they are urgent. Bugmail is too easy to miss.
Assignee | ||
Comment 6•13 years ago
|
||
Comment on attachment 619218 [details] [diff] [review]
Proof of concept fix from Joe Orton (fix the leak)
I agree with this patch.
The manual page for pthread_key_create says that it's the responsibility of the application to cleanup the resources associated to the key.
Because _PR_Fini is a function called at unloading a shared library, I agree this is the correct place to perform the cleanup. In particular the existing comment suggests to do it here.
r=kaie
However, I propose a minor enhancement for portability. We should define a symbol, in the same way as it's being done for pthread_key_create, because the name of that function may differ between platforms.
Attachment #619218 -
Flags: review?(wtc) → review+
Assignee | ||
Comment 7•13 years ago
|
||
Slightly improved patch for portability.
Assignee | ||
Comment 8•13 years ago
|
||
Explanation what this bug is about and why we need it fixed:
If the NSPR shared libraries get repeatedly loaded and unloaded during the lifetime of a single process, NSPR might repeatedly allocate the same key again and again, with never freeing it up. Eventually the memory available for thread specific keys will be exhausted and the application will fail.
By cleaning up we avoid this problem and increase reliability.
Just a note, there are really two separate bugs here:
a) memory leak => NSPR never calls pthread_key_delete()
b) memory corruption => NSPR corrupts memory if pthread_key_create() fails by not handling the error, it then "steals" another application's TSD and writes over it. This is arguably the more severe bug.
Calling pthread_key_delete() in _PR_Fini fixes (a) but not (b).
Assignee | ||
Comment 10•13 years ago
|
||
Joe, I agree with you.
But I don't have a quick solution for (b).
If we run in a well behaved environment, and NSPR cleans up after itself (a),
then it's much less likely that we'll suffer from (b), do you agree?
I understand that (a) is the quick fix that we want to get done.
Assignee | ||
Comment 11•13 years ago
|
||
I believe it's difficult to find a solution for (b)
that would allow the code to function anywhy.
It's basically a kind of out-of-memory error of critical coordination memory.
But maybe we should detect (b) and abort the application.
Assignee | ||
Comment 12•13 years ago
|
||
Assignee | ||
Updated•13 years ago
|
Attachment #645312 -
Attachment description: Patch v2 → Patch v2 (fix the leak)
Assignee | ||
Updated•13 years ago
|
Attachment #619218 -
Attachment description: Proof of concept fix from Joe Orton → Proof of concept fix from Joe Orton (fix the leak)
Comment 13•13 years ago
|
||
Comment on attachment 645312 [details] [diff] [review]
Patch v2 (fix the leak) [checked in]
r+ rrelyea.
This patch is safe. We are freeing our thread keyjust before unloading nspr (which will clear the state of _pr_initialized if we ever reload NSPR).
I'm r+ it but it's not complete. If we call PR_Cleanup() before we unload, we won't free the thread key.
bob
Attachment #645312 -
Flags: review+
Assignee | ||
Comment 14•13 years ago
|
||
Checking in include/md/_pth.h;
/cvsroot/mozilla/nsprpub/pr/include/md/_pth.h,v <-- _pth.h
new revision: 3.40; previous revision: 3.39
done
Checking in src/pthreads/ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v <-- ptthread.c
new revision: 3.97; previous revision: 3.96
done
Assignee | ||
Updated•13 years ago
|
Attachment #645312 -
Attachment description: Patch v2 (fix the leak) → Patch v2 (fix the leak) [checked in]
Assignee | ||
Comment 15•13 years ago
|
||
> I'm r+ it but it's not complete. If we call PR_Cleanup() before we unload,
> we won't free the thread key.
... because PR_Cleanup() will set _pr_initialized to FALSE.
This makes it difficult to decide in _PR_FINI whether it's necessary
to call pthread_key_delete.
I propose that we avoid this difficulty and call pthread_key_delete
from inside PR_Cleanup().
Assignee | ||
Comment 16•13 years ago
|
||
(In reply to Kai Engert (:kaie) from comment #15)
> > I'm r+ it but it's not complete. If we call PR_Cleanup() before we unload,
> > we won't free the thread key.
>
> ... because PR_Cleanup() will set _pr_initialized to FALSE.
>
> This makes it difficult to decide in _PR_FINI whether it's necessary
> to call pthread_key_delete.
>
> I propose that we avoid this difficulty and call pthread_key_delete
> from inside PR_Cleanup().
Actually, maybe that's too risky if there's still another thread active.
It would be safer to add another bool to pt_book that says whether or not we must free the key.
Assignee | ||
Comment 17•13 years ago
|
||
Assignee: wtc → kaie
Attachment #619218 -
Attachment is obsolete: true
Attachment #645351 -
Attachment is obsolete: true
Attachment #645381 -
Flags: review?(rrelyea)
Comment 18•13 years ago
|
||
Comment on attachment 645381 [details] [diff] [review]
Incremental patch v4, cleanup scenario + abort on failure
r+ rrelyea, but I'd also like wtc to review since it's in NSPR.
Also, I think we can check pt_book.mustFreeThreadKey beforew we try to do a PT_PTHREAD_KEY_CREATE() and not call PT_PTHREAD_KEY_CREATE if pt_book.mustFreeThreadKey is true.
Also, I assume your change of the PR_ASSERTs are so that you blow up even if DEBUG is not on. I'd like wtc's opinion on whether or not this is correct, or if we should just return. I totally agree that we shouldn't set pt_book.mustFreeThreadKey if PT_PTHREAD_KEY_CREATE didn't succeed.
bob
Attachment #645381 -
Flags: review?(rrelyea) → review+
Comment 19•13 years ago
|
||
Comment on attachment 645312 [details] [diff] [review]
Patch v2 (fix the leak) [checked in]
Review of attachment 645312 [details] [diff] [review]:
-----------------------------------------------------------------
::: mozilla/nsprpub/pr/src/pthreads/ptthread.c
@@ +975,4 @@
> rv = pthread_setspecific(pt_book.key, NULL);
> PR_ASSERT(0 == rv);
> }
> + _PT_PTHREAD_KEY_DELETE(pt_book.key);
Please remove the _PT_PTHREAD_KEY_DELETE macro definition
from _pth.h and just use pthread_key_delete. Thanks.
Comment 20•13 years ago
|
||
Comment on attachment 645381 [details] [diff] [review]
Incremental patch v4, cleanup scenario + abort on failure
Review of attachment 645381 [details] [diff] [review]:
-----------------------------------------------------------------
::: mozilla/nsprpub/pr/src/pthreads/ptthread.c
@@ +42,5 @@
> PRLock *ml; /* a lock to protect ourselves */
> PRCondVar *cv; /* used to signal global things */
> PRInt32 system, user; /* a count of the two different types */
> PRUintn this_many; /* number of threads allowed for exit */
> pthread_key_t key; /* private private data key */
Please change "private private" to "thread private".
@@ +43,5 @@
> PRCondVar *cv; /* used to signal global things */
> PRInt32 system, user; /* a count of the two different types */
> PRUintn this_many; /* number of threads allowed for exit */
> pthread_key_t key; /* private private data key */
> + PRBool mustFreeThreadKey;
Please document this boolean member. For example:
/* whether 'key' should be deleted */
I suggest renaming this boolean member "keyCreated" or "keyValid",
to mean whether 'key' has been created or is valid. The name
"mustFreeThreadKey" is a little long.
@@ +970,5 @@
> + * or we already went through PR_Cleanup()
> + */
> + if (pt_book.mustFreeThreadKey)
> + {
> + _PT_PTHREAD_KEY_DELETE(pt_book.key);
You should set pt_book.mustFreeThreadKey to PR_FALSE after
this line.
You can do something like this:
if (pt_book.mustFreeThreadKey) {
pthread_key_delete(pt_book.key);
pt_book.mustFreeThreadKey = PR_FALSE;
}
/* Either NSPR was never successfully initialized or PR_Cleanup has been called. */
if (!_pr_initialized) return;
You need to remove the _PT_PTHREAD_KEY_DELETE call you added
in the previous patch.
Assignee | ||
Comment 21•13 years ago
|
||
Wan-Teh, I'm trying to understand your proposal from comment 20.
The way I understand it doesn't make sense, because pt_book.key would be used after having destroying it already.
I will attach a new patch that addresses the renames and comments you had proposed.
Updated•13 years ago
|
Status: NEW → ASSIGNED
Priority: -- → P1
Hardware: x86_64 → All
Target Milestone: --- → 4.9.3
Version: 4.9 → other
Reporter | ||
Comment 23•13 years ago
|
||
Hi Wan-Teh, have had a chance to review Kai's attachment? I was hoping this would make to cut for 4.9.3.
Comment 24•13 years ago
|
||
Comment on attachment 652387 [details] [diff] [review]
Patch v5
Sorry, I thought this bug is already fixed. I will take a look
at this patch.
Comment 25•13 years ago
|
||
Kai, please review this patch.
I take a simpler approach. I just add a pthread_key_delete call
to PR_Cleanup, but only when there is no other thread active.
(You pointed out this issue in comment 16.)
Let's see if this simpler approach is good enough in practice.
If not, then we will use the approach of your patch v5 (only
call pthread_key_delete in _PR_Fini and use a keyCreated boolean
to keep track of whether the key needs to be deleted).
I didn't fix problem b (memory corruption) that Joe Orton reported
in comment 9. We need to deal with that separately. Many things
break if NSPR initialization fails.
Note: I verified that when libnspr4.so is unloaded and reloaded,
_pr_initialized is 0, even if we don't call PR_Cleanup before
unloading libnspr4.so. This confirms what Bob said in comment 13.
I've checked in the patch on the NSPR trunk (NSPR 4.9.3). I will
address your review comments with a new patch.
Checking in pr/include/md/_pth.h;
/cvsroot/mozilla/nsprpub/pr/include/md/_pth.h,v <-- _pth.h
new revision: 3.41; previous revision: 3.40
done
Checking in pr/src/pthreads/ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v <-- ptthread.c
new revision: 3.98; previous revision: 3.97
done
Attachment #652387 -
Attachment is obsolete: true
Attachment #652387 -
Flags: review?(wtc)
Attachment #673474 -
Flags: review?(kaie)
Comment 26•13 years ago
|
||
This cumulative patch summarizes all the patches that have
been checked in.
Updated•13 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment 27•13 years ago
|
||
(In reply to Kai Engert (:kaie) from comment #21)
> Wan-Teh, I'm trying to understand your proposal from comment 20.
>
> The way I understand it doesn't make sense, because pt_book.key would be
> used after having destroying it already.
Just for the record: my suggestion was wrong. Thank you for catching my
mistake.
Assignee | ||
Comment 28•13 years ago
|
||
Our test machine buildnss01 frequently reports a failure with a crashed selfserv.
I just looked at the crash, and it might be related to this bug:
#0 0x0000003582a30285 in raise () from /lib64/libc.so.6
#1 0x0000003582a31d30 in abort () from /lib64/libc.so.6
#2 0x00002b06d6561c11 in PR_Assert (s=0x2b06d65907af "0 == rv", file=0x2b06d6590788 "../../../../pr/src/pthreads/ptthread.c", ln=772) at ../../../../pr/src/io/prlog.c:554
#3 0x00002b06d658655f in _pt_thread_death (arg=0x1e32d650) at ../../../../pr/src/pthreads/ptthread.c:772
#4 0x0000003583a05b19 in __nptl_deallocate_tsd () from /lib64/libpthread.so.0
#5 0x0000003583a0678b in start_thread () from /lib64/libpthread.so.0
#6 0x0000003582ad325d in clone () from /lib64/libc.so.6
I marked the line containing the assert with !!!
766 /* PR_TRUE for: call destructors */
767 _pt_thread_death_internal(arg, PR_TRUE);
768
769 if (NULL == thred)
770 {
771 rv = pthread_setspecific(pt_book.key, NULL);
772 !!! PR_ASSERT(0 == rv);
773 }
Assignee | ||
Comment 29•13 years ago
|
||
... and this crash was reported 40 minutes ago, meaning we still crash after Wan-Teh's most recent change to CVS.
Assignee | ||
Comment 30•13 years ago
|
||
reopening, because we're not done
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 31•13 years ago
|
||
Actually, it seems this new crash was INTRODUCED by the changes since Friday.
Sorry, but I don't understand yet why patch v6 is supposed to work. It's a long time since I've worked on this, and I must re-read all involved code before I can make a judgement.
But given that we have a new crash (and this checkin seems the only one that could have introduced it since Friday), I want to back it out, in order to see if the crash goes away.
If this patch v6 was responsible, then I propose we try to check in my proposed patch, to see if it works better.
Assignee | ||
Comment 32•13 years ago
|
||
Comment on attachment 673474 [details] [diff] [review]
Patch v6
backed out
Checking in include/md/_pth.h;
/cvsroot/mozilla/nsprpub/pr/include/md/_pth.h,v <-- _pth.h
new revision: 3.42; previous revision: 3.41
done
Checking in src/pthreads/ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v <-- ptthread.c
new revision: 3.99; previous revision: 3.98
done
Comment 33•13 years ago
|
||
Kai, thank you for your help.
The assertion failures that caused the test failures are:
ssl.sh: Stress TLS RC4 128 with MD5 (session ticket, client auth) ----
selfserv starting at Mon Oct 22 09:30:08 EDT 2012
selfserv -D -p 8543 -d ../ext_server -n buildnss01.localdomain -B -s \
-e buildnss01.localdomain-ec -w nss -r -r -u -i ../tests_pid.14519 &
trying to connect to selfserv at Mon Oct 22 09:30:09 EDT 2012
tstclnt -p 8543 -h buildnss01.localdomain -q \
-d ../ext_client -v < /home/tinderbox/mozilla/security/tinderlight/data/buildnss01_trunk_64_DBG/mozilla/security/nss/tests/ssl/sslreq.dat
tstclnt: connecting to buildnss01.localdomain:8543 (address=127.0.0.1)
kill -0 15277 >/dev/null 2>/dev/null
selfserv with PID 15277 found at Mon Oct 22 09:30:09 EDT 2012
selfserv with PID 15277 started at Mon Oct 22 09:30:09 EDT 2012
strsclnt -q -p 8543 -d ../ext_client -w nss -V ssl3: -c 100 -C c -n ExtendedSSLUser -u \
buildnss01.localdomain
strsclnt started at Mon Oct 22 09:30:09 EDT 2012
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: 0 cache hits; 1 cache misses, 0 cache not reusable
0 stateless resumes
strsclnt: 99 cache hits; 1 cache misses, 0 cache not reusable
99 stateless resumes
strsclnt completed at Mon Oct 22 09:30:09 EDT 2012
ssl.sh: #2079: Stress TLS RC4 128 with MD5 (session ticket, client auth) produced a returncode of 0, expected is 0. - PASSED
trying to kill selfserv with PID 15277 at Mon Oct 22 09:30:09 EDT 2012
kill -USR1 15277
Assertion failure: 0 == rv, at ../../../../pr/src/pthreads/ptthread.c:979
selfserv: 99 cache hits; 1 cache misses, 0 cache not reusable
99 stateless resumes, 0 ticket parse failures
selfserv: normal termination
1154582848[1e341f20]: Assertion failure: 0 == rv, at ../../../../pr/src/pthreads/ptthread.c:772
Assertion failure: 0 == rv, at ../../../../pr/src/pthreads/ptthread.c:772
./ssl.sh: line 173: 15277 Aborted (core dumped) ${PROFTOOL} ${BINDIR}/selfserv -D -p ${PORT} -d ${P_R_SERVERDIR} -n ${HOSTADDR} ${SERVER_OPTIONS} ${ECC_OPTIONS} -w nss ${sparam} -i ${R_SERVERPID} $verbose
selfserv -b -p 8543 2>/dev/null;
selfserv with PID 15277 killed at Mon Oct 22 09:30:09 EDT 2012
ssl.sh: #2080: kill_selfserv core detection step - Core file is detected - FAILED
The first assertion failure (much more common) is:
960 void _PR_Fini(void)
961 {
962 void *thred;
963 int rv;
964
965 if (!_pr_initialized) return;
966
967 _PT_PTHREAD_GETSPECIFIC(pt_book.key, thred);
968 if (NULL != thred)
969 {
970 /*
971 * PR_FALSE, because it is unsafe to call back to the
972 * thread private data destructors at final cleanup.
973 */
974 _pt_thread_death_internal(thred, PR_FALSE);
975 rv = pthread_setspecific(pt_book.key, NULL);
976 PR_ASSERT(0 == rv);
977 }
978 rv = pthread_key_delete(pt_book.key);
979 PR_ASSERT(0 == rv); <=== THIS FAILED.
980 /* TODO: free other resources used by NSPR */
981 /* _pr_initialized = PR_FALSE; */
982 } /* _PR_Fini */
The second assertion failure is:
750 static void _pt_thread_death(void *arg)
751 {
752 void *thred;
753 int rv;
754
755 _PT_PTHREAD_GETSPECIFIC(pt_book.key, thred);
756 if (NULL == thred)
757 {
758 /*
759 * Have PR_GetCurrentThread return the expected value to the
760 * destructors.
761 */
762 rv = pthread_setspecific(pt_book.key, arg);
763 PR_ASSERT(0 == rv);
764 }
765
766 /* PR_TRUE for: call destructors */
767 _pt_thread_death_internal(arg, PR_TRUE);
768
769 if (NULL == thred)
770 {
771 rv = pthread_setspecific(pt_book.key, NULL);
772 PR_ASSERT(0 == rv); <=== THIS FAILED.
773 }
774 }
Status: REOPENED → ASSIGNED
Comment 34•13 years ago
|
||
I checked in the code cleanup changes from Kai's patch v5
(attachment 652387 [details] [diff] [review]).
Checking in pr/include/md/_pth.h;
/cvsroot/mozilla/nsprpub/pr/include/md/_pth.h,v <-- _pth.h
new revision: 3.43; previous revision: 3.42
done
Checking in pr/src/pthreads/ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v <-- ptthread.c
new revision: 3.100; previous revision: 3.99
done
Attachment #673474 -
Attachment is obsolete: true
Attachment #673474 -
Flags: review?(kaie)
Comment 35•13 years ago
|
||
Attachment #673476 -
Attachment is obsolete: true
Comment 36•13 years ago
|
||
Comment 37•13 years ago
|
||
I want to see if this assertion fails, without any change to PR_Cleanup.
Checking in ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v <-- ptthread.c
new revision: 3.101; previous revision: 3.100
done
If this assertion fails, please back out this change only. Keep the
code cleanup changes.
Attachment #674082 -
Flags: review?(kaie)
Assignee | ||
Updated•13 years ago
|
Attachment #674082 -
Flags: review?(kaie) → review+
Reporter | ||
Comment 38•12 years ago
|
||
Why is this bug still in the assigned state?
Reporter | ||
Comment 39•12 years ago
|
||
Kai, Wan-Teh,
This bug is still in the assigned state. The patches where approved and committed to cvs but were not picked up with the migration to mercurial. Was this intentional because may be extra work to do or or oversight due to never having been marked as fixed? Red Hat needs to know soon we can pick it up for use on the upcoming update to RHEL-6. I am inclined to pick it up and I need to make a recommendation very soon. Please, let me know if it will added to the current sources in the hg repository and its safe for me to use it. Thanks in advance for your answer.
-Elio
Flags: needinfo?(kaie)
Assignee | ||
Comment 40•12 years ago
|
||
Elio, there are multiple patches in this bug, and it's not clear to me, which patches still need landing.
Elio, can you please make it clear, which patches are you interested in using? It's best if you quote the attachment numbers.
Flags: needinfo?(kaie)
Assignee | ||
Comment 41•12 years ago
|
||
Comment on attachment 674074 [details] [diff] [review]
Code cleanup changes from Kai's patch v5 [checked in]
This patch from comment 34 is contained in HG.
Assignee | ||
Comment 42•12 years ago
|
||
Comment on attachment 674082 [details] [diff] [review]
Assert that the pthread_key_delete call in _PR_Fini succeeds [checked in]
This patch from comment 37 is contained in HG
Assignee | ||
Comment 43•12 years ago
|
||
Elio, unless you think any of the other patches still must be checked in to HG, this should be marked fixed.
I'm no longer certain what the patch from comment 36 is about. It might have been an alternative proposal that is no longer necessary, because of the patches that Wan-Teh had checked in?
Reporter | ||
Comment 44•12 years ago
|
||
Kai,
I must have been comparing the wrong local copies in my system. After redoing the diff of ptthread.c of last state from from CVS against the current from hg, I see that the current one one has everything that you checked in. Everything needed made in the switch. No other patches are needed. Sorry for the confusion. It seems to me that this bug should be switched to RESOLVED.
Thanks,
-Elio
Assignee | ||
Updated•12 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 13 years ago → 12 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 45•12 years ago
|
||
After reading more carefully. I'm reopening this bug.
This bug is still in an incomplete state. It's unfortunate we aren't able to finalize important correctness bugs in a timely manner, and stretch such issues over a long period of time, where each time going back requires a lot of time to read again and remember all the details.
I'm trying to describe the current state of this bug.
(A) We had noticed that we must call pthread_key_delete.
This has been done (comment 14).
So, *part* of this bug is fixed.
(B) Bob noticed we have some scenarios that still need fixing (comment 13).
(C) I created another patch to complete the fix (comment 17), which Bob r+ed (comment 18),
but suggested that wtc should review, too.
(D) Wan-Teh did a review, but only focused on stylistic details.
I addressed Wan-Teh's requests in comment 22.
(E) Wan-Teh followed up, judging that he didn't like my patch at all,
and wrote an alternative implementation (comment 25).
It was checked in without review.
However, it didn't work, and therefore it was backed out (comment 32).
(F) After Wan-Teh's patch didn't work, he checked in some cleanup from my patch (comment 34)
and he merged the remaining functional portions from my earlier suggested patch
(comment 36), and he checked in an experimental assertion (37).
At this point work on this bug had stalled.
Wan-Teh didn't follow up reviewing the functional changes from my patch 5 (which he attached in comment 36).
Summary:
- Portions of the bugs are fixed
- Portions are still unfixed
- Wan-Teh's experiment from comment 37 didn't cause problems
The remaining work:
- Either Wan-Teh or Bob must review attachment 674079 [details] [diff] [review] (which attempts to fix the issue mentioned by Bob in comment 13)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 46•12 years ago
|
||
Comment on attachment 674079 [details] [diff] [review]
The rest of Kai's patch v5
This patch still needs review.
However, it no longer applies to NSPR trunk, because of a context change. I'll attach a merged version.
Assignee | ||
Updated•12 years ago
|
Whiteboard: summary: comment 45
Assignee | ||
Updated•12 years ago
|
Attachment #674082 -
Attachment description: Assert that the pthread_key_delete call in _PR_Fini succeeds → Assert that the pthread_key_delete call in _PR_Fini succeeds [checked in]
Assignee | ||
Updated•12 years ago
|
Attachment #645381 -
Attachment is obsolete: true
Assignee | ||
Updated•12 years ago
|
Attachment #674074 -
Attachment description: Code cleanup changes from Kai's patch v5 → Code cleanup changes from Kai's patch v5 [checked in]
Assignee | ||
Comment 47•12 years ago
|
||
This is my patch v5 merged to NSPR trunk (only one context line had changed).
Note that right now I don't remember any details about this patch any more. If you have questions on this patch, or request changes to this patch, I will have to reread the related code.
Either Bob or Wan-Teh, could you please review?
Attachment #674079 -
Attachment is obsolete: true
Attachment #789149 -
Flags: superreview?(wtc)
Attachment #789149 -
Flags: review?(rrelyea)
Comment 48•12 years ago
|
||
Comment on attachment 789149 [details] [diff] [review]
Kai's patch v7 (ensure we'll free the thread key)
r+ rrelyea.
This looks like it handles wtc's style comments.
Wan-Teh, it's a relatively short patch that seems to address the issues raised, the only reservation would be if you saw something we missed when you did your rewrite. If not then I think we can drop this in and make this bug go away.
bob
Attachment #789149 -
Flags: review?(rrelyea) → review+
Assignee | ||
Updated•12 years ago
|
Attachment #789149 -
Attachment description: Kai's patch v7 → Kai's patch v7 (ensure we'll free the thread key)
Assignee | ||
Comment 49•12 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Target Milestone: 4.9.3 → 4.10.3
Comment 50•12 years ago
|
||
Comment on attachment 789149 [details] [diff] [review]
Kai's patch v7 (ensure we'll free the thread key)
Review of attachment 789149 [details] [diff] [review]:
-----------------------------------------------------------------
This patch may have caused Mozilla bug 946072: an assertion failure at
nspr/pr/src/pthreads/ptthread.c:848:
826 static void _pt_thread_death(void *arg)
827 {
828 void *thred;
829 int rv;
830
831 _PT_PTHREAD_GETSPECIFIC(pt_book.key, thred);
832 if (NULL == thred)
833 {
834 /*
835 * Have PR_GetCurrentThread return the expected value to the
836 * destructors.
837 */
838 rv = pthread_setspecific(pt_book.key, arg);
839 PR_ASSERT(0 == rv);
840 }
841
842 /* PR_TRUE for: call destructors */
843 _pt_thread_death_internal(arg, PR_TRUE);
844
845 if (NULL == thred)
846 {
847 rv = pthread_setspecific(pt_book.key, NULL);
848 PR_ASSERT(0 == rv); <=== ASSERTION FAILED
849 }
850 }
Bob, do you have a test program for this bug? I'll run the test program
and see if I can come up with an alternative fix. It is very hard to
shut down NSPR cleanly.
Attachment #789149 -
Flags: superreview?(wtc) → checked-in+
Comment 51•12 years ago
|
||
Comment on attachment 789149 [details] [diff] [review]
Kai's patch v7 (ensure we'll free the thread key)
Review of attachment 789149 [details] [diff] [review]:
-----------------------------------------------------------------
I will back out this patch in mozilla-central as an experiment.
If the assertion in Mozilla bug 946072 stops failing, then we
will know the assertion failure is caused by this patch.
Comment 52•12 years ago
|
||
1. Call PR_Assert (a function used by the PR_ASSERT macro) so that
an error message is printed before aborting the process.
2. We don't need to abort the process if the pthread_setspecific call
fails. (Also, that pthread_setspecific call is unlikely to fail.)
PR_ASSERT is sufficient.
3. Assert that the new pthread_key_delete call succeeds.
4. Always set pt_book.keyCreated to PR_FALSE after calling pthread_key_delete.
Attachment #8362138 -
Flags: review?(kaie)
Assignee | ||
Comment 53•12 years ago
|
||
Comment on attachment 8362138 [details] [diff] [review]
Call PR_Assert instead of PR_Abort
Good improvement, thanks.
r=kaie
Attachment #8362138 -
Flags: review?(kaie) → review+
Comment 54•12 years ago
|
||
Comment on attachment 8362138 [details] [diff] [review]
Call PR_Assert instead of PR_Abort
Review of attachment 8362138 [details] [diff] [review]:
-----------------------------------------------------------------
Patch checked in: https://hg.mozilla.org/projects/nspr/rev/67602d9f63c2
Attachment #8362138 -
Flags: checked-in+
You need to log in
before you can comment on or make changes to this bug.
Description
•