Closed
Bug 557586
Opened 14 years ago
Closed 14 years ago
XPCPerThreadData::~XPCPerThreadData race leading to possible crash
Categories
(Core :: XPConnect, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.3a5
People
(Reporter: jseward, Assigned: jseward)
Details
Attachments
(1 file)
1.29 KB,
patch
|
timeless
:
review+
|
Details | Diff | Splinter Review |
There's a race in XPCPerThreadData::~XPCPerThreadData that can cause XPCPerThreadData::gLock to be PR_Destroy'd and nulled out whilst a second thread is holding it. In the worst case this could cause an essentially unreproducible segfault at shutdown. All assuming the following analysis is correct, of course. What follows is a bit of background and an interleaving of execution which I believe can lead to the failure. All line numbers pertain to js/src/xpconnect/src/xpcthreadcontext.cpp (m-c as of today). BACKGROUND This code appears to maintain a linked list of XPCPerThreadDatas. Each such element has a field mNextThread which points to the next element. The list is rooted by a global variable 325 XPCPerThreadData* XPCPerThreadData::gThreads = nsnull; which is protected by this lock 324 PRLock* XPCPerThreadData::gLock = nsnull; XPCPerThreadData::~XPCPerThreadData() is intended to remove itself from the linked list |gThreads|. Then, if the list is empty, |gLock| is destroyed and nulled out. However, these actions are not carried out atomically, and it can be that a second thread T2 entering ~XPCPerThreadData() just after a previous thread T1 entered it, can cause T1 to destroy and null out |gLock| while T2 still holds it. FAILURE INTERLEAVING Imagine there are two threads, T1 and T2, there are two entries in |gThreads|, and |gLock| is non-NULL. Here's the relevant code: 394 // Unlink 'this' from the list of threads. 395 if(gLock) 396 { 397 nsAutoLock lock(gLock); 398 if(gThreads == this) 399 gThreads = mNextThread; 400 else 401 { 402 XPCPerThreadData* cur = gThreads; 403 while(cur) 404 { 405 if(cur->mNextThread == this) 406 { 407 cur->mNextThread = mNextThread; 408 break; 409 } 410 cur = cur->mNextThread; 411 } 412 } 413 } 414 415 if(gLock && !gThreads) 416 { 417 PR_DestroyLock(gLock); 418 gLock = nsnull; 419 } T1: 395 (taken) 397 (T1 acquires lock) 398 .. 412 (T1 removes itself from list) At this point |gThreads| is non-NULL (still one entry in it), and we still hold the lock. 413 (T1 drops lock at end of scope) 414 stall T2: 395 (taken) 397 (T2 acquires lock) 398 .. 412 (T2 removes itself from list) |gThreads| is now null 412 stall, still holding the lock T1: 415 (taken - since |gLock| is non-NULL and gThreads is NULL) 416 .. 419 destroy |gLock| and null it T2: 413 (drop lock at end of scope). Unfortunately it's now NULL. Possible segfault. This race was pointed out by Helgrind. It reports that the write of |gThreads| at 399 conflicts with the read of it at 415: Possible data race during write of size 8 at 0x6b23220 by thread #16 at 0x558138C: XPCPerThreadData::~XPCPerThreadData() (xpcthreadcontext.cpp:399) by 0x55813AD: xpc_ThreadDataDtorCB(void*) (xpcthreadcontext.cpp:427) by 0x76DFA4E: _PR_DestroyThreadPrivate (prtpd.c:265) by 0x76F5033: _pt_root (ptthread.c:270) by 0x4C2A5D8: mythread_wrapper (hg_intercepts.c:213) by 0x4E34A03: start_thread (pthread_create.c:300) by 0xAB2680C: clone (clone.S:112) This conflicts with a previous read of size 8 by thread #14 at 0x5581357: XPCPerThreadData::~XPCPerThreadData() (xpcthreadcontext.cpp:415) by 0x55813AD: xpc_ThreadDataDtorCB(void*) (xpcthreadcontext.cpp:427) by 0x76DFA4E: _PR_DestroyThreadPrivate (prtpd.c:265) by 0x76F5033: _pt_root (ptthread.c:270) by 0x4C2A5D8: mythread_wrapper (hg_intercepts.c:213) by 0x4E34A03: start_thread (pthread_create.c:300) by 0xAB2680C: clone (clone.S:112)
Comment 1•14 years ago
|
||
This isn't a problem in practice, because we know the main thread outlives all other threads: we explicitly join all other threads (at least the ones which might run xpconnect) during shutdown well before we get around to destroying the XPCPerThreadData for the main thread.
Assignee | ||
Comment 2•14 years ago
|
||
Comment on attachment 437374 [details] [diff] [review] patch to make update and delete-decision atomic i can live with this.
Attachment #437374 -
Flags: review+
Assignee | ||
Comment 4•14 years ago
|
||
(In reply to comment #1) > This isn't a problem in practice, [...] It would be better if the code was "obviously correct" without having to invoke non-local invariants, given that in this case it's easy to make it so.
Assignee | ||
Updated•14 years ago
|
Keywords: checkin-needed
Updated•14 years ago
|
Assignee: nobody → jseward
Status: NEW → ASSIGNED
Comment 5•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/3f75c967a6cc
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a5
You need to log in
before you can comment on or make changes to this bug.
Description
•