Closed Bug 123930 Opened 24 years ago Closed 24 years ago

rt->scopeSharingDone/LockScope deadlock

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9.9

People

(Reporter: epstein, Assigned: brendan)

Details

(Keywords: js1.5)

Attachments

(1 file)

We have seen this deadlock several times. Here's a good example. 45 threads are blocked on ClaimScope(0x81426b8). 11 threads are blocked on ClaimScope(0xb8a02e8). All of these are at jslock.c:444, PR_WaitCondVar(rt->scopeSharingDone...) Two other threads are blocked on js_LockScope, in particular thread 139: #0 0xdf6a06e9 in _lwp_sema_wait () from /usr/lib/libc.so.1 #1 0xdfbb2d5d in _park () from /usr/lib/libthread.so.1 #2 0xdfbb29df in _swtch () from /usr/lib/libthread.so.1 #3 0xdfbb1b79 in cond_wait () from /usr/lib/libthread.so.1 #4 0xdf82f28c in _MD_WAIT_CV (md_cv=0x8135d3c, md_lock=0x8135cf4, timeout=4294967295) at solaris.c:308 #5 0xdf822b6c in _PR_WaitCondVar (thread=0x81de378, cvar=0x8135d18, lock=0x8135cd8, timeout=4294967295) at prucv.c:201 #6 0xdf823063 in PR_WaitCondVar (cvar=0x8135d18, timeout=4294967295) at prucv.c:548 #7 0xdf4f6015 in js_SuspendThread (tl=0x8149e60) at jslock.c:835 #8 0xdf4f60e5 in js_Enqueue (tl=0x8149e60, me=136176504) at jslock.c:876 #9 0xdf4f6187 in js_Lock (tl=0x8149e60, me=136176504) at jslock.c:909 #10 0xdf4f624d in js_LockScope (cx=0x4, scope=0x8149e38) at jslock.c:966 #11 0xdf4f62f0 in js_LockObj (cx=0xa0ff770, obj=0x813e6b0) at jslock.c:1050 #12 0xdf4fc6f6 in js_LookupProperty (cx=0xa0ff770, obj=0x813e6b0, id=135512456, objp=0xdc19856c, propp=0xdc198570) at jsobj.c:1956 #13 0xdf4fcc05 in js_GetProperty (cx=0xa0ff770, obj=0x813e6b0, id=135512456, vp=0xdc1985a0) at jsobj.c:2122 #14 0xdf4fbc99 in js_NewObject (cx=0xa0ff770, clasp=0xdf522aa0, proto=0x813e6b0, parent=0x0) at jsobj.c:1554 ... #44 0xdf4d2afe in JS_CompileUCScript (cx=0xa0ff770, obj=0x813df40, chars=0xaf11900, length=24143, filename=0xdc198ef8 "http://wakeup.vui.en-us.tellme.com/wakeup.js", lineno=0) at jsapi.c:2807 Note that threads 139 is trying to lock the same scope 0x8149e38. The owner of scope 0x8149e38 is thread 0x81c41f8, which is thread 127, one of the 45 threads blocked on ClaimScope(0x81426b8). Thread 127 (LWP 39 ): #0 0xdf6a06e9 in _lwp_sema_wait () from /usr/lib/libc.so.1 #1 0xdfbb2d5d in _park () from /usr/lib/libthread.so.1 #2 0xdfbb29df in _swtch () from /usr/lib/libthread.so.1 #3 0xdfbb1b79 in cond_wait () from /usr/lib/libthread.so.1 #4 0xdf82f28c in _MD_WAIT_CV (md_cv=0x813c68c, md_lock=0x8112314, timeout=4294967295) at solaris.c:308 #5 0xdf822b6c in _PR_WaitCondVar (thread=0x81c41f8, cvar=0x813c668, lock=0x81122f8, timeout=4294967295) at prucv.c:201 #6 0xdf823063 in PR_WaitCondVar (cvar=0x813c668, timeout=4294967295) at prucv.c:548 #7 0xdf4f5964 in ClaimScope (scope=0x81426b8, cx=0xa142200) at jslock.c:444 #8 0xdf4f59f9 in js_GetSlotThreadSafe (cx=0xa142200, obj=0x813e068, slot=2) at jslock.c:494 #9 0xdf4feb31 in js_GetClassPrototype (cx=0xa142200, name=0xdf517b10 "Array", protop=0x4) at jsobj.c:2987 #10 0xdf4fbc14 in js_NewObject (cx=0xa142200, clasp=0xdf520700, proto=0x0, parent=0x0) at jsobj.c:1540 #11 0xdf4d5f1a in js_NewArrayObject (cx=0xa142200, length=0, vector=0x0) at jsarray.c:1410 #12 0xdf50c654 in js_ExecuteRegExp (cx=0xa142200, re=0xa17ce88, str=0xba87d28, indexp=0xdc4b05e0, test=0, rval=0xdc4b06e0) at jsregexp.c:2130 #13 0xdf512d01 in match_or_replace (cx=0xa142200, obj=0xba87d30, argc=4, argv=0xa17ce88, glob=0xdf512d38 <match_glob>, data=0xdc4b062c, rval=0xdc4b06e0, forceFlat=0) at jsstr.c:992 #14 0xdf512e0b in str_match (cx=0xa142200, obj=0xba87d30, argc=1, argv=0xbaac740, rval=0xdc4b06e0) at jsstr.c:1047 #15 0xdf4ec551 in js_Invoke (cx=0xa142200, argc=1, flags=0) at jsinterp.c:777 #16 0xdf4f31c9 in js_Interpret (cx=0xa142200, result=0xdc4b0994) at jsinterp.c:2668 #17 0xdf4ec966 in js_Execute (cx=0xa142200, chain=0xba87c98, script=0x0, down=0x0, special=4, result=0xdc4b0994) at jsinterp.c:956 #18 0xdf4d3214 in JS_ExecuteScript (cx=0xa142200, obj=0xba87c98, script=0xa17e208, rval=0xdc4b0994) at jsapi.c:3150 The scope being claimed has ownercx=0xa0ff770, which is thread 139. There is definitely a two-way deadlock here between thread 127 and thread 139. GC is not running.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Taking. /be
Status: NEW → ASSIGNED
Keywords: js1.5, mozilla0.9.9
Priority: -- → P1
Target Milestone: --- → mozilla0.9.9
This is related to bug 114712 as follows: that bug shows that a regexp object can be held locked while other gc-things are allocated. It happens that not only string gc-things, but also (array) object gc-things, may be allocated. That in turn leads to locking another object (the class constructor) while getting a class prototype object. Nested locks, yikes! The locking design in SpiderMonkey allows for exactly one nested lock case, in js_SetProtoOrParent (jsobj.c). Nested locks in general are an evil that will lead to deadlock. In exceptional cases, such as the one discussed in bug 114712, and for the cycle-check in js_SetProtoOrParent, we allow restricted kinds of lock nesting. But this potentially-long-duration regexp locking to serialize match/replace/split method executions must die! I'll attempt a patch. /be
Attached patch proposed fixSplinter Review
Forget about locking a regexp object over long, non-critical and nested-lock sections, and live free of deadlock. But re.lastIndex may be inconsistent if threads race to update it for a global regexp re. Too bad, I say -- see also bug 76717, which if fixed will leave only the user to blame for global regexps shared unsafely across threads. Comments on my "let them eat races" philosophy welcome. Testing from tellme.com especially welcome, this patch should not make things worse, only better. /be
If re.lastIndex is all that's at stake, I guess I'm OK, though I think this ups the ante on 76717. Let me make sure that I believe that before I bless this.
Comment on attachment 68237 [details] [diff] [review] proposed fix OK, I buy it. Racing on re.lastIndex is a perfectly reasonable risk to leave to an application that shares regexps across threads, just like with any other complex object.
Attachment #68237 - Flags: superreview+
sr=jband. I second what shaver said. I'd rather rogerl or khanson (who know more about the regexp code) stamp it with an r=. But it looks reasonable to me.
The way I chose to fix bug 76717 does not "leave only the user to blame for global regexps shared unsafely across threads", as I wrote in comment #3. I made re.lastIndex thread-specific. ECMA addresses neither concurrency nor compilation separate from execution (concurrency is enough to raise the question: how is lastIndex handled; compilation begs it in connection with concurrency even if no script passes a regexp from thread A to thread B, because the compiler will create a single regexp object shared among all threads executing the script in which the regexp is expressed literally). So I made an extension that was cheap in terms of code and implementation cost, and that should satisfy the thread-safe embeddings that I know about. Note that among all of JSRegExp's members, only lastIndex is mutated after initialization during (single-threaded) creation. So I don't expect anything bad to come of these jsstr.c lock removals. /be
I've successfully run the full JS testsuite on the above patch in the debug JS shell on WinNT: no regressions occurred.
Comment on attachment 68237 [details] [diff] [review] proposed fix r=rogerl
Attachment #68237 - Flags: review+
Fixed. /be
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
I've run the full testsuite on this fix, in both the debug and optimized JS shell, on both WinNT and Linux. No new regressions have occurred. Provisionally marking this bug Verified. Mike (at tellme.com), does this fix the problem for you? If not, please reopen this bug; thanks.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: