JS scope deadlock avoidance doesn't handle GC dependencies

VERIFIED FIXED in mozilla0.9

Status

()

P1
critical
VERIFIED FIXED
18 years ago
17 years ago

People

(Reporter: brendan, Assigned: brendan)

Tracking

({js1.5})

Trunk
mozilla0.9
js1.5
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(3 attachments)

(Assignee)

Description

18 years ago
If a thread is blocked waiting for the GC to finish, and the GC is in a
finalizer waiting for a scope lock, it's possible for that scope to be owned
exclusively by the first thread's context (I don't understand how the deadlock
comes about yet -- it may require at least three threads).

The deadlock detection in jslock.c's ClaimScope handles the PR_WaitCondVar on
rt->scopeSharingDone, but it doesn't do diddly for the JS_AWAIT_GC_DONE that
follows, when resuming the request.

/be
(Assignee)

Comment 1

18 years ago
Gotta fix fast.

/be
Severity: normal → critical
Status: NEW → ASSIGNED
Keywords: js1.5, mozilla0.9
Priority: -- → P1
Target Milestone: --- → mozilla0.9
(Assignee)

Comment 2

18 years ago
Created attachment 30061 [details] [diff] [review]
proposed fix
(Assignee)

Comment 3

18 years ago
jband: you are my best buddy from bug 54743.  How about another sr=?  Shaver,
can you r=?  Thanks.

/be
I had to read the comment and related code about 5 times, but I think I get it
now.  (Finalizers having to share scopes on the dead objects, who'da thunk it?)

r=shaver

Comment 5

18 years ago
sr=jband
(Assignee)

Comment 6

18 years ago
Created attachment 30166 [details] [diff] [review]
better fix, but conservative/pessimal: GC thread context can't claim a scope, rather forces sharing
(Assignee)

Comment 7

18 years ago
I also fixed the bogus assertion that rt->requestCount > 0 if the ClaimScope cx
is in a request.  The cx running the GC and calling finalizers will be in a
request according to its cx->requestDepth, but the GC has already temporarily
reduced rt->requestCount to 0.

Not only should ClaimScope not assert so bogusly, it shouldn't even fiddle with
rt->requestCount or do notifies based on its 0/non-0 transition at all, if cx is
running the GC (or on the same thread as a cx that's running the GC).

/be
(Assignee)

Comment 8

18 years ago
Fix is in.

/be
Status: ASSIGNED → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → FIXED
(Assignee)

Comment 9

18 years ago
Phil, I think we should do an RC3a tarball, or whatever -- this bug is biting
RC3 downloaders, all around the same time.

/be
(Assignee)

Comment 10

18 years ago
WillDeadlock nicely checks whether the GC is active on cx's thread, and returns
true -- but ClaimScope calls WillDeadlock iff ownercx->scopeToShare.  D'oh!  The
"GC thread is cx's thread" test must happen in ClaimScope, before WillDeadlock
and as part of the same condition that leads to ShareScope.

/be
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Updated

18 years ago
Status: REOPENED → ASSIGNED
(Assignee)

Comment 11

18 years ago
Patch coming right up.

/be
(Assignee)

Comment 12

18 years ago
Created attachment 30441 [details] [diff] [review]
best fix
(Assignee)

Comment 13

18 years ago
Slightly improved code will be checked in: use rt->gcThread rather than
cx->runtime->gcThread in ClaimScope (where rt is available in a local).

I don't know whether anyone can claim to review this code and find bugs before
hand -- I have failed.  But a hindsight-review from jband or shaver would make
me feel better.  Also, we need to tickle this with jband's MT stress test, which
seems not to do enough finalization of objects claimed but slated for sharing by
other threads.

/be
(Assignee)

Comment 14

18 years ago
Fixed for good.

/be
(Assignee)

Comment 15

18 years ago
Urgh.

/be
Status: ASSIGNED → RESOLVED
Last Resolved: 18 years ago18 years ago
Resolution: --- → FIXED

Comment 16

18 years ago
Marking Verified - 
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.