Potential deadlock in nsObserverList::RemoveObserver

RESOLVED FIXED

Status

()

Core
XPCOM
RESOLVED FIXED
11 years ago
11 years ago

People

(Reporter: roc, Assigned: roc)

Tracking

({fixed1.8.0.5, fixed1.8.1})

Trunk
fixed1.8.0.5, fixed1.8.1
Points:
---
Bug Flags:
blocking1.8.0.5 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

Even after bug 334523, there is still a potential deadlock. The problem is that we do a QueryInterface while holding mLock, and if the observer is a JS object, then it can run any arbitrary code.

In https://bugzilla.novell.com/show_bug.cgi?id=173986 we seem to be hitting this; the stack trace shows that while running the JS we do a JS GC which releases stuff which deadlocks trying to reenter RemoveObserver. Unfortunately I can't reproduce this myself so I can't yet verify that my fix fixes that issue, but I still think it's worth doing. Hopefully we'll get this verified on the Novell side shortly.
Created attachment 222112 [details] [diff] [review]
fix

Obvious fix --- move the QI to before we grab the lock.
Attachment #222112 - Flags: superreview?(darin)
Attachment #222112 - Flags: review?(darin)

Comment 2

11 years ago
This is 1.8 branch only, right?  The locks don't exist on the trunk anymore.

Comment 3

11 years ago
Dupe of bug 334523 (which I fixed for 1.5.0.4)?
(In reply to comment #2)
> This is 1.8 branch only, right?  The locks don't exist on the trunk anymore.

Yes.

(In reply to comment #3)
> Dupe of bug 334523 (which I fixed for 1.5.0.4)?

No. Our builds have that patch.

Well, our stack does look like attachment 218886 [details]. I think perhaps you diagnosed bug 334523 incorrectly. Your comment there says "If RemoveObserver results in an observer being destroyed, then that could result in another call to RemoveObserver" and you prevented that keeping the observer alive until the lock is released. However these stacks seem to indicate that the transition to JS from RemoveObserver is because of a QI call, not because of a destructor call.
I think we should avoid both QI *and* destructor calls while holding mLock, so we need both patches.

Comment 6

11 years ago
Comment on attachment 222112 [details] [diff] [review]
fix

ok, that makes sense.
Attachment #222112 - Flags: superreview?(darin)
Attachment #222112 - Flags: superreview+
Attachment #222112 - Flags: review?(darin)
Attachment #222112 - Flags: review+
Comment on attachment 222112 [details] [diff] [review]
fix

I'm adding 1.8.1+ on the grounds that Darin knew when he r+ed that this was intended for the 1.8 branch only...
Attachment #222112 - Flags: approval-branch-1.8.1+
checked in.
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Keywords: fixed1.8.1
Resolution: --- → FIXED
Comment on attachment 222112 [details] [diff] [review]
fix

This probably should go into 1.8.0.5.
Attachment #222112 - Flags: approval1.8.0.5?
roc, could you please attach the stack from that Novell bug?  I can't access it (I signed up for a bugzilla.novell.com account, but still no joy).  Thanks,

/be
Flags: blocking1.8.0.5+
Comment on attachment 222112 [details] [diff] [review]
fix

approved for 1.8.0 branch, a=dveditz for drivers
Attachment #222112 - Flags: approval1.8.0.5? → approval1.8.0.5+
Fixed on 1.8.0 branch.
Keywords: fixed1.8.0.5
You need to log in before you can comment on or make changes to this bug.