Closed Bug 442066 Opened 12 years ago Closed 12 years ago

Calling a JS callback from a worker thread can cause a crash

Categories

(Core :: XPConnect, defect)

1.9.0 Branch
defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: bruno, Assigned: mozbugs)

References

Details

(Whiteboard: [needs update on 444859])

Attachments

(3 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008062515 Firefox/3.0 Flock/2.0b2pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008062515 Firefox/3.0 Flock/2.0b2pre

Our Flock browser has an indexing (C++) thread that calls back to a JS component to signal that the operation is complete so that another document(bookmark) can be indexed.
I have discovered some threading issues that result in crashes. See attached stacks.

Reproducible: Sometimes

Steps to Reproduce:
1.Start Flock (Firefox based) with no profile
2.Do not import anything
3.Let the default bookmarks be indexed
4.Crash after 10-30 seconds (most reproducible on my Vista machine but seen once on Mac)
Attachment #326933 - Attachment mime type: image/jpeg → text/plain
Flock is based on the Firefox 3 General release.
This is a scenario of the crash:

    Main thread                          Worker thread
                                     calls Release on a nsXPCWrappedJS object
which has a mRefCnt of 2
acquires map lock in GetNewOrUsed
finds a valid root in WrappedJSMap
releases the maplock

finds a valid IID

                                     mRefCnt is decremented to 1 then 0 and
calls NS_DELETEXPCOM

adds reference to it (mRefCnt = 1)
                                     the destructor is called and the object
removed from the WrappedJSMap
tries to access that nsXPCWrappedJS
BOOM
The final Release (when refcnt ==2) usually happens in the main thread, but some race condition can make it happen in the Worker thread
The crash says "Access violation reading location 0xddddddf1" which indicates
deleted memory.
The lock on Release is needed to match the other lock on AddRef on line 336.
What happens is that the Worker thread is destroying an nsXPCWrappedJS object
while the main thread is deciding if it can reuse it or it should create a new
one.
Attachment #326938 - Flags: review?
Status: UNCONFIRMED → NEW
Component: General → XPConnect
Ever confirmed: true
OS: Windows Vista → All
Product: Firefox → Core
Hardware: PC → All
Version: unspecified → 1.9.0 Branch
QA Contact: general → xpconnect
Attachment #326938 - Flags: review? → review?(mrbkap)
Comment on attachment 326938 [details] [diff] [review]
fix by adding synchronization locks 

jst should review this.
Attachment #326938 - Flags: review?(mrbkap) → review?(jst)
Comment on attachment 326938 [details] [diff] [review]
fix by adding synchronization locks 

Nice catch! Looks good, but this needs comments in Release() and GetNewOrUsed() explaining the need for the map lock to prevent two (or more) threads racing in those two calls.

r- only because I want to see the comments, then I can r+sr, even.
Attachment #326938 - Flags: review?(jst) → review-
Bruno's on vacation, so I'm handling this now
Assignee: nobody → manish
Attachment #326938 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #328787 - Flags: review?(jst)
Comment on attachment 328787 [details] [diff] [review]
Add requested comments

Awesome. r+sr=jst, thanks for fixing!
Attachment #328787 - Flags: superreview+
Attachment #328787 - Flags: review?(jst)
Attachment #328787 - Flags: review+
We should land this on the 1.9.0 branch too assuming it tests out good etc.
Flags: wanted1.9.0.x+
Keywords: checkin-needed
Fix landed, thanks a ton for the patch!
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Attachment #328787 - Flags: approval1.9.0.2?
You're welcome. Only took me 3 weeks to hunt the bug.
Possibly, but as I commented in bug 444859, we'd need a more useful stack trace to investigate.
Not approving until we find out if this indeed caused bug 444859. Also, can we get some tests for this?
Flags: in-testsuite?
An update on tests for this or a fix for bug 444859. I'm inclined to minus this for 1.9.0.2 right now...
Depends on: 444859
Whiteboard: [needs update on 444859]
Comment on attachment 328787 [details] [diff] [review]
Add requested comments

We'll revisit approval for 1.9.0.3 given the possible regression bug 444859.
Attachment #328787 - Flags: approval1.9.0.3?
Attachment #328787 - Flags: approval1.9.0.2?
Attachment #328787 - Flags: approval1.9.0.2-
Attachment #328787 - Flags: approval1.9.0.4?
Comment on attachment 328787 [details] [diff] [review]
Add requested comments

Clearing approval request. Please re-request when bug 444859 has been addressed and tests have been created for both bugs.
You need to log in before you can comment on or make changes to this bug.