Closed Bug 442066 Opened 12 years ago Closed 12 years ago
Calling a JS callback from a worker thread can cause a crash
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008062515 Firefox/3.0 Flock/2.0b2pre Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008062515 Firefox/3.0 Flock/2.0b2pre Our Flock browser has an indexing (C++) thread that calls back to a JS component to signal that the operation is complete so that another document(bookmark) can be indexed. I have discovered some threading issues that result in crashes. See attached stacks. Reproducible: Sometimes Steps to Reproduce: 1.Start Flock (Firefox based) with no profile 2.Do not import anything 3.Let the default bookmarks be indexed 4.Crash after 10-30 seconds (most reproducible on my Vista machine but seen once on Mac)
Attachment #326933 - Attachment mime type: image/jpeg → text/plain
Flock is based on the Firefox 3 General release.
This is a scenario of the crash: Main thread Worker thread calls Release on a nsXPCWrappedJS object which has a mRefCnt of 2 acquires map lock in GetNewOrUsed finds a valid root in WrappedJSMap releases the maplock finds a valid IID mRefCnt is decremented to 1 then 0 and calls NS_DELETEXPCOM adds reference to it (mRefCnt = 1) the destructor is called and the object removed from the WrappedJSMap tries to access that nsXPCWrappedJS BOOM
The final Release (when refcnt ==2) usually happens in the main thread, but some race condition can make it happen in the Worker thread
The crash says "Access violation reading location 0xddddddf1" which indicates deleted memory.
The lock on Release is needed to match the other lock on AddRef on line 336. What happens is that the Worker thread is destroying an nsXPCWrappedJS object while the main thread is deciding if it can reuse it or it should create a new one.
Attachment #326938 - Flags: review?
Status: UNCONFIRMED → NEW
Component: General → XPConnect
Ever confirmed: true
OS: Windows Vista → All
Product: Firefox → Core
Hardware: PC → All
Version: unspecified → 1.9.0 Branch
12 years ago
Attachment #326938 - Flags: review? → review?(mrbkap)
Comment on attachment 326938 [details] [diff] [review] fix by adding synchronization locks jst should review this.
Attachment #326938 - Flags: review?(mrbkap) → review?(jst)
Comment on attachment 326938 [details] [diff] [review] fix by adding synchronization locks Nice catch! Looks good, but this needs comments in Release() and GetNewOrUsed() explaining the need for the map lock to prevent two (or more) threads racing in those two calls. r- only because I want to see the comments, then I can r+sr, even.
Attachment #326938 - Flags: review?(jst) → review-
Bruno's on vacation, so I'm handling this now
Comment on attachment 328787 [details] [diff] [review] Add requested comments Awesome. r+sr=jst, thanks for fixing!
We should land this on the 1.9.0 branch too assuming it tests out good etc.
Fix landed, thanks a ton for the patch!
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You're welcome. Only took me 3 weeks to hunt the bug.
Could this be the cause of bug 444859?
Possibly, but as I commented in bug 444859, we'd need a more useful stack trace to investigate.
Not approving until we find out if this indeed caused bug 444859. Also, can we get some tests for this?
An update on tests for this or a fix for bug 444859. I'm inclined to minus this for 188.8.131.52 right now...
Depends on: 444859
Whiteboard: [needs update on 444859]
Comment on attachment 328787 [details] [diff] [review] Add requested comments We'll revisit approval for 184.108.40.206 given the possible regression bug 444859.
Attachment #328787 - Flags: approval220.127.116.11?
Comment on attachment 328787 [details] [diff] [review] Add requested comments Clearing approval request. Please re-request when bug 444859 has been addressed and tests have been created for both bugs.
You need to log in before you can comment on or make changes to this bug.