Closed Bug 616072 Opened 14 years ago Closed 13 years ago

Deadlock involving proxy object lock

Categories

(Core :: XPCOM, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: roc, Unassigned)

References

Details

I was just doing regular browsing, my browser hung, and I caught it in a debugger.

The main thread is stuck here:

 	nspr4.dll!PR_Lock(PRLock * lock)  Line 234	C
 	xul.dll!nsProxyEventObject::Release()  Line 105	C++
 	xul.dll!nsProxyObjectCallInfo::~nsProxyObjectCallInfo()  Line 158	C++
 	xul.dll!nsProxyObjectCallInfo::`scalar deleting destructor'()  + 0x8 bytes	C++
>	xul.dll!nsDOMWorkerEventListenerBase::Release()  Line 50 + 0x1f bytes	C++
 	xul.dll!nsCOMPtr_base::~nsCOMPtr_base()  Line 82	C++
 	xul.dll!nsThread::ProcessNextEvent(int mayWait, int * result)  Line 636	C++
 	xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, int mayWait)  Line 250 + 0xd bytes	C++

We're releasing 'event', an nsCOMPtr<nsIRunnable>, which happens to be an nsProxyObjectCallInfo. The method is sameTypeRootTreeInfo, and we seem to be calling it on an nsDocShell. (I don't know how nsDOMWorkerEventListenerBase is showing up there, it's probably an artifact of my opt build.) That call originated in nsNSSIOLayer.cpp. The main thread is hung waiting for the mProxyCreationLock. As far as I can tell, the main thread is not holding any locks.

Another thread has this stack:

 	nspr4.dll!PR_Lock(PRLock * lock)  Line 234	C
 	xul.dll!nsProxyObjectManager::GetProxyForObject(nsIEventTarget * aTarget, const nsID & aIID, nsISupports * aObj, int proxyType, void * * aProxyObject)  Line 240 + 0x22 bytes	C++
 	xul.dll!NS_GetProxyForObject(nsIEventTarget * target, const nsID & aIID, nsISupports * aObj, int proxyType, void * * aProxyObject)  Line 353 + 0x18 bytes	C++
>	xul.dll!nsThreadPool::ShutdownThread(nsIThread * thread)  Line 147	C++
 	xul.dll!nsThreadPool::Run()  Line 234	C++
 	xul.dll!nsThread::ProcessNextEvent(int mayWait, int * result)  Line 626 + 0x6 bytes	C++
 	xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, int mayWait)  Line 250 + 0xd bytes	C++

So it's waiting for the mProxyCreationLock too. But as far as I can tell, it's not holding any locks either.

Strangely, in PR_Lock, mProxyCreationLock->owner is NULL. But in the Windows CRITICAL_SECTION data correctly, OwningThread is the NSS thread that originated the proxied call to sameTypeRootTreeInfo. But that might just be the *last* thread that owned the critical section. Here's that thread's call stack:

>	nspr4.dll!_PR_MD_WAIT_CV(_MDCVar * cv, _MDLock * lock, unsigned int timeout)  Line 282	C
 	nspr4.dll!_PR_WaitCondVar(PRThread * thread, PRCondVar * cvar, PRLock * lock, unsigned int timeout)  Line 205	C
 	nspr4.dll!PR_Wait(PRMonitor * mon, unsigned int ticks)  Line 184 + 0x1a bytes	C
 	xul.dll!nsEventQueue::GetEvent(int mayWait, nsIRunnable * * result)  Line 85 + 0x9 bytes	C++
 	xul.dll!nsThread::ProcessNextEvent(int mayWait, int * result)  Line 622	C++
 	xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, int mayWait)  Line 250 + 0xd bytes	C++
 	xul.dll!nsProxyEventObject::CallMethod(unsigned short methodIndex, const XPTMethodDescriptor * methodInfo, nsXPTCMiniVariant * params)  Line 260 + 0xa bytes	C++
 	xul.dll!PrepareAndDispatch(nsXPTCStubBase * self, unsigned int methodIndex, unsigned int * args, unsigned int * stackBytesToPop)  Line 114 + 0x15 bytes	C++
 	xul.dll!SharedStub()  Line 142	C++
 	xul.dll!nsNSSSocketInfo::EnsureDocShellDependentStuffKnown()  Line 415	C++
 	xul.dll!nsNSSSocketInfo::GetPreviousCert(nsIX509Cert * * _result)  Line 850	C++
 	xul.dll!HandshakeCallback(PRFileDesc * fd, void * client_data)  Line 948	C++
 	ssl3.dll!ssl3_HandleFinished(sslSocketStr * ss, unsigned char * b, unsigned int length, const SSL3Hashes * hashes)  Line 8503 + 0xa bytes	C
 	ssl3.dll!ssl3_HandleHandshakeMessage(sslSocketStr * ss, unsigned char * b, unsigned int length)  Line 8660	C
 	ssl3.dll!ssl3_HandleHandshake(sslSocketStr * ss, sslBufferStr * origBuf)  Line 8727 + 0x8 bytes	C
 	ssl3.dll!ssl3_HandleRecord(sslSocketStr * ss, SSL3Ciphertext * cText, sslBufferStr * databuf)  Line 9066 + 0x9 bytes	C
 	ssl3.dll!ssl3_GatherCompleteHandshake(sslSocketStr * ss, int flags)  Line 209 + 0x11 bytes	C
 	ssl3.dll!ssl_GatherRecord1stHandshake(sslSocketStr * ss)  Line 1259	C
 	ssl3.dll!ssl_Do1stHandshake(sslSocketStr * ss)  Line 151 + 0x3 bytes	C
 	ssl3.dll!ssl_SecureSend(sslSocketStr * ss, const unsigned char * buf, int len, int flags)  Line 1213 + 0x6 bytes	C
 	ssl3.dll!ssl_SecureWrite(sslSocketStr * ss, const unsigned char * buf, int len)  Line 1258 + 0x13 bytes	C
 	ssl3.dll!ssl_Write(PRFileDesc * fd, const void * buf, int len)  Line 1653	C
 	xul.dll!nsSSLThread::Run()  Line 1052	C++

The state of the CRITICAL_SECTION is weird. Here it is:

-		lock	0x09f6c888 {links={...} owner=0x00000000 waitQ={...} ...}	PRLock *
+		links	{next=0x09f6c888 prev=0x09f6c888 }	PRCListStr
+		owner	0x00000000 {state=??? priority=??? arg=??? ...}	PRThread *
+		waitQ	{next=0x09f6c894 prev=0x09f6c894 }	PRCListStr
		priority	PR_PRIORITY_FIRST	PRThreadPriority
		boostPriority	PR_PRIORITY_FIRST	PRThreadPriority
-		ilock	{mutex={...} notified={...} }	_MDLock
-		mutex	{DebugInfo=0x00362430 LockCount=-10 RecursionCount=1 ...}	_RTL_CRITICAL_SECTION
-		DebugInfo	0x00362430 {Type=0 CreatorBackTraceIndex=0 CriticalSection=0x09f6c8a4 ...}	_RTL_CRITICAL_SECTION_DEBUG *
		Type	0	unsigned short
		CreatorBackTraceIndex	0	unsigned short
+		CriticalSection	0x09f6c8a4 {DebugInfo=0x00362430 LockCount=-10 RecursionCount=1 ...}	_RTL_CRITICAL_SECTION *
+		ProcessLocksList	{Flink=0x003639e8 Blink=0x00362168 }	_LIST_ENTRY
		EntryCount	0	unsigned long
		ContentionCount	15	unsigned long
		Flags	0	unsigned long
		CreatorBackTraceIndexHigh	0	unsigned short
		SpareWORD	0	unsigned short
		LockCount	-10	long
		RecursionCount	1	long
		OwningThread	0x000014c0	void *
		LockSemaphore	0x000006b4	void *
		SpinCount	0	unsigned long
+		notified	{length=0 cv=0x09f6c8c0 link=0x00000000 }	_MDNotified

http://msdn.microsoft.com/en-us/magazine/cc164040.aspx#S2 says that OwningThread is definitely zeroed out when a process releases the critical section. And LockCount should apparently never be less than -1. RecursionCount 1 suggests that the the NSS thread is definitely holding the lock. But all this information might be out of date :-(.

I really don't have a clue what could have happened here. Maybe memory corruption?
Same deadlock occurs on 2011-04-10 nightly

0:006> ~0k
ChildEBP RetAddr
002bd61c 771d0177 ntdll!ZwWaitForSingleObject+0x15
002bd680 771d0142 ntdll!RtlpWaitOnCriticalSection+0x155
002bd6a8 6505b3a7 ntdll!RtlEnterCriticalSection+0x152
002bd6b8 61c2656d nspr4!PR_Lock+0x17
002bd6cc 61c255f8 xul!nsProxyEventObject::Release+0x15
002bd6d8 61c2562c xul!nsProxyObjectCallInfo::~nsProxyObjectCallInfo+0x38
002bd6e0 61adc6b3 xul!nsProxyObjectCallInfo::`scalar deleting destructor'+0x8
002bd6ec 61a5111e xul!nsRunnable::Release+0x33
002bd6f4 61ac2833 xul!nsRefPtr<nsIRunnable>::~nsRefPtr<nsIRunnable>+0xe
002bd720 61aeb02e xul!nsThread::ProcessNextEvent+0x163
002bd754 61c97352 xul!mozilla::ipc::MessagePump::Run+0x6e
002bd760 61c9733b xul!MessageLoop::RunInternal+0x11
002bd778 64dc808b xul!MessageLoop::RunHandler+0x1d
002bd798 61c972e7 MOZCRT19!_VEC_memzero+0x36
002bd7b8 61c7a932 xul!MessageLoop::Run+0x15
002bd7c4 61ca2ad3 xul!nsBaseAppShell::Run+0x34
002bf718 61ca2a7d xul!nsAppShell::Run+0x42
002bf724 61bc7589 xul!nsAppStartup::Run+0x1e
002bfab4 010c134c xul!XRE_main+0xde9
002bfb04 010c16f2 firefox!wmain+0x34c

0:006> dt -r         nspr4!PRLock 05caa4c0
   +0x000 links            : PRCListStr
      +0x000 next             : 0x05caa4c0 PRCListStr
         +0x000 next             : 0x05caa4c0 PRCListStr
         +0x004 prev             : 0x05caa4c0 PRCListStr
      +0x004 prev             : 0x05caa4c0 PRCListStr
         +0x000 next             : 0x05caa4c0 PRCListStr
         +0x004 prev             : 0x05caa4c0 PRCListStr
   +0x008 owner            : (null)
   +0x00c waitQ            : PRCListStr
      +0x000 next             : 0x05caa4cc PRCListStr
         +0x000 next             : 0x05caa4cc PRCListStr
         +0x004 prev             : 0x05caa4cc PRCListStr
      +0x004 prev             : 0x05caa4cc PRCListStr
         +0x000 next             : 0x05caa4cc PRCListStr
         +0x004 prev             : 0x05caa4cc PRCListStr
   +0x014 priority         : 0 ( PR_PRIORITY_FIRST )
   +0x018 boostPriority    : 0 ( PR_PRIORITY_FIRST )
   +0x01c ilock            : _MDLock
      +0x000 mutex            : _RTL_CRITICAL_SECTION
         +0x000 DebugInfo        : 0x00312730 _RTL_CRITICAL_SECTION_DEBUG
         +0x004 LockCount        : -14
         +0x008 RecursionCount   : 1
         +0x00c OwningThread     : 0x00000ee0
         +0x010 LockSemaphore    : 0x0000084c
         +0x014 SpinCount        : 0
      +0x018 notified         : _MDNotified
         +0x000 length           : 0
         +0x004 cv               : [6] <unnamed-tag>
         +0x04c link             : (null)

0:006> ~
   0  Id: 156c.1410 Suspend: 1 Teb: fffdd000 Unfrozen
   1  Id: 156c.fb8 Suspend: 1 Teb: fffaf000 Unfrozen
   2  Id: 156c.12c0 Suspend: 1 Teb: fffa9000 Unfrozen
   3  Id: 156c.e84 Suspend: 1 Teb: fffa6000 Unfrozen
   4  Id: 156c.ae8 Suspend: 1 Teb: fffa3000 Unfrozen
   5  Id: 156c.f74 Suspend: 1 Teb: fffa0000 Unfrozen
.  6  Id: 156c.1bcc Suspend: 1 Teb: fff9d000 Unfrozen
   7  Id: 156c.ee0 Suspend: 1 Teb: fff9a000 Unfrozen
   8  Id: 156c.15d8 Suspend: 1 Teb: fff97000 Unfrozen
   9  Id: 156c.1734 Suspend: 1 Teb: fff94000 Unfrozen
  10  Id: 156c.1440 Suspend: 1 Teb: fff85000 Unfrozen
  11  Id: 156c.1698 Suspend: 1 Teb: fff91000 Unfrozen
  12  Id: 156c.180 Suspend: 1 Teb: fff8e000 Unfrozen
  13  Id: 156c.1740 Suspend: 1 Teb: fff82000 Unfrozen
  14  Id: 156c.e58 Suspend: 1 Teb: fff7c000 Unfrozen
  15  Id: 156c.17d0 Suspend: 1 Teb: fff79000 Unfrozen
  16  Id: 156c.ea8 Suspend: 1 Teb: fff76000 Unfrozen
  17  Id: 156c.153c Suspend: 1 Teb: fff70000 Unfrozen
  18  Id: 156c.1098 Suspend: 1 Teb: fff6d000 Unfrozen
  19  Id: 156c.b6c Suspend: 1 Teb: fff37000 Unfrozen
  20  Id: 156c.1498 Suspend: 1 Teb: fff7f000 Unfrozen
  21  Id: 156c.1494 Suspend: 1 Teb: fff52000 Unfrozen
  22  Id: 156c.1598 Suspend: 1 Teb: fffac000 Unfrozen
  23  Id: 156c.17b8 Suspend: 1 Teb: fff64000 Unfrozen
  24  Id: 156c.e4c Suspend: 1 Teb: fff61000 Unfrozen
  25  Id: 156c.cf8 Suspend: 1 Teb: fff58000 Unfrozen
  26  Id: 156c.1560 Suspend: 1 Teb: fff55000 Unfrozen
  27  Id: 156c.d6c Suspend: 1 Teb: fff8b000 Unfrozen
  29  Id: 156c.1818 Suspend: 1 Teb: fff6a000 Unfrozen
  30  Id: 156c.958 Suspend: 1 Teb: fff88000 Unfrozen

0:006> ~7k 200
ChildEBP RetAddr
04ecf33c 74da1270 ntdll!ZwWaitForSingleObject+0x15
04ecf3ac 74da11d8 kernel32!WaitForSingleObjectEx+0xbe
04ecf3c0 6505f3f9 kernel32!WaitForSingleObject+0x12
04ecf3e4 6505af28 nspr4!_PR_MD_WAIT_CV+0xc9
04ecf404 6505ab31 nspr4!_PR_WaitCondVar+0x58
04ecf420 61ac2a50 nspr4!PR_Wait+0x51
04ecf448 61a5111e xul!nsThread::ProcessNextEvent+0x380
04ecf4d0 61bee46b xul!nsRefPtr<nsIRunnable>::~nsRefPtr<nsIRunnable>+0xe
04ecf510 61c384b8 xul!nsProxyObjectManager::GetProxyForObject+0x191
04ecf558 6233e430 xul!NS_GetProxyForObject+0x48
04ecf59c 6234e2ee xul!nsNSSSocketInfo::EnsureDocShellDependentStuffKnown+0xc2
04ecf5b0 6236d23c xul!nsNSSSocketInfo::GetPreviousCert+0x1c
04ecf644 64dc808b xul!HandshakeCallback+0x313
04ecf708 64db84f9 MOZCRT19!_VEC_memzero+0x36
04ecf724 63ed1bc4 MOZCRT19!arena_dalloc+0x39
04ecf728 64db83c8 softokn3!sftk_DestroySession+0x6b
04ecf740 64db84f9 MOZCRT19!arena_dalloc_small+0x28
04ecf7a0 644d4490 MOZCRT19!arena_dalloc+0x39
04ecf7bc 6505b402 ssl3!ssl3_CacheWrappedMasterSecret+0x142
04ecf7d4 6505b3e9 nspr4!PR_AssertCurrentThreadOwnsLock+0x12
04ecf7dc 644df940 nspr4!PR_Unlock+0x39
04ecf820 644d953a ssl3!CacheSID+0xa0
04ecf870 644d96a4 ssl3!ssl3_HandleHandshakeMessage+0x267
04ecf890 644d9b50 ssl3!ssl3_HandleHandshake+0xa8
04ecf91c 644d9e8c ssl3!ssl3_HandleRecord+0x3b9
04ecf948 644daff9 ssl3!ssl3_GatherCompleteHandshake+0x5d
04ecf95c 644dfaba ssl3!ssl_GatherRecord1stHandshake+0x29
04ecf970 644e0776 ssl3!ssl_Do1stHandshake+0x44
04ecf988 644e37e6 ssl3!ssl_SecureRecv+0xdf
04ecf9a8 61f5a59a ssl3!ssl_Recv+0x51
04ecf9d8 6237e39b xul!nsSSLThread::requestRecvMsgPeek+0xf2
04ecf9f8 61c30714 xul!PSMRecv+0x35
04ecfa28 61c30766 xul!nsSocketTransport::IsAlive+0x6e
04ecfa3c 61be8805 xul!nsHttpConnection::IsAlive+0x15
04ecfa50 61bb8c52 xul!nsHttpConnection::CanReuse+0x3f
04ecfa64 61bb89ce xul!nsHttpConnectionMgr::GetConnection+0x9e
04ecfaa4 61bb8934 xul!nsHttpConnectionMgr::ProcessNewTransaction+0x87
04ecfab4 61a07189 xul!nsHttpConnectionMgr::OnMsgNewTransaction+0x2b
04ecfac0 61ac282a xul!nsHttpConnectionMgr::nsConnEvent::Run+0x10
04ecfb5c 61b29318 xul!nsThread::ProcessNextEvent+0x15a
04ecfb7c 6505bdd9 xul!nsThread::ThreadFunc+0x88
00000000 00000000 nspr4!_PR_NativeRunThread+0x169
SSL thread gets lock by nsProxyObjectManager::GetProxyForObject(), then this thread is waiting event in xul!nsThread::ProcessNextEvent().

But main UI thread try getting this lock.  So dead lock occurs.
There is no SSL thread after bug 674147. Also, the use of XPCOM proxies in PSM was removed in bug 675221.
Status: NEW → RESOLVED
Closed: 13 years ago
Depends on: 675221, 674147
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.