Open Bug 1101547 Opened 10 years ago Updated 2 years ago

loading multiple PKCS11 modules lead to hang

Categories

(NSS :: Libraries, defect, P3)

x86_64
macOS

Tracking

(Not tracked)

People

(Reporter: bugzilla, Unassigned)

References

(Blocks 1 open bug)

Details

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:33.0) Gecko/20100101 Firefox/33.0
Build ID: 20141106120505

Steps to reproduce:

loading multiple PKCS11 modules security device manager lead to hang.

1) load module A in Firefox security device manager, calling it "Module A"
2) load module B in Firefox security device manager, calling it "Module B"
3) insert a smart card that is recognized by "Module A"
4) close security device manager
5) open the certificate manager (Show certificates, in advanced settings)
6) enter the smart card pin/password
7) Firefox hangs

The problem happens on OSX (tested 10.9, 10.10) and Windows (tested XP and 7).



Actual results:

Firefox blocks/hangs and must be foced to close


Expected results:

After entering the pin/password the next window should be shown (certificates window)
Component: Untriaged → Security
OS: Windows XP → Mac OS X
Hardware: x86 → x86_64
Does this happen irrespective of the modules A/B in question? Have you tried older/newer versions (ie did this use to work and/or does it work on the current developer version, Firefox Nightly - https://nightly.mozilla.org/ )?

If it used to work, could you try to find out when this regressed using mozregression ( http://mozilla.github.io/mozregression/ ) ?

(I would normally try and help myself, but I don't know how to load PKCS11 modules and don't have security smartcards - hopefully folks from more relevant teams can help)
Flags: needinfo?(midori)
Product: Firefox → Core
I know it's not so easy to reproduce, the proof is that this bug exists since Firefox 3 and maybe also in older versions.
Is not so easy to reproduce because it's difficult for FF developers and contributors to obtain access to HW and SW related to security smart cards.

It's happening on FF 33.1, Windows 7.
Is not happening on Nightly 36.0a (changeset: 215857:acbd7b68fa8c (Fri Nov 14 17:32:39 2014 -0500))

Well, actually I'm not sure, but it may be just a case that is working on Nightly: there is some kind of deadlock when more than 1 PKCS#11 library is loaded. I've compiled it with debug enabed, this may slow down some oeprations and the deadlock just don't appear.
Anyway, I'll try also to disable debug, but I need some kind of debug symols at least, to try to understand where the hang is happening, eventually.
Flags: needinfo?(midori)
(In reply to Giuseppe Amato from comment #2)
> I know it's not so easy to reproduce, the proof is that this bug exists
> since Firefox 3 and maybe also in older versions.
> Is not so easy to reproduce because it's difficult for FF developers and
> contributors to obtain access to HW and SW related to security smart cards.
> 
> It's happening on FF 33.1, Windows 7.
> Is not happening on Nightly 36.0a (changeset: 215857:acbd7b68fa8c (Fri Nov
> 14 17:32:39 2014 -0500))
> 
> Well, actually I'm not sure, but it may be just a case that is working on
> Nightly: there is some kind of deadlock when more than 1 PKCS#11 library is
> loaded. I've compiled it with debug enabed, this may slow down some
> oeprations and the deadlock just don't appear.
> Anyway, I'll try also to disable debug, but I need some kind of debug symols
> at least, to try to understand where the hang is happening, eventually.

opt builds get symbols by default, so that part should be OK - although them being opt may obviously optimize out variables or such. Thanks for your help!
I've reproduced the bug in the nightly removing any debug options, so using empty mozconf
( Nightly 36.0a, changeset: 215857:acbd7b68fa8c (Fri Nov 14 17:32:39 2014 -0500) )

I've found a possible deadlock, but I can't figure out exactly why it happens.
There are many threads using the NSS layer, working with security tokens.
Actually there are 3 threads inside the nssSlot_IsTokenPresent () function (2 worker and the main threads)
All threads has detected an empty slot and is trying to "clean up" all certificates belonging to it.



"Main thread" (TID=4048 / 0x0FD0) try to lock the "cert cache lock" (hangs here):
-		&lock->ilock.mutex	0x08e1efcc {DebugInfo=0xffffffff {Type=??? CreatorBackTraceIndex=??? CriticalSection=??? ...} LockCount=...}	_RTL_CRITICAL_SECTION *
+		DebugInfo	0xffffffff {Type=??? CreatorBackTraceIndex=??? CriticalSection=??? ...}	_RTL_CRITICAL_SECTION_DEBUG *
		LockCount	-10	long
		RecursionCount	1	long
		OwningThread	0x00000f60	void *
		LockSemaphore	0x000006b0	void *
		SpinCount	1500	unsigned long


"Worker Thread 1" (ID=552 / 0x0228) try to lock the "cert cache lock" (hangs here)
-		&lock->ilock.mutex	0x08e1efcc {DebugInfo=0xffffffff {Type=??? CreatorBackTraceIndex=??? CriticalSection=??? ...} LockCount=...}	_RTL_CRITICAL_SECTION *
+		DebugInfo	0xffffffff {Type=??? CreatorBackTraceIndex=??? CriticalSection=??? ...}	_RTL_CRITICAL_SECTION_DEBUG *
		LockCount	-10	long
		RecursionCount	1	long
		OwningThread	0x00000f60	void *
		LockSemaphore	0x000006b0	void *
		SpinCount	1500	unsigned long



"Worker Thread 2" (TID=3936 / 0x0F60) is the owner of the "cert cache lock".
It has got the lock calling PZ_Lock(td->cache->lock); in the nssTrustDomain_RemoveTokenCertsFromCache() function.

"Worker Thread 2" Then try to "lock" the "nssPKIObject" inside remove_token_certs() function:
It hangs inside the nssPKIObject_Lock(nssPKIObject * object) function, calling:
PZ_Lock(object->sync.lock);

I'm not really an expert of windows internals, but it looks like to me that the following critical section it try to lock has something wrong:
-		object->sync.lock	0x120f5ec0 {links={next=0x00000000 <NULL> prev=0x120f5ec4 {next=0x120f5ec4 {next=0x120f5ec4 {...} prev=...} ...} } ...}	PRLock *
+		links	{next=0x00000000 <NULL> prev=0x120f5ec4 {next=0x120f5ec4 {next=0x120f5ec4 {next=0x120f5ec4 {...} prev=...} ...} ...} }	PRCListStr
+		owner	0x120f5ec4 {state=302997188 priority=302997188 arg=0x00000000 ...}	PRThread *
+		waitQ	{next=0x00000000 <NULL> prev=0x120f5ed0 {next=0x120f5ed0 {next=0x120f5ed0 {next=0x120f5ed0 {...} prev=...} ...} ...} }	PRCListStr
		priority	302997200	PRThreadPriority
		boostPriority	PR_PRIORITY_FIRST (0)	PRThreadPriority
-		ilock	{mutex={DebugInfo=0x00000000 <NULL> LockCount=-1 RecursionCount=-1 ...} notified={length=1500 cv=0x120f5ef8 {...} ...} }	_MDLock
-		mutex	{DebugInfo=0x00000000 <NULL> LockCount=-1 RecursionCount=-1 ...}	_RTL_CRITICAL_SECTION
+		DebugInfo	0x00000000 <NULL>	_RTL_CRITICAL_SECTION_DEBUG *
		LockCount	-1	long
		RecursionCount	-1	long
		OwningThread	0x00000000	void *
		LockSemaphore	0x00000000	void *
		SpinCount	0	unsigned long
+		notified	{length=1500 cv=0x120f5ef8 {{cv=0x00000000 <NULL> times=302997440 notifyHead=0x00000001 {state=??? priority=...} }, ...} ...}	_MDNotified


Main thread call stack:
	nss3.dll!PR_Lock(PRLock * lock) Line 215	C
 	nss3.dll!nssTrustDomain_RemoveTokenCertsFromCache(NSSTrustDomainStr * td, NSSTokenStr * token) Line 438	C
 	nss3.dll!nssToken_NotifyCertsNotVisible(NSSTokenStr * tok) Line 303	C
 	nss3.dll!nssSlot_IsTokenPresent(NSSSlotStr * slot) Line 206	C
 	nss3.dll!nssSlot_GetToken(NSSSlotStr * slot) Line 230	C
 	nss3.dll!nssTrustDomain_FindTrustForCertificate(NSSTrustDomainStr * td, NSSCertificateStr * c) Line 1077	C
 	nss3.dll!nssTrust_GetCERTCertTrustForCert(NSSCertificateStr * c, CERTCertificateStr * cc) Line 615	C
 	nss3.dll!fill_CERTCertificateFields(NSSCertificateStr * c, CERTCertificateStr * cc, int forced) Line 822	C
 	nss3.dll!stan_GetCERTCertificate(NSSCertificateStr * c, int forceUpdate) Line 890	C
 	nss3.dll!STAN_GetCERTCertificate(NSSCertificateStr * c) Line 922	C
 	nss3.dll!pk11ListCertCallback(NSSCertificateStr * c, void * arg) Line 2450	C
 	nss3.dll!nssPKIObjectCollection_Traverse(nssPKIObjectCollectionStr * collection, nssPKIObjectCallback * callback) Line 924	C
 	nss3.dll!NSSTrustDomain_TraverseCertificates(NSSTrustDomainStr * td, PRStatus (NSSCertificateStr *, void *) * callback, void * arg) Line 1051	C
 	nss3.dll!PK11_ListCerts(PK11CertListType type, void * pwarg) Line 2520	C
 	xul.dll!nsNSSCertCache::CacheAllCerts() Line 49	C++
 	xul.dll!NS_InvokeByIndex(nsISupports * that, unsigned int methodIndex, unsigned int paramCount, nsXPTCVariant * params) Line 71	C++
 	xul.dll!XPCWrappedNative::CallMethod(XPCCallContext & ccx, XPCWrappedNative::CallMode mode) Line 1714	C++
 	xul.dll!XPC_WN_CallMethod(JSContext * cx, unsigned int argc, JS::Value * vp) Line 1244	C++
 	xul.dll!js::Invoke(JSContext * cx, JS::CallArgs args, js::MaybeConstruct construct) Line 475	C++
 	xul.dll!Interpret(JSContext * cx, js::RunState & state) Line 2526	C++
 	xul.dll!js::RunScript(JSContext * cx, js::RunState & state) Line 432	C++
 	xul.dll!js::Invoke(JSContext * cx, JS::CallArgs args, js::MaybeConstruct construct) Line 504	C++
 	xul.dll!js::Invoke(JSContext * cx, const JS::Value & thisv, const JS::Value & fval, unsigned int argc, const JS::Value * argv, JS::MutableHandle<JS::Value> rval) Line 538	C++
 	xul.dll!JS::Call(JSContext * cx, JS::Handle<JS::Value> thisv, JS::Handle<JS::Value> fval, const JS::HandleValueArray & args, JS::MutableHandle<JS::Value> rval) Line 5029	C++
 	xul.dll!mozilla::dom::EventHandlerNonNull::Call(JSContext * cx, JS::Handle<JS::Value> aThisVal, mozilla::dom::Event & event, JS::MutableHandle<JS::Value> aRetVal, mozilla::ErrorResult & aRv) Line 259	C++
 	xul.dll!mozilla::dom::EventHandlerNonNull::Call<nsISupports *>(nsISupports * const & thisObjPtr, mozilla::dom::Event & event, JS::MutableHandle<JS::Value> aRetVal, mozilla::ErrorResult & aRv, mozilla::dom::CallbackObject::ExceptionHandling aExceptionHandling) Line 350	C++
 	xul.dll!mozilla::JSEventHandler::HandleEvent(nsIDOMEvent * aEvent) Line 215	C++
 	xul.dll!mozilla::EventListenerManager::HandleEventSubType(mozilla::EventListenerManager::Listener * aListener, nsIDOMEvent * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget) Line 964	C++
 	xul.dll!mozilla::EventListenerManager::HandleEventInternal(nsPresContext * aPresContext, mozilla::WidgetEvent * aEvent, nsIDOMEvent * * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget, nsEventStatus * aEventStatus) Line 1080	C++
 	xul.dll!mozilla::EventListenerManager::HandleEvent(nsPresContext * aPresContext, mozilla::WidgetEvent * aEvent, nsIDOMEvent * * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget, nsEventStatus * aEventStatus) Line 330	C++
 	xul.dll!mozilla::EventTargetChainItem::HandleEvent(mozilla::EventChainPostVisitor & aVisitor, mozilla::ELMCreationDetector & aCd) Line 203	C++
 	xul.dll!mozilla::EventTargetChainItem::HandleEventTargetChain(nsTArray<mozilla::EventTargetChainItem> & aChain, mozilla::EventChainPostVisitor & aVisitor, mozilla::EventDispatchingCallback * aCallback, mozilla::ELMCreationDetector & aCd) Line 295	C++
 	xul.dll!mozilla::EventDispatcher::Dispatch(nsISupports * aTarget, nsPresContext * aPresContext, mozilla::WidgetEvent * aEvent, nsIDOMEvent * aDOMEvent, nsEventStatus * aEventStatus, mozilla::EventDispatchingCallback * aCallback, nsCOMArray<mozilla::dom::EventTarget> * aTargets) Line 609	C++
 	xul.dll!nsDocumentViewer::LoadComplete(tag_nsresult aStatus) Line 996	C++
 	xul.dll!nsDocShell::EndPageLoad(nsIWebProgress * aProgress, nsIChannel * aChannel, tag_nsresult aStatus) Line 7382	C++
 	xul.dll!nsDocShell::OnStateChange(nsIWebProgress * aProgress, nsIRequest * aRequest, unsigned int aStateFlags, tag_nsresult aStatus) Line 7193	C++
 	xul.dll!nsDocLoader::DoFireOnStateChange(nsIWebProgress * const aProgress, nsIRequest * const aRequest, int & aStateFlags, const tag_nsresult aStatus) Line 1271	C++
 	xul.dll!nsDocLoader::doStopDocumentLoad(nsIRequest * request, tag_nsresult aStatus) Line 850	C++
 	xul.dll!nsDocLoader::DocLoaderIsEmpty(bool aFlushLayout) Line 742	C++
 	xul.dll!nsDocLoader::OnStopRequest(nsIRequest * aRequest, nsISupports * aCtxt, tag_nsresult aStatus) Line 628	C++
 	xul.dll!nsLoadGroup::RemoveRequest(nsIRequest * request, nsISupports * ctxt, tag_nsresult aStatus) Line 689	C++
 	xul.dll!nsDocument::DoUnblockOnload() Line 8913	C++
 	xul.dll!nsUnblockOnloadEvent::Run() Line 8865	C++
 	xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 830	C++
 	xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 265	C++
 	xul.dll!nsXULWindow::ShowModal() Line 364	C++
 	xul.dll!nsContentTreeOwner::ShowAsModal() Line 557	C++
 	xul.dll!nsWindowWatcher::OpenWindowInternal(nsIDOMWindow * aParent, const char * aUrl, const char * aName, const char * aFeatures, bool aCalledFromJS, bool aDialog, bool aNavigate, nsITabParent * aOpeningTab, nsIArray * argv, nsIDOMWindow * * _retval) Line 1001	C++
 	xul.dll!nsWindowWatcher::OpenWindow2(nsIDOMWindow * aParent, const char * aUrl, const char * aName, const char * aFeatures, bool aCalledFromScript, bool aDialog, bool aNavigate, nsITabParent * aOpeningTab, nsISupports * aArguments, nsIDOMWindow * * _retval) Line 420	C++
 	xul.dll!nsGlobalWindow::OpenInternal(const nsAString_internal & aUrl, const nsAString_internal & aName, const nsAString_internal & aOptions, bool aDialog, bool aContentModal, bool aCalledNoScript, bool aDoJSFixups, bool aNavigate, nsIArray * argv, nsISupports * aExtraArgument, nsIPrincipal * aCalleePrincipal, JSContext * aJSCallerContext, nsIDOMWindow * * aReturn) Line 11889	C++
 	xul.dll!nsGlobalWindow::OpenDialog(JSContext * aCx, const nsAString_internal & aUrl, const nsAString_internal & aName, const nsAString_internal & aOptions, const mozilla::dom::Sequence<JS::Value> & aExtraArgument, mozilla::ErrorResult & aError) Line 7715	C++
 	xul.dll!nsGlobalWindow::OpenDialog(JSContext * aCx, const nsAString_internal & aUrl, const nsAString_internal & aName, const nsAString_internal & aOptions, const mozilla::dom::Sequence<JS::Value> & aExtraArgument, mozilla::ErrorResult & aError) Line 7694	C++
 	xul.dll!mozilla::dom::WindowBinding::openDialog(JSContext * cx, JS::Handle<JSObject *> obj, nsGlobalWindow * self, const JSJitMethodCallArgs & args) Line 5283	C++
 	xul.dll!mozilla::dom::WindowBinding::genericMethod(JSContext * cx, unsigned int argc, JS::Value * vp) Line 12361	C++
 	xul.dll!js::Invoke(JSContext * cx, JS::CallArgs args, js::MaybeConstruct construct) Line 475	C++
 	xul.dll!Interpret(JSContext * cx, js::RunState & state) Line 2526	C++
 	xul.dll!js::RunScript(JSContext * cx, js::RunState & state) Line 432	C++
 	xul.dll!js::Invoke(JSContext * cx, JS::CallArgs args, js::MaybeConstruct construct) Line 504	C++
 	xul.dll!js::CallOrConstructBoundFunction(JSContext * cx, unsigned int argc, JS::Value * vp) Line 1574	C++
 	xul.dll!js::Invoke(JSContext * cx, JS::CallArgs args, js::MaybeConstruct construct) Line 475	C++
 	xul.dll!js::Invoke(JSContext * cx, const JS::Value & thisv, const JS::Value & fval, unsigned int argc, const JS::Value * argv, JS::MutableHandle<JS::Value> rval) Line 538	C++
 	xul.dll!JS::Call(JSContext * cx, JS::Handle<JS::Value> thisv, JS::Handle<JS::Value> fval, const JS::HandleValueArray & args, JS::MutableHandle<JS::Value> rval) Line 5029	C++
 	xul.dll!mozilla::dom::EventListener::HandleEvent(JSContext * cx, JS::Handle<JS::Value> aThisVal, mozilla::dom::Event & event, mozilla::ErrorResult & aRv) Line 47	C++
 	xul.dll!mozilla::dom::EventListener::HandleEvent<mozilla::dom::EventTarget *>(mozilla::dom::EventTarget * const & thisObjPtr, mozilla::dom::Event & event, mozilla::ErrorResult & aRv, mozilla::dom::CallbackObject::ExceptionHandling aExceptionHandling) Line 54	C++
 	xul.dll!mozilla::EventListenerManager::HandleEventSubType(mozilla::EventListenerManager::Listener * aListener, nsIDOMEvent * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget) Line 962	C++
 	xul.dll!mozilla::EventListenerManager::HandleEventInternal(nsPresContext * aPresContext, mozilla::WidgetEvent * aEvent, nsIDOMEvent * * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget, nsEventStatus * aEventStatus) Line 1080	C++
 	xul.dll!mozilla::EventListenerManager::HandleEvent(nsPresContext * aPresContext, mozilla::WidgetEvent * aEvent, nsIDOMEvent * * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget, nsEventStatus * aEventStatus) Line 330	C++
 	xul.dll!mozilla::EventTargetChainItem::HandleEvent(mozilla::EventChainPostVisitor & aVisitor, mozilla::ELMCreationDetector & aCd) Line 203	C++
 	xul.dll!mozilla::EventTargetChainItem::HandleEventTargetChain(nsTArray<mozilla::EventTargetChainItem> & aChain, mozilla::EventChainPostVisitor & aVisitor, mozilla::EventDispatchingCallback * aCallback, mozilla::ELMCreationDetector & aCd) Line 295	C++
 	xul.dll!mozilla::EventDispatcher::Dispatch(nsISupports * aTarget, nsPresContext * aPresContext, mozilla::WidgetEvent * aEvent, nsIDOMEvent * aDOMEvent, nsEventStatus * aEventStatus, mozilla::EventDispatchingCallback * aCallback, nsCOMArray<mozilla::dom::EventTarget> * aTargets) Line 609	C++
 	xul.dll!mozilla::EventDispatcher::DispatchDOMEvent(nsISupports * aTarget, mozilla::WidgetEvent * aEvent, nsIDOMEvent * aDOMEvent, nsPresContext * aPresContext, nsEventStatus * aEventStatus) Line 671	C++
 	xul.dll!PresShell::HandleDOMEventWithTarget(nsIContent * aTargetContent, nsIDOMEvent * aEvent, nsEventStatus * aStatus) Line 8405	C++
 	xul.dll!nsContentUtils::DispatchXULCommand(nsIContent * aTarget, bool aTrusted, nsIDOMEvent * aSourceEvent, nsIPresShell * aShell, bool aCtrl, bool aAlt, bool aShift, bool aMeta) Line 5925	C++
 	xul.dll!nsButtonBoxFrame::DoMouseClick(mozilla::WidgetGUIEvent * aEvent, bool aTrustEvent) Line 155	C++
 	xul.dll!nsButtonBoxFrame::MouseClicked(nsPresContext * aPresContext, mozilla::WidgetGUIEvent * aEvent) Line 33	C++
 	xul.dll!nsButtonBoxFrame::HandleEvent(nsPresContext * aPresContext, mozilla::WidgetGUIEvent * aEvent, nsEventStatus * aEventStatus) Line 118	C++
 	xul.dll!nsPresShellEventCB::HandleEvent(mozilla::EventChainPostVisitor & aVisitor) Line 506	C++
 	xul.dll!mozilla::EventTargetChainItem::HandleEventTargetChain(nsTArray<mozilla::EventTargetChainItem> & aChain, mozilla::EventChainPostVisitor & aVisitor, mozilla::EventDispatchingCallback * aCallback, mozilla::ELMCreationDetector & aCd) Line 340	C++
 	xul.dll!mozilla::EventDispatcher::Dispatch(nsISupports * aTarget, nsPresContext * aPresContext, mozilla::WidgetEvent * aEvent, nsIDOMEvent * aDOMEvent, nsEventStatus * aEventStatus, mozilla::EventDispatchingCallback * aCallback, nsCOMArray<mozilla::dom::EventTarget> * aTargets) Line 609	C++
 	xul.dll!PresShell::HandleEventInternal(mozilla::WidgetEvent * aEvent, nsEventStatus * aStatus) Line 8218	C++
 	xul.dll!PresShell::HandleEventWithTarget(mozilla::WidgetEvent * aEvent, nsIFrame * aFrame, nsIContent * aContent, nsEventStatus * aStatus) Line 7951	C++
 	xul.dll!mozilla::EventStateManager::CheckForAndDispatchClick(nsPresContext * aPresContext, mozilla::WidgetMouseEvent * aEvent, nsEventStatus * aStatus) Line 4413	C++
 	xul.dll!mozilla::EventStateManager::PostHandleEvent(nsPresContext * aPresContext, mozilla::WidgetEvent * aEvent, nsIFrame * aTargetFrame, nsEventStatus * aStatus) Line 2922	C++
 	xul.dll!PresShell::HandleEventInternal(mozilla::WidgetEvent * aEvent, nsEventStatus * aStatus) Line 8230	C++
 	xul.dll!PresShell::HandlePositionedEvent(nsIFrame * aTargetFrame, mozilla::WidgetGUIEvent * aEvent, nsEventStatus * aEventStatus) Line 7924	C++
 	xul.dll!PresShell::HandleEvent(nsIFrame * aFrame, mozilla::WidgetGUIEvent * aEvent, bool aDontRetargetEvents, nsEventStatus * aEventStatus) Line 7721	C++
 	xul.dll!nsViewManager::DispatchEvent(mozilla::WidgetGUIEvent * aEvent, nsView * aView, nsEventStatus * aStatus) Line 776	C++
 	xul.dll!nsView::HandleEvent(mozilla::WidgetGUIEvent * aEvent, bool aUseAttachedEvents) Line 1098	C++
 	xul.dll!nsWindow::DispatchEvent(mozilla::WidgetGUIEvent * event, nsEventStatus & aStatus) Line 3713	C++
 	xul.dll!nsWindow::DispatchMouseEvent(unsigned int aEventType, unsigned int wParam, long lParam, bool aIsContextMenuKey, short aButton, unsigned short aInputSource) Line 4055	C++
 	xul.dll!TimerThread::RemoveTimer(nsTimerImpl * aTimer) Line 439	C++
 	nss3.dll!_MD_CURRENT_THREAD() Line 314	C
 	xul.dll!PresShell::Release() Line 809	C++
 	xul.dll!nsRefreshDriver::Tick(__int64 aNowEpoch, mozilla::TimeStamp aNowTime) Line 1376	C++
 	[External Code]	
 	xul.dll!nsAppShell::ProcessNextNativeEvent(bool mayWait) Line 294	C++
 	xul.dll!nsBaseAppShell::DoProcessNextNativeEvent(bool mayWait, unsigned int recursionDepth) Line 145	C++
 	xul.dll!nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal * thr, bool mayWait, unsigned int recursionDepth) Line 298	C++
 	xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 805	C++
 	xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 265	C++
 	xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 140	C++
 	xul.dll!MessageLoop::RunHandler() Line 227	C++
 	xul.dll!MessageLoop::Run() Line 201	C++
 	xul.dll!nsBaseAppShell::Run() Line 166	C++
 	xul.dll!nsAppShell::Run() Line 180	C++
 	xul.dll!nsAppStartup::Run() Line 282	C++
 	xul.dll!XREMain::XRE_mainRun() Line 4150	C++
 	xul.dll!NS_TableDrivenQI(void * aThis, const nsID & aIID, void * * aInstancePtr, const QITableEntry * aEntries) Line 18	C++
 	xul.dll!nsComponentManagerImpl::QueryInterface(const nsID & aIID, void * * aInstancePtr) Line 940	C++
 	xul.dll!nsQueryInterface::operator()(const nsID & aIID, void * * aAnswer) Line 19	C++
 	xul.dll!nsCOMPtr_base::assign_from_qi(const nsQueryInterface aQI, const nsID & aIID) Line 62	C++

    
    
"Worker thread 1" call  stack:
	nss3.dll!PR_Lock(PRLock * lock) Line 215	C
 	nss3.dll!nssTrustDomain_LockCertCache(NSSTrustDomainStr * td) Line 369	C
 	nss3.dll!nssCertificate_Destroy(NSSCertificateStr * c) Line 108	C
 	nss3.dll!CERT_DestroyCertificate(CERTCertificateStr * cert) Line 796	C
 	nss3.dll!ssl3_CleanupPeerCerts(sslSocketStr * ss) Line 9745	C
 	nss3.dll!ssl3_DestroySSL3Info(sslSocketStr * ss) Line 12102	C
 	nss3.dll!ssl_DestroySocketContents(sslSocketStr * ss) Line 351	C
 	nss3.dll!ssl_FreeSocket(sslSocketStr * ss) Line 415	C
 	nss3.dll!ssl_DefClose(sslSocketStr * ss) Line 205	C
 	nss3.dll!ssl_SecureClose(sslSocketStr * ss) Line 1140	C
 	nss3.dll!ssl_Close(PRFileDesc * fd) Line 2081	C
 	xul.dll!nsSocketTransport::ReleaseFD_Locked(PRFileDesc * fd) Line 1675	C++
 	xul.dll!nsSocketTransport::OnSocketDetached(PRFileDesc * fd) Line 1940	C++
 	xul.dll!nsSocketTransportService::DetachSocket(nsSocketTransportService::SocketContext * listHead, nsSocketTransportService::SocketContext * sock) Line 188	C++
 	xul.dll!nsSocketTransportService::DoPollIteration(bool wait) Line 909	C++
 	xul.dll!nsSocketTransportService::Run() Line 736	C++
 	xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 830	C++
 	xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 265	C++
 	xul.dll!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate * aDelegate) Line 368	C++
 	xul.dll!MessageLoop::RunHandler() Line 227	C++
 	xul.dll!MessageLoop::Run() Line 201	C++
 	xul.dll!nsThread::ThreadFunc(void * aArg) Line 359	C++
 	nss3.dll!_PR_NativeRunThread(void * arg) Line 419	C
 	nss3.dll!pr_root(void * arg) Line 90	C
 	[External Code]	

    
"Worker thread 2" call  stack:

 	nss3.dll!_PR_MD_WAIT_CV(_MDCVar * cv, _MDLock * lock, unsigned int timeout) Line 250	C
 	nss3.dll!_PR_WaitCondVar(PRThread * thread, PRCondVar * cvar, PRLock * lock, unsigned int timeout) Line 172	C
 	nss3.dll!PR_WaitCondVar(PRCondVar * cvar, unsigned int timeout) Line 525	C
 	nss3.dll!PR_EnterMonitor(PRMonitor * mon) Line 136	C
 	nss3.dll!nssPKIObject_Lock(nssPKIObjectStr * object) Line 25	C
 	nss3.dll!remove_token_certs(const void * k, void * v, void * a) Line 395	C
 	nss3.dll!nss_hash_enumerator(PLHashEntry * he, int index, void * arg) Line 345	C
 	nss3.dll!PL_HashTableEnumerateEntries(PLHashTable * ht, int (PLHashEntry *, int, void *) * f, void * arg) Line 375	C
 	nss3.dll!nssHash_Iterate(nssHashStr * hash, void (const void *, void *, void *) * fcn, void * closure) Line 370	C
 	nss3.dll!nssTrustDomain_RemoveTokenCertsFromCache(NSSTrustDomainStr * td, NSSTokenStr * token) Line 439	C
 	nss3.dll!nssToken_NotifyCertsNotVisible(NSSTokenStr * tok) Line 303	C
	nss3.dll!nssSlot_IsTokenPresent(NSSSlotStr * slot) Line 206	C
 	nss3.dll!nssToken_IsPresent(NSSTokenStr * token) Line 1441	C
 	nss3.dll!pk11_IsPresentCertLoad(PK11SlotInfoStr * slot, int loadCerts) Line 1437	C
 	nss3.dll!PK11_IsPresent(PK11SlotInfoStr * slot) Line 1485	C
 	nss3.dll!secmod_HandleWaitForSlotEvent(SECMODModuleStr * mod, unsigned long flags, unsigned int latency) Line 1072	C
 	nss3.dll!SECMOD_WaitForAnyTokenEvent(SECMODModuleStr * mod, unsigned long flags, unsigned int latency) Line 1124	C
 	xul.dll!SmartCardMonitoringThread::Execute() Line 379	C++
 	xul.dll!SmartCardMonitoringThread::LaunchExecute(void * arg) Line 393	C++
 	nss3.dll!_PR_NativeRunThread(void * arg) Line 419	C
 	nss3.dll!pr_root(void * arg) Line 90	C
 	[External Code]
Brian/David, does comment #4 here ring any bells, and/or do we need to get other NSS folks involved? :-)
Flags: needinfo?(dkeeler)
Flags: needinfo?(brian)
Guys, I'm not able to debug this problem effectively.
In case someone is interested reproduce the this bug, I can send a HW/SW kit worldwide.
Flags: needinfo?(brian)
I may have found the deadlock:

This is the relevant part of Thread 2:

 	nss3.dll!nssPKIObject_Lock(nssPKIObjectStr * object) Line 25	C
 	nss3.dll!remove_token_certs(const void * k, void * v, void * a) Line 395	C
 	nss3.dll!nss_hash_enumerator(PLHashEntry * he, int index, void * arg) Line 345	C
 	nss3.dll!PL_HashTableEnumerateEntries(PLHashTable * ht, int (PLHashEntry *, int, void *) * f, void * arg) Line 375	C
 	nss3.dll!nssHash_Iterate(nssHashStr * hash, void (const void *, void *, void *) * fcn, void * closure) Line 370	C
 	nss3.dll!nssTrustDomain_RemoveTokenCertsFromCache(NSSTrustDomainStr * td, NSSTokenStr * token) Line 439	C
 	nss3.dll!nssToken_NotifyCertsNotVisible(NSSTokenStr * tok) Line 303	C
	nss3.dll!nssSlot_IsTokenPresent(NSSSlotStr * slot) Line 206	C
 	nss3.dll!nssToken_IsPresent(NSSTokenStr * token) Line 1441	C
 	nss3.dll!pk11_IsPresentCertLoad(PK11SlotInfoStr * slot, int loadCerts) Line 1437	C
 	nss3.dll!PK11_IsPresent(PK11SlotInfoStr * slot) Line 1485	C

This is the relevant part of the Main Thread:

	nss3.dll!PR_Lock(PRLock * lock) Line 215	C
 	nss3.dll!nssTrustDomain_RemoveTokenCertsFromCache(NSSTrustDomainStr * td, NSSTokenStr * token) Line 438	C
 	nss3.dll!nssToken_NotifyCertsNotVisible(NSSTokenStr * tok) Line 303	C
 	nss3.dll!nssSlot_IsTokenPresent(NSSSlotStr * slot) Line 206	C
 	nss3.dll!nssSlot_GetToken(NSSSlotStr * slot) Line 230	C
 	nss3.dll!nssTrustDomain_FindTrustForCertificate(NSSTrustDomainStr * td, NSSCertificateStr * c) Line 1077	C
 	nss3.dll!nssTrust_GetCERTCertTrustForCert(NSSCertificateStr * c, CERTCertificateStr * cc) Line 615	C
 	nss3.dll!fill_CERTCertificateFields(NSSCertificateStr * c, CERTCertificateStr * cc, int forced) Line 822	C
 	nss3.dll!stan_GetCERTCertificate(NSSCertificateStr * c, int forceUpdate) Line 890	C
 	nss3.dll!STAN_GetCERTCertificate(NSSCertificateStr * c) Line 922	C
 	nss3.dll!pk11ListCertCallback(NSSCertificateStr * c, void * arg) Line 2450	C
 	nss3.dll!nssPKIObjectCollection_Traverse(nssPKIObjectCollectionStr * collection, nssPKIObjectCallback * callback) Line 924	C
 	nss3.dll!NSSTrustDomain_TraverseCertificates(NSSTrustDomainStr * td, PRStatus (NSSCertificateStr *, void *) * callback, void * arg) Line 1051	C
 	nss3.dll!PK11_ListCerts(PK11CertListType type, void * pwarg) Line 2520	C

Thread 2 has td->cache->lock. The Main Thread is attempting to acquire it. Also, the Main Thread has acquired the lock on a nssPKIObjectStr in stan_GetCERTCertificate. If my hunch is correct, this is the same lock that Thread 2 is attempting to acquire. Thus the deadlock.

Bob, does this seem plausible?
Assignee: nobody → nobody
Component: Security → Libraries
Flags: needinfo?(dkeeler) → needinfo?(rrelyea)
Product: Core → NSS
Version: 33 Branch → trunk
The worst part of the threading code within NSS is the relationship between locks in pki code (exactly where this is deadlocking). I fixed a deadlock in the directory server in this code that may be related. The patch we use is available in this bug 943144 .

It may be the patch there will fix this problem (70% chance). There's still a pretty high chance that this is a different deadlock condition.

bob
Flags: needinfo?(rrelyea)
I think that the patch in bug 943144 can fix this problem too. I'm going to try.
That patch contains a possible memory leak and and not been accepted yet.
I've applied the pathc and now there is a deadlock elsewhere.

Deadlock on td->cache->lock. Owned by thread 0x0000099c (2460 - main thread), if we should trust the "OwningThread" field in the underlying CRITICAL_SECTION structure.



2460	0	Main Thread	Main Thread	mozglue.dll!arena_dalloc	Normal

 	mozglue.dll!arena_dalloc(void * ptr, unsigned int offset) Line 4720	C
 	mozglue.dll!je_free(void * ptr) Line 6459	C
 	xul.dll!nsTArray_base<nsTArrayInfallibleAllocator,nsTArray_CopyWithMemutils>::ShrinkCapacity(unsigned int aElemSize, unsigned int aElemAlign) Line 227	C++
 	[External Code]	
>	xul.dll!mozilla::widget::WinUtils::WaitForMessage(unsigned long aTimeoutMs) Line 637	C++
 	xul.dll!nsAppShell::ProcessNextNativeEvent(bool mayWait) Line 300	C++
 	xul.dll!nsBaseAppShell::DoProcessNextNativeEvent(bool mayWait, unsigned int recursionDepth) Line 145	C++
 	xul.dll!nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal * thr, bool mayWait, unsigned int recursionDepth) Line 298	C++
 	xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 805	C++
 	xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 265	C++
 	xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 140	C++
 	xul.dll!MessageLoop::RunHandler() Line 227	C++
 	xul.dll!MessageLoop::Run() Line 201	C++
 	xul.dll!nsBaseAppShell::Run() Line 166	C++
 	xul.dll!nsAppShell::Run() Line 180	C++
 	xul.dll!nsAppStartup::Run() Line 282	C++
 	xul.dll!XREMain::XRE_mainRun() Line 4150	C++
 	xul.dll!NS_TableDrivenQI(void * aThis, const nsID & aIID, void * * aInstancePtr, const QITableEntry * aEntries) Line 18	C++
 	xul.dll!nsComponentManagerImpl::QueryInterface(const nsID & aIID, void * * aInstancePtr) Line 940	C++
 	xul.dll!nsQueryInterface::operator()(const nsID & aIID, void * * aAnswer) Line 19	C++
 	xul.dll!nsCOMPtr_base::assign_from_qi(const nsQueryInterface aQI, const nsID & aIID) Line 62	C++


2456	0	Worker Thread	msvcr120.dll thread	nss3.dll!PR_Lock	Normal

 	nss3.dll!PR_Lock(PRLock * lock) Line 215	C
>	nss3.dll!nssTrustDomain_GetCertsForSubjectFromCache(NSSTrustDomainStr * td, NSSItemStr * subject, nssListStr * certListOpt) Line 954	C
 	nss3.dll!nssTrustDomain_FindCertificatesBySubject(NSSTrustDomainStr * td, NSSItemStr * subject, NSSCertificateStr * * rvOpt, unsigned int maximumOpt, NSSArenaStr * arenaOpt) Line 585	C
 	nss3.dll!find_cert_issuer(NSSCertificateStr * c, NSSTimeStr * timeOpt, NSSUsageStr * usage, NSSPoliciesStr * policiesOpt, NSSTrustDomainStr * td, NSSCryptoContextStr * cc) Line 410	C
 	nss3.dll!nssCertificate_BuildChain(NSSCertificateStr * c, NSSTimeStr * timeOpt, NSSUsageStr * usage, NSSPoliciesStr * policiesOpt, NSSCertificateStr * * rvOpt, unsigned int rvLimit, NSSArenaStr * arenaOpt, PRStatus * statusOpt, NSSTrustDomainStr * td, NSSCryptoContextStr * cc) Line 481	C
 	nss3.dll!CERT_CertChainFromCert(CERTCertificateStr * cert, SECCertUsageEnum usage, int includeRoot) Line 1042	C
 	nss3.dll!ssl3_HandleCertificateRequest(sslSocketStr * ss, unsigned char * b, unsigned int length) Line 7116	C
 	nss3.dll!ssl3_HandleHandshakeMessage(sslSocketStr * ss, unsigned char * b, unsigned int length) Line 10912	C
 	nss3.dll!ssl3_HandleHandshake(sslSocketStr * ss, sslBufferStr * origBuf) Line 11026	C
 	nss3.dll!ssl3_HandleRecord(sslSocketStr * ss, SSL3Ciphertext * cText, sslBufferStr * databuf) Line 11696	C
 	nss3.dll!md_UnlockAndPostNotifies(_MDLock * lock, PRThread * waitThred, _MDCVar * waitCV) Line 139	C
 	nss3.dll!ssl_Do1stHandshake(sslSocketStr * ss) Line 109	C
 	nss3.dll!ssl_Send(PRFileDesc * fd, const void * buf, int len, int flags, unsigned int timeout) Line 2124	C



3568	0	Worker Thread	msvcr120.dll thread	nss3.dll!PR_Lock	Normal
 
>	nss3.dll!PR_Lock(PRLock * lock) Line 215	C
 	nss3.dll!nssTrustDomain_RemoveTokenCertsFromCache(NSSTrustDomainStr * td, NSSTokenStr * token) Line 456	C
 	nss3.dll!nssToken_NotifyCertsNotVisible(NSSTokenStr * tok) Line 303	C
 	nss3.dll!nssSlot_IsTokenPresent(NSSSlotStr * slot) Line 206	C
 	nss3.dll!nssToken_IsPresent(NSSTokenStr * token) Line 1441	C
 	nss3.dll!pk11_IsPresentCertLoad(PK11SlotInfoStr * slot, int loadCerts) Line 1437	C
 	nss3.dll!PK11_IsPresent(PK11SlotInfoStr * slot) Line 1485	C
 	nss3.dll!secmod_HandleWaitForSlotEvent(SECMODModuleStr * mod, unsigned long flags, unsigned int latency) Line 1072	C
 	nss3.dll!SECMOD_WaitForAnyTokenEvent(SECMODModuleStr * mod, unsigned long flags, unsigned int latency) Line 1124	C
 	xul.dll!SmartCardMonitoringThread::Execute() Line 379	C++
 	xul.dll!SmartCardMonitoringThread::LaunchExecute(void * arg) Line 393	C++
 	nss3.dll!_PR_NativeRunThread(void * arg) Line 419	C
 	nss3.dll!pr_root(void * arg) Line 90	C
 	[External Code]
I've found a very similar report, talking about Firefox 3.5

http://permalink.gmane.org/gmane.comp.mozilla.crypto/16719


As I stated in comment #2 this bug exists from early version of Firefox.
This bug still here.

No one is able to fix it?
Blocks: 1399364
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.