Closed Bug 1221160 Opened 9 years ago Closed 8 years ago

AutoTraceLogLock can deadlock on same thread (at least on Windows)

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox45 --- affected
firefox48 --- fixed

People

(Reporter: vlad, Assigned: vlad)

References

Details

Attachments

(1 file, 2 obsolete files)

Trying to run with XPCOM_MEM_LOG_CLASSES=... XPCOM_MEM_REFCNT_LOG=1 XPCOM_MEM_LEAK_LOG=1 on Windows, I get a deadlock with AutoTraceLogLock with the following stack.  AutoTraceLogLock attempts to lock by doing a cmpxchng of a bool from false to true; but we're already locked (this thread set the bool to true) in the stack frame marked as ##.  Basically we're doing a bunch of work underneath NS_LogAddRef that might result in another NS_LogAddRef to happen.

A fix would probably be to change the bool value to hold the current thread ID, and then to have AutoTraceLogLock compare that to the current thread value.


 	xul.dll!mozilla::Atomic<bool,1,void>::compareExchange(bool aOldValue, bool aNewValue) Line 747	C++
>> 	xul.dll!AutoTraceLogLock::AutoTraceLogLock() Line 70	C++
 	xul.dll!NS_LogAddRef(void * aPtr, unsigned __int64 aRefcnt, const char * aClass, unsigned int aClassSize) Line 1095	C++
 	xul.dll!nsStringBuffer::Alloc(unsigned __int64 aSize) Line 225	C++
 	xul.dll!nsAString_internal::MutatePrep(unsigned int aCapacity, wchar_t * * aOldData, unsigned int * aOldFlags) Line 133	C++
 	xul.dll!nsAString_internal::SetCapacity(unsigned int aCapacity, const mozilla::fallible_t & __formal) Line 665	C++
 	xul.dll!nsAString_internal::SetCapacity(unsigned int aCapacity) Line 644	C++
 	xul.dll!nsAString_internal::SetLength(unsigned int aLength) Line 697	C++
 	xul.dll!mozilla::NtPathToDosPath(const nsAString_internal & aNtPath, nsAString_internal & aDosPath) Line 86	C++
 	xul.dll!`anonymous namespace'::WinIOAutoObservation::WinIOAutoObservation(mozilla::IOInterposeObserver::Operation aOp, nsAString_internal & aFilename) Line 167	C++
 	xul.dll!`anonymous namespace'::InterposedNtCreateFile(void * * aFileHandle, unsigned long aDesiredAccess, _OBJECT_ATTRIBUTES * aObjectAttributes, _IO_STATUS_BLOCK * aIoStatusBlock, _LARGE_INTEGER * aAllocationSize, unsigned long aFileAttributes, unsigned long aShareAccess, unsigned long aCreateDisposition, unsigned long aCreateOptions, void * aEaBuffer, unsigned long aEaLength) Line 252	C++
 	KernelBase.dll!CreateFileInternal()	Unknown
 	KernelBase.dll!CreateFileW()	Unknown
 	dbghelp.dll!IStreamCRTFile::Create(unsigned short const *,int,int,int *)	Unknown
 	dbghelp.dll!MSF_HB::internalOpen(unsigned short const *,int,long *,long)	Unknown
 	dbghelp.dll!MSF::Open(unsigned short const *,int,long *,long,unsigned long)	Unknown
 	dbghelp.dll!PDB1::OpenEx2W(unsigned short const *,char const *,unsigned long,long,long *,unsigned short *,unsigned __int64,struct PDB * *)	Unknown
 	dbghelp.dll!PDB1::OpenValidate4(unsigned short const *,char const *,struct _GUID const *,unsigned long,unsigned long,long *,unsigned short *,unsigned __int64,struct PDB * *)	Unknown
 	dbghelp.dll!CDiaDataSource::loadAndValidateDataFromPdb(unsigned short const *,struct _GUID *,unsigned long,unsigned long)	Unknown
 	dbghelp.dll!DiaOpenPdbEx(unsigned short const *,struct _GUID *,unsigned long,unsigned long,int,void *,struct _PROCESS_ENTRY *,int *,unsigned short *)	Unknown
 	dbghelp.dll!diaOpenPdb(struct _IMGHLP_DEBUG_DATA *,unsigned short const *,struct _GUID *,unsigned long,unsigned long,int)	Unknown
 	dbghelp.dll!diaLocatePdb(struct _MODULE_ENTRY *,struct _IMGHLP_DEBUG_DATA *,unsigned short *,struct _GUID *,unsigned long,unsigned long,unsigned short const *)	Unknown
 	dbghelp.dll!diaGetPdb(struct _MODULE_ENTRY *,struct _IMGHLP_DEBUG_DATA *)	Unknown
 	dbghelp.dll!GetDebugData(struct _MODULE_ENTRY *,struct _IMGHLP_DEBUG_DATA *)	Unknown
 	dbghelp.dll!modload(void *,struct _MODULE_ENTRY *)	Unknown
 	dbghelp.dll!LoadModule(void *,unsigned short const *,unsigned short const *,unsigned __int64,unsigned long,void *,struct _MODLOAD_DATA *,unsigned long)	Unknown
 	dbghelp.dll!GetModule(void *,char const *,unsigned __int64,unsigned long,void *)	Unknown
 	dbghelp.dll!GetProcessModules(void *,int (*)(void *,char const *,unsigned __int64,unsigned long,void *),void *,int)	Unknown
 	dbghelp.dll!SymInitializeW()	Unknown
 	dbghelp.dll!SymInitialize()	Unknown
 	mozglue.dll!EnsureSymInitialized() Line 758	C++
 	mozglue.dll!MozDescribeCodeAddress(void * aPC, MozCodeAddressDetails * aDetails) Line 780	C++
 	xul.dll!mozilla::CodeAddressService<CodeAddressServiceStringTable,CodeAddressServiceStringAlloc,CodeAddressServiceLock>::GetLocation(unsigned int aFrameNumber, const void * aPc, char * aBuf, unsigned __int64 aBufLen) Line 149	C++
 	xul.dll!PrintStackFrameCached(unsigned int aFrameNumber, void * aPC, void * aSP, void * aClosure) Line 897	C++
 	mozglue.dll!MozStackWalk(void (unsigned int, void *, void *, void *) * aCallback, unsigned int aSkipFrames, unsigned int aMaxFrames, void * aClosure, unsigned __int64 aThread, void * aPlatformData) Line 618	C++
 	xul.dll!nsTraceRefcnt::WalkTheStackCached(_iobuf * aStream) Line 931	C++
##	xul.dll!NS_LogAddRef(void * aPtr, unsigned __int64 aRefcnt, const char * aClass, unsigned int aClassSize) Line 1130	C++
 	xul.dll!mozilla::layers::CompositorVsyncScheduler::AddRef() Line 99	C++
 	xul.dll!RefPtr<mozilla::layers::CompositorVsyncScheduler>::AddRefTraitsAddRefHelper(mozilla::layers::CompositorVsyncScheduler * aPtr) Line 358	C++
 	xul.dll!RefPtr<mozilla::layers::CompositorVsyncScheduler>::AddRefTraits<mozilla::layers::CompositorVsyncScheduler>::AddRef(mozilla::layers::CompositorVsyncScheduler * aPtr) Line 370	C++
 	xul.dll!RefPtr<mozilla::layers::CompositorVsyncScheduler>::assign_with_AddRef(mozilla::layers::CompositorVsyncScheduler * aRawPtr) Line 34	C++
 	xul.dll!RefPtr<mozilla::layers::CompositorVsyncScheduler>::operator=(mozilla::layers::CompositorVsyncScheduler * aRhs) Line 154	C++
 	xul.dll!mozilla::layers::CompositorParent::CompositorParent(nsIWidget * aWidget, bool aUseExternalSurfaceSize, int aSurfaceWidth, int aSurfaceHeight) Line 591	C++
 	xul.dll!nsBaseWidget::NewCompositorParent(int aSurfaceWidth, int aSurfaceHeight) Line 842	C++
 	xul.dll!nsBaseWidget::CreateCompositor(int aWidth, int aHeight) Line 1430	C++
 	xul.dll!nsBaseWidget::CreateCompositor() Line 850	C++
 	xul.dll!nsWindow::GetLayerManager(mozilla::layers::PLayerTransactionChild * aShadowManager, mozilla::layers::LayersBackend aBackendHint, nsIWidget::LayerManagerPersistence aPersistence, bool * aAllowRetaining) Line 3532	C++
 	xul.dll!nsWindow::OnPaint(HDC__ * aDC, unsigned int aNestingLevel) Line 223	C++
 	xul.dll!nsWindow::ProcessMessage(unsigned int msg, unsigned __int64 & wParam, __int64 & lParam, __int64 * aRetValue) Line 5048	C++
 	xul.dll!nsWindow::WindowProcInternal(HWND__ * hWnd, unsigned int msg, unsigned __int64 wParam, __int64 lParam) Line 4597	C++
 	xul.dll!CallWindowProcCrashProtected(__int64 (HWND__ *, unsigned int, unsigned __int64, __int64) * aWndProc, HWND__ * aHWnd, unsigned int aMsg, unsigned __int64 aWParam, __int64 aLParam) Line 35	C++
 	xul.dll!nsWindow::WindowProc(HWND__ * hWnd, unsigned int msg, unsigned __int64 wParam, __int64 lParam) Line 4550	C++
 	user32.dll!UserCallWinProcCheckWow()	Unknown
 	user32.dll!DispatchClientMessage()	Unknown
 	user32.dll!__fnDWORD()	Unknown
 	ntdll.dll!KiUserCallbackDispatcherContinue()	Unknown
 	user32.dll!NtUserCallHwndLock()	Unknown
 	xul.dll!nsWindow::Show(bool bState) Line 1266	C++
 	xul.dll!nsXULWindow::SetVisibility(bool aVisibility) Line 851	C++
 	xul.dll!nsXULWindow::OnChromeLoaded() Line 1056	C++
 	xul.dll!nsWebShellWindow::OnStateChange(nsIWebProgress * aProgress, nsIRequest * aRequest, unsigned int aStateFlags, nsresult aStatus) Line 584	C++
 	xul.dll!nsDocLoader::DoFireOnStateChange(nsIWebProgress * const aProgress, nsIRequest * const aRequest, int & aStateFlags, const nsresult aStatus) Line 1252	C++
 	xul.dll!nsDocLoader::doStopDocumentLoad(nsIRequest * request, nsresult aStatus) Line 843	C++
 	xul.dll!nsDocLoader::DocLoaderIsEmpty(bool aFlushLayout) Line 723	C++
 	xul.dll!nsDocLoader::OnStopRequest(nsIRequest * aRequest, nsISupports * aCtxt, nsresult aStatus) Line 608	C++
 	xul.dll!nsLoadGroup::RemoveRequest(nsIRequest * request, nsISupports * ctxt, nsresult aStatus) Line 640	C++
 	xul.dll!nsDocument::DoUnblockOnload() Line 9113	C++
 	xul.dll!nsUnblockOnloadEvent::Run() Line 9065	C++
 	xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 972	C++
 	xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 297	C++
 	xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 95	C++
 	xul.dll!MessageLoop::RunInternal() Line 235	C++
 	xul.dll!MessageLoop::RunHandler() Line 228	C++
 	xul.dll!MessageLoop::Run() Line 202	C++
 	xul.dll!nsBaseAppShell::Run() Line 158	C++
 	xul.dll!nsAppShell::Run() Line 178	C++
 	xul.dll!nsAppStartup::Run() Line 281	C++
 	xul.dll!XREMain::XRE_mainRun() Line 4298	C++
 	xul.dll!XREMain::XRE_main(int argc, char * * argv, const nsXREAppData * aAppData) Line 4391	C++
 	xul.dll!XRE_main(int argc, char * * argv, const nsXREAppData * aAppData, unsigned int aFlags) Line 4493	C++
 	firefox.exe!do_main(int argc, char * * argv, nsIFile * xreDirectory) Line 212	C++
 	firefox.exe!NS_internal_main(int argc, char * * argv) Line 352	C++
 	firefox.exe!wmain(int argc, wchar_t * * argv) Line 131	C++
 	firefox.exe!__tmainCRTStartup() Line 255	C
 	kernel32.dll!BaseThreadInitThunk()	Unknown
 	ntdll.dll!RtlUserThreadStart()	Unknown
This patch doesn't work, since we deadlock when calling NS_GetCurrentThread() when we don't have a nsIThread for the current thread (in order to log the allocation of the nsThread).

Any suggestions?  Only other thing I can think of is to add a saveLogging guard in NS_LogAddRef/NS_LogRelease/etc. like we do in other places (or build it into AutoTraceLogLock); this would drop some logs though, and could cause GetSerialNumber assertions.
Attachment #8682554 - Flags: feedback?(nfroyd)
This doesn't crash/deadlock, but breaks logging entirely.  Almost all destructors/releases seem to be missed.
Attachment #8682558 - Flags: feedback?(nfroyd)
The thread-based patch works; I should have been using PR_GetCurrentThread instead of NS_.  I don't know if this has much of a perf hit on other platforms; a Windows debug build seems fine with refcnt logging enabled.
Assignee: nobody → vladimir
Attachment #8682554 - Attachment is obsolete: true
Attachment #8682558 - Attachment is obsolete: true
Attachment #8682554 - Flags: feedback?(nfroyd)
Attachment #8682558 - Flags: feedback?(nfroyd)
Attachment #8682573 - Flags: review?(nfroyd)
Comment on attachment 8682573 [details] [diff] [review]
Track threads, using PR_GetCurrentThread()

Review of attachment 8682573 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not too worried about the overhead here.
Attachment #8682573 - Flags: review?(nfroyd) → review+
https://hg.mozilla.org/mozilla-central/rev/32d19555348e
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Depends on: 1270002
Vlad, did you mean to land the changes to the logging format (to include a [thread] annotation)? These changes appear to be unreviewed, and broke at least one refcount log parsing script.
Flags: needinfo?(vladimir)
Flags: needinfo?(vladimir)
You need to log in before you can comment on or make changes to this bug.