Closed
Bug 1375344
Opened 7 years ago
Closed 6 years ago
Crash in shutdownhang | kernelbase.dll@0xcaf18
Categories
(Core :: General, defect, P3)
Tracking
()
RESOLVED
WONTFIX
Tracking | Status | |
---|---|---|
firefox54 | --- | affected |
People
(Reporter: gchang, Unassigned)
References
Details
(Keywords: crash)
Crash Data
This bug was filed from the Socorro interface and is report bp-6f6e1887-4868-468a-9fc5-c10ec0170621. ============================================================= Frame Module Signature Source 0 ntdll.dll NtWaitForSingleObject Ø 1 kernelbase.dll kernelbase.dll@0xcaf18 Ø 2 kernelbase.dll kernelbase.dll@0xcae71 3 winscard.dll CSCardSubcontext::WaitForAvailable() 4 winscard.dll CSCardSubcontext::ReleaseContext() 5 winscard.dll CSCardUserContext::ReleaseContext() 6 winscard.dll SCardReleaseContext Ø 7 tokenmgr.dll tokenmgr.dll@0x37d7 8 kernel32.dll LoadEnclaveData Ø 9 tokenmgr.dll tokenmgr.dll@0x4d14 Ø 10 tokenmgr.dll tokenmgr.dll@0x2dd1 Ø 11 wdpkcs.dll wdpkcs.dll@0x351af Ø 12 wdpkcs.dll wdpkcs.dll@0x35caf Ø 13 wdpkcs.dll wdpkcs.dll@0x28328 14 nss3.dll SECMOD_CancelWait security/nss/lib/pk11wrap/pk11util.c:1222 15 xul.dll SmartCardMonitoringThread::~SmartCardMonitoringThread() security/manager/ssl/nsSmartCardMonitor.cpp:184 16 xul.dll SmartCardThreadEntry::~SmartCardThreadEntry() security/manager/ssl/nsSmartCardMonitor.cpp:109 17 xul.dll SmartCardThreadEntry::`scalar deleting destructor'(unsigned int) 18 xul.dll nsNSSComponent::ShutdownNSS() security/manager/ssl/nsNSSComponent.cpp:2070 19 xul.dll nsNSSComponent::DoProfileBeforeChange() security/manager/ssl/nsNSSComponent.cpp:2314 20 xul.dll nsNSSComponent::Observe(nsISupports*, char const*, char16_t const*) security/manager/ssl/nsNSSComponent.cpp:2149 21 xul.dll nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverList.cpp:112 22 xul.dll nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverService.cpp:281 23 xul.dll nsXREDirProvider::DoShutdown() toolkit/xre/nsXREDirProvider.cpp:1248 24 xul.dll ScopedXPCOMStartup::~ScopedXPCOMStartup() toolkit/xre/nsAppRunner.cpp:1417 25 xul.dll mozilla::UniquePtr<ScopedXPCOMStartup, mozilla::DefaultDelete<ScopedXPCOMStartup> >::reset(ScopedXPCOMStartup*) obj-firefox/dist/include/mozilla/UniquePtr.h:345 26 xul.dll mozilla::UniquePtr<ScopedXPCOMStartup, mozilla::DefaultDelete<ScopedXPCOMStartup> >::operator=(std::nullptr_t) obj-firefox/dist/include/mozilla/UniquePtr.h:313 27 xul.dll XREMain::XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4705 28 xul.dll XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4768 29 xul.dll mozilla::BootstrapImpl::XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/Bootstrap.cpp:45 30 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:115 31 firefox.exe __scrt_common_main_seh f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253 32 kernel32.dll BaseThreadInitThunk 33 ntdll.dll __RtlUserThreadStart 34 ntdll.dll _RtlUserThreadStart This is #7 topcrash and there is a spike in the last 3 days.
Reporter | ||
Updated•7 years ago
|
status-firefox54:
--- → affected
Reporter | ||
Comment 1•7 years ago
|
||
Hi Nathan, Can you help find someone to look at this?
Flags: needinfo?(nfroyd)
Comment 2•7 years ago
|
||
The crash in comment 0 comes from something NSS-y; there's another thread stuck doing smart card things: Thread 26 Frame Module Signature Source 0 ntdll.dll NtWaitForAlertByThreadId 1 ntdll.dll RtlpWaitOnAddressWithTimeout 2 ntdll.dll RtlpWaitOnAddress 3 ntdll.dll RtlpWaitOnCriticalSection 4 ntdll.dll RtlpEnterCriticalSectionContended 5 ntdll.dll RtlEnterCriticalSection 6 nss3.dll PR_Lock nsprpub/pr/src/threads/combined/prulock.c:213 7 nss3.dll SECMOD_WaitForAnyTokenEvent security/nss/lib/pk11wrap/pk11util.c:1148 8 xul.dll SmartCardMonitoringThread::Execute() security/manager/ssl/nsSmartCardMonitor.cpp:344 9 xul.dll SmartCardMonitoringThread::LaunchExecute(void*) security/manager/ssl/nsSmartCardMonitor.cpp:397 10 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397 which looks like we might have deadlocked? ni keeler to evaluate.
Flags: needinfo?(nfroyd) → needinfo?(dkeeler)
Comment 3•7 years ago
|
||
However, there are a few other crashes that I looked at with this signature that look more like network cache crashes. For instance: https://crash-stats.mozilla.com/report/index/b7c4b6cb-5e1b-43d4-b2b0-868420170623 https://crash-stats.mozilla.com/report/index/e3e93836-348b-4908-bccb-1cd280170623 https://crash-stats.mozilla.com/report/index/ca9dbe9c-21da-4d49-a85f-8b8bd0170623 The first one says the main thread is stuck: Crashing Thread (0) Frame Module Signature Source 0 ntdll.dll NtWaitForSingleObject Ø 1 kernelbase.dll kernelbase.dll@0xcaf18 Ø 2 kernelbase.dll kernelbase.dll@0xcae71 3 xul.dll mozilla::net::detail::BlockingIOWatcher::WatchAndCancel(mozilla::Monitor&) netwerk/cache2/CacheIOThread.cpp:189 4 xul.dll mozilla::net::CacheIOThread::CancelBlockingIO() netwerk/cache2/CacheIOThread.cpp:417 5 xul.dll mozilla::net::ShutdownEvent::PostAndWait() netwerk/cache2/CacheFileIOManager.cpp:587 6 xul.dll mozilla::net::CacheFileIOManager::Shutdown() netwerk/cache2/CacheFileIOManager.cpp:1160 7 xul.dll mozilla::net::CacheObserver::Observe(nsISupports*, char const*, char16_t const*) netwerk/cache2/CacheObserver.cpp:542 8 xul.dll nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverList.cpp:112 9 xul.dll nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverService.cpp:281 and it looks like another thread is off doing cache I/O: Thread 18 Frame Module Signature Source 0 ntdll.dll NtClose Ø 1 KERNELBASE.dll KERNELBASE.dll@0xcadc9 2 nss3.dll _MD_CloseFile nsprpub/pr/src/md/windows/w95io.c:403 3 nss3.dll FileClose nsprpub/pr/src/io/prfile.c:207 4 nss3.dll PR_Close nsprpub/pr/src/io/priometh.c:104 5 xul.dll mozilla::net::CacheFileIOManager::MaybeReleaseNSPRHandleInternal(mozilla::net::CacheFileHandle*, bool) netwerk/cache2/CacheFileIOManager.cpp:2323 6 xul.dll mozilla::net::ReleaseNSPRHandleEvent::Run() netwerk/cache2/CacheFileIOManager.cpp:857 7 xul.dll mozilla::net::CacheIOThread::LoopOneLevel(unsigned int) netwerk/cache2/CacheIOThread.cpp:565 8 xul.dll mozilla::net::CacheIOThread::ThreadFunc() netwerk/cache2/CacheIOThread.cpp:503 9 xul.dll mozilla::net::CacheIOThread::ThreadFunc(void*) netwerk/cache2/CacheIOThread.cpp:446 10 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397 Or the second one, where the main thread is hanging: Crashing Thread (0) Frame Module Signature Source 0 ntdll.dll NtWaitForSingleObject Ø 1 kernelbase.dll kernelbase.dll@0xcaf18 Ø 2 kernelbase.dll kernelbase.dll@0xcae71 3 nss3.dll _PR_MD_WAIT_CV nsprpub/pr/src/md/windows/w95cv.c:248 4 nss3.dll _PR_WaitCondVar nsprpub/pr/src/threads/combined/prucv.c:172 5 nss3.dll PR_WaitCondVar nsprpub/pr/src/threads/combined/prucv.c:525 6 xul.dll mozilla::CondVar::Wait(unsigned int) obj-firefox/dist/include/mozilla/CondVar.h:79 7 xul.dll mozilla::net::ShutdownEvent::PostAndWait() netwerk/cache2/CacheFileIOManager.cpp:582 8 xul.dll mozilla::net::CacheFileIOManager::Shutdown() netwerk/cache2/CacheFileIOManager.cpp:1160 9 xul.dll mozilla::net::CacheObserver::Observe(nsISupports*, char const*, char16_t const*) netwerk/cache2/CacheObserver.cpp:542 10 xul.dll nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverList.cpp:112 and another thread is off doing things, slightly different from the previous: Thread 20 Frame Module Signature Source 0 ntdll.dll NtSetInformationFile Ø 1 KERNELBASE.dll KERNELBASE.dll@0xdc945 Ø 2 iNetSafe.dll iNetSafe.dll@0x5630 Ø 3 KERNELBASE.dll KERNELBASE.dll@0xdc75a Ø 4 KERNELBASE.dll KERNELBASE.dll@0xdc736 5 xul.dll nsLocalFile::CopySingleFile(nsIFile*, nsIFile*, nsAString_internal const&, unsigned int) xpcom/io/nsLocalFileWin.cpp:1982 6 xul.dll nsLocalFile::CopyMove(nsIFile*, nsAString_internal const&, unsigned int) xpcom/io/nsLocalFileWin.cpp:2103 7 xul.dll nsLocalFile::MoveToNative(nsIFile*, nsACString_internal const&) xpcom/io/nsLocalFileWin.cpp:3628 8 xul.dll mozilla::net::CacheFileIOManager::DoomFileInternal(mozilla::net::CacheFileHandle*, mozilla::net::CacheFileIOManager::PinningDoomRestriction) netwerk/cache2/CacheFileIOManager.cpp:2144 9 xul.dll mozilla::net::DoomFileEvent::Run() netwerk/cache2/CacheFileIOManager.cpp:782 10 xul.dll mozilla::net::CacheIOThread::LoopOneLevel(unsigned int) netwerk/cache2/CacheIOThread.cpp:565 11 xul.dll mozilla::net::CacheIOThread::ThreadFunc() netwerk/cache2/CacheIOThread.cpp:503 12 xul.dll mozilla::net::CacheIOThread::ThreadFunc(void*) netwerk/cache2/CacheIOThread.cpp:446 13 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397 ni to mayhemer for cache knowledge, and ni back to gchang to see if it's possible to have these stacks processed differently so the different crashes show up as, well, different crashes.
Flags: needinfo?(honzab.moz)
Flags: needinfo?(gchang)
Not much we can do about stacks like in comment 0 - the frames below 14 are completely out of our control. That's a 3rd party PKCS#11 module (i.e. external code the user loaded into Firefox's memory space). For a long time I've been thinking about writing a PKCS#11 module that would load another given module in a child process that would prevent things like this from hanging/crashing Firefox, but since only a small percentage of our users actually use PKCS#11 modules, it's been hard to justify the effort.
Flags: needinfo?(dkeeler)
Comment 5•7 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #3) > However, there are a few other crashes that I looked at with this signature > that look more like network cache crashes. For instance: All of these are known. The code from the first two stacks combo is already trying to handle when the IO thread we wait for (on the main thread) to shutdown by telling it to cancel the current sync IO operation. Regardless the windows documentation for the function used it in most cases doesn't work anyway. It's not worse than what we had before, tho. There is also a switch to turn this "cancel sync IO" feature off, I think, but we may get back to even worse state. Note that after early shutdown we forbid most if not all of the cache background IO and just leak the opened file handles (only in release builds w/o leak checking!) I've spent huge amount of time on this already and I'm not sure what better we could do side by using mmap or fully async IO on some version windows that support that. Note that sync IO on windows often gets stuck for probably extremely long times, we don't know the cause. The crash rate become lower recently, so we decided to not invest more time here.
Flags: needinfo?(honzab.moz)
Comment 6•7 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #5) > (In reply to Nathan Froyd [:froydnj] from comment #3) > > However, there are a few other crashes that I looked at with this signature > > that look more like network cache crashes. For instance: > > All of these are known. The code from the first two stacks combo is already > trying to handle when the IO thread we wait for (on the main thread) to > shutdown by telling it to cancel the current sync IO operation. Regardless > the windows documentation for the function used it in most cases doesn't > work anyway. It's not worse than what we had before, tho. There is also a > switch to turn this "cancel sync IO" feature off, I think, but we may get > back to even worse state. Note that after early shutdown we forbid most if > not all of the cache background IO and just leak the opened file handles > (only in release builds w/o leak checking!) > > I've spent huge amount of time on this already and I'm not sure what better > we could do side by using mmap or fully async IO on some version windows > that support that. Note that sync IO on windows often gets stuck for > probably extremely long times, we don't know the cause. The crash rate > become lower recently, so we decided to not invest more time here. Thanks for the detailed response, very helpful! Do you have links to the other crashes/hangs?
Flags: needinfo?(honzab.moz)
Comment 7•7 years ago
|
||
https://crash-stats.mozilla.com/search/?signature=~BlockingIOWatcher%3A%3AWatchAndCancel&date=%3E%3D2017-06-03T14%3A27%3A47.000Z&date=%3C2017-07-03T14%3A27%3A47.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature going back to 51 where this has been introduced (bug 1288204). We had crashes even before, like https://bugzilla.mozilla.org/show_bug.cgi?id=1263199 pointing even more back.
Flags: needinfo?(honzab.moz)
Reporter | ||
Updated•7 years ago
|
Flags: needinfo?(gchang)
Comment 8•7 years ago
|
||
Moving this to P3 given low number of crashes and lack of resources to tackle it.
Priority: -- → P3
Comment 9•6 years ago
|
||
Closing because no crash reported since 12 weeks.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•