Closed Bug 1045670 Opened 11 years ago Closed 10 years ago

Hang [@ mozilla::net::CacheStorageService::AddStorageEntry ]

Categories

(Core :: Networking: Cache, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1036692

People

(Reporter: ted, Assigned: mayhemer)

Details

My Windows nightly hung yesterday. I broke into it with a debugger and the main thread was stuck in a lock here: ntdll.dll!_ZwWaitForSingleObject@12() Unknown ntdll.dll!_RtlpWaitOnCriticalSection@8() Unknown ntdll.dll!_RtlEnterCriticalSection@4() Unknown > nss3.dll!PR_Lock(PRLock * lock) Line 215 C xul.dll!mozilla::OffTheBooksMutex::Lock() Line 69 C++ xul.dll!mozilla::BaseAutoLock<mozilla::Mutex>::BaseAutoLock<mozilla::Mutex>(mozilla::Mutex & aLock) Line 165 C++ xul.dll!mozilla::net::CacheStorageService::AddStorageEntry(const nsACString_internal & aContextKey, nsIURI * aURI, const nsACString_internal & aIdExtension, bool aWriteToDisk, bool aCreateIfNotExist, bool aReplace, mozilla::net::CacheEntryHandle * * aResult) Line 1340 C++ xul.dll!mozilla::net::CacheStorageService::AddStorageEntry(const mozilla::net::CacheStorage * aStorage, nsIURI * aURI, const nsACString_internal & aIdExtension, bool aCreateIfNotExist, bool aReplace, mozilla::net::CacheEntryHandle * * aResult) Line 1311 C++ xul.dll!mozilla::net::CacheStorage::AsyncOpenURI(nsIURI * aURI, const nsACString_internal & aIdExtension, unsigned int aFlags, nsICacheEntryOpenCallback * aCallback) Line 98 C++ xul.dll!mozilla::net::nsHttpChannel::OpenCacheEntry(bool usingSSL) Line 2603 C++ xul.dll!mozilla::net::nsHttpChannel::Connect() Line 333 C++ xul.dll!mozilla::net::nsHttpChannel::BeginConnect() Line 4713 C++ xul.dll!mozilla::net::nsHttpChannel::OnProxyAvailable(nsICancelable * request, nsIURI * uri, nsIProxyInfo * pi, tag_nsresult status) Line 4791 C++ xul.dll!nsAsyncResolveRequest::DoCallback() Line 226 C++ xul.dll!nsAsyncResolveRequest::OnQueryComplete(tag_nsresult status, const nsCString & pacString, const nsCString & newPACURL) Line 196 C++ xul.dll!ExecuteCallback::Run() Line 80 C++ xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 771 C++ xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 265 C++ xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 140 C++ xul.dll!MessageLoop::RunHandler() Line 223 C++ xul.dll!MessageLoop::Run() Line 197 C++ xul.dll!nsBaseAppShell::Run() Line 166 C++ xul.dll!nsAppShell::Run() Line 191 C++ xul.dll!nsAppStartup::Run() Line 279 C++ xul.dll!XREMain::XRE_mainRun() Line 4013 C++ xul.dll!XREMain::XRE_main(int argc, char * * argv, const nsXREAppData * aAppData) Line 4084 C++ xul.dll!XRE_main(int argc, char * * argv, const nsXREAppData * aAppData, unsigned int aFlags) Line 4298 C++ firefox.exe!do_main(int argc, char * * argv, nsIFile * xreDirectory) Line 282 C++ firefox.exe!NS_internal_main(int argc, char * * argv) Line 643 C++ firefox.exe!wmain(int argc, wchar_t * * argv) Line 112 C++ firefox.exe!__tmainCRTStartup() Line 552 C kernel32.dll!@BaseThreadInitThunk@12() Unknown ntdll.dll!___RtlUserThreadStart@8() Unknown ntdll.dll!__RtlUserThreadStart@8() Unknown I crashed it with "killfirefox.exe" and produced this crash report: https://crash-stats.mozilla.com/report/index/7130b5c0-f5ed-476e-8eb5-497262140728 (Ignore the crashing thread, look at the other threads.) I also saved off a minidump locally if that'd be of use.
My self-build has hung twice in AddStorageEntry this week (with slightly different backtraces). The lock appears to think that the cache IO thread still owns it, however the backtrace on the cache IO thread is this: xul.dll!mozilla::CondVar::Wait(PR_INTERVAL_NO_TIMEOUT) xul.dll!mozilla::Monitor::Wait(PR_INTERVAL_NO_TIMEOUT) xul.dll!mozilla::MonitorAutoLock::Wait(PR_INTERVAL_NO_TIMEOUT) xul.dll!mozilla::net::CacheIOThread::ThreadFunc() xul.dll!mozilla::net::CacheIOThread::ThreadFunc(void*)
...and it's just hung again, this time with the same main stack as Ted's hang (but the same Cache IO Thread stack as comment #1). I'm not using it right now, so I can poke around in it with the debugger if you need anything.
Why do you think that the lock thinks that it is owned by cache IO thread? Is there any cache activity on some other thread?
(In reply to Michal Novotny from comment #3) > Why do you think that the lock thinks that it is owned by cache IO thread? > Is there any cache activity on some other thread? On my first hang I started with WinDbg's !analyze -v -hang and it said that there was an abandoned critical section owned by that thread. Then on later hangs I peeked into the lock's ilock's mutex's OwningThread which is the cache IO thread.
We don't call Lock()/Unlock() on the CacheStorageService's lock directly, instead we always use MutexAutoLock allocated on the stack. So how this can happen?
I updated my build a few days ago and I stressed it out yesterday and the bug did not manifest.
Assignee: nobody → honzab.moz
Could also have something to do with bug 1035411.
Nail, was that a debug build?
Flags: needinfo?(neil)
No, it was just an unoptimized build. (I no longer have that build.)
Flags: needinfo?(neil)
Has this ever again manifested?
Flags: needinfo?(ted)
Flags: needinfo?(neil)
Marking as duplicate, still I'd like to know if you were able to reproduce this recently.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
I haven't seen this since.
Flags: needinfo?(ted)
I've stopped dogfooding this sort of thing because unoptimised builds are just too slow these days and optimised builds don't give enough useful debugging information, but I don't remember seeing this again.
Flags: needinfo?(neil)
You need to log in before you can comment on or make changes to this bug.