630420 - Cold startup is sometimes slow due to race condition of cache lock

Reporter

Description

•

15 years ago

Since UI thread is pending to get lock of cache, startup time sometimes spends 30s on cold startup. ENV === 2011-01-30 nightly UI thread ========= 0:021> ~0k ChildEBP RetAddr 003c9b40 77e08dcd ntdll!ZwWaitForSingleObject+0x15 003c9ba4 77e08d98 ntdll!RtlpWaitOnCriticalSection+0x155 003c9bcc 6956b3a7 ntdll!RtlEnterCriticalSection+0x152 003c9bdc 664a864f nspr4!PR_Lock+0x17 003c9bf4 666221bd xul!nsCacheService::OpenCacheEntry+0x2d 003c9c10 6649dfbb xul!nsCacheSession::AsyncOpenCacheEntry+0x19 003c9c9c 6649de6f xul!nsHttpChannel::OpenNormalCacheEntry+0xc1 003c9d84 664a8d1c xul!nsHttpChannel::OpenCacheEntry+0x111 003c9dbc 664ad05a xul!nsHttpChannel::Connect+0x12c 003c9dd4 667109f8 xul!nsHttpChannel::AsyncOpen+0x1aa 003c9df0 665613f7 xul!NS_InvokeByIndex_P+0x27 003ca16c 68553d1f xul!XPC_WN_CallMethod+0x747 This lock is ... 0:000> dt -r2 nspr4!PRLock 08f21b30 +0x000 links : PRCListStr : : +0x01c ilock : _MDLock +0x000 mutex : _RTL_CRITICAL_SECTION +0x000 DebugInfo : 0x001dcfc8 _RTL_CRITICAL_SECTION_DEBUG +0x004 LockCount : -6 +0x008 RecursionCount : 1 +0x00c OwningThread : 0x00001244 +0x010 LockSemaphore : 0x0000046c +0x014 SpinCount : 0 +0x018 notified : _MDNotified +0x000 length : 0 +0x004 cv : [6] <unnamed-tag> +0x04c link : (null) Owner of lock 0:021> ~19k 100 ChildEBP RetAddr 0985f2d4 74e11e4d ntdll!ZwOpenFile+0x12 0985f31c 74e1431f NTMARTA!I_MartaFileNtOpenFile+0x4d 0985f360 74e14171 NTMARTA!MartaFindNextFile+0xc5 0985f3a0 74e14422 NTMARTA!MartaUpdateTree+0x293 0985f3f0 74e13d74 NTMARTA!MartaUpdateTree+0x1e2 0985f448 74e134cd NTMARTA!MartaManualPropagation+0x31c 0985f4f8 772a59a4 NTMARTA!AccRewriteSetNamedRights+0x207 0985f528 6646bcb9 ADVAPI32!SetNamedSecurityInfoW+0x4f 0985f720 6661ed31 xul!nsLocalFile::CopySingleFile+0x16b 0985f980 6667b323 xul!nsLocalFile::CopyMove+0xfb 0985f994 66845899 xul!nsLocalFile::MoveTo+0x15 0985f9e0 6667f05f xul!nsLocalFile::MoveToNative+0x1e5dd8 0985fa88 6684ab0b xul!GetTrashDir+0x73 0985faa0 666d9d19 xul!nsDiskCacheDevice::OpenDiskCache+0x18b371 0985fabc 666d9cac xul!nsDiskCacheBindery::Init+0x14 0985fad0 6662d55b xul!nsCacheService::CreateDiskDevice+0x69 0985fae0 6662243d xul!nsCacheService::SearchCacheDevices+0x68 0985fb00 6662230e xul!nsCacheService::ActivateEntry+0x9b 0985fb34 66621e49 xul!nsCacheService::ProcessRequest+0x32 0985fb48 66552ea6 xul!nsProcessRequestEvent::Run+0x26 0985fbb4 664e994c xul!nsThread::ProcessNextEvent+0x266 0985fbdc 6956bdd9 xul!nsThread::ThreadFunc+0x8c 0985fbfc 6956e05d nspr4!_PR_NativeRunThread+0x169 0985fc04 6ea22c28 nspr4!pr_root+0xd 0985fc3c 6ea22cb6 MOZCRT19!_callthreadstartex+0x48 0985fc44 76ceeccb MOZCRT19!_threadstartex+0x66 0985fc50 77e4d24d kernel32!BaseThreadInitThunk+0xe 0985fc90 77e4d45f ntdll!__RtlUserThreadStart+0x23 0985fca8 00000000 ntdll!_RtlUserThreadStart+0x1b

Boris Zbarsky [:bzbarsky]

Updated

•

15 years ago

blocking2.0: --- → ?

Boris Zbarsky [:bzbarsky]

Comment 1

•

15 years ago

To be clear I nominated this to block because it can cause cold-startup times in the tens of seconds "randomly" (depending on the size of the user's cache, presumably, and possibly on alignment of the moon and whatnot; I don't know when we decide to do that MoveTo call in the OpenDiskCache code).

Boris Zbarsky [:bzbarsky]

Comment 2

•

15 years ago

And I'm assuming, btw, that the problem is that we're copying lots of data around or something... and locking around that operation, which means we block the UI thread when it tries to also lock during startup.

Mike Beltzner [:beltzner, not reading bugmail]

Comment 3

•

15 years ago

Don't think it blocks ship, but we definitely want to fix this!

blocking2.0: ? → -

Whiteboard: [ts]

Brendan Eich [:brendan]

Comment 4

•

15 years ago

Who should own this bug? /be

Proposed solution 15 years ago Bjarne (:bjarne) 6.46 KB, patch		Details \| Diff \| Splinter Review
Patch to simulate long running DeleteDir 15 years ago Bjarne (:bjarne) 1.45 KB, patch		Details \| Diff \| Splinter Review
Part 1: prepare xpcshell tests 15 years ago Bjarne (:bjarne) 25.36 KB, patch	michal : review-	Details \| Diff \| Splinter Review
Part 2: Prepare Mochitests 15 years ago Bjarne (:bjarne) 2.02 KB, patch	michal : review-	Details \| Diff \| Splinter Review
Part 3: Code-changes to create disk-device on background thread 15 years ago Bjarne (:bjarne) 7.81 KB, patch	michal : review-	Details \| Diff \| Splinter Review
Approach 2: Working cache, temporarily wo/ disk device 15 years ago Bjarne (:bjarne) 15.01 KB, patch		Details \| Diff \| Splinter Review