Closed Bug 595413 Opened 14 years ago Closed 14 years ago

xpcshell intermittent crash in [@ nsCacheService::SetDiskCacheCapacity]

Categories

(Core :: Networking: Cache, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- beta7+

People

(Reporter: dbaron, Assigned: jduell.mcbugs)

References

Details

(Keywords: crash, intermittent-failure)

Crash Data

Attachments

(1 file, 2 obsolete files)

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1284162663.1284163685.12045.gz Rev3 Fedora 12x64 mozilla-central opt test xpcshell [testfailed] Crashed here: TEST-PASS | /home/cltbld/talos-slave/mozilla-central_fedora64_test-xpcshell/build/xpcshell/tests/toolkit/mozapps/extensions/test/xpcshell/test_bug570173.js | [anonymous : 74] -2 == -2 TEST-INFO | (xpcshell/head.js) | test 1 finished TEST-INFO | (xpcshell/head.js) | exiting test *** LOG addons.xpi: shutdown TEST-PASS | (xpcshell/head.js) | 5 (+ 0) check(s) passed <<<<<<< PROCESS-CRASH | /home/cltbld/talos-slave/mozilla-central_fedora64_test-xpcshell/build/xpcshell/tests/toolkit/mozapps/extensions/test/xpcshell/test_bug570173.js | application crashed (minidump found) Operating system: Linux 0.0.0 Linux 2.6.31.5-127.fc12.x86_64 #1 SMP Sat Nov 7 21:11:14 EST 2009 x86_64 CPU: amd64 family 6 model 23 stepping 10 2 CPUs Crash reason: SIGSEGV Crash address: 0x14 Thread 0 (crashed) 0 libxul.so!nsCacheService::SetDiskCacheCapacity [nsCacheService.cpp:fa6c83c1a10e : 785 + 0x0] rbx = 0x0009c400 r12 = 0x00000001 r13 = 0x00000001 r14 = 0x4afd69ec r15 = 0x00000000 rip = 0x9c71a974 rsp = 0x4afd6960 rbp = 0x00000000 1 libxul.so!nsSetSmartSizeEvent::Run [nsCacheService.cpp:fa6c83c1a10e : 219 + 0x7] rbx = 0x78000ee0 r12 = 0x00000001 r13 = 0x00000001 r14 = 0x4afd69ec r15 = 0x00000000 rip = 0x9c71d817 rsp = 0x4afd6970 rbp = 0x00000000 2 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp:fa6c83c1a10e : 547 + 0x7] rbx = 0x011ae590 r12 = 0x00000001 r13 = 0x00000001 r14 = 0x4afd69ec r15 = 0x00000000 rip = 0x9d0ff574 rsp = 0x4afd6990 rbp = 0x00000000
Attached patch Fixes race with ::Shutdown (obsolete) — Splinter Review
The problem here is that if the main thread already proceeded to Shutdown before setSmartSize event runs,, mObserver would be null and the event's attempt to call gService->mObserver->DiskCacheEnabled segfaults (the mDiskCacheEnabled variable it accesses is at offset 0x14 on x86_64). Fixed by checking for null mObserver in SetDiskCacheCapacity. JST also wanted to me to fix another potential race here: the two events (one sent to IO thread, which then creates event to send back to main thread) were each holding strong refs to both the mObserver and the Prefs branch. This could potentially result in final release happening on IO thread, which would be Bad. Fixed by Addrefing only in the first event, then Releasing at end of second objects's Run. Also updated comment as per bug 559942 comment 62.
Attachment #474950 - Flags: review?(jst)
Comment on attachment 474950 [details] [diff] [review] Fixes race with ::Shutdown So with this patch there's a potential for a leak if nsCacheService::DispatchToCacheIOThread() fails to dispatch the event, in that case the nsGetSmartSizeEvent will go away, but its Run() method will never be called, etc. Seems it'd be "cleaner" to do the reference count bumps in the code that creates the nsGetSmartSizeEvent objects, then you could do the appropriate releases if the dispatching fails as well, or alternatively add a boolean member to nsGetSmartSizeEvent that gets set in its Run() method, and then release the pointers in the nsGetSmartSizeEvent destructor if that boolean was never set. Another alternative: seems the pointers carried through by these events don't *really* need to be members of the event. The cache is global, so the observer can be reached through that, and the pref branch you can re-create at will, so if this reference counting thing gets too gnarly then not passing these pointers in the events at all could be an option as well. Other than the potential leak this looks good, but r- until that's fixed.
Attachment #474950 - Flags: review?(jst) → review-
OK, this version doesn't pass the prefsBranch and observer to the events, so no leak and/or tricky refcount issues. There's more I want to fix in the smart size code, but I'll open other bugs for them.
Attachment #474950 - Attachment is obsolete: true
Attachment #475009 - Flags: review?(jst)
Comment on attachment 475009 [details] [diff] [review] Fixes race with ::Shutdown w/o refcount issues - In nsSetSmartSizeEvent::Run(): + // also set on observer, in case disk device not init'd yet. + // -- but make sure observer alive: may be after Shutdown + if (nsCacheService::gService->mObserver) Do we need to worry about nsCacheService::gService possibly being null here? Not something a regular user would ever hit, but could we hit that case in our fast startup/shutdown patterns we hit in our tests? r=jst with that fixed or believed to be a non-issue...
Attachment #475009 - Flags: review?(jst) → review+
gService does indeed need checking. Forwarding jst's +r
Assignee: nobody → jduell.mcbugs
Attachment #475009 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #475394 - Flags: review+
Blocks: 596476
Blocking beta7+ as this blocks beta7 blocker bug 559942.
blocking2.0: --- → beta7+
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Blocks: 598007
No longer blocks: 598007
Severity: normal → critical
Keywords: crash
Summary: xpcshell intermittent crash in nsCacheService::SetDiskCacheCapacity → xpcshell intermittent crash in [@ nsCacheService::SetDiskCacheCapacity]
Crash Signature: [@ nsCacheService::SetDiskCacheCapacity]
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: