Closed
Bug 706517
Opened 13 years ago
Closed 13 years ago
Intermittent test leak of about 1-3KB (1 CondVar, 2 Mutex, some nsDNSAsyncRequest, 1 nsHTMLDNSPrefetch::nsListener, 1 nsHostRecord, ...)
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
DUPLICATE
of bug 463724
People
(Reporter: mbrubeck, Assigned: sworkman)
References
Details
(Keywords: intermittent-failure, memory-leak)
https://tbpl.mozilla.org/php/getParsedLog.php?id=7644429&tree=Mozilla-Inbound Rev4 MacOSX Snow Leopard 10.6 mozilla-inbound debug test mochitests-1/5 on 2011-11-29 15:50:45 PST for push 39346d506e54 TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 3240 bytes during test execution TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of CondVar with size 32 bytes TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 2 instances of Mutex with size 24 bytes each (48 bytes total) TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 29 instances of nsDNSAsyncRequest with size 88 bytes each (2552 bytes total) TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of nsHTMLDNSPrefetch::nsListener with size 24 bytes TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of nsHostRecord with size 128 bytes TEST-INFO | automationutils.processLeakLog() | leaked 1 instance of nsHostResolver with size 232 bytes TEST-INFO | automationutils.processLeakLog() | leaked 27 instances of nsStringBuffer with size 8 bytes each (216 bytes total) TEST-INFO | automationutils.processLeakLog() | leaked 1 instance of nsTArray_base with size 8 bytes and https://tbpl.mozilla.org/php/getParsedLog.php?id=7644414&tree=Mozilla-Inbound Rev4 MacOSX Snow Leopard 10.6 mozilla-inbound debug test mochitests-5/5 on 2011-11-29 15:59:43 PST for push 4b085d906272 TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 760 bytes during test execution TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of CondVar with size 32 bytes TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 2 instances of Mutex with size 24 bytes each (48 bytes total) TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 3 instances of nsDNSAsyncRequest with size 88 bytes each (264 bytes total) TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of nsHTMLDNSPrefetch::nsListener with size 24 bytes TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of nsHostRecord with size 128 bytes TEST-INFO | automationutils.processLeakLog() | leaked 1 instance of nsHostResolver with size 232 bytes TEST-INFO | automationutils.processLeakLog() | leaked 3 instances of nsStringBuffer with size 8 bytes each (24 bytes total) TEST-INFO | automationutils.processLeakLog() | leaked 1 instance of nsTArray_base with size 8 bytes
Comment 1•13 years ago
|
||
Moving this to necko as it seems the most likely culprit. We're detecting this leak on buildfarms--not sure how best to repro. Heard on IRC: mbrubeck: I think those leaks are your bug 706517, and I think both they and it are "we leak when dns in the buildfarm is kinda busted" since in the midst of them was a talos graphserver name resolution failure <catlee-away> philor: why does dns in buildfarm cause leaks? <philor> catlee-away: because we don't know how to give up? I'm no necko hacker, just a phenomenologist <catlee-away> that smells of tests trying to access something remote <catlee-away> which we all know is BAD BAD BAD <philor> I don't doubt for a second that we have them again, but they could also be prefetching dns without accessing anything, I don't think we ban dns <jduell> philor catlee-away: do we have a necko bug open for the dns leak? <philor> jduell: you can have bug 706517 if you want it, I'm pretty sure nobody else will
Component: General → Networking
QA Contact: general → networking
Comment 2•13 years ago
|
||
Those two being: https://tbpl.mozilla.org/php/getParsedLog.php?id=7895893&tree=Firefox Rev4 MacOSX Snow Leopard 10.6 mozilla-central debug test reftest on 2011-12-12 17:43:50 PST for push 3f0c8604e2c1 https://tbpl.mozilla.org/php/getParsedLog.php?id=7895970&tree=Firefox Rev3 Fedora 12 mozilla-central debug test mochitests-3/5 on 2011-12-12 17:54:31 PST for push 351fcbc12030 so not specific to an OS, or a test or two, or even a broad family of tests, with that reftest stuck in there.
OS: Mac OS X → All
Hardware: x86_64 → All
Summary: Intermittent OS X64 mochitest leak of about 1-3KB (1 CondVar, 2 Mutex, some nsDNSAsyncRequest, 1 nsHTMLDNSPrefetch::nsListener, 1 nsHostRecord, ...) → Intermittent test leak of about 1-3KB (1 CondVar, 2 Mutex, some nsDNSAsyncRequest, 1 nsHTMLDNSPrefetch::nsListener, 1 nsHostRecord, ...)
Comment 3•13 years ago
|
||
This pair are from the midst of the downtime which is supposed to be fixing the buildfarm's dns troubles, so they could wind up being your last two. https://tbpl.mozilla.org/php/getParsedLog.php?id=7921363&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=7921369&tree=Mozilla-Inbound
Comment 4•13 years ago
|
||
Still during the downtime, snagged these off try because I like how they both were in mochitest-a11y, yet another suite heard from. https://tbpl.mozilla.org/php/getParsedLog.php?id=7921510&tree=Try https://tbpl.mozilla.org/php/getParsedLog.php?id=7921405&tree=Try
Assignee | ||
Comment 6•13 years ago
|
||
Did some analysis and discussed with mcmanus offline. I agree with his first thoughts on this (bug 707930 comment 2), based on looking at the code and verifying using XPCOM_MEM_LEAK_LOG. Looks like the following is happening, from DNS requests being made to shutdown happening: 1. A thread is created using nsHostResolver::ThreadFunc. -- It is given a strong ref to nsHostResolver. -- It does a lookup, resulting in an nsHostRecord being dequeued from High, Med or Low queue. 2. nsHostResolver::Shutdown is called. -- All nsHostRecords on the pending queues have OnLookupComplete called on them, which ends with them and their callbacks (nsDNSAsyncRequest or nsDNSSyncRequest objects) being released. -- The hash table is cleared. This should only be the nsHostDBEnt objects which have a simple ptr to the nsHostRecord. 3. The thread is killed (this is my assumption) -- Since it has an nsHostRecord that was dequeued but not released, the ptr's refcount is > 0. Same with all the nsDNSAsync | SyncRequests attached to the record. -- Since is also has a strong ref to nsHostResolver, it's refcount is also > 0. I have also checked this with mem leak testing, and a page with a 99999 links on it, resulting in the max number of link prefetches (512). 1. Load up said page. 2. Cut network access once page is loaded and prefetching has started. 3. Wait for 10 mins or so to allow all prefetching failures to timeout. 4. Check mem leak logs. I didn't see any mem leaks for these classes using this method. So, it looks like the leak is not progressive when access to the DNS server is cut. Instead, as Patrick said, it's a timing issue at shutdown. Since it's low priority and doesn't seem to be progressive, I'm going to change importance to minor.
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → sworkman
Severity: normal → minor
Assignee | ||
Comment 7•13 years ago
|
||
Just discovered bug 463724, which has a patch to avoid the leaks, except for an intentional mutex leak. Duping this one to that since it's the same root issue.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
Updated•12 years ago
|
Keywords: intermittent-failure
Updated•12 years ago
|
Whiteboard: [orange]
You need to log in
before you can comment on or make changes to this bug.
Description
•