Closed Bug 397134 Opened 17 years ago Closed 17 years ago

Leak involving nsTimerImpl and nsLoadGroup with rapid page loads

Categories

(Toolkit :: Safe Browsing, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 3 beta1

People

(Reporter: jruderman, Assigned: Waldo)

Details

(Keywords: memory-leak, testcase)

Attachments

(2 files)

Attached file testcase
Steps to reproduce:
1. Save the testcase locally.
2. export XPCOM_MEM_LEAK_LOG=2
3. Run Firefox (debug) with the testcase filename on the command line.
4. As soon as "?retry=5" appears in the address bar, press Cmd+Q.

Result: trace-refcnt complains about the following types of objects leaking:

nsBaseAppShell                        
nsBaseChannel                         
nsBaseURLParser                       
nsDocNavStartProgressListener         
nsDocShell::InterfaceRequestorProxy   
nsFileChannel                         
nsHashPropertyBag                     
nsLoadGroup                           
nsLocalFile                           
nsRunnable                            
nsStandardURL                         
nsStringBuffer                        
nsThread                              
nsTimerImpl                           
nsVariant                             
nsVoidArray                           
nsWeakReference                       

Since there's a timer involved, and the leak goes away if I wait a few seconds before pressing Cmd+Q, I suspect that this might just be bug 387341.  (I don't know which timer is involved, though.)

I don't think this leak affects users, but it does seem to get in the way of automatic testing for nsDocShell leaks.
I think I've diagnosed the leak; more on that in either half an hour or this afternoon, depending on how the fix goes and when I decide to sleep.
Assignee: nobody → jwalden+bmo
Attached patch PatchSplinter Review
The safebrowsing progress listener uses a timer to do its work when it gets notified about a location change.  It initializes the timer *to notify itself*, and then *it stashes the timer in a member array.*  This timer fires after 1500ms, the value assigned to .delay in phishing-warden.js.  When the timer fires, the callback is called and when it completes the timer is removed from the array, breaking the listener->mTimers->listener cycle.  If the timer doesn't fire, the cycle isn't broken, and the requests in mRequests aren't released, which is what causes lots of stuff to leak.

I'm not entirely certain doing this cancellation in SetCallback is the right place or that this fix isn't a bit too cute, but it's less churn than adding a destroy() method or something like it.

Interestingly, I wasn't even debugging this bug when I ran across this problem; I was attacking Rlk instead, which seems to have this problem too.  It's a bit worse there, however, because there it's usually an nsHttpChannel in mRequests, which drags in quite a bit more than an nsFileChannel.  I'm really not sure why this bug doesn't appear in the Linux Rlk numbers; perhaps shutdown is prolonged enough that you don't see the leak there.
Attachment #282001 - Flags: review?(tony)
Status: NEW → ASSIGNED
Component: Embedding: Docshell → Phishing Protection
Product: Core → Firefox
QA Contact: docshell → phishing.protection
Target Milestone: --- → Firefox 3 M8
So basically:

1)  This is a special case of bug 387341
2)  The destructor code there is pointless, since the destructor can't fire
    while we have timers.

right?
Target Milestone: Firefox 3 M8 → Firefox 3 M9
(In reply to comment #2)
> Created an attachment (id=282001) [details]
> Patch

I think Boris is right, won't the timer hold on to the destructor making it so it can't clean up?

When does the callback get nulled out causing the timers to be cleaned up?  I see the callback set in sb-loader.js, but the shutdown method probably needs to null out the callback.
http://mxr.mozilla.org/seamonkey/source/browser/components/safebrowsing/content/sb-loader.js
(In reply to comment #4)

> I think Boris is right, won't the timer hold on to the destructor making it so
> it can't clean up?

Err, that should read, "hold onto the object making it so the destructor doesn't get called."
Indeed the destructor code isn't necessary; I left it in for belt-and-suspenders protection in case future code changes made it fire somehow.

shutdown does null out the callback, inside the phishing warden shutdown (called from shutdown in sb-loader.js):

http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/browser/components/safebrowsing/content/phishing-warden.js&rev=1.24&mark=178#173

As for being a general case of bug 387134, it's arguable.  The progress listener already knows there's a cycle to break, and it correctly breaks it given enough time or, with this patch, nulling the callback in shutdown.  There are cases where cycle collection's the right answer, but I don't really feel *this particular instance* is one of them.
Comment on attachment 282001 [details] [diff] [review]
Patch

looks good
Attachment #282001 - Flags: review?(tony) → review+
Comment on attachment 282001 [details] [diff] [review]
Patch

Eliminates some noise from leak logs, making the remaining leaks easier to tackle...
Attachment #282001 - Flags: approval1.9?
Attachment #282001 - Flags: approval1.9? → approval1.9+
Umm, we'll call this + because it affects RLk and other automated tests, which is only slightly bogus, but realistically this isn't even a user-visible leak 99% of the time -- how often do you load a page and then immediately quit the app less than 1.5s later?

RLk dropped from 4.81K to 12B with this patch (bug 397305 already filed, with info on the leak) -- good stuff.  Marking FIXED...
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Product: Firefox → Toolkit
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: