Closed Bug 381992 Opened 17 years ago Closed 17 years ago

thunderbird leaks a jscontext

Categories

(Thunderbird :: General, defect)

x86
Windows 2000
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: lan, Assigned: mscott)

References

Details

(Keywords: memory-leak, regression)

Attachments

(4 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0
Build Identifier: 2.0.0.0 (20070326) and 3.0a1pre (20070524)

Thunderbird grow's up to "end of virtual memory" windows error (in my case working set grows up to 212MB),

Reproducible: Always

Steps to Reproduce:
1. add NSPR environment variables to system environment
2. run thunderbird
3. close thunderbird
4. run leaks gauge on NSPR logfile
Actual Results:  
Leaked outer window 1edc6b8 at address 1edc6b8.
Leaked inner window 2212f30 (outer 1edc6b8) at address 2212f30.
 ... with URI "about:blank".
Summary:
Leaked 2 out of 17 DOM Windows
Leaked 0 out of 57 documents
Leaked 0 out of 5 docshells 


see also bug 378077, comment 18, Bug 39238, Bug 341447, Bug 380837, Bug 380837.

i used leaks gauge from bug 320915, comment 28
leaks gauge shows a little more with this file:

Leaked inner window 2503b00 (outer 17ab6c0) at address 2503b00.
 ... with URI "about:blank".
Leaked outer window 17ab6c0 at address 17ab6c0.
Leaked document at address 39674c8.
 ... with URI "http://bash.org.ru/rss/".
Summary:
Leaked 2 out of 13 DOM Windows
Leaked 1 out of 54 documents
Leaked 0 out of 5 docshells
Keywords: mlk
Version: unspecified → 2.0
Attachment #266054 - Attachment mime type: application/octet-stream → text/plain
WADA, per bug 378077 comment 9, would it be helpful to have a leak log file using trunk build before 5/20?
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #3)
> WADA, per bug 378077 comment 9, would it be helpful to have a leak log file
> using trunk build before 5/20?

Bug I mentioned in bug 378077 comment 9 is Bug 380837(OS=Mac OS X was never be changed...), and the bug has been closed as DUP of Bug 383269(Core,OS=All).
Let's watch Bug 383269, with setting it in dependency tree.     

Depends on: 383269
It seems highly unlikely that bug 383269 is related.
I'm able to reproduce this dom window leak by starting up and then quitting thunderbird, without letting it check for new mail, and with about:blank as the start page.

Because the dom window is getting leaked, we get the following assertion on shut down:

"Fonts still alive while shutting down gfxFontCache"
Status: NEW → ASSIGNED
Similar reports on browser. (possibly same problem, same cause) 
 Bug 384371 – Leaks with dynamically-inserted IFRAMEs (Tinderbox causes leaks) 
 Bug 385376 – Memory leak when closing a tab?
 Bug 386665 – DOM window leak (++DOMWINDOW) when opening and closing windows 
I've been trying to use our ref counting tools to track down this document leak (http://www.mozilla.org/performance/leak-tutorial.html).

So far I haven't been able to get make-tree.pl to give me the unbalanced trees (I'm on Windows). find-leakers.pl works fine. make-tree.pl keeps generating:

n/a     .root

Imbalance
---------
.root n/a

I'm passing in:

$ /c/build/trees/tb-trunk/mozilla/tools/rb/make-tree.pl --object 0X04F75BE8 < 
refcnt.txt > refcnt.bal

Maybe someone else will have better luck on a non-windows machine.

It's the 3rd GlobalWindowImpl object that leaks.
For those who are curious, here's a leak log showing all the objects we leak if you start up thunderbird and then quit without actually checking for new mail.
the bloatview log implies that we are leaking a jscontext, and that probabl explains most of the other leaks including the global window. I always thought it was really hard to leak a jscontext. This should be fun (not ) to figure out.

I still can't get my windows build to generate a reference tree using make-tree. I wonder if someone on linux or mac would have better luck.
I'm looking at the bloatview log on the branch and thought it worth pointing out that the jscontext is not leaked on the branch so this regression is trunk only.
Keywords: regression
Summary: memory leakage on about:blank → thunderbird leaks a jscontext
No longer depends on: 383269
the fix for Bug 372713 for other RDF data sources could shed some light on a potential fix. I need to investigate more. 
In fact Bug 372713 led me to Bug 383939 which implies that all data sources are leaking if they aren't using the cycle collector...that could explain what's going on here.
I've fixed a set of mailnews leaks recently, here's an updated leak log showing the nsJSContext and GlobalWindow still being leaked. There are no mailnews specific objects in the leak log anymore.
Attachment #271291 - Attachment is obsolete: true
I turned on some cycle collector debugging information and here's what it had to say on shutdown:

nsCycleCollector: XPCWrappedNative (ChromeWindow) 038A4488 was not collected due to 1
  external references (1 total - 0 known)
  The 0 known references to it were from:
nsCycleCollector: XPCWrappedNative (ChromeWindow) 02E6C608 was not collected due to 1
  external references (1 total - 0 known)
  The 0 known references to it were from:
nsCycleCollector: nsGenericElement 03B32010 was not collected due to 1
  external references (2 total - 1 known)
  The 1 known references to it were from:
  nsJSEventListener 04D31C50

FYI, neither the compose window nor the address book suffers from this global window / jscontext leak. It appears to effect the 3-pane only for a startup / shutdown test.
Doing a startup / shutdown test, Thunderbird leaks a JS event listener. This in turn leads to one of the two nsJSContext leaks we're trying to fix.

I've tracked down the JS event listener leak to the following on click handler in messenger.xul's browser element:

  <browser id="messagepane" context="messagePaneContext" autofind="false"
           minheight="1" flex="1" name="messagepane"
           disablehistory="true" type="content-primary" src="about:blank"
           disablesecurity="true" onclick="return contentAreaClick(event);"/>

Removing the onclick handler fixes one nsJSContext leak.

Looking at the bigger picture, I wonder if our browser element is the one XULElement we are leaking and fixing that might fix the larger leak.
I can no confirm that it is indeed the browser element that is getting leaked.

The contstructor for the browser element is firing twice on the *same element*:

------------------------------------------------------------------------
Hit JavaScript "debugger" keyword. JS call stack...
0 () ["chrome://global/content/bindings/browser.xml":571]
    os = undefined
    securityUI = undefined
    this = [object XULElement @ 0x71c36e8 (native @ 0x6d16770)]
1 [native frame]
------------------------------------------------------------------------

and then again:

------------------------------------------------------------------------
Hit JavaScript "debugger" keyword. JS call stack...
0 () ["chrome://global/content/bindings/browser.xml":571]
    os = undefined
    securityUI = undefined
    this = [object XULElement @ 0x71c36e8 (native @ 0x6d16770)]
1 [native frame]
------------------------------------------------------------------------


The first time the constructor tag for our single browser element fires we've just finished loading the last style sheet in our chrome:

 	nsXBLProtoImplAnonymousMethod::Execute() Line 352	C++
 	nsXBLPrototypeBinding::BindingAttached() Line 470	C++
 	nsXBLBinding::ExecuteAttachedHandler() Line 830	C++
 	nsBindingManager::ProcessAttachedQueue() Line 834	C++
 	PresShell::InitialReflow() Line 2390	C++
 	nsXULDocument::StartLayout() Line 1946	C++
 	nsXULDocument::DoneWalking() Line 3086	C++
 	nsXULDocument::ResumeWalk() Line 3028	C++
 	nsXULDocument::OnStreamComplete() Line 3399	C++
 	nsStreamLoader::OnStopRequest() Line 110	C++
 	nsJARChannel::OnStopRequest() Line 753	C++
 	nsInputStreamPump::OnStateStop() Line 577	C++
 	nsInputStreamPump::OnInputStreamReady() Line 401	C++


The second time it fires, we've just finished loading an image (in this case online.png) and the doc loader is now empty, calling nsDocshell::EndPageLoad which ends up calling the constructor for our browser xbl widget:

 	nsXBLProtoImplAnonymousMethod::Execute() Line 352	C++
 	nsXBLPrototypeBinding::BindingAttached() Line 470	C++
 	nsXBLBinding::ExecuteAttachedHandler() Line 830	C++
 	nsBindingManager::ProcessAttachedQueue() Line 834	C++
 	nsBindingManager::EndOutermostUpdate() Line 1423	C++
 	nsDocument::EndUpdate() Line 2630	C++
 	mozAutoDocUpdate::~mozAutoDocUpdate() Line 991	C++
 	nsGenericElement::doInsertChildAt() Line 2662	C++
 	nsGenericElement::InsertChildAt() Line 2581	C++
 	nsXULElement::InsertChildAt() Line 1615	C++
 	nsGenericElement::doReplaceOrInsertBefore() Line 3256	C++
 	nsGenericElement::InsertBefore() Line 2842	C++
 	nsXULElement::InsertBefore() Line 564	C++
 	nsGenericElement::AppendChild() Line 504	C++
 	nsXULElement::AppendChild() Line 564	C++
 	NS_InvokeByIndex_P() Line 102	C++
 	XPCWrappedNative::CallMethod() Line 2253	C++
 	XPCWrappedNative::CallMethod() Line 2253	C++
 	XPC_WN_CallMethod() Line 1467	C++
 	js_Invoke() Line 1308	C
 	js_Interpret() Line 4020	C
 	js_Invoke() Line 1327	C
 	js_InternalInvoke() Line 1402	C
 	JS_CallFunctionValue() Line 4863	C
 	nsJSContext::CallEventHandler() Line 1856	C++
 	nsJSEventListener::HandleEvent() Line 215	C++
 	nsEventListenerManager::HandleEventSubType() Line 1096	C++
 	nsEventListenerManager::HandleEvent() Line 1216	C++
 	nsEventTargetChainItem::HandleEvent() Line 202	C++
 	nsEventTargetChainItem::HandleEventTargetChain() Line 260	C++
 	nsEventDispatcher::Dispatch() Line 473	C++
 	DocumentViewerImpl::LoadComplete() Line 955	C++
 	nsDocShell::EndPageLoad() Line 4878	C++
 	nsWebShell::EndPageLoad() Line 1019	C++
 	nsDocShell::OnStateChange() Line 4793	C++
 	nsDocLoader::FireOnStateChange() Line 1236	C++
 	nsDocLoader::doStopDocumentLoad() Line 869	C++
 	nsDocLoader::DocLoaderIsEmpty() Line 765	C++
 	nsDocLoader::OnStopRequest() Line 682	C++
 	nsLoadGroup::RemoveRequest() Line 676	C++
 	imgRequestProxy::RemoveFromLoadGroup() Line 163	C++
>	imgRequestProxy::OnStopRequest() Line 505	C++
 	imgRequest::OnStopRequest() Line 772	C++
 	ProxyListener::OnStopRequest() Line 867	C++
 	nsJARChannel::OnStopRequest() Line 753	C++
The constructor for the browser binding is getting called twice on the same element, once when the xul document is parsed and once again when we dynamically re-root the browser element to match the appropriate layout:

http://mxr.mozilla.org/mozilla/source/mail/base/content/msgMail3PaneWindow.js#714

The destructor for the same binding fires once at shut down. And this element gets leaked.

Neil, would you expect the constructor to fire on a binding when inserting it back into the DOM? That doesn't seem right to me, I wonder if that's a recent gecko regression. 
In order to convince myself that this is indeed the cause of the leaks, I added the following JS:

getMessageBrowser().destroy();

right before we remove the message pane box:

http://mxr.mozilla.org/mozilla/source/mail/base/content/msgMail3PaneWindow.js#716

and verified that the leaks were no longer present.
(In reply to comment #20)
>In order to convince myself that this is indeed the cause of the leaks, I added
>the following JS:
> 
> getMessageBrowser().destroy();
> 
>right before we remove the message pane box:
Interestingly xpfe's browser (which SeaMonkey no longer uses) protects itself against multiple construction, but I don't know whether the correct fix is that, or to allow toolkit browser to be destroyed more than once (which you would need when changing the layout more than once), or to clone the browser each time.
Hmm... The way XBL1 works, the constructor will fire any time a binding is attached.  That would be any time frames get constructed for the node (e.g. the node gets inserted into the DOM or there's just a frame reconstruct that comes through due to style changed).

On the other hand, destructors fire mostly just on window close, as far as I can tell.  Removing a node from the DOM tears down the binding's anonymous content, but doesn't fire the dtor... nor tear down the JS property stuff, as far as I know.

Note that I'd expect cloning the browser to fire the ctor twice as things stand: once when you clone, and once when it's inserted into the DOM.  Creating it via createElement() is a better idea.

It's all a bit of a mess, really.  :(
It's interesting to note that on the 1.8 branch, I see the same behavior, the constructor for the browser element is firing twice. But we don't leak on the branch. This behavior only causes a jscontext/globalwindow leak on the trunk.

I'll experiment with using create element.
(In reply to comment #19)
> The constructor for the browser binding is getting called twice on the same
> element, once when the xul document is parsed and once again when we
> dynamically re-root the browser element to match the appropriate layout:
>(snip) 
> The destructor for the same binding fires once at shut down. And this element gets leaked.

To Mscott:
Is phenomenon of Bug 329876 explained by your finding? 
Attached patch possible fix (obsolete) — Splinter Review
So far, this is the only fix I've been able to make that cures the leak.

createElement and cloneElement were problematic because the docshell wasn't available immediately when we insert the new element into the document, breaking a lot of the surrounding code. 

I also tried to add protection to the constructor of the browser element to support being called multiple times (like browser.destroy()) but that didn't work because the field is getting re-initialized when the element gets re-inserted into the document.
Phenomenon of Bug 329876 seems to be Bug 388353... 
(In reply to comment #25)
>I also tried to add protection to the constructor of the browser element to
>support being called multiple times (like browser.destroy()) but that didn't
>work because the field is getting re-initialized when the element gets
>re-inserted into the document.
Ah, so that the browser.destroy() protection doesn't get triggered because this.mDestroyed gets reset when the element gets re-inserted? Neat ;-)
The check-in on 2007-07-25 at 14:28 was for bug 388353.
This is the only fix I've been able to get to actually work. This patches seamonkey and Thunderbird and manually triggers a call to the browser's dtor before the ctor fires when we re-insert the element back into the document.
Attachment #272534 - Attachment is obsolete: true
Attachment #276706 - Flags: review?(neil)
Comment on attachment 276706 [details] [diff] [review]
force a matching call to the dtor

Inconveniently someone made switching view crash on trunk but I'm sure it works because when the binding is reattached it resets the mDestroyed field.
Attachment #276706 - Flags: review?(neil) → review+
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Depends on: 392413
Bug 388353 isn't listed as dependency. 
So was this patch changed to avoid that bug?

And are patches finished/suitable for testing on branch?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: