Closed
Bug 764742
Opened 13 years ago
Closed 13 years ago
Thunderbird shows reference count leaks on shutdown
Categories
(Thunderbird :: Mail Window Front End, defect)
Tracking
(thunderbird13+ fixed, thunderbird14+ fixed, thunderbird15+ fixed)
RESOLVED
FIXED
Thunderbird 16.0
People
(Reporter: standard8, Assigned: standard8)
References
Details
(Keywords: memory-leak, regression)
Attachments
(4 files)
138.15 KB,
text/plain
|
Details | |
2.37 MB,
application/x-gzip
|
Details | |
18.36 KB,
text/plain
|
Details | |
1.57 KB,
patch
|
Bienvenu
:
review+
standard8
:
approval-comm-aurora+
standard8
:
approval-comm-beta+
standard8
:
approval-comm-release+
|
Details | Diff | Splinter Review |
Sometime between Thunderbird 10 and 14, we started showing a whole lot of refcount leaks on shutdown. We missed this because we haven't been running reference count checks on our bloat builds, and we've not got UI based coverage for refcount leaks. I'll file a follow-up bug for that.
We're leaking global windows, so it is possible that we've got observer services holding onto windows that we're not tidying up on shutdown.
An example log is attached, this was generated via "make mailbloat" on a trunk build.
Assignee | ||
Comment 1•13 years ago
|
||
(In reply to Mark Banner (:standard8) from comment #0)
> Sometime between Thunderbird 10 and 14
I've got this down to between THUNDERBIRD_12_0_RELEASE and THUNDERBIRD_13_0_RELEASE.
Comment 2•13 years ago
|
||
Dumb question time: how does one go about rooting these leaks out?
Comment 3•13 years ago
|
||
In Firefox what you'd do if you have no idea what is causing the problem is to generate a cycle collector log on shutdown (the easiest way to do this right now is to set gAlwaysLogCCGraphs to true in xpcom/base/nsCycleCollector.cpp), then use my find_roots script (which you can get from here https://github.com/amccreight/heapgraph/tree/master/cc : you also need the parse_cc_graph.py file). That will tell you what is holding alive one of the nsGlobalWindows. If it is a refcounted object, then you need to do refcount logging for that particular object. Not an easy process.
Assignee | ||
Comment 4•13 years ago
|
||
There's also this old document about refcounting:
http://www-archive.mozilla.org/performance/leak-tutorial.html
(In reply to Mark Banner (:standard8) from comment #1)
> I've got this down to between THUNDERBIRD_12_0_RELEASE and
> THUNDERBIRD_13_0_RELEASE.
and now down to between THUNDERBIRD_12_0_RELEASE and THUNDERBIRD_13_0b1_RELEASE.
Comment 5•13 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #3)
> In Firefox what you'd do if you have no idea what is causing the problem is
> to generate a cycle collector log on shutdown (the easiest way to do this
> right now is to set gAlwaysLogCCGraphs to true in
> xpcom/base/nsCycleCollector.cpp),
I have a build with DEBUG_CC defined, and started it with XPCOM_CC_DRAW_GRAPHS=1 in the environment. That's equivalent, right?
> then use my find_roots script (which you
> can get from here https://github.com/amccreight/heapgraph/tree/master/cc :
> you also need the parse_cc_graph.py file). That will tell you what is
> holding alive one of the nsGlobalWindows.
I then cd /var/tmp/ and run python2.6 find_roots.py gc-edges-4.32000.log nsGlobalWindow
This prints thousands of "Error: Unknown result entry: <some line from the gc-edges-4.32000.log file>" lines in my terminal, and then:
Done loading graph. Reversing graph. Done.
No matching class names found.
What could I be doing wrong? :)
Comment 6•13 years ago
|
||
(In reply to Florian Quèze from comment #5)
> I have a build with DEBUG_CC defined, and started it with
> XPCOM_CC_DRAW_GRAPHS=1 in the environment. That's equivalent, right?
Yup.
> What could I be doing wrong? :)
Sounds like a problem with my script. If you could post a compressed log here or email it to me I can figure out what is happening.
Comment 7•13 years ago
|
||
Updated•13 years ago
|
Attachment #633146 -
Attachment mime type: text/plain → application/x-gzip
Comment 8•13 years ago
|
||
Oh, sorry, I didn't read your earlier post closely enough. There should be two files that it creates, gc-edges-XYZ and cc-edges-XYZ. You want to run the script on the cc-edges file. I forget that we generate both nowadays. I should have been more specific.
Comment 9•13 years ago
|
||
Works much better with the correct input file!
Comment 10•13 years ago
|
||
I really need to add some kind of header so the parsing scripts can see that they are loading the wrong file, instead of just dumping out mountains of useless error messages.
So, here's an example entry. The first line shows the object that the cycle collector has decided is alive. The rest of it shows how that object is keeping alive the nsGlobalWindow, but for our purposes here we probably don't care.
0x242b6a18 [nsXPCWrappedJS (nsIInterfaceRequestor)]
--[mJSObj]-> 0x2340ca00 [JS Object (Object)]
--[type_proto]-> 0x2bb9790 [JS Object (Object)]
--[parent]-> 0x2bb1040 [JS Object (ChromeWindow)]
--[xpc_GetJSPrivate(obj)]-> 0x27c7e10 [XPCWrappedNative (ChromeWindow)]
--[mIdentity]-> 0x27c688c [nsGlobalWindow]
Root 0x242b6a18 is a ref counted object with 1 unknown edge(s).
known edges:
0x242b6a18 [nsXPCWrappedJS (nsIInterfaceRequestor)] --[self]-> 0x242b6a18
Looking over these, it appears that there are three different objects that are keeping the nsGlobalWindow alive:
0x245a9cf8 [nsXPCWrappedJS (nsIMsgDBViewCommandUpdater)]
0x242b6a18 [nsXPCWrappedJS (nsIInterfaceRequestor)]
0x242c1bc8 [nsXPCWrappedJS (nsIMsgHeaderSink)]
These are all wrapped JS, which means that JS is being used to implement some interface exposed to C++, like the nsIMsgDBViewCommandUpdater. They each have 1 unknown edge, which means there's something keeping each of these alive.
My guess would be is that these are some kind of service that is being registered somewhere by being stored into an array or whatever, and that the array isn't being cleared. If some or all of these types aren't widely used then the easiest thing from here would be to just find where they are used in the code. Because these are wrapped JS, which is a widely used class, ref counting may not be practical.
Comment 11•13 years ago
|
||
nsIInterfaceRequestor doesn't help me much, it's used in so many places.
I started investigating in the direction of nsIMsgDBViewCommandUpdater that seems easier.
It's implemented at http://mxr.mozilla.org/comm-central/source/mail/base/content/folderDisplay.js#1229
The only place that really seems to use it, but keeps a strong reference is: http://mxr.mozilla.org/comm-central/source/mailnews/base/src/nsMsgDBView.cpp#2401
So I'm guessing something keeps some references to a nsMsgDBView instance. And there's some funny business with some gDBView global variables in the UI code. I'm trying to understand what's happening there exactly.
nsIMsgHeaderSink is implemented in http://mxr.mozilla.org/comm-central/source/mail/base/content/msgHdrViewOverlay.js#374 and stored at http://mxr.mozilla.org/comm-central/source/mailnews/base/src/nsMsgWindow.cpp#173
So I guess something keeps a reference to the nsMsgWindow instance. I'll investigate that later.
Thanks for the help Andrew!
Comment 12•13 years ago
|
||
No problem! You can try doing ref count logging on nsMsgDBView or something, although it is kind of a pain, so you may want to investigate other approaches first. ;)
Comment 13•13 years ago
|
||
I can look at it from the nsIMsgDBView angle.
Comment 14•13 years ago
|
||
Changing at http://mxr.mozilla.org/comm-central/source/mail/base/content/folderDisplay.js#911 the line
if (this._active) {
to
if (gDBView == this.view.dbView || this._active) {
removes from the output of Andrew's script all the mentions of nsIMsgDBViewCommandUpdater.
We'll need to debug why the this._active value isn't correctly updated (or just stop using it).
Comment 15•13 years ago
|
||
If at http://mxr.mozilla.org/comm-central/source/mail/base/content/msgMail3PaneWindow.js#622 I add " msgWindow.msgHeaderSink = null;", the only thing that remains in the leak log is: http://pastebin.mozilla.org/1662416
Note: setting that global msgWindow variable to null doesn't help.
I don't really know how I can debug the nsXPCWrappedJS (nsIInterfaceRequestor) case that remains.
Comment 16•13 years ago
|
||
(In reply to Florian Quèze from comment #15)
> If at
> http://mxr.mozilla.org/comm-central/source/mail/base/content/
> msgMail3PaneWindow.js#622 I add " msgWindow.msgHeaderSink = null;", the only
> thing that remains in the leak log is: http://pastebin.mozilla.org/1662416
>
> Note: setting that global msgWindow variable to null doesn't help.
>
> I don't really know how I can debug the nsXPCWrappedJS
> (nsIInterfaceRequestor) case that remains.
could it be the notification callbacks for the nsIMsgWindow? i.e., does doing msgWindow.notificationCallbacks = null; help?
Assignee | ||
Comment 17•13 years ago
|
||
As discussed on irc, there's a multitude of leaks but the most severe ones are for the standalone message window, which for some users will get opened and closed a lot.
This patch aims to fix mainly the standalone message ones and incorporates fixes from David, Florian and myself.
The three pane and compose windows still leak - we'll need a follow-up to track those down.
Attachment #633268 -
Flags: review?(dbienvenu)
Comment 18•13 years ago
|
||
Comment on attachment 633268 [details] [diff] [review]
Fix standalone message window leaks
I guess we can mutually r= this.
Attachment #633268 -
Flags: review?(dbienvenu) → review+
Assignee | ||
Comment 19•13 years ago
|
||
Comment on attachment 633268 [details] [diff] [review]
Fix standalone message window leaks
[Triage Comment]
We want this for 13.0.1 due to the memory leak issues in the standalone window. a=me.
Attachment #633268 -
Flags: approval-comm-release+
Attachment #633268 -
Flags: approval-comm-beta+
Attachment #633268 -
Flags: approval-comm-aurora+
Assignee | ||
Comment 20•13 years ago
|
||
Checked in:
https://hg.mozilla.org/comm-central/rev/13b5b398738c
https://hg.mozilla.org/releases/comm-aurora/rev/259fdc067bde
https://hg.mozilla.org/releases/comm-beta/rev/5b81a531181f
https://hg.mozilla.org/releases/comm-release/rev/461d927daa30
Assignee: nobody → mbanner
Status: NEW → RESOLVED
Closed: 13 years ago
status-thunderbird13:
--- → fixed
status-thunderbird14:
--- → fixed
status-thunderbird15:
--- → fixed
tracking-thunderbird13:
--- → +
tracking-thunderbird14:
--- → +
tracking-thunderbird15:
--- → +
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 16.0
Comment 21•13 years ago
|
||
> Fix standalone message window leaks
Thanks, guys!
You need to log in
before you can comment on or make changes to this bug.
Description
•