Leaks memory due to reference count cycle via nsMsgQuote
Categories
(MailNews Core :: Composition, defect, P2)
Tracking
(thunderbird_esr78+ affected, thunderbird83 affected)
People
(Reporter: jcranmer, Assigned: benc)
References
Details
(Keywords: leave-open, memory-leak)
Attachments
(2 files, 1 obsolete file)
|
6.74 KB,
patch
|
jcranmer
:
review-
|
Details | Diff | Splinter Review |
|
5.30 KB,
patch
|
mkmelin
:
review+
|
Details | Diff | Splinter Review |
:darktrojan pointed me to an impressively major leak log involving composition. Some selected lines:
[task 2019-11-15T01:49:40.497Z] 01:49:40 INFO - 754 |QuotingOutputStreamListener | 168 1008| 6 6|
[task 2019-11-15T01:49:40.589Z] 01:49:40 INFO - 1641 |nsMsgQuote | 72 432| 6 6|
[task 2019-11-15T01:49:40.589Z] 01:49:40 INFO - 1642 |nsMsgQuoteListener | 32 192| 6 6|
All of these instances are being created but never freed. There appears to be a reference count cycle going on here:
https://searchfox.org/comm-central/source/mailnews/compose/src/nsMsgCompose.cpp#2958 creates a QuotingOutputStreamListener, who helpfully stores it in a strong reference: https://searchfox.org/comm-central/source/mailnews/compose/src/nsMsgCompose.cpp#2958.
Then, nsMsgCompose passes it to nsMsgQuote::QuoteMessage (https://searchfox.org/comm-central/source/mailnews/compose/src/nsMsgQuote.cpp#87), who happily stores the listener parameter into the strong reference on the class itself.
One of those references should be a weak reference. Possibly nsMsgQuote's listener, actually, since it's only actually held in nsMsgQuote to pass to nsStreamConverter via a hidden channel, and I think AsyncConvertData is actually called synchronously, so a strong-ref across that call is all that is needed.
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Updated•5 years ago
|
| Assignee | ||
Comment 1•5 years ago
|
||
Not sure this'll plug the leak, but here's an easy change that removes at least one hard reference.
If this doesn't fix it I'll dig deeper. I did start mapping it all out, but the ownership of the various objects gets very murky very quickly...
How do I generate the leak log :darktrojan reported? Is it a case of fiddling with XPCOM_MEM_ environment variables? (I think Magnus said that try builds might generate them - there's a try run here, but I didn't spot any obvious leak logging)
Comment 2•5 years ago
|
||
Has Joshua become more active lately (I don't follow Matrix chat) or should you find a different reviewer?
Comment 3•5 years ago
|
||
He's around matrix, dunno about bugzilla.
You need to look at a debug build, there's at least the "LEAK AND BLOAT STATISTICS" section, like https://firefoxci.taskcluster-artifacts.net/Qqx7byQMTdOHnD7-cMz5fw/0/public/logs/live_backing.log
| Reporter | ||
Comment 4•5 years ago
|
||
(In reply to Ben Campbell from comment #1)
How do I generate the leak log :darktrojan reported? Is it a case of fiddling with
XPCOM_MEM_environment variables? (I think Magnus said that try builds might generate them - there's a try run here, but I didn't spot any obvious leak logging)
XPCOM_MEM_LEAK_LOG generates essentially the view you find in the logs. For local failures, you can run XPCOM_MEM_REFCNT_LOG and XPCOM_MEM_COMPTR_LOG to figure out which reference isn't being freed (with a healthy dose of mozilla-central/tools/rb/find_leakers.py).
| Reporter | ||
Comment 5•5 years ago
|
||
| Assignee | ||
Comment 6•5 years ago
|
||
Take two :-)
I went with making nsMsgQuote's reference to the QuotingOutputStreamListener a weak one, mostly because it felt like nsMsgQuote should 'contain' the streamlistener. But I never quite worked out the proper relationship between the various actors here - it feels more like a soup of objects.
I wasn't able to replicate the leak locally - my build is a debug one and exiting is pretty flaky. It tends to crash and not display the leak summary. I'll do release build with --enable-logrefcnt and try again, but I think this patch to break the co-dependent refs is worthwhile, even if the leak turns out to be something else.
| Assignee | ||
Comment 7•5 years ago
|
||
Oh, I did do a try build, but there was some breakage at the time so I'm not sure it's useful:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=48ad26ec5eb952baacdc7ef247cc6d7f0ff11893
Comment 8•5 years ago
|
||
| Reporter | ||
Comment 9•5 years ago
|
||
I was hoping this would clean up far more leaks than it actually did. While the apparent improvement in leaked code is ~10MB (!), I doubt we fixed more than a couple of KB with this leak. The main source of the leaks is the entire compose window being leaked (indicated by the presence of nsMsgComposeParams in the leaks; it's not used outside of initializing a compose window), and leaking a full HTML/XUL window causes entire DOMs to be leaked, and that's where the MB of leaks come from.
Still, that's no reason not to accept this patch.
| Reporter | ||
Comment 10•5 years ago
|
||
Updated•5 years ago
|
Comment 11•5 years ago
|
||
Let's get this finished up and landed. We can improve more in other bugs.
| Assignee | ||
Comment 12•5 years ago
|
||
Just adds the missing 'public' as per Comment 10.
More to do on this bug but yes: some improvement is better than none.
| Assignee | ||
Updated•5 years ago
|
Comment 13•5 years ago
|
||
Unless you have very concrete plans for fixing further cases atm, I'd suggest to close this after landing and clone another bug to track down further cases to fix.
Comment 14•5 years ago
|
||
Updated•5 years ago
|
Comment 15•5 years ago
|
||
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/628f8f5c15e0
Break leaking cyclic reference in nsMsgQuote. r=mkmelin DONTBUILD
| Assignee | ||
Comment 16•5 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #13)
Unless you have very concrete plans for fixing further cases atm, I'd suggest to close this after landing and clone another bug to track down further cases to fix.
I didn't want to lose track - I've got other stuff to work on first but I'd like to come back to this one.
My plan is to sit down and just manually map out all the code - which classes derive/interact with each other. Lots of awful UML scribblings :-)
Once the overall shape is sketched out, a lot of these kinds of ownership/reference issues become clear. (caveat: the compose code has a bit of a "Here be dragons" reputation)
Comment 17•4 years ago
|
||
Time to close this perhaps?
Updated•3 years ago
|
Comment 18•3 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #17)
Time to close this perhaps?
Yes, we should.
(In reply to Ben Campbell from comment #16)
(In reply to Magnus Melin [:mkmelin] from comment #13)
Unless you have very concrete plans for fixing further cases atm, I'd suggest to close this after landing and clone another bug to track down further cases to fix.
I didn't want to lose track - I've got other stuff to work on first but I'd like to come back to this one.
My plan is to sit down and just manually map out all the code - which classes derive/interact with each other. Lots of awful UML scribblings :-)
Once the overall shape is sketched out, a lot of these kinds of ownership/reference issues become clear. (caveat: the compose code has a bit of a "Here be dragons" reputation)
FWIW, https://mzl.la/3YUwDAR lists current compos issues involving memory
| Assignee | ||
Updated•1 year ago
|
Description
•