21364 - [leak] Leak another 10K just viewing a pair of messages (each time)

Reporter

Description

•

25 years ago

Using the 12/9 build....
Open mail
Open inbox
select a message
view the message
select next message
view message
select first message
...etc

You will leak amounts of memory as high as 1M, but rarely less thatn 200K with
each transition.  The double transition (back to original message) seems to
consistently cost in excess of 1M total.  The messages I'm viewing are each
about 5 screen in lengeth (not too big, but not tiny).  My inbox has 4000
messages, but I don't expect this to have been critical.

Chris Waterson

Comment 1

•

25 years ago

This is probably XUL's fault. XUL has become quite leaky in the last week.

Jim Roskind

Reporter

Comment 2

•

25 years ago

I just tried this with small messages, and the leak was a mere 20-40K per cycle,
so I suspect the size of the message is significant.

lchiang

Updated

•

25 years ago

QA Contact: lchiang → suresh

Chris Waterson

Updated

•

25 years ago

Assignee: phil → waterson

Chris Waterson

Comment 3

•

25 years ago

It turns out we seem to leak a nsDocLoaderImpl (which holds on to the XUL
document) because of a circular reference between the docloader and the load
group it creates. I'm gonna take a look at this, with mscott's help.

leger

Updated

•

25 years ago

Whiteboard: [PDT+]

leger

Comment 4

•

25 years ago

Putting on the PDT+ radar.

Chris Waterson

Updated

•

25 years ago

Assignee: waterson → mscott

Chris Waterson

Comment 5

•

25 years ago

mscott wrastled this from me.

Scott MacGregor

Assignee

Updated

•

25 years ago

Status: NEW → ASSIGNED

Scott MacGregor

Assignee

Comment 6

•

25 years ago

so right now I am not looking at mailnews. I'm trying to figure out why starting
up and shutting down the browser apparently leaks a doc loader and load group as
waterson mentioned.

I've figured out why that leaks happens. I'm trying to come with some code to
fix it right now. Some people try to create a doc loader instance as a service.
The first instance is stored in a global variable and has no ties to a web
shell. Keep in mind that doc loaders have a circular reference with a load
group. So a doc loader's Destroy method needs to be called in order to break the
circular reference. This happens naturally when a doc loader is created in
conjunction with a webshell (because the webshell always calls it's doc loader's
destroy method). But this doesn't happen when someone creates a "global" doc
loader by calling into the service manager. So the circular reference is never
broken and we leak the doc loader and load group.

Adding rpotts to the cc list. My fix is going to involve force people who want
to use the doc loader as a service to use a different object than those that use
the doc loader as part of the web shell. You'll get the doc loader service, then
ask it for the real doc loader. Then when the doc loader service is destroyed,
it will call the destroy method on the global doc loader it contains.

rpotts (gone)

Comment 7

•

25 years ago

hey scott,
If the docloader is leaking because of the circular link to the loadgroup, then
let me break that link with a nsWeakPtr...  It used to be done this way, but for
a short time we were proxying the notifications from the load group - and the
weak ptr was lost.

BTW.  leaking the global docloader service should not be the cause this leak!
This docloader has no associated webshell and *only* contains other docloaders
(no channels).  Since all of the other docloaders (and loadgroups) are cleaned
up this one will be empty when it is leaked...

-- rick

Scott MacGregor

Assignee

Comment 8

•

25 years ago

Hey Rick, So I missed this message of yours. I just went ahead and wrote a small
snippet of code to make a doc loader service object which olds onto the global
doc loader.

This fixes the leak, but of course as you pointed out, the leak is quite small
and not the source of the 400K leaks we are seeing on tinderbox which is what
waterson and I were investigating.

Scott MacGregor

Assignee

Comment 9

•

25 years ago

waterson, I hate to have to think about bouncing this back to you but I've fixed
the doc loader / load group problem in my tree and it doesn't effect the bloat
at all. some of the bigger ticket items in the leak log are xul attributes (,
atomimpl, 2 xul documents...

Chris Waterson

Comment 10

•

25 years ago

No problem. if nsDocLoaderImpl doesn't leak, but nsXULDocument still leaks, give
it to me.

Scott MacGregor

Assignee

Comment 11

•

25 years ago

Update: The changes I have in my tree to fix the doc loader leak + some changes
to fix webshells that were leaked by the global window have helped the mailnews
leak story. That is, these changes fix the webshells that were leaked per
message viewed. i.e. if you used to view 4 messages, we would leak 4 webshells.
However, this change doesn't have as dramatic as an effect on the leak count as
you might think because the webshells that were getting leaked before were being
destroyed. i.e. there destroy method was being called so they weren't leaking
anything they were holding on to. Just the actual webshell structure itself.

And you won't notice that gain until you close the mail app too.

That being said. When I view messages using todays bits, I'm not seeing jumps of
200K between message views. Sometimes it jumps up, sometimes it goes down
depending on how big the message size is. But I'm not seeing increasing  amounts
of memory useage.

Now that being said, the part that I think is still potentially a dogfood fix is
the fact that starting up mail OR the browser leaks the xul document associated
with the window for that application. I think this is contributing to a large
about of the 400K we leak when you start up and shut down the browser (as seen
on tinderbox)

Scott MacGregor

Assignee

Comment 12

•

25 years ago

More info: I do see my memory useage jump about 200K when I display a message
but as soon as layout is done, the 200K is returned. i.e. if I view 5 or 6
messages that are about the same size, the mem useage when I view the first one
vs when I view the last one is about the same. The memory is being returned. But
i do see 200K spikes when laying out each message.

Also, I've been trying to look into the XUL Document leak for mail and the
browser. The bloat log swears it's getting leaked. Yet when I ask it for the
serial number of the leaked document it won't give me one! So I can't get a leak
tree for the xul document...arggh...

Chris Waterson

Comment 13

•

25 years ago

I'll look for that also. Do you have patches in your tree that I should pick
up? If so, spank 'em  into the bug.

Scott MacGregor

Assignee

Comment 14

•

25 years ago

Chris, I take it back. I am getting logs of nsXULDocument. I was incorrectly
setting the class to be nsWebShell for ref count data and of course we weren't
leaking the webshell. =) So I'm getting a valid ref counting graph for the xul
document. There's one xtra ref count in there. I'm trying to track it down right
now but feel free to look too.

I'm attaching a patch which includes changes to globalWindowImpl, webshell,
extensions.

Scott MacGregor

Assignee

Comment 15

•

25 years ago

Attached patch current patch showing leak fixes in my tree for webshell and global window stuff — Details — Splinter Review

Scott MacGregor

Assignee

Updated

•

25 years ago

Summary: [dogfood][leak] Clicking between adjacent messages leaks 200K-1M → [dogfood][leak] Leak 400K when starting up and shutting down the app

Scott MacGregor

Assignee

Comment 16

•

25 years ago

I'm going to change the summary of this bug to represent what we are working on:
starting up and shutting down the browser leaks over 400K according to the leak
tools on tinderbox. The patches I've already posted fix any webshell leaks
related to viewing mailnews messages.


In addition to these patches, I'm slowly chipping away at leaks for starting up
and shutting down the browser. I found some nasty leaks in the cookie code and
the wallet code earlier and I now have my linux box down to the point where it
is leaking 350K instead of 425K. But I'm just chipping around the edges until we
can solve one of the big leaks which I think is the XULDocument.

Scott MacGregor

Assignee

Comment 17

•

25 years ago

load balancing to waterson who will probably spot the xul document leak faster
than me anyhow. I also have leak fixes that help bring this number down. I'll
continue to try to help chris as well.

Scott MacGregor

Assignee

Updated

•

25 years ago

Assignee: mscott → waterson

Status: ASSIGNED → NEW

Scott MacGregor

Assignee

Comment 18

•

25 years ago

re-assigning for real this time.

lchiang

Updated

•

25 years ago

QA Contact: suresh → gerardok

Chris Waterson

Updated

•

25 years ago

Depends on: 21661

Chris Waterson

Updated

•

25 years ago

Depends on: 21643

Chris Waterson

Updated

•

25 years ago

Status: NEW → ASSIGNED

Target Milestone: M12

Chris Waterson

Updated

•

25 years ago

Depends on: 21668

Jim Roskind

Reporter

Updated

•

25 years ago

Summary: [dogfood][leak] Leak 400K when starting up and shutting down the app → [dogfood][leak] Leak another 400K just viewing a pair of messages (each time)

Jim Roskind

Reporter

Comment 19

•

25 years ago

The bug is not a "startup and shutdown" bug.  Sorry the focus changed by
someone. I'm changing the title back to the original which involves about 1M
plus leakage from just viewing two distinct messages (even as you view them
again and again).  Two message views add up to a 1 meg leak.  That was the
bug... not a startup/shutdown issue.

Scott MacGregor

Assignee

Comment 20

•

25 years ago

I'm sorry Jar. I was the one who morphed the subject title. Per my comments way
up there somewhere, I'm not seeing a leak between viewing messages. Yes we use
up 200K to display it but I see that memory getting freed.  I suppose I should
have marked this bug as works for me and we should have used a new one to track
the large start up and shutdown leaks.

One note about my inability to reproduce your problem: I now have lots of leak
fixes in my tree for the startup / shutdown problem including changes that
prevent us from leaking webshells when viewing messages)

Jim Roskind

Reporter

Comment 21

•

25 years ago

I just tried this on the 12/13 build number 15 on windows (today's build).
Without working too hard, (my second pair of messages I chose), I quickly found
a pair that leaks about 350K each time I cycle back and forth through them.  I
can do this again and again, with ever increasing total leak (not just
startup/stutdown).  If there is a startup/shutdown bug, that should be filed...
but tihs should really look at the problem.  Please ring me up or come by to see
(if my description is still not clear, and you can't reproduce).

Thanks,

Jim

Chris Waterson

Comment 22

•

25 years ago

jar: due to the nature of object ownership, if a XUL document is leaked, it
will leak all the things that it owns.

Jim Roskind

Reporter

Comment 23

•

25 years ago

I understand that if a XUL object is leaked, then all related objects go along
with it (leak) in a reference-count system.  The critical thing about this bug
is that in the course of an oft repeated user activity (reading email) that a
leak of several hundred K apppears for every couple of messages is a problem.

IF there was a 400K leak that related to startup and shutdown *ONLY*, then it
would not be a PDT+ bug ('cause it would not impact usability, since a user
doesn't startup and shut down tons of times in each session.... especially
mail... and especially with the startup/shutdown time performance in mail today).

I understand that the leaked object(s) will typically be small... I'm just after
finding the leaked object that transitively induces this mother of a leak. This
bug will be gone when that object (or the one or two big helpers) is isolated,
and repaired.

Chris Waterson

Comment 24

•

25 years ago

Every mail message is a XUL document.

Chris Waterson

Comment 25

•

25 years ago

jar: sorry, re-reading the comments in this bug, my previous comment was a
complete tautology ("an object leaks the objects it owns" -- duh). What I
*meant* to say was that the XUL document is a central object that has a very
high ownership fan-out. It leaks the document content model and script event
listeners, content model elements hold on to RDF datasources, etc. Fixing the
startup leak of the XUL documents on startup will be necessary (although
possibly not sufficient) to fix the leak of the mail message document.

Scott MacGregor

Assignee

Comment 26

•

25 years ago

chris, here are two small leak fixes that I also have in my tree of the global
context. I'm attaching them here just so we are on the same page. Note: these
fix ref counting leaks on the global context but even with these changes, it is
still being leaked.

Scott MacGregor

Assignee

Comment 27

•

25 years ago

Attached patch fixes two leaks of the global context — Details — Splinter Review

Chris Waterson

Comment 28

•

25 years ago

Attached patch more fixes. includes fixes for dependent bugs. — Details — Splinter Review

Jim Roskind

Reporter

Comment 29

•

25 years ago

Rpotts confirmed that he could not reproduce this in his non-commercial build.
He also confirmed that he could reproduce this in his comercial build.

mscott: When you said you could not reproduce, were you using the comercial
build?  Please try it there... in my demo for rpotts, we were losing about 600K
per pair of messages using the 12/13 build pulled from sweetlou.

The obvious question is why are we leaking so differently on comercial build??

Chris Waterson

Comment 30

•

25 years ago

In a tree without mods, I see a 80-100K leak per message in the mozilla build. I
see a 400-500Kb leak in the commercial build. I'm going to build a commercial
tree with these mods and see what happens.

Scott MacGregor

Assignee

Comment 31

•

25 years ago

Some AIM stuff was recently turned on when displaying messages in the commercial
build. I bet my bottom dollar it's coming from there...That would explain the
difference.

Scott MacGregor

Assignee

Comment 32

•

25 years ago

Jar, I think I may have forced your problem to be fixed incidently today. Please
see the PDT+ bug I filed on Alex Musil this afternoon: Bug #21710. Turns out
they had added some stuff recently (within the last week) that was causing an
address book to get opened for each email address in the email message. And I've
been told that the address book is very very leaky. Alex fixed this bug for me
and as a result we only open an address book up once per session instead of
every time you view a message.

I believe this was causing the leak you were seeing since it was only occurring
in the commercial build. I'm running a commercial debug build with alex's
changes for that bug and I'm seeing the same minimal leak behavior I reported to
you earlier on the mozilla build when viewing messages.

I belive we can mark this PDT bug fixed due to the changes in Alex's code tonight.
 Tell you what, why don't I wait and you can try out tomorrow's bits .if you
aren't seeing the better leak behavior that I'm seeing in the commercial build
we'll take a look again.

Chris Waterson

Comment 33

•

25 years ago

mscott: it looks like you are right. I just updated my tree, and am no longer

seeing a large leak each time a new message is opened.



jar: i'll let you mark this as fixed when the builds come out tomorrow.

Chris Waterson

Comment 34

•

25 years ago

It appears to be windows-specific. I cannot reproduce the leak in a commercial
Linux build.

chris hofmann

Updated

•

25 years ago

Target Milestone: M12 → M13

chris hofmann

Comment 35

•

25 years ago

lets check this out to see if it looks good in 12/15 builds.
if not fix in the first m13 carpool landings?

Jim Roskind

Reporter

Comment 36

•

25 years ago

Both Waterson and I can still see the leak in the 12/15 morning build :-(.
It is not gone :-/.

Chris Waterson

Updated

•

25 years ago

Priority: P3 → P1

Chris Waterson

Comment 37

•

25 years ago

jar: in a recent private email, you said that this seems to be down to l.t. 100K
per message now. should we declare victory, or continue work on this? Or did I
mis-read your email? ;-)

Jim Roskind

Reporter

Comment 38

•

25 years ago

I was either sleepy, or miscommunicated (and I can't find the messag in my
outbox :-/ ).  Anyway... the problem is still visible on the Netscape Window's
build, but not visible on the mozilla windows build, and not visible on the
Linux Netscape build.  Conclusion: This is a Windows only, Netscape version
only, leak.

Attached is a file that you can use to demonstrate the leak.  Please just cycle
back and forth between the two messages, and you should see about a 380K leak
per cycle.

Jim Roskind

Reporter

Comment 39

•

25 years ago

Attached file Leak demo (cycle between messages in this local box to see leak) — Details

Jim Roskind

Reporter

Updated

•

25 years ago

Blocks: 22176

Jim Roskind

Reporter

Comment 40

•

25 years ago

Attached file This is a demo mailbox file, with two messages, that leaks 600-700K per viewing cycle — Details

Jim Roskind

Reporter

Comment 41

•

25 years ago

Second try... this time I attached the actual mailbox, rather than the mailbox
index.   I tried a couple of messages.  I can't tell if the leak is larger
because of the number of folks on the To/CC list, or because of the complexity
of the message.  I could try to debug this down by hand editing the file so that
I could tell what caused the larger leak.  I'll do a little of this and comment
back in a moment.

Jim Roskind

Reporter

Comment 42

•

25 years ago

I tried reducing the number of folks on the CC list, and the leak persisted.  It
would appear that the size of the leak relates to the size and/or complexity of
the message, not the email address count in the headers.

Hope that helps,

Jim

p.s., This was tested on the M12 candidate for the comercial build, built 12/20

Chris Waterson

Comment 43

•

25 years ago

I see 320Kb per message on Linux; which is consistent with the 700Kb per
"cycle", as you were defining it, Jim.

Chris Waterson

Comment 44

•

25 years ago

Ok, I've just done the Dumb Thing and instrumented PR_Malloc() to dump a stack
trace when asked to allocate a block >8Kb. Turns out that message parsing does a
-lot- of this.

Specifically, we create ten to twenty 16-32Kb blocks (!) while parsing the
"data:" URL that (presumably) is holding the message data. This is done to copy
string data around. If leaked (or not re-usable because of subseqent
re-poritioning to smaller blocks), this would account for jar's 300Kb leak per
message.

See

http://lxr.mozilla.org/mozilla/source/netwerk/protocol/data/src/nsDataChannel.cpp#118

Specifically, the mURL->GetSpec() is creating 16-32Kb string copies!

Not only does this seem to be -tremendously- inefficient, my guess that, even if
it isn't leaking memory, it is seriously fragmenting memory.

Can we think of a better way to do this? Maybe adding a [noscript]
GetImmutableSpec() to nsIURI? (warren: this is your chance to say "I told you
so.") Or using a mechanism other than a "data:" URL to transfer message data?

Chris Waterson

Comment 45

•

25 years ago

add bienvenu to cc list, who is another leak hero on the mail team.

Scott MacGregor

Assignee

Comment 46

•

25 years ago

Chris: i wouldn't spend too much time on this leak if it looks to be related to
the message body being placed in the data url. I've been spending the last few
days cooking up a new more efficient way to display messages that gets rid of
the data url. It's much more efficient and is improving message display times by
as much as 4 times on slower machines! Plus, it looks to be much more memory
efficient as well. I'm planning on landing it when I get back from the holidays
after New Years.

David :Bienvenu

Comment 47

•

25 years ago

I will purify with Scott's new message loading stuff tomorrow and see what
happens.

Chris Waterson

Updated

•

25 years ago

Assignee: waterson → mscott

Status: ASSIGNED → NEW

Chris Waterson

Comment 48

•

25 years ago

mscott: giving this one to you. when you land your mail reading changes, test
it, see if we still leak. if we do, then you can give it back to me. if we don't
then close it!

Scott MacGregor

Assignee

Updated

•

25 years ago

Status: NEW → ASSIGNED

Scott MacGregor

Assignee

Comment 49

•

25 years ago

I'm not seeing the large message leaks after landing my new souped up message
display stuff. However, I don't consider myself to be a good test case 'cause I
had such a hard time getting the large leak to happen before too. Jar, are you
still seeing large leaks?

Jim Roskind

Reporter

Updated

•

25 years ago

Summary: [dogfood][leak] Leak another 400K just viewing a pair of messages (each time) → [leak] Leak another 400K just viewing a pair of messages (each time)

Whiteboard: [PDT+]

Jim Roskind

Reporter

Comment 50

•

25 years ago

The bug is now very small, and certainly not worthy of a PDT+ status (or even a
dogfood request).  I'm removing both comments.  I think the leak is now under
10K per toggle, which is still a bug... but nothing critical at this point.

IMO, you should mark this at least M15.

Thanks to all,

Jim

Scott MacGregor

Assignee

Updated

•

25 years ago

Target Milestone: M13 → M15

Scott MacGregor

Assignee

Comment 51

•

25 years ago

Thanks for the help jar. Bouncing out past beta 1 for the work involving
tracking down the 10K or so we are still leaking when displaying a message.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

25 years ago

Keywords: mlk

Scott MacGregor

Assignee

Comment 52

•

24 years ago

updating the summary to reflect that the last known leak cnt was under 10K for
some messages. I'm not actually seeing any signficant leakage displaying
messages according to the refcounting logs and by monitoring my memory useage on
windows when displaying messages.

I'm just going to go ahead and mark this fixed. I will of course continue to
keep an eye on leaks displaying messages as we go forward.

Summary: [leak] Leak another 400K just viewing a pair of messages (each time) → [leak] Leak another 10K just viewing a pair of messages (each time)

Scott MacGregor

Assignee

Comment 53

•

24 years ago

fixed.

Status: ASSIGNED → RESOLVED

Closed: 24 years ago

Resolution: --- → FIXED

Jim Roskind

Reporter

Updated

•

24 years ago

No longer blocks: 22176

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: MailNews → Core

Nobody; OK to take it and work on it

Updated

•

16 years ago

Product: Core → MailNews Core

current patch showing leak fixes in my tree for webshell and global window stuff 25 years ago Scott MacGregor 35.11 KB, patch		Details \| Diff \| Splinter Review
fixes two leaks of the global context 25 years ago Scott MacGregor 1.50 KB, patch		Details \| Diff \| Splinter Review
more fixes. includes fixes for dependent bugs. 25 years ago Chris Waterson 15.40 KB, patch		Details \| Diff \| Splinter Review
Leak demo (cycle between messages in this local box to see leak) 25 years ago Jim Roskind 2.48 KB, application/octet-stream		Details
This is a demo mailbox file, with two messages, that leaks 600-700K per viewing cycle 25 years ago Jim Roskind 34.98 KB, application/octet-stream		Details