Closed Bug 75641 Opened 23 years ago Closed 23 years ago

Major memory leak reloading slashdot.org

Categories

(SeaMonkey :: General, defect)

x86
Windows NT
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
mozilla0.9.1

People

(Reporter: keith, Assigned: harishd)

References

()

Details

(4 keywords, Whiteboard: want for 0.9.1)

Attachments

(10 files)

After first loading slashdot.org, the system reports about 9MB memory usage.  If
I do a reload, the memory usage bumps up to 17MB.  This happens with the latest
nightly build I downloaded today (04/11/01).
Unable to reproduce this problem.  After 10 reloads of slashdot.org my memory
usage went up about 120K.  After about 20 reloads it was only up about 60K. 
Some of the reloads seemed to push it up as much as 250K but it went back down
on subsequent loads.  Tested with 2001041104 on win2K
I reproduced it again with 0412 build.  After you go to slashdot.org home page, 
iconify the window.  The memory usage goes down to 1.5MB.  Then raise the window 
again.  The memory usage will climb to 8.8MB.  Then reload the page and memory 
usage jumps to 14.8MB.
Waiting on someone else to be able to reproduce it. Reporter try creating a new
profile see if that fixes it.
Keywords: mlk, perf, qawanted
reporter, I think you're misreading the memory usage stats here.  

1. start the browser (mozilla.org startup page) 18,550 K
2. surf to slashdot 18,740 K
3. iconify browser 800 K
4. maximize browser 12,200 K
5. reload slashdot 18,500 K
6. reload slashdot 18,460 K
7. reload slashdot 18,520 K
8. reload slashdot... (no appreciable increase)

I don't think we're leaking anything here.
OS: Win2K
Build: 2001042012

Here's what I get after 10 reloads:
1. start the browser with blank page 21,812 K
2. surf to slashdot 23,224 K
3. iconify browser 1,600 K
4. maximise browser 8,832 K
5. reload slashdot 15,376 K
6. reload slashdot 16,444 K
7. reload slashdot 16,692 K
8. reload slashdot 17,008 K
9. reload slashdot 17,440 K
10. reload slashdot 17,656 K
11. reload slashdot 18,500 K
12. reload slashdot 18,816 K
13. reload slashdot 18,708 K
14. reload slashdot 19,560 K

That's 4Mb increase over 10 reloads.
It's definately leaking.
For a while now I got used to Mozilla using at most 32Mb of memory even with a
lot of windows open. But during past week, nightlies have been blowing it
through the roof. If I don't close Mozilla for a day, most of my memory will be
gone into it. Today I ended up with Mozilla using 72Mb with just one browser
window. It's a real pain.
Severity: normal → critical
over to chofmann for triage.
Assignee: asa → chofmann
Status: UNCONFIRMED → NEW
Ever confirmed: true
adding namachi,dprice,cathleen,waterson,pav to cc list.

I think shiva has a build tree and purify install on one
of the new dell systems..
shiva can you and dprice get together and check this out.

putting this on the 0.9 radar until we learn more.
lets get some data today...
Assignee: chofmann → dprice
Target Milestone: --- → mozilla0.9
I spent some time looking at this tonight.  I saw us leaking a whole lot of
little things.  The largest leak I saw was in the gfx code, nsImageWin::Init
allocates into mAlphaBits and never deletes it.  I showed a patch to pav, and he
told me the leak is OLD, but should still be fixed.  He also doen't think it is
accounting for this much of a leak.  Unless SlashDot all of a sudden changed the
types of images they use.  More to come tomorrow afternoon.
I don' think the leak I'm having is related to what Keith Hankin is talking about.
In fact, on second review the behaviour Keith described seems quite legit to me
and has to do with Mozilla freeing up memory on minimise, and re-allocating it
when it's brought back.

When I woke up in the morning and had Mozilla using 72Mb, I stumbled upon this
bug when I was about to submit my report, and added my coments to it instead of
creating new bug.

This massive leak I get is not related to SlashDot and occurs during day-to-day
browsing (most of which is Bugzilla anyways). It is consistent, I have to
regularily restart browser to ease it up.

I have this gut feeling it might have something to do with Character Encoding
Auto-Detection, because that's the configuration change I made some time prior
to noticing this leak.
Oh, and if it's of any help, I have Auto-Detect set on Russian.
It's my wild guess, but probably with Auto-Detect on, after processing the pages
are left in memory and accumulate as you browse/reload.
this isn't 0.9. unsetting TM
Target Milestone: mozilla0.9 → ---
No, the memory growth I'm seeing is not just reallocating memory that was freed 
when the window was minimized.  Before minimization, the memory usage was around 
9MB, then when I minimized, it goes down to 1.5MB, then when I open it again, it 
grows to 8.8MB, then when I reload, it jumps up to 14.8MB.
  
There does have something suspecious in autodetection related code. But those
XPCOM stuff drove me nuts, I hope this might not be a very difficult job for 
a person familiar with XPCOM usage. I will just list some suspecious points for 
your reference. 
1, Inside nsDetectionAdaptor.cpp, mObserver is questionable. The comment says 
   it should be weak refed, but I did not see and nsWeakPtr related stuff. Since
   the useful lifespan of detector is no longer than adaptor, we probably should 
   delete mObserver when nsDetectionAdaptor dismissed.
2, I set a break point in descructor of nsDetectionAdaptor, but it never reach there.
   I guess the handling of nsDetectionAdaptor interface in file nsHTMLDocument.cpp, 
   function StartDocumentLoad might not be correct. 
Hope this can be helpful.


add myself to cc list.
Task manager is a little tricky when it comes to displaying the memory usage.
Go to View -> Select Columns and check Peak Mem Usage and VM Size.  VM Size is
the key, it is the total size of the application.  Start mozilla and look at the
VM Size.  Should be somewhere around 17ish  Minimize the window  Mem Usage
shoots way down to about a meg, but the VM Size should not change at all.
Restoring the window again changes the value for Mem Usage and VM Size remains
the same.  Reloading the page caused memory to change a little, but not by 4
megs.  The major culprit appears to be the different add banners being loaded.

Having said all of that, there is still a leak on the page.  When I tested with
my fix, I saw the memory usage hovering between 18.5 and 19.5 megs.  Pav
reviewed my diff.  I'll check it in to 0.9.1 when the tree is open.  Here it is
for all to enjoy:

Index: nsImageWin.cpp
===================================================================
RCS file: /cvsroot/mozilla/gfx/src/windows/nsImageWin.cpp,v
retrieving revision 3.74
diff -u -r3.74 nsImageWin.cpp
--- nsImageWin.cpp      2001/04/04 08:23:14     3.74
+++ nsImageWin.cpp      2001/04/24 22:05:26
@@ -912,6 +912,11 @@
     mImageBits = nsnull;
   }

+  if (mAlphaBits != nsnull) {
+    delete [] mAlphaBits;
+    mAlphaBits = nsnull;
+  }
+
        // Should be an ISupports, so we can release
   if (mColorMap != nsnull){
     delete [] mColorMap->Index;

Target Milestone: --- → mozilla0.9.1
Any reason we wouldn't want that mAlphaBits-deleting patch for 0.9?

/be
I have disabled the Charset Auto-Detection, but I still have the problem.
dprice thanks for the tips on Task Manager, I did some testing and here are results:

OS: Win2K Server SP1
Build: 2001042404 nightly

I used a new profile for testing.
I needed a noticeably large HTML file, so I downloaded
http://www.wellnet-one.de/penguin.html onto my HD and used that for testing.

The first number in these results is Peak Mem Usage reported by Win2K Task
Manager, and the second number is VM Size:

1. Start Mozilla up with a blank page: 20544 K - 13052 K
2. Loaded penguin.html off local drive: 22364 K - 13956 K
3. Reloaded the page 10 times by pressing F5 (and waiting for reload to finish
before pressing it again): 31392 K - 22940 K
4. Reloaded another 10 times: 40416 K - 31940 K
5. Reloaded another 10 times: 49432 K - 40946 K
6. Reloaded another 10 times: 58480 K - 49948 K

As you can see, the result is 36Mb increase after 40 page reloads.

This is a regression that happened in last 2 weeks or so.
Can we try and fix this before 0.9?
brendan: I can't see any harm in putting this in 0.9, gimme an a= and I'll toss
it in the tree.

alexey: can you test with the above patch?  If the leak disappears then we've
got all the more reason to check this in to .9  If the leak doesn't go away, can
you open a new bug?  Then test with something like .8 to confirm that the leak
with the new test case is not something that's been around for a while.  Thanks.
branching for 0.9.1 now so hold this for a bit.
when the tree opens check it in.
I think we will want to pull this to the 0.9 branch.

unfortunately right now I am unable to compile anything, I have no VC++ installed.
So I can't try the patch, sorry.

The problem I'm seeing I think is no more than 1-2 weeks old.
Half a year ago I wasn't impressed at all by Mozilla's memory usage. But lately
I got used to mem use never rising above 32Mb or so, and was quite happy with
it. Until I found myself forced to restart Mozilla to free up  memory in recent
builds, because my computer was getting down to a crawl because of it.

However I don't think the leak I'm seeing is related to images. The page I
tested it with contained no images at all. I think maybe parse tree is not
getting freed up or something like that, because the bigger the page - the
bigger memory increase after each reload. And the memory increase can be over 2
times bigger than the document file size.

Can't anyone else reproduce it on Win2Ksp1 apart from me?
I always delete old Mozilla browser before extracting a new nightly, and I am
seeing it with freshly created profile. I thought this can be easily reproduced?
Because hell, if someone can show me a way of how to *NOT* reproduce it, I'd be
extremely happy. lol :o)
dprice: please get r= on that patch from module owner (pav?) and get it in,
sr=brendan, to the trunk.  We'll get it into the 0.9 branch soon.

Any similar leak in the non-windows gfx/src/*/*Image*.cpp files?

/be
dbaron is master of all leaks and things leak-like.

/be
r=pavlov
i've had this patch in my tree for probably 6 months, so it is pretty safe. :)
fix is in
Do we want this on the branch too?
drivers@mozilla.org would like this for the 0.9 branch.  Please check it in
there too.
  
MOZILLA_0_9_BRANCH

a= asa@mozilla.org
Looks like the leaky Charset Auto-Detection problem is discussed in bug 60908: 
"changing character-coding-->auto-detect appears to leak memory". It might help 
to also paste your leaky data there.
Depends on: 60908
I can confirm the major memory leaking that Alexy is seeing.  On Win2K SP1 with
build 2001043004 opening a fresh browser and repeatedly loading the penguin.html
that he pointed to:

Open -    Mem: 18,332 Pk: 18,464 VM: 11,812
Load 1 -  Mem: 21,688 Pk: 21,796 VM: 14,456
Load 2 -  Mem: 22,736 Pk: 22,940 VM: 15,484
Load 3 -  Mem: 23,680 Pk: 23,844 VM: 16,420
Load 4 -  Mem: 24,768 Pk: 24,916 VM: 17,508
Load 5 -  Mem: 25,600 Pk: 25,744 VM: 18,340
Load 6 -  Mem: 26,448 Pk: 26,596 VM: 19,196
Load 7 -  Mem: 27,296 Pk: 27,444 VM: 20,048
Load 8 -  Mem: 28,236 Pk: 28,320 VM: 20,956
Load 9 -  Mem: 29,084 Pk: 29,216 VM: 21,808
Load 10 - Mem: 29,980 Pk: 30,064 VM: 22,668

Please note this page has no images and I do not have charset auto-detection. 
It's leaking about 1MB per reload.  Not nice.
It leaks on any page.
The bigger the page - the more it leaks on each load.
That penguin.html page is HUGE, so it is more obvious with it.
But even if you browse pages 100 times smaller than that, it accumulates over time.
I just closed Mozilla after peak of 73Mb
On Linux, where the leak isn't quite as big and where it seems to stop after
about 10 reloads or so, I stuck some printfs in the Init method and the
destructor of nsImageGTK to see when they were being called.  I found that when
we reload http://www.slashdot.org/ , we only destroy one nsImageGTK.  All the
rest are destroyed after I close the window, with the stack:

nsImageGTK::~nsImageGTK()
nsImageGTK::Release()
nsCOMPtr<nsIImage>::~nsCOMPtr()
gfxImageFrame::~gfxImageFrame()
gfxImageFrame::Release()
nsSupportsArray::Clear()
imgContainer::~imgContainer()
imgContainer::Release()
nsCOMPtr<imgIContainer>::~nsCOMPtr()
imgRequest::~imgRequest()
imgRequest::Release()
nsCOMPtr<nsISupports>::~nsCOMPtr()
nsCacheEntry::~nsCacheEntry()
nsCacheEntryHashTable::FreeCacheEntries(PLDHashTable*, PLDHashEntryHdr*,
                                        unsigned, void*)
nsCacheEntryHashTable::Finalize(PLDHashTable*)
PL_DHashTableFinish
nsCacheEntryHashTable::~nsCacheEntryHashTable()
nsMemoryCacheDevice::~nsMemoryCacheDevice()
nsCacheService::Shutdown()

Two questions here:

1) Should we be storing the image data in the cache along with the compressed
image?  Or is it the image data that the cache stores, and we don't store the
compressed image?  (I was under the impression that the memory cache was a
network cache, though.)

2) Why are we storing additional copies of the image in the cache when we
already have one?
while necko may have stored a compressed (as downloaded) copy of the image in
the cache (most likely the disk cache), imagelib's use of the cache is to
store a decompressed/decoded form of the data that can be quickly rendered.

slashdot.org's homepage has three or so images with dynamic URLs (created using
the current time from javascript).  so, it seems that as long as your memory
cache has room, we'll definitely add to the memory cache each time this page
is loaded.  in other words, this may not be a real leak at all.  just try
clearing your memory cache after visiting slashdot.org a couple times... the
memory usage should drop to near what it is was when you started the browser.
I was only seeing one of the images destroyed between reloads.  I doubt all but
one of the images on slashdot have dynamic URLs.
Platform: Windows NT sp6a
Mozila: 20010502

This major memory leak is definitely not connected with images. I am working
with intranet system, I have a lot of pege reloads and almost no images but guge
HTML (tables, forms). I run out of memory quite quickly. 

On startup my mozilla is something like 16Mb (VM Size, 400kb bookmark file and
simple intranet startup page loaded).

After several dozens of pageloads (it is 10-15 minutes of intensive work into
system). My mozilla.exe jumps up to 50Mb (in VM). 

The most surprisingly, it starts working slowly and slowly. 

It is *not* due to swapping, It has more Phisical memory available then it
wants. It is also very starange, because I have a lot of allications running and
they are swapped out to disk slowly, but mozilla is whole into RAM, it is not
going to allow any if it's parts to be swapped out.

Slowdown is: When I click on a link or button (on a page) CPU load jumps to 100%
and Mozilla freezes (no animation, no switching to other mozilla window) for a
second or two (on PIII 800). It look like it is searching something into huge
memory pool it is utilizing at that moment. Note: at startup mozilla worked
quite fast. This also explains why mozilla is so memory aggressive.

Usual application when leaking memory forget it completely and it is easily
swapped out by OS, but not in case of mozilla. It look like mozilla review all
the leaked structures during pageloads.

Hope this comment helps.
Your description could implicate memory cache. However it also doesn't 
contradict the assertion that we aren't leaking merely archiving more.  Every 
few hours, you could try creating a new window and closing the old one. See how 
much memory you use then.
Clearing memory cache doesn't decrease this enormous memory use, nor does
closing all browser windows and leaving just one. Only restarting Mozilla gives
some breathing space until it gets unbearable again.

Memory use increases while refreshing same imageless file over and over from
local disk. Don't think it's cache. My guess is it's parser related. The bigger
the file - the more memory increase. Maybe parse tree (or structure like that)
is not getting freed and stays in memory?
Can't you run a tool like purify to find out exactly where the leaks are?
Attached file Purify run..
CCing myself..
*** Bug 78924 has been marked as a duplicate of this bug. ***
harishd, could you please do Purify run for http://www.wellnet-one.de/penguin.html
and reload that page a few times?

[W] MLK: Memory leak of 15372 bytes from 63 blocks allocated in
nsMIMEInfoImplConstructor

That looks like something that we're after, but trying loading a BIG page should
make it more obvious.

Keith do you think you are seeing the same leak as me, Stephen and Myroslav?

Also could someone make some Windows builds of leak detection tools and make
them available for download?
It is quite a buggar, not being able to test this.

In my case http://www.wellnet-one.de/penguin.html just refused to render
(20010502). I downloaded file to a local drive, but it rendered something like
30 lines of text and stopped. I tried to do "View Source", my mozilla became
more then 60Mb of VM size. After closing that "View Source" window, some memory
released, but my mozilla is 42Mb in size. Most surprisingly IE5.5 is only 8.3Mb
in VM size having rendered the same page successfully. I do not understand that
big memory fuutprint of the 419Kb HTML. What can I do to help? Purify is
commercial, what other tool(s) I can use to help to track this bug?
The problem of partial rendering is bug 58917. Memory increase on view-source is
not surprising, taking in account the size of that page and ammount of character
entities/colour-coding information applied to it.
I believe that there is a memory leak with page reloads in text only pages.  I
believe there are some bigger leaks related to images.  However, I believe the
particularly aggregious memory leaking that Alexey and I were seeing with
penuin.html are caused by the handling of the non-terminated <font> tags.

I just finished a 10-oad test of a cleaned up (ie terminated) version located
at:  http://home.hiwaay.net/~stephena/penguin.html

Open - Mem: 18,324 Pk: 18,448 VM: 11,672
1    - Mem: 27,428 Pk: 28,292 VM: 19,600
2    - Mem: 30,964 Pk: 31,696 VM: 23,164
3    - Mem: 30,888 Pk: 31,716 VM: 23,084
4    - Mem: 34,408 Pk: 35,120 VM: 26,600
5    - Mem: 30,952 Pk: 35,120 VM: 23,204
6    - Mem: 30,968 Pk: 35,120 VM: 23,224
7    - Mem: 31,120 Pk: 35,120 VM: 23,312
8    - Mem: 34,520 Pk: 35,172 VM: 26,760
9    - Mem: 31,852 Pk: 35,172 VM: 24,032

While there is still a slight trend in memory growth (30,964kB - 31,852kB0 it is
on the order of 100kB per reload not 1MB per reload as it was with the
unterminated penguin.html.

So, I suspect we have three issues here. Some image leaking, some basic layout
leaking, and some really major layout leaking when handling the broken html. 
Should the last case be a reopen of bug 58917 ?
Stephen BINGO!
I indeed could not reproduce the leak with penguin.html with <font> tags closed.
The ammount of memory Mozilla accumulates over time just shows how many bad HTML
pages are out there.
I have been getting these massive ammounts of memory over time while browsing
Bugzilla, so I tried the page of this bug with W3C validator, it indeed showed
quite a few tag open/close problems!
http://validator.w3.org/check?uri=http://bugzilla.mozilla.org/show_bug.cgi?id=75641&doctype=Inline

I think this is a regression that happened after 0.8.1 milestone. Because I
didn't see this at that time. Now that we narrowed this down to parser
regression in handling of opened/closed tags, can someone please track it down
and try to get it fixed for 0.9? This is a real showstopper for Windows machines.
harishd/nisheeth,   can you have a look at this idea that
unclosed font tags are leaking?
Keywords: topmlk
Alexey: I will do a purify run on penguin.html and will post the result on the
bug. Lets see if that will enlighten us.
Keywords: topmlk
assinging to myself for now....
Assignee: dprice → harishd
Yup, running purify on penguin.html infact suggest that the attribute tokens are
leaking. yikes! Will attach the run.
Status: NEW → ASSIGNED
Whiteboard: want for 0.9.1
[W] MLK: Memory leak of 765554 bytes from 95 blocks allocated in PR_Malloc

That's the leak that's getting to us. It's at the very bottom of the log.
Keywords: topmlk
sorry, not at the very bottom, but a little bit up.
Don't get it confused with this one which looks similar from first glance:

[I] MPK: Potential memory leak of 73746 bytes from 9 blocks allocated in PR_Malloc
> That's the leak that's getting to us. It's at the very bottom of the log.

How do you know?  It could be largely shutdown leaks (i.e., memory that's
supposed to be allocated once for the lifetime of browser and freed at shutdown
for which the latter never happens).
Because that's exactly the ammount I expected to see leaking.
I've tested it plenty, and that's approx by how much the memory increase was on
each reload.

All testing results are above, you can check it yourself.
Increase always been by 700 - 800kb.
The sliding string takes ownership of "unichars" ( in nsScanner::Append ) and is 
responsible for freeing it.

scc?
Related to bug 75310?  Or just entrained by the leak of the attribute tokens? 
(Does purify show leak roots?)
I'm not hitting the nsSharedBufferList's destructor ( I put a break point there 
per scc's advice )!!! The problem is probably confined to the scanner. Will 
investigate.
Several scenarios spring to mind.  You could either leak a reference to the
shared buffer list (either the original work string or one of the token
substrings), or else refcounting in the [sub]string could have broken (less
likely), or else the destruction of the [sub]string could have been messed up
somehow avoiding its destructor and thus failing to do the appropriate
ref-counting (least likely).
Attached file Reduced Testcase
The combination of PRE and FONT seems to cause the leak!
Attached patch Patch v1.1Splinter Review
Note: The above patch is not final. I have to run more tests on it..
was that purify run on the testcase you attached or on the penguin.html?
My last purify run was for penguine.html.
Attached patch patch v1.2Splinter Review
Did v1.2 include the crasher fix you mentioned when I was reviewing this with
you, I did not spot the place in the diff?

I believe we are trying to move away from NS_ASSERTION and replace them with
NS_WARN_IF_FALSE and NS_ABORT_IF_FALSE. I suggest you do that, too, but this is
not a requirement.

"while(mCount)" seems fragile since mCount is of type PRInt32 (i.e. it can be
negative). I would feel better if you made it "while(mCount > -1)". You might
also want to add an assertion before the loop to catch a case where it is
negative, because that most likely marks a bug elsewhere.

Otherwise, r=heikki.
Heikki: I backed out the code that could have caused the crash ( so it will not
be a part of the diff ). while(mCount > -1) makes me nervous because I don't
want to enter the loop when mCount is 0. 
Yeah, I realized my error after submitting the comment. But the idea stands,
just change it to mCount > 0. And you might want to have NS_WARN... before the
loop to catch a case where that condition was false initially - it is probably
bad logic error elsewhere (would probably never happen but I am a little paranoid).
Attached patch patch v1.3Splinter Review
Just to nit pick ReleaseAll() a bit more, how about:

+    NS_WARN_IF_FALSE(mCount >= 0,"count should not be a negative value");

Anyway, you know the drill: make sure view source still works, try some XML
documents for giggles. Looks ok to me: [s]r=waterson

harishd: one more thing that I just thought of. Be sure to bracket the macros
(IF_HOLD, IF_FREE, IF_DELETE) with PR_BEGIN_MACRO and PR_END_MACRO. That'll
avoid weird bugs like:

  if (blah)
    IF_FREE(baz, bop);
  else
    /* this'll get bound to the |if| in the macro! oops! */;
The fix worked, I can no longer reproduce leak I was seeing with penguin.html :o)
Thanks for fixing this!

Win2K build 2001051520
Fix is in. Marking FIXED.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Thanks for the fix; it really improved this particular case.

I'm a little concerned that we are still leaking on average 136KB per reload of
the properly terminated penguin.html.  Is there a generic "leak on reload" bug
or would this be a good subject for a new bug?  
stephen: Could you attach the penguin.html ( with FONT tags properly closed ) so
that I can run purify on it? Thanx.

Btw, a new bug would be preferable.
Okay, I post bug 81446 to cover the residual memory leaking.  Thanks for the
purify run.
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: