Closed Bug 503108 Opened 15 years ago Closed 13 years ago

Memory usage climbs slowly but continuously on downloadstats.mozilla.com

Categories

(Core :: General, defect)

1.9.1 Branch
x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- -
blocking1.9.1 --- -
status1.9.1 --- wanted

People

(Reporter: matthew.bugzilla, Unassigned)

References

()

Details

(Whiteboard: [MemShrink])

Attachments

(3 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090707 Minefield/3.6a1pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090707 Minefield/3.6a1pre

Leave the browser open on this page for an hour or so and Task Manager reports several hundred megabytes of memory in use.

Reproducible: Always
Does it happen in safe mode?
Yes.
I see the same thing happening in other browsers.  Does the memory usage not go down after you close the page?  Given what the page is doing, I wouldn't be surprised if it's just using more and more memory by keeping all the data it ever fetched in RAM...
No, the memory usage doesn't go down after closing the page.
I had left my browser open for most of the day and i found out that it was up to 1.4GB of memory usage.  This never happened in 3.0 but it is happening on 3.5.  It just increases rapidly.
classical@westnet.com.au, was that on the specific page this bug was reported on?  If not, can you please file a new bug and ideally provide the urls you had loaded when this happened?
Going to start by assuming js engine, assuming this is a shutdown leak, since there are no shutdown XPCOM leaks here.  But this might also be a for-the-process-lifetime leak, which would be extra fun.  :(
Assignee: nobody → general
Status: UNCONFIRMED → NEW
Component: General → JavaScript Engine
Ever confirmed: true
Flags: blocking1.9.1.1?
Product: Firefox → Core
QA Contact: general → general
Version: unspecified → 1.9.1 Branch
Flags: blocking1.9.2?
So I'm not seeing obvious leaks in the OSX "leaks" output here, modulo the known tagged pointers.  This suggests that we're looking at a for-the-process-lifetime kind of thing...

Nicholas, valgrind should be able to tell us something about where memory is being allocated here, right?  Just a matter of running the program under valgrind with the right flags for a while?
(In reply to comment #8)
> valgrind should be able to tell us something about where memory is
> being allocated here, right?

Two different types of leak to think about:

* process allocates memory, throws away the pointers, can never
  free them.  This is what the default tool (Memcheck) can find for
  you.

* process allocates memory as it runs (perhaps giving slow constant
  increase in memory use over its lifetime).  At the end it frees it
  all before exiting.  Memcheck won't tell you anything since there is
  no real leak.  What you need is a heap profiler (to answer the
  question "who put all this stuff here") as the process runs.  That'd
  be the Massif tool:   valgrind --tool=massif ...

  See http://www.valgrind.org/docs/manual/ms-manual.html

  Should be easy to use.  If not, pls yell.
Julian, those are exactly the two types of leaks I mentioned in comment 7.

Running with massif now; here's hoping it no longer lies like crazy (which it did last time I tried it a few years back).
(In reply to comment #10)
> Julian, those are exactly the two types of leaks I mentioned in comment 7.

Oh, sorry.  I didn't understand that.

> Running with massif now; here's hoping it no longer lies like crazy (which it
> did last time I tried it a few years back).

It got totally overhauled for Valgrind 3.3.0; what we measured before that
(space-time product) was misleading and inappropriate for C/C++ apps, so
that was scrapped.  Should be more reliable and understandable now.
This log doesn't show constantly growing memory usage after the first little bit...  Neither did activity monitor for the process in question, actually.

It's possible that the unbounded growth is Windows-only, of course; if so it's more likely to be a cairo issue than JS.
Massif only measures heap blocks (well, you can use --stacks=yes to measure stacks but they seem unlikely to be relevant here).  "Heap blocks" means things allocated with malloc/calloc/realloc/memalign/valloc/new/new[].  Crucially, Massif does *not* measure memory allocated directly with mmap() or brk(), and they might be the source of your growing memory usage.  Eg. I noticed recently that nanojit uses mmap() to allocate code pages, so Massif doesn't record these.  It probably should record such mmaps, but it gets complicated with shared maps and code maps... maybe it could do something like record all private anonymous maps or something.

You can use VALGRIND_MALLOCLIKE_BLOCK to remedy this situation... modifying code is always a pain, but if you have a suspicion that a particular place is causing the slow leak it might be useful.
> You can use VALGRIND_MALLOCLIKE_BLOCK to remedy this situation

Can you point me to the details?  It's entirely possible that the nanojit mmap()s are the issue here (though the activity monitor data still makes me question that).
Attached patch untested patchSplinter Review
Look in valgrind/valgrind.h in your Valgrind distribution for an explanation.  Or even better, use the attached patch as a starting point -- I haven't tested it (I haven't even compiled it, I have to run in a minute) but it will probably work.
blocking1.9.1: --- → -
Flags: wanted1.9.1.x+
Flags: blocking1.9.1.1?
Flags: blocking1.9.1.1-
Assignee: general → nobody
Component: JavaScript Engine → General
Flags: wanted1.9.2+
QA Contact: general → general
HOw do you use valgrind, since I am having trouble trying to use it, so that you can what might be causing my problems?
It might as well be French for me, since the code does not make much sense to me.  Basicall i will need a very detail step by step method of using it.
bz, attached is a patch that changes Massif to track memory at the mmap/brk level rather than the malloc/free level, i.e. it covers *all* memory allocations/deallocations, but at a lower abstraction level.  You may find it helpful -- if there's a slow leak, Massif is guaranteed to find it with this patch.

(Graydon, you might also find it useful for the TM allocation changes you've been working on.)

Apply it to the current Valgrind SVN trunk (with suitable substitutions for all the $VARs):

  svn co svn://svn.valgrind.org/valgrind/trunk $DIR
  cd $DIR
  patch -p0 < $PATCHNAME
  ./autogen.sh
  ./configure --prefix=$INSTALL
  make
  make install

You can skip the 'make install' step and just use $DIR/vg-in-place if you like.

If you apply it to an existing Valgrind workspace you *must* run 'make clean' first.

You'll need to run it with --smc-check=all unless you configured Firefox with --enable-valgrind.  

I've tested it with 'js' but not with a full Firefox build;  hopefully it's robust enough for your needs.
I can't reproduce this on my Linux box or my Mac laptop (this is natively, not using Valgrind).

That is to say, if I open that webpage 'top' tells me that the virtual size of the Firefox process is 576MB/450MB on Linux/Mac, but the resident sizes are more like 65MB/60MB and that doesn't vary much even if I leave the window open for a long time.  The original reporter didn't indicate if the "several hundred megabytes" number increased over time, ie. what the reported number was at start-up.

bz, can you reproduce it?
> The original reporter didn't indicate if the "several hundred
> megabytes" number increased over time, ie. what the reported number was at
> start-up.

Yes, it increases over time.
(In reply to comment #21)
>
> Yes, it increases over time.

How much?  What does it start at?
It starts at about 45MB, and passes 100MB within about 10 minutes.
Add-ons?

/be
> Add-ons?

None.
Flags: wanted1.9.1.x+
Flags: blocking1.9.2? → blocking1.9.2-
Is anything happening with this? It is still there in Firefox 3.6.3.
I haven't had time to look into this, certainly; too many high-priority items on my plate.  We really need someone to sit down with a trunk (or better yet 1.9.2 if this is not reproducible on trunk) build and look at what's going on here....

jst, do we have anyone available to do that?
blocking2.0: --- → ?
If it's happening on trunk, then the newly-detailed about:memory will help narrow down where the memory is accumulating.
I found out the source of my problem and it was an add on.  I was using an Australian dictionary and that was the cause of the problem.
Could you provide a link to that add-on?  If spell-checking is causing leaks, we should get on that!  Adding Ehsan, who I think said something about spell-check memory problems in another bug, or something.
The spellchecker didn't use to participate in cycle collection, and it held references to the document, which caused the document to remain in memory and therefore leaking very badly.  I have landed a couple of patches for this on trunk, and there shouldn't be any leaks any more (one of the patches has also landed on 1.9.1 and 1.9.2).

classical, could you please try the latest nightly version (http://nightly.mozilla.org) and see if you still see the problem?
The bugs in question are bug 569504 and bug 570417.
Note that classical's situation is NOT the one this bug was reported on.  Comment 25 explicitly says reporter has no add-ons.
Because comment #0 is phenomenon on MS Win and "memory in use" is seen in comment #0, comment #0 is probably for "Mem Usage" column value of MS Win-XP's Task manager(in Vista, "Private Bytes" is used as column name).

As written in next document I pointed in Bug 381950 comment #0, MS Win's memory management is "Page Trimming" instead of "Page Stealing".
> http://www.demandtech.com/Resources/Papers/WinMemMgmt.pdf
> http://www.microsoft.com/resources/sharedsource/windowsacademic/facultyexperiences/teachingkit.mspx
"Memo Usage" column value contains real memory size which are already free mained but still is not returned to page pool. So, "Memo Usage" column value can be considered high water mark of allocated real memory size in the past.
But, Win-XP's column name of "Mem Usage" is very confusing.
If talking about leak or not on MS Win, "Virual Memory Size" column value should be checked first and "Mem Usage" column value should be ignored.

To know value near to really needed Working Set size on MS Win, next is required.
(1) config.trim_on_minimize=true and restart
(2) Check really needed/frequently referred real memory size. 
(2-1) Use Firefox ordinaly for a while,
      check "Mem Usage" value and "Virtual Memory Size" value.
(2-2) Minimize Firfox, and wait for a while,
      check "Mem Usage" value and "Virtual Memory Size" value.
(2-3) Retuen to normal window size, and do some operations until no delay
      in responding is observed,
      check "Mem Usage" value and "Virtual Memory Size" value.
(3) Repeat (2) many times.
    Avarage size of step (2-3) is rough really-required "Working Set Size".
Delta of Virtual Memory Size and average value of (2-3) is evidence of inefficient use of virtual memory or Magnate programming, and evidence of memory leak in some cases like leak by add-on. Average size of step (2-3) is evidence that Mozilla requires large real memory to keep acceptable response time.
blocking2.0: ? → -
Testing needed.
Whiteboard: [MemShrink]
Summary: Memory usage climbs slowly but continuously → Memory usage climbs slowly but continuously on downloadstats.mozilla.com
I'm going to close this.  Reasons:

- It happened on 3.6, and various memory problems have been fixed since then.

- I was unable to reproduce it when I looked at it 2 years ago.

- http://downloadstats.mozilla.com/ has been retired, it now redirects to http://blog.mozilla.com/website-archive/2011/06/14/glow-1-0/, so there's no obvious way to attempt to reproduce.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: