The default bug view has changed. See this FAQ.

Significant memory leak introduced between 4.0b10 and 4.0b12; causing regular system OOMs

RESOLVED INCOMPLETE

Status

()

Core
Plug-ins
--
critical
RESOLVED INCOMPLETE
6 years ago
6 years ago

People

(Reporter: Omari Stephens, Unassigned)

Tracking

({mlk, regression})

Trunk
x86_64
Linux
mlk, regression
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(11 attachments)

(Reporter)

Description

6 years ago
User-Agent:       Mozilla/5.0 (X11; Linux i686; rv:2.0b12) Gecko/20100101 Firefox/4.0b12
Build Identifier: Mozilla/5.0 (X11; Linux i686; rv:2.0b12) Gecko/20100101 Firefox/4.0b12

Two weeks ago I upgraded from 4.0b10 on both of my machines to 4.0b12 on both of my machines.  Since then, one of the machines (named intercal) has started OOMing regularly, and the other (named perl) experienced abnormally high memory usage given how few tabs it has open.


The OOMs on intercal seem to be correlated with abnormally high memory usage for Flash's plugin-container.  I did not upgrade Flash anytime recently, so it would seem that the high Flash memory usage is related to my recent FF upgrade.  Here are my recent OOMs:
$grep -A1 'Out of memory' /var/log/kern.log
Mar  6 09:35:38 intercal kernel: [5268880.338201] Out of memory: Kill process 424 (firefox-bin) score 370 or sacrifice child
Mar  6 09:35:38 intercal kernel: [5268880.338212] Killed process 424 (firefox-bin) total-vm:3768460kB, anon-rss:3037200kB, file-rss:0kB
--
Mar  6 17:33:16 intercal kernel: [5297538.593814] Out of memory: Kill process 16228 (firefox-bin) score 673 or sacrifice child
Mar  6 17:33:16 intercal kernel: [5297538.593826] Killed process 16296 (plugin-containe) total-vm:632640kB, anon-rss:100232kB, file-rss:1636kB
--
Mar  7 00:02:04 intercal kernel: [5320866.717173] Out of memory: Kill process 16228 (firefox-bin) score 582 or sacrifice child
Mar  7 00:02:04 intercal kernel: [5320866.717186] Killed process 2000 (plugin-containe) total-vm:309004kB, anon-rss:7280kB, file-rss:0kB
--
Mar  7 23:49:12 intercal kernel: [5406493.667396] Out of memory: Kill process 16228 (firefox-bin) score 494 or sacrifice child
Mar  7 23:49:12 intercal kernel: [5406493.667410] Killed process 4195 (plugin-containe) total-vm:355372kB, anon-rss:10648kB, file-rss:4kB
--
Mar  7 23:54:12 intercal kernel: [5406794.459170] Out of memory: Kill process 16228 (firefox-bin) score 495 or sacrifice child
Mar  7 23:54:12 intercal kernel: [5406794.459181] Killed process 16228 (firefox-bin) total-vm:5250364kB, anon-rss:4065652kB, file-rss:0kB

This last one, at 5GB, is between 2x and 4x the memory usage I saw under 4.0b10

Note that I run two independent (`ff4 -no-remote` from different homedirs) FF instances on intercal.  I only run Flash in one of those two instances (pid 16228 in the OOMs).  Memory usage in the non-Flash instance is currently as follows:
Memory mapped: 1,432,354,816
Memory in use: 1,286,469,108


perl has 2 windows, 7 tabs in one, 22 tabs in the other.  Memory usage is currently as follows:
Memory mapped: 1,439,694,848
Memory in use: 696,473,970

There was one GoDaddy tab I left open for about a day, at which point I noticed perl lagging badly.  I closed that single tab on a guess that the memory usage was somehow related to JS, and FF immediately dropped about 600MB in RSS, which meant I could use my system again.  At the same time, 700MB of RSS is pretty insane for 30 tabs.

Reproducible: Always

Steps to Reproduce:
Don't know yet.  I tend to have a bunch of tabs open at a time (80-120 per FF instance on intercal).  On 4.0b10 and earlier, with this quantity of tabs, the instances would generally run up to ~2GB apiece and sit there.
(Reporter)

Comment 1

6 years ago
Oh, I should mention that intercal has 8GB of ram (with no swap).  perl has 1.5GB of ram and 400MB of swap.

The complete memory usage listing (from about:memory) for intercal's non-Flash instance is:
Memory mapped:                     1,432,354,816
Memory in use:                     1,351,033,924

malloc/allocated                   1,351,039,412
malloc/mapped                      1,432,354,816
malloc/committed                   1,432,354,816
malloc/dirty                       3,428,352
js/gc-heap                         727,711,744
js/string-data                     10,509,872
js/mjit-code                       0
storage/sqlite/pagecache           20,116,632
storage/sqlite/other               1,981,776
images/chrome/used/raw             0
images/chrome/used/uncompressed    13,383,240
images/chrome/unused/raw           0
images/chrome/unused/uncompressed  0
images/content/used/raw            21,499,639
images/content/used/uncompressed   1,827,705
images/content/unused/raw          0
images/content/unused/uncompressed 0
layout/all                         47,127,374
layout/bidi                        630
gfx/surface/image                  2,244,048
content/canvas/2d_pixel_bytes      8,471,040


The complete memory usage listing for perl is:
Memory mapped:                     1,439,694,848
Memory in use:                     696,473,970
malloc/allocated                   696,477,434
malloc/mapped                      1,439,694,848
malloc/committed                   1,439,694,848
malloc/dirty                       2,703,360
js/gc-heap                         324,009,984
js/string-data                     7,278,700
js/mjit-code                       3,731,521
storage/sqlite/pagecache           115,389,704
storage/sqlite/other               1,561,992
images/chrome/used/raw             0
images/chrome/used/uncompressed    156,160
images/chrome/unused/raw           0
images/chrome/unused/uncompressed  0
images/content/used/raw            4,623,817
images/content/used/uncompressed   2,066,400
images/content/unused/raw          4,208
images/content/unused/uncompressed 23,120
layout/all                         15,101,251
layout/bidi                        1,766
gfx/surface/image                  167,616
content/canvas/2d_pixel_bytes      6,246,596
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Version: unspecified → Trunk
Why is this a JS issue?  It's plugin-container that's using all the RAM... the about:memory numbers above are for the main process only.
Assignee: general → nobody
Component: JavaScript Engine → Plug-ins
QA Contact: general → plugins
(Reporter)

Comment 3

6 years ago
bz: It's possible that there are multiple issues here.
The perl browser has zero plugins (to be specific, about:plugins says "No enabled plugins found"), and yet was using 600MB of RSS for a single, JS-laden tab.  That's clearly not a plugin-container bug.

At this point, I've restarted the intercal Flash browser, but haven't used Flash at all (I use Flashblock).  Memory seems to be holding steady (if you count 500MB RSS swings as "steady"), but most of the OOMs have happened while I've been away at work, so we'll see what happens when I get back home tonight.  I set up a quick loop to capture memory usage (`ps u`) for both processes on 60-second intervals, just to see if that shows anything interesting

If there's anything you want me to try, let me know; I'm all ears.
Blocks: 632234
Keywords: mlk, regression
(Reporter)

Comment 4

6 years ago
Created attachment 517789 [details]
Memory usage for flash-enabled browser (on intercal) since restart

Note that I locked my screen and went to sleep immediately after I started running the measurements.  Despite all the sawtoothiness, though, the base memory rate seems to be holding steady.  We'll see if that holds up throughout the day while I'm at work.

Again, note that even though this browser has the Flash plugin ("Flash+" in the title), I use Flashblock and have not yet tried playing any Flash content.
> That's clearly not a plugin-container bug.

Sure, but it's also not clearly a "bug" depending on what the page is doing...  Worth investigating, for sure.
Keywords: mlk
(Reporter)

Comment 6

6 years ago
Created attachment 517799 [details]
Memory usage for non-flash browser instance on intercal

x0 on this graph is the same time as x0 on the other graph.  Also, the Y axis limits are identical between the two (0:1,700,000 bytes).

Clearly, in addition to the sawtoothiness, there's a linear increase in RSS, which I'd classify as a leak of some sort.  Again, we'll see where this goes once I get back from work today.  And to reiterate, this happened while the machine was locked and I was sleeping, so there was no human interaction until the very end of the data on the graph.

Finally, I haven't yet been, but I'll start recording similar metrics for perl; my memory usage has been going up, so I presume it's seeing some kind of leak as well, but it'd be useful to see what the actual data looks like.
(Reporter)

Comment 7

6 years ago
Oh, here's a current memory breakdown for intercal's non-flash browser (that's showing the linear ramp in memory usage):

Memory mapped:                     1,579,155,456
Memory in use:                     1,536,258,396
malloc/allocated                   1,536,206,540
malloc/mapped                      1,579,155,456
malloc/committed                   1,579,155,456
malloc/dirty                       2,519,040
js/gc-heap                         871,366,656
js/string-data                     12,834,334
js/mjit-code                       0
storage/sqlite/pagecache           20,149,648
storage/sqlite/other               2,034,104
images/chrome/used/raw             0
images/chrome/used/uncompressed    13,383,240
images/chrome/unused/raw           0
images/chrome/unused/uncompressed  0
images/content/used/raw            21,499,639
images/content/used/uncompressed   1,827,705
images/content/unused/raw          0
images/content/unused/uncompressed 0
layout/all                         47,126,301
layout/bidi                        630
gfx/surface/image                  2,244,048
content/canvas/2d_pixel_bytes      8,471,040
(Reporter)

Comment 8

6 years ago
FWIW, it just occurred to me that the units in both graphs are off by an order of magnitude; they should be kilobytes, not bytes. (so the y scales are from 0 to 1.7GB)
(Reporter)

Comment 9

6 years ago
I returned home to an unresponsive-but-not-yet-OOM-killed Flash+ browser.  Flash- is sitting pretty.  Graphs from all three browser instances forthcoming.
(Reporter)

Comment 10

6 years ago
Created attachment 518009 [details]
intercal non-flash memory usage graph

The non-flash instance on intercal continues its nice linear ramp
(Reporter)

Comment 11

6 years ago
Created attachment 518014 [details]
memory usage graph for flash-enabled browser

The flash-enabled instance was fine for a long while, and then it suddenly blew up.  To reiterate, I was at work at the time, and the machine was locked.

Also, as I had mentioned, the Flash+ instance is completely unresponsive right now.  It doesn't redraw anything.  It _is_ still alive, though, so I'll leave it and see what happens.  Given the memory usage, though, I imagine something might get OOM-killed by the morning:

$free -to
             total       used       free     shared    buffers     cached
Mem:       8199588    8065452     134136          0     134964     285508
Swap:            0          0          0
Total:     8199588    8065452     134136
(Reporter)

Comment 12

6 years ago
Created attachment 518017 [details]
Memory usage graph for perl's browser

perl seems pretty well-behaved.  The noise at the beginning and end is from when I was using the browser to interact with bugzilla (this morning and now).  As a reminder, perl's browser has no plugins (namely, no Flash)
(Reporter)

Comment 13

6 years ago
Finally, this is the current memory breakdown for intercal's non-flash browser (that's showing the linear ramp in memory usage).  Note that js/gc-heap is growing steadily with each one of these I post:

Memory mapped:                     1,858,076,672
Memory in use:                     1,827,098,152
malloc/allocated                   1,827,105,688
malloc/mapped                      1,858,076,672
malloc/committed                   1,858,076,672
malloc/dirty                       1,368,064
js/gc-heap                         1,138,753,536
js/string-data                     14,886,114
js/mjit-code                       0
storage/sqlite/pagecache           20,218,224
storage/sqlite/other               2,034,440
images/chrome/used/raw             0
images/chrome/used/uncompressed    13,383,240
images/chrome/unused/raw           0
images/chrome/unused/uncompressed  0
images/content/used/raw            21,499,639
images/content/used/uncompressed   1,827,705
images/content/unused/raw          0
images/content/unused/uncompressed 0
layout/all                         47,127,374
layout/bidi                        630
gfx/surface/image                  2,244,048
content/canvas/2d_pixel_bytes      8,471,040


And here's perl's:
Memory mapped:                     1,437,597,696
Memory in use:                     739,658,102
malloc/allocated                   739,661,566
malloc/mapped                      1,437,597,696
malloc/committed                   1,437,597,696
malloc/dirty                       2,412,544
js/gc-heap                         324,009,984
js/string-data                     6,504,054
js/mjit-code                       6,118,204
storage/sqlite/pagecache           137,863,000
storage/sqlite/other               1,749,384
images/chrome/used/raw             0
images/chrome/used/uncompressed    157,184
images/chrome/unused/raw           0
images/chrome/unused/uncompressed  0
images/content/used/raw            5,154,241
images/content/used/uncompressed   4,202,468
images/content/unused/raw          0
images/content/unused/uncompressed 0
layout/all                         17,555,945
layout/bidi                        1,486
gfx/surface/image                  91,840
content/canvas/2d_pixel_bytes      6,246,596
(Reporter)

Comment 14

6 years ago
First off: if there's anything people want me to try/check/whatever, please let me know.  Nothing would make me happier than if this were fixed, so if there's some (reasonable) way I can be of assistance, please speak up.

That said, a short update.  Unfortunately, an errant > instead of >> last night killed my previous data; oops.  That said, intercal flash+ memory usage went up slightly after I grabbed a core dump last night, and then dropped by 35MB overnight.  It's currently sitting at 3215196kB of RSS, solid; no fluctuation at all.

perl instance is sitting right around 759MB.  Base rate is flat, and it's getting tiny GC fluctuations (in other words, looks very reasonable)

intercal flash- is continuing its ramp; current detailed numbers are below.  We're up to 1.3GB classified as js/gc-heap:
Memory mapped:                     2,031,091,712
Memory in use:                     1,920,810,456
malloc/allocated                   1,920,817,992
malloc/mapped                      2,031,091,712
malloc/committed                   2,031,091,712
malloc/dirty                       3,354,624
js/gc-heap                         1,308,622,848
js/string-data                     12,601,150
js/mjit-code                       0
storage/sqlite/pagecache           20,259,928
storage/sqlite/other               2,141,136
images/chrome/used/raw             0
images/chrome/used/uncompressed    13,384,712
images/chrome/unused/raw           0
images/chrome/unused/uncompressed  0
images/content/used/raw            21,498,361
images/content/used/uncompressed   1,825,657
images/content/unused/raw          0
images/content/unused/uncompressed 0
layout/all                         47,259,690
layout/bidi                        630
gfx/surface/image                  2,244,048
content/canvas/2d_pixel_bytes      8,471,040
Blocks: 640452
(Reporter)

Comment 15

6 years ago
Double-update for intercal flash-.

Sometime yesterday:
Memory mapped:                     2,497,708,032
Memory in use:                     2,409,595,346
malloc/allocated                   2,409,602,882
malloc/mapped                      2,497,708,032
malloc/committed                   2,497,708,032
malloc/dirty                       3,104,768
js/gc-heap                         1,768,947,712
js/string-data                     16,135,706
js/mjit-code                       0
storage/sqlite/pagecache           20,364,064
storage/sqlite/other               2,140,408
images/chrome/used/raw             0
images/chrome/used/uncompressed    13,384,712
images/chrome/unused/raw           0
images/chrome/unused/uncompressed  0
images/content/used/raw            21,498,361
images/content/used/uncompressed   1,825,657
images/content/unused/raw          0
images/content/unused/uncompressed 0
layout/all                         47,273,051
layout/bidi                        630
gfx/surface/image                  2,244,048
content/canvas/2d_pixel_bytes      8,471,040

half-an-hour ago:
Memory mapped:                     2,942,304,256
Memory in use:                     2,834,912,766
malloc/allocated                   2,834,920,302
malloc/mapped                      2,942,304,256
malloc/committed                   2,942,304,256
malloc/dirty                       3,125,248
js/gc-heap                         2,204,106,752
js/string-data                     17,475,580
js/mjit-code                       0
storage/sqlite/pagecache           20,432,920
storage/sqlite/other               2,192,712
images/chrome/used/raw             0
images/chrome/used/uncompressed    13,397,864
images/chrome/unused/raw           0
images/chrome/unused/uncompressed  0
images/content/used/raw            21,498,361
images/content/used/uncompressed   1,825,657
images/content/unused/raw          0
images/content/unused/uncompressed 0
layout/all                         47,281,341
layout/bidi                        630
gfx/surface/image                  2,257,200
content/canvas/2d_pixel_bytes      8,471,040
(Reporter)

Comment 16

6 years ago
So, I woke up yesterday morning to the revelation that _all three_ of my browser instances (across both machines) had been OOM-killed.  Fun graphs forthcoming.

I think my conclusion so far is that there's a slow, consistent memory leak somewhere (which causes js/gc-heap to grow without bound), as well as a spontaneous leak which causes the browser to blow up pretty immediately.

One thought: is it possible that having dialogs open somehow hampers GC effectiveness/retards GC frequency/something like that?  For the intercal/Flash+ instance that blew up and then stopped responding, iirc it did have a "do you want to accept this cookie?" dialog open (judging by the window title; the contents weren't being drawn).
Does the memory leak still occur if all addons disabled?
(Reporter)

Comment 18

6 years ago
Created attachment 519028 [details]
Pre-OOM memory usage for intercal flash-

Same constant-base-rate linear ramp
(Reporter)

Comment 19

6 years ago
Created attachment 519030 [details]
Pre-OOM memory usage for intercal flash+

It looked reasonable up until everything broke.  It's interesting to note that the base level of memory usage increased after each large swing (this also goes for some of the swings from ~800 to ~1500 minutes).

Also interesting is that each large swing shows at least two different slopes — it starts bleeding memory at a constant rate, and then at some point, it starts going at an even faster constant rate.
(Reporter)

Comment 20

6 years ago
Created attachment 519032 [details]
Pre-OOM memory usage for perl

Perl was well-behaved until things took a sudden turn for the worse.

I should note that this browser instance wasn't really killed by the OOM-killer; it hit the "out of memory" message which, presumably, means that FF's custom malloc was unable to allocate more memory and killed the process.  It's interesting that the rate of increase tapered off some near the end.
(Reporter)

Comment 21

6 years ago
Had a power outage, so sort of forgot about all this until just now.  I'll set up my monitoring again.
[1739260.164336] Out of memory: Kill process 28908 (firefox-bin) score 570 or sacrifice child
[1739260.164351] Killed process 28908 (firefox-bin) total-vm:5782716kB, anon-rss:4681200kB, file-rss:0kB

This was the non-flash instance on intercal

As for running without plugins, I can probably do a short A/B test on the nonflash intercal instance, but I find the web-browsing experience to be nearly unbearable without vertical tabs, so it'll have to be a short-lived experiment.

For the FF folks: is there any way to get a feel for where the memory is actually being used/held, rather than just "it's in the JS heap"?  It would be pretty useful if I could see the memory footprint of each tab (and it'd make it that much easier to figure out if it's a problem related to resources for closed or open tabs).
Not yet.  I think Mike Shaver is working on something there.
Keywords: mlk
Blocks: 659855
No longer blocks: 632234
No longer blocks: 640452
Whiteboard: [MemShrink:P2]
Omari:  any chance you can try your experiments again with a Nightly build of FF6 from http://nightly.mozilla.org/?  Numerous leak fixes and similar improvements have occurred since the FF4 betas you tried.  Thanks!
In particular, bug 656120 has been fixed, and it has helped with a lot of bug reports involving slow increases in memory usage.
I'm going to close this;  there hasn't been any extra info from the reporter in two months, and there's a good chance some MemShrink-related fixes have fixed the original problem.

Omari, please reopen if you can still reproduce with Firefox 7 or later;  also note that Firefox 7 has per-compartment reporters in about:memory, which give a lot more detail about how the JS engine uses memory.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → INCOMPLETE
(Reporter)

Comment 26

6 years ago
Howdy, all.  Sorry for the lack of response for awhile.  Here's a quick update and the current state of affairs:

For one, I think the bugfix that njn mentioned did fix the immediate issue of FF's memory usage growing to the point of getting OOM-killed repeatedly.

That said, memory usage is still significantly higher than I'd expect.  I'm currently running Nightly 2011-10-01 (10.0a1) on perl.  I'm currently at 2.5GB resident for just one of my two FF instances.  Things that look ultra-suspicious are 322MB for "gc-heap-chunk-dirty-unused" and 700MB for "heap-unclassified".

In particular, I recorded the reported memory usage (attachments forthcoming) before and after hitting the "Minimize memory usage" button.  gc-heap-chunk-dirty-unused decreased by a whole 2MB and heap-unclassified by 7MB.  By percentage, this is nothing.
Status: RESOLVED → UNCONFIRMED
Resolution: INCOMPLETE → ---
(Reporter)

Comment 27

6 years ago
Created attachment 564640 [details]
Memory usage before "Minimize memory usage" button

Note in particular gc-heap-chunk-dirty-unused and heap-unclassified.
(Reporter)

Updated

6 years ago
Attachment #564640 - Attachment mime type: application/octet-stream → application/xhtml+xml
(Reporter)

Comment 28

6 years ago
Created attachment 564646 [details]
Memory usage *after* pressing "Minimize memory usage" button
(Reporter)

Updated

6 years ago
Attachment #564646 - Attachment mime type: application/octet-stream → application/xhtml+xml
(Reporter)

Comment 29

6 years ago
Created attachment 564671 [details]
Memory usage after browser restart

In a short moment of non-stupidity, I grabbed a memory usage dump after restarting the browser.  To be explicit, this was a hard shutdown rather than a clean shutdown — I hit ^C in the console I was running the browser from.  As such, the browser should have the same state as it did before I killed it (and in particular, I've waited for it to finish lazy-loading all of my zillion tabs).  After the restart, the browser's resident size is reduced by _1 GB_.

Referring to the metrics I mentioned before, heap-unclassified is down by 300MB to 430MB (from 700MB before), and gc-heap-chunk-dirty-unused is down to 26MB (from 320MB).  That alone is a savings of 0.6GB.
heap-unclassified of 20-30% isn't that unusual, unfortunately.  We have various efforts to improve this linked off of Bug 563700.  gc-heap-chunk-dirty-unused I think is the result of heap fragmentation.  Until we get a much fancier GC, we probably won't be able to avoid that entirely.  How long did you have the browser running before you restarted it?  A few hours?  A few days?

Some addons can also cause memory use to increase, so you can try disabling those.

Unfortunately, without more specifics about what pages or browsing behavior are causing your memory usage to increase, there's nothing we can really act on here.

Also, you can copy and paste about:memory as plain text, which makes it a little easier to view.
Attachment #564671 - Attachment mime type: application/octet-stream → application/xhtml+xml
Omari, thanks for the updates.  I'm going to close this again, because there's no longer a clear problem, such as a leak.  If Firefox is using more memory than you might expect, that alone doesn't make for a terribly good bug report.  There's not enough data here to take any concrete actions, unfortunately.  The good news is that we have many other bugs open for reducing memory usage in general, search for bugs with "MemShrink" in the whiteboard :)

Nb: We have bug 668809 open for the goal of startup memory usage matching memory usage after browsing for some time.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.