Closed Bug 653817 Opened 13 years ago Closed 12 years ago
js/gc-heap memory never fully reclaimed
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 When using JS heavy web sites, the about:memory field for js/gc-heap, js/string-data and js/mjit-code values can grow large, which isn't a big surprise. However when closing the tabs associated with those sites it appears string-data and mjit-code go back down to about where they were before the site(s) were loaded, but js/gc-heap never returns to normal. I've reproduced this on mac, windows, and fennec. Reproducible: Always Steps to Reproduce: 1. Launch fresh firefox, open about:memory tab, observe the values. 2. Create a few new tabs and go to JS-heavy sites like google mail + calendar + docs, edit a few docs or copy/paste a large (e.g. few hundred kb) text string into a new google document. 3. Observe the about:memory values for js/gc-heap 4. Close all tabs except about:memory 5. Wait and observe about:memory Actual Results: Depending on which platform you're on and how large you drive the memory allocation up to, you'll get different final resting amounts, but never close to the starting point. For example, on the mac a fresh launch put js/gc-heap in the ~5MB range. I created a new google text document. Copy/pasted in 291K of text from another text document. After the document was fully rendered, js/gc-heap was 100MB. I closed the document and waited an hour. js/gc-heap reduced to 27MB. Not the 5MB starting point. I did the same on windows. The peak js/gc-heap was 147MB, returning to 32MB. On fennec I had less memory so used a smaller test. I got js/gc-heap up to 68MB. After closing all tabs except about:memory, content process js/gc-heap was 9.4MB. Expected Results: While I can understand that the final numbers might not return to the original in every case, it doesn't seem right for tens of megabytes of memory to remain when all content tabs are closed. I'm concerned about memory not being reclaimed properly and/or ongoing "bloated" feeling as firefox runs longer, especially when trying to use it on memory limited devices. Every byte matters.
Did you notice a difference in behavior relative to Firefox 3.6?
Kevin, do you want to try your tool on this bug?
I looked at this with Massif, opening a single gmail tab and then closing it. The main single thing I see is that 2 MB (out of about 26MB of heap) of JSScripts are still held onto. That seems high.
Here's some data from Massif -- the top 24 heap allocation points left after I opened 10 gmail tabs then closed them, leaving open only about:memory. I'm not really sure what to make of this.
At the end, the majority of the JSScripts left are XDR scripts. I wonder if they're involved with running the browser. o1o> 05.09% (2,225,356B): JSScript::NewScript() (jsutil.h:239) | o2o> 03.62% (1,580,817B): js_XDRScript() (jsscript.cpp:549) | | o3o> 03.53% (1,540,964B): js_XDRFunctionObject() (jsfun.cpp:1904) | | | o4o> 03.53% (1,540,964B): js_XDRScript() (jsscript.cpp:673) | | | o5o> 02.92% (1,278,012B): JS_XDRScriptObject () | | | | o6o> 01.74% (762,260B): mozJSComponentLoader::ReadScript() (mozJSComponentLoader.cpp:365) | | | | | o7o> 01.74% (762,260B): mozJSComponentLoader::GlobalForLocation() (mozJSComponentLoader.cpp:1034) | | | | | o8o> 01.09% (478,003B): mozJSComponentLoader::LoadModuleImpl() (mozJSComponentLoader.cpp:688) | | | | | | o9o> 01.09% (478,003B): mozJSComponentLoader::LoadModule() (mozJSComponentLoader.cpp:615) | | | | | | o10> 01.09% (478,003B): nsComponentManagerImpl::KnownModule::Load() (nsComponentManager.cpp:954) | | | | | | o11> 01.09% (478,003B): nsFactoryEntry::GetFactory() (nsComponentManager.cpp:1976) | | | | | | o12> 00.70% (307,128B): nsComponentManagerImpl::CreateInstance() (nsComponentManager.cpp:1212) | | | | | | | o13> 00.70% (305,888B): nsComponentManagerImpl::GetService() (nsComponentManager.cpp:1508) | | | | | | | | o14> 00.70% (305,888B): nsJSCID::GetService() (xpcjsid.cpp:839) | | | | | | | | o15> 00.70% (305,888B): NS_InvokeByIndex_P (xptcinvoke_x86_64_unix.cpp:195)
HashMgr is hunspell spell checking. 2.5MB of memory might be reasonable for that if it doesn't grow over time. pcache1Alloc is SQLite storage, probably used by Places for history/bookmarks/etc. 3.5MB might also be reasonable for that if it doesn't grow over time. All those nsSegmentedBuffers look suspicious to me. They're used for I/O, but I/O memory should have been freed if you're quiescent, I suspect. Can you open GMail 10 times, close all the windows, then open GMail 10 times again and close those, and compare before and after the second 10 GMails?
A sizeable chunk is the spell-checking dictionary: o1o> 06.31% (2,755,787B): HashMgr::add_word(char const*, int, int, unsigned short*, int, char const*, bool) (hashmgr.cpp:176) | o2o> 06.26% (2,735,531B): HashMgr::load_tables(char const*, char const*) (hashmgr.cpp:520) | | o3o> 06.26% (2,735,531B): HashMgr::HashMgr(char const*, char const*, char const*) (hashmgr.cpp:89) | | o4o> 06.26% (2,735,531B): Hunspell::Hunspell(char const*, char const*, char const*) (hunspell.cpp:79) | | o5o> 06.26% (2,735,531B): mozHunspell::SetDictionary(unsigned short const*) (mozHunspell.cpp:168) | | o6o> 06.26% (2,735,531B): mozSpellChecker::SetCurrentDictionary(nsAString_internal const&) (mozSpellChecker.cpp:386) I guess it's reasonable that this stays around. Really, the whole question of this bug is understanding what things are likely to stick around for the full length of the process.
Here's the full Massif data.
Here's the nsSegmentedBuffer info in full: o1o> 05.68% (2,482,176B): nsSegmentedBuffer::AppendNewSegment() (nsSegmentedBuffer.cpp:103) | o2o> 05.68% (2,482,176B): nsStorageStream::Write() (nsStorageStream.cpp:195) | | o3o> 05.68% (2,482,176B): nsInputStreamTeeWriteEvent::Run() (nsInputStreamTee.cpp:130) | | | o4o> 05.68% (2,482,176B): nsThread::ProcessNextEvent() (nsThread.cpp:618) | | | o5o> 05.68% (2,482,176B): NS_ProcessNextEvent_P() (nsThreadUtils.cpp:250) | | | o6o> 05.68% (2,482,176B): nsThread::ThreadFunc() (nsThread.cpp:273) | | | o7o> 05.68% (2,482,176B): _pt_root (ptthread.c:187) | | | o8o> 05.68% (2,482,176B): start_thread (pthread_create.c:300) | | | o9o> 05.68% (2,482,176B): clone (clone.S:112) roc, I'll try your experiment tomorrow, I don't have time today (it'll take a while).
A better growth test for hunspell may be to open a big text document on google docs (or any online editor) to see if more spelling errors and/or words checked is increasing the retained allocation for the HashMgr. Any good tests to try for pcache1Alloc growth? Is this impacted at all by safebrowsing database? Any speculation on what's causing the nsSegmentedBuffers to be retained? On his simple test, that's a couple MB. I wonder if that grows with increased js network traffic on js heavy sites?
Whoever is seeing the segmented buffers; I assume that you're running a clean profile with no extensions? The way nsStorageStream works is that you write data to it once, then can read it multiple times. That means that the data in the nsStorageStream has to stick around for as long as the nsStorageStream does (and also for as long as any of the input streams that can read the data do, etc). Looking at mxr, our nsStorageStream users are: necko memory cache, XHR (for XML uploads only), startup cache, web browser persist ("save as"), the JS component loader, about:cache. That seems to be it. The stack in comment 9 shows a storage stream in combination with an input stream tee, which sort of sounds like memory cache is the most likely thing, if this is a vanilla browser.
(In reply to comment #11) > Whoever is seeing the segmented buffers; I assume that you're running a clean > profile with no extensions? No extensions and an almost-clean profile (it's been used to visit gmail a number of times, and little else).
So the nsStorageStream could just be the Necko memory cache? I guess jvp/njn could set the max cache size to something very small, or just flush the cache, to see if that's the problem. You could probably also disable spellchecking to get an idea how much memory it's using.
> So the nsStorageStream could just be the Necko memory cache? That's my best guess so far, yes. Testing with a limited-size memory cache might be of use.
(In reply to comment #14) > > So the nsStorageStream could just be the Necko memory cache? > > That's my best guess so far, yes. Testing with a limited-size memory cache > might be of use. I experimented a bit tonight. I'm not on a debug build at the moment so don't know a way to do any detailed tracing. I set browser.config.memory.capacity = 1024 and layout.spellcheckDefault=0. Not sure if that's exactly the Necko memory cache you're referring to. Is there a different setting I should be modifying? Starting js/gc-heap was about 3MB. Loaded up gmail, calendar, docs, opened a big text file, big complex spreadsheet, and 7 other misc content tabs. js/gc-heap grew to 45MB. I then closed all tabs except about:memory and left the browser idle for 30 minutes. js/gc-heap quickly dropped to 33.5MB and stayed there for the duration. In case it helps, I had browser.session.max_history=4 and browser.sessionhistory.max_total_viewers=1 during this test.
I did the open-close-reopen-close experiment, but only with 2 tabs because I was having difficulty with higher numbers. The top 20 allocation sites for the two minima are attached. I'll shortly add another attachment that presents the same data in an easier-to-compare fashion.
In this version entries from the two minima are paired. If an entry appeared in one but not the other, it's paired with "n/a". So "n/a" after an entry means the entry only appeared in the first minimum, and "n/a" before an entry means the entry only appeared in the second minimum. (Well, this data is only the top 20 from each minimum.)
jvp, the problem with this bug report is that's it's not clear that there's actually a leak. It's possible that the heap increase you're seeing is due to stuff that is supposed to hang around so long as Firefox is running. An experiment that would be useful would be to do like roc suggested to me -- do your experiment twice in a row (in a single execution of Firefox), and see if the memory usage after the second run is higher than the first run. If it is, that is stronger evidence of a leak.
(In reply to comment #18) > jvp, the problem with this bug report is that's it's not clear that there's > actually a leak. It's possible that the heap increase you're seeing is due to > stuff that is supposed to hang around so long as Firefox is running. Agree. Note that I didn't say leak in the problem description. But the overall footprint grows and is not meaningfully returned, which is problematic, especially in an embedded environment. > An experiment that would be useful would be to do like roc suggested to me -- > do your experiment twice in a row (in a single execution of Firefox), and see > if the memory usage after the second run is higher than the first run. If it > is, that is stronger evidence of a leak. Good idea. I spent some time today running a long experiment. I measured both the firefox-bin size as reported in USS by procrank (in KB) as well as js/gc-heap as reported by about:memory. I loaded eight tabs of content, then closed, then opened, then closed. For the first load, I actually kept the tabs open and left the browser idle and sampled memory consumption every ten minutes for 150 minutes. firefox-bin js/gc-heap Fresh Launch 47,972K 3,145,728 +gmail 99,940K 10,485,760 +calendar 102,896K 13,631,488 +docs dir 123,768K 16,777,216 +text doc 152,256K 20,971,520 +spreadsheet 209,512K 35,651,584 +espn.com 208,916K 35,651,584 +att.net 213,168K 35,651,584 +engadgetmobile 224,712K 36,700,160 Wait 10 minutes 231,248K 40,894,464 Wait 20 minutes 233,636K 40,894,464 Wait 30 minutes 238,572K 40,894,464 Wait 40 minutes 240,488K 42,991,616 Wait 50 minutes 242,260K 42,991,616 Wait 60 minutes 242,352K 42,991,616 Wait 70 minutes 250,016K 45,088,768 Wait 80 minutes 246,536K 45,088,768 Wait 90 minutes 251,844K 45,088,768 Wait 150 minutes 251,448K 49,283,072 Close all tabs 226,328K 38,797,312 Wait 30 minutes 226,252K 38,797,312 Wait 90 minutes 226,156K 38,797,312 Load all tabs 271,528K 38,797,312 Close all tabs 234,799K 38,797,312 Load all tabs 269,200K 38,797,312 Close all tabs 232,852K 38,797,312 Load all tabs 261,532K 38,797,312 Close all tabs 231,936K 37,748,736 A couple interesting things to note. From the fresh launch to the first full close, you'll notice that the working set of the browser increased by 178MB. That's the problem I'm trying to highlight. I can understand that some additional services (e.g. spell checker) are lazy loaded, but 178MB is a HUGE increase, especially in an embedded device. Note that I had spell checking disabled and browser.cache.memory.capacity=1024 and browser.sessionhistory.max_total_viewers=1. Second, note that with the browser sitting "idle", memory use grew significantly. Granted, these are some JS hungry sites with live content. But after closing all tabs, the js/gc-heap and total process footprint remains very large. Third, the peak/valley profile is interesting: 48MB, 251MB, 226MB, 272MB, 235MB, 270MB, 233MB, 262MB, 232MB. So yes, back to back loads/closes of identical content results in significant growth...but after a few load/close cycles it levels off. Is it a leak? Maybe. Maybe not. At a minimum it's a significant increase in memory profile without sufficient reduction when content is unloaded. This needs more scrutiny. Something isn't right here.
In today's Nightly (2011-05-04) bug 654106 was fixed. It likely causes a reasonable amount of memory rise over time. I suggest another test with today's Nightly to eliminate this variable.
jvp: thanks for the details. Giving specific sites and specific measurements is really helpful. Yes, it's hard to say if it's a leak but I agree there should be more drop-off after closing the tabs. If you can repeat the experiment as Hughman suggests in comment 20, that would be helpful. Also, I wonder if opening different sites the second time around would give different results. (Sorry if that sounds like "hey reporter, do a bunch more work!"... these bugs are difficult to diagnose.)
Applied patch and did a build tonight. First reaction after just a quick test is that the rate of growth is MUCH better with that patch applied. It's definitely part of the problem. I'll spend more time tomorrow trying to reproduce a test like above and get more accurate numbers so we can see what's left.
Great! Thanks for the feedback, I'm looking forward to detailed numbers.
(In reply to comment #23) > Great! Thanks for the feedback, I'm looking forward to detailed numbers. I ran the same test early this morning. As you can see, it makes an improvement on closing tabs. But as you can also see, it's only a partial solution. firefox-bin js/gc-heap Fresh Launch 48,700 3,145,728 +gmail 91,252 11,534,336 +calendar 104,408 12,582,912 +docs dir 123,460 15,728,640 +text doc 149,704 19,922,944 +spreadsheet 203,388 36,700,160 +espn.com 215,308 36,700,160 +att.net 226,492 36,700,160 +engadgetmobile 232,036 36,700,160 Wait 10 minutes 235,600 38,797,120 Wait 20 minutes 236,152 39,845,888 Wait 30 minutes 235,924 39,845,888 Wait 40 minutes 234,480 39,845,888 Wait 50 minutes 238,248 41,943,040 Wait 60 minutes 234,964 42,991,616 Wait 70 minutes 236,444 42,991,616 Wait 90 minutes 240,064 45,088,768 Close all tabs 108,908 34,603,008 Load all tabs 239,220 34,603,008 Close all tabs 121,552 34,603,008 Load all tabs 240,100 34,603,008 Close all tabs 123,692 34,603,008 Load all tabs 245,759 37,748,736 Close all tabs 127,588 36,700,160 Wait 30 minutes 122,040 35,651,584 Wait 60 minutes 121,832 35,651,584 From fresh launch to end state (tabs closed except about:memory), there's still ~75MB of growth that isn't reclaimed. That's about 250% of the original. Something's still either leaking or needlessly hanging onto memory.
> From fresh launch to end state (tabs closed except about:memory), there's still > ~75MB of growth that isn't reclaimed. That's about 250% of the original. Just for clarity, that's down from close to 500%, so a big improvement. > Something's still either leaking or needlessly hanging onto memory. Yeah, still seems likely.
I just blogged about this bug and bug 654106; the post may be interesting to people following this bug: http://blog.mozilla.com/nnethercote/2011/05/06/another-leak-fixed-part-2/
Nicholas can you run a massif before/after snapshot of a similar experiment (loading tabs once and then closing) to what I did above?
jvp, I already did that with gmail. You want me to do it with the mix of sites that you used?
(In reply to comment #28) > jvp, I already did that with gmail. You want me to do it with the mix of > sites that you used? Yea, two thoughts. First, your gmail massif was before the other memory leak was found and fixed so some of that leak may have factored into your numbers. Would be good to get a fresh run with the patch applied. Second, could use the massif output compared (at least partially anyway) against the overall numbers I was seeing, and maybe it gives some more clues.
Sorry, haven't got round to another run yet.
jvp: bug 656120 has been fixed. It causes GC to run more often during idle times, preventing
Depends on: 656120
Ugh, let me try again. jvp: bug 656120 has been fixed. It causes GC to run more often during idle times, preventing the gc-heap from getting so big, and has helped with a ton of bug reports about leak-like behaviour. Furthermore, we have some evidence that when all tabs are closed, the size of the gc-heap is smaller as a result of less fragmentation -- see bug 666101 comment 5 and subsequent comments. So, would you mind trying again with a Nightly from June 21 or later? If you want to do a strict comparison, you could do so against a Nightly from June 20. Thanks!
jvp, could you try with recent versions of Firefox?
I think this is basically caused by fragmentation, so compacting GC (bug 619558) will fix it. And we also have 668809 which is more general. So I'll close this.
Status: UNCONFIRMED → RESOLVED
Closed: 12 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.