limi's been seeing some memory issues on Firefox on OSX, and I had a chance to look into it. For reference, his about memory numbers look like: malloc/allocated 262,245,152 malloc/mapped 577,536,000 malloc/zone0/committed 262,333,104 malloc/zone0/allocated 573,341,696 xpconnect/js/gcchunks 71,303,168 images/chrome/used/raw0images/chrome/used/uncompressed 1,186,472 images/chrome/unused/raw0images/chrome/unused/uncompressed 3,072 images/content/used/raw0images/content/used/uncompressed 2,134,612 images/content/unused/raw0images/content/unused/uncompressed 12,288 storage/sqlite/pagecache 53,719,256 storage/sqlite/other 2,180,832 layout/all 922,395 layout/bidi0gfx/surface/image 5,566,996 content/canvas/2d_pixel_bytes 111,000 The zone0 numbers are swapped, committed should be 573MB and allocated should be 262MB (I'll fix that separately). The above implies massive fragmentation -- firefox is only actually using 260MB or so, but has a commit size of almost 600MB. I grabbed both a vmmap and leaks output from his process, and leaks in particular confirms the 260MB allocated number. Both are attached. I didn't have any way to figure out what was actually in those blocks, or to get some idea of actual fragmentation. We should probably revisit jemalloc on OSX, particularly now that we have mozalloc and a #define-based approach to using it if we want; that way we may be able to avoid the double-frees coming from core frameworks.
Created attachment 462859 [details] leaks output Note that there are a small handful of actual leak-looking things, though it's possible that they're held through JS somehow, and that leaks can't decipher the object mangling in jsvals usefully or something.
leaks has known false-positives with tagged pointers; should be less of an issue with fatvals, but we use those elsewhere too. jemalloc on mac is bug 414946.
Other ideas: “After extensive testing and confirmation from Apple employees we realized that there was no way for an allocator to give unused pages of memory back while keeping the address range reserved.” “(You can unmap them and remap them, but that causes some race conditions and isn’t as performant.) There are APIs that claim to do it (both madvise() and msync()) but they don’t actually do anything. It does appear that pages mapped in that haven’t been written to won’t be accounted for in memory stats, but you’ve written to them they’re going to show as taking up space until you unmap them.” (from http://blog.pavlov.net/2008/03/11/firefox-3-memory-usage/) <limi> I guess we could unmap/remap on idle, worst case? <vlad> yeah, I think we need to revisit that <vlad> I don't understand why we need to keep the address range reserved, that seems like an optimization
I recently went back and carefully considered the only source of documentation on this subject I'm aware of (other than the XNU source code): http://lists.apple.com/archives/darwin-development/2003/Apr/msg00223.html My reading of this is that msync(..., MS_KILLPAGES) actually does something useful, namely the marking of pages as unneeded. If memory pressure pushes the machine to the brink of swapping, these unneeded pages will be discarded rather than being swapped. This is ideal behavior from a performance perspective, though it makes memory usage monitoring more difficult. However, jemalloc keeps internal statistics on how much memory is in active use, so we have a viable alternative to top when it comes to gathering memory usage statistics. It makes sense to verify that msync(..., MS_KILLPAGES) really behaves as stated, but if it does, I think we have a reasonable technical solution to the problem, though it leaves a public perception problem, since top will tell users that Firefox is hogging memory.
jasone: can you comment on this: <vlad> I don't understand why we need to keep the address range reserved, that seems like an optimization
(In reply to comment #6) > jasone: can you comment on this: > > <vlad> I don't understand why we need to keep the address range reserved, that > seems like an optimization The underlying issue is that jemalloc deals with memory mapping in chunks, not pages. Suppose some pages within a chunk were momentarily unmapped in order to break the physical mapping, but another thread won the race to re-map those pages. This would break the assumption that the entirety of a chunk is mapped for the singular purpose of chunk operations. It would be possible to add code such that partial chunks could still work, but it would be non-trivial, and for questionable benefit.
(In reply to comment #7) > It would be possible to add code such that partial chunks could still work, but > it would be non-trivial, and for questionable benefit. MAkes sense. In comment 84 on bug 414946, it seems that madvise(MADV_FREE) is the way to do this, at least 10.6.4. (These two bugs have gotten very tangled at this point...)
Actually, I think we can dupe this one into that one. Since I was responsible for getting this one filed, I'll do it — don't think we need to duplicate it here.