Closed Bug 746009 Opened 13 years ago Closed 12 years ago

Investigate sources of and solutions to jemalloc heap fragmentation

Categories

(Core :: Memory Allocator, defect)

14 Branch
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: justin.lebar+bug, Unassigned)

References

Details

(Whiteboard: [MemShrink:P1])

Attachments

(7 files, 4 obsolete files)

Our heap gets fragmented after you close a bunch of tabs. See http://areweslimyet.com/ -- the red line is roughly twice the light blue/green lines; this is due to heap fragmentation. We don't really understand where this is coming from or what we can do about it.
Some instrumentation which lets you see which size classes are responsible for wasted space.
Here's the output from a short browsing session with this patch applied. The table is (size class, wasted space [mb]). 8 0.07 16 0.41 32 1.37 48 1.27 64 1.45 80 0.77 96 0.42 112 0.78 128 1.22 144 0.54 160 0.40 176 0.30 192 0.66 208 0.38 224 0.39 240 0.18 256 1.38 272 0.38 288 0.16 304 0.35 320 0.19 336 0.17 352 0.31 368 0.15 384 0.24 400 0.29 416 0.11 432 0.13 448 0.20 464 0.37 480 0.09 496 0.29 512 0.92 1024 4.87 2048 4.32 Total 25.54
Comment 2 ordered by waste amount: 1024 4.87 2048 4.32 64 1.45 256 1.38 32 1.37 48 1.27 128 1.22 512 0.92 112 0.78 80 0.77 192 0.66 144 0.54 96 0.42 16 0.41 160 0.40 224 0.39 272 0.38 208 0.38 464 0.37 304 0.35 352 0.31 176 0.30 496 0.29 400 0.29 384 0.24 448 0.20 320 0.19 240 0.18 336 0.17 288 0.16 368 0.15 432 0.13 416 0.11 480 0.09 8 0.07 Total 25.54
Whiteboard: [MemShrink] → [MemShrink:P1]
Depends on: 688979
How about investigating this with the new jemalloc?
(In reply to Mike Hommey [:glandium] from comment #9) > How about investigating this with the new jemalloc? Yeah, at the moment, I think we can get somewhere by looking just at the allocation sites. But if I really dig into the allocator's behavior, it would be worthwile to use the new one, for sure.
Attachment #615559 - Attachment is obsolete: true
Attached file 1024-byte allocation sites (obsolete) —
Mostly SQLite.
Attached file 2048-byte allocation sites (obsolete) —
Mostly HTML5 parser
Depends on: 746501
Depends on: 746503
I wonder. If these 1024 and 2048 allocations contribute to fragmentation, when we close lots of tabs, it means they are long-lived. Making them bigger is likely to increase memory footprint.
(In reply to Mike Hommey [:glandium] from comment #13) > I wonder. If these 1024 and 2048 allocations contribute to fragmentation, > when we close lots of tabs, it means they are long-lived. Making them bigger > is likely to increase memory footprint. Indeed it would, unless we usually allocate N 2048-byte chunks, and we'd be switching to N/2 4096-byte chunks. For example, the NSS PL_Arenas are, it seems, usually larger than 2048 bytes, so increasing the chunk size there shouldn't have much of an impact on memory usage. I don't know about SQLite or the HTML5 parser. Alternatively (but less likely), we could reduce fragmentation by changing the size of some short- or medium-lived chunks which get allocated in-between long-lived chunks, spreading the long-lived chunks out.
> For example, the NSS PL_Arenas are, it > seems, usually larger than 2048 bytes, so increasing the chunk size there > shouldn't have much of an impact on memory usage. I did some instrumentation of them and saw that they often were not larger than 2048 bytes :/
So it just allocates a bunch of separate arenas?
Yes, it seemed to. Enough so that I stopped looking at it closely.
The units of the lifetime field are "number of X-byte malloc's" -- that is, if a malloc has lifetime 10, that means that the allocation survived 10 X-byte malloc's before being free'd. Lifetime inf means the object was never free'd (I believe that when I ran the browser to collect this data, I killed it after GC/CC'ing, rather than shutting down nicely). We exclude |inf|'s when calculating the mean lifetime. I'm not handling realloc, which may be throwing this data off. But I verified a few points by hand, so I'm reasonably confident that the numbers are meaningful, modulo that.
Attachment #616044 - Attachment is obsolete: true
Attachment #616045 - Attachment is obsolete: true
Attached image AWSY
I tried changing jemalloc so that any allocation request in the range 513..4095 bytes was rounded up to 4096. (Instead of the usual 1024, 2048 or 4096.) And I did an AWSY run. The idea was that we'll use some more memory, but suffer less fragmentation. The results weren't very good -- memory consumption went up significantly, except for the final "measure after closing all tabs" measurement which was flat. (I've attached a screenshot.) So, the fragmentation improved from a relative point of view, but the cure is worse than the disease.
This patch converts all the NSS arenas that use 2KB chunks to use 4KB chunks. It does likewise with nsPersistentProperties.
This patch converts all the NSS arenas that use 1KB chunks to use 4KB chunks.
I was pretty wrong about how jemalloc handles 512b, 1kb, and 2kb allocations. Here's my updated understanding, with the proviso that I reserve the right to re-understand this again later. We allocate 512-2kb allocs out of "runs" of size 8 pages (32kb). One run contains allocations from exactly one size class (i.e., 512b, 1kb, or 2kb). There's some bookkeeping at the beginning of the run. Because the runs are 8 pages, not 8 pages plus epsilon, the bookkeeping takes up the space we might otherwise use to store one object. So our 32kb run can store only 15 2kb allocs, 31 1kb allocs, or 63 512b allocs. Afaict, we never madvise/decommit part of a run. I don't see a technical limitation against madvising/decommitting part of a run. It might be slow, both to madvise/decommit and to page-fault in / recommit.
John, would you mind running this patch through AWSY?
Attachment #616889 - Flags: feedback?(jschoenick)
Attachment #616889 - Attachment is patch: true
Comment on attachment 616889 [details] [diff] [review] Test patch 3: Consider 1k and 2k allocations as "large" Actually, I have no idea how my browser even stood up with this patch. There's no way it'll work. The smallest allowable "large" allocation is 1 page.
Attachment #616889 - Flags: feedback?(jschoenick) → feedback-
Depends on: 748440
I talked with jlebar; this bug is serving no useful purpose at this point.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: