Closed
Bug 746009
Opened 13 years ago
Closed 12 years ago
Investigate sources of and solutions to jemalloc heap fragmentation
Categories
(Core :: Memory Allocator, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: justin.lebar+bug, Unassigned)
References
Details
(Whiteboard: [MemShrink:P1])
Attachments
(7 files, 4 obsolete files)
4.65 KB,
patch
|
Details | Diff | Splinter Review | |
73.22 KB,
text/plain
|
Details | |
76.82 KB,
text/plain
|
Details | |
104.70 KB,
image/png
|
Details | |
11.29 KB,
patch
|
Details | Diff | Splinter Review | |
13.33 KB,
patch
|
Details | Diff | Splinter Review | |
842 bytes,
patch
|
justin.lebar+bug
:
feedback-
|
Details | Diff | Splinter Review |
Our heap gets fragmented after you close a bunch of tabs.
See http://areweslimyet.com/ -- the red line is roughly twice the light blue/green lines; this is due to heap fragmentation. We don't really understand where this is coming from or what we can do about it.
Reporter | ||
Comment 1•13 years ago
|
||
Some instrumentation which lets you see which size classes are responsible for wasted space.
Reporter | ||
Comment 2•13 years ago
|
||
Here's the output from a short browsing session with this patch applied.
The table is (size class, wasted space [mb]).
8 0.07
16 0.41
32 1.37
48 1.27
64 1.45
80 0.77
96 0.42
112 0.78
128 1.22
144 0.54
160 0.40
176 0.30
192 0.66
208 0.38
224 0.39
240 0.18
256 1.38
272 0.38
288 0.16
304 0.35
320 0.19
336 0.17
352 0.31
368 0.15
384 0.24
400 0.29
416 0.11
432 0.13
448 0.20
464 0.37
480 0.09
496 0.29
512 0.92
1024 4.87
2048 4.32
Total 25.54
Reporter | ||
Comment 3•13 years ago
|
||
Attachment #615535 -
Attachment is obsolete: true
Reporter | ||
Comment 4•13 years ago
|
||
![]() |
||
Comment 5•13 years ago
|
||
Comment 2 ordered by waste amount:
1024 4.87
2048 4.32
64 1.45
256 1.38
32 1.37
48 1.27
128 1.22
512 0.92
112 0.78
80 0.77
192 0.66
144 0.54
96 0.42
16 0.41
160 0.40
224 0.39
272 0.38
208 0.38
464 0.37
304 0.35
352 0.31
176 0.30
496 0.29
400 0.29
384 0.24
448 0.20
320 0.19
240 0.18
336 0.17
288 0.16
368 0.15
432 0.13
416 0.11
480 0.09
8 0.07
Total 25.54
![]() |
||
Updated•13 years ago
|
Blocks: MatchStartupMem
Whiteboard: [MemShrink] → [MemShrink:P1]
Comment 9•13 years ago
|
||
How about investigating this with the new jemalloc?
Reporter | ||
Comment 10•13 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #9)
> How about investigating this with the new jemalloc?
Yeah, at the moment, I think we can get somewhere by looking just at the allocation sites. But if I really dig into the allocator's behavior, it would be worthwile to use the new one, for sure.
Reporter | ||
Updated•13 years ago
|
Attachment #615559 -
Attachment is obsolete: true
Reporter | ||
Comment 11•13 years ago
|
||
Mostly SQLite.
Reporter | ||
Comment 12•13 years ago
|
||
Mostly HTML5 parser
Comment 13•13 years ago
|
||
I wonder. If these 1024 and 2048 allocations contribute to fragmentation, when we close lots of tabs, it means they are long-lived. Making them bigger is likely to increase memory footprint.
Reporter | ||
Comment 14•13 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #13)
> I wonder. If these 1024 and 2048 allocations contribute to fragmentation,
> when we close lots of tabs, it means they are long-lived. Making them bigger
> is likely to increase memory footprint.
Indeed it would, unless we usually allocate N 2048-byte chunks, and we'd be switching to N/2 4096-byte chunks. For example, the NSS PL_Arenas are, it seems, usually larger than 2048 bytes, so increasing the chunk size there shouldn't have much of an impact on memory usage. I don't know about SQLite or the HTML5 parser.
Alternatively (but less likely), we could reduce fragmentation by changing the size of some short- or medium-lived chunks which get allocated in-between long-lived chunks, spreading the long-lived chunks out.
![]() |
||
Comment 15•13 years ago
|
||
> For example, the NSS PL_Arenas are, it
> seems, usually larger than 2048 bytes, so increasing the chunk size there
> shouldn't have much of an impact on memory usage.
I did some instrumentation of them and saw that they often were not larger than 2048 bytes :/
Reporter | ||
Comment 16•13 years ago
|
||
So it just allocates a bunch of separate arenas?
![]() |
||
Comment 17•13 years ago
|
||
Yes, it seemed to. Enough so that I stopped looking at it closely.
Reporter | ||
Comment 18•13 years ago
|
||
The units of the lifetime field are "number of X-byte malloc's" -- that is, if a malloc has lifetime 10, that means that the allocation survived 10 X-byte malloc's before being free'd.
Lifetime inf means the object was never free'd (I believe that when I ran the browser to collect this data, I killed it after GC/CC'ing, rather than shutting down nicely). We exclude |inf|'s when calculating the mean lifetime.
I'm not handling realloc, which may be throwing this data off. But I verified a few points by hand, so I'm reasonably confident that the numbers are meaningful, modulo that.
Attachment #616044 -
Attachment is obsolete: true
Reporter | ||
Comment 19•13 years ago
|
||
Attachment #616045 -
Attachment is obsolete: true
![]() |
||
Comment 20•13 years ago
|
||
I tried changing jemalloc so that any allocation request in the range 513..4095 bytes was rounded up to 4096. (Instead of the usual 1024, 2048 or 4096.) And I did an AWSY run.
The idea was that we'll use some more memory, but suffer less fragmentation. The results weren't very good -- memory consumption went up significantly, except for the final "measure after closing all tabs" measurement which was flat. (I've attached a screenshot.) So, the fragmentation improved from a relative point of view, but the cure is worse than the disease.
![]() |
||
Comment 21•13 years ago
|
||
This patch converts all the NSS arenas that use 2KB chunks to use 4KB chunks. It does likewise with nsPersistentProperties.
![]() |
||
Comment 22•13 years ago
|
||
This patch converts all the NSS arenas that use 1KB chunks to use 4KB chunks.
Reporter | ||
Comment 23•13 years ago
|
||
I was pretty wrong about how jemalloc handles 512b, 1kb, and 2kb allocations. Here's my updated understanding, with the proviso that I reserve the right to re-understand this again later.
We allocate 512-2kb allocs out of "runs" of size 8 pages (32kb). One run contains allocations from exactly one size class (i.e., 512b, 1kb, or 2kb).
There's some bookkeeping at the beginning of the run. Because the runs are 8 pages, not 8 pages plus epsilon, the bookkeeping takes up the space we might otherwise use to store one object. So our 32kb run can store only 15 2kb allocs, 31 1kb allocs, or 63 512b allocs.
Afaict, we never madvise/decommit part of a run.
I don't see a technical limitation against madvising/decommitting part of a run. It might be slow, both to madvise/decommit and to page-fault in / recommit.
Reporter | ||
Comment 24•13 years ago
|
||
John, would you mind running this patch through AWSY?
Attachment #616889 -
Flags: feedback?(jschoenick)
Reporter | ||
Updated•13 years ago
|
Attachment #616889 -
Attachment is patch: true
Reporter | ||
Comment 25•13 years ago
|
||
Comment on attachment 616889 [details] [diff] [review]
Test patch 3: Consider 1k and 2k allocations as "large"
Actually, I have no idea how my browser even stood up with this patch. There's no way it'll work. The smallest allowable "large" allocation is 1 page.
Attachment #616889 -
Flags: feedback?(jschoenick) → feedback-
![]() |
||
Comment 26•12 years ago
|
||
I talked with jlebar; this bug is serving no useful purpose at this point.
![]() |
||
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•