Last Comment Bug 746009 - Investigate sources of and solutions to jemalloc heap fragmentation
: Investigate sources of and solutions to jemalloc heap fragmentation
Status: RESOLVED WONTFIX
[MemShrink:P1]
:
Product: Core
Classification: Components
Component: Memory Allocator (show other bugs)
: 14 Branch
: All All
: -- normal with 4 votes (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
: 636220 637449 676007 (view as bug list)
Depends on: 688979 746503 746501 748440
Blocks: MatchStartupMem
  Show dependency treegraph
 
Reported: 2012-04-16 16:40 PDT by Justin Lebar (not reading bugmail)
Modified: 2013-01-24 18:46 PST (History)
18 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Force jemalloc print stats on, and include "waste" in the bin stats. (2.34 KB, patch)
2012-04-16 16:43 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Print detailed malloc stats (and include "waste") every time jemalloc_stats are run. (v2) (4.65 KB, patch)
2012-04-16 17:29 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Print a stack trace on every allocation. (2.11 KB, patch)
2012-04-16 17:29 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
1024-byte allocation sites (62.28 KB, text/plain)
2012-04-18 01:20 PDT, Justin Lebar (not reading bugmail)
no flags Details
2048-byte allocation sites (66.20 KB, text/plain)
2012-04-18 01:21 PDT, Justin Lebar (not reading bugmail)
no flags Details
1024-byte allocs (with lifetimes) (73.22 KB, text/plain)
2012-04-18 21:16 PDT, Justin Lebar (not reading bugmail)
no flags Details
2048-byte allocs (with lifetimes) (76.82 KB, text/plain)
2012-04-18 21:17 PDT, Justin Lebar (not reading bugmail)
no flags Details
AWSY (104.70 KB, image/png)
2012-04-18 23:22 PDT, Nicholas Nethercote [:njn]
no flags Details
test patch 1: convert 2KB arenas to 4KB (11.29 KB, patch)
2012-04-19 16:54 PDT, Nicholas Nethercote [:njn]
no flags Details | Diff | Splinter Review
test patch 2: convert 1KB arenas to 4KB (13.33 KB, patch)
2012-04-19 16:54 PDT, Nicholas Nethercote [:njn]
no flags Details | Diff | Splinter Review
Test patch 3: Consider 1k and 2k allocations as "large" (842 bytes, patch)
2012-04-20 00:13 PDT, Justin Lebar (not reading bugmail)
justin.lebar+bug: feedback-
Details | Diff | Splinter Review

Description Justin Lebar (not reading bugmail) 2012-04-16 16:40:23 PDT
Our heap gets fragmented after you close a bunch of tabs.

See http://areweslimyet.com/ -- the red line is roughly twice the light blue/green lines; this is due to heap fragmentation.  We don't really understand where this is coming from or what we can do about it.
Comment 1 Justin Lebar (not reading bugmail) 2012-04-16 16:43:03 PDT
Created attachment 615535 [details] [diff] [review]
Force jemalloc print stats on, and include "waste" in the bin stats.

Some instrumentation which lets you see which size classes are responsible for wasted space.
Comment 2 Justin Lebar (not reading bugmail) 2012-04-16 16:44:00 PDT
Here's the output from a short browsing session with this patch applied.

The table is (size class, wasted space [mb]).

8	0.07
16	0.41
32	1.37
48	1.27
64	1.45
80	0.77
96	0.42
112	0.78
128	1.22
144	0.54
160	0.40
176	0.30
192	0.66
208	0.38
224	0.39
240	0.18
256	1.38
272	0.38
288	0.16
304	0.35
320	0.19
336	0.17
352	0.31
368	0.15
384	0.24
400	0.29
416	0.11
432	0.13
448	0.20
464	0.37
480	0.09
496	0.29
512	0.92
1024	4.87
2048	4.32
Total  25.54
Comment 3 Justin Lebar (not reading bugmail) 2012-04-16 17:29:33 PDT
Created attachment 615558 [details] [diff] [review]
Print detailed malloc stats (and include "waste") every time jemalloc_stats are run. (v2)
Comment 4 Justin Lebar (not reading bugmail) 2012-04-16 17:29:49 PDT
Created attachment 615559 [details] [diff] [review]
Print a stack trace on every allocation.
Comment 5 Nicholas Nethercote [:njn] 2012-04-16 22:53:28 PDT
Comment 2 ordered by waste amount:

1024    4.87
2048    4.32
64      1.45
256     1.38
32      1.37
48      1.27
128     1.22
512     0.92
112     0.78
80      0.77
192     0.66
144     0.54
96      0.42
16      0.41
160     0.40
224     0.39
272     0.38
208     0.38
464     0.37
304     0.35
352     0.31
176     0.30
496     0.29
400     0.29
384     0.24
448     0.20
320     0.19
240     0.18
336     0.17
288     0.16
368     0.15
432     0.13
416     0.11
480     0.09
8       0.07

Total  25.54
Comment 6 Justin Lebar (not reading bugmail) 2012-04-17 17:23:34 PDT
*** Bug 636220 has been marked as a duplicate of this bug. ***
Comment 7 Justin Lebar (not reading bugmail) 2012-04-17 17:25:44 PDT
*** Bug 637449 has been marked as a duplicate of this bug. ***
Comment 8 Justin Lebar (not reading bugmail) 2012-04-17 17:59:23 PDT
*** Bug 676007 has been marked as a duplicate of this bug. ***
Comment 9 Mike Hommey [:glandium] 2012-04-17 22:45:51 PDT
How about investigating this with the new jemalloc?
Comment 10 Justin Lebar (not reading bugmail) 2012-04-18 01:19:01 PDT
(In reply to Mike Hommey [:glandium] from comment #9)
> How about investigating this with the new jemalloc?

Yeah, at the moment, I think we can get somewhere by looking just at the allocation sites.  But if I really dig into the allocator's behavior, it would be worthwile to use the new one, for sure.
Comment 11 Justin Lebar (not reading bugmail) 2012-04-18 01:20:35 PDT
Created attachment 616044 [details]
1024-byte allocation sites

Mostly SQLite.
Comment 12 Justin Lebar (not reading bugmail) 2012-04-18 01:21:12 PDT
Created attachment 616045 [details]
2048-byte allocation sites

Mostly HTML5 parser
Comment 13 Mike Hommey [:glandium] 2012-04-18 01:57:26 PDT
I wonder. If these 1024 and 2048 allocations contribute to fragmentation, when we close lots of tabs, it means they are long-lived. Making them bigger is likely to increase memory footprint.
Comment 14 Justin Lebar (not reading bugmail) 2012-04-18 06:45:14 PDT
(In reply to Mike Hommey [:glandium] from comment #13)
> I wonder. If these 1024 and 2048 allocations contribute to fragmentation,
> when we close lots of tabs, it means they are long-lived. Making them bigger
> is likely to increase memory footprint.

Indeed it would, unless we usually allocate N 2048-byte chunks, and we'd be switching to N/2 4096-byte chunks.  For example, the NSS PL_Arenas are, it seems, usually larger than 2048 bytes, so increasing the chunk size there shouldn't have much of an impact on memory usage.  I don't know about SQLite or the HTML5 parser.

Alternatively (but less likely), we could reduce fragmentation by changing the size of some short- or medium-lived chunks which get allocated in-between long-lived chunks, spreading the long-lived chunks out.
Comment 15 Nicholas Nethercote [:njn] 2012-04-18 17:04:49 PDT
> For example, the NSS PL_Arenas are, it
> seems, usually larger than 2048 bytes, so increasing the chunk size there
> shouldn't have much of an impact on memory usage.

I did some instrumentation of them and saw that they often were not larger than 2048 bytes :/
Comment 16 Justin Lebar (not reading bugmail) 2012-04-18 17:05:40 PDT
So it just allocates a bunch of separate arenas?
Comment 17 Nicholas Nethercote [:njn] 2012-04-18 17:13:29 PDT
Yes, it seemed to.  Enough so that I stopped looking at it closely.
Comment 18 Justin Lebar (not reading bugmail) 2012-04-18 21:16:43 PDT
Created attachment 616446 [details]
1024-byte allocs (with lifetimes)

The units of the lifetime field are "number of X-byte malloc's" -- that is, if a malloc has lifetime 10, that means that the allocation survived 10 X-byte malloc's before being free'd.

Lifetime inf means the object was never free'd (I believe that when I ran the browser to collect this data, I killed it after GC/CC'ing, rather than shutting down nicely).  We exclude |inf|'s when calculating the mean lifetime.

I'm not handling realloc, which may be throwing this data off.  But I verified a few points by hand, so I'm reasonably confident that the numbers are meaningful, modulo that.
Comment 19 Justin Lebar (not reading bugmail) 2012-04-18 21:17:22 PDT
Created attachment 616447 [details]
2048-byte allocs (with lifetimes)
Comment 20 Nicholas Nethercote [:njn] 2012-04-18 23:22:43 PDT
Created attachment 616468 [details]
AWSY

I tried changing jemalloc so that any allocation request in the range 513..4095 bytes was rounded up to 4096.  (Instead of the usual 1024, 2048 or 4096.)  And I did an AWSY run.

The idea was that we'll use some more memory, but suffer less fragmentation.  The results weren't very good -- memory consumption went up significantly, except for the final "measure after closing all tabs" measurement which was flat.  (I've attached a screenshot.)  So, the fragmentation improved from a relative point of view, but the cure is worse than the disease.
Comment 21 Nicholas Nethercote [:njn] 2012-04-19 16:54:06 PDT
Created attachment 616807 [details] [diff] [review]
test patch 1: convert 2KB arenas to 4KB

This patch converts all the NSS arenas that use 2KB chunks to use 4KB chunks.  It does likewise with nsPersistentProperties.
Comment 22 Nicholas Nethercote [:njn] 2012-04-19 16:54:52 PDT
Created attachment 616808 [details] [diff] [review]
test patch 2: convert 1KB arenas to 4KB

This patch converts all the NSS arenas that use 1KB chunks to use 4KB chunks.
Comment 23 Justin Lebar (not reading bugmail) 2012-04-19 23:38:09 PDT
I was pretty wrong about how jemalloc handles 512b, 1kb, and 2kb allocations.  Here's my updated understanding, with the proviso that I reserve the right to re-understand this again later.

We allocate 512-2kb allocs out of "runs" of size 8 pages (32kb).  One run contains allocations from exactly one size class (i.e., 512b, 1kb, or 2kb).

There's some bookkeeping at the beginning of the run.  Because the runs are 8 pages, not 8 pages plus epsilon, the bookkeeping takes up the space we might otherwise use to store one object.  So our 32kb run can store only 15 2kb allocs, 31 1kb allocs, or 63 512b allocs.

Afaict, we never madvise/decommit part of a run.

I don't see a technical limitation against madvising/decommitting part of a run.  It might be slow, both to madvise/decommit and to page-fault in / recommit.
Comment 24 Justin Lebar (not reading bugmail) 2012-04-20 00:13:23 PDT
Created attachment 616889 [details] [diff] [review]
Test patch 3: Consider 1k and 2k allocations as "large"

John, would you mind running this patch through AWSY?
Comment 25 Justin Lebar (not reading bugmail) 2012-04-20 00:24:35 PDT
Comment on attachment 616889 [details] [diff] [review]
Test patch 3: Consider 1k and 2k allocations as "large"

Actually, I have no idea how my browser even stood up with this patch.  There's no way it'll work.  The smallest allowable "large" allocation is 1 page.
Comment 26 Nicholas Nethercote [:njn] 2013-01-24 17:01:19 PST
I talked with jlebar;  this bug is serving no useful purpose at this point.

Note You need to log in before you can comment on or make changes to this bug.