Closed Bug 746009 Opened 13 years ago Closed 12 years ago

Investigate sources of and solutions to jemalloc heap fragmentation

Tracking

()

Status:

RESOLVED WONTFIX

People

(Reporter: justin.lebar+bug, Unassigned)

References

Details

(Whiteboard: [MemShrink:P1])

Attachments

(7 files, 4 obsolete files)

Force jemalloc print stats on, and include "waste" in the bin stats. 13 years ago Justin Lebar (not reading bugmail) 2.34 KB, patch		Details \| Diff \| Splinter Review
Print detailed malloc stats (and include "waste") every time jemalloc_stats are run. (v2) 13 years ago Justin Lebar (not reading bugmail) 4.65 KB, patch		Details \| Diff \| Splinter Review
Print a stack trace on every allocation. 13 years ago Justin Lebar (not reading bugmail) 2.11 KB, patch		Details \| Diff \| Splinter Review
1024-byte allocation sites 13 years ago Justin Lebar (not reading bugmail) 62.28 KB, text/plain		Details
2048-byte allocation sites 13 years ago Justin Lebar (not reading bugmail) 66.20 KB, text/plain		Details
1024-byte allocs (with lifetimes) 13 years ago Justin Lebar (not reading bugmail) 73.22 KB, text/plain		Details
2048-byte allocs (with lifetimes) 13 years ago Justin Lebar (not reading bugmail) 76.82 KB, text/plain		Details
AWSY 13 years ago Nicholas Nethercote [inactive] 104.70 KB, image/png		Details
test patch 1: convert 2KB arenas to 4KB 13 years ago Nicholas Nethercote [inactive] 11.29 KB, patch		Details \| Diff \| Splinter Review
test patch 2: convert 1KB arenas to 4KB 13 years ago Nicholas Nethercote [inactive] 13.33 KB, patch		Details \| Diff \| Splinter Review
Test patch 3: Consider 1k and 2k allocations as "large" 13 years ago Justin Lebar (not reading bugmail) 842 bytes, patch	justin.lebar+bug : feedback-	Details \| Diff \| Splinter Review

Justin Lebar (not reading bugmail)

Reporter

Description

•

13 years ago

Our heap gets fragmented after you close a bunch of tabs. See http://areweslimyet.com/ -- the red line is roughly twice the light blue/green lines; this is due to heap fragmentation. We don't really understand where this is coming from or what we can do about it.

Justin Lebar (not reading bugmail)

Reporter

Comment 1

•

13 years ago

Attached patch Force jemalloc print stats on, and include "waste" in the bin stats. (obsolete) — Details — Splinter Review

Some instrumentation which lets you see which size classes are responsible for wasted space.

Justin Lebar (not reading bugmail)

Reporter

Comment 2

•

13 years ago

Here's the output from a short browsing session with this patch applied. The table is (size class, wasted space [mb]). 8 0.07 16 0.41 32 1.37 48 1.27 64 1.45 80 0.77 96 0.42 112 0.78 128 1.22 144 0.54 160 0.40 176 0.30 192 0.66 208 0.38 224 0.39 240 0.18 256 1.38 272 0.38 288 0.16 304 0.35 320 0.19 336 0.17 352 0.31 368 0.15 384 0.24 400 0.29 416 0.11 432 0.13 448 0.20 464 0.37 480 0.09 496 0.29 512 0.92 1024 4.87 2048 4.32 Total 25.54

Justin Lebar (not reading bugmail)

Reporter

Comment 3

•

13 years ago

Attached patch Print detailed malloc stats (and include "waste") every time jemalloc_stats are run. (v2) — Details — Splinter Review

Attachment #615535 - Attachment is obsolete: true

Justin Lebar (not reading bugmail)

Reporter

Comment 4

•

13 years ago

Attached patch Print a stack trace on every allocation. (obsolete) — Details — Splinter Review

Nicholas Nethercote [inactive]

Comment 5

•

13 years ago

Comment 2 ordered by waste amount: 1024 4.87 2048 4.32 64 1.45 256 1.38 32 1.37 48 1.27 128 1.22 512 0.92 112 0.78 80 0.77 192 0.66 144 0.54 96 0.42 16 0.41 160 0.40 224 0.39 272 0.38 208 0.38 464 0.37 304 0.35 352 0.31 176 0.30 496 0.29 400 0.29 384 0.24 448 0.20 320 0.19 240 0.18 336 0.17 288 0.16 368 0.15 432 0.13 416 0.11 480 0.09 8 0.07 Total 25.54

Nicholas Nethercote [inactive]

Updated

•

13 years ago

Blocks: MatchStartupMem

Whiteboard: [MemShrink] → [MemShrink:P1]

Justin Lebar (not reading bugmail)

Reporter

Updated

•

13 years ago

Depends on: 688979

Mike Hommey [:glandium]

Comment 9

•

13 years ago

How about investigating this with the new jemalloc?

Justin Lebar (not reading bugmail)

Reporter

Comment 10

•

13 years ago

(In reply to Mike Hommey [:glandium] from comment #9) > How about investigating this with the new jemalloc? Yeah, at the moment, I think we can get somewhere by looking just at the allocation sites. But if I really dig into the allocator's behavior, it would be worthwile to use the new one, for sure.

Justin Lebar (not reading bugmail)

Reporter

Updated

•

13 years ago

Attachment #615559 - Attachment is obsolete: true

Justin Lebar (not reading bugmail)

Reporter

Comment 11

•

13 years ago

Attached file 1024-byte allocation sites (obsolete) — Details

Mostly SQLite.

Justin Lebar (not reading bugmail)

Reporter

Comment 12

•

13 years ago

Attached file 2048-byte allocation sites (obsolete) — Details

Mostly HTML5 parser

Justin Lebar (not reading bugmail)

Reporter

Updated

•

13 years ago

Depends on: 746501

Justin Lebar (not reading bugmail)

Reporter

Updated

•

13 years ago

Depends on: 746503

Mike Hommey [:glandium]

Comment 13

•

13 years ago

I wonder. If these 1024 and 2048 allocations contribute to fragmentation, when we close lots of tabs, it means they are long-lived. Making them bigger is likely to increase memory footprint.

Justin Lebar (not reading bugmail)

Reporter

Comment 14

•

13 years ago

(In reply to Mike Hommey [:glandium] from comment #13) > I wonder. If these 1024 and 2048 allocations contribute to fragmentation, > when we close lots of tabs, it means they are long-lived. Making them bigger > is likely to increase memory footprint. Indeed it would, unless we usually allocate N 2048-byte chunks, and we'd be switching to N/2 4096-byte chunks. For example, the NSS PL_Arenas are, it seems, usually larger than 2048 bytes, so increasing the chunk size there shouldn't have much of an impact on memory usage. I don't know about SQLite or the HTML5 parser. Alternatively (but less likely), we could reduce fragmentation by changing the size of some short- or medium-lived chunks which get allocated in-between long-lived chunks, spreading the long-lived chunks out.

Nicholas Nethercote [inactive]

Comment 15

•

13 years ago

> For example, the NSS PL_Arenas are, it > seems, usually larger than 2048 bytes, so increasing the chunk size there > shouldn't have much of an impact on memory usage. I did some instrumentation of them and saw that they often were not larger than 2048 bytes :/

Justin Lebar (not reading bugmail)

Reporter

Comment 16

•

13 years ago

So it just allocates a bunch of separate arenas?

Nicholas Nethercote [inactive]

Comment 17

•

13 years ago

Yes, it seemed to. Enough so that I stopped looking at it closely.

Justin Lebar (not reading bugmail)

Reporter

Comment 18

•

13 years ago

Attached file 1024-byte allocs (with lifetimes) — Details

The units of the lifetime field are "number of X-byte malloc's" -- that is, if a malloc has lifetime 10, that means that the allocation survived 10 X-byte malloc's before being free'd. Lifetime inf means the object was never free'd (I believe that when I ran the browser to collect this data, I killed it after GC/CC'ing, rather than shutting down nicely). We exclude |inf|'s when calculating the mean lifetime. I'm not handling realloc, which may be throwing this data off. But I verified a few points by hand, so I'm reasonably confident that the numbers are meaningful, modulo that.

Attachment #616044 - Attachment is obsolete: true

Justin Lebar (not reading bugmail)

Reporter

Comment 19

•

13 years ago

Attached file 2048-byte allocs (with lifetimes) — Details

Attachment #616045 - Attachment is obsolete: true

Nicholas Nethercote [inactive]

Comment 20

•

13 years ago

Attached image AWSY — Details

I tried changing jemalloc so that any allocation request in the range 513..4095 bytes was rounded up to 4096. (Instead of the usual 1024, 2048 or 4096.) And I did an AWSY run. The idea was that we'll use some more memory, but suffer less fragmentation. The results weren't very good -- memory consumption went up significantly, except for the final "measure after closing all tabs" measurement which was flat. (I've attached a screenshot.) So, the fragmentation improved from a relative point of view, but the cure is worse than the disease.

Nicholas Nethercote [inactive]

Comment 21

•

13 years ago

Attached patch test patch 1: convert 2KB arenas to 4KB — Details — Splinter Review

This patch converts all the NSS arenas that use 2KB chunks to use 4KB chunks. It does likewise with nsPersistentProperties.

Nicholas Nethercote [inactive]

Comment 22

•

13 years ago

Attached patch test patch 2: convert 1KB arenas to 4KB — Details — Splinter Review

This patch converts all the NSS arenas that use 1KB chunks to use 4KB chunks.

Justin Lebar (not reading bugmail)

Reporter

Comment 23

•

13 years ago

I was pretty wrong about how jemalloc handles 512b, 1kb, and 2kb allocations. Here's my updated understanding, with the proviso that I reserve the right to re-understand this again later. We allocate 512-2kb allocs out of "runs" of size 8 pages (32kb). One run contains allocations from exactly one size class (i.e., 512b, 1kb, or 2kb). There's some bookkeeping at the beginning of the run. Because the runs are 8 pages, not 8 pages plus epsilon, the bookkeeping takes up the space we might otherwise use to store one object. So our 32kb run can store only 15 2kb allocs, 31 1kb allocs, or 63 512b allocs. Afaict, we never madvise/decommit part of a run. I don't see a technical limitation against madvising/decommitting part of a run. It might be slow, both to madvise/decommit and to page-fault in / recommit.

Justin Lebar (not reading bugmail)

Reporter

Comment 24

•

13 years ago

Attached patch Test patch 3: Consider 1k and 2k allocations as "large" — Details — Splinter Review

John, would you mind running this patch through AWSY?

Attachment #616889 - Flags: feedback?(jschoenick)

Justin Lebar (not reading bugmail)

Reporter

Updated

•

13 years ago

Attachment #616889 - Attachment is patch: true

Justin Lebar (not reading bugmail)

Reporter

Comment 25

•

13 years ago

Comment on attachment 616889 [details] [diff] [review] Test patch 3: Consider 1k and 2k allocations as "large" Actually, I have no idea how my browser even stood up with this patch. There's no way it'll work. The smallest allowable "large" allocation is 1 page.

Attachment #616889 - Flags: feedback?(jschoenick) → feedback-

Justin Lebar (not reading bugmail)

Reporter

Updated

•

13 years ago

Depends on: 748440

Nicholas Nethercote [inactive]

Comment 26

•

12 years ago

I talked with jlebar; this bug is serving no useful purpose at this point.

Nicholas Nethercote [inactive]

Updated

•

12 years ago

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → WONTFIX

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

8 years ago