Closed Bug 684180 Opened 14 years ago Closed 14 years ago

Avoid wasted space caused by AssemblerBuffer's growth strategy

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla9

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: [inbound])

Attachments

(1 file)

patch 14 years ago Nicholas Nethercote [inactive] 1.96 KB, patch	dvander : review+	Details \| Diff \| Splinter Review

Nicholas Nethercote [inactive]

Assignee

Description

•

14 years ago

Attached patch patch — Details — Splinter Review

AssemblerBuffer starts with an inline buffer of 256 bytes. If that's not enough, it heap allocates progressively larger buffers using the following growth formula: int newCapacity = m_capacity + m_capacity / 2 + extraCapacity; In practice, extraCapacity is always zero so we're multiplying by 1.5 each time. But heap allocators work much better with requests that are powers-of-two; multiplying by 1.5 gives request sizes that get rounded up a lot, leading to wasted memory. In particular, any request in the range 513..8191 bytes gets rounded up by jemalloc to the nearest power of two. (Beyond that, jemalloc rounds to the next 4KB until we get to 1MB, then it rounds to the nearest MB.) I instrumented AssemblerBuffer::grow() to print out on every grow() call a line like this: m_capacity * 1.5 + extraCapacity == newCapacity --> usable_size (waste) Firing up the browser and opening gmail, I get this frequency of lines: 21652 lines: ( 1) 4775 (22.1%, 22.1%): 256 * 1.5 + 0 == 384 --> 384 (0) ( 2) 3821 (17.6%, 39.7%): 384 * 1.5 + 0 == 576 --> 1024 (448) ( 3) 3419 (15.8%, 55.5%): 576 * 1.5 + 0 == 864 --> 1024 (160) ( 4) 2896 (13.4%, 68.9%): 864 * 1.5 + 0 == 1296 --> 2048 (752) ( 5) 2275 (10.5%, 79.4%): 1296 * 1.5 + 0 == 1944 --> 2048 (104) ( 6) 1621 ( 7.5%, 86.9%): 1944 * 1.5 + 0 == 2916 --> 4096 (1180) ( 7) 1114 ( 5.1%, 92.0%): 2916 * 1.5 + 0 == 4374 --> 8192 (3818) ( 8) 758 ( 3.5%, 95.5%): 4374 * 1.5 + 0 == 6561 --> 8192 (1631) ( 9) 475 ( 2.2%, 97.7%): 6561 * 1.5 + 0 == 9841 --> 12288 (2447) (10) 262 ( 1.2%, 98.9%): 9841 * 1.5 + 0 == 14761 --> 16384 (1623) (11) 138 ( 0.6%, 99.5%): 14761 * 1.5 + 0 == 22141 --> 24576 (2435) (12) 62 ( 0.3%, 99.8%): 22141 * 1.5 + 0 == 33211 --> 36864 (3653) (13) 18 ( 0.1%, 99.9%): 33211 * 1.5 + 0 == 49816 --> 53248 (3432) (14) 9 ( 0.0%,100.0%): 49816 * 1.5 + 0 == 74724 --> 77824 (3100) (15) 3 ( 0.0%,100.0%): 74724 * 1.5 + 0 == 112086 --> 114688 (2602) (16) 3 ( 0.0%,100.0%): 112086 * 1.5 + 0 == 168129 --> 172032 (3903) (17) 2 ( 0.0%,100.0%): 168129 * 1.5 + 0 == 252193 --> 253952 (1759) (18) 1 ( 0.0%,100.0%): 252193 * 1.5 + 0 == 378289 --> 380928 (2639) The most common case, 256->384, results in no waste, but all the others give lots of waste. The cumulative waste total is 14,340,985 bytes. I think AssemblerBuffer is fairly short-lived so not many of these bytes are live at any one time but it's still worth fixing this. If we just change the growth factor from 1.5 to 2, we get this: 13417 lines: ( 1) 4645 (34.6%, 34.6%): 256 * 2 + 0 == 512 --> 512 (0) ( 2) 3424 (25.5%, 60.1%): 512 * 2 + 0 == 1024 --> 1024 (0) ( 3) 2562 (19.1%, 79.2%): 1024 * 2 + 0 == 2048 --> 2048 (0) ( 4) 1517 (11.3%, 90.5%): 2048 * 2 + 0 == 4096 --> 4096 (0) ( 5) 784 ( 5.8%, 96.4%): 4096 * 2 + 0 == 8192 --> 8192 (0) ( 6) 351 ( 2.6%, 99.0%): 8192 * 2 + 0 == 16384 --> 16384 (0) ( 7) 107 ( 0.8%, 99.8%): 16384 * 2 + 0 == 32768 --> 32768 (0) ( 8) 18 ( 0.1%, 99.9%): 32768 * 2 + 0 == 65536 --> 65536 (0) ( 9) 6 ( 0.0%,100.0%): 65536 * 2 + 0 == 131072 --> 131072 (0) (10) 2 ( 0.0%,100.0%): 131072 * 2 + 0 == 262144 --> 262144 (0) (11) 1 ( 0.0%,100.0%): 262144 * 2 + 0 == 524288 --> 524288 (0) Zero waste, and roughly 40% fewer malloc/realloc calls because of (a) less waste, and (b) faster growth rate. The attached patch makes this change to the growth rate. It also removes append(), which is unused, and this allowed me to remove extraCapacity, thus guaranteeing that the allocation amounts will always be powers of two. On Sunspider the patch gives a small but clear reduction in instruction counts (measured on a 64-bit shell with '-j -m -n -p'), so it'll definitely help on the much larger codebases that we compile on real sites like gmail: --------------------------------------------------------------- | millions of instructions executed | | total | compiled (may overestimate) | --------------------------------------------------------------- | 53.397 53.295 (1.002x) | 19.438 19.438 (------) | 3d-cube | 43.655 43.642 (------) | 26.796 26.795 (------) | 3d-morph | 57.449 57.393 (1.001x) | 24.341 24.341 (------) | 3d-raytrace | 15.170 15.156 (1.001x) | 8.104 8.104 (------) | access-binary- | 50.813 50.795 (------) | 43.132 43.132 (------) | access-fannkuc | 19.241 19.227 (1.001x) | 11.912 11.912 (------) | access-nbody | 18.924 18.921 (------) | 14.900 14.900 (------) | access-nsieve | 8.540 8.537 (------) | 5.335 5.335 (------) | bitops-3bit-bi | 29.286 29.282 (------) | 26.222 26.222 (------) | bitops-bits-in | 24.985 24.985 (------) | 22.200 22.200 (------) | bitops-bitwise | 29.055 29.053 (------) | 25.415 25.415 (------) | bitops-nsieve- | 15.460 15.449 (1.001x) | 11.625 11.625 (------) | controlflow-re | 26.264 26.234 (1.001x) | 7.856 7.856 (------) | crypto-md5 | 19.451 19.428 (1.001x) | 8.383 8.383 (------) | crypto-sha1 | 61.176 61.182 (------) | 13.092 13.092 (------) | date-format-to | 61.241 61.229 (------) | 9.534 9.534 (------) | date-format-xp | 24.562 24.561 (------) | 20.721 20.721 (------) | math-cordic | 34.751 34.746 (------) | 12.459 12.459 (------) | math-partial-s | 14.987 14.979 (1.001x) | 9.372 9.372 (------) | math-spectral- | 45.492 45.494 (------) | 34.486 34.486 (------) | regexp-dna | 36.756 36.739 (------) | 10.449 10.449 (------) | string-base64 | 41.397 41.371 (1.001x) | 24.435 24.435 (------) | string-fasta | 92.143 92.128 (------) | 17.511 17.511 (------) | string-tagclou | 113.711 113.473 (1.002x) | 12.147 12.147 (------) | string-unpack- | 39.128 39.119 (------) | 10.305 10.305 (------) | string-validat ------- | 977.046 976.427 (1.001x) | 430.181 430.180 (------) | all

Attachment #557774 - Flags: review?(dvander)

Nicholas Nethercote [inactive]

Assignee

Comment 1

•

14 years ago

> The cumulative waste total is 14,340,985 bytes. And the cumulative non-waste total is 46,685,959 bytes. So the waste proportion is 23.5%.

David Anderson [:dvander] - inactive, e-mail if emergency

Updated

•

14 years ago

Attachment #557774 - Flags: review?(dvander) → review+

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

14 years ago

It turns out that append() is used in AssemblerBufferWithConstantPool.h, so I didn't remove it. http://hg.mozilla.org/integration/mozilla-inbound/rev/f31743731039

Whiteboard: [inbound]

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 3

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/f31743731039

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla9

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Avoid wasted space caused by AssemblerBuffer's growth strategy

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: [inbound])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Updated

Comment 2

Comment 3

Attachment

General

Description

File Name

Content Type