Closed
Bug 684180
Opened 13 years ago
Closed 13 years ago
Avoid wasted space caused by AssemblerBuffer's growth strategy
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla9
People
(Reporter: n.nethercote, Assigned: n.nethercote)
Details
(Whiteboard: [inbound])
Attachments
(1 file)
1.96 KB,
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
AssemblerBuffer starts with an inline buffer of 256 bytes. If that's not enough, it heap allocates progressively larger buffers using the following growth formula: int newCapacity = m_capacity + m_capacity / 2 + extraCapacity; In practice, extraCapacity is always zero so we're multiplying by 1.5 each time. But heap allocators work much better with requests that are powers-of-two; multiplying by 1.5 gives request sizes that get rounded up a lot, leading to wasted memory. In particular, any request in the range 513..8191 bytes gets rounded up by jemalloc to the nearest power of two. (Beyond that, jemalloc rounds to the next 4KB until we get to 1MB, then it rounds to the nearest MB.) I instrumented AssemblerBuffer::grow() to print out on every grow() call a line like this: m_capacity * 1.5 + extraCapacity == newCapacity --> usable_size (waste) Firing up the browser and opening gmail, I get this frequency of lines: 21652 lines: ( 1) 4775 (22.1%, 22.1%): 256 * 1.5 + 0 == 384 --> 384 (0) ( 2) 3821 (17.6%, 39.7%): 384 * 1.5 + 0 == 576 --> 1024 (448) ( 3) 3419 (15.8%, 55.5%): 576 * 1.5 + 0 == 864 --> 1024 (160) ( 4) 2896 (13.4%, 68.9%): 864 * 1.5 + 0 == 1296 --> 2048 (752) ( 5) 2275 (10.5%, 79.4%): 1296 * 1.5 + 0 == 1944 --> 2048 (104) ( 6) 1621 ( 7.5%, 86.9%): 1944 * 1.5 + 0 == 2916 --> 4096 (1180) ( 7) 1114 ( 5.1%, 92.0%): 2916 * 1.5 + 0 == 4374 --> 8192 (3818) ( 8) 758 ( 3.5%, 95.5%): 4374 * 1.5 + 0 == 6561 --> 8192 (1631) ( 9) 475 ( 2.2%, 97.7%): 6561 * 1.5 + 0 == 9841 --> 12288 (2447) (10) 262 ( 1.2%, 98.9%): 9841 * 1.5 + 0 == 14761 --> 16384 (1623) (11) 138 ( 0.6%, 99.5%): 14761 * 1.5 + 0 == 22141 --> 24576 (2435) (12) 62 ( 0.3%, 99.8%): 22141 * 1.5 + 0 == 33211 --> 36864 (3653) (13) 18 ( 0.1%, 99.9%): 33211 * 1.5 + 0 == 49816 --> 53248 (3432) (14) 9 ( 0.0%,100.0%): 49816 * 1.5 + 0 == 74724 --> 77824 (3100) (15) 3 ( 0.0%,100.0%): 74724 * 1.5 + 0 == 112086 --> 114688 (2602) (16) 3 ( 0.0%,100.0%): 112086 * 1.5 + 0 == 168129 --> 172032 (3903) (17) 2 ( 0.0%,100.0%): 168129 * 1.5 + 0 == 252193 --> 253952 (1759) (18) 1 ( 0.0%,100.0%): 252193 * 1.5 + 0 == 378289 --> 380928 (2639) The most common case, 256->384, results in no waste, but all the others give lots of waste. The cumulative waste total is 14,340,985 bytes. I think AssemblerBuffer is fairly short-lived so not many of these bytes are live at any one time but it's still worth fixing this. If we just change the growth factor from 1.5 to 2, we get this: 13417 lines: ( 1) 4645 (34.6%, 34.6%): 256 * 2 + 0 == 512 --> 512 (0) ( 2) 3424 (25.5%, 60.1%): 512 * 2 + 0 == 1024 --> 1024 (0) ( 3) 2562 (19.1%, 79.2%): 1024 * 2 + 0 == 2048 --> 2048 (0) ( 4) 1517 (11.3%, 90.5%): 2048 * 2 + 0 == 4096 --> 4096 (0) ( 5) 784 ( 5.8%, 96.4%): 4096 * 2 + 0 == 8192 --> 8192 (0) ( 6) 351 ( 2.6%, 99.0%): 8192 * 2 + 0 == 16384 --> 16384 (0) ( 7) 107 ( 0.8%, 99.8%): 16384 * 2 + 0 == 32768 --> 32768 (0) ( 8) 18 ( 0.1%, 99.9%): 32768 * 2 + 0 == 65536 --> 65536 (0) ( 9) 6 ( 0.0%,100.0%): 65536 * 2 + 0 == 131072 --> 131072 (0) (10) 2 ( 0.0%,100.0%): 131072 * 2 + 0 == 262144 --> 262144 (0) (11) 1 ( 0.0%,100.0%): 262144 * 2 + 0 == 524288 --> 524288 (0) Zero waste, and roughly 40% fewer malloc/realloc calls because of (a) less waste, and (b) faster growth rate. The attached patch makes this change to the growth rate. It also removes append(), which is unused, and this allowed me to remove extraCapacity, thus guaranteeing that the allocation amounts will always be powers of two. On Sunspider the patch gives a small but clear reduction in instruction counts (measured on a 64-bit shell with '-j -m -n -p'), so it'll definitely help on the much larger codebases that we compile on real sites like gmail: --------------------------------------------------------------- | millions of instructions executed | | total | compiled (may overestimate) | --------------------------------------------------------------- | 53.397 53.295 (1.002x) | 19.438 19.438 (------) | 3d-cube | 43.655 43.642 (------) | 26.796 26.795 (------) | 3d-morph | 57.449 57.393 (1.001x) | 24.341 24.341 (------) | 3d-raytrace | 15.170 15.156 (1.001x) | 8.104 8.104 (------) | access-binary- | 50.813 50.795 (------) | 43.132 43.132 (------) | access-fannkuc | 19.241 19.227 (1.001x) | 11.912 11.912 (------) | access-nbody | 18.924 18.921 (------) | 14.900 14.900 (------) | access-nsieve | 8.540 8.537 (------) | 5.335 5.335 (------) | bitops-3bit-bi | 29.286 29.282 (------) | 26.222 26.222 (------) | bitops-bits-in | 24.985 24.985 (------) | 22.200 22.200 (------) | bitops-bitwise | 29.055 29.053 (------) | 25.415 25.415 (------) | bitops-nsieve- | 15.460 15.449 (1.001x) | 11.625 11.625 (------) | controlflow-re | 26.264 26.234 (1.001x) | 7.856 7.856 (------) | crypto-md5 | 19.451 19.428 (1.001x) | 8.383 8.383 (------) | crypto-sha1 | 61.176 61.182 (------) | 13.092 13.092 (------) | date-format-to | 61.241 61.229 (------) | 9.534 9.534 (------) | date-format-xp | 24.562 24.561 (------) | 20.721 20.721 (------) | math-cordic | 34.751 34.746 (------) | 12.459 12.459 (------) | math-partial-s | 14.987 14.979 (1.001x) | 9.372 9.372 (------) | math-spectral- | 45.492 45.494 (------) | 34.486 34.486 (------) | regexp-dna | 36.756 36.739 (------) | 10.449 10.449 (------) | string-base64 | 41.397 41.371 (1.001x) | 24.435 24.435 (------) | string-fasta | 92.143 92.128 (------) | 17.511 17.511 (------) | string-tagclou | 113.711 113.473 (1.002x) | 12.147 12.147 (------) | string-unpack- | 39.128 39.119 (------) | 10.305 10.305 (------) | string-validat ------- | 977.046 976.427 (1.001x) | 430.181 430.180 (------) | all
Attachment #557774 -
Flags: review?(dvander)
Assignee | ||
Comment 1•13 years ago
|
||
> The cumulative waste total is 14,340,985 bytes.
And the cumulative non-waste total is 46,685,959 bytes. So the waste proportion is 23.5%.
Updated•13 years ago
|
Attachment #557774 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 2•13 years ago
|
||
It turns out that append() is used in AssemblerBufferWithConstantPool.h, so I didn't remove it. http://hg.mozilla.org/integration/mozilla-inbound/rev/f31743731039
Whiteboard: [inbound]
http://hg.mozilla.org/mozilla-central/rev/f31743731039
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla9
You need to log in
before you can comment on or make changes to this bug.
Description
•