Closed
Bug 571332
Opened 14 years ago
Closed 14 years ago
jemalloc - avoiding the null check in the free method for non-huge allocations
Categories
(Core :: Memory Allocator, enhancement)
Tracking
()
RESOLVED
FIXED
People
(Reporter: igor, Assigned: igor)
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(2 files)
3.04 KB,
patch
|
jasone
:
review+
|
Details | Diff | Splinter Review |
1.01 KB,
text/plain
|
Details |
Currently the implementation of the free function in jemalloc essentially doing: if (ptr != NULL) { chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); if (chunk != ptr) { arena_dalloc(chunk->arena, chunk, ptr); } else { huge_dalloc(ptr); } } The initial null check can be avoided for small allocations if one takes into account that (arena_chunk_t *)CHUNK_ADDR2BASE(NULL) == NULL for any (known for me) platform. Hence the idea is to reorganize the above code as chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); if (chunk != ptr) { assert(ptr != NULL); arena_dalloc(chunk->arena, chunk, ptr); } else if (ptr != NULL) { huge_dalloc(ptr); } This way only huge allocations would bear the penalty of the NULL check.
Assignee | ||
Comment 1•14 years ago
|
||
Besides moving the NULL check the patch also chages idalloc and free to use CHUNK_ADDR2OFFSET, not CHUNK_ADDR2BASE, for optimal performance.
Assignee | ||
Comment 2•14 years ago
|
||
The test case measures how long it take to call free method for 32-byte allocation using rdtsc instruction. To run it save it to a directory with jemalloc source (memory/jemalloc) and compile on Linux with GCC using: gcc -o x -O3 -std=c99 -Wall -DMOZ_MEMORY -DMOZ_MEMORY_LINUX -DMOZ_MEMORY_SIZEOF_PTR_2POW=3 -DNDEBUG -fstrict-aliasing -fomit-frame-pointer jemalloc.c x.c -lpthread To get 32-bit output use: gcc -m32 -o x -O3 -std=c99 -Wall -DMOZ_MEMORY -DMOZ_MEMORY_LINUX -DMOZ_MEMORY_SIZEOF_PTR_2POW=3 -DNDEBUG -fstrict-aliasing -fomit-frame-pointer jemalloc.c x.c -lpthread With the patch I see the following results on my Intel(R) Core(TM) i5 CPU M 520 @ 2.40GHz laptop: i686 executable before the patch: cycles per loop iteration: average=74.2 min=72.5 after the patch: cycles per loop iteration: average=71.6 min=69.8 or roughly 3% speedup of the hot free call. x86_64 executable: before the patch: cycles per loop iteration: average=61.8 min=60.6 after the patch: cycles per loop iteration: average=61.3 min=59.9 or roughly 1% speedup of the hot free call.
Assignee | ||
Updated•14 years ago
|
Attachment #450645 -
Flags: review?(jasone)
Comment 3•14 years ago
|
||
Comment on attachment 450645 [details] [diff] [review] v1 This change seems unlikely to have a measurable impact on Firefox performance, but it certainly won't hurt anything.
Attachment #450645 -
Flags: review?(jasone) → review+
Assignee | ||
Comment 4•14 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/2e14a43ef3db
Whiteboard: fixed-in-tracemonkey
Comment 5•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/2e14a43ef3db
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•