Created attachment 568509 [details] [diff] [review] v1 We get a 2% speedup on EarlyBoyer if ChunkInfo (hot all over the GC allocator paths) is not split across a cache line. An easy and guaranteed effective way to do this is to just pad Chunk out to the full 1MiB allocation. This makes ChunkInfo abut the end of the 1MiB allocation, rather than whereever the Arenas and ChunkBitmap happen to end. Since GC Chunks are aligned at 1MiB address boundaries, this ensures that ChunkInfo is inside of a cache line.
Speedup under 32bit linux build (running on 64bit) is an incredibly modest 0.2%, so this appears to be isolated to 64bit builds.
Comment on attachment 568509 [details] [diff] [review] v1 Nice job. Can you change the assert after the definition of Chunk to read: JS_STATIC_ASSERT(sizeof(Chunk) == GC_CHUNK_SIZE); (instead of <=)?
Created attachment 568572 [details] [diff] [review] v2: With review feedback. Good catch! I actually had this check my other patch but forgot about it, apparently.