Open Bug 1738467 Opened 4 years ago Updated 4 years ago

Consider speeding up post barriers by skipping same-chunk edges

Categories

(Core :: JavaScript: GC, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: sfink, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

We could skip recording store buffer edges if src xor dst < ChunkSize, because those would be intra-chunk edges and so cannot go from a tenured chunk to a nursery chunk. It requires no memory dereferences, though it does add a few instructions to cases where you do need to insert into the store buffer, as well as cross-chunk cases where you end up deciding not to insert.

(Note that if we controlled virtual memory addresses such that the nursery came earlier in memory than the tenured heap, we could instead skip edges where src <= dst or (src & ChunkMask) < (dst & ChunkMask) which would remove even more unnecessary dereferences, especially if we arranged to allocate tenured chunks in decreasing address order so that the oldest objects were at the highest addresses.)

(In reply to Steve Fink [:sfink] [:s:] from comment #0)

Note that if we controlled virtual memory addresses such that the nursery came earlier in memory than the tenured heap, we could instead skip edges where src <= dst or (src & ChunkMask) < (dst & ChunkMask) which would remove even more unnecessary dereferences,

I like this a lot. This shouldn't be too hard on 64-bit platforms with plenty of address space? For pointer compression it would also be necessary to better control where (certain) GC chunks are allocated...

Note that if we controlled virtual memory addresses

How much control do we have over virtual address? Can we encode nursery/tenured heap in a bit of the address?

Maybe could do something similar on 64-bit to what we do for JIT code: reserve a large region of GC memory, then allocate chunks from a region within that. This would also pave the way for pointer compression.

(In reply to Jan de Mooij [:jandem] from comment #3)
If we reserve an aligned 8GB region per runtime we can allocate nursery chunks from the bottom half and tenured chunks from the top. Then bit 32 of the address will tell us whether a chunk is in the nursery or not.

For pointer compression we'd put the nursery chunks at the top of the bottom half and halfway point - max nursery size would be our base address.

Depends on: 1738725
See Also: → 1738725
Blocks: GC.size
Severity: -- → N/A
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.