Optimize Wasm-GC load/store codegen
Categories
(Core :: JavaScript: WebAssembly, task, P2)
Tracking
()
People
(Reporter: rhunt, Assigned: rhunt)
References
(Blocks 1 open bug)
Details
Attachments
(3 files)
Our GC store code is not very efficient. We have several issues:
- The pre-write barrier happens inline, which is bad for code layout
- We do useFixed(PreBarrierReg) for stores to ref-fields, which cause every store to conflict with every other store
- We cannot store constants directly to fields without loading into an intermediate register
| Assignee | ||
Comment 1•8 months ago
|
||
The pre-write barrier requires the valueBase to be in a special
register. We solved this for Ion by having a register allocator
constraint. However the pre-write barrier only runs when an
incremental GC is in progress, in all other times we could be
using different registers.
This commit removes the register constraint and moves the pre-write
barrier call to OOL code. That OOL code manually saves the PreBarrierReg
and moves the valueBase into it for the barrier.
| Assignee | ||
Comment 2•8 months ago
|
||
The guards for ion are now OOL always which means we could invert the guards
so that we only do a branch in the case the barrier must be taken.
| Assignee | ||
Comment 3•8 months ago
|
||
WIP. Also should handle arrays too.
Comment 4•3 months ago
|
||
I rebased these without difficulty. However the first of the 3 patches,
D241493, has problems:
-
the out-of-line code blocks lack a final
masm.jump(ool.rejoin());. This
causes wasm/regress/table-of-anyref.js to fail. -
In CodeGenerator::visitWasmStoreElementRef, there's a regalloc problem. We have
Register index = ToRegister(ins->index()); .. if (base != PreBarrierReg) { masm.Push(PreBarrierReg); masm.movePtr(base, PreBarrierReg); } BaseIndex addr(PreBarrierReg, index, ScalePointer);In the case where
base != PreBarrierRegbutindex == PreBarrierReg,
masm.movePtr(base, PreBarrierReg)trashesindex; yet we use it right
afterwards. -
In both CodeGenerator::visitWasmStoreRef and ::visitWasmStoreElementRef, I
suspect there is a similar conflict between
Register temp = ToRegister(ins->temp0())andPreBarrierReg;
iftempis equal toPreBarrierRegand live ranges of them as created
bywasm::EmitWasmPreBarrierGuardwill conflict.
I was unable to completely fix the second and third problems and wound up
removing the regalloc stuff from the patch, retaining just the move to OOL
code.
With that in place, measured results are (Intel Tiger Lake):
cycles:u, best of 5 runs, million
Basis +D241942
+D241941 +D241943
Df-todomvc-w 24,210 24,049 24,165 24,038
Ko-compose-w 81,336 80,836 80,528 80,301
j2cl-box2d-w 4,894 4,906 4,889 4,860
| Assignee | ||
Comment 5•3 months ago
|
||
Ah, that's a tricky regalloc issue. JS seems to have the same issue and I think we can solve it the same way [1].
Push(PreBarrierReg);
computeEffectiveAddress(BaseIndex(anyBaseReg, anyIndexReg, Scale, PreBarrierReg);
call(preBarrier);
Pop(PreBarrierReg);
Basically always push the PreBarrierReg, then compute the address into the PreBarrierReg (which may trash the base or index reg). Then restore the PreBarrierReg after the call, which will untrash that value.
I think in wasmStoreElementRef it'd be:
masm.Push(PreBarrierReg);
BaseIndex addr(base, index, ScalePointer);
wasm::EmitWasmPreBarrierCallIndex(masm, instance, temp0, temp1, addr);
masm.Pop(PreBarrierReg);
Then we might need to update EmitWasmPreBarrierCallIndex so that it doesn't do the manual save/restore of the base register to a scratch register.
In fact, we might just want to move this push/pop of PreBarrierReg into EmitWasmPreBarrierCall, and remove the need for any of the callers to ever care about ensuring that base == PreBarrierReg. It's a huge pain in baseline to ensure this is true, it'd be a major simplification to remove this.
Description
•