Open Bug 1312138 Opened 9 years ago Updated 2 years ago

Reduce code size when zeroing out memory.

Categories

(Core :: JavaScript Engine: JIT, defect, P5)

defect

Tracking

()

People

(Reporter: mbx, Unassigned)

References

(Blocks 1 open bug)

Details

We emit a lot of code to zero out memory in ION (WebAssembly). For instance, in AngryBots, there are 45,161 instructions of the form: mov dword ptr [r15 + reg + ##], 0 This is encoded as [41 c7 44 3f ## 00 00 00 00] which is 9 bytes, so 406,449 bytes or ~ 5% (out of 10M). Since there is no 8-bit immediate mov, perhaps we can zero out a temp register and use that instead to save 2 bytes. xor rbx, rbx mov dword ptr [r15 + reg, ##], rbx Another common pattern is: mov dword ptr [r15 + rax], 0 mov dword ptr [r15 + rax + 4], 0 mov dword ptr [r15 + rax + 8], 0 There are 1247 matches of this exact pattern in AngryBots. And lots more if you consider a more general pattern: mov dword ptr [r15 + reg + (i ) * 4], 0 mov dword ptr [r15 + reg + (i + 1) * 4], 0 mov dword ptr [r15 + reg + (i + 2) * 4], 0 mov dword ptr [r15 + reg + (i + n) * 4], 0 Can we optimize this?
Blocks: wasm-perf
The former is an interesting question. It'd be nice if, in Lowering, you could ask for a register that could be denied if it would cause register pressure. Otherwise, it's hard to weigh the tradeoff of *always* requiring a register to zero or never. Poking around, gcc and clang don't seem to do anything fancier here. (Although clang does convert grouped stores of 0 to SIMD stores.) Bug 897425 would help with the latter case so we could expose the redundant (HeapReg + reg) computation.
Priority: -- → P5
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.