Bug 1701164 Comment 1 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Lars T Hansen [:lth]

on 2021-03-26 08:11:48 PDT

Here's an example:
```
wasmDis(new WebAssembly.Module(wasmTextToBinary(`
  (module
    (func (param v128) (param v128) (result v128)
      (i32x4.add (local.get 1) (local.get 0))))
`)))
```
The core of the code generated for this is:
```
00000024  66 0f 6f d1               movdqa %xmm1, %xmm2
00000028  66 0f 6f ca               movdqa %xmm2, %xmm1
0000002C  66 0f fe c8               paddd %xmm0, %xmm1
00000030  66 0f 6f c1               movdqa %xmm1, %xmm0
```
which is not great.  FP addition is commutative so the optimal code is just the addition; almost-optimal code has at most one move (the one at the end).  But even ignoring that, the first two moves are clearly redundant, as xmm2 is dead and the move accomplishes nothing.

(Note the swapped operand order.  If the body were param0 + param1 then the machine code is simply the paddd.)

Regalloc reports that the input is this:
```
[RegAlloc]     [2,3 WasmParameter] [def v1<simd128>:%xmm0.i4]
[RegAlloc]     [4,5 WasmParameter] [def v2<simd128>:%xmm1.i4]
[RegAlloc]     [6,7 WasmParameter] [def v3<g>:r14]
[RegAlloc]     [8,9 WasmBinarySimd128] [def v4<simd128>:tied(0)] [use v2:r?] [use v1:r]
[RegAlloc]     [10,11 WasmReturn] [use v4:%xmm0.d] [use v3:r14]
```
but at the end the IR looks like this:
```
[RegAlloc]     [2,3 WasmParameter] [def v1<simd128>:%xmm0.i4]
[RegAlloc]     [4,5 WasmParameter] [def v2<simd128>:%xmm1.i4]
[RegAlloc]     [MoveGroup] [%xmm1.i4 -> %xmm2.i4]
[RegAlloc]     [6,7 WasmParameter] [def v3<g>:r14]
[RegAlloc]     [MoveGroup] [%xmm2.i4 -> %xmm1.i4]
[RegAlloc]     [8,9 WasmBinarySimd128] [def v4<simd128>:%xmm1.i4] [use v2:r %xmm1.i4] [use v1:r %xmm0.i4]
[RegAlloc]     [MoveGroup] [%xmm1.i4 -> %xmm0.i4]
[RegAlloc]     [10,11 WasmReturn] [use v4:%xmm0.d %xmm0.i4] [use v3:r14 r14]
```

Revision 1 by

Lars T Hansen [:lth]

on 2021-03-26 08:13:37 PDT

Here's an example:
```
wasmDis(new WebAssembly.Module(wasmTextToBinary(`
  (module
    (func (param v128) (param v128) (result v128)
      (i32x4.add (local.get 1) (local.get 0))))
`)))
```
The core of the code generated for this is:
```
00000024  66 0f 6f d1               movdqa %xmm1, %xmm2
00000028  66 0f 6f ca               movdqa %xmm2, %xmm1
0000002C  66 0f fe c8               paddd %xmm0, %xmm1
00000030  66 0f 6f c1               movdqa %xmm1, %xmm0
```
which is not great.  Integer SIMD addition is commutative so the optimal code is just the addition; almost-optimal code has at most one move (the one at the end).  But even ignoring that, the first two moves are clearly redundant, as xmm2 is dead and the move accomplishes nothing.

(Note the swapped operand order.  If the body were param0 + param1 then the machine code is simply the paddd.)

Regalloc reports that the input is this:
```
[RegAlloc]     [2,3 WasmParameter] [def v1<simd128>:%xmm0.i4]
[RegAlloc]     [4,5 WasmParameter] [def v2<simd128>:%xmm1.i4]
[RegAlloc]     [6,7 WasmParameter] [def v3<g>:r14]
[RegAlloc]     [8,9 WasmBinarySimd128] [def v4<simd128>:tied(0)] [use v2:r?] [use v1:r]
[RegAlloc]     [10,11 WasmReturn] [use v4:%xmm0.d] [use v3:r14]
```
but at the end the IR looks like this:
```
[RegAlloc]     [2,3 WasmParameter] [def v1<simd128>:%xmm0.i4]
[RegAlloc]     [4,5 WasmParameter] [def v2<simd128>:%xmm1.i4]
[RegAlloc]     [MoveGroup] [%xmm1.i4 -> %xmm2.i4]
[RegAlloc]     [6,7 WasmParameter] [def v3<g>:r14]
[RegAlloc]     [MoveGroup] [%xmm2.i4 -> %xmm1.i4]
[RegAlloc]     [8,9 WasmBinarySimd128] [def v4<simd128>:%xmm1.i4] [use v2:r %xmm1.i4] [use v1:r %xmm0.i4]
[RegAlloc]     [MoveGroup] [%xmm1.i4 -> %xmm0.i4]
[RegAlloc]     [10,11 WasmReturn] [use v4:%xmm0.d %xmm0.i4] [use v3:r14 r14]
```

Back to Bug 1701164 Comment 1