Looks like a real (and fairly bad) cranelift bug.
Test program follows. Note only the last function is exported and called; the others are unreferenced, so I've omitted their bodies.
```
(module
(type (;0;) (func))
(type (;1;) (func (result i32)))
(type (;2;) (func (result i64)))
(func (;0;) (type 2) (result i64) ...)
(func (;1;) (type 1) (result i32) ...)
(func (;2;) (type 0)
i32.const -4098
i32.load16_s offset=1
drop)
(memory (;0;) 1)
(export "g" (func 2)))
```
Baseline code for the body:
```
0x15c6826a11d4 12820000 mov w0, #0xffffefff ;; -4098 as a 32-bit value, so 0x00000000ffffefff in x0
0x15c6826a11d8 78e06aa0 ldrsh w0, [x21, x0] ;; load from memory
0x15c6826a11dc 14000002 b #+0x8 (addr 0x15c6826a11e4) ;; i have no idea what this is
0x15c6826a11e0 d4200000 brk #0x0 ;; nor here
```
Cranelift code:
```
0x1ee4f34c1160 92820000 mov x0, #0xffffffffffffefff ;; -4098 as 64-bit value
0x1ee4f34c1164 8b150000 add x0, x0, x21 ;; load from
0x1ee4f34c1168 79800000 ldrsh x0, [x0] ;; memory
```
The difference seems to be that cranelift sign extends the offset (the wasm program's address) to 64 bits while baseline zero extends it. Baseline is right here: wasm i32 values are to be treated as unsigned when used as memory addresses.
The baseline instruction 78e06aa0 (LDRSH) uses the raw value of x0 for the address computation, from what I can tell from the encoding.
Normally there will be stuff in memory below the heap. By manipulating the negative address (admittedly these need to be constants in the code), those contents could be read. It's probably not predictable what's there and it would be easy to crash, but there's an attack vector.
Bug 1678785 Comment 8 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
Looks like a real (and fairly bad) cranelift bug.
Test program follows. Note only the last function is exported and called; the others are unreferenced, so I've omitted their bodies.
```
(module
(type (;0;) (func))
(type (;1;) (func (result i32)))
(type (;2;) (func (result i64)))
(func (;0;) (type 2) (result i64) ...)
(func (;1;) (type 1) (result i32) ...)
(func (;2;) (type 0)
i32.const -4098
i32.load16_s offset=1
drop)
(memory (;0;) 1)
(export "g" (func 2)))
```
Baseline code for the body:
```
0x15c6826a11d4 12820000 mov w0, #0xffffefff ;; -4098 as 32-bit, ie 0x00000000ffffefff in x0
0x15c6826a11d8 78e06aa0 ldrsh w0, [x21, x0] ;; load from memory
0x15c6826a11dc 14000002 b #+0x8 (addr 0x15c6826a11e4) ;; i have no idea what this is
0x15c6826a11e0 d4200000 brk #0x0 ;; nor here
```
Cranelift code:
```
0x1ee4f34c1160 92820000 mov x0, #0xffffffffffffefff ;; -4098 as 64-bit value
0x1ee4f34c1164 8b150000 add x0, x0, x21 ;; load from
0x1ee4f34c1168 79800000 ldrsh x0, [x0] ;; memory
```
The difference seems to be that cranelift sign extends the offset (the wasm program's address) to 64 bits while baseline zero extends it. Baseline is right here: wasm i32 values are to be treated as unsigned when used as memory addresses.
The baseline instruction 78e06aa0 (LDRSH) uses the raw value of x0 for the address computation, from what I can tell from the encoding.
Normally there will be stuff in memory below the heap. By manipulating the negative address (admittedly these need to be constants in the code), those contents could be read. It's probably not predictable what's there and it would be easy to crash, but there's an attack vector.
Looks like a real (and fairly bad) cranelift bug.
Test program follows. Note only the last function is exported and called; the others are unreferenced, so I've omitted their bodies.
```
(module
(type (;0;) (func))
(type (;1;) (func (result i32)))
(type (;2;) (func (result i64)))
(func (;0;) (type 2) (result i64) ...)
(func (;1;) (type 1) (result i32) ...)
(func (;2;) (type 0)
i32.const -4098
i32.load16_s offset=1
drop)
(memory (;0;) 1)
(export "g" (func 2)))
```
Baseline code for the body:
```
0x15c6826a11d4 12820000 mov w0, #0xffffefff ;; -4098 as 32-bit, ie 0x00000000ffffefff in x0
0x15c6826a11d8 78e06aa0 ldrsh w0, [x21, x0] ;; load from memory
```
Cranelift code:
```
0x1ee4f34c1160 92820000 mov x0, #0xffffffffffffefff ;; -4098 as 64-bit value
0x1ee4f34c1164 8b150000 add x0, x0, x21 ;; load from
0x1ee4f34c1168 79800000 ldrsh x0, [x0] ;; memory
```
The difference seems to be that cranelift sign extends the offset (the wasm program's address) to 64 bits while baseline zero extends it. Baseline is right here: wasm i32 values are to be treated as unsigned when used as memory addresses.
The baseline instruction 78e06aa0 (LDRSH) uses the raw value of x0 for the address computation, from what I can tell from the encoding.
Normally there will be stuff in memory below the heap. By manipulating the negative address (admittedly these need to be constants in the code), those contents could be read. It's probably not predictable what's there and it would be easy to crash, but there's an attack vector.