Bug 1678785 Comment 8 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Looks like a real (and fairly bad) cranelift bug.

Test program follows.  Note only the last function is exported and called; the others are unreferenced, so I've omitted their bodies.
```
(module
  (type (;0;) (func))
  (type (;1;) (func (result i32)))
  (type (;2;) (func (result i64)))
  (func (;0;) (type 2) (result i64) ...)
  (func (;1;) (type 1) (result i32) ...)
  (func (;2;) (type 0)
    i32.const -4098
    i32.load16_s offset=1
    drop)
  (memory (;0;) 1)
  (export "g" (func 2)))
```
Baseline code for the body:
```
0x15c6826a11d4  12820000  mov     w0, #0xffffefff              ;; -4098 as a 32-bit value, so 0x00000000ffffefff in x0
0x15c6826a11d8  78e06aa0  ldrsh   w0, [x21, x0]                ;; load from memory
0x15c6826a11dc  14000002  b       #+0x8 (addr 0x15c6826a11e4)  ;; i have no idea what this is
0x15c6826a11e0  d4200000  brk     #0x0                         ;; nor here
```

Cranelift code:
```
0x1ee4f34c1160  92820000  mov     x0, #0xffffffffffffefff  ;; -4098 as 64-bit value
0x1ee4f34c1164  8b150000  add     x0, x0, x21                 ;; load from
0x1ee4f34c1168  79800000  ldrsh   x0, [x0]                     ;;   memory
```

The difference seems to be that cranelift sign extends the offset (the wasm program's address) to 64 bits while baseline zero extends it.  Baseline is right here: wasm i32 values are to be treated as unsigned when used as memory addresses.

The baseline instruction 78e06aa0 (LDRSH) uses the raw value of x0 for the address computation, from what I can tell from the encoding.

Normally there will be stuff in memory below the heap.  By manipulating the negative address (admittedly these need to be constants in the code), those contents could be read.  It's probably not predictable what's there and it would be easy to crash, but there's an attack vector.
Looks like a real (and fairly bad) cranelift bug.

Test program follows.  Note only the last function is exported and called; the others are unreferenced, so I've omitted their bodies.
```
(module
  (type (;0;) (func))
  (type (;1;) (func (result i32)))
  (type (;2;) (func (result i64)))
  (func (;0;) (type 2) (result i64) ...)
  (func (;1;) (type 1) (result i32) ...)
  (func (;2;) (type 0)
    i32.const -4098
    i32.load16_s offset=1
    drop)
  (memory (;0;) 1)
  (export "g" (func 2)))
```
Baseline code for the body:
```
0x15c6826a11d4  12820000  mov     w0, #0xffffefff              ;; -4098 as 32-bit, ie 0x00000000ffffefff in x0
0x15c6826a11d8  78e06aa0  ldrsh   w0, [x21, x0]                ;; load from memory
0x15c6826a11dc  14000002  b       #+0x8 (addr 0x15c6826a11e4)  ;; i have no idea what this is
0x15c6826a11e0  d4200000  brk     #0x0                         ;; nor here
```

Cranelift code:
```
0x1ee4f34c1160  92820000  mov     x0, #0xffffffffffffefff  ;; -4098 as 64-bit value
0x1ee4f34c1164  8b150000  add     x0, x0, x21              ;; load from
0x1ee4f34c1168  79800000  ldrsh   x0, [x0]                 ;;   memory
```

The difference seems to be that cranelift sign extends the offset (the wasm program's address) to 64 bits while baseline zero extends it.  Baseline is right here: wasm i32 values are to be treated as unsigned when used as memory addresses.

The baseline instruction 78e06aa0 (LDRSH) uses the raw value of x0 for the address computation, from what I can tell from the encoding.

Normally there will be stuff in memory below the heap.  By manipulating the negative address (admittedly these need to be constants in the code), those contents could be read.  It's probably not predictable what's there and it would be easy to crash, but there's an attack vector.
Looks like a real (and fairly bad) cranelift bug.

Test program follows.  Note only the last function is exported and called; the others are unreferenced, so I've omitted their bodies.
```
(module
  (type (;0;) (func))
  (type (;1;) (func (result i32)))
  (type (;2;) (func (result i64)))
  (func (;0;) (type 2) (result i64) ...)
  (func (;1;) (type 1) (result i32) ...)
  (func (;2;) (type 0)
    i32.const -4098
    i32.load16_s offset=1
    drop)
  (memory (;0;) 1)
  (export "g" (func 2)))
```
Baseline code for the body:
```
0x15c6826a11d4  12820000  mov     w0, #0xffffefff              ;; -4098 as 32-bit, ie 0x00000000ffffefff in x0
0x15c6826a11d8  78e06aa0  ldrsh   w0, [x21, x0]                ;; load from memory
```

Cranelift code:
```
0x1ee4f34c1160  92820000  mov     x0, #0xffffffffffffefff  ;; -4098 as 64-bit value
0x1ee4f34c1164  8b150000  add     x0, x0, x21              ;; load from
0x1ee4f34c1168  79800000  ldrsh   x0, [x0]                 ;;   memory
```

The difference seems to be that cranelift sign extends the offset (the wasm program's address) to 64 bits while baseline zero extends it.  Baseline is right here: wasm i32 values are to be treated as unsigned when used as memory addresses.

The baseline instruction 78e06aa0 (LDRSH) uses the raw value of x0 for the address computation, from what I can tell from the encoding.

Normally there will be stuff in memory below the heap.  By manipulating the negative address (admittedly these need to be constants in the code), those contents could be read.  It's probably not predictable what's there and it would be easy to crash, but there's an attack vector.

Back to Bug 1678785 Comment 8