Closed Bug 1542195 Opened 6 years ago Closed 3 years ago

Faster interrupt checks via mprotect() + load

Tracking

()

Status:

RESOLVED INACTIVE

People

(Reporter: lth, Unassigned)

Details

Attachments

(1 file)

WIP: Bug 1542195 - WIP check via mprotect 4 years ago Yury Delendik (:yury) 48 bytes, text/x-phabricator-request		Details \| Review

Lars T Hansen [:lth]

Reporter

Description

•

6 years ago

Currently our interrupt check at the loop header is this:

  ld tmp, *(tls + offs)
  cmp tmp, 0
  je  continue
  trap

and on x86/x64 Ion even folds the load and compare and avoids using a register here. This is not too awful, the branch should be well-predicted in any hot loop and it's mostly a matter of fetching and processing instructions, + minor concerns about code size.

It is possible to do better in principle by using an mprotect trick. In this scenario, the interrupt "flag" is an entire page that is normally readable but not writable. The check performs a load from that page and discards the result:

  ld tmp, *loc

where loc is a run-time constant. To signal an interrupt, we mprotect the page, making it unreadable and making the load trap.

Not without some issues:

(1) Sadly we need to use a register for this but that's fixable. On ARM64 we can target the zero register I think. On x86 and x64 we could instead guarantee that the page contains only zero values and then do this (rax is the destination):

   add *loc, $rax

(2) Encoding a constant address in the code is fine only so long as the code is not saved to disk to be reloaded in another process later, or shared among runtimes that don't all interrupt at the same time. A constant address is also sometimes expensive to encode in the instruction stream.

In practice, we may prefer to use TLS-relative offsets to sidestep all these problems. Suppose for the sake of argument that the tls could start on a page boundary. Then the page before the tls could be the trap page:

   ld tmp, *($tls - offs)

where offs might depend on what's most convenient in the instruction set; any location within the page might do.

Lars T Hansen [:lth]

Reporter

Updated

•

5 years ago

Blocks: wasm-simd

Yury Delendik (:yury)

Comment 1

•

4 years ago

Attached file WIP: Bug 1542195 - WIP check via mprotect — Details

Yury Delendik (:yury)

Comment 2

•

4 years ago

•

Edited

Measured the above WIP with following code:

timeout(3, function() {
  const buf = new BigUint64Array(i.exports.memory.buffer, 0, 8);
  print("timeout! " + buf[0]);

  quit(1);
});

const b = wasmTextToBinary(`(module
    (memory (export "memory") 1)
    (func (export "run")
      i32.const 0
      i64.const 0
      i64.store
      loop
        i32.const 0
        i32.const 0
        i64.load
        i64.const 1
        i64.add
        i64.store
        br 0
      end
    ))`);

const i = new WebAssembly.Instance(new WebAssembly.Module(b));
//wasmDis(i.exports.run);
i.exports.run();

Without patch the numbers are (on x64): 1969886744 1926480976 1992487502 1964049245 1945432197
With patch: 1995686307 1981681606 1972536904 1977942344 1930216667 (about 0.6% speed up?)

Lars T Hansen [:lth]

Reporter

Comment 3

•

4 years ago

Taking this since it's blocked on me to run some tests. I think we should test loops that terminate (so we don't pay for the interrupt) but also loops that interrupt much more often (to gauge the cost of the interrupt).

Assignee: nobody → lhansen

Status: NEW → ASSIGNED

Lars T Hansen [:lth]

Reporter

Updated

•

3 years ago

No longer blocks: wasm-simd

Lars T Hansen [:lth]

Reporter

Comment 4

•

3 years ago

OK, so on this program with the patch I'm seeing a 0.4% speedup on my dual Xeon (with taskset to pin the job to one of the CPUs) and a small slowdown (in the same range, I didn't bother to run the numbers) on the M1. Both of these with Ion.

I also take back what I said in comment 3 about "paying for the interrupt", clearly that is not an issue here.

In short, this is not worth doing at this time. The patch is interesting, though, and ties into some things I'm doing with bounds checking on memory64, so I'll P5 it for now.

Assignee: lhansen → nobody

Status: ASSIGNED → NEW

Priority: P3 → P5

Lars T Hansen [:lth]

Reporter

Updated

•

3 years ago

Status: NEW → RESOLVED

Closed: 3 years ago

Resolution: --- → INACTIVE

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Faster interrupt checks via mprotect() + load

Categories

(Core :: JavaScript: WebAssembly, enhancement, P5)

Tracking

()

People

(Reporter: lth, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Updated

Attachment

General

Description

File Name

Content Type