Closed Bug 1920669 Opened 1 year ago Closed 1 year ago

Avoid partial register read in AtomicFetchOp

Categories

(Core :: JavaScript Engine: JIT, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
135 Branch
Tracking Status
firefox135 --- fixed

People

(Reporter: anba, Assigned: anba)

Details

Attachments

(1 file)

Avoid partial register reads in AtomicFetchOp which can cause false-dependencies on the upper register parts. See also [1] for a related LLVM optimisation pass.

[1] https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86FixupBWInsts.cpp

Using movzbl resp. movzwl avoids a partial register read before the
movl + {and,or,xor}l instruction sequence.

"jit/GenerateAtomicOperations.py" shouldn't be affected, because it moves and
operates on correctly sized registers, e.g. AtomicAnd8SeqCst compiles to:

movb (rdi), al
0: movb al, dl
andb sil, dl
lock; cmpxchgb dl, (rdi)
jnz 0b
ret

We could change AtomicFetchOp to also use correctly sized registers, but that
leads to a slight code size increase, because word-sized require an explicit
override byte.

Drive-by change:

  • Rewrite AtomicFetchOp to avoid the ATOMIC_BITOP_BODY macro.

µ-benchmark which improves with this patch:

function f() {
  var u8 = new Uint8Array(4);
  var i32 = new Int32Array([1, 1, 1, 1]);

  var r = 0;
  var t = performance.now();
  for (var i = 0; i < 100_000_000; ++i) {
    // idivl uses %rax.
    i32[i&1] = 100 / i32[i&3];

    // cmpxchgb uses %ax.
    r += Atomics.and(u8, 0, 255);
  }
  print(performance.now() - t, r, i32[0]);
}
for (let i = 0; i < 10; ++i) f();
Severity: -- → N/A
Priority: -- → P2
Pushed by andre.bargull@gmail.com: https://hg.mozilla.org/integration/autoland/rev/fccbf2fb9735 Avoid partial register read in AtomicFetchOp. r=spidermonkey-reviewers,jandem
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 135 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: