Open Bug 1077036 Opened 5 years ago Updated 3 years ago

x86 atomics: Generate better code for byte-sized atomic operations (32-bit only)

Categories

(Core :: JavaScript Engine: JIT, defect, P5)

x86
All
defect

Tracking

()

People

(Reporter: lth, Unassigned)

References

(Blocks 1 open bug)

Details

Followup work on bug 979594.

Currently the code generator constrains the register allocator quite a lot in several ways when it implements atomic operations on byte arrays:

- it requires the use of specific registers that have byte parts (ebx, ecx)
- it requires the use of a full register with a byte part in situations
  where it could work around that and pick any full register

Also it does not conserve registers as well as it could:

- it does not allow use of the "high" registers (ah, bh)
- it does not allow the use of immediate values when it would sometimes
  be possible to do so
See Also: → 1138348
These optimization opportunities (modulo immediate value) apply to 32-bit only, see bug 1138348 for the 64-bit case.
Hardware: x86_64 → x86
Summary: x86 atomics: Generate better code for byte-sized atomic operations → x86 atomics: Generate better code for byte-sized atomic operations (32-bit only)
I also think that in the case of a properly in-range immediate argument the final movzbl / movsbl in the atomic binops with a result (AND, OR, NOT) can be avoided provided that the initial load performs the proper zero/sign extension.   That would be true also for 16-bit operations and also for x64.
Observe also that in the case of Atomics.sub() we sometimes end up with

  movl src, eax
  neg eax
  lock xadd al, ...
  movsbl al, eax

and it seems that if the initial mov is necessary then it would be desirable to combine it with the neg, I'm sensing there may be some LEA solution.
Priority: -- → P5
Assignee: lhansen → nobody
Blocks: 1317626
No longer blocks: shared-array-buffer
You need to log in before you can comment on or make changes to this bug.