Closed Bug 1061637 Opened 11 years ago Closed 6 years ago

SIMD: optimize 'select' to use blendvps when SSE4.1 is available.

Categories

(Core :: JavaScript Engine: JIT, defect, P5)

x86_64
All
defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: dougc, Unassigned)

References

Details

Attachments

(2 files)

The blendvps SSE4.1 instruction may be faster than the standard sequence implemented in bug 1060437. The semantics of blendvps are different to the 'select' operation, it only looks at the top sign bit to decide which input lanes to select for the output. However there are probably many useful cases in which 'select' could be transform into a blend operation, such as when 'select' follows a comparison.
Depends on: 1062067
This is wip that can be used to explore the performance of blendvps. The semantics of blendvps differ from 'select' but it could still be used when we can prove that doing so does not change the semantics of the code, such as when it follows a comparison that sets all bits. It might also need some spec work to define or confirm how this should work. This patch is rebased after the VEX patches in bug 1065339 and also support vblendvps.
Priority: -- → P5
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: