Closed Bug 1690483 Opened 4 years ago Closed 4 years ago

SIMD optimization x64/x86: Better code for variable swizzle

Categories

(Core :: JavaScript: WebAssembly, enhancement, P3)

x86_64
All
enhancement

Tracking

()

RESOLVED FIXED
95 Branch
Tracking Status
firefox95 --- fixed

People

(Reporter: lth, Assigned: yury)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

The variable swizzle on intel can use PSHUFB to shuffle the bytes but the mask vector must first be sanitized so that out-of-range lanes in the mask have the high bit set. Currently we use a compare-with-constant-and-POR to do this (and we don't even inline the constant load in the compare, sigh) but it's possible to do better by saturating-add'ing a constant into the mask: https://github.com/WebAssembly/simd/issues/68#issuecomment-470825324

For specific code generation, I'm not sure if it's better to (a) splat a byte value into scratch / load the constant into scratch, and add the mask to the scratch, or (b) to move the mask to scratch and add a constant from memory into scratch. Either way the mask register is not volatile.

Also see https://github.com/WebAssembly/simd/issues/93 for more discussion, probably worth reading although it ranges across a bunch of topics.

Assignee: nobody → ydelendik
Status: NEW → ASSIGNED

There is some 3-4% gain in local microbenchmark test.

Pushed by ydelendik@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ee2bc38e681e Use saturating add for mask of SIMD swizzle. r=lth
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 95 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: