Closed Bug 1690462 Opened 3 years ago Closed 3 years ago

SIMD optimization: sign replication

Categories

(Core :: JavaScript: WebAssembly, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
90 Branch
Tracking Status
firefox90 --- fixed

People

(Reporter: lth, Assigned: yury)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

https://github.com/WebAssembly/simd/issues/437 points out that some cliches, such as iNxM.shr_s(v, N-1) and iNxM.shr_s(v, -1) , actually mean "replicate the sign bit throughout the lane", and that on some architectures there are faster instruction sequences for this than constant right shift. (See the ticket for suggestions.) It would be easy to optimize this, as we already handle the shift-by-constant case specially.

See Also: → 1690490
See Also: → 1693473

Google bug with some ideas for code generation: https://crbug.com/v8/11311

Assignee: nobody → ydelendik
Status: NEW → ASSIGNED

Not sure if it makes sense to replace one instruction vpsraw/vpsrad with multiple instructions PXOR/PCMPGTx. Submitted a patch to optimize x86 for i8x16.shr_s and i64x2.shr_s

(In reply to Yury Delendik (:yury) from comment #3)

Not sure if it makes sense to replace one instruction vpsraw/vpsrad with multiple instructions PXOR/PCMPGTx. Submitted a patch to optimize x86 for i8x16.shr_s and i64x2.shr_s

I agree, on x64 the 16/32 bit cases are best left as they are, the V8 bug also indicates that the payoff is for 8 and 64 primarily.

On ARM64 there seems to be no particularly good reason to change anything; apart from too many moves (see bug 1712692) we already generate a single shift instruction, and we'll do this for all operand sizes. The ARM64 optimization manual indicates that the execution cost of the shifts does not differ from that of the compares.

Pushed by ydelendik@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/556aff9ffdc6
SIMD optimization for sign replication. r=lth
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 90 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: