Open Bug 1690460 Opened 3 years ago Updated 1 year ago

[meta] SIMD instruction selection optimizations

Categories

(Core :: JavaScript: WebAssembly, task, P3)

task

Tracking

()

People

(Reporter: lth, Unassigned)

References

(Depends on 10 open bugs, Blocks 1 open bug)

Details

(Keywords: meta)

Tracking bug for SIMD instruction selection optimizations.

There are many suggestions for desirable special cases of instruction selection on the SIMD tracker, and we also have an internal document with a few more suggestions. Here I mean instruction selection narrowly: better code generation for individual SIMD instructions or perhaps very small trees of such instructions, not overarching optimization concerns.

All platforms are fair game here.

Some optimization advice from the intel optimization manual:

  • "Use PSHUFB if the alternative uses 5 or more instructions"
  • Blend operations that use XMM0 are most natural if XMM0 is the result of a previous operation that creates the mask. (Ie Blend is a natural for bitselect but maybe not for shuffle?)
  • Section 5.5 has various ideas for generating common constants with few instructions; the implication is it's faster than memory load. (Some benchmarking finds that this is not necessarily the case, and/or the differences are very slight, and/or it's specific to generating integer constants in the integer part of the ALU, and we're not yet where that makes a difference.)
Severity: -- → N/A
Type: enhancement → task
Priority: -- → P3
Depends on: 1690462
Depends on: 1690466
Depends on: 1690471
Depends on: 1671873
Depends on: 1690478
Depends on: 1690483
Depends on: 1690490
Depends on: 1690492
Keywords: meta
Summary: SIMD instruction selection optimizations → [meta] SIMD instruction selection optimizations
Depends on: 1690533
Depends on: 1690538
Depends on: 1691154
Depends on: 1691343
Depends on: 1693473
Depends on: 1693482
Depends on: 1693490
Depends on: 1693497
Depends on: 1693500
Depends on: 1694191
Depends on: 1694342
Depends on: 1696103

Bug 1691490 adds i64x2.{gt,lt,ge,le}_s instructions, and for SSE4.1 and below, code is not optimal. SSE4.2 and above provides better lowering, see https://github.com/WebAssembly/simd/pull/412

Depends on: 1699244
Depends on: 1699620
Depends on: 1700316
Depends on: 1700317
Depends on: 1700319
Depends on: 1709209
Depends on: 1725667
You need to log in before you can comment on or make changes to this bug.