Closed
Bug 1096684
Opened 8 years ago
Closed 8 years ago
Optimize SIMD shuffle and swizzle sequences with SSE3 and SSE4.1
Categories
(Core :: JavaScript Engine: JIT, enhancement)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
FIXED
mozilla36
People
(Reporter: sunfish, Assigned: sunfish)
References
Details
Attachments
(4 files)
1.83 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
8.84 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
6.95 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
6.04 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
SSE3 and SSE4.1 provide movsldup, movshdup, insertps, blendps, and pextrd, which are advantageous in SIMD shuffle, insert, and extract sequences.
Assignee | ||
Comment 1•8 years ago
|
||
This patch implements extraName() for the LIR extract and insert element classes, for more readable debug output.
Attachment #8520298 -
Flags: review?(benj)
Assignee | ||
Comment 2•8 years ago
|
||
This patch optimizes shuffles with insertps and blendps.
Attachment #8520301 -
Flags: review?(benj)
Assignee | ||
Comment 3•8 years ago
|
||
This patch adds the pextrd instruction and optimizes extract-element and boxDouble on x86 (where unboxDouble already uses pinsrd).
Attachment #8520303 -
Flags: review?(benj)
Assignee | ||
Comment 4•8 years ago
|
||
This patch adds movsldup and movshdup and uses them for swizzles.
Attachment #8520305 -
Flags: review?(benj)
Comment 5•8 years ago
|
||
Comment on attachment 8520298 [details] [diff] [review] extraname.patch Review of attachment 8520298 [details] [diff] [review]: ----------------------------------------------------------------- Cool. ::: js/src/jit/LIR-Common.h @@ +162,5 @@ > + case LaneX: return "lane x"; > + case LaneY: return "lane y"; > + case LaneZ: return "lane z"; > + case LaneW: return "lane w"; > + default: return "unknown lane"; can you please remove the default case, such that the compiler complains if one adds a member to SimdLane and forgets to add the lane name here? @@ +211,5 @@ > + case LaneX: return "lane x"; > + case LaneY: return "lane y"; > + case LaneZ: return "lane z"; > + case LaneW: return "lane w"; > + default: return "unknown lane"; ditto
Attachment #8520298 -
Flags: review?(benj) → review+
Comment 6•8 years ago
|
||
Comment on attachment 8520301 [details] [diff] [review] insertps-blendps.patch Review of attachment 8520301 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, r+ with tests. ::: js/src/jit/shared/BaseAssembler-x86-shared.h @@ +379,5 @@ > > typedef enum { > OP3_ROUNDSS_VsdWsd = 0x0A, > OP3_ROUNDSD_VsdWsd = 0x0B, > + OP3_BLENDPS_VpdWpdIb = 0x0C, nit: should be OP3_BLENDPS_VpsWpsIb as (legit) operands are packed singles ::: js/src/jit/shared/CodeGenerator-x86-shared.cpp @@ +2476,5 @@ > MOZ_ASSERT(numLanesFromLHS < 4); > > + // If all values stay in their lane, this is a blend. > + if (AssemblerX86Shared::HasSSE41()) { > + if (x % 4 == 0 && y % 4 == 1 && z % 4 == 2 && w % 4 == 3) { Could you add tests in jit-tests/tests/asm.js/testSIMD.js for this case and the ones below, below the line commented as: // Impl-specific special cases for shuffle (case and swapped case)
Attachment #8520301 -
Flags: review?(benj) → review+
Comment 7•8 years ago
|
||
Comment on attachment 8520303 [details] [diff] [review] pextrd.patch Review of attachment 8520303 [details] [diff] [review]: ----------------------------------------------------------------- Nice! ::: js/src/jit/shared/BaseAssembler-x86-shared.h @@ +3728,5 @@ > > + void pextrd_irr(unsigned lane, XMMRegisterID src, RegisterID dst) > + { > + MOZ_ASSERT(lane < 4); > + spew("pextrd $%u, %s, %s", lane, nameFPReg(src), nameIReg(4, dst)); could be $%x for symmetry with pinsrd_irr @@ +3737,5 @@ > + > + void pextrd_imr(unsigned lane, XMMRegisterID src, int offset, RegisterID base) > + { > + MOZ_ASSERT(lane < 4); > + spew("pextrd $%u, %s, %s0x%x(%s)", lane, nameFPReg(src), ditto
Attachment #8520303 -
Flags: review?(benj) → review+
Comment 8•8 years ago
|
||
Comment on attachment 8520305 [details] [diff] [review] movsldup-movshdup.patch Review of attachment 8520305 [details] [diff] [review]: ----------------------------------------------------------------- Cool! ::: js/src/jit/shared/CodeGenerator-x86-shared.cpp @@ +2416,5 @@ > uint32_t z = ins->laneZ(); > uint32_t w = ins->laneW(); > > + if (AssemblerX86Shared::HasSSE3()) { > + if (ins->lanesMatch(0, 0, 2, 2)) { Can you add tests for these two specific patterns in testSIMD.js as well please?
Attachment #8520305 -
Flags: review?(benj) → review+
Assignee | ||
Comment 9•8 years ago
|
||
Nits addressed, and as noted on irc, the tests are in. https://hg.mozilla.org/integration/mozilla-inbound/rev/9bd6e299c84a https://hg.mozilla.org/integration/mozilla-inbound/rev/191a52db5011 https://hg.mozilla.org/integration/mozilla-inbound/rev/cf5beb0f5d25 https://hg.mozilla.org/integration/mozilla-inbound/rev/4c04203003c2
https://hg.mozilla.org/mozilla-central/rev/9bd6e299c84a https://hg.mozilla.org/mozilla-central/rev/191a52db5011 https://hg.mozilla.org/mozilla-central/rev/cf5beb0f5d25 https://hg.mozilla.org/mozilla-central/rev/4c04203003c2
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
You need to log in
before you can comment on or make changes to this bug.
Description
•