Closed
Bug 1136226
Opened 10 years ago
Closed 8 years ago
Add SIMD.int8x16 and SIMD.int16x8 support to IonMonkey
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla49
Tracking | Status | |
---|---|---|
firefox49 | --- | fixed |
People
(Reporter: ProgramFOX, Assigned: jolesen)
References
Details
Attachments
(41 files)
28.87 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
9.17 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
12.87 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
26.19 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
19.91 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
60.57 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
5.40 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
6.29 KB,
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
20.75 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
17.12 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
5.31 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
6.50 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
12.25 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
7.07 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
17.87 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
3.53 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
10.96 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
38.21 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
2.62 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
11.60 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
20.02 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
18.44 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
33.92 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
26.36 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
18.95 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
13.28 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
9.62 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
45.44 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
24.66 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
2.45 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
8.13 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
28.89 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
27.38 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
39.02 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
48.47 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
9.32 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
4.61 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
14.90 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
15.05 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
9.49 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
15.42 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
Bug 1124291 will add support for SIMD.int8x16 and SIMD.int16x8 to the interpreter, but not yet to IonMonkey.
I think this cannot be immediately done after bug 1124291 is finished, based on nbp's comment on bug 1124205 (which is about adding SIMD.float64x2 to IonMonkey):
> This should be added later, once the infrastructure for int32x4 and float32x4 is complete.
Comment 1•9 years ago
|
||
Emscripten has a small workaround currently in place to ignore its test suite failures since these types don't yet exist for asm.js: https://github.com/juj/emscripten/commit/c1b006d044ec2f47d70a31bafe635fa3f7e7c4a3
Assignee | ||
Comment 2•9 years ago
|
||
Working on this now. I'll be doing the signed and unsigned types simultaneously since most of the code is shared. I am adding asm.js / wasm support during development since asm.js is a good way of writing code generator test cases.
Assignee: nobody → jolesen
Assignee | ||
Comment 3•9 years ago
|
||
We're about to add 16-lane and 8-lane SIMD types to IonMonkey, so we don't want
an enum to identify lanes. Just use an unsigned instead.
Also note that SIMD.js no longer has the .x .y .z .w properties.
Change a few lane < 4 checks to use the number of lanes in the Simd type.
Attachment #8749246 -
Flags: review?(sunfish)
Assignee | ||
Comment 4•9 years ago
|
||
Extend the SimdConstant class to handle other 128-bit SIMD vector types.
Remove some unused methods.
Attachment #8749247 -
Flags: review?(sunfish)
Assignee | ||
Comment 5•9 years ago
|
||
This adds 8x16 and 16x8 types to MIRType, but it does not enable inlining of
the corresponding SIMD operations yet.
Explicitly disable the inlining of non-4-lane constructor calls until we have
support for them.
Attachment #8749248 -
Flags: review?(sunfish)
Assignee | ||
Comment 6•9 years ago
|
||
Rename LIR instructions:
LInt32x4 -> LSimd128Int
LFloat32x4 -> LSimd128Float.
These two LIR instructions can be used to materialize 128-bit SIMD vectors of
other geometries too. Also rename the masm.loadConstant{Int,Float}32x4()
functions to indicate that they can be used for other geometries.
Attachment #8749249 -
Flags: review?(sunfish)
Assignee | ||
Comment 7•9 years ago
|
||
In both classes, rename enums:
INT32X4 -> SIMD128INT
FLOAT32X4 -> SIMD128FLOAT
Catch the new MIRTypes to produce SIMD128INT.
Attachment #8749250 -
Flags: review?(sunfish)
Assignee | ||
Comment 8•9 years ago
|
||
Load, store, and move methods are generic for all 128-bit SIMD types. Keep the
difference between float and int vectors since x86 sometimes cares.
Attachment #8749251 -
Flags: review?(sunfish)
Assignee | ||
Comment 9•9 years ago
|
||
Replace the existing int32x4Constant and float32x4Constant masm methods with a
generic simd128Constant.
Attachment #8749252 -
Flags: review?(sunfish)
Assignee | ||
Comment 10•9 years ago
|
||
We have a lot of SIMD types in Ion now, and there is little benefit in
maintaining a separate bailout kind for each type. Instead, use a single
Bailout_NonSimd bailout kind to indicate that SimdUnbox didn't get the SIMD
type it was expecting.
Attachment #8749253 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 11•9 years ago
|
||
Constructors and factories take lane lists as arrays instead of four separate
lane arguments.
Attachment #8749254 -
Flags: review?(bbouvier)
Assignee | ||
Comment 12•9 years ago
|
||
This MIR instruction will be used for splatting all vector shapes.
Attachment #8749255 -
Flags: review?(bbouvier)
Assignee | ||
Comment 13•9 years ago
|
||
Support tests using all SIMD types.
Attachment #8749256 -
Flags: review?(bbouvier)
Comment 14•9 years ago
|
||
Comment on attachment 8749246 [details] [diff] [review]
Remove SimdLane enumeration.
Review of attachment 8749246 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good. The x/y/z/w conventions and the enum were added before int8x16/int16x8 were part of SIMD.js.
Attachment #8749246 -
Flags: review?(sunfish) → review+
Comment 15•9 years ago
|
||
Comment on attachment 8749247 [details] [diff] [review]
Support 8x16 and 16x8 types in SimdConstant.
Review of attachment 8749247 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/builtin/SIMD.cpp
@@ +172,5 @@
> if (!IsVectorObject<V>(v))
> return ErrorWrongTypeArg(cx, 1, typeDescr);
>
> Elem* mem = reinterpret_cast<Elem*>(v.toObject().as<TypedObject>().typedMem());
> + *out = jit::SimdConstant::Create(mem);
Calling "Create" with a pointer and no length looks unusual. Looking at the code, I see it infers the length from the element type and assuming that the SIMD type is 128-bit. I think I'd suggest calling this function CreateSimd128, so that it's clear here that it makes that assumption.
Attachment #8749247 -
Flags: review?(sunfish) → review+
Comment 16•9 years ago
|
||
Comment on attachment 8749248 [details] [diff] [review]
Add 16x8 and 8x16 MIRTypes.
Review of attachment 8749248 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/IonTypes.h
@@ +440,5 @@
> Bool32x4 = Boolean | (2 << VECTOR_SCALE_SHIFT),
> Doublex2 = Double | (1 << VECTOR_SCALE_SHIFT)
> };
>
> +static inline MIRType SimdTypeToLaneType(MIRType type);
It looks like it'd be a little tidier to just move SimdTypeToLaneType's definition to this point, rather than just declaring it here.
@@ +618,5 @@
>
> static inline bool
> IsSimdType(MIRType type)
> {
> + return ((unsigned(type) >> VECTOR_SCALE_SHIFT) & VECTOR_SCALE_MASK) != 0;
Cool!
Attachment #8749248 -
Flags: review?(sunfish) → review+
Comment 17•9 years ago
|
||
Comment on attachment 8749249 [details] [diff] [review]
Materialize 8x16 and 16x8 SIMD constants.
Review of attachment 8749249 [details] [diff] [review]:
-----------------------------------------------------------------
On the Float side, we can worry about whether movaps vs movapd matters for float64x2 when we add full support for float64x2, so this looks good.
Attachment #8749249 -
Flags: review?(sunfish) → review+
Comment 18•9 years ago
|
||
Comment on attachment 8749250 [details] [diff] [review]
Update LDefinition and MoveOp for 8x16 and 16x8.
Review of attachment 8749250 [details] [diff] [review]:
-----------------------------------------------------------------
Ditto for float64x2 and movapd :-).
Attachment #8749250 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8749251 -
Flags: review?(sunfish) → review+
Comment 19•9 years ago
|
||
Comment on attachment 8749252 [details] [diff] [review]
Add masm.simd128Constant().
Review of attachment 8749252 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/x86-shared/BaseAssembler-x86-shared.h
@@ +3399,5 @@
> {
> + const uint8_t* b = reinterpret_cast<const uint8_t*>(data);
> + spew(".simd128 %02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x",
> + b[0], b[1], b[2], b[3], b[4], b[5], b[6], b[7],
> + b[8], b[9], b[10], b[11], b[12], b[13], b[14], b[15]);
The spew strings are actually valid GAS syntax (bugs aside), and it is occasionally useful this to feed spew output into GAS in order to check encodings (though it does take some work to do this). .simd128 unfortunately isn't a GAS directive.
I kind of like printing it as 4 32-bit int values, and if we use hex then it'd still be pretty usable for int8x16 and int16x8 cases too:
".int 0x%08x,0x%08x,0x%08x,0x%08x" (passing it the data as 4 int32 values)
Other options could be:
".byte 0x%02x,0x%02x,0x%02x,..."
".ascii \x%02x\x%02x\x%02x..."
Attachment #8749252 -
Flags: review?(sunfish) → review+
Comment 20•9 years ago
|
||
Comment on attachment 8749255 [details] [diff] [review]
Rename MSimdSplatX4 to MSimdSplat.
Review of attachment 8749255 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, thanks!
::: js/src/jit/shared/LIR-shared.h
@@ +200,3 @@
> {
> public:
> LIR_HEADER(SimdSplatX4)
I don't know if you've planned this for a subsequent patch, but feel free to rename LSimdSplatX4 into LSimdSplat too, if this makes sense.
Attachment #8749255 -
Flags: review?(bbouvier) → review+
Comment 21•9 years ago
|
||
Comment on attachment 8749256 [details] [diff] [review]
Fix jit-test shell for SIMD.
Review of attachment 8749256 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks.
::: js/src/jit-test/lib/simd.js
@@ +95,5 @@
>
> + assertFunc(Type.extractLane(vec, 0), arr[0]);
> + assertFunc(Type.extractLane(vec, 1), arr[1]);
> + assertFunc(Type.extractLane(vec, 2), arr[2]);
> + assertFunc(Type.extractLane(vec, 3), arr[3]);
Much better!
Attachment #8749256 -
Flags: review?(bbouvier) → review+
Comment 22•9 years ago
|
||
Comment on attachment 8749254 [details] [diff] [review]
Make MSimdSwizzle and MSimdShuffle length-agnostic.
Review of attachment 8749254 [details] [diff] [review]:
-----------------------------------------------------------------
Much simpler, thank you.
::: js/src/jit/FixedList.h
@@ +84,5 @@
> MOZ_ASSERT(index < length_);
> return list_[index];
> }
>
> + T* data() {
See comment in MIR.cpp; i think we can get rid of this one.
::: js/src/jit/MIR.cpp
@@ +1209,3 @@
>
> MOZ_ASSERT(numVectors() == 2);
> + return MSimdShuffle::New(alloc, vector(0), vector(1), lanes.data());
Here and above, could we simply use lanes.begin()? Then I think we can remove the data() helper on the FixedList. It is ok, at least to me, to use "begin()" when meaning "data()" (I think STL does that).
::: js/src/jit/MIR.h
@@ +1972,4 @@
> {
> + arity_ = SimdTypeToLength(type);
> + for (unsigned i = 0; i < arity_; i++) {
> + MOZ_ASSERT(lanes[i] < 2 * arity_);
This is too lax for swizzle: it should be lanes[i] < arity_. Maybe the value to check against could be passed as a DebugOnly<uint32_t>? Or the assertion could be in the children's constructors.
@@ +1995,5 @@
> };
>
> // Applies a shuffle operation to the input, putting the input lanes as
> // indicated in the output register's lanes. This implements the SIMD.js
> // "shuffle" function, that takes one vector and one mask.
Pre-existing: s/shuffle/swizzle/ in this comment
@@ +2141,5 @@
> // In the balanced case, swap operands if needs be, in order to be able
> // to do only one vshufps on x86.
> + unsigned lanesFromLHS = 0;
> + for (unsigned i = 0; i < arity; i++)
> + if (lanes[i] < arity)
nit: Not sure what our style guide says about this kind of situation, but I would put braces for the `for` body here, just to ensure we don't fall into a weird openssl-like situation later.
@@ +2145,5 @@
> + if (lanes[i] < arity)
> + lanesFromLHS++;
> +
> + if (lanesFromLHS < arity / 2 || (arity == 4 && lanesFromLHS == 2 &&
> + lanes[0] >= 4 && lanes[1] >= 4)) {
uber-nit: can you put the second part of the || in its own line, please? (or put it first, then put the `lanesFromLHS < arity / 2` on the second line)
@@ +2154,4 @@
> }
>
> // If all lanes come from the same vector, just use swizzle instead.
> + if (lanesFromLHS == arity)
Nice, it's much simpler to think in terms of arity.
Attachment #8749254 -
Flags: review?(bbouvier) → review+
Assignee | ||
Comment 23•9 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #22)
> Comment on attachment 8749254 [details] [diff] [review]
> Make MSimdSwizzle and MSimdShuffle length-agnostic.
Thanks, Ben!
> ::: js/src/jit/FixedList.h
> @@ +84,5 @@
> > MOZ_ASSERT(index < length_);
> > return list_[index];
> > }
> >
> > + T* data() {
>
> See comment in MIR.cpp; i think we can get rid of this one.
>
> ::: js/src/jit/MIR.cpp
> @@ +1209,3 @@
> >
> > MOZ_ASSERT(numVectors() == 2);
> > + return MSimdShuffle::New(alloc, vector(0), vector(1), lanes.data());
>
> Here and above, could we simply use lanes.begin()? Then I think we can
> remove the data() helper on the FixedList. It is ok, at least to me, to use
> "begin()" when meaning "data()" (I think STL does that).
The begin() and end() methods return iterators, while the data() method returns a pointer. It is common in C++ standard libraries to implement even std::vector iterators as classes since iterators are not guaranteed to support all the same things a pointer can do. Some libraries use extended debug-mode iterators that can perform bounds checking too.
Of course FixedList is our own class, and we can do whatever we want, but I would prefer to keep the separation between iterators and pointers, even though this class uses pointers as its iterators.
> ::: js/src/jit/MIR.h
> @@ +1972,4 @@
> > {
> > + arity_ = SimdTypeToLength(type);
> > + for (unsigned i = 0; i < arity_; i++) {
> > + MOZ_ASSERT(lanes[i] < 2 * arity_);
>
> This is too lax for swizzle: it should be lanes[i] < arity_. Maybe the value
> to check against could be passed as a DebugOnly<uint32_t>? Or the assertion
> could be in the children's constructors.
Thanks, good catch. I'll move the checks to the children.
Assignee | ||
Comment 24•9 years ago
|
||
Assignee | ||
Comment 25•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/48e12dc46e554777b2048cbf515cd4f808364fb7
Bug 1136226 - Remove SimdLane enumeration. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/55710af96f41086584b3c9065e552ec129f23a96
Bug 1136226 - Support 8x16 and 16x8 types in SimdConstant. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/5de8e799eae3901a46aba497de1601be02c61a1a
Bug 1136226 - Add 16x8 and 8x16 MIRTypes. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/2d6da6b2c98ab18cbb2f071c7e623180d48bd218
Bug 1136226 - Materialize 8x16 and 16x8 SIMD constants. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/c515aeae0c850113c1127a0db4f3a72e8822f71e
Bug 1136226 - Update LDefinition and MoveOp for 8x16 and 16x8. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/f63aa4a372a4af819f067d0aad23cf60a14619f6
Bug 1136226 - Rename 32x4 SIMD masm methods to "Simd128". r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb54ae23b22a7434ffef46e9678731338971d1bd
Bug 1136226 - Add masm.simd128Constant(). r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/117800523a0368e2a98d90befbc9c15854c7ce6c
Bug 1136226 - Make MSimdSwizzle and MSimdShuffle length-agnostic. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/a8a0b4464969d0b76cb23556e6bc940d74e481db
Bug 1136226 - Rename MSimdSplatX4 to MSimdSplat. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/f4b3c6c334e9bacbd278dfb4cd5aeee59098ef9d
Bug 1136226 - Fix jit-test shell for SIMD. r=bbouvier
Assignee | ||
Updated•9 years ago
|
Keywords: leave-open
Comment 26•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/48e12dc46e55
https://hg.mozilla.org/mozilla-central/rev/55710af96f41
https://hg.mozilla.org/mozilla-central/rev/5de8e799eae3
https://hg.mozilla.org/mozilla-central/rev/2d6da6b2c98a
https://hg.mozilla.org/mozilla-central/rev/c515aeae0c85
https://hg.mozilla.org/mozilla-central/rev/f63aa4a372a4
https://hg.mozilla.org/mozilla-central/rev/bb54ae23b22a
https://hg.mozilla.org/mozilla-central/rev/117800523a03
https://hg.mozilla.org/mozilla-central/rev/a8a0b4464969
https://hg.mozilla.org/mozilla-central/rev/f4b3c6c334e9
Comment 27•9 years ago
|
||
Comment on attachment 8749253 [details] [diff] [review]
Unify Bailout_NonSimd*Input.
Review of attachment 8749253 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry for the delay, I was off for a few days.
::: js/src/jit/BaselineBailouts.cpp
@@ +1887,5 @@
> case Bailout_NonBooleanInput:
> case Bailout_NonObjectInput:
> case Bailout_NonStringInput:
> case Bailout_NonSymbolInput:
> + case Bailout_NonSimdInput:
Note, we use this bailout to tag a bailout which attempt to unbox a Simd input of one particular type. Thus we can technically bail on a Bool8x16 and succeed on a Float32x4.
Thus I think this should be renamed as well to something like:
Bailout_UnexpectedSimdInput
Attachment #8749253 -
Flags: review?(nicolas.b.pierron) → review+
Assignee | ||
Comment 28•9 years ago
|
||
These are used in the asm.js / wasm code generator.
Also include addSaturate and subSaturate in the SimdOperation enum. They had
been left out by accident.
Attachment #8750870 -
Flags: review?(sunfish)
Assignee | ||
Comment 29•9 years ago
|
||
Attachment #8750871 -
Flags: review?(sunfish)
Assignee | ||
Comment 30•9 years ago
|
||
Attachment #8750872 -
Flags: review?(sunfish)
Assignee | ||
Comment 31•9 years ago
|
||
Attachment #8750873 -
Flags: review?(sunfish)
Assignee | ||
Comment 32•9 years ago
|
||
SIMD constructors with dynamic arguments are translated to a sequence of
MSimdInsertElement instructions, LLVM style. This may change if we need the
optimization.
Attachment #8750874 -
Flags: review?(sunfish)
Assignee | ||
Comment 33•9 years ago
|
||
Attachment #8750875 -
Flags: review?(sunfish)
Assignee | ||
Comment 34•9 years ago
|
||
Almost entirely boilerplate.
Note that IsSimdValidOperationType() does not yet accept 8x16 and 16x8 SIMD
operations which means that none of the new code is reachable yet. We'll turn
this on once IonMonkey has full support for the new SIMD types.
Attachment #8750876 -
Flags: review?(bbouvier)
Assignee | ||
Comment 35•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/c31a5ad159946c74a32c9ebedcaa8585593ff178
Bug 1136226 - Unify Bailout_NonSimd*Input. r=nbp
Comment 36•9 years ago
|
||
Comment on attachment 8750876 [details] [diff] [review]
Asm.js: Add small integer SIMD types.
Review of attachment 8750876 [details] [diff] [review]:
-----------------------------------------------------------------
Very boilerplatey, indeed. Thank you!
::: js/src/asmjs/AsmJS.cpp
@@ +975,5 @@
> + {
> + return which_ == Int8x16 || which_ == Uint8x16 || which_ == Int16x8 ||
> + which_ == Uint16x8 || which_ == Int32x4 || which_ == Uint32x4 ||
> + which_ == Float32x4 || which_ == Bool8x16 || which_ == Bool16x8 ||
> + which_ == Bool32x4;
This, or you can add two enum values to Which: FirstSimdType = Int8x16, LastSimdType = Bool32x4, and then just compare FirstSmdType <= which_ <= LastSimdType.
@@ +2518,5 @@
> + int8_t val[16];
> + for (size_t i = 0; i < 16; i++, arg = NextNode(arg)) {
> + uint32_t u32;
> + JS_ALWAYS_TRUE(IsLiteralInt(m, arg, &u32));
> + val[i] = int8_t(u32);
There is an implicit coercion here. Should we make this explicit and require that e.g. int8x16 lanes literals must be uint8 values in the first place? Otherwise, you could have pretty weird situations like SIMD.Int8x16(0x7fffffff, etc.). Not too sure about this though.
@@ +2520,5 @@
> + uint32_t u32;
> + JS_ALWAYS_TRUE(IsLiteralInt(m, arg, &u32));
> + val[i] = int8_t(u32);
> + }
> + MOZ_ASSERT(arg == nullptr);
I'd prefer MOZ_ASSERT(!arg); here and below, but that's up to your taste.
@@ +2521,5 @@
> + JS_ALWAYS_TRUE(IsLiteralInt(m, arg, &u32));
> + val[i] = int8_t(u32);
> + }
> + MOZ_ASSERT(arg == nullptr);
> + NumLit::Which w = type == SimdType::Uint8x16 ? NumLit::Uint8x16 : NumLit::Int8x16;
Pre-existing, but as there's a coercion from SimdType to Type, and Type enum values are the same as NumLit, could we do NumLit::Which w = NumLit::Which(Type(type)); and maybe inline its use?
@@ +2718,5 @@
> + case SimdOperation::Fn_fromInt8x16Bits: return Expr::Limit;
> + default: break;
> + }
> + MOZ_FALLTHROUGH;
> + case SimdType::Int8x16: {
nit, here and below, pre-existing: no need for the {} here
@@ +7532,4 @@
> #define SET_NATIVE_UINT32X4(op) case SimdOperation::Fn_##op: native = simd_uint32x4_##op; break;
> #define SET_NATIVE_FLOAT32X4(op) case SimdOperation::Fn_##op: native = simd_float32x4_##op; break;
> +#define SET_NATIVE_BOOL8x16(op) case SimdOperation::Fn_##op: native = simd_bool8x16_##op; break;
> +#define SET_NATIVE_BOOL16x8(op) case SimdOperation::Fn_##op: native = simd_bool16x8_##op; break;
uber nit, but can you please make the capitalization of the "x" consistent here? I actually prefer it to be lowercase, so that would mean changing the preexisting macro names.
Attachment #8750876 -
Flags: review?(bbouvier) → review+
Comment 37•9 years ago
|
||
bugherder |
Assignee | ||
Comment 38•9 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #36)
> Comment on attachment 8750876 [details] [diff] [review]
> Asm.js: Add small integer SIMD types.
Thanks, Ben!
> @@ +2518,5 @@
> > + int8_t val[16];
> > + for (size_t i = 0; i < 16; i++, arg = NextNode(arg)) {
> > + uint32_t u32;
> > + JS_ALWAYS_TRUE(IsLiteralInt(m, arg, &u32));
> > + val[i] = int8_t(u32);
>
> There is an implicit coercion here. Should we make this explicit and require
> that e.g. int8x16 lanes literals must be uint8 values in the first place?
> Otherwise, you could have pretty weird situations like
> SIMD.Int8x16(0x7fffffff, etc.). Not too sure about this though.
I don't really have a strong opinion. Nagy at Microsoft is writing a more formal specification of SIMD in asm.js, maybe we should wait for him?
FWIW, the corresponding JS constructor call has the same coercion via the ToInt8() and ToInt16() functions which extract the low bits. Also note that IsLiteralInt() already coerces negative literals into a u32.
Given the JS precedent, I don't think there is a lot of value in restricting asm.js further than requiring 32-bit literals in the 1.5x range INT_MIN - UINT_MAX, which is what IsLiteralInt() does.
> @@ +2521,5 @@
> > + JS_ALWAYS_TRUE(IsLiteralInt(m, arg, &u32));
> > + val[i] = int8_t(u32);
> > + }
> > + MOZ_ASSERT(arg == nullptr);
> > + NumLit::Which w = type == SimdType::Uint8x16 ? NumLit::Uint8x16 : NumLit::Int8x16;
>
> Pre-existing, but as there's a coercion from SimdType to Type, and Type enum
> values are the same as NumLit, could we do NumLit::Which w =
> NumLit::Which(Type(type)); and maybe inline its use?
It would be safe in this code, but it seems dangerous in general to cast between different enums with partially overlapping ranges. I see it's done in Type::lit(), but that is inside the Type class, and it is from the smaller to the larger enum.
I am mostly worried about what happens if somebody changes NumLit and Type in the future.
> @@ +7532,4 @@
> > #define SET_NATIVE_UINT32X4(op) case SimdOperation::Fn_##op: native = simd_uint32x4_##op; break;
> > #define SET_NATIVE_FLOAT32X4(op) case SimdOperation::Fn_##op: native = simd_float32x4_##op; break;
> > +#define SET_NATIVE_BOOL8x16(op) case SimdOperation::Fn_##op: native = simd_bool8x16_##op; break;
> > +#define SET_NATIVE_BOOL16x8(op) case SimdOperation::Fn_##op: native = simd_bool16x8_##op; break;
>
> uber nit, but can you please make the capitalization of the "x" consistent
> here? I actually prefer it to be lowercase, so that would mean changing the
> preexisting macro names.
He he, clearly most of this patch was made with s/32x4/8x16/g.
Updated•9 years ago
|
Attachment #8750870 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8750871 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8750872 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8750873 -
Flags: review?(sunfish) → review+
Comment 39•9 years ago
|
||
Comment on attachment 8750874 [details] [diff] [review]
Wasm: Generate code for small int SIMD constructors.
Review of attachment 8750874 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/asmjs/WasmIonCompile.cpp
@@ +2578,5 @@
> + MDefinition* scalar = 0;
> + if (!f.iter().readSimdCtorArg(ValType::I32, length, i, &scalar))
> + return false;
> + val = f.insertElementSimd(val, scalar, i, type);
> + }
This is missing readSimdCtorArgsEnd and readSimdCtorReturn calls, such as are used in the ValType::I32x4 case below.
@@ +2594,5 @@
> + MDefinition* scalar = 0;
> + if (!f.iter().readSimdCtorArg(ValType::I32, length, i, &scalar))
> + return false;
> + val = f.insertElementSimd(val, EmitSimdBooleanLaneExpr(f, scalar), i, type);
> + }
Same here.
@@ +2609,5 @@
> switch (type) {
> + case ValType::I8x16:
> + return EmitSimdChainedCtor(f, MIRType::Int8x16, SimdConstant::SplatX16(0));
> + case ValType::I16x8:
> + return EmitSimdChainedCtor(f, MIRType::Int16x8, SimdConstant::SplatX8(0));
Would it make sense to migrate the I32x4 and F32x4 cases to use this same approach, for consistency?
Attachment #8750874 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8750875 -
Flags: review?(sunfish) → review+
Assignee | ||
Comment 40•9 years ago
|
||
(In reply to Dan Gohman [:sunfish] from comment #39)
> Comment on attachment 8750874 [details] [diff] [review]
> Wasm: Generate code for small int SIMD constructors.
Thanks, Dan.
> @@ +2609,5 @@
> > switch (type) {
> > + case ValType::I8x16:
> > + return EmitSimdChainedCtor(f, MIRType::Int8x16, SimdConstant::SplatX16(0));
> > + case ValType::I16x8:
> > + return EmitSimdChainedCtor(f, MIRType::Int16x8, SimdConstant::SplatX8(0));
>
> Would it make sense to migrate the I32x4 and F32x4 cases to use this same
> approach, for consistency?
I thought about doing that, but I didn't want to cause any regressions.
For an F32x4 constructor, we currently use three vunpcklps instructions to shuffle things into place. Expanded like above, we would generate four vinsertps instead which is OK, except they require SSE 4.1. Without SSE 4.1, visitSimdInsertElementF falls back to stack bounces.
Similarly, for I32x4 we already generate four vpinsrd with SSE 4.1, but without we do a stack bounce. However, it's a single stack bounce (four 32-bit stores, one 128-bit load) and expanding the constructor into replaceLane calls would create 4 x (128-bit store, 32-bit store, 128-bit load).
For I16x8, it is always safe to expand the constructor into replaceLanes since vpinsrw is available since SSE 2.
For I8x16, we generate pretty bad code without SSE 4.1 for vpinsrb, but the register allocator would choke on a 16-input constructor instruction anyway.
It will be interesting to see if these SIMD constructors are used in real-world code, and how. If they are often used with many constant arguments, there is an opportunity to convert that into replaceLane operations on a constant initial vector.
Assignee | ||
Comment 41•9 years ago
|
||
Assignee | ||
Comment 42•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/6eb70c8292c06d9ddffe886aac5ae99e5386a93c
Bug 1136226 - Add FORALL enumerations for small integer SIMD types. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/14bb0cf9a6b1a32abb30bdf66bb186f85910c6e3
Bug 1136226 - Wasm: Add small integer SIMD types to WasmTypes.h. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/47556a2527746aacf8612b553bb7231e75a70682
Bug 1136226 - Wasm: Encode/decode small integer SIMD constants. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/2c4e652a57ba5c673283cc65bfcd4a70e13b9b6c
Bug 1136226 - Wasm: Add opcodes for small integer SIMD types. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/99c24ad35c7f623a200f02234887dd901921383f
Bug 1136226 - Wasm: Generate code for small int SIMD constructors. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/31e5232117ea78c731667a042e9fa993c7f14e56
Bug 1136226 - Wasm: Codegen for small integer SIMD constants. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/a372dfd44aadb5b41fb526f5b960d98f79eb9f40
Bug 1136226 - Asm.js: Add small integer SIMD types. r=bbouvier
Comment 43•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6eb70c8292c0
https://hg.mozilla.org/mozilla-central/rev/14bb0cf9a6b1
https://hg.mozilla.org/mozilla-central/rev/47556a252774
https://hg.mozilla.org/mozilla-central/rev/2c4e652a57ba
https://hg.mozilla.org/mozilla-central/rev/99c24ad35c7f
https://hg.mozilla.org/mozilla-central/rev/31e5232117ea
https://hg.mozilla.org/mozilla-central/rev/a372dfd44aad
Assignee | ||
Comment 44•9 years ago
|
||
Attachment #8756048 -
Flags: review?(bbouvier)
Assignee | ||
Comment 45•9 years ago
|
||
More tests will be added as operations are implemented.
Attachment #8756049 -
Flags: review?(bbouvier)
Assignee | ||
Comment 46•9 years ago
|
||
Attachment #8756050 -
Flags: review?(bbouvier)
Assignee | ||
Comment 47•9 years ago
|
||
Use vpinsrw to insert 16x8 lanes. This instruction is available since SSE2, so
it can be used unconditionally.
Attachment #8756051 -
Flags: review?(bbouvier)
Assignee | ||
Comment 48•9 years ago
|
||
- Implement 'not' and 'neg' for 8x16 and 16x8 types.
- Rename some 'bitwiseFooX4' masm functions to 'bitwiseFooSimd128'.
- Rename the zeroInt32x4 and zeroFloat32x4 to zeroSimd128{Int,Float}.
- Add support for the paddb/paddw and psubb/psubw SSE2 instructions in the
assembler.
Attachment #8756052 -
Flags: review?(bbouvier)
Assignee | ||
Comment 49•9 years ago
|
||
- Rename LSimdBinaryBitwiseX4 to LSimdBinaryBitwise and use it for all types.
- Add pmullw to the assembler for 16x8 multiplies.
Don't implement 8x16 multiplies just yet. There are no SSE instructions for
that operation, so they need to be synthesied from 16x8 multiplies and shuffles.
Attachment #8756053 -
Flags: review?(bbouvier)
Assignee | ||
Comment 50•9 years ago
|
||
These all have corresponding SSE instructions.
The 8x16 shifts don't have SSE instructions, so they will be added by the next
commit.
Attachment #8756054 -
Flags: review?(bbouvier)
Assignee | ||
Comment 51•9 years ago
|
||
These operations don't have SSE instructions, so express them in terms of 16x8
shifts in MSimdShift::AddLegalized.
Attachment #8756055 -
Flags: review?(bbouvier)
Assignee | ||
Comment 52•9 years ago
|
||
There are no 8x16 SSE multiply instructions, so expand these multiplies in
terms of 16x8 multiplies.
Attachment #8756056 -
Flags: review?(bbouvier)
Assignee | ||
Comment 53•9 years ago
|
||
Attachment #8756057 -
Flags: review?(bbouvier)
Assignee | ||
Comment 54•9 years ago
|
||
The scalar argument to this operation is expanded into MIR as either -1 or 0 in
an Int32, so the 4-lane splat produces the correct result for 8-lane and
16-lane splats too. Either an all-zeroes vector or an all-ones vector.
Attachment #8756058 -
Flags: review?(bbouvier)
Assignee | ||
Comment 55•9 years ago
|
||
Attachment #8756059 -
Flags: review?(bbouvier)
Assignee | ||
Comment 56•9 years ago
|
||
Remove a normalizing shift of the boolean selector. Boolean SIMD types are
guaranteed to be represented as all-ones.
Attachment #8756060 -
Flags: review?(sunfish)
Assignee | ||
Comment 57•9 years ago
|
||
When we have SSSE3 available, the pshufb instruction can perform any byte-wise
swizzle.
Without SSSE3, fall back to using byte-wise loads and stores to simulate the
swizzle. This applies to CPUs from before 2006.
Attachment #8756061 -
Flags: review?(sunfish)
Assignee | ||
Comment 58•9 years ago
|
||
When SSSE3 is available, two pshufb instructions can be combined to form any
shuffle.
Old machines without SSSE3 bounce the two vectors through the stack.
Attachment #8756062 -
Flags: review?(sunfish)
Assignee | ||
Comment 59•9 years ago
|
||
Since SSE doesn't have unsigned comparisons, add a bias vector and use the
signed comparisons instead, just like we do for the 32x4 unsigned vectors.
Use 'defineReuseInput' even when SIMD input and output types differ. This is
fine now since the register allocator uses a single Simd128 class for all SIMD
registers.
Attachment #8756063 -
Flags: review?(sunfish)
Assignee | ||
Comment 60•9 years ago
|
||
Add a new asm.js test which exercises all the possible bitcast operations as
well as all possible load and store operations, except the partial ones.
Add new Scalar::I8x16, I16x8 enumerators.
Remove the fromUintXXX opcodes from the *_ASMJS macros. This avoids gerating
unwanted wasm opcodes. Include the relevant unsigned bit-casts explicitly where
needed.
Attachment #8756065 -
Flags: review?(sunfish)
Assignee | ||
Comment 61•9 years ago
|
||
Add support for inlining addSaturate and subSaturate SIMD operations when
compiling SIMD.js.
Attachment #8756066 -
Flags: review?(sunfish)
Assignee | ||
Comment 62•9 years ago
|
||
Use the same strategy as we do for asm.js: Start from an initial vector and
insert one lane at a time.
Attachment #8756067 -
Flags: review?(sunfish)
Assignee | ||
Comment 63•9 years ago
|
||
Some MIR instructions have an AddLegalized() static function which much be used
when creating new instructions. This ensures that the proper expansions are
generated.
Make the New() factory method in those instructions private so it isn't used
inadvertently.
De-templatize the inlineSimdBinary<> function since it was only used for two
MIR instructions which are now created differently.
Attachment #8756068 -
Flags: review?(sunfish)
Assignee | ||
Comment 64•9 years ago
|
||
Attachment #8756070 -
Flags: review?(sunfish)
Assignee | ||
Comment 65•9 years ago
|
||
Comment 66•9 years ago
|
||
Comment on attachment 8756048 [details] [diff] [review]
Asm.js: Enable small integer types.
Review of attachment 8756048 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks! Just a technicality: having this patch being pushed before the others create a window in which fuzzers can trip and for which they'll hit MOZ_CRASH(NYI). Can you move this patch in the patch queue to be after all the opcodes have been implemented, please?
Attachment #8756048 -
Flags: review?(bbouvier) → review+
Comment 67•9 years ago
|
||
Comment on attachment 8756049 [details] [diff] [review]
Asm.js: Test cases for small integer SIMD types.
Review of attachment 8756049 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks! Comments apply to both files.
::: js/src/jit-test/tests/asm.js/testSIMD-8x16.js
@@ +5,5 @@
> +
> +// Set to true to see more JS debugging spew.
> +const DEBUG = false;
> +
> +if (!isSimdAvailable() || typeof SIMD === 'undefined') {
The second part is tested in libdir/simd.js and can be removed from here.
@@ +33,5 @@
> +
> +// Linking
> +assertEq(asmLink(asmCompile('glob', USE_ASM + I8x16 + "function f() {} return f"), {SIMD:{Int8x16: SIMD.Int8x16}})(), undefined);
> +assertEq(asmLink(asmCompile('glob', USE_ASM + U8x16 + "function f() {} return f"), {SIMD:{Uint8x16: SIMD.Uint8x16}})(), undefined);
> +assertEq(asmLink(asmCompile('glob', USE_ASM + B8x16 + "function f() {} return f"), {SIMD:{Bool8x16: SIMD.Bool8x16}})(), undefined);
Can you test linking fails against the wrong functions?
@@ +68,5 @@
> +assertAsmTypeFail('glob', USE_ASM + B8x16 + "function f() {var x=b8x16(1,0,0,0);} return f");
> +assertAsmTypeFail('glob', USE_ASM + B8x16 + "function f() {var x=b8x16(1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1.0);} return f");
> +assertAsmTypeFail('glob', USE_ASM + B8x16 + "function f() {var x=b8x16(1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1|0);} return f");
> +assertAsmTypeFail('glob', USE_ASM + B8x16 + "function f() {var x=b8x16(1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1);} return f");
> +assertEq(asmLink(asmCompile('glob', USE_ASM + B8x16 + "function f() {var x=b8x16(1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1);} return f"), this)(), undefined);
Can you test values other than 0, 1 here too, please? (In particular -1)
@@ +74,5 @@
> +// Only signed Int8x16 allowed as return value.
> +assertEqVecArr(asmLink(asmCompile('glob', USE_ASM + I8x16 + "function f() {return i8x16(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);} return f"), this)(),
> + [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]);
> +assertEqVecArr(asmLink(asmCompile('glob', USE_ASM + I8x16 + I8x16CHK + "function f() {return i8x16chk(i8x16(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16));} return f"), this)(),
> + [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]);
Can you add a test with a vector that has (INT8X16_MAX + 1) in one component, and check that it gets wrapped?
Attachment #8756049 -
Flags: review?(bbouvier) → review+
Comment 68•9 years ago
|
||
Comment on attachment 8756050 [details] [diff] [review]
Implement MSimdExtractElement for small integer types.
Review of attachment 8756050 [details] [diff] [review]:
-----------------------------------------------------------------
Thank you! Nice tests.
::: js/src/jit/Lowering.cpp
@@ +4400,5 @@
> MOZ_ASSERT(ins->signedness() == SimdSign::Unsigned);
> define(new (alloc()) LSimdExtractElementU2D(use, temp()), ins);
> } else {
> + auto* lir = new (alloc()) LSimdExtractElementI(use);
> +#if defined(JS_CODEGEN_X86)
Lowering should not have platform dependent code. I think you will need an implementation in Lowering-x86-shared. Or ignore ARM at the moment and remove this #ifdef, until we implement ARM.
@@ +4407,5 @@
> + // version of these instructions require a source register that is
> + // %al, %bl, %cl, or %dl.
> + // Fix it to %eax since we can't express that constraint better.
> + if (ins->input()->type() == MIRType::Int8x16) {
> + defineFixed(lir, ins, LAllocation(AnyRegister(eax)));
As there are other instructions which need eax as fixed but don't have other options, would it make sense to use ebx instead, to maybe lower register pressure in some places?
::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +2690,5 @@
> + signedness = SimdSign::NotApplicable;
> + } else {
> + // Extract the relevant 16 bits containing our lane, then shift the
> + // right 8 bits into place.
> + emitSimdExtractLane16x8(input, output, lane / 2, SimdSign::Unsigned);
As the last parameter is unused, can you use NotApplicable? I wondered why Unsigned was used here, just to find out it's a trick to not trigger signedness = SimdSign::Signed above.
@@ +2699,5 @@
> + signedness = SimdSign::NotApplicable;
> + }
> + }
> +
> + // We have the right low 8 bits in \output|, but we may need to fix the high
nit: |output|
@@ +2752,4 @@
>
> + switch (length) {
> + case 4:
> + emitSimdExtractLane32x4(input, output, mir->lane());
Maybe MOZ_ASSERT(mir->signedness() == Signed)?
Attachment #8756050 -
Flags: review?(bbouvier) → review+
Comment 69•9 years ago
|
||
Comment on attachment 8756050 [details] [diff] [review]
Implement MSimdExtractElement for small integer types.
Review of attachment 8756050 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry I spotted something else.
::: js/src/jit/x86-shared/Assembler-x86-shared.h
@@ +2115,5 @@
> + void vpextrb(unsigned lane, FloatRegister src, Register dest) {
> + MOZ_ASSERT(HasSSE41());
> + masm.vpextrb_irr(lane, src.encoding(), dest.encoding());
> + }
> + void vpextrb(unsigned lane, FloatRegister src, const Operand& dest) {
This variant is actually unused, can we remove it? (unless you need it in a subsequent patch or have plans to use it somehow). In this case, can we remove vpextrb_irm too?
Comment 70•9 years ago
|
||
Comment on attachment 8756051 [details] [diff] [review]
Implement MSimdInsertElement for small integer types.
Review of attachment 8756051 [details] [diff] [review]:
-----------------------------------------------------------------
Thank you!
::: js/src/jit-test/tests/asm.js/testSIMD-8x16.js
@@ +128,5 @@
> +for (var i = 0; i < 16; i++) {
> + var f = replaceI(a, i);
> + var b = a.slice(0);
> + b[i] = 0;
> + assertEqVecArr(f(0), b);
nit: can you use a value more interesting than zero, please? :} Like, a signed one? (applies for the other test too)
::: js/src/jit/x86-shared/Assembler-x86-shared.h
@@ +2098,5 @@
> + void vpinsrb(unsigned lane, Register src1, FloatRegister src0, FloatRegister dest) {
> + MOZ_ASSERT(HasSSE41());
> + masm.vpinsrb_irr(lane, src1.encoding(), src0.encoding(), dest.encoding());
> + }
> + void vpinsrb(unsigned lane, const Operand& src1, FloatRegister src0, FloatRegister dest) {
Can we remove this variant? It seems unused. (and vpinsrb_imr will be too)
Attachment #8756051 -
Flags: review?(bbouvier) → review+
Comment 71•9 years ago
|
||
Comment on attachment 8756052 [details] [diff] [review]
Unary functions for small integer SIMD types.
Review of attachment 8756052 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good. Muuuuuch boilerplate.
::: js/src/jit-test/tests/asm.js/testSIMD-16x8.js
@@ +180,5 @@
> +
> +function unaryB(opname, lanefunc) {
> + simdfunc = asmLink(asmCompile('glob', USE_ASM + B16x8 + B16x8CHK +
> + `var fut = b16x8.${opname};` +
> + 'function f(v) { v = b16x8chk(v); return fut(v); } return f'), this);
Same question as for the other test file.
::: js/src/jit-test/tests/asm.js/testSIMD-8x16.js
@@ +181,5 @@
> +
> +function unaryB(opname, lanefunc) {
> + simdfunc = asmLink(asmCompile('glob', USE_ASM + B8x16 + B8x16CHK +
> + `var fut = b8x16.${opname};` +
> + 'function f(v) { v = b8x16chk(v); return fut(v); } return f'), this);
I thought only signed integer (non bool) vectors could be returned from asm.js to the outside world? Or is it that the "not" operation return an i8x16? Either way, isn't there something to fix here?
::: js/src/jit/Lowering.cpp
@@ +4556,5 @@
>
> // Cannot be at start, as the ouput is used as a temporary to store values.
> LUse in = use(ins->input());
>
> + if (ins->type() == MIRType::Int8x16 || ins->type() == MIRType::Bool8x16) {
Maybe we could switch on ins->type(). Or maybe we don't need to.
::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +3571,5 @@
> + masm.packedSubInt16(in, out);
> + return;
> + case MSimdUnaryArith::not_:
> + masm.loadConstantSimd128Int(allOnes, out);
> + masm.bitwiseXorSimd128(in, out);
Random question: as not_ is independent from the input's length, could/should we common it out somewhere?
Attachment #8756052 -
Flags: review?(bbouvier) → review+
Assignee | ||
Comment 72•9 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #66)
> Comment on attachment 8756048 [details] [diff] [review]
> Asm.js: Enable small integer types.
>
> Review of attachment 8756048 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Thanks! Just a technicality: having this patch being pushed before the
> others create a window in which fuzzers can trip and for which they'll hit
> MOZ_CRASH(NYI). Can you move this patch in the patch queue to be after all
> the opcodes have been implemented, please?
Good point. Currently, the patches implementing operations are also adding asm.test cases to the two test files. I'll roll up all the test cases into one commit and add in this order:
1. Implement new operations.
2. Enable inlining of new types for JS JIT.
3. Enable new types and operations for asm.js
4. Single commit with asm.js tests.
This should keep the tree passing at all intermediate stages without any NYI crashers.
Assignee | ||
Comment 73•9 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #68)
> Comment on attachment 8756050 [details] [diff] [review]
> Implement MSimdExtractElement for small integer types.
>
> Review of attachment 8756050 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Thank you! Nice tests.
>
> ::: js/src/jit/Lowering.cpp
> @@ +4400,5 @@
> > MOZ_ASSERT(ins->signedness() == SimdSign::Unsigned);
> > define(new (alloc()) LSimdExtractElementU2D(use, temp()), ins);
> > } else {
> > + auto* lir = new (alloc()) LSimdExtractElementI(use);
> > +#if defined(JS_CODEGEN_X86)
>
> Lowering should not have platform dependent code. I think you will need an
> implementation in Lowering-x86-shared. Or ignore ARM at the moment and
> remove this #ifdef, until we implement ARM.
Ugh, I though you might say that ;-)
I agree, this is a bit gross. I'll move it to x86-shared.
> @@ +4407,5 @@
> > + // version of these instructions require a source register that is
> > + // %al, %bl, %cl, or %dl.
> > + // Fix it to %eax since we can't express that constraint better.
> > + if (ins->input()->type() == MIRType::Int8x16) {
> > + defineFixed(lir, ins, LAllocation(AnyRegister(eax)));
>
> As there are other instructions which need eax as fixed but don't have other
> options, would it make sense to use ebx instead, to maybe lower register
> pressure in some places?
Yes, that would make sense. Some shift instructions use ecx hardcoded, so ebx is a good choice.
> ::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
> @@ +2690,5 @@
> > + signedness = SimdSign::NotApplicable;
> > + } else {
> > + // Extract the relevant 16 bits containing our lane, then shift the
> > + // right 8 bits into place.
> > + emitSimdExtractLane16x8(input, output, lane / 2, SimdSign::Unsigned);
>
> As the last parameter is unused, can you use NotApplicable? I wondered why
> Unsigned was used here, just to find out it's a trick to not trigger
> signedness = SimdSign::Signed above.
In the case of an odd lane with unsigned extraction, we do use the zero-extension. We end up emitting this code:
vpextrw lane/2, input, output ; output = 0x0000xxyy
shrl 8, output ; output = 0x000000xx
That's a 32-bit right shift, and we can cancel the insertion of a movzbl below.
> @@ +2699,5 @@
> > + signedness = SimdSign::NotApplicable;
> > + }
> > + }
> > +
> > + // We have the right low 8 bits in \output|, but we may need to fix the high
>
> nit: |output|
>
> @@ +2752,4 @@
> >
> > + switch (length) {
> > + case 4:
> > + emitSimdExtractLane32x4(input, output, mir->lane());
>
> Maybe MOZ_ASSERT(mir->signedness() == Signed)?
I'm not sure if we can depend on asm.js to always generate that, but at least it shouldn't be NotApplicable.
Assignee | ||
Comment 74•9 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #69)
> Comment on attachment 8756050 [details] [diff] [review]
> Implement MSimdExtractElement for small integer types.
>
> Review of attachment 8756050 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Sorry I spotted something else.
>
> ::: js/src/jit/x86-shared/Assembler-x86-shared.h
> @@ +2115,5 @@
> > + void vpextrb(unsigned lane, FloatRegister src, Register dest) {
> > + MOZ_ASSERT(HasSSE41());
> > + masm.vpextrb_irr(lane, src.encoding(), dest.encoding());
> > + }
> > + void vpextrb(unsigned lane, FloatRegister src, const Operand& dest) {
>
> This variant is actually unused, can we remove it? (unless you need it in a
> subsequent patch or have plans to use it somehow). In this case, can we
> remove vpextrb_irm too?
I'm not so sure about this. It makes more sense to me to add all the variants of an opcode once I need one. Then the next guy won't have to worry about which variants of an opcode happen to be implemented already.
Since these are inline functions, I expect that the compiler won't even generate code for them if they're not used.
How do we usually do this? It looked to me like the macro assembler mostly implements the full set of variants.
Assignee | ||
Comment 75•9 years ago
|
||
Comment on attachment 8756055 [details] [diff] [review]
Implement 8x16 SIMD shift operators.
Review of attachment 8756055 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/MIR.cpp
@@ +1389,5 @@
> + // wide = yyxx yyxx yyxx yyxx yyxx yyxx yyxx yyxx
> +
> + MInstruction* shiftMask = MConstant::New(alloc, Int32Value(7));
> + addTo->add(shiftMask);
> + MInstruction* shiftBy = MBitAnd::New(alloc, right, shiftMask);
It looks like I missed this:
shiftBy->setInt32Specialization();
It appears that `specialization` is left uninitialized if I don't do something like that.
Assignee | ||
Comment 76•9 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #71)
> Comment on attachment 8756052 [details] [diff] [review]
> Unary functions for small integer SIMD types.
>
> Review of attachment 8756052 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Looks good. Muuuuuch boilerplate.
Such is the way of the macro assembler ;-)
> ::: js/src/jit-test/tests/asm.js/testSIMD-8x16.js
> @@ +181,5 @@
> > +
> > +function unaryB(opname, lanefunc) {
> > + simdfunc = asmLink(asmCompile('glob', USE_ASM + B8x16 + B8x16CHK +
> > + `var fut = b8x16.${opname};` +
> > + 'function f(v) { v = b8x16chk(v); return fut(v); } return f'), this);
>
> I thought only signed integer (non bool) vectors could be returned from
> asm.js to the outside world? Or is it that the "not" operation return an
> i8x16? Either way, isn't there something to fix here?
We don't allow unsigned vectors because WebAssembly wouldn't be able to tell themapart form signed vectors in its type system. I expect that WebAssembly will have boolean vector types, though, so they are safe to pass around.
For FFI, all SIMD types are banned.
> ::: js/src/jit/Lowering.cpp
> @@ +4556,5 @@
> >
> > // Cannot be at start, as the ouput is used as a temporary to store values.
> > LUse in = use(ins->input());
> >
> > + if (ins->type() == MIRType::Int8x16 || ins->type() == MIRType::Bool8x16) {
>
> Maybe we could switch on ins->type(). Or maybe we don't need to.
That would be good form, I think.
> ::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
> @@ +3571,5 @@
> > + masm.packedSubInt16(in, out);
> > + return;
> > + case MSimdUnaryArith::not_:
> > + masm.loadConstantSimd128Int(allOnes, out);
> > + masm.bitwiseXorSimd128(in, out);
>
> Random question: as not_ is independent from the input's length,
> could/should we common it out somewhere?
We do have a SimdBinaryBitwise opcode, but since there is only the one bitwise unary operation, it doesn't seem worthwhile to add a whole SimdUnaryBitwise instruction.
Updated•9 years ago
|
Attachment #8756060 -
Flags: review?(sunfish) → review+
Comment 77•9 years ago
|
||
Comment on attachment 8756061 [details] [diff] [review]
Implement swizzle for 8x16 and 16x8 SIMD types.
Review of attachment 8756061 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +3021,5 @@
> + if (AssemblerX86Shared::HasSSSE3()) {
> + ScratchSimd128Scope scratch(masm);
> + masm.loadConstantSimd128Int(SimdConstant::CreateX16(bLane), scratch);
> + if (input != output)
> + masm.vmovdqa(input, output);
VEX prefixes are implemented, but not actually enabled yet, so this doesn't matter at the moment, but it would be nice to use masm.reusedInputFloat32x4 here instead of an unconditional copy, so that we don't need a copy when VEX prefixes are enabled.
@@ +3026,5 @@
> + masm.vpshufb(scratch, output, output);
> + return;
> + }
> +
> + // Worst-case fallback for pre-SSE3 machines. Bounce through memory.
typo; pre-SSSE3 machines.
Attachment #8756061 -
Flags: review?(sunfish) → review+
Comment 78•9 years ago
|
||
Comment on attachment 8756061 [details] [diff] [review]
Implement swizzle for 8x16 and 16x8 SIMD types.
Review of attachment 8756061 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +3021,5 @@
> + if (AssemblerX86Shared::HasSSSE3()) {
> + ScratchSimd128Scope scratch(masm);
> + masm.loadConstantSimd128Int(SimdConstant::CreateX16(bLane), scratch);
> + if (input != output)
> + masm.vmovdqa(input, output);
Or rather, reusedInputInt32x4.
Comment 79•9 years ago
|
||
Comment on attachment 8756062 [details] [diff] [review]
Implement shuffle for 8x16 and 16x8 SIMD types.
Review of attachment 8756062 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +3121,5 @@
> + // Use pshufb instructions to gather the lanes from each source vector.
> + // A negative index creates a zero lane, so the two vectors can be combined.
> +
> + // Set scratch2 = lanes from lhs.
> + masm.vmovdqa(lhs, scratch2);
This can use reusedInputInt32x4
@@ +3129,5 @@
> + masm.loadConstantSimd128Int(SimdConstant::CreateX16(idx), scratch1);
> + masm.vpshufb(scratch1, scratch2, scratch2);
> +
> + // Set output = lanes from rhs.
> + masm.vmovdqa(rhs, output);
Ditto :)
Attachment #8756062 -
Flags: review?(sunfish) → review+
Comment 80•9 years ago
|
||
Comment on attachment 8756063 [details] [diff] [review]
Implement compares for 8x16 and 16x8 SIMD types.
Review of attachment 8756063 [details] [diff] [review]:
-----------------------------------------------------------------
Cool. Yay for using defineReuseInput :).
Attachment #8756063 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8756065 -
Flags: review?(sunfish) → review+
Comment 81•9 years ago
|
||
Comment on attachment 8756066 [details] [diff] [review]
Inline saturating arithmetic operations.
Review of attachment 8756066 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit-test/lib/simd.js
@@ +13,4 @@
> var arr = [];
> var [varr, warr] = [simdToArray(v), simdToArray(w)];
> [varr, warr] = [varr.map(Math.fround), warr.map(Math.fround)];
> for (var i = 0; i < 4; i++)
It looks like you want to generalize this function to non-4 sizes, but this loop still counts to 4.
Attachment #8756066 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8756067 -
Flags: review?(sunfish) → review+
Comment 82•9 years ago
|
||
Comment on attachment 8756068 [details] [diff] [review]
Make MIR New factories private where AddLegalized should be used.
Review of attachment 8756068 [details] [diff] [review]:
-----------------------------------------------------------------
Nice!
Attachment #8756068 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8756070 -
Flags: review?(sunfish) → review+
Comment 83•9 years ago
|
||
Thank you for all your answers and explanations. I agree to all of them, except for one...
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #74)
> (In reply to Benjamin Bouvier [:bbouvier] from comment #69)
> > Comment on attachment 8756050 [details] [diff] [review]
> > Implement MSimdExtractElement for small integer types.
> >
> > Review of attachment 8756050 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >
> > Sorry I spotted something else.
> >
> > ::: js/src/jit/x86-shared/Assembler-x86-shared.h
> > @@ +2115,5 @@
> > > + void vpextrb(unsigned lane, FloatRegister src, Register dest) {
> > > + MOZ_ASSERT(HasSSE41());
> > > + masm.vpextrb_irr(lane, src.encoding(), dest.encoding());
> > > + }
> > > + void vpextrb(unsigned lane, FloatRegister src, const Operand& dest) {
> >
> > This variant is actually unused, can we remove it? (unless you need it in a
> > subsequent patch or have plans to use it somehow). In this case, can we
> > remove vpextrb_irm too?
>
> I'm not so sure about this. It makes more sense to me to add all the
> variants of an opcode once I need one. Then the next guy won't have to worry
> about which variants of an opcode happen to be implemented already.
>
> Since these are inline functions, I expect that the compiler won't even
> generate code for them if they're not used.
>
> How do we usually do this? It looked to me like the macro assembler mostly
> implements the full set of variants.
Well, it is just a basic golden rule I'm trying to apply: don't add dead code, because it can remain dead for a long time. There was an instance of this I've found in one of the ARM assemblers: the rotation functions have been implemented since the very beginning, but there were no users until we implement rotations in wasm! And it has been like this for some time :-)
I think it is being more pragmatic to not add dead code, of course unless you use it in the patch series, or very soon afterwards. It takes the developer's memory, byte per byte. Plus, if it's not used, it's not tested, so it can have bugs, which might be hard to find for the potential future user.
(Unrelated to this argument: I don't think we can require regalloc that the output of an instruction be in memory. So this function could be used otherwise in codegen, if we manually put something onto the stack or whatnot, but again, if this doesn't happen, please let's just not add dead code)
Or as the Delphi Greek temple front says, "nothing in excess" :).
Comment 84•9 years ago
|
||
Comment on attachment 8756053 [details] [diff] [review]
Binary functions for small integer SIMD types.
Review of attachment 8756053 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, thanks!
::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +3370,5 @@
> + FloatRegister lhs = ToFloatRegister(ins->lhs());
> + Operand rhs = ToOperand(ins->rhs());
> + FloatRegister output = ToFloatRegister(ins->output());
> +
> + ScratchSimd128Scope scratch(masm);
That might be pre-existing in other code below, but it seems this is unused.
@@ +3384,5 @@
> + case MSimdBinaryArith::Op_mul:
> + // 8x16 mul is a valid operation, but not supported in SSE or AVX.
> + // The operation is synthesized from 16x8 multiplies by
> + // MSimdBinaryArith::AddLegalized().
> + break;
nit: inconsistent indentation here.
@@ +3402,5 @@
> + FloatRegister lhs = ToFloatRegister(ins->lhs());
> + Operand rhs = ToOperand(ins->rhs());
> + FloatRegister output = ToFloatRegister(ins->output());
> +
> + ScratchSimd128Scope scratch(masm);
ditto
::: js/src/jit/x86-shared/Lowering-x86-shared.cpp
@@ +651,5 @@
> + case MIRType::Int8x16: {
> + LSimdBinaryArithIx16* lir = new (alloc()) LSimdBinaryArithIx16();
> + bool needsTemp =
> + ins->operation() == MSimdBinaryArith::Op_mul && !MacroAssembler::HasSSE41();
> + lir->setTemp(0, needsTemp ? temp(LDefinition::SIMD128INT) : LDefinition::BogusTemp());
If I've understood correctly, ins->operation() can't be Op_mul here? So the temp can consistently be a BogusTemp()?
Attachment #8756053 -
Flags: review?(bbouvier) → review+
Comment 85•9 years ago
|
||
Comment on attachment 8756054 [details] [diff] [review]
Implement 16x8 SIMD shift operators.
Review of attachment 8756054 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks! I really like the fact that each type gets their own test file.
::: js/src/jit-test/tests/asm.js/testSIMD-16x8.js
@@ +252,5 @@
> +// Test shift operators.
> +function zip1map(a, s, f) {
> + let r = [];
> + for (let i = 0; i < a.length; i++)
> + r.push(f(a[i], s));
(ubernit: return a.map(x => f(x, s)); )
::: js/src/jit/x86-shared/BaseAssembler-x86-shared.h
@@ +2771,5 @@
> +
> + void vpsllw_ir(uint32_t count, XMMRegisterID src, XMMRegisterID dst)
> + {
> + MOZ_ASSERT(count < 16);
> + shiftOpImmSimd("vpsllw", OP2_PSLLW_UdqIb, ShiftID::vpslld, count, src, dst);
There are different enum values for vpsrld/vpsrlq; I am fine with reusing them. If you feel like so, can you unify ShiftID::vpsrld with ShiftID::vpsrlq (and the same for vpslld with vpsllq, and remove the size suffixes in those enum values). Or it could be done in a follow up bug.
Attachment #8756054 -
Flags: review?(bbouvier) → review+
Assignee | ||
Comment 86•9 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #83)
> Well, it is just a basic golden rule I'm trying to apply: don't add dead
> code, because it can remain dead for a long time. There was an instance of
> this I've found in one of the ARM assemblers: the rotation functions have
> been implemented since the very beginning, but there were no users until we
> implement rotations in wasm! And it has been like this for some time :-)
>
> I think it is being more pragmatic to not add dead code, of course unless
> you use it in the patch series, or very soon afterwards. It takes the
> developer's memory, byte per byte. Plus, if it's not used, it's not tested,
> so it can have bugs, which might be hard to find for the potential future
> user.
>
> Or as the Delphi Greek temple front says, "nothing in excess" :).
I actually agree with all of this. It must be because the code is so boilerplatey that I don't like pruning it. It really should be generated by a script from a few opcode tables.
I'll prune the dead code.
Assignee | ||
Comment 87•9 years ago
|
||
Since most targets don't implement SIMD and don't enable the generation of SIMD
instructions, it makes sense to provide a default implementation of the SIMD
visitor functions that simply crashes with an NYI error.
Remove the many identical copies of these visitors from the non-SIMD targets.
Attachment #8757023 -
Flags: review?(bbouvier)
Assignee | ||
Comment 88•9 years ago
|
||
This instruction is used when the shuffle indexes are not compile time
constants.
Give the register allocator permission to spill the index arguments to
GeneralShuffle instructions. There can be up to 16 of them, and they can't all
be registers.
Move visitSimdGeneralShuffle into the x86-specific lowering code since it has
special register allocation requirements for 8-bit lanes.
Attachment #8757024 -
Flags: review?(bbouvier)
Assignee | ||
Comment 89•9 years ago
|
||
Assignee | ||
Comment 90•9 years ago
|
||
Ben, I've rearranged the commits as we discussed. You can see the final order in the try push above or here: https://hg.mozilla.org/try/pushloghtml?changeset=32332939081fa174955be2ea12e464d0a8850a56
- All the tests are squashed into one commit that follows enabling the new types.
- I put the "NYI" commit up front so the following commits don't have to touch the arm and mips targets at all.
Comment 91•9 years ago
|
||
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #90)
> Ben, I've rearranged the commits as we discussed. You can see the final
> order in the try push above or here:
> https://hg.mozilla.org/try/
> pushloghtml?changeset=32332939081fa174955be2ea12e464d0a8850a56
>
> - All the tests are squashed into one commit that follows enabling the new
> types.
> - I put the "NYI" commit up front so the following commits don't have to
> touch the arm and mips targets at all.
Fantastic, thank you!
Comment 92•9 years ago
|
||
Comment on attachment 8756055 [details] [diff] [review]
Implement 8x16 SIMD shift operators.
Review of attachment 8756055 [details] [diff] [review]:
-----------------------------------------------------------------
Thank you!
::: js/src/jit/MIR.cpp
@@ +1389,5 @@
> + // wide = yyxx yyxx yyxx yyxx yyxx yyxx yyxx yyxx
> +
> + MInstruction* shiftMask = MConstant::New(alloc, Int32Value(7));
> + addTo->add(shiftMask);
> + MInstruction* shiftBy = MBitAnd::New(alloc, right, shiftMask);
Roger.
@@ +1394,5 @@
> + addTo->add(shiftBy);
> +
> + MInstruction* mask =
> + MSimdConstant::New(alloc, SimdConstant::SplatX8(int16_t(0xff00)), MIRType::Int16x8);
> + addTo->add(mask);
It seems this could be closer to its use, moved down to...
@@ +1408,5 @@
> + // even = xx00 xx00 xx00 xx00 xx00 xx00 xx00 xx00
> + // odd = yyxx yyxx yyxx yyxx yyxx yyxx yyxx yyxx
> +
> + // Left-shift: Clear the low bits in `odd` before shifting.
> + if (op == lsh) {
... right above here.
::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +2639,2 @@
> masm.vmovdqa(input, output);
> + } else {
nit: no {} for one line if/else bodies.
Attachment #8756055 -
Flags: review?(bbouvier) → review+
Comment 93•9 years ago
|
||
Comment on attachment 8756056 [details] [diff] [review]
Implement 8x16 SIMD multiplies.
Review of attachment 8756056 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, thanks!
::: js/src/jit/MIR.cpp
@@ +1399,5 @@
> + // oddR = 00bb 00bb 00bb 00bb 00bb 00bb 00bb 00bb
> +
> + // Now do two 16x8 multiplications. We can use the low bits of each.
> + MInstruction* even = MSimdBinaryArith::AddLegalized(alloc, addTo, evenL, evenR, op);
> + MInstruction* odd = MSimdBinaryArith::AddLegalized(alloc, addTo, oddL, oddR, op);
We know op == Op_Mul, should we just replace these two instances here with Op_Mul directly?
Attachment #8756056 -
Flags: review?(bbouvier) → review+
Comment 94•9 years ago
|
||
Comment on attachment 8756057 [details] [diff] [review]
Implement SIMD saturating arithmetic.
Review of attachment 8756057 [details] [diff] [review]:
-----------------------------------------------------------------
Straightforward, thank you.
::: js/src/jit-test/tests/asm.js/testSIMD-16x8.js
@@ +203,5 @@
> return r
> }
>
> function binaryI(opname, lanefunc) {
> simdfunc = asmLink(asmCompile('glob', USE_ASM + I16x8 + I16x8CHK +
(see comment in the other test file)
::: js/src/jit-test/tests/asm.js/testSIMD-8x16.js
@@ +217,4 @@
> }
>
> function binaryU(opname, lanefunc) {
> simdfunc = asmLink(asmCompile('glob', USE_ASM + U8x16 + I8x16 + I8x16CHK + U8x16I8x16 + I8x16U8x16 +
pre-existing, can you add a `let` before simdfunc as well, here and above?
::: js/src/jit/MIR.h
@@ +2413,5 @@
> + public:
> + enum Operation
> + {
> + addSaturate,
> + subSaturate,
As the name of the MIR instruction already contains "saturating", what do you think about calling them simply add/sub?
@@ +2441,5 @@
> + MIRType type = left->type();
> + MOZ_ASSERT(type == MIRType::Int8x16 || type == MIRType::Int16x8);
> + setResultType(type);
> + setMovable();
> + setCommutative();
subSaturate might not be commutative.
::: js/src/jit/x86-shared/Lowering-x86-shared.cpp
@@ +696,5 @@
> +LIRGeneratorX86Shared::visitSimdBinarySaturating(MSimdBinarySaturating* ins)
> +{
> + MOZ_ASSERT(IsSimdType(ins->lhs()->type()));
> + MOZ_ASSERT(IsSimdType(ins->rhs()->type()));
> + MOZ_ASSERT(IsSimdType(ins->type()));
Can probably ReorderCommutative if the instruction is commutative (which would exhibit the sub marked as commutative bug).
Attachment #8756057 -
Flags: review?(bbouvier) → review+
Comment 95•8 years ago
|
||
Comment on attachment 8756058 [details] [diff] [review]
Implement Bool8x16.splat and Bool16x8.splat.
Review of attachment 8756058 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, thank you. x86 is such a mess.
::: js/src/jit/x86-shared/Assembler-x86-shared.h
@@ +3043,5 @@
> + void vpshuflw(uint32_t mask, FloatRegister src, FloatRegister dest) {
> + MOZ_ASSERT(HasSSE2());
> + masm.vpshuflw_irr(mask, src.encoding(), dest.encoding());
> + }
> + void vpshuflw(uint32_t mask, const Operand& src1, FloatRegister dest) {
Only the vpshuflw mask/register/register variant is used. Can you delete this one please? (and take away the BaseAssembler implementations too?)
@@ +3064,5 @@
> + void vpshufhw(uint32_t mask, FloatRegister src, FloatRegister dest) {
> + MOZ_ASSERT(HasSSE2());
> + masm.vpshufhw_irr(mask, src.encoding(), dest.encoding());
> + }
> + void vpshufhw(uint32_t mask, const Operand& src1, FloatRegister dest) {
These two are unused too at the moment.
@@ +3085,5 @@
> + void vpshufb(FloatRegister mask, FloatRegister src, FloatRegister dest) {
> + MOZ_ASSERT(HasSSSE3());
> + masm.vpshufb_rr(mask.encoding(), src.encoding(), dest.encoding());
> + }
> + void vpshufb(const Operand &mask, FloatRegister src, FloatRegister dest) {
This one is unused at the moment.
::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +2599,5 @@
> }
>
> void
> +CodeGeneratorX86Shared::visitSimdSplatX16(LSimdSplatX16* ins)
> +{
Here and in the x8 and x4 methods, can we MOZ_ASSERT that we have the SimdTypeToLength of the input's type is the expected one (here 16).
@@ +2624,5 @@
> + Register input = ToRegister(ins->getOperand(0));
> + FloatRegister output = ToFloatRegister(ins->output());
> + masm.vmovd(input, output);
> + masm.vpshuflw(0, output, output);
> + masm.vpshufd(0, output, output);
so we have vpshufw but it only works on 64 bits values, not 128 bits values.
yay x86 ¯\_(ツ)_/¯
Attachment #8756058 -
Flags: review?(bbouvier) → review+
Updated•8 years ago
|
Attachment #8756059 -
Flags: review?(bbouvier) → review+
Comment 96•8 years ago
|
||
Comment on attachment 8757023 [details] [diff] [review]
Provide shared NYI implementations of SIMD visitors.
Review of attachment 8757023 [details] [diff] [review]:
-----------------------------------------------------------------
Nice, thank you.
Attachment #8757023 -
Flags: review?(bbouvier) → review+
Comment 97•8 years ago
|
||
Comment on attachment 8757024 [details] [diff] [review]
Add general shuffle support for 8x16 and 16x8 shuffles.
Review of attachment 8757024 [details] [diff] [review]:
-----------------------------------------------------------------
Thank you for the patch. Great work on all of this.
::: js/src/jit/x86-shared/Lowering-x86-shared.cpp
@@ +947,5 @@
> +#if defined(JS_CODEGEN_X86)
> + // The temp register must be usable with 8-bit load and store
> + // instructions, so one of %eax-%edx.
> + if (ins->type() == MIRType::Int8x16)
> + t = tempFixed(ebx);
I think clobbering `t` will increase register pressure: there was a virtual register for the previous value of `t` which gets unused. Instead, we could just do
#if defined(JS_CODEGEN_X86)
t = ins->type() == ... ? tempFixed(ebx) : temp();
#else
t = temp();
#endif
if you don't mind.
It is a bit sad to have a platform-dependent #ifdef in the shared lowering, but it sounds more compelling than having a platform-specific implementation, since there are so few differences between x86/x64. We could have a platform-specific getTempForGeneralShuffle, but sounds like over-engineering, so let's keep it this way.
@@ +951,5 @@
> + t = tempFixed(ebx);
> +#endif
> + lir = new (alloc()) LSimdGeneralShuffleI(t);
> + } else if (ins->type() == MIRType::Float32x4)
> + lir = new (alloc()) LSimdGeneralShuffleF(temp());
nit: {} here and below as the `if` has {}
@@ +953,5 @@
> + lir = new (alloc()) LSimdGeneralShuffleI(t);
> + } else if (ins->type() == MIRType::Float32x4)
> + lir = new (alloc()) LSimdGeneralShuffleF(temp());
> + else
> + MOZ_CRASH("Unknown SIMD kind when doing a shuffle");
ditto
Attachment #8757024 -
Flags: review?(bbouvier) → review+
Assignee | ||
Comment 98•8 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #95)
> @@ +2624,5 @@
> > + Register input = ToRegister(ins->getOperand(0));
> > + FloatRegister output = ToFloatRegister(ins->output());
> > + masm.vmovd(input, output);
> > + masm.vpshuflw(0, output, output);
> > + masm.vpshufd(0, output, output);
>
> so we have vpshufw but it only works on 64 bits values, not 128 bits values.
>
> yay x86 ¯\_(ツ)_/¯
It's probably because of encoding constraints. A 4-way swizzle can be encoded in 4 x 2 bits = 1 byte which seems to be the size of immediate field the SSE instructions support. An 8-way swizzle would require 8 x 3 bits = 3 bytes immediate field which would probably require a new instruction format.
Instead, we get pshufb which can do it all by not using immediate fields.
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Assignee | ||
Comment 99•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/1df0548311c90e35e86104dc387bd3ab81dd797f
Bug 1136226 - Provide shared NYI implementations of SIMD visitors. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/f0779cf0f83dc854e97c87466f7109d9b264b48c
Bug 1136226 - Implement MSimdExtractElement for small integer types. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/b45c0a42f19cc86dbeb58f0f0282b788fe132d46
Bug 1136226 - Implement MSimdInsertElement for small integer types. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/62c28a8f7ebf20927d97f809313d3e0c567a064a
Bug 1136226 - Unary functions for small integer SIMD types. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/69ce6e7501086e2492274e352752eee5eeea447d
Bug 1136226 - Binary functions for small integer SIMD types. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/25dc50270a77116f3e679f451e143b0031382cdd
Bug 1136226 - Implement 16x8 SIMD shift operators. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/6ea5fb073f4be87096890646bf58061c90b22fcd
Bug 1136226 - Implement 8x16 SIMD shift operators. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/43882729d51c668fc0e57a7941cfbdc25664422d
Bug 1136226 - Implement 8x16 SIMD multiplies. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/4117a5326ded2a0f6543da9070e7c68c0ba0a172
Bug 1136226 - Implement SIMD saturating arithmetic. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/a17bc6fab38f7beaffa3608ca8ec4a7d660a2bd4
Bug 1136226 - Implement Bool8x16.splat and Bool16x8.splat. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa42a25f4124566158f812c0b796360dd239f814
Bug 1136226 - Handle SIMD global variables for x86 / x64. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/5472bbbb12079f4ca2da7fb8048fd0787ef6f200
Bug 1136226 - Implement select for 8x16 and 16x8 SIMD types. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/018f7422c57ec03d65f58802e4cbb6ee2fc25418
Bug 1136226 - Implement swizzle for 8x16 and 16x8 SIMD types. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/7be2feba720f43f6c5df652f9908f7a8c8a39be1
Bug 1136226 - Implement shuffle for 8x16 and 16x8 SIMD types. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/e5479106a7ab7033580c4114d96cd5d0d3c062d2
Bug 1136226 - Add general shuffle support for 8x16 and 16x8 shuffles. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/8e4f48388c600edd5ee5292b014cedb0b8f7672f
Bug 1136226 - Implement compares for 8x16 and 16x8 SIMD types. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/a019522f497b3d852ece421f4e09548e654d1e7e
Bug 1136226 - Inline saturating arithmetic operations. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/500e86461a3cce584c5b8069a2721a95366f8f72
Bug 1136226 - Test loads, stores, and bitcasts. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/b069413fe6b93ada4c38dad2bd6b8f22c2abf8ce
Bug 1136226 - Asm.js: Enable small integer types. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/5721a25757df53c3650889587768f4c6787c7ed1
Bug 1136226 - Asm.js: Test cases for small integer SIMD types. r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/b986652cc2606730be2a7c0f43697d8b2a9a2641
Bug 1136226 - Make MIR New factories private where AddLegalized should be used. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/4314ed7441b656c689d98d54996dcd524f80b72a
Bug 1136226 - Inline small int SIMD.js constructor calls. r=sunfish
https://hg.mozilla.org/integration/mozilla-inbound/rev/c19c99878a6076e928690f45b37403b110cd5482
Bug 1136226 - Enable inlining of 8x16 and 16x8 types. r=sunfish
Comment 100•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/1df0548311c9
https://hg.mozilla.org/mozilla-central/rev/f0779cf0f83d
https://hg.mozilla.org/mozilla-central/rev/b45c0a42f19c
https://hg.mozilla.org/mozilla-central/rev/62c28a8f7ebf
https://hg.mozilla.org/mozilla-central/rev/69ce6e750108
https://hg.mozilla.org/mozilla-central/rev/25dc50270a77
https://hg.mozilla.org/mozilla-central/rev/6ea5fb073f4b
https://hg.mozilla.org/mozilla-central/rev/43882729d51c
https://hg.mozilla.org/mozilla-central/rev/4117a5326ded
https://hg.mozilla.org/mozilla-central/rev/a17bc6fab38f
https://hg.mozilla.org/mozilla-central/rev/fa42a25f4124
https://hg.mozilla.org/mozilla-central/rev/5472bbbb1207
https://hg.mozilla.org/mozilla-central/rev/018f7422c57e
https://hg.mozilla.org/mozilla-central/rev/7be2feba720f
https://hg.mozilla.org/mozilla-central/rev/e5479106a7ab
https://hg.mozilla.org/mozilla-central/rev/8e4f48388c60
https://hg.mozilla.org/mozilla-central/rev/a019522f497b
https://hg.mozilla.org/mozilla-central/rev/500e86461a3c
https://hg.mozilla.org/mozilla-central/rev/b069413fe6b9
https://hg.mozilla.org/mozilla-central/rev/5721a25757df
https://hg.mozilla.org/mozilla-central/rev/b986652cc260
https://hg.mozilla.org/mozilla-central/rev/4314ed7441b6
https://hg.mozilla.org/mozilla-central/rev/c19c99878a60
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox49:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
You need to log in
before you can comment on or make changes to this bug.
Description
•