Closed Bug 1160971 Opened 5 years ago Closed 4 years ago

SIMD: remove signMask and implement Bool vectors in the interpreter and in the JITs

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla46
Tracking Status
firefox40 --- affected
firefox46 --- fixed

People

(Reporter: bbouvier, Assigned: jolesen)

References

Details

(Keywords: dev-doc-complete, Whiteboard: [DocArea=JS])

Attachments

(5 files, 32 obsolete files)

77.02 KB, patch
Details | Diff | Splinter Review
88.52 KB, patch
Details | Diff | Splinter Review
105.24 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
140.38 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
83.09 KB, patch
Details | Diff | Splinter Review
Keywords: dev-doc-needed
Whiteboard: [DocArea=JS]
Blocks: 894105
The linked commit only implements it for the int{8x16,16x8,32x4} types (and still removes it for the float{32x4,64x2} types), but the simd-mandelbrot.js test still needs it because it uses signmask of float32x4 at the moment. Does a patch for this bug have to include allTrue/anyTrue for the float types?
Flags: needinfo?(benj)
In the spec, allTrue/anyTrue are removed from the integer types as well, and is only present on the Bool vector types. The most straightforward translation of float32x4 signmask will be to do a SIMD.Float32x4.lessThan and then do allTrue/anyTrue on the result.
What Dan said. Let's morph this bug into implementing Bool vectors, as these are non trivial to do and deserve their own bug.
Blocks: 1173722
Flags: needinfo?(benj)
Summary: SIMD: remove signMask and implement AllTrue/AnyTrue → SIMD: remove signMask and implement Bool vectors in the interpreter and in the JITs
Blocks: 1176375
Attachment #8640012 - Flags: feedback?(bbouvier)
Attachment #8640012 - Flags: feedback?(bbouvier) → feedback?(benj)
Comment on attachment 8640012 [details] [diff] [review]
Implementing bool vector for the interpreter

Review of attachment 8640012 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good overall! A few style nits, easy wins and discussions, so I would like to see another version of this patch.

About the implementation strategy, this is how I would recommend doing things (one patch per bullet)
1. let signMask live and implement bool vectors in the interpreter
2. add support for the JITs
3. update tests to remove signMask and use bool vectors instead
4. remove support for signMask

By moving patches 3. and 4. in your patch queue, you can make sure that all instances of signMask have all been replaced. This makes also landing and testing things easier! How does that sound?

::: js/src/builtin/SIMD.cpp
@@ +164,5 @@
> +//    SIGN_MASK(Float64x2);
> +//   SIGN_MASK(Int8x16);
> +//    SIGN_MASK(Int16x8);
> +//    SIGN_MASK(Int32x4);
> +//#undef SIGN_MASK

This code can be deleted.

@@ +259,5 @@
>      JS_FS_END
>  };
>  
>  const JSPropertySpec Float32x4Defn::TypedObjectProperties[] = {
> +//    JS_PSG("signMask", Float32x4SignMask, JSPROP_PERMANENT),

ditto, here and below

@@ +905,5 @@
> +    Elem* vec = TypedObjectMemory<Elem*>(args[0]);
> +    bool allTrue = true;
> +    // enumerate the lanes
> +    for (unsigned i = 0; i < V::lanes; i++) {
> +        allTrue &= vec[i];

We can make this slightly faster by changing the middle condition in the for loop to:

allTrue && i < V::lanes

@@ +917,5 @@
> +static bool
> +AnyTrue(JSContext* cx, unsigned argc, Value* vp)
> +{
> +    typedef typename V::Elem Elem;
> +    

nit: a few trailing spaces, here and below

@@ +925,5 @@
> +    
> +    Elem* vec = TypedObjectMemory<Elem*>(args[0]);
> +    bool anyTrue = false;
> +    // enumerate the lanes
> +    for (unsigned i = 0; i < V::lanes; i++) {

ditto, !anyTrue && i < V::lanes

@@ +1266,2 @@
>      for (unsigned i = 0; i < V::lanes; i++)
> +        result[i] = mask[i] > 0 ? tv[i] : fv[i];

Probably just testing mask[i] would do the trick?

@@ +1266,3 @@
>      for (unsigned i = 0; i < V::lanes; i++)
> +        result[i] = mask[i] > 0 ? tv[i] : fv[i];
> +    

nit: trailing ws

::: js/src/builtin/SIMD.h
@@ +50,5 @@
> +  V(notEqual, (CompareFunc<Bool32x4, NotEqual, Bool32x4>), 2)
> +
> +#define BOOL8X16_TERNARY_FUNCTION_LIST(V)                                             \
> +  V(replaceLane, (ReplaceLane<Bool8x16>), 3)                                          \
> +  V(select, (Select<Bool32x4, Bool8x16>), 3)

should be Select<Bool8x16, Bool8x16>

@@ +54,5 @@
> +  V(select, (Select<Bool32x4, Bool8x16>), 3)
> +
> +#define BOOL16X8_TERNARY_FUNCTION_LIST(V)                                             \
> +  V(replaceLane, (ReplaceLane<Bool16x8>), 3)                                          \
> +  V(select, (Select<Bool32x4, Bool16x8>), 3)

ditto with Bool16x8

@@ +310,5 @@
>    V(neg, (UnaryFunc<Int32x4, Neg, Int32x4>), 1)                                       \
>    V(not, (UnaryFunc<Int32x4, Not, Int32x4>), 1)                                       \
>    V(splat, (FuncSplat<Int32x4>), 0)
>  
> +

nit: blank line

@@ +551,5 @@
> +        return a;
> +    }
> +    static bool toType(JSContext* cx, JS::HandleValue v, Elem* out) {
> +        *out = ToBoolean(v);
> +        return true;

Isn't there a fallible ToBoolean function, as for the other types?

::: js/src/builtin/TypedObject.h
@@ +365,5 @@
>      macro_(SimdTypeDescr::Int16x8, int16_t, int16, 8)    \
>      macro_(SimdTypeDescr::Int32x4, int32_t, int32, 4)    \
> +    macro_(SimdTypeDescr::Bool8x16, int8_t, int8, 16)     \
> +    macro_(SimdTypeDescr::Bool16x8, int16_t, int16, 8)    \
> +    macro_(SimdTypeDescr::Bool32x4, int32_t, int32, 4)    \

The 3 types should be able to use int8_t for their storage, right? Having 32 bits to represent a bool seems overkill...

::: js/src/builtin/TypedObject.js
@@ +157,5 @@
>      var y = Load_float64(typedObj, offset + 8);
>      return GetFloat64x2TypeDescr()(x, y);
>  
>    case JS_SIMDTYPEREPR_INT8:
> +  case JS_SIMDTYPEREPR_BOOL8:

According to the specification, bool vectors store booleans, not integers, so reusing the integer path won't work.

@@ +196,5 @@
>      var y = Load_int32(typedObj, offset + 4);
>      var z = Load_int32(typedObj, offset + 8);
>      var w = Load_int32(typedObj, offset + 12);
>      return GetInt32x4TypeDescr()(x, y, z, w);
> +

nit: extra spaces

@@ +625,5 @@
> +          var s5 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 4);
> +          var s6 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 5);
> +          var s7 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 6);
> +          var s8 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 7);
> +          var s9 = callFunction(std_SIMD_Boolx16_extractLane, null, this, 0);

nit: missing std_SIMD_Bool8x16_extractLane, not std_SIMD_Boolx16_extractLane

Also, copy-pasto error: the last indexes restart from 0, while it should start from 7 (maybe a pre-existing issue with int8x16?)

::: js/src/vm/SelfHosting.cpp
@@ +1300,5 @@
>  
>      JS_FN("std_SIMD_Int32x4_extractLane",        simd_int32x4_extractLane,     2,0),
>      JS_FN("std_SIMD_Float32x4_extractLane",      simd_float32x4_extractLane,   2,0),
>      JS_FN("std_SIMD_Float64x2_extractLane",      simd_float64x2_extractLane,   2,0),
> +    JS_FN("std_SIMD_Bool32x4_extractLane",       simd_bool32x4_extractLane,   2,0),

Other Bool types would need their intrinsics too
Attachment #8640012 - Flags: feedback?(benj) → feedback+
Thanks for the review; that flow makes perfect sense to me.

Now, I have bool32x4 working in JIT using SSE intrinsic. The question is, is it okay to use int32 type to represent boolean vectors(Bool32x4, Bool16x8, Bool8x16) and then apply non-vectorized bitwise operations on them?
(In reply to staheri from comment #6)
> Thanks for the review; that flow makes perfect sense to me.
> 
> Now, I have bool32x4 working in JIT using SSE intrinsic. The question is, is
> it okay to use int32 type to represent boolean vectors(Bool32x4, Bool16x8,
> Bool8x16) and then apply non-vectorized bitwise operations on them?

Yes, it sounds fine. The specification is specifically loose about the implementation details of the boolean vectors. The only place where it matters are the places where we need to rematerialize the SIMD values (bailouts, recover instructions), but this just should be about specializing MSimdBox, hopefully.
Attached patch Tests for SIMD boolean vectors (obsolete) — Splinter Review
Attachment #8662166 - Flags: review?(benj)
Attachment #8640012 - Attachment is obsolete: true
Attachment #8662331 - Flags: feedback?(benj)
Comment on attachment 8662166 [details] [diff] [review]
Tests for SIMD boolean vectors

Review of attachment 8662166 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good for bool32x4! I wonder why you had to raise the iteration threshold in most loops? I guess we should be fine just having 50 iterations with the special line at the top. Can you change the value back to what it was before, in all tests, please?

::: js/src/jit-test/tests/SIMD/bool32x4-arith.js
@@ +6,5 @@
> +    var b1 = SIMD.Bool32x4(true, false, true, false);
> +    var b2 = SIMD.Bool32x4(true, true, true, true);
> +    var ret = false;
> +    for (var i = 0; i < 3500; i++) {
> +        assertEqX4(SIMD.Bool32x4.and(b1, b2), booleanBinaryX4((x, y) => x && y, b1, b2));

Where is booleanBinaryX4 defined? (it doesn't seem present to be in this patch)

::: js/src/jit-test/tests/SIMD/check.js
@@ +9,3 @@
>      var i = 0;
>      try {
> +        for (; i < 3500; i++) {

Any reason why you needed to raise the thresholds? The warmup trigger is supposed to make this compile after around 50 uses.

::: js/src/jit-test/tests/SIMD/compare.js
@@ +13,5 @@
>      var i1 = SIMD.Int32x4(1, 2, -3, 4);
>      var i2 = SIMD.Int32x4(1, -2, 3, 0);
>  
> +    for (var i = 0; i < 3500; i++) {
> +        assertEqX4(SIMD.Int32x4.lessThan(i1, i2),             [false, false, true, false]);

i think you can simply change the function bool, instead of changing all expected results, to something like

function bool(x) { return !!x; }

::: js/src/jit-test/tests/SIMD/getters.js
@@ +29,5 @@
> +        assertEq(SIMD.Bool32x4.extractLane(b4, 2), false);
> +        assertEq(SIMD.Bool32x4.extractLane(b4, 3), true);
> +
> +        assertEq(SIMD.Bool32x4.anyTrue(b4), true);
> +        assertEq(SIMD.Bool32x4.allTrue(b4), false);

Can you add some tests for anyTrue/allTrue?

anyTrue for a vector containing all falses
allTrue for a vector containing all trues

::: js/src/jit-test/tests/SIMD/replacelane.js
@@ +169,5 @@
> +        try {
> +            let x = SIMD.Bool32x4.replaceLane(b4, i < 3499 ? 0 : 1.1, true);
> +        } catch(e) {
> +            assertEq(e instanceof TypeError, true);
> +            //assertEq(i, 3499);

nit: please un-comment this line and make sure the test pass

::: js/src/tests/ecma_7/SIMD/binary-operations.js
@@ +389,5 @@
> +}
> +
> +function testBool8x16xor() {
> +  function xorb(a, b) {
> +    return !!(a ^ b);

Out of curiosity, why do we need the explicit boolean coercion here?

::: js/src/tests/ecma_7/SIMD/select-bitselect.js
@@ +16,5 @@
>  
>  function getMask(i, maskLength) {
>      var args = [];
>      for (var j = 0; j < maskLength; j++) args.push(!!((i >> j) & 1));
> +    console.log(args);

nit: please remove this call

@@ +30,5 @@
> +
> +function getSelectBitsMask(i, maskLength) {
> +    var args = [];
> +    for (var j = 0; j < maskLength; j++) args.push(!!((i >> j) & 1));
> +    console.log(args);

nit: ditto

@@ +88,3 @@
>          var mask = getMask(i, maskLength);
>          for ([x, y] of inputs)
> +            assertEqVec(type.selectBits(bitsMask, x, y), selectBits(type, bitsMask, x, y));

pre-existing, but as you're around: please add {} around these two lines, it's actually not doing what it ought to :/

::: js/src/tests/ecma_7/SIMD/shell.js
@@ +151,5 @@
>  }
>  
>  function simdLength(v) {
>      var pt = Object.getPrototypeOf(v);
> +    if (pt == SIMD.Int8x16.prototype || pt === SIMD.Bool8x16.prototype)

pre-existing, but here and below, can you replace all of these by ===?

@@ +163,5 @@
>      throw new TypeError("Unknown SIMD kind.");
>  }
>  
>  function simdLengthType(t) {
> +    if (t == SIMD.Int8x16 || t == SIMD.Bool8x16)

ditto

::: js/src/tests/ecma_7/SIMD/typedobjects.js
@@ +661,2 @@
>  function test() {
> +

nit: blank line
Attachment #8662166 - Flags: review?(benj) → feedback+
Comment on attachment 8662331 [details] [diff] [review]
WIP patch for supporting SIMD boolean vectors in interpreter and JIT

Review of attachment 8662331 [details] [diff] [review]:
-----------------------------------------------------------------

Any chance you'd be willing to split this patch into two parts (with hg histedit/hg record if you're using mercurial, git rebase -i/git add -p if you're using git): the interpreter parts on one hand, the JIT parts on the other hand?
Thanks for the review. I'll provide you with new patches.

To answer the questions that you just raised,

I wonder why you had to raise the iteration threshold in most loops? 
I noticed that 50 iteration is not enough for Ion to generate machine code (Just the baseline).

Where is booleanBinaryX4 defined? You're right, my bad. It supposed to be jit-test/lib/simd.js.
function booleanBinaryX4(op, v, w) {
    var arr = [];
    var [varr, warr] = [simdToArray(v), simdToArray(w)];
    for (var i = 0; i < 4; i++)
        arr[i] = op(varr[i], warr[i]);
    return arr;
}

i think you can simply change the function bool...
That function is from previous implementation and not being used anymore. I guess we can remove it.

Out of curiosity, why do we need the explicit boolean coercion here?
One way to make an xor is to use "^" operator. But, result of "^" operator when applied to two boolean values is a numeric value i.e. "0,1". Since we're using "===" in assertions, we need to make sure that they have the same type as well.
Attachment #8662331 - Attachment is obsolete: true
Attachment #8662331 - Flags: feedback?(benj)
Attachment #8670460 - Flags: feedback?(benj)
Attachment #8670462 - Flags: feedback?(benj)
Comment on attachment 8670460 [details] [diff] [review]
WIP patch for supporting SIMD boolean vectors in interpreter

Review of attachment 8670460 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for splitting the patch! I'd like to see another version before r+-ing it, but it looks good. I suppose from this patch that your plan is to remove signMask later? That sounds great.

::: js/src/builtin/SIMD.cpp
@@ +225,5 @@
>      static const JSFunctionSpec TypedObjectMethods[];
>      static const JSFunctionSpec Methods[];
>  };
> +class Bool8x16Defn {
> +public:

nit: before "public", half indent please (2 spaces)

@@ +228,5 @@
> +class Bool8x16Defn {
> +public:
> +    static const SimdTypeDescr::Type type = SimdTypeDescr::Bool8x16;
> +    static const JSFunctionSpec TypeDescriptorMethods[];
> +    static const JSPropertySpec TypedObjectProperties[];

You probably don't need this here (see below), and in other types as well.

@@ +233,5 @@
> +    static const JSFunctionSpec TypedObjectMethods[];
> +    static const JSFunctionSpec Methods[];
> +};
> +class Bool16x8Defn {
> +public:

ditto

@@ +241,5 @@
> +    static const JSFunctionSpec TypedObjectMethods[];
> +    static const JSFunctionSpec Methods[];
> +};
> +class Bool32x4Defn {
> +public:

ditto

@@ +384,5 @@
> +    JS_FS_END,
> +};
> +
> +const JSPropertySpec Bool8x16Defn::TypedObjectProperties[] = {
> +    JS_PS_END

You probably don't need this TypedObjectProperties here

@@ +899,5 @@
> +        return ErrorBadArgs(cx);
> +
> +    Elem* vec = TypedObjectMemory<Elem*>(args[0]);
> +    bool allTrue = true;
> +    // enumerate the lanes

nit: i would remove this comment, it's not adding much information

@@ +902,5 @@
> +    bool allTrue = true;
> +    // enumerate the lanes
> +    for (unsigned i = 0; allTrue && i < V::lanes; i++) {
> +        allTrue &= vec[i];
> +    }

nit: you don't need {} for single line for-bodies

@@ +920,5 @@
> +        return ErrorBadArgs(cx);
> +
> +    Elem* vec = TypedObjectMemory<Elem*>(args[0]);
> +    bool anyTrue = false;
> +    // enumerate the lanes

ditto for this comment

@@ +923,5 @@
> +    bool anyTrue = false;
> +    // enumerate the lanes
> +    for (unsigned i = 0; !anyTrue && i < V::lanes; i++) {
> +        anyTrue |= vec[i];
> +    }

ditto for {}

@@ +1255,5 @@
>      Elem* tv = TypedObjectMemory<Elem*>(args[1]);
>      Elem* fv = TypedObjectMemory<Elem*>(args[2]);
>  
>      Elem result[V::lanes];
> +    //Boolean mask items are represented with -1/0 for true/false

Well well well, couldn't just have them be 1 for true, 0 for false, instead? That seems more natural.

::: js/src/builtin/SIMD.h
@@ +50,5 @@
> +  V(notEqual, (CompareFunc<Bool32x4, NotEqual, Bool32x4>), 2)
> +
> +#define BOOL8X16_TERNARY_FUNCTION_LIST(V)                                             \
> +  V(replaceLane, (ReplaceLane<Bool8x16>), 3)                                          \
> +  V(select, (Select<Bool32x4, Bool8x16>), 3)

there's no select, per spec

@@ +54,5 @@
> +  V(select, (Select<Bool32x4, Bool8x16>), 3)
> +
> +#define BOOL16X8_TERNARY_FUNCTION_LIST(V)                                             \
> +  V(replaceLane, (ReplaceLane<Bool16x8>), 3)                                          \
> +  V(select, (Select<Bool32x4, Bool16x8>), 3)

there's no select, per spec

@@ +58,5 @@
> +  V(select, (Select<Bool32x4, Bool16x8>), 3)
> +
> +#define BOOL32X4_TERNARY_FUNCTION_LIST(V)                                             \
> +  V(replaceLane, (ReplaceLane<Bool32x4>), 3)                                          \
> +  V(select, (Select<Bool32x4, Bool32x4>), 3)

there's no select, per spec

@@ +60,5 @@
> +#define BOOL32X4_TERNARY_FUNCTION_LIST(V)                                             \
> +  V(replaceLane, (ReplaceLane<Bool32x4>), 3)                                          \
> +  V(select, (Select<Bool32x4, Bool32x4>), 3)
> +
> +#define BOOL16X8_BINARY_FUNCTION_LIST(V)                                              \

Please add some consistency to the grouping of the new functions: the BINARY macros stay together, the TERNARY macros stay together, etc.

@@ +208,5 @@
>    V(fromInt32x4Bits, (FuncConvertBits<Int32x4, Int8x16>), 1)                          \
>    V(neg, (UnaryFunc<Int8x16, Neg, Int8x16>), 1)                                       \
>    V(not, (UnaryFunc<Int8x16, Not, Int8x16>), 1)                                       \
> +  V(splat, (FuncSplat<Int8x16>), 1)                                                   \
> +

nit: extra line here

@@ +392,5 @@
>  #define BITWISE_COMMONX4_SIMD_OP(_)  \
>      _(and)                           \
>      _(or)                            \
>      _(xor)
> +

nit: extra line

@@ +431,5 @@
>      _(store2)                        \
>      _(store3)                        \
>      _(check)
> +
> +

nit: blank lines

@@ +567,5 @@
> +        return JS::ToInt8(a);
> +    }
> +    static bool toType(JSContext* cx, JS::HandleValue v, Elem* out) {
> +        bool ret = ToInt8(cx, v, out);
> +        *out = *out * -1;

Why do you need to do this? If *out is false, it's 0, and multiplying by 0 won't change it; if it's true, it will change the sign of the result (I guess passing from -1 to 1, in that case). Can you either add a comment explaining what's going on here, or just do:

*out = int8_t(bool(*out));

@@ +568,5 @@
> +    }
> +    static bool toType(JSContext* cx, JS::HandleValue v, Elem* out) {
> +        bool ret = ToInt8(cx, v, out);
> +        *out = *out * -1;
> +        return ret;

nit: the convention, in that case, is to do it this way:

if (!ToInt8(cx, v, out))
    return false;
*out = *out * -1;
return true;

These two remarks apply to the other types, as well :)

::: js/src/builtin/TypedObject.js
@@ +212,5 @@
> +    var s13 = Load_int8(typedObj, offset + 13);
> +    var s14 = Load_int8(typedObj, offset + 14);
> +    var s15 = Load_int8(typedObj, offset + 15);
> +    return GetBool8x16TypeDescr()(s0, s1, s2, s3, s4, s5, s6, s7,
> +                                 s8, s9, s10, s11, s12, s13, s14, s15);

nit: align s8 with the opening parenthesis above, please

@@ +226,5 @@
> +    var s7 = Load_int8(typedObj, offset + 7);
> +    return GetBool16x8TypeDescr()(s0, s1, s2, s3, s4, s5, s6, s7);
> +
> +  case JS_SIMDTYPEREPR_BOOL32:
> +      var x = Load_int8(typedObj, offset + 0);

uber-nit: this should be aligned 2 spaces after the "case" keyword above, for consistency:

  case X:
    var x = Load_int8(etc);

@@ +593,5 @@
>  // SIMD
>  
>  function SimdProtoString(type) {
>    switch (type) {
> +    case JS_SIMDTYPEREPR_BOOL8:

nit: alignment is inconsistent with below

@@ +692,5 @@
> +          var s5 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 4);
> +          var s6 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 5);
> +          var s7 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 6);
> +          var s8 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 7);
> +          var s9 = callFunction(std_SIMD_Boolx16_extractLane, null, this, 0);

nit: that should be std_SIMD_Bool8x16_extractLane here (the 8 is missing)

Please make sure this is tested (calling toSource() or toString() on a SIMD instance)
Attachment #8670460 - Flags: feedback?(benj) → feedback+
Comment on attachment 8670462 [details] [diff] [review]
WIP patch for supporting SIMD boolean vectors in JIT

Review of attachment 8670462 [details] [diff] [review]:
-----------------------------------------------------------------

One major non-nit: it seems that at some places, an int32x4 register is used as the backend for a bool32x4 vector, but not at others. This will blow up all runtime assertions, especially with respect to memory alignment constraints.
Ideally, MIRType_Bool32x4 would exist, but would lower to an LInt32x4, so you don't need LBool32x4 at all. Does that sound feasible?

::: js/src/jit/BaselineIC.cpp
@@ +8475,5 @@
> +           FOREACH_BOOL32X4_SIMD_OP(ADD_BOOL32X4_SIMD_OP_NAME_))
> +       {
> +            Rooted<SimdTypeDescr*> descr(cx, &cx->global()->bool32x4TypeDescr().as<SimdTypeDescr>());
> +            res.set(cx->compartment()->jitCompartment()->getSimdTemplateObjectFor(cx, descr));
> +            return !!res;

Can you merge the two if conditions, so as to have only one block?

::: js/src/jit/IonTypes.h
@@ +312,5 @@
>          SimdConstant cst;
>          cst.fillInt32x4(v, v, v, v);
>          return cst;
>      }
> +    static SimdConstant CreateX4(int8_t x, int8_t y, int8_t z, int8_t w) {

There's an implicit type risk here: CreateX4 has a variant that takes all int32, so we could create an instance of the wrong type. Maybe call it CreateBoolX4 instead? Or make sure all arguments are bools?

@@ +317,5 @@
> +        SimdConstant cst;
> +        cst.fillBool32x4(x, y, z, w);
> +        return cst;
> +    }
> +    static SimdConstant CreateX4(int8_t* array) {

ditto

@@ +322,5 @@
> +        SimdConstant cst;
> +        cst.fillBool32x4(array[0], array[1], array[2], array[3]);
> +        return cst;
> +    }
> +    static SimdConstant SplatX4(int8_t v) {

ditto

@@ +423,5 @@
>      MIRType_ObjectGroup,               // An ObjectGroup pointer.
>      MIRType_Last = MIRType_ObjectGroup,
>      MIRType_Float32x4 = MIRType_Float32 | (2 << VECTOR_SCALE_SHIFT),
>      MIRType_Int32x4   = MIRType_Int32   | (2 << VECTOR_SCALE_SHIFT),
> +    MIRType_Bool32x4   = MIRType_Boolean   | (2 << VECTOR_SCALE_SHIFT),

nit: remove spaces so as to keep alignment

::: js/src/jit/LIR.cpp
@@ +547,5 @@
>      MOZ_ASSERT(from != to);
>      for (size_t i = 0; i < moves_.length(); i++)
>          MOZ_ASSERT(to != moves_[i].to());
>  
> +    int memoryAlignment = (type == LDefinition::BOOL32X4) ? SimdBoolMemoryAlignment:SimdMemoryAlignment;

I'm not sure to follow: I thought the implementation was using xmm registers (namely, int32x4 registers) for storing the content of bool32x4. Correct me if that's wrong but it seems to be the case. If that's the case, the same memory alignment constraints apply on bool32x4 as for int32x4. If you haven't hit any runtime failure during your testing, it might mean there's something really bad happening, like none of the functions being inlined. This is something you can check by running the shell with the env variable IONFLAGS=logs, and use iongraph to check that the bool32x4 appear there. 

iongraph: https://github.com/sstangl/iongraph

::: js/src/jit/Lowering.cpp
@@ +4016,5 @@
>      else if (ins->type() == MIRType_Float32x4)
>          define(new(alloc()) LFloat32x4(), ins);
> +    else if (ins->type() == MIRType_Bool32x4){
> +        MOZ_CRASH("Visit SIMD Bool Const is NYI");
> +    }

nit: no {} for single line if body.

Also, please do implement this :D

@@ +4107,5 @@
>      }
>  }
>  
>  void
> +LIRGenerator::visitSimdAllTrue(MSimdAllTrue* ins)

Is there any chance this could be part of SimdUnaryX4 instead?

@@ +4114,5 @@
> +    MOZ_ASSERT(IsSimdType(input->type()));
> +    MOZ_ASSERT(input->type() == MIRType_Bool32x4);
> +
> +    LUse use = useRegisterAtStart(input);
> +    define(new(alloc()) LSimdAnyTrue(use), ins);

nit: LSimdAllTrue

@@ +4118,5 @@
> +    define(new(alloc()) LSimdAnyTrue(use), ins);
> +}
> +
> +void
> +LIRGenerator::visitSimdAnyTrue(MSimdAnyTrue* ins)

ditto

@@ +4210,5 @@
>      if (ins->type() == MIRType_Int32x4) {
>          LSimdUnaryArithIx4* lir = new(alloc()) LSimdUnaryArithIx4(in);
>          define(lir, ins);
> +    } else if (ins->type() == MIRType_Bool32x4) {
> +        LSimdUnaryArithIx4* lir = new(alloc()) LSimdUnaryArithIx4(in);

Looks like there's something wrong here: it should probably be LSimdUnaryArithBx4, as the methods

@@ +4257,5 @@
>          LSimdBinaryBitwiseX4* lir = new(alloc()) LSimdBinaryBitwiseX4;
>          lowerForFPU(lir, ins, lhs, rhs);
> +    } else if (ins->type() == MIRType_Bool32x4) {
> +        LSimdBinaryBitwiseX4* lir = new(alloc()) LSimdBinaryBitwiseX4;
> +        lowerForFPU(lir, ins, lhs, rhs);

seems you can just add "|| ins->type() == MIRType_Bool32x4" in the above branch

::: js/src/jit/MCallOptimize.cpp
@@ +3507,5 @@
>      return boxSimd(callInfo, ins, templateObj);
>  }
>  
> +
> +    IonBuilder::InliningStatus

nit: all of this code should be less indented by one tab

@@ +3546,5 @@
>  {
>      switch (type) {
>        case SimdTypeDescr::Float32x4: return Scalar::Float32x4;
>        case SimdTypeDescr::Int32x4:   return Scalar::Int32x4;
> +      case SimdTypeDescr::Bool32x4:  return Scalar::Int32x4;

I don't think you need to return a value here: this function is just used for load/store, which aren't defined for Bool vectors.

That being said, there's a pre-existing naming issue here, and I wonder how the current code can compile. There's already a function called SimdTypeToScalarType (returning MIRType) which is called way above this function, and it doesn't seem to be an issue...

::: js/src/jit/MIR.cpp
@@ +919,5 @@
>          switch (type()) {
> +          case MIRType_Bool32x4:{
> +            int8_t a[4];
> +            for (size_t i = 0; i < 4; ++i)
> +                  a[i] = ((int8_t) getOperand(i)->constantValue().toBoolean() ? -1:0);

nits: no need for the wrapping parenthesis, space before and after the :

::: js/src/jit/MIR.h
@@ +1754,5 @@
>  
> +// Returns true if all lanes are true.
> +class MSimdAllTrue
> +: public MUnaryInstruction,
> +public SimdPolicy<0>::Data

nit: spacing on this 2 lines is wrong (look at the rest of this file)

@@ +1756,5 @@
> +class MSimdAllTrue
> +: public MUnaryInstruction,
> +public SimdPolicy<0>::Data
> +{
> +protected:

nit: half indent (2 spaces) before public/protected/private keywords

@@ +1758,5 @@
> +public SimdPolicy<0>::Data
> +{
> +protected:
> +    explicit MSimdAllTrue(MDefinition* obj, MIRType type)
> +    : MUnaryInstruction(obj)

nit: 2 spaces before the :

@@ +1765,5 @@
> +        specialization_ = type;
> +        setMovable();
> +    }
> +
> +public:

ditto

@@ +1768,5 @@
> +
> +public:
> +    INSTRUCTION_HEADER(SimdAllTrue)
> +
> +    static MSimdAllTrue* NewAsmJS(TempAllocator& alloc, MDefinition* obj)

If it's not used yet, don't add it yet :)

@@ +1787,5 @@
> +        if (!ins->isSimdAllTrue())
> +            return false;
> +        return congruentIfOperandsEqual(ins);
> +    }
> +    ALLOW_CLONE(MSimdAllTrue)

I guess you could also implement foldsTo, but that can be done as a follow up bug

@@ +1793,5 @@
> +
> +// Returns true if any lane is true.
> +class MSimdAnyTrue
> +: public MUnaryInstruction,
> +public SimdPolicy<0>::Data

All comments from the previous class apply here as well, and in the next class as well

@@ -3208,5 @@
>        : MUnaryInstruction(op),
>          templateObject_(templateObject),
>          initialHeap_(initialHeap)
>      {
> -        MOZ_ASSERT(IsSimdType(op->type()));

Please don't remove this assertion.

::: js/src/jit/StackSlotAllocator.h
@@ +98,5 @@
>  #endif
>            case LDefinition::DOUBLE:    return 8;
>            case LDefinition::SINCOS:
>            case LDefinition::FLOAT32X4:
> +

nit: no newline here

::: js/src/jit/shared/LIR-shared.h
@@ +248,5 @@
>          }
>          return "unknown lane";
>      }
>  };
> +// Extracts an element from a given SIMD bool32x4 lane.

nit: Please add a line before and after this class.

@@ +251,5 @@
>  };
> +// Extracts an element from a given SIMD bool32x4 lane.
> +class LSimdExtractElementB : public LSimdExtractElementBase
> +{
> +public:

see nits in other files, like MIR.h

@@ +254,5 @@
> +{
> +public:
> +    LIR_HEADER(SimdExtractElementB);
> +    explicit LSimdExtractElementB(const LAllocation& base)
> +    : LSimdExtractElementBase(base)

here too

@@ +741,5 @@
>          return f_;
>      }
>  };
>  
> +

nit: blank line

@@ +763,5 @@
>      const SimdConstant& getValue() const { return mir_->toSimdConstant()->value(); }
>  };
>  
> +// Constant SIMD Bool32x4
> +class LBool32x4 : public LInstructionHelper<1, 0, 0>//

nit: remove // at the end

::: js/src/jit/x64/Assembler-x64.h
@@ +201,5 @@
>  // here such that it is accessible from the entire codebase. Once full support
>  // for SIMD is reached on all tier-1 platforms, this constant can be deleted.
>  static MOZ_CONSTEXPR_VAR bool SupportsSimd = true;
>  static MOZ_CONSTEXPR_VAR uint32_t SimdMemoryAlignment = 16;
> +static MOZ_CONSTEXPR_VAR uint32_t SimdBoolMemoryAlignment = 4;

Why do we need this? (see also LIR.cpp)

::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +2396,5 @@
>      }
>  }
>  
>  void
> +CodeGeneratorX86Shared::visitSimdExtractElementB(LSimdExtractElementB* ins)

It's the same code as visitSimdExtractElementI, which makes sense if bool32x4 are represented by int32x4. At an higher level, can you use LSimdExtractElementI instead?

@@ +2460,5 @@
>          masm.canonicalizeFloat(output);
>  }
>  
>  void
> +CodeGeneratorX86Shared::visitSimdInsertElementB(LSimdInsertElementB* ins)

ditto

::: js/src/jit/x86-shared/Encoding-x86-shared.h
@@ +175,5 @@
>      OP2_RCPPS_VpsWps    = 0x53,
>      OP2_ANDPD_VpdWpd    = 0x54,
>      OP2_ORPD_VpdWpd     = 0x56,
>      OP2_XORPD_VpdWpd    = 0x57,
> +    OP2_PUNPCKLBW_VdqWdq= 0x60,

nit: space before the =

::: js/src/jit/x86-shared/MacroAssembler-x86-shared.h
@@ +1015,5 @@
> +        vmovdqu(Operand(src), dest);
> +        // Do a low byte unpacking
> +        vpunpcklbw(dest, dest, dest);
> +        // Do a low half-word unpack
> +        vpunpcklwd(dest, dest, dest);

Why do you need the unpacking? It seems like it's just supposed to read the value from an address (stack or memory), which should already be in the right format.

@@ +1062,5 @@
>      }
>      void storeUnalignedInt32x4(FloatRegister src, const Operand& dest) {
>          vmovdqu(src, dest);
>      }
> +    void storeBool32x4(FloatRegister src, const Address& dest) {

ditto

@@ -1357,5 @@
>              return true;
>          }
>          return false;
>      }
> -

nit: keep this line please
Attachment #8670462 - Flags: feedback?(benj)
Attachment #8662166 - Attachment is obsolete: true
Attachment #8670460 - Attachment is obsolete: true
Attachment #8670462 - Attachment is obsolete: true
Attachment #8686450 - Flags: review?(benj)
Attached patch Tests for SIMD boolean vectors (obsolete) — Splinter Review
Attachment #8686451 - Flags: review?(benj)
Comment on attachment 8686450 [details] [diff] [review]
Implementing bool vector for the interpreter

Review of attachment 8686450 [details] [diff] [review]:
-----------------------------------------------------------------

A few comments below. I'll add them atop of your patch. Thanks for the patch!

::: js/src/builtin/SIMD.cpp
@@ +382,5 @@
> +    JS_SELF_HOSTED_FN("equivalent", "TypeDescrEquivalent", 1, 0),
> +    JS_FS_END,
> +};
> +
> +const JSPropertySpec Bool8x16Defn::TypedObjectProperties[] = {

Do we really need ::TypedObjectProperties arrays?

@@ +901,5 @@
> +    bool allTrue = true;
> +    for (unsigned i = 0; allTrue && i < V::lanes; i++)
> +        allTrue &= vec[i];
> +
> +    V::setReturn(args, allTrue);

I'd just set args.rval() here, to avoid the implicit coercion bool -> int8/int16/int32 -> bool when calling setReturn

@@ +920,5 @@
> +    bool anyTrue = false;
> +    for (unsigned i = 0; !anyTrue && i < V::lanes; i++)
> +        anyTrue |= vec[i];
> +
> +    V::setReturn(args, anyTrue);

ditto

@@ +1250,5 @@
>      Elem* tv = TypedObjectMemory<Elem*>(args[1]);
>      Elem* fv = TypedObjectMemory<Elem*>(args[2]);
>  
>      Elem result[V::lanes];
> +    //Boolean mask items are represented with -1/0 for true/false

Why not just true/false? I really don't see a reason not to do this.

::: js/src/builtin/SIMD.h
@@ +32,5 @@
> +  V(and, (BinaryFunc<Bool8x16, And, Bool8x16>), 2)                                    \
> +  V(or, (BinaryFunc<Bool8x16, Or, Bool8x16>), 2)                                      \
> +  V(xor, (BinaryFunc<Bool8x16, Xor, Bool8x16>), 2)                                    \
> +  V(equal, (CompareFunc<Bool8x16, Equal, Bool8x16>), 2)                               \
> +  V(notEqual, (CompareFunc<Bool8x16, NotEqual, Bool8x16>), 2)

bool8x16 doesn't have equal/notEqual in the spec

@@ +55,5 @@
> +  V(and, (BinaryFunc<Bool16x8, And, Bool16x8>), 2)                                    \
> +  V(or, (BinaryFunc<Bool16x8, Or, Bool16x8>), 2)                                      \
> +  V(xor, (BinaryFunc<Bool16x8, Xor, Bool16x8>), 2)                                    \
> +  V(equal, (CompareFunc<Bool16x8, Equal, Bool16x8>), 2)                               \
> +  V(notEqual, (CompareFunc<Bool16x8, NotEqual, Bool16x8>), 2)

ditto

@@ +78,5 @@
> +  V(and, (BinaryFunc<Bool32x4, And, Bool32x4>), 2)                                    \
> +  V(or, (BinaryFunc<Bool32x4, Or, Bool32x4>), 2)                                      \
> +  V(xor, (BinaryFunc<Bool32x4, Xor, Bool32x4>), 2)                                    \
> +  V(equal, (CompareFunc<Bool32x4, Equal, Bool32x4>), 2)                               \
> +  V(notEqual, (CompareFunc<Bool32x4, NotEqual, Bool32x4>), 2)

ditto

@@ +351,5 @@
>    INT32X4_TERNARY_FUNCTION_LIST(V)                                                    \
>    INT32X4_QUARTERNARY_FUNCTION_LIST(V)                                                \
>    INT32X4_SHUFFLE_FUNCTION_LIST(V)
>  
> +#define FOREACH_BOOL32X4_SIMD_OP(_)   \

This will be needed for when we compile these operators in the JITs, let's not add them before.

@@ +559,5 @@
> +    static bool toType(JSContext* cx, JS::HandleValue v, Elem* out) {
> +        if (!ToInt8(cx, v, out))
> +            return false;
> +        /* Convert true value (i.e. 1) to 0xff which makes SIMD select
> +           implemetation easier*/

nit: implementation + use a simple comment //

@@ +583,5 @@
> +    static bool toType(JSContext* cx, JS::HandleValue v, Elem* out) {
> +        if (!ToInt8(cx, v, out))
> +            return false;
> +        /* Convert true value (i.e. 1) to 0xff which makes SIMD select
> +           implemetation easier*/

ditto

@@ +607,5 @@
> +    static bool toType(JSContext* cx, JS::HandleValue v, Elem* out) {
> +        if (!ToInt8(cx, v, out))
> +            return false;
> +        /* Convert true value (i.e. 1) to 0xff which makes SIMD select
> +           implemetation easier*/

ditto

::: js/src/builtin/TypedObject.js
@@ +692,5 @@
> +          var s5 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 4);
> +          var s6 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 5);
> +          var s7 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 6);
> +          var s8 = callFunction(std_SIMD_Bool8x16_extractLane, null, this, 7);
> +          var s9 = callFunction(std_SIMD_Boolx16_extractLane, null, this, 8);

The 8 of Bool8x16 is still missing on this line, this will break. Is this tested somewhere?
Attachment #8686450 - Flags: review?(benj)
Comment on attachment 8686451 [details] [diff] [review]
Tests for SIMD boolean vectors

Review of attachment 8686451 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for this patch! I think Bool16x8 and Bool8x16 will need full proper testing as well, but that's going to happen in a different patch set.

::: js/src/jit-test/tests/SIMD/check.js
@@ +9,4 @@
>      var i = 0;
>      try {
> +        for (; i < 3500; i++) {
> +            if (i > 3048)

No need to change these thresholds (when the test is run with --no-threads, it gets ion-compiled. Indeed it wasn't the case a few months ago, so we might need something more to set in the jit compiler options).

::: js/src/jit-test/tests/SIMD/getters.js
@@ +5,5 @@
>  function f() {
>      var i4 = SIMD.Int32x4(1, -2, 3, -4);
> +    var b4 = SIMD.Bool32x4(true, true, false, true);
> +
> +

nit: blank line

::: js/src/jit-test/tests/asm.js/testBug1099216.js
@@ +25,4 @@
>      var g = m.SIMD.Int32x4
>      var h = g.select
>      function f() {
> +        var x = k(0, 0, 0, 0)

This won't validate as asm.js, it should rather stay as is.

::: js/src/tests/ecma_7/SIMD/typedobjects.js
@@ +253,5 @@
>    var f = array[1];
> +
> +  var sj1 = Int8x16.extractLane(f, 3);
> +
> +  assertEq(sj1, 20);

Could stay as it was before.

@@ +661,2 @@
>  function test() {
> +

nit: blank line
Attachment #8686451 - Flags: review?(benj) → review+
Attached patch 1. Interpreter changes (obsolete) — Splinter Review
Updated sajjad's patch.
Attachment #8686450 - Attachment is obsolete: true
Attachment #8688571 - Flags: review?(jolesen)
Attached patch 2. Tests (obsolete) — Splinter Review
Attachment #8686451 - Attachment is obsolete: true
Attachment #8688572 - Flags: review?(jolesen)
Comment on attachment 8688572 [details] [diff] [review]
2. Tests

Review of attachment 8688572 [details] [diff] [review]:
-----------------------------------------------------------------

The ecma_7 tests look great with really good coverage.

The tests under tests/SIMD only cover the x4 types, there are very few tests for x8 and x16 types.

It seems odd that it would be necessary to increase all the iteration counts that much. Wouldn't --baseline-eager and/or --ion-eager test the JITs without slowing down the tests?

Even if it is necessary to bump the iteration counts, I would prefer to do that in a separate patch.

::: js/src/jit-test/tests/SIMD/bool32x4-arith.js
@@ +4,5 @@
> +
> +function f() {
> +    var b1 = SIMD.Bool32x4(true, false, true, false);
> +    var b2 = SIMD.Bool32x4(true, true, true, true);
> +    for (var i = 0; i < 3500; i++) {

Is it necessary to iterate this much when the test is run with --baseline-eager and --ion-eager?

::: js/src/jit-test/tests/SIMD/check.js
@@ +4,5 @@
>  
>  function f() {
>      var f1 = SIMD.Float32x4(1, 2, 3, 4);
>      var i1 = SIMD.Int32x4(1, 2, -3, 4);
> +    var b1 = SIMD.Bool32x4(true, true, false, true);

Should this test cover the x8 and x16 types too?

::: js/src/jit-test/tests/SIMD/compare.js
@@ +6,5 @@
>      var f1 = SIMD.Float32x4(1, 2, 3, 4);
>      var f2 = SIMD.Float32x4(NaN, Infinity, 3.14, -0);
>  
>      var i1 = SIMD.Int32x4(1, 2, -3, 4);
>      var i2 = SIMD.Int32x4(1, -2, 3, 0);

Coverage for Int16x8, Int8x16?

::: js/src/jit-test/tests/SIMD/getters.js
@@ +7,5 @@
> +    var b4 = SIMD.Bool32x4(true, true, false, true);
> +
> +
> +    var bt4 = SIMD.Bool32x4(true, true, true, true);
> +    var bf4 = SIMD.Bool32x4(false, false, false, false);

Add x8 and x16 coverage here too.

::: js/src/tests/ecma_7/SIMD/typedobjects.js
@@ +486,5 @@
> +  assertEq(Bool8x16.extractLane(f, 13), true);
> +  assertEq(Bool8x16.extractLane(f, 14), false);
> +  assertEq(Bool8x16.extractLane(f, 15), false);
> +
> +  assertThrowsInstanceOf(() => Bool8x16.extractLane(f, 16), TypeError);

This is supposed to throw a RangeError. Is that a bug in our implementation?
Attachment #8688572 - Flags: review?(jolesen) → review+
Comment on attachment 8688571 [details] [diff] [review]
1. Interpreter changes

Review of attachment 8688571 [details] [diff] [review]:
-----------------------------------------------------------------

These patches look good to me. I can't really tell if they are complete.


I wonder about the in-memory representation of the bool vectors in the interpreter.

In JIT code, it is clear that we want bool vectors to be represented as 128-bit vectors with 0 lanes for false and -1 lanes for true. This format is what the SSE and NEON vector compare instructions generate, and it is the natural format for implementing select:

select(s, a, b) = (s & a) | (~s & b)

It also works for the blendvps SSE instruction.

Would it make sense to use the same representation in the interpreter? It would simplify jitted code at a small memory cost. I don't think programs will generate huge numbers of bool vectors. At least they would be transient objects.

::: js/public/Class.h
@@ +718,5 @@
>  // the beginning of every global object's slots for use by the
>  // application.
>  #define JSCLASS_GLOBAL_APPLICATION_SLOTS 5
>  #define JSCLASS_GLOBAL_SLOT_COUNT                                             \
> +    (JSCLASS_GLOBAL_APPLICATION_SLOTS + JSProto_LIMIT * 3 + 40)

This is pretty scary.

Will something bad happen if this magic number is not 40?

::: js/src/builtin/SIMD.cpp
@@ +891,5 @@
> +
> +    Elem* vec = TypedObjectMemory<Elem*>(args[0]);
> +    bool allTrue = true;
> +    for (unsigned i = 0; allTrue && i < V::lanes; i++)
> +        allTrue &= vec[i];

This code is not performance critical. I would go with '&&=' to avoid surprises if the in-memory boolean representation changes (i.e., storing 1 vs -1 in the lanes).

This would also help the compiler reason about the loop semantics.

@@ +910,5 @@
> +
> +    Elem* vec = TypedObjectMemory<Elem*>(args[0]);
> +    bool anyTrue = false;
> +    for (unsigned i = 0; !anyTrue && i < V::lanes; i++)
> +        anyTrue |= vec[i];

Same here: ||=

::: js/src/builtin/SIMD.h
@@ +222,5 @@
>  
>  #define INT8X16_TERNARY_FUNCTION_LIST(V)                                              \
>    V(replaceLane, (ReplaceLane<Int8x16>), 3)                                           \
> +  V(select, (Select<Int8x16, Bool8x16>), 3)                                           \
> +  V(selectBits, (SelectBits<Int8x16, Int8x16>), 3)                                    \

Note that selectBits was removed from the spec.
Not relevant for this patch, I suppose.

::: js/src/builtin/TypedObject.js
@@ +608,5 @@
> +      return "Int32x4";
> +    case JS_SIMDTYPEREPR_FLOAT32:
> +      return "Float32x4";
> +    case JS_SIMDTYPEREPR_FLOAT64:
> +      return "Float64x2";

Preserve the numerical order or these constants here: INT, FLOAT, BOOL. See  TypedObjectConstants.h
Attachment #8688571 - Flags: review?(jolesen) → review+
Depends on: 1226027
I am working on this now, starting from Sajjad's last JIT patch. Initially, I am adding Bool32x4 support to Ion.
Assignee: nobody → jolesen
Thanks for the reviews,

The 8 of Bool8x16 is still missing on this line, this will break. Is this tested somewhere?
I added the tests. Thanks for spotting.

Do we really need ::TypedObjectProperties arrays?
They're being referenced in template function CreateAndBindSimdClass which are applied to every SIMD type.

About representation of Boolean vectors in interpreter:
Each Boolean lane is represented by a UInt8. I've used 0xff for true instead of using a Boolean value of "true". This way filling SIMD registers don't need a conversion. For example to fill the Bool32x4 values into a SIMD register we just need to use two unpacking operations. Do you suggest to change the representation?
Adding bug 1225605 as a blocker, as it should simplify quite a few things.
Depends on: 1225605
(In reply to sajjadt from comment #26)
> About representation of Boolean vectors in interpreter:
> Each Boolean lane is represented by a UInt8. I've used 0xff for true instead
> of using a Boolean value of "true". This way filling SIMD registers don't
> need a conversion. For example to fill the Bool32x4 values into a SIMD
> register we just need to use two unpacking operations. Do you suggest to
> change the representation?

I suggest that we represent all the Bool vectors with 128 bits in memory, the same as the corresponding Int vectors with -1 for a true lane and 0 for a false lane. This has some advantages:

- Box and unbox operations become simpler -- they are just load/store instructions. No need for pack and unpack.
- SimdConstant doesn't need a Bool32x4 kind. We can reuse the Int32x4 kind.
- All Bool vector memory operations are the same: Box/unbox, API arg passing, spill/fill.
- We don't need LBool32x4. We can reuse LInt32x4 for bool constants.
- Omitting the pack and unpack instructions when converting bool vectors uses less code memory.

It looks to me like a 4-byte TypedObject and a 16-byte TypedObject will both be assigned to AllocKind::OBJECT2, so we are not actually saving any gc memory with the more compact in-memory representation.
Updated patch:

- Rebased onto the changes from Bug 1225605.

- Changed TypedObject in-memory representation of boolean vectors to always use
  16 bytes. This simplifies the JIT compiler a lot since vector registers can be
  loaded and stored directly.
Attachment #8688571 - Attachment is obsolete: true
Patch by Sajjad Taheri!
Attachment #8688572 - Attachment is obsolete: true
Updated / simplified patch:

- Box/unbox is simplified by the 16-byte representation of boolean vectors.

- SimdConstant doesn't need a BOOL32X4 variant. Just use the INT32x4 one.

- Use LInt32x4 to materialize Bool32x4 constants.

Based on a patch by Sajjad Taheri!
These functions take scalar inputs that are converted ToBoolean() before
becoming a lane in a boolean SIMD vector:

  Bool32x4(scalar, scalar, scalar, scalar)
  Bool32x4.splat(scalar)
  Bool32x4.replaceLane(simd, lane, scalar)

The scalar can legally be any type that ToBoolean() can handle, so it needs to
be converted into a 0 or -1 lane value. This is slightly complicated in the jit
because 1) The ToBoolean function can accept arguments of any type, and 2) Even
a properly types boolean scalar needs conversion from the {0,1} set of Boolean
values to the  {0,-1} set of boolean lane values.

Insert the necessary conversion instructions when inlining these functions such
that these MIR instructions expect an Int32 scalar when producing Bool32x4
values:

  MSimdValueX4(scalar, scalar, scalar, scalar)
  MSimdSplatX4(scalar)
  MSimdInsertElement(simd, scalar, lane)

Use constantToBoolean() when constant folding boolean scalars. This implements
the full semantics of the ECMA ToBoolean() function.

By inserting the necessary ToBoolean(x)-1 conversion when creating the MIR
instructions, the boolean vector version of these instructions can behave
exactly like their integer vector counterparts. This simplifies code generation
which doesn't have to implement the complicated patterns of the MNot and MTest
instructions.
Fix obvious bugs in TypedObject.js.

Also add an ENABLE_SIMD compilation flag that enables SIMD support in the
nightly build only. Previously, SIMD and BINARYDATA used the same flag.
Attachment #8691026 - Attachment is obsolete: true
Also add an ENABLE_SIMD compilation flag that enables SIMD support in the
nightly build only. Previously, SIMD and BINARYDATA used the same flag.
Attachment #8691124 - Attachment is obsolete: true
Patch by Sajjad Taheri!
Attachment #8691027 - Attachment is obsolete: true
Based on a patch by Sajjad Taheri!
Attachment #8691030 - Attachment is obsolete: true
Attachment #8691032 - Attachment is obsolete: true
These operations were removed from the SIMD.js spec.
The SIMD.js spec currently has 10 different SIMD types, and we will probably
support Float64x2 + Bool64X2 as well. It becomes repetitive and error prone to
declare separate opcode enumerations for all these types.

- Merge all the SIMD types into a single SIMD enumeration.
- Merge the EmitI32X4Expr and EmitF32X4Expr functions into EmitSimdExpr.
Attachment #8694402 - Flags: feedback?(luke)
Attached patch Part 6: ASM.js boolean vectors. (obsolete) — Splinter Review
Implement asm.js support for Bool32x4 and the allTrue() / anyTrue() functions.
Comment on attachment 8694402 [details] [diff] [review]
Part 5: Reconcile asm.js SIMD opcodes

Review of attachment 8694402 [details] [diff] [review]:
-----------------------------------------------------------------

Ah, I see.  So one recent task in flight (bug 1229399) is making our internal IR match wasm (for function bodies).  One major difference is that wasm does not segregate by type like the IR does (which is based on the early wasm polyfill work (https://github.com/WebAssembly/polyfill-prototype-1).  Another thing is that wasm has distinct ops for distinct types (so (i32,i64,f32,f64).add), so I think this change moves us farther away from wasm.  It is expected that with all the new SIMD types there will be many total ops, but that's ok since (1) wasm will have variable-length opcodes, (2) each module gets to declare its own opcode table (see BinaryEncoding.md) and assign opcodes to named operators so the hot ops can always fit in a byte.  I realize now you were asking about this earlier, but I misunderstood and thought you were talking about the duplication involving AsmJSSimdOperation; sorry about that.  Perhaps we can leave this as is for now?
Attachment #8694402 - Flags: feedback?(luke)
Attached patch Part 5: ASM.js boolean vectors. (obsolete) — Splinter Review
Luke, this is the copy-and-paste version of this patch that we talked about.

This is good enough to unblock progress on the other SIMD.js types, but it
should be apparent from the diff that this approach does not scale to 12 SIMD
types. There's too many switches.

I'll beef up the tests before asking for reviews on this patch.
Attachment #8694902 - Flags: feedback?(luke)
Attachment #8694402 - Attachment is obsolete: true
Attachment #8694403 - Attachment is obsolete: true
Comment on attachment 8694902 [details] [diff] [review]
Part 5: ASM.js boolean vectors.

Review of attachment 8694902 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for doing this.  Probably best to give bbouvier the r?.
Attachment #8694902 - Flags: feedback?(luke) → feedback+
Also add an ENABLE_SIMD compilation flag that enables SIMD support in the
nightly build only. Previously, SIMD and BINARYDATA used the same flag.

Include a Bool64x2 type to go with the existing Float64x2 type. Neither are in
the current spec.
Attachment #8696136 - Flags: review?(benj)
Attachment #8694398 - Attachment is obsolete: true
Patch by Sajjad Taheri!
Attachment #8696137 - Flags: review?(benj)
Attachment #8694399 - Attachment is obsolete: true
Based on a patch by Sajjad Taheri!
Attachment #8696138 - Flags: review?(benj)
Attachment #8694400 - Attachment is obsolete: true
These operations were removed from the SIMD.js spec.

Also remove TypedObjectProperties from SIMD objects, since there are no such
properties defined in the spec.

Reorganize the operation lists in SIMD.h to be friendlier to the boolean
vectors.
Attachment #8696141 - Flags: review?(benj)
Attachment #8694401 - Attachment is obsolete: true
Attached patch Part 5: ASM.js boolean vectors. (obsolete) — Splinter Review
Implement asm.js support for Bool32x4.
Attachment #8696142 - Flags: review?(benj)
Attachment #8694902 - Attachment is obsolete: true
Comment on attachment 8696136 [details] [diff] [review]
Part 1: SIMD bool vector implementation for the interpreter.

Review of attachment 8696136 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!

::: js/src/builtin/SIMD.h
@@ +393,2 @@
>      _(xor)
>  #define COMP_COMMONX4_TO_INT32X4_SIMD_OP(_) \

I am assuming this et al. go away in the next patches?

::: js/src/builtin/TypedObject.h
@@ +328,5 @@
>  };
>  
>  /*
> + * Type descriptors `bool8x16`, `bool16x8`, `bool32x4`, int8x16`, `int16x8`,
> + * `int32x4`, `float32x4` and `float64x2`.

Maybe we can just collapse this comment down to "SIMD Type descriptors"?

::: js/src/builtin/TypedObject.js
@@ +437,5 @@
>        Store_int16(typedObj, offset + 14, Load_int16(fromValue, 14));
>        break;
>      case JS_SIMDTYPEREPR_INT32X4:
> +    case JS_SIMDTYPEREPR_BOOL32X4:
> +    case JS_SIMDTYPEREPR_BOOL64X2:

is this really correct for bool64x2?

::: js/src/jit/IonBuilder.cpp
@@ +11165,5 @@
>        case SimdTypeDescr::Int32x4:   return MIRType_Int32x4;
>        case SimdTypeDescr::Float32x4: return MIRType_Float32x4;
>        case SimdTypeDescr::Int8x16:
>        case SimdTypeDescr::Int16x8:
> +      case SimdTypeDescr::Float64x2:

nit: Float64x2 is repeated twice? (I might have introduced this error when rebasing the patch)

::: js/src/jsapi.cpp
@@ +54,1 @@
>  #include "builtin/SIMD.h"

Our #ifdef style guidelines require that we indent by 1 space every time we enter a new #ifdef, but that was not respected beforehand anyway. Feel free to add it though, if you want:

#ifdef ENABLE_SIMD
# include "builtin/SIMD.h"
#endif

::: js/src/jsprototypes.h
@@ +49,5 @@
>  #define IF_BDATA(real,imaginary) imaginary
>  #endif
>  
> +#ifdef ENABLE_SIMD
> +#define IF_SIMD(real,imaginary) real

(see comment in jsapi.cpp)
Attachment #8696136 - Flags: review?(benj) → review+
Comment on attachment 8696137 [details] [diff] [review]
Part 2: JSAPI/JIT tests for SIMD bool vector implementation.

Review of attachment 8696137 [details] [diff] [review]:
-----------------------------------------------------------------

If we could keep the trigger counts in all tests, that'd be great. However, these are only tests so it does not matter much.

Feel free to drop a few bool64x2 tests as well. I am fine with leaving this work for later.

::: js/src/jit-test/tests/SIMD/bool32x4-arith.js
@@ +7,5 @@
> +    var b2 = SIMD.Bool32x4(true, true, true, true);
> +    for (var i = 0; i < 3500; i++) {
> +        assertEqX4(SIMD.Bool32x4.and(b1, b2), booleanBinaryX4((x, y) => x && y, b1, b2));
> +        assertEqX4(SIMD.Bool32x4.or(b1, b2),  booleanBinaryX4((x, y) => x || y, b1, b2));
> +        assertEqX4(SIMD.Bool32x4.xor(b1, b2), booleanBinaryX4((x, y) => ( x || y ) && !( x && y ), b1, b2));

Any reason to explicit the xor here, rather than using the ^ operator? Just asking, as I saw you used the ^ operator in the spec tests.
Attachment #8696137 - Flags: review?(benj) → review+
Comment on attachment 8696138 [details] [diff] [review]
Part 3: SIMD boolean vector support for JIT.

Review of attachment 8696138 [details] [diff] [review]:
-----------------------------------------------------------------

Great work! Mostly style nits, some unused code, some code that I think should live in TypePolicy instead of IonBuilder, so requesting another round of review before r+. Thank you!

::: js/src/jit-test/tests/SIMD/bool32x4-const.js
@@ +10,5 @@
> +    return [
> +        B(false, false, false, true),
> +        B(true),
> +        B(undefined, null, "", "x"),
> +        B({}, 0, 1, -0.0),

Thanks for adding tests! Maybe you can add a test using NaN, a Symbol, objectEmulatingUndefined()? The latter is particular in the sense that it's the only kind of objects which truthiness value is false.

::: js/src/jit/IonBuilder.h
@@ +887,5 @@
>                                    unsigned numElems);
>      InliningStatus inlineSimdStore(CallInfo& callInfo, JSNative native, SimdTypeDescr::Type type,
>                                     unsigned numElems);
>  
> +    InliningStatus inlineSimdBool(CallInfo& callInfo, JSNative native, SimdTypeDescr::Type type);

nit: this method is never defined, it seems it can be deleted.

::: js/src/jit/IonTypes.h
@@ +675,5 @@
>      return MIRType((type >> ELEMENT_TYPE_SHIFT) & ELEMENT_TYPE_MASK);
>  }
>  
> +// Get the type expected when inserting a lane into a SIMD type.
> +// The is the argument type expected by the MSimdValue constructors as well as

nit: missing word here

::: js/src/jit/Lowering.cpp
@@ +4248,5 @@
>  
>      if (ins->type() == MIRType_Int32x4) {
>          LSimdUnaryArithIx4* lir = new(alloc()) LSimdUnaryArithIx4(in);
>          define(lir, ins);
> +    } else if (ins->type() == MIRType_Bool32x4) {

Maybe just a || in the above condition?

::: js/src/jit/MCallOptimize.cpp
@@ +3262,5 @@
> +//
> +// Convert any scalar value into an appropriate SIMD lane value: An Int32 value
> +// that is either 0 for false or -1 for true.
> +MDefinition*
> +IonBuilder::convertToBooleanSimdLane(MDefinition* scalar)

This really feels like this should be in the TypePolicy of the instructions themselves, where we have more information about the input (the types here can be very much imprecise). How big a burden would it be to move this code in TypePolicy.cpp?

@@ +3578,5 @@
> +    if (!checkInlineSimd(callInfo, native, SimdTypeDescr::Bool32x4, 1, &templateObj))
> +        return InliningStatus_NotInlined;
> +
> +    MIRType mirType = SimdTypeDescrToMIRType(SimdTypeDescr::Bool32x4);
> +    MSimdAnyTrue* ins = MSimdAnyTrue::New(alloc(), callInfo.getArg(0), mirType);

mirType can be statically inferred, and thus the value could be just directly copied in the ctor. I know this is less generic and doesn't prepare much well for the future, but can we keep it simple at the moment, and remove the last argument of the ctor, please?

@@ +3594,5 @@
> +    if (!checkInlineSimd(callInfo, native, SimdTypeDescr::Bool32x4, 1, &templateObj))
> +        return InliningStatus_NotInlined;
> +
> +    MIRType mirType = SimdTypeDescrToMIRType(SimdTypeDescr::Bool32x4);
> +    MSimdAllTrue* ins = MSimdAllTrue::New(alloc(), callInfo.getArg(0), mirType);

ditto

::: js/src/jit/MIR.cpp
@@ +975,3 @@
>        case MIRType_Int32x4: {
> +        int32_t v = op->constantValue().toInt32();
> +        cst = SimdConstant::SplatX4(v);

Nice use of ::SplatX4!

::: js/src/jit/MIR.h
@@ +1760,5 @@
>  
> +// Returns true if all lanes are true.
> +class MSimdAllTrue
> +: public MUnaryInstruction,
> +public SimdPolicy<0>::Data

style nit: the spacing is wrong here:

class X
  : public A,
    public B
{
  protected: // half indent
    void method(); // etc.
};

@@ +1762,5 @@
> +class MSimdAllTrue
> +: public MUnaryInstruction,
> +public SimdPolicy<0>::Data
> +{
> +protected:

style nit: half indent here

@@ +1763,5 @@
> +: public MUnaryInstruction,
> +public SimdPolicy<0>::Data
> +{
> +protected:
> +    explicit MSimdAllTrue(MDefinition* obj, MIRType type)

no strict requirement to be explicit here, but fine to keep.

@@ +1771,5 @@
> +        specialization_ = type;
> +        setMovable();
> +    }
> +
> +public:

style nit: half indent here

@@ +1778,5 @@
> +    static MSimdAllTrue* NewAsmJS(TempAllocator& alloc, MDefinition* obj)
> +    {
> +        MOZ_ASSERT(IsSimdType(obj->type()));
> +        MSimdAllTrue* ins = new(alloc) MSimdAllTrue(obj, obj->type());
> +        ins->setResultType(MIRType_Int32);

Any chance the ctor could do setResultType(type), so that we don't have to redefine the result type by hand after calling the ctor?

@@ +1799,5 @@
> +    ALLOW_CLONE(MSimdAllTrue)
> +};
> +
> +// Returns true if any lane is true.
> +class MSimdAnyTrue

All remarks for the class above apply here, I think.

@@ +2223,5 @@
>      enum Operation {
>  #define OP_LIST_(OP) Op_##OP,
>          ARITH_COMMONX4_SIMD_OP(OP_LIST_)
>          BINARY_ARITH_FLOAT32X4_SIMD_OP(OP_LIST_)
> +        BITWISE_COMMONX4_SIMD_OP(OP_LIST_)

This and the macro below shouldn't be changed, right? Bitwise instructions should use MSimdBinaryBitwise.

@@ -3208,5 @@
>        : MUnaryInstruction(op),
>          templateObject_(templateObject),
>          initialHeap_(initialHeap)
>      {
> -        MOZ_ASSERT(IsSimdType(op->type()));

Why is this removed?

::: js/src/jit/shared/LIR-shared.h
@@ +251,5 @@
>  };
> +// Extracts an element from a given SIMD bool32x4 lane.
> +class LSimdExtractElementB : public LSimdExtractElementBase
> +{
> +public:

nit: half indent (two spaces) before "public:"

@@ +254,5 @@
> +{
> +public:
> +    LIR_HEADER(SimdExtractElementB);
> +    explicit LSimdExtractElementB(const LAllocation& base)
> +    : LSimdExtractElementBase(base)

nit: style guidelines require two spaces before the :

@@ +303,5 @@
>  
> +// Replace an element from a given SIMD bool32x4 lane with a given value.
> +class LSimdInsertElementB : public LSimdInsertElementBase
> +{
> +public:

nit: ditto

@@ +306,5 @@
> +{
> +public:
> +    LIR_HEADER(SimdInsertElementB);
> +    LSimdInsertElementB(const LAllocation& vec, const LAllocation& val)
> +    : LSimdInsertElementBase(vec, val)

nit: ditto

@@ +625,5 @@
>  };
>  
> +class LSimdAnyTrue : public LInstructionHelper<1, 1, 0>
> +{
> +public:

nit: ditto

@@ +627,5 @@
> +class LSimdAnyTrue : public LInstructionHelper<1, 1, 0>
> +{
> +public:
> +    LIR_HEADER(SimdAnyTrue)
> +    LSimdAnyTrue(const LAllocation& input) {

nit: need to make the ctor explicit here

@@ +640,5 @@
> +};
> +
> +class LSimdAllTrue : public LInstructionHelper<1, 1, 0>
> +{
> +public:

ditto

@@ +642,5 @@
> +class LSimdAllTrue : public LInstructionHelper<1, 1, 0>
> +{
> +public:
> +    LIR_HEADER(SimdAllTrue)
> +    LSimdAllTrue(const LAllocation& input) {

ditto

@@ +741,5 @@
>          return f_;
>      }
>  };
>  
> +

nit: no new line please

::: js/src/jit/x86-shared/Assembler-x86-shared.h
@@ +2719,5 @@
>          MOZ_ASSERT(HasSSE2());
>          masm.vunpckhps_rr(src1.encoding(), src0.encoding(), dest.encoding());
>      }
> +
> +    void vpunpcklbw(FloatRegister src1, FloatRegister src0, FloatRegister dest) {

These 4 functions seem unused in this patch. Can you move them out to another patch, please? The changes in Encoding/BaseAssembler also need to be moved out.

::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +2570,5 @@
> +    masm.cmp32(output, Imm32(0x0));
> +    masm.emitSet(Assembler::NonZero, output);
> +}
> +
> +

nit: blank line

::: js/src/jit/x86-shared/Encoding-x86-shared.h
@@ +176,5 @@
>      OP2_RCPPS_VpsWps    = 0x53,
>      OP2_ANDPD_VpdWpd    = 0x54,
>      OP2_ORPD_VpdWpd     = 0x56,
>      OP2_XORPD_VpdWpd    = 0x57,
> +    OP2_PUNPCKLBW_VdqWdq= 0x60,

nit: space before =
Attachment #8696138 - Flags: review?(benj)
Comment on attachment 8696141 [details] [diff] [review]
Part 4: Delete signMask and selectBits.

Review of attachment 8696141 [details] [diff] [review]:
-----------------------------------------------------------------

Glad to see this code go! The good news is, we can even remove more, so asking for another look before r+.
Also, a big thank you for the operations lists reordering.

::: js/src/asmjs/AsmJSLink.cpp
@@ +379,5 @@
>  #define FALLTHROUGH(op) case AsmJSSimdOperation_##op:
>        case AsmJSSimdType_int32x4:
>          switch (global.simdOperation()) {
> +          FORALL_INT32X4_ASMJS_OP(SET_NATIVE_INT32X4)
> +          default:

I'd prefer to keep it the way it was before (explicit cases), so that we get compiler warnings when we miss something and can decide what to do at the moment we see the warning.

@@ +387,5 @@
>          break;
>        case AsmJSSimdType_float32x4:
>          switch (global.simdOperation()) {
> +          FORALL_FLOAT32X4_ASMJS_OP(SET_NATIVE_FLOAT32X4)
> +          default:

ditto

@@ +394,5 @@
> +        }
> +        break;
> +      case AsmJSSimdType_bool32x4:
> +        switch (global.simdOperation()) {
> +          FORALL_BOOL_SIMD_OP(SET_NATIVE_BOOL32X4)

This line should be in the asm.js patch, but it's okay to keep it here, as this path should not ever be taken (no validation in the first place).

@@ +395,5 @@
> +        break;
> +      case AsmJSSimdType_bool32x4:
> +        switch (global.simdOperation()) {
> +          FORALL_BOOL_SIMD_OP(SET_NATIVE_BOOL32X4)
> +          default:

ditto

::: js/src/asmjs/WasmIonCompile.cpp
@@ +315,5 @@
>          MOZ_ASSERT(IsSimdType(mask->type()));
>          MOZ_ASSERT(mask->type() == MIRType_Int32x4);
>          MOZ_ASSERT(IsSimdType(lhs->type()) && rhs->type() == lhs->type());
>          MOZ_ASSERT(lhs->type() == type);
> +        MSimdSelect* ins = MSimdSelect::NewAsmJS(alloc(), mask, lhs, rhs, type, true);

Probably the last argument of MSimdSelect's ctor could be removed as well?

::: js/src/builtin/SIMD.cpp
@@ -131,5 @@
>      return reinterpret_cast<Elem>(obj.typedMem());
>  }
>  
> -template<typename SimdType>
> -static bool SignMask(JSContext* cx, unsigned argc, Value* vp)

\o/

::: js/src/builtin/SIMD.h
@@ +353,5 @@
>    INT32X4_SHUFFLE_FUNCTION_LIST(V)
>  
> +//
> +// The FOREACH macros below partition all of the SIMD operations into disjoint sets.
> +// 

nit: trailing space

Also, general style nit: multi-line comments require either

// Text starting on the first line.
// Blahblahblah.

or

/*
 * Text.
 * Note the space at start of line.
 */

Also: very nice grouping and comments, thank you so much for doing that.

@@ +510,5 @@
> +    _(fromFloat64x2Bits)              \
> +    _(fromInt16x8Bits)                \
> +    _(fromInt32x4Bits)
> +
> +#define FOREACH_INT16x8_SIMD_CAST(_)  \

This contains INT16x8 while the above is INT8X16 (note lowercase vs uppercase 'x'). Can we be consistent here, please? This will probably also avoid compilation errors :)

::: js/src/jit-test/tests/SIMD/getters.js
@@ -16,5 @@
> -    for (var i = 0; i < 3500; i++) {
> -        assertEq(SIMD.Int32x4.extractLane(i4, 0), 1);
> -        assertEq(SIMD.Int32x4.extractLane(i4, 1), -2);
> -        assertEq(SIMD.Int32x4.extractLane(i4, 2), 3);
> -        assertEq(SIMD.Int32x4.extractLane(i4, 3), -4);

Unless it's tested in other files (I honestly can't remember if I've seen this in other test files from today's review...), I would keep this file and just remove the signMask tests.

::: js/src/jit/IonBuilder.cpp
@@ +11202,5 @@
>          return true;
>      }
>  
> +    // Unknown getprop access on a SIMD value
> +    trackOptimizationOutcome(TrackedOutcome::UnknownSimdProperty);

It's actually even better: with signMask leaving, there are actually no more properties on SIMD vector instances, so we can get rid of this entire method (getPropTrySimdGetter). There's going to be a JitCoach message to delete as well, somewhere.
Attachment #8696141 - Flags: review?(benj)
(In reply to Benjamin Bouvier [:bbouvier] from comment #50)
> ::: js/src/jit/MCallOptimize.cpp
> @@ +3262,5 @@
> > +//
> > +// Convert any scalar value into an appropriate SIMD lane value: An Int32 value
> > +// that is either 0 for false or -1 for true.
> > +MDefinition*
> > +IonBuilder::convertToBooleanSimdLane(MDefinition* scalar)
> 
> This really feels like this should be in the TypePolicy of the instructions
> themselves, where we have more information about the input (the types here
> can be very much imprecise). How big a burden would it be to move this code
> in TypePolicy.cpp?

Yes, I agree, and in fact I implemented it like that initially.

However, the TypePolicy functions look like they shold be idempotent, and they wouldn't be for boolean lane values. The canonical input form is an int32 value that is known to be either 0 or -1. The TypePolicy can't distinguish between any old int32 value and one that has been canonicalized to 0/-1, so it won't be idempotent.

Another possibility is to have the TypePolicy canonicalize to a boolean input value, and then generate the conversion from boolean to 0/-1 during lowering. This has the disadvantage of hiding the conversion from any optimizations.
Comment on attachment 8696142 [details] [diff] [review]
Part 5: ASM.js boolean vectors.

Review of attachment 8696142 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good! Indeed it's a bit redundant and won't cleanly scale for other smaller SIMD types. A few questions below that would probably make things easier.

::: js/src/asmjs/AsmJSValidate.cpp
@@ +1902,5 @@
>  
>          uint32_t _;
>          switch (type) {
>            case AsmJSSimdType_int32x4:
> +          case AsmJSSimdType_bool32x4:

Maybe for the literal form, we could just accept 0 or 1 as input values, and fail softly otherwise? Having only two possible values to fill a bool32x4 literal's lanes sounds better than having 2**32.

@@ +1981,5 @@
> +            uint32_t u32;
> +            JS_ALWAYS_TRUE(IsLiteralInt(m, arg, &u32));
> +            val[i] = u32 ? -1 : 0;
> +        }
> +        MOZ_ASSERT(arg== nullptr);

nit: space before ==

@@ +4648,5 @@
>          *type = Type::Float;
>          break;
> +      case AsmJSSimdType_bool32x4:
> +        f.writeOp(I32::B32X4ExtractLane);
> +        *type = Type::Signed;

I *think* this should be intish, otherwise there would be some cases where asm.js would behave differently than plain JS. For example, ffi calls accept Signed, so you could have ffiCall(bool32x4_extractLane(bool32x4(1,1,1,1), 0)), and an integer would be passed to the ffiCall function (so for instance, if ffiCall := console.log, with asm.js, it would print 1, without asm.js, it would print true). If that's true, please add a test case that would have caught this.

@@ +4886,5 @@
> +    switch (opType) {
> +      case AsmJSSimdType_bool32x4:
> +        f.writeOp(I32::B32X4AllTrue);
> +        break;
> +      default:

can you explicit the other cases here, rather than using default, please?

@@ +4902,5 @@
> +    switch (opType) {
> +      case AsmJSSimdType_bool32x4:
> +        f.writeOp(I32::B32X4AnyTrue);
> +        break;
> +      default:

ditto

::: js/src/asmjs/Wasm.h
@@ +50,5 @@
>  {
> +    return vt == ValType::I32x4 || vt == ValType::F32x4 || vt == ValType::B32x4;
> +}
> +
> +// Is this one of the boolean SIMD types?

I think we can delete this comment.

@@ +105,5 @@
>      explicit Val(float f32) : type_(ValType::F32) { u.f32_ = f32; }
>      explicit Val(double f64) : type_(ValType::F64) { u.f64_ = f64; }
> +
> +    explicit Val(const I32x4& i32x4, ValType type = ValType::I32x4) : type_(type) {
> +        MOZ_ASSERT(type_ == ValType::I32x4 || type_ == ValType::B32x4);

I'd be fine with not having the second parameter at all and just add a comment somewhere that B32x4 is stored as a I32x4. The assertion below could also be removed.

::: js/src/asmjs/WasmIonCompile.cpp
@@ +1303,5 @@
>          return true;
>        }
> +      case ValType::B32x4: {
> +        // Boolean vectors are stored as an Int vector with -1 / 0 lanes.
> +        SimdConstant lit(f.readI32X4());

Maybe MOZ_ASSERT that the actual values really are -1 or 0?

@@ +1814,5 @@
> +    if (IsSimdBoolType(simdType)) {
> +        if (!EmitSimdBooleanLaneExpr(f, &scalar))
> +            return false;
> +    } else {
> +        if (!EmitExpr(f, SimdToLaneType(simdType), &scalar))

nit: else if

@@ +2255,5 @@
>  }
>  
> +// Emit an I32 expression and then convert it to a boolean SIMD lane value, i.e. -1 or 0.
> +static bool
> +EmitSimdBooleanLaneExpr(FunctionCompiler& f, MDefinition** def)

The fact that we need this method is unfortunate, although needed in this scheme. What do you think about having the bool32x4 replaceLane/ctor/splat arguments be only int32 literals, in asm.js? That would limit the dynamic use cases (and we wouldn't need this method), but they seem much unlikely to appear in the wild, yet I am not familiar enough with simd codebases to know. What do you think?

::: js/src/jit-test/tests/asm.js/simd-fbirds.js
@@ +57,5 @@
>      var getActualBirds = imp.getActualBirds;
>  
>      var i4 = global.SIMD.Int32x4;
>      var f4 = global.SIMD.Float32x4;
> +    var b4 = global.SIMD.Bool32x4;

Note that this test file and simd-mandelbrot.js have been introduced to make sure the benchmarks would actually run on Intel's website [0] and AWFY [1]. So each update to these files imply that we need to update these two websites as well; can be done once it's all landed, but useful not to forget about.

[0] http://peterjensen.github.io/idf2014-simd/idf2014-simd.html
[1] https://github.com/h4writer/arewefastyet/blob/master/benchmarks/asmjs-ubench/mandelbrot-native.js

::: js/src/jit-test/tests/asm.js/testSIMD.js
@@ -574,5 @@
>  var CheckNegF = CheckUnaryF4('neg', function(x) { return -x });
>  CheckNegF([1, 42.42, 0.63, 13.37]);
>  CheckNegF([NaN, -Infinity, Infinity, 0]);
>  
> -var CheckNotF = CheckUnaryF4('not', (function() {

Cut/paste error? I think this should stay.

@@ -853,5 @@
>  })();
>  
> -CheckF4(ANDF32, 'var x=f4(42, 13.37,-1.42, 23.10); var y=f4(19.89, 2.4, 8.15, 16.36); x=andd(x,y)', bitwise.and([42, 13.37, -1.42, 23.10], [19.89, 2.4, 8.15, 16.36]));
> -CheckF4(ORF32,  'var x=f4(42, 13.37,-1.42, 23.10); var y=f4(19.89, 2.4, 8.15, 16.36); x=orr(x,y)',  bitwise.or( [42, 13.37, -1.42, 23.10], [19.89, 2.4, 8.15, 16.36]));
> -CheckF4(XORF32, 'var x=f4(42, 13.37,-1.42, 23.10); var y=f4(19.89, 2.4, 8.15, 16.36); x=xorr(x,y)', bitwise.xor([42, 13.37, -1.42, 23.10], [19.89, 2.4, 8.15, 16.36]));

ditto
Attachment #8696142 - Flags: review?(benj)
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #52)
> (In reply to Benjamin Bouvier [:bbouvier] from comment #50)
> > ::: js/src/jit/MCallOptimize.cpp
> > @@ +3262,5 @@
> > > +//
> > > +// Convert any scalar value into an appropriate SIMD lane value: An Int32 value
> > > +// that is either 0 for false or -1 for true.
> > > +MDefinition*
> > > +IonBuilder::convertToBooleanSimdLane(MDefinition* scalar)
> > 
> > This really feels like this should be in the TypePolicy of the instructions
> > themselves, where we have more information about the input (the types here
> > can be very much imprecise). How big a burden would it be to move this code
> > in TypePolicy.cpp?
> 
> Yes, I agree, and in fact I implemented it like that initially.
> 
> However, the TypePolicy functions look like they shold be idempotent, and
> they wouldn't be for boolean lane values. The canonical input form is an
> int32 value that is known to be either 0 or -1. The TypePolicy can't
> distinguish between any old int32 value and one that has been canonicalized
> to 0/-1, so it won't be idempotent.

That's a good point, so I am fine with keeping as is then.
(In reply to Benjamin Bouvier [:bbouvier] from comment #48)
> Comment on attachment 8696136 [details] [diff] [review]
> Part 1: SIMD bool vector implementation for the interpreter.

> ::: js/src/builtin/SIMD.h
> @@ +393,2 @@
> >      _(xor)
> >  #define COMP_COMMONX4_TO_INT32X4_SIMD_OP(_) \
> 
> I am assuming this et al. go away in the next patches?

Indeed.

> ::: js/src/builtin/TypedObject.js
> @@ +437,5 @@
> >        Store_int16(typedObj, offset + 14, Load_int16(fromValue, 14));
> >        break;
> >      case JS_SIMDTYPEREPR_INT32X4:
> > +    case JS_SIMDTYPEREPR_BOOL32X4:
> > +    case JS_SIMDTYPEREPR_BOOL64X2:
> 
> is this really correct for bool64x2?

The problem is that we don't have any Load_int64 function, and this at least makes a bitwise copy of the Bool64x2. I am not even sure how to test this self-hosted code. Using Load_float64 feels gross even if it might work.

It's not clear to me if our tests are exercising this?

> ::: js/src/jit/IonBuilder.cpp
> @@ +11165,5 @@
> >        case SimdTypeDescr::Int32x4:   return MIRType_Int32x4;
> >        case SimdTypeDescr::Float32x4: return MIRType_Float32x4;
> >        case SimdTypeDescr::Int8x16:
> >        case SimdTypeDescr::Int16x8:
> > +      case SimdTypeDescr::Float64x2:
> 
> nit: Float64x2 is repeated twice? (I might have introduced this error when
> rebasing the patch)

No, my bad. I messed up a merge conflict at some point.
Flags: needinfo?(benj)
(In reply to Benjamin Bouvier [:bbouvier] from comment #49)
> Comment on attachment 8696137 [details] [diff] [review]
> Part 2: JSAPI/JIT tests for SIMD bool vector implementation.
>
> If we could keep the trigger counts in all tests, that'd be great. However,
> these are only tests so it does not matter much.

I would like that too. Sajjad mentioned that many of these tests did not reach Ion with the old 100 iteration count. I wonder if that is because the ion.warmup.trigger setting is counting function calls, and we don't have any function calls inside some of these loops.

Maybe the 3500 count is required to trigger an OSR recompilation?

I'll rewrite the tests to make 100 calls to a test function. Then the test function should be quickly Ion-compiled.

> Feel free to drop a few bool64x2 tests as well. I am fine with leaving this
> work for later.

I didn't put any Bool64x2 tests in jit-test since this type is not inlined in Ion yet. There should be a few Bool64x2 tests under tests/ecma_7, I think. We may want to isolate the Float64x2 and Bool64x2 tests when we start enabling the standard SIMD types in beta and beyond. I'll do that then.

> ::: js/src/jit-test/tests/SIMD/bool32x4-arith.js
> @@ +7,5 @@
> > +    var b2 = SIMD.Bool32x4(true, true, true, true);
> > +    for (var i = 0; i < 3500; i++) {
> > +        assertEqX4(SIMD.Bool32x4.and(b1, b2), booleanBinaryX4((x, y) => x && y, b1, b2));
> > +        assertEqX4(SIMD.Bool32x4.or(b1, b2),  booleanBinaryX4((x, y) => x || y, b1, b2));
> > +        assertEqX4(SIMD.Bool32x4.xor(b1, b2), booleanBinaryX4((x, y) => ( x || y ) && !( x && y ), b1, b2));
> 
> Any reason to explicit the xor here, rather than using the ^ operator? Just
> asking, as I saw you used the ^ operator in the spec tests.

I guess '^' returns 1 / 0, but here we need true / false. Either !!(x^y) or (x!=y) would also work, though.
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #55)
> The problem is that we don't have any Load_int64 function, and this at least
> makes a bitwise copy of the Bool64x2. I am not even sure how to test this
> self-hosted code. Using Load_float64 feels gross even if it might work.
> 
> It's not clear to me if our tests are exercising this?

They might be related to toString()/toSource(), or other remaining typed object cruft. From a quick scan, it seems related to constructors/property assignments on typed structs/arrays containing a SIMD value and to reifying objects (there certainly are a few tests for that under ecma_7/SIMD/typedobjects.js).

As the general plan is moving forward to SIMD values *not* being typed objects anymore, I think it is fine to let as is in the meanwhile, with a clear comment that this is a (hopefully) temporary situation.
Flags: needinfo?(benj)
(In reply to Benjamin Bouvier [:bbouvier] from comment #51)
> Comment on attachment 8696141 [details] [diff] [review]
> Part 4: Delete signMask and selectBits.

> ::: js/src/asmjs/AsmJSLink.cpp
> @@ +379,5 @@
> >  #define FALLTHROUGH(op) case AsmJSSimdOperation_##op:
> >        case AsmJSSimdType_int32x4:
> >          switch (global.simdOperation()) {
> > +          FORALL_INT32X4_ASMJS_OP(SET_NATIVE_INT32X4)
> > +          default:
> 
> I'd prefer to keep it the way it was before (explicit cases), so that we get
> compiler warnings when we miss something and can decide what to do at the
> moment we see the warning.

This is one of those things that won't scale well to 12 SIMD types. Before this patch there were only two SIMD types in asm.js, and this switch was easy to write because the operations partition into 3 disjoint sets: Shared ops, float-only, and int-only ops.

However, the number of disjoint partitions grows exponentially (literally) as we add more SIMD types. Even when adding just a single Bool32x4 type, we would need a partition of the operations like this:

- Common to all types.
- Common to Int and Bool, but not float.
- Common to Int and Float, but not bool.
- Exclusive to Int.
- Exclusive to Float.
- Exclusive to Bool.

Add to this the various cast and conversion functions. The set of conversions supported by any given type tends to be unique because types don't support no-op conversions to themselves.

The end result would be something like 20-30 lines of macros in each of these switches. While the compiler would emit warnings about missing operations in each switch, it would not be able to tell us if we put each operation in the right case (i.e., is this subset of operations supported for this type or not.)

As we add more SIMD types, it is easy to make mistakes in determining the set of operations supported by each type. I think it is safer to maintain a single copy of that list - the FORALL_INT32X4_ASMJS_OP() macro. Notice that this same macro is used both in AsmJSValidate.cpp:IsSimdValidOperationType() and in AsmJSLink.cpp.

This means that if I make a mistake in listing all the operations on Int32x2, it will hopefully be caught by asm.js validation tests, and it is not required to duplicate all of these tests for the AsmJSLink.cpp:ValidateSimdOperation() function.

To summarize, we have these two options:

a) Keep a single canonical list of all operations on a type, and use a default: label, foregoing the benefit of compiler warnings.

b) Partition SIMD operations into a large number of disjoint sets that are enumerated by every switch statement.

I think a) is less error-prone, particularly when adding or removing operations to a SIMD type.

What do you think?
(In reply to Benjamin Bouvier [:bbouvier] from comment #53)
> Comment on attachment 8696142 [details] [diff] [review]
> Part 5: ASM.js boolean vectors.
> 
> ::: js/src/asmjs/AsmJSValidate.cpp
> @@ +1902,5 @@
> >  
> >          uint32_t _;
> >          switch (type) {
> >            case AsmJSSimdType_int32x4:
> > +          case AsmJSSimdType_bool32x4:
> 
> Maybe for the literal form, we could just accept 0 or 1 as input values, and
> fail softly otherwise? Having only two possible values to fill a bool32x4
> literal's lanes sounds better than having 2**32.

That would be simple to implement, of course, but I'm not sure what the benefit would be? Since Asm.js doesn't have a boolean type but uses int instead, I though it would be better to follow that behavior for boolean literals. In other words, the literals should accept the same types and values as the non-literal version of a constructor.

Otherwise, ti feels like we're halfway introducing a boolean type to asm.js.

Does that make sense?

> @@ +4648,5 @@
> >          *type = Type::Float;
> >          break;
> > +      case AsmJSSimdType_bool32x4:
> > +        f.writeOp(I32::B32X4ExtractLane);
> > +        *type = Type::Signed;
> 
> I *think* this should be intish, otherwise there would be some cases where
> asm.js would behave differently than plain JS. For example, ffi calls accept
> Signed, so you could have ffiCall(bool32x4_extractLane(bool32x4(1,1,1,1),
> 0)), and an integer would be passed to the ffiCall function (so for
> instance, if ffiCall := console.log, with asm.js, it would print 1, without
> asm.js, it would print true). If that's true, please add a test case that
> would have caught this.

I assume this would apply to all functions returning a boolean value? So also the allTrue() and anyTrue() functions?

The CheckNot() and CheckComparison() functions set *type = Type::Int, FWIW.

> @@ +105,5 @@
> >      explicit Val(float f32) : type_(ValType::F32) { u.f32_ = f32; }
> >      explicit Val(double f64) : type_(ValType::F64) { u.f64_ = f64; }
> > +
> > +    explicit Val(const I32x4& i32x4, ValType type = ValType::I32x4) : type_(type) {
> > +        MOZ_ASSERT(type_ == ValType::I32x4 || type_ == ValType::B32x4);
> 
> I'd be fine with not having the second parameter at all and just add a
> comment somewhere that B32x4 is stored as a I32x4. The assertion below could
> also be removed.

This constructor is shared by I32x4 and B32x4 values, and the Val object contains a type_ field. I assumed that it was important for a Val to have the correct type?

> @@ +2255,5 @@
> >  }
> >  
> > +// Emit an I32 expression and then convert it to a boolean SIMD lane value, i.e. -1 or 0.
> > +static bool
> > +EmitSimdBooleanLaneExpr(FunctionCompiler& f, MDefinition** def)
> 
> The fact that we need this method is unfortunate, although needed in this
> scheme. What do you think about having the bool32x4 replaceLane/ctor/splat
> arguments be only int32 literals, in asm.js? That would limit the dynamic
> use cases (and we wouldn't need this method), but they seem much unlikely to
> appear in the wild, yet I am not familiar enough with simd codebases to
> know. What do you think?

A literal-only splat() would be useless since the constructor can do the same thing.

If we require literal arguments to these functions, the only way of getting runtime values into a boolean vector would be with the comparison functions. I'm afraid that would force Emscripten to jump through hoops if it ever needed to do that, and we would then need to implement the MIR optimizations that turns the hoop-jumping back into good code.

I don't think the benefits of this simplification are big enough to justify that.

> ::: js/src/jit-test/tests/asm.js/testSIMD.js
> @@ -574,5 @@
> >  var CheckNegF = CheckUnaryF4('neg', function(x) { return -x });
> >  CheckNegF([1, 42.42, 0.63, 13.37]);
> >  CheckNegF([NaN, -Infinity, Infinity, 0]);
> >  
> > -var CheckNotF = CheckUnaryF4('not', (function() {
> 
> Cut/paste error? I think this should stay.

In the current iteration of the SIMD.js spec, the floating point types don't have any bitwise operations. I removed them from the corresponding macros as part of the "Part 4" patch.
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #59)
> (In reply to Benjamin Bouvier [:bbouvier] from comment #53)
> > Comment on attachment 8696142 [details] [diff] [review]
> > Part 5: ASM.js boolean vectors.
> > 
> > @@ +4648,5 @@
> > >          *type = Type::Float;
> > >          break;
> > > +      case AsmJSSimdType_bool32x4:
> > > +        f.writeOp(I32::B32X4ExtractLane);
> > > +        *type = Type::Signed;
> > 
> > I *think* this should be intish, otherwise there would be some cases where
> > asm.js would behave differently than plain JS. For example, ffi calls accept
> > Signed, so you could have ffiCall(bool32x4_extractLane(bool32x4(1,1,1,1),
> > 0)), and an integer would be passed to the ffiCall function (so for
> > instance, if ffiCall := console.log, with asm.js, it would print 1, without
> > asm.js, it would print true). If that's true, please add a test case that
> > would have caught this.
> 
> I assume this would apply to all functions returning a boolean value? So
> also the allTrue() and anyTrue() functions?
> 
> The CheckNot() and CheckComparison() functions set *type = Type::Int, FWIW.

I added a testcase passing extractLane() to an extern function, and that did indeed cause different behavior with -no-asmjs, passing true/false instead of 0/1.

When I changed the type to 'Int' which is not a subtype of 'extern', I got a validation error for that code, forcing me to add a |0 cast.

I added tests for allTrue and anyTrue, but they were already producing 'Int'.

Nice catch!
Address review comments.
- Remove duplicate Float624x2 so the patch compiles.
- Minor style issues.
Address review comments.
- Revert repeat counts to 150. These tests still trigger with --ion-eager.
- Use inIon() instead of arbitrary repeat counts in new tests.
- Use (x != y) as a bool x bool -> bool implementation of xor.
Address review comments.
- Add suggested weird objects to the ToBoolean() test.
- Minor cleanups as suggested.
- Reinstate assertion that shouldn't have been removed.
- Remove the pack/unpack x86 instructions completely along with their encoding
  details.
- Don't move logic into TypePolicy per idempotency discussion.
Attachment #8698225 - Flags: review?(benj)
Attachment #8696138 - Attachment is obsolete: true
Address review comments.
- Delete even more code dealing with SIMD object properties.
- Remove the MSimdSelect support for bitwise selects. We can always add this
  back if a future spec requires it.
- Reinstate tests/SIMD/getters.js. Not sure why I deleted that test.
- Did NOT remove switch defaults and replace them with explicit lists of
  complement set SIMD operations. See Comment 58 above.
Attachment #8698227 - Flags: review?(benj)
Attachment #8696141 - Attachment is obsolete: true
Attached patch Part 5: ASM.js boolean vectors. (obsolete) — Splinter Review
Address review comments.
- Did NOT restrict requirements on boolean vector literals. See Comment 59
  above.
- Changed the type of a boolean extractLane to Int so it can't be passed
  directly to an FFI function. Added relevant tests.
- Minor stylistic fixes.
Attachment #8698229 - Flags: review?(benj)
Attachment #8696142 - Attachment is obsolete: true
Attachment #8696136 - Attachment is obsolete: true
Attachment #8696137 - Attachment is obsolete: true
Blocks: 1228612
Attachment #8698222 - Attachment is obsolete: true
Rebased; added missing tests for Bool64x2.
Attachment #8698224 - Attachment is obsolete: true
Rebased.
Attachment #8699519 - Flags: review?(benj)
Attachment #8698225 - Attachment is obsolete: true
Attachment #8698225 - Flags: review?(benj)
- Rebased.
- Removed a use of 'constexpr' which broke the Windows build.
- Properly removed all the bitwise operators from Float32x4 and updated the
  tests. In the previous version of this patch, these operators were removed
  from Odin and Ion only, but still present in the interpreter.
Attachment #8699521 - Flags: review?(benj)
Attachment #8698227 - Attachment is obsolete: true
Attachment #8698227 - Flags: review?(benj)
- Rebased.
- Removed all asm.js test uses of Float32x4 bitwise operations.
Attachment #8699522 - Flags: review?(benj)
Attachment #8698229 - Attachment is obsolete: true
Attachment #8698229 - Flags: review?(benj)
(In reply to Benjamin Bouvier [:bbouvier] from comment #57)
> (In reply to Jakob Stoklund Olesen [:jolesen] from comment #55)
> > The problem is that we don't have any Load_int64 function, and this at least
> > makes a bitwise copy of the Bool64x2. I am not even sure how to test this
> > self-hosted code. Using Load_float64 feels gross even if it might work.
> > 
> > It's not clear to me if our tests are exercising this?
> 
> They might be related to toString()/toSource(), or other remaining typed
> object cruft. From a quick scan, it seems related to constructors/property
> assignments on typed structs/arrays containing a SIMD value and to reifying
> objects (there certainly are a few tests for that under
> ecma_7/SIMD/typedobjects.js).
> 
> As the general plan is moving forward to SIMD values *not* being typed
> objects anymore, I think it is fine to let as is in the meanwhile, with a
> clear comment that this is a (hopefully) temporary situation.

It turns out that this code was being tested by the ToSource.js and typedobjects.js tests in tests/ecma_7/SIMD, but the test coverage was incomplete. I've updated these files to cover all the SIMD types, and they even revealed a bug in the Int8x16 implementation. I'm including the fix in the patches for Bug 1233111.

The Bool64x2 copy code you asked about above is being exercised by testBool64x2Reify() in typedobjects.js. It gets called when accessing the Bool64.array(3) thingy.

We should probably remove these things at some point. I don't think the SIMD types are supposed to behave like TypedObjects like that, and an array of Bool SIMD types is extra wrong since they don't have load/store operations, and this probably exposes the representation we're using.
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #58)
> 
> What do you think?

Agreed.
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #72)
> We should probably remove these things at some point. I don't think the SIMD
> types are supposed to behave like TypedObjects like that, and an array of
> Bool SIMD types is extra wrong since they don't have load/store operations,
> and this probably exposes the representation we're using.

Entirely agreed. I think this is cruft inherited from the fact that SIMD values are internally represented by TypedObjects. So this should go away as soon as we represent SIMD instances by their own value type / gc thing.
Comment on attachment 8699519 [details] [diff] [review]
Part 3: SIMD boolean vector support for JIT.

Review of attachment 8699519 [details] [diff] [review]:
-----------------------------------------------------------------

Looks nice, thank you!

::: js/src/jit/Lowering.cpp
@@ +4148,5 @@
>  void
> +LIRGenerator::visitSimdAllTrue(MSimdAllTrue* ins)
> +{
> +    MDefinition* input = ins->input();
> +    MOZ_ASSERT(IsSimdType(input->type()));

Maybe IsBooleanSimdType here and in the function below?

@@ +4260,5 @@
>  LIRGenerator::visitSimdBinaryComp(MSimdBinaryComp* ins)
>  {
>      MOZ_ASSERT(IsSimdType(ins->lhs()->type()));
>      MOZ_ASSERT(IsSimdType(ins->rhs()->type()));
> +    MOZ_ASSERT(ins->type() == MIRType_Bool32x4);

Preparing for the future, this could also be IsBooleanSimdType(ins->type())

::: js/src/jit/MCallOptimize.cpp
@@ +3588,5 @@
> +    return InliningStatus_Inlined;
> +}
> +
> +IonBuilder::InliningStatus
> +IonBuilder::inlineSimdAllTrue(CallInfo& callInfo, JSNative native)

OK for templatizing, if you want to factor out this function with the above.

::: js/src/jit/MIR.cpp
@@ +911,5 @@
>      bool allSame = true;
>  
>      for (size_t i = 0; i < 4; ++i) {
>          MDefinition* op = getOperand(i);
> +        MOZ_ASSERT_IF(laneType != MIRType_Boolean, op->type() == laneType);

Can we straighten this assertion like this? 

MOZ_ASSERT(op->type() == laneType ||
           (laneType == MIRType_Boolean && op->type() == MIRType_Int32));

@@ +962,5 @@
>      DebugOnly<MIRType> laneType = SimdTypeToLaneType(type());
>      MDefinition* op = getOperand(0);
>      if (!op->isConstantValue())
>          return this;
> +    MOZ_ASSERT_IF(laneType != MIRType_Boolean, op->type() == laneType);

ditto

::: js/src/jit/MIR.h
@@ +1790,5 @@
> +    AliasSet getAliasSet() const override {
> +        return AliasSet::None();
> +    }
> +    bool congruentTo(const MDefinition* ins) const override {
> +        if (!ins->isSimdAllTrue())

This check isn't needed: the first check in congruentIfOperandsEqual (testing op()) does just that.

@@ +1831,5 @@
> +        return AliasSet::None();
> +    }
> +    bool congruentTo(const MDefinition* ins) const override {
> +        if (!ins->isSimdAnyTrue())
> +            return false;

ditto

::: js/src/jit/shared/LIR-shared.h
@@ +248,5 @@
>          }
>          return "unknown lane";
>      }
>  };
> +// Extracts an element from a given SIMD bool32x4 lane.

nit: new line before this comment and after the class declaration please

::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +2414,5 @@
> +    } else {
> +        uint32_t mask = MacroAssembler::ComputeShuffleMask(lane);
> +        masm.shuffleInt32(mask, input, ScratchSimd128Reg);
> +        masm.moveLowInt32(ScratchSimd128Reg, output);
> +    }

Unless I'm missing something, the above block is the same code as visitSimdExtractElementI? Guess that if LSimdExtractElementB inherit LSimdExtractElementI, we could just call visitSimdExtractElementI here. If it's too much work, that can be a followup / good first bug for contributors.

@@ +2466,5 @@
>          masm.canonicalizeFloat(output);
>  }
>  
>  void
> +CodeGeneratorX86Shared::visitSimdInsertElementB(LSimdInsertElementB* ins)

Ditto, do we really need LSimdInsertElementB, as it appears to be generating the same code as LSimdInsertElementI? (may be a follow up as well)
Attachment #8699519 - Flags: review?(benj) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #75)
> Comment on attachment 8699519 [details] [diff] [review]
> Part 3: SIMD boolean vector support for JIT.

> ::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
> @@ +2414,5 @@
> > +    } else {
> > +        uint32_t mask = MacroAssembler::ComputeShuffleMask(lane);
> > +        masm.shuffleInt32(mask, input, ScratchSimd128Reg);
> > +        masm.moveLowInt32(ScratchSimd128Reg, output);
> > +    }
> 
> Unless I'm missing something, the above block is the same code as
> visitSimdExtractElementI? Guess that if LSimdExtractElementB inherit
> LSimdExtractElementI, we could just call visitSimdExtractElementI here. If
> it's too much work, that can be a followup / good first bug for contributors.

Yes, visitSimdExtractElementB() is identical to visitSimdExtractElementI() except for the added 'and' instruction at the end to coerce the result into {0,1}.

It would indeed be nice to expose this 'and' to the MIR optimizers. It seems likely to be followed by an MTest instruction, and the 'and' is redundant in that case.

Perhaps what we need is an MToBoolean function which is equivalent to MNot(MNot(x))? Then MTest(MToBoolean(x)) could fold to MTest(x), and we could simply generate MToBoolean(MExtractElementI(x)).

I do think we should defer this to a followup bug.

> @@ +2466,5 @@
> >          masm.canonicalizeFloat(output);
> >  }
> >  
> >  void
> > +CodeGeneratorX86Shared::visitSimdInsertElementB(LSimdInsertElementB* ins)
> 
> Ditto, do we really need LSimdInsertElementB, as it appears to be generating
> the same code as LSimdInsertElementI? (may be a follow up as well)

I think you are right. It should be dropped.
Rebased and addressed review comments:

- Removed LSimdInsertElementB. Just use LSimdInsertElementI everywhere.
- Merged inlineSimdAnyTrue() and inlineSimdAllTrue() into one function.
- Minor style issues etc.
Attachment #8699519 - Attachment is obsolete: true
Comment on attachment 8699521 [details] [diff] [review]
Part 4: Delete signMask and selectBits.

Review of attachment 8699521 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great! I think in Odin, Float32x4 bitwise operators and "not" are still present and might need removal as well. Feel free to re-ask for review if you want to, but it seems trivial enough that you can consider r=me on including this removal and changing the asm.js/testSIMD.js, which ought to need modifications.

::: js/src/asmjs/AsmJSValidate.cpp
@@ -4932,5 @@
>          return CheckSimdSplat(f, call, opType, type);
> -
> -      case AsmJSSimdOperation_allTrue:
> -      case AsmJSSimdOperation_anyTrue:
> -        MOZ_CRASH("unreachable and nyi"); // TODO bug 1160971

If my reading of the macro partitioning is correct, this might need to stay as part of this patch to make sure it compiles independently from the other patches?

::: js/src/asmjs/WasmIR.h
@@ -371,5 @@
>      FromI32X4Bits,
>      Swizzle,
>      Shuffle,
>      Select,
> -    BitSelect,

Unless I am missing something, you also need to delete the BinaryBitwise entry from this list. This will imply changing validation and changing asm.js tests a bit.

Also, Float32x4::not shouldn't validate.

I was wondering why asm.js/testSIMD.js wasn't affected; is it because these functions hadn't been removed? or is the test changed in another patch? (many patches and long time between reviews makes me confused)
Attachment #8699521 - Flags: review?(benj) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #78)
> Comment on attachment 8699521 [details] [diff] [review]
> Part 4: Delete signMask and selectBits.
> 
> Review of attachment 8699521 [details] [diff] [review]:

> ::: js/src/asmjs/WasmIR.h
> @@ -371,5 @@
> >      FromI32X4Bits,
> >      Swizzle,
> >      Shuffle,
> >      Select,
> > -    BitSelect,
> 
> Unless I am missing something, you also need to delete the BinaryBitwise
> entry from this list. This will imply changing validation and changing
> asm.js tests a bit.

Yes, I missed that. Thanks!

> Also, Float32x4::not shouldn't validate.
> 
> I was wondering why asm.js/testSIMD.js wasn't affected; is it because these
> functions hadn't been removed? or is the test changed in another patch?
> (many patches and long time between reviews makes me confused)

No, unfortunately these patches are not independent. The asm.js test cases would probably start failing with 'Part 3: JIT', and only start working again with 'Part 5: asm.js'.

Since we're changing the return type of comparisons and removing operations from Float32x4, it is hard to keep tests passing when the patches are organized this way.

The big testcase update happens in the next patch, 'Part 5'.

I'll make sure we have tests for the non-validation of bitwise Float32x4 operations in asm.js.
Comment on attachment 8699522 [details] [diff] [review]
Part 5: ASM.js boolean vectors.

Review of attachment 8699522 [details] [diff] [review]:
-----------------------------------------------------------------

Great work on this patch series, thank you!

::: js/src/asmjs/AsmJSLink.cpp
@@ +147,5 @@
>            case ValType::F64:
>              *(double*)datum = v.f64();
>              break;
>            case ValType::I32x4:
> +          case ValType::B32x4:

Maybe add the "// Bool32x4 uses the same data layout" comment here as well?

::: js/src/asmjs/AsmJSModule.h
@@ +875,5 @@
>            case wasm::ValType::I32:   case wasm::ValType::F32:   width = 4;  break;
>            case wasm::ValType::I64:   case wasm::ValType::F64:   width = 8;  break;
> +          case wasm::ValType::I32x4:
> +          case wasm::ValType::F32x4:
> +          case wasm::ValType::B32x4:

It's weird to have a mix of inline cases and paragraphs cases. Can you make it look more consistent here, please? I'd be good with

case I32x4: case F32x4:
case B32x4:             width = 16; break;

but the other way works too.

::: js/src/asmjs/AsmJSValidate.cpp
@@ +1842,2 @@
>        case AsmJSSimdType_int32x4:
> +      case AsmJSSimdType_float32x4:

unnecessary code motion, but well ;)

@@ +4589,5 @@
>  {
>      switch (opType) {
> +      case AsmJSSimdType_int32x4:   f.writeOp(B32X4::BinaryCompI32X4); break;
> +      case AsmJSSimdType_float32x4: f.writeOp(B32X4::BinaryCompF32X4); break;
> +      case AsmJSSimdType_bool32x4:  MOZ_CRASH("Can't compare boolean vectors");

That should be unreachable, right, since the operation won't have validated in the first place?

@@ +4624,5 @@
>          *type = Type::Float;
>          break;
> +      case AsmJSSimdType_bool32x4:
> +        f.writeOp(I32::B32X4ExtractLane);
> +        *type = Type::Int;

I've gone through all the operators in asm.js, and there doesn't seem to be any which takes int as an input, and that wouldn't do an int32 coercion at this place, so I think we're safe with Int (and don't need to use Intish). In the worst case, the fuzzers will let us know.

@@ +4820,5 @@
>      unsigned numArgs = CallArgListLength(call);
>      if (numArgs != 3)
>          return f.failf(call, "expected 3 arguments to SIMD store, got %u", numArgs);
>  
> +    SwitchPackOp(f, opType, I32X4::Store, F32X4::Store, B32X4::Bad);

Maybe the last argument to SwitchPackOp could be optional and its default value be B32X4::Bad?

::: js/src/asmjs/WasmIonCompile.cpp
@@ +129,5 @@
>                case ValType::F32x4:
>                  ins = MSimdConstant::New(alloc(), SimdConstant::CreateX4(v.f32x4()), MIRType_Float32x4);
>                  break;
> +              case ValType::B32x4:
> +                ins = MSimdConstant::New(alloc(), SimdConstant::CreateX4(v.i32x4()), MIRType_Bool32x4);

Maybe add a comment here explaining that it's safe to call i32x4() ?

@@ +324,5 @@
>          curBlock_->add(ins);
>          return ins;
>      }
>  
> +    MDefinition* simdAllTrue(MDefinition* bv)

Not sure about the naming of the argument: bv stands for boolean vector, i guess? How about just naming it booleanVector, or boolVec, or simply vec?

@@ +334,5 @@
> +        curBlock_->add(ins);
> +        return ins;
> +    }
> +
> +    MDefinition* simdAnyTrue(MDefinition* bv)

ditto

@@ +1824,5 @@
> +    if (IsSimdBoolType(simdType)) {
> +        if (!EmitSimdBooleanLaneExpr(f, &scalar))
> +            return false;
> +    } else {
> +        if (!EmitExpr(f, SimdToLaneType(simdType), &scalar))

style nit: no isolated if in else blocks, please:

} else if (...) {
}

@@ +1958,5 @@
>      return true;
>  }
>  
>  static bool
> +EmitSimdBooleanSplat(FunctionCompiler& f, MDefinition** def)

Feel free to reuse EmitSimdSplat and add the "if (IsSimdBoolType(...))" check as for replaceLane, if you want. I don't have a strong opinion here as it's not much duplicated code.

@@ +2267,5 @@
> +// Emit an I32 expression and then convert it to a boolean SIMD lane value, i.e. -1 or 0.
> +static bool
> +EmitSimdBooleanLaneExpr(FunctionCompiler& f, MDefinition** def)
> +{
> +    MDefinition *i32;

nit: * goes to the left: MDefinition* i32;

@@ +2271,5 @@
> +    MDefinition *i32;
> +    if (!EmitI32Expr(f, &i32))
> +        return false;
> +    // Now compute !i32 - 1 to force the value range into {0, -1}.
> +    MDefinition *noti32 = f.unary<MNot>(i32);

ditto

::: js/src/jit-test/tests/asm.js/simd-fbirds.js
@@ +68,5 @@
>      var f4greaterThan = f4.greaterThan;
>      var f4splat = f4.splat;
>      var f4load = f4.load;
>      var f4store = f4.store;
> +    var b4any = b4.anyTrue;

Note: Intel's repo and arewefastyet will want their updates too, for this file and the other simd- one (automating this is a problem I haven't figured out yet)

::: js/src/jit-test/tests/asm.js/testSIMD.js
@@ +1371,5 @@
>  }
> +
> +// Passing boolean results to extern functions.
> +// Verify that these functions are typed correctly.
> +function isone(x) { return (x===1)|0 }

I think these tests should be put above the previous one, which has a comment stating that it should be the last one, iirc? (because it times out)
Attachment #8699522 - Flags: review?(benj) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #80)
> Comment on attachment 8699522 [details] [diff] [review]
> Part 5: ASM.js boolean vectors.
> 
> ::: js/src/asmjs/AsmJSValidate.cpp
> @@ +1842,2 @@
> >        case AsmJSSimdType_int32x4:
> > +      case AsmJSSimdType_float32x4:
> 
> unnecessary code motion, but well ;)

I've been trying to keep a consistent Int-Float-Bool order everywhere.

> @@ +4624,5 @@
> >          *type = Type::Float;
> >          break;
> > +      case AsmJSSimdType_bool32x4:
> > +        f.writeOp(I32::B32X4ExtractLane);
> > +        *type = Type::Int;
> 
> I've gone through all the operators in asm.js, and there doesn't seem to be
> any which takes int as an input, and that wouldn't do an int32 coercion at
> this place, so I think we're safe with Int (and don't need to use Intish).
> In the worst case, the fuzzers will let us know.

Thanks!

> @@ +4820,5 @@
> >      unsigned numArgs = CallArgListLength(call);
> >      if (numArgs != 3)
> >          return f.failf(call, "expected 3 arguments to SIMD store, got %u", numArgs);
> >  
> > +    SwitchPackOp(f, opType, I32X4::Store, F32X4::Store, B32X4::Bad);
> 
> Maybe the last argument to SwitchPackOp could be optional and its default
> value be B32X4::Bad?

I was hoping that SwitchPackOp() would not survive the wasm-ir refactoring you're about to do. This approach does not scale to 12 SIMD types.

The SIMD.js instruction set is very orthogonal, but not completely. There are 12 types and some 30 operations on each type. This gives us a pretty large 'feature matrix' of supported (Type, Operation) pairs.

In my opinion, we should only encode that feature matrix once in our source code, as far as possible. The canonical list of operations supported on the Int32x4 type is in the FORALL_INT32X4_ASMJS_OP macro. In OdinMonkey, the IsSimdValidOperationType(type, op) function represents the feature matrix, and it is defined in terms of the canonical macros.

Ideally, the rest of OdinMonkey would represent SIMD operations as (Type, Operation) pairs. This is much easier to deal with in source code because it avoids all the tedious switches. It does mean that OdinMonkey will be able to represent invalid operations like Float32x4.not() internally, but that is OK because these invalid operations will be filtered out as early as possible by the IsSimdValidOperationType() filter.

A function like SwitchPackOp() is problematic because it requires encoding a second version of the feature matrix: which of the SwitchPackOp() arguments are going to be *::Bad? This version of the feature matrix will even be distributed all over the AsmJSValidate.cpp source code, so it gets quite tricky to compare it against the canonical SIMD.h macros.

When you switch to context-free opcodes that encode both type and operation in one opcode, I would suggest that you provide functions to perform the mapping:

  Opcode <--> (Type, Operation)

I would also suggest that you use bitfields in the opcode so those functions don't have to consist of giant switches. For example:

Opcode
makeSimdOpcode(AsmJSSimdType type, AsmJSSimdOperation operation) {
  return Opcode((type << 8) | operation);
}

This will make it possible to represent invalid operations as an opcode, but it makes the compiler source so much simpler. Just filter out the invalid opcodes up front, like we do for asm.js with IsSimdValidOperationType().
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #79)
> (In reply to Benjamin Bouvier [:bbouvier] from comment #78)
> > Unless I am missing something, you also need to delete the BinaryBitwise
> > entry from this list. This will imply changing validation and changing
> > asm.js tests a bit.
> 
> Yes, I missed that. Thanks!

This illustrates nicely the problem I was talking about in comment #81.

I had already removed Float32x4.and from the canonical list of operations in FORALL_FLOAT32X4_ASMJS_OP, so any asm.js code attempting to use it would not validate. It doesn't really make a difference if the F32X4::BinaryBitwise opcode exists or not. That is not where we filter the legal operations.

It is just a second copy of the feature matrix that needs to be kept in sync with the canonical copy, and the compiler and/or tests can't tell you if it is done correctly.

Since the existence of an opcode does not make that operation legal in itself, we might as well use orthogonal opcodes that can represent any (Type, Operation) pair. It simplifies the source code a lot.
Duplicate of this bug: 1228612
Depends on: 1236522
Depends on: 1238236
You need to log in before you can comment on or make changes to this bug.