Closed Bug 1240796 Opened 4 years ago Closed 4 years ago

Add SIMD.Uint32x4 support to IonMonkey

Categories

(Core :: JavaScript Engine: JIT, defect)

x86_64
Unspecified
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox46 --- affected
firefox47 --- fixed

People

(Reporter: jolesen, Assigned: jolesen)

References

Details

Attachments

(8 files, 6 obsolete files)

17.48 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
9.33 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
2.86 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
22.30 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
28.15 KB, patch
sunfish
: review+
Details | Diff | Splinter Review
17.08 KB, patch
sunfish
: review+
Details | Diff | Splinter Review
4.30 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
15.96 KB, patch
nbp
: review+
Details | Diff | Splinter Review
We already have SIMD.Int32x4 in Ion, and most of the operations are identical, except for shiftRightByScalar(), comparisons, and to/from float conversions.

It should not be necessary to add a new MIRType for the unsigned vectors. Int32x4 and Uint32x4 can both map to MIRType_Int32x4, and we can use the signedness bit to choose between arithmetic/logical right shifts and signed/unsigned integer comparison opcodes.

- We already have separate rsh/ursh opcodes for MSimdShift.
- Add unsigned comparison opcodes to MSimdBinaryCompare.
- Add an 'unsigned' flag to MSimdConvert.
- Add an 'unsigned' flag to MSimdExtractElement. Look at how the >>> operator deals with unsigned 32-bit numbers.
Blocks: 1064540
Depends on: 1238679
Assignee: nobody → jolesen
This saves some code size in a cold function, and it makes it possible to pass
in the SIMD type as a dynamic argument.

Also detemplatize the static CreateSimdType() to save some code size.

Total code shrink ~ 32 KB.
The extractLane(), anyTrue(), and allTrue() SIMD functions produce scalar
values, and so they don't ned a template object. The canInlineSimd() function
was rejecting these functions because of the missing template object.
Extract the code that generates template objects for SIMD operations, and
rewrite it to use the JSJitInfo nativeOp encoding.

This avoids the native function pointer comparisons, and it makes it simpler to
add new SIMD types and operations.
Add a new InlinableNative::SimdUint32x4 enumerator, and emit the corresponding
JSJitInfo objects in SIMD.cpp.

Start producing template objects for Uint32x4 operations in BaselineIC.cpp.

Add a new MIRSign enum class to IonTypes.h which will be used to distinguish
between signed and unsigned integers in the few places where it matters.

Map the SIMD.Uint32x4 type to the existing MIRType_Int32x4 + MIRSign::Unsigned.
Map SIMD.Int32x4 to MITType_Int32x4 + MIRSign::Signed.

Add a 'MIRSign sign' argument to those inlineSimd...() functions that care.
Some MIR instructions will get similar fields in the following commits.

For now, abort inlining if unsigned vectors are actually encountered. These cases
will be fixed in the following commits.
Comment on attachment 8710538 [details] [diff] [review]
Connect SIMD.Uint32x4 operations to the Ion inliner.

Ben, this is a WIP.

The 'enum class MIRSign' is how I intend to add support for unsigned integer vectors in MIR. I will add a MIRSign field to MSimdConvert, MSimdBinaryComp, and MSimdExtractElement.

How would asm.js represent Uint32x4.extractLane()? This is a 32-bit unsigned value. Do you want the bits cast into a MIRType_Int32?

Should we do a SimdSign in SIMD.h instead of a MIRSign?
Flags: needinfo?(bbouvier)
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #6)
> Comment on attachment 8710538 [details] [diff] [review]
> Connect SIMD.Uint32x4 operations to the Ion inliner.
> 
> Ben, this is a WIP.
> 
> The 'enum class MIRSign' is how I intend to add support for unsigned integer
> vectors in MIR. I will add a MIRSign field to MSimdConvert, MSimdBinaryComp,
> and MSimdExtractElement.
> 
> How would asm.js represent Uint32x4.extractLane()? This is a 32-bit unsigned
> value. Do you want the bits cast into a MIRType_Int32?

extractLane would return Type::Unsigned in the asm.js type system, which means int32, so casting to int32 sounds fine. If you look at CheckBitwise in AsmJS.cpp, the logical right shift operator yields the type Unsigned, so it's already well-integrated in asm.js.
 
> 
> Should we do a SimdSign in SIMD.h instead of a MIRSign?

If this can be reused at different locations out of the jit/ dir, yes.
Flags: needinfo?(bbouvier)
Emscripten has a small workaround currently in place to ignore its test suite failures since this type doesn't yet exist for asm.js: https://github.com/juj/emscripten/commit/c1b006d044ec2f47d70a31bafe635fa3f7e7c4a3
This saves some code size in a cold function, and it makes it possible to pass
in the SIMD type as a dynamic argument.

Also detemplatize the static CreateSimdType() to save some code size.

Total code shrink ~ 32 KB.
Attachment #8713747 - Flags: review?(bbouvier)
Attachment #8710534 - Attachment is obsolete: true
The extractLane(), anyTrue(), and allTrue() SIMD functions produce scalar
values, and so they don't need a template object. The canInlineSimd() function
was rejecting these functions because of the missing template object.
Attachment #8713748 - Flags: review?(bbouvier)
Attachment #8710535 - Attachment is obsolete: true
Extract the code that generates template objects for SIMD operations, and
rewrite it to use the JSJitInfo nativeOp encoding.

This avoids the native function pointer comparisons, and it makes it simpler to
add new SIMD types and operations.
Attachment #8713749 - Flags: review?(bbouvier)
Attachment #8710536 - Attachment is obsolete: true
Attachment #8710537 - Attachment is obsolete: true
Add a new InlinableNative::SimdUint32x4 enumerator, and emit the corresponding
JSJitInfo objects in SIMD.cpp.

Start producing template objects for Uint32x4 operations in BaselineIC.cpp.

Add a new SimdSign enum class to SIMD.h which will be used to distinguish
between signed and unsigned integers in the few places where it matters.

Map the SIMD.Uint32x4 type to the existing MIRType_Int32x4 + SimdSign::Unsigned.
Map SIMD.Int32x4 to MITType_Int32x4 + SimdSign::Signed.

Add a 'SimdSign sign' argument to those inlineSimd...() functions that care.
Some MIR instructions will get similar fields in the following commits.

For now, abort inlining if unsigned vectors are actually encountered. These cases
will be fixed in the following commits.
Attachment #8713751 - Flags: review?(bbouvier)
Attachment #8710538 - Attachment is obsolete: true
The conversion from Uint32x4 to Float32x4 is not available as an SSE
instruction, so we need to expand into a larger instruction sequence lifted
from LLVM. Make this expansion early when generating MIR so that it can be
exposed to LICM and GVN optimizations.

The conversion from Float32x4 to Uint32x4 can throw a RangeError. It is handled
similarly to LFloat32x4ToInt32x4. This expansion depends on the details of the
cvttps2dq instruction that can't be expressed in MIR, so it can't be expanded
early.
Attachment #8713752 - Flags: review?(sunfish)
Add a MSimdBinaryComp::AddLegalized function which expands unsigned compares on
target platforms that don't support them directly. The early expansion exposes
the constants to MIR optimizations.

Unsigned comparison is expressed in terms of signed comparison by offsetting
both sides by INT_MIN.
Attachment #8713768 - Flags: review?(sunfish)
The extractLane(), anyTrue(), and allTrue() SIMD functions produce scalar
values, and so they don't need a template object. The canInlineSimd() function
was rejecting these functions because of the missing template object.

At the same time, explicitly avoid inlining any SIMD operations if the JIT does
not support SIMD. This was previously controlled by the absense of the template
object.
Attachment #8713786 - Flags: review?(bbouvier)
Attachment #8713748 - Attachment is obsolete: true
Attachment #8713748 - Flags: review?(bbouvier)
Comment on attachment 8713752 [details] [diff] [review]
Implement Uint32x4 <==> Float32x4 conversions.

Review of attachment 8713752 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. There's just the matter of what I believe is a spec bug :) but we can address that in separate patches.

::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp
@@ +2318,5 @@
> +    FloatRegister tempF = ToFloatRegister(ins->tempF());
> +
> +    // Classify lane values into 4 disjoint classes:
> +    //
> +    //   N-lanes:             in < -0.0

See comment below about the specific bound here.

@@ +2345,5 @@
> +
> +    // First we need to filter out N-lanes. We need to use a floating point
> +    // comparison to do that because cvttps2dq maps the negative range
> +    // [-0x0.ffffffp0;-0.0] to 0. We can't simply look at the sign bits of in
> +    // because -0.0 is a valid input.

Hmm. We do actualy want -0x0.ffffffp0 to convert to 0 here. The desired semantics are that the fractional part is discarded first, and then the value is range checked.

Actually, this looks like it's a bug in the spec. I filed https://github.com/tc39/ecmascript_simd/issues/315 to track this.
Attachment #8713752 - Flags: review?(sunfish) → review+
(In reply to Dan Gohman [:sunfish] from comment #17)
> Actually, this looks like it's a bug in the spec. I filed
> https://github.com/tc39/ecmascript_simd/issues/315 to track this.

Thanks, Dan.

When the spec changes, we will have to change the interpreter implementation in builtin/SIMD.cpp too:

    // Float to integer conversions have undefined behavior if the float value
    // is out of the representable integer range (on x86, will yield the undefined
    // value pattern, namely 0x80000000; on arm, will clamp the input value), so
    // check this here.
    template<typename From, typename IntegerType>
    struct ThrowIfNotInRange
    {
        static_assert(mozilla::IsIntegral<IntegerType>::value, "bad destination type");

        static bool value(From v) {
            double d(v);
            return mozilla::IsNaN(d) ||
                   d < double(mozilla::MinValue<IntegerType>::value) ||
                   d > double(mozilla::MaxValue<IntegerType>::value);
        }
    };

If we ever implement a Float64x2 to Int64x2 conversion, we would have problems in both ends of the range: The code above doesn't allow a truncation to become MaxValue or MinValue of it was outside the range before the truncation.

This is not a problem for Float32 where all values are integers in the outer Int32 limits.
Since Uint32 can't be represented in a MIRType_Int32, this function should
return a MIRType_Double.

Allow MSimdExtractElement(Uint32x4) to return a MIRType_Int32 too. It will work
like the double version followed by MTruncateToInt32.
Attachment #8713880 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8713768 [details] [diff] [review]
Implement unsigned SIMD compares.

Review of attachment 8713768 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, with a few comments:

::: js/src/jit/MIR.cpp
@@ +1165,5 @@
> +
> +        return result;
> +    }
> +
> +    if (!SupportsUint32x4Compares && sign == SimdSign::Unsigned && opType == MIRType_Int32x4) {

Since this tests that sign is Unsigned, can it assert that opType is an integer SIMD type in that case, rather than testing for it?

::: js/src/jit/MIR.h
@@ +2158,5 @@
>      }
>  
>    private:
>      Operation operation_;
> +    SimdSign sign_;

The congruentTo function should also check for this new field.
Attachment #8713768 - Flags: review?(sunfish) → review+
Comment on attachment 8713747 [details] [diff] [review]
Detemplatize getOrCreateSimdTypeDescr().

Review of attachment 8713747 [details] [diff] [review]:
-----------------------------------------------------------------

Cool.

::: js/src/builtin/SIMD.cpp
@@ +530,5 @@
> +       return nullptr;
> +
> +    uint32_t typeSlotIndex = uint32_t(simdType);
> +    if (globalSimdObject->as<NativeObject>().getReservedSlot(typeSlotIndex).isUndefined() &&
> +        !GlobalObject::initSimdType(cx, global, simdType)) {

nit: multi-line condition => parenthesis goes to the next line

@@ +531,5 @@
> +
> +    uint32_t typeSlotIndex = uint32_t(simdType);
> +    if (globalSimdObject->as<NativeObject>().getReservedSlot(typeSlotIndex).isUndefined() &&
> +        !GlobalObject::initSimdType(cx, global, simdType)) {
> +            return nullptr;

nit: there are 2 levels of indent instead of one here, right?

::: js/src/builtin/TypedObject.cpp
@@ +2604,5 @@
>      CallArgs args = CallArgsFromVp(argc, vp);
>      Rooted<GlobalObject*> global(cx, cx->global());
>      MOZ_ASSERT(global);
> +    args.rval().setObject(
> +      *GlobalObject::getOrCreateSimdTypeDescr(cx, global, SimdType::Float32x4));

This is a bit ugly to have the parenthesis on one line and the argument on another. Can you use a local variable to store the result of getOrCreateSimdTypeDescr and use that instead? Ditto for other functions.

Alternatively, feel free to group all Get*TypeDescr functions into a single one that would take the SimdType as an argument.
Attachment #8713747 - Flags: review?(bbouvier) → review+
Comment on attachment 8713749 [details] [diff] [review]
Extract baseline code to GetTemplateObjectForSimd().

Review of attachment 8713749 [details] [diff] [review]:
-----------------------------------------------------------------

Much better!

::: js/src/jit/BaselineIC.cpp
@@ +5564,5 @@
> +    switch (jitInfo->inlinableNative) {
> +      case InlinableNative::SimdInt32x4:   ctrlType = SimdType::Int32x4;   break;
> +      case InlinableNative::SimdFloat32x4: ctrlType = SimdType::Float32x4; break;
> +      case InlinableNative::SimdBool32x4:  ctrlType = SimdType::Bool32x4;  break;
> +        // This is not an inlinable SIMD operation.

I'd align this with the start of "case" above.
Attachment #8713749 - Flags: review?(bbouvier) → review+
Attachment #8713750 - Flags: review?(bbouvier) → review+
Comment on attachment 8713786 [details] [diff] [review]
Inline SIMD operations that return scalars.

Review of attachment 8713786 [details] [diff] [review]:
-----------------------------------------------------------------

Righto.
Attachment #8713786 - Flags: review?(bbouvier) → review+
Comment on attachment 8713751 [details] [diff] [review]
Connect SIMD.Uint32x4 operations to the Ion inliner.

Review of attachment 8713751 [details] [diff] [review]:
-----------------------------------------------------------------

Cool.

::: js/src/builtin/SIMD.h
@@ +800,5 @@
>  };
>  
> +// The integer SIMD types have a lot of operations that do the exact same thing
> +// for signed and unsigned integer types. Sometimes it is simpler to treat
> +// signed and unsiged integer SIMD types as the same type, using a SimdSign to

nit: unsiged => unsigned

@@ +801,5 @@
>  
> +// The integer SIMD types have a lot of operations that do the exact same thing
> +// for signed and unsigned integer types. Sometimes it is simpler to treat
> +// signed and unsiged integer SIMD types as the same type, using a SimdSign to
> +// distinguish the few cases where thee is a difference.

nit: Thou Shalt Fix "Thee" Typo.

@@ +804,5 @@
> +// signed and unsiged integer SIMD types as the same type, using a SimdSign to
> +// distinguish the few cases where thee is a difference.
> +enum class SimdSign {
> +    // Signedness is not applicable to this type. (i.e., Float or Bool).
> +    NA,

I'd use the long form NotApplicable, rather than an acronym. It shouldn't appear too much and autocompletion is a thing nowadays.

@@ +813,5 @@
> +};
> +
> +// Get the signedness of a SIMD type.
> +static inline SimdSign
> +GetSimdSign(SimdType t) {

Please replace IsSignedIntSimdType uses below by this, or have IsSignedIntSimdType call GetSimdSign.

::: js/src/jit/BaselineIC.cpp
@@ +5562,5 @@
>      // Check if this is a native inlinable SIMD operation.
>      SimdType ctrlType;
>      switch (jitInfo->inlinableNative) {
>        case InlinableNative::SimdInt32x4:   ctrlType = SimdType::Int32x4;   break;
> +      case InlinableNative::SimdUint32x4:  ctrlType = SimdType::Uint32x4;  break;

Now I can feel the benefits of your previous refactorings :-)

::: js/src/jit/MCallOptimize.cpp
@@ +3151,5 @@
>        case SimdOperation::Fn_shiftLeftByScalar:
>          return inlineSimdBinary<MSimdShift>(callInfo, native, MSimdShift::lsh, simdType);
>        case SimdOperation::Fn_shiftRightByScalar:
> +          return inlineSimdBinary<MSimdShift>(callInfo, native,
> +                     sign == SimdSign::Unsigned ? MSimdShift::ursh : MSimdShift::rsh, simdType);

nit: indent is 2-chars off. Any chance we could align `sign == ...` with the first parenthesis?

@@ +3400,5 @@
>      InlineTypedObject* templateObj = nullptr;
>      if (!canInlineSimd(callInfo, native, 2, &templateObj))
>          return InliningStatus_NotInlined;
>  
> +    // TODO JSO: Implement unsigned integer comparisons.

Unless this is done in later patches in this same bug, please add a bug number to these TODO comments.
Attachment #8713751 - Flags: review?(bbouvier) → review+
Attachment #8713880 - Flags: review?(nicolas.b.pierron) → review+
Blocks: 1244889
Blocks: 1245547
Note that this set of patches (probably attachment 8713786 [details] [diff] [review] in particular) made the performance of asmjs-micro SIMD benchmarks back to very-fast! Good job!

https://arewefastyet.com/#machine=28&view=single&suite=asmjs-ubench&subtest=fbirds-native&start=1454358251&end=1454375103
See Also: → 1252927
You need to log in before you can comment on or make changes to this bug.