Closed
Bug 1044256
Opened 10 years ago
Closed 10 years ago
SIMD backend: implement unary arithmetic operations
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla35
People
(Reporter: ivan, Assigned: ivan)
References
Details
Attachments
(3 files, 6 obsolete files)
20.38 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
9.63 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
1.73 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
SIMD.float32x4: float32x4 abs(float32x4 a) float32x4 neg(float32x4 a) float32x4 sqrt(float32x4 a) float32x4 reciprocalSqrt(float32x4 a) float32x4 reciprocal(float32x4 a) SIMD.int32x4: int32x4 neg(int32x4 a)
Assignee | ||
Comment 1•10 years ago
|
||
WIP patch: still needs to add the macro assembler for Float32x4.abs: absps, Float32x4.neg: negps, and Int32x4.neg: pnegd which are implemented using xorps and andps
Attachment #8462862 -
Flags: feedback?(benj)
Assignee | ||
Updated•10 years ago
|
Attachment #8462862 -
Flags: feedback?(sunfish)
Comment 2•10 years ago
|
||
Comment on attachment 8462862 [details] [diff] [review] bug1044256.txt Review of attachment 8462862 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/assembler/assembler/X86Assembler.h @@ +2484,5 @@ > + m_formatter.twoByteOp(OP2_RCPPS_VpsWps, (RegisterID)dst, address); > + } > + > + void rsqrtps_rr(XMMRegisterID src, XMMRegisterID dst){ > + spew("rcpps %s, %s", copy+pasto in the spew string here and a few more below. ::: js/src/jit/MIR.h @@ +1312,5 @@ > + reciprocal, > + reciprocalSqrt, > + sqrt > + }; > + Trailing whitespace. ::: js/src/jit/shared/CodeGenerator-x86-shared.cpp @@ +2185,5 @@ > + masm.packedReciprocalFloat32x4(rhs, lhs); > + return true; > + case MSimdUnaryArith::reciprocalSqrt: > + masm.packedReciprocalSqrtFloat32x4(rhs, lhs); > + return true; This uses the rcp and rsqrt instructions, which are approximation instructions. According to Intel's manual, of rcp, "The relative error for this approximation is: |Relative Error| ≤ 1.5 ∗ 2 −12". I don't think we want to expose these low-precision approximations directly in the JS API, so we'll need to use the Newton-Raphson method to refine the results.
Attachment #8462862 -
Flags: feedback?(sunfish)
Comment 3•10 years ago
|
||
Comment on attachment 8462862 [details] [diff] [review] bug1044256.txt Review of attachment 8462862 [details] [diff] [review]: ----------------------------------------------------------------- Nice start. ::: js/src/jit/Lowering.cpp @@ +3642,5 @@ > MOZ_ASSUME_UNREACHABLE("Unknown SIMD kind when getting lane"); > return false; > } > > +bool nit: trailing whitespace @@ +3658,5 @@ > + LSimdUnaryArithFx4 *lir = new(alloc()) LSimdUnaryArithFx4(op); > + return define(lir, ins); > + } > + > + MOZ_ASSUME_UNREACHABLE("Unknown SIMD kind when comparing values"); nit: copy pasto of another error string here ::: js/src/jit/shared/CodeGenerator-x86-shared.cpp @@ +2160,5 @@ > + > + MSimdUnaryArith::Operation op = ins->operation(); > + switch(op){ > + case MSimdUnaryArith::neg: > + //TODO after 'not' has been implemented nit: TODO is a copy pasto as well. @@ +2162,5 @@ > + switch(op){ > + case MSimdUnaryArith::neg: > + //TODO after 'not' has been implemented > + return true; > + case MSimdUnaryArith::abs: As far as I can see, int32x4 only have neg in the SIMD.js polyfill so far. Could you add assertions in MIR.h that if the type is MIRType_Int32x4, then the op must neg?
Attachment #8462862 -
Flags: feedback?(benj)
Comment 4•10 years ago
|
||
Rebase. Have not integrated the feedback yet.
Attachment #8462862 -
Attachment is obsolete: true
Comment 5•10 years ago
|
||
Rebase after bug 1025127 (the comparisons) and before bug 1021716 (shuffle).
Attachment #8479813 -
Attachment is obsolete: true
Comment 6•10 years ago
|
||
Just reordered things a little for consistency.
Attachment #8480271 -
Attachment is obsolete: true
Updated•10 years ago
|
Attachment #8480402 -
Attachment is patch: true
Comment 7•10 years ago
|
||
Rebase. Some of the feedback has been addressed.
Attachment #8480402 -
Attachment is obsolete: true
Comment 8•10 years ago
|
||
Some of these operations (neg and abs) appear to require the loading of a constant before a binary bitwise operation. It would probably be better to separate the loading of such constants into a separate MIR operation so that they can be hoisted etc. So some of these would be transformed into binary operations when generating the MIR.
Comment 9•10 years ago
|
||
Rebased and completed. About remarks: - SIMD spec committee seems fine to have reciprocal and reciprocalSqrt be approximations, so we just use them in this patch. - Having MIR constants getting out of the instructions would be beneficial (e.g. for not we use a SimdConstant which -1 in all lanes), but it seems weird to add the constraint at instantiation time that the second argument of a MIR node (say an hypothetical MSimdNot) be a constant with the precise given value (here, MSimdConstant(-1) or MSimdSplat(-1)). Perhaps this could be done with a separate optimization phase though, as this would be probably beneficial to all operations that use constants at the codegen level. The issue is not too grave, as constants (simd and non-simd) are already shared in pools, at least on x86 and x64.
Attachment #8481213 -
Attachment is obsolete: true
Attachment #8499589 -
Flags: review?(sunfish)
Comment 10•10 years ago
|
||
Interdiff: in codegen, the input has to be an Operand, not a FloatRegister. See other comments above.
Attachment #8499589 -
Attachment is obsolete: true
Attachment #8499589 -
Flags: review?(sunfish)
Attachment #8499591 -
Flags: review?(sunfish)
Comment 12•10 years ago
|
||
Comment on attachment 8499593 [details] [diff] [review] bug1044256-2.patch Review of attachment 8499593 [details] [diff] [review]: ----------------------------------------------------------------- Way to be
Attachment #8499593 -
Flags: review?(luke) → review+
Comment 13•10 years ago
|
||
Comment on attachment 8499591 [details] [diff] [review] bug1044256.patch Review of attachment 8499591 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/LIR-Common.h @@ +313,5 @@ > +// Unary SIMD arithmetic operation on a SIMD operand > +class LSimdUnaryArith : public LInstructionHelper<1, 1, 0> > +{ > + public: > + LSimdUnaryArith(const LAllocation &in) { nit: Add an explicit keyword. @@ +324,5 @@ > + > +// Unary SIMD arithmetic operation on a Int32x4 operand > +class LSimdUnaryArithIx4 : public LSimdUnaryArith > +{ > + public: nit: indent 'public:' at 2 spaces. @@ +326,5 @@ > +class LSimdUnaryArithIx4 : public LSimdUnaryArith > +{ > + public: > + LIR_HEADER(SimdUnaryArithIx4); > + LSimdUnaryArithIx4(const LAllocation &in) : LSimdUnaryArith(in) {} another explicit keyword @@ +332,5 @@ > + > +// Unary SIMD arithmetic operation on a Float32x4 operand > +class LSimdUnaryArithFx4 : public LSimdUnaryArith > +{ > + public: another 2-space indent @@ +334,5 @@ > +class LSimdUnaryArithFx4 : public LSimdUnaryArith > +{ > + public: > + LIR_HEADER(SimdUnaryArithFx4); > + LSimdUnaryArithFx4(const LAllocation &in) : LSimdUnaryArith(in) {} another explicit :) ::: js/src/jit/shared/CodeGenerator-x86-shared.cpp @@ +2539,5 @@ > + // In int32 arithmetic, -x === ~x + 1 === ~x - 1 > + masm.loadConstantInt32x4(allOnes, ScratchSimdReg); > + masm.moveAlignedInt32x4(ScratchSimdReg, out); > + masm.bitwiseXorX4(in, out); > + masm.packedSubInt32(Operand(ScratchSimdReg), out); This looks correct (aside from a missing '-' in the comment), but it'd be simpler and faster to load zeros into |out| and then subtract |in| from it. @@ +2567,5 @@ > + int32_t maskSignBit = FloatingPoint<float>::Base::kSignBit; > + static const SimdConstant signBitMasks = > + SimdConstant::CreateX4(maskSignBit, maskSignBit, maskSignBit, maskSignBit); > + static const SimdConstant undefSignBitMasks = > + SimdConstant::CreateX4(~maskSignBit, ~maskSignBit, ~maskSignBit, ~maskSignBit); In CodeGeneratorX86Shared::visitAbsF, we construct the all-ones-except-the-sign-bit value with SpecificNaN<float>(0, FloatingPoint<float>::kSignificandBits. I guess it's a matter of taste which is cleaner, but it'd be nice to do the same thing in both places (or perhaps even add utility routines for this). Using that expression, and -0.0f for the sign-bit, would allow you to use float values here, which would allow you to use loadConstantFloat32x4 instead of loadConstantInt32x4 below, which is faster on platforms with domain-crossing penalties. It'd also be nice to add a SimdConstant::CreateSplatX4(s) or so for situations like this. @@ +2586,5 @@ > + case MSimdUnaryArith::reciprocal: > + masm.packedReciprocalFloat32x4(in, out); > + return true; > + case MSimdUnaryArith::reciprocalSqrt: > + masm.packedReciprocalSqrtFloat32x4(in, out); rcp and rsqrt are only approximation instructions, and we haven't yet figured out what we want the spec to say for reciprocal and reciprocalSqrt [0]. I had previously assumed that we wouldn't want to have such approximation semantics in the spec, but I've since learned that this may not be as bad as I previously thought, so this seems ok for now. I guess our strategy for such things now is just mentioning them in bug 1068028 and adding corresponding comments here. [0] https://github.com/johnmccutchan/ecmascript_simd/issues/71
Attachment #8499591 -
Flags: review?(sunfish) → review+
Comment 14•10 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #9) > - Having MIR constants getting out of the instructions would be beneficial > (e.g. for not we use a SimdConstant which -1 in all lanes), but it seems > weird to add the constraint at instantiation time that the second argument > of a MIR node (say an hypothetical MSimdNot) be a constant with the precise > given value (here, MSimdConstant(-1) or MSimdSplat(-1)). Perhaps this could > be done with a separate optimization phase though, as this would be probably > beneficial to all operations that use constants at the codegen level. The > issue is not too grave, as constants (simd and non-simd) are already shared > in pools, at least on x86 and x64. I think Douglas was suggesting that 'not' be represented in MIR as an MSimdConstant and a plain MSimdBinaryBitwise xor. Then normal LICM and GVN would see them. It's definitely worth investigating at some point, though my gut feeling is that we're ok without this for now.
Comment 15•10 years ago
|
||
Thanks for the reviews, added SimdConstant::SplatX4(int32_t/float v) by the way, and a comment in the masm about the precision of rcpps and rsqrtps. remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/2e15d016544a remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/ab230e36399b
Comment 16•10 years ago
|
||
Reverted for test failures: https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2818210&repo=mozilla-inbound https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2820314&repo=mozilla-inbound remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/a9a291cb900b remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/279aac062e97
Comment 17•10 years ago
|
||
r+ over irc by nbp. Asan detected that the hash was actually using bits above the limits of the SimdConstant's array, nice!
Attachment #8501723 -
Flags: review+
Comment 18•10 years ago
|
||
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/d80d80865715 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/30cb1d37a3ff remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/2ae98c304c5e
https://hg.mozilla.org/mozilla-central/rev/d80d80865715 https://hg.mozilla.org/mozilla-central/rev/30cb1d37a3ff https://hg.mozilla.org/mozilla-central/rev/2ae98c304c5e
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in
before you can comment on or make changes to this bug.
Description
•