Closed
Bug 1025475
Opened 10 years ago
Closed 10 years ago
SIMD backend: implement Int32x4 / Float32x4 constructors
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla34
People
(Reporter: bbouvier, Assigned: bbouvier)
References
Details
Attachments
(4 files, 4 obsolete files)
11.40 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
3.95 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
29.65 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
2.31 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
int32x4(a,b,c,d) float32x4(a,b,c,d) I'll probably add MSimdConstant, for when a, b, c, and d are constant literals.
Assignee | ||
Comment 1•10 years ago
|
||
This implements Int32x4(a,b,c,d) and Float32x4(a,b,c,d), where inputs are arbitrary MIR nodes. Code generation does the following: - reserve enough stack space for storing a SIMD value (as of now, 4 * 32 bits) - move each input in a 32-bits stack slot - (aligned) load into the destination register (should be sane, as SP is aligned from the function prologue) - free stack An improvement in the future could be to have this stack space reserved per function, every time we see we have at least N SimdValueX4 in the MIRGraph, where N is a constant >= 2.
Attachment #8459652 -
Flags: feedback?(sunfish)
Assignee | ||
Comment 2•10 years ago
|
||
asm.js needs SIMD literals, for variable initializers in functions. E.g. function f() { var x = int32x4(1,2,3,4); } So we need a way to have the equivalent of js::Value, for storing SIMD constants. Typed objects (being SIMD objects polyfills in the interpreter) are not convenient because they are JSObject and thus involve GC. So I've created an equivalent of constant js::Value for SIMD. This is called SimdConstant in this patch, and there is an equivalent MSimdConstant.
Attachment #8459656 -
Flags: feedback?(sunfish)
Assignee | ||
Comment 3•10 years ago
|
||
Attachment #8459657 -
Flags: feedback?(sunfish)
Assignee | ||
Comment 4•10 years ago
|
||
(for the record, splat, which can also be folded from SimdValueX4, will be implemented in another bug)
Comment 5•10 years ago
|
||
Comment on attachment 8459656 [details] [diff] [review] 2 - Constants Review of attachment 8459656 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/assembler/assembler/X86Assembler.h @@ +2917,5 @@ > + } > +#else > + void movdqa_mr(const void* address, XMMRegisterID dst) > + { > + spew("movss %p, %s", copy+pasto in the string? @@ +4006,5 @@ > + void int32x4Constant(const int32_t s[4]) > + { > + m_buffer.ensureSpace(4 * sizeof(int32_t)); > + for (size_t i = 0; i < 4; ++i) > + m_buffer.putIntUnchecked(s[i]); It's odd that we can call floatConstant below, but there's no corresponding method for int32 yet. Could you add one, so that the code here in int32x4Constant can just call int32Constant? ::: js/src/jit/IonTypes.h @@ +299,5 @@ > + return false; > + // Comparing as one kind is enough. > + for (size_t i = 0; i < 4; ++i) { > + if (u.i32x4[i] != rhs.u.i32x4[i]) > + return false; I like this, but it is technically a violation of the C++ union rules. I suggest calling memcmp instead. ::: js/src/jit/shared/MacroAssembler-x86-shared.h @@ +649,5 @@ > + static SimdConstant zero(0,0,0,0); > + if (v == zero) { > + pxor(dest, dest); > + return true; > + } Another value that's nice and easy to inline here is (-1,-1,-1,-1) since you can do that with just pcmpeqw(reg, reg) (Agner Optimizing Assembly table 13.10). ::: js/src/jit/x64/MacroAssembler-x64.cpp @@ +157,5 @@ > bind(&flt.uses); > masm.floatConstant(flt.value); > } > > + // SIMD memory values must be 16-bits aligned s/bits/bytes/. Actually, it may be better to avoid saying the number here at all, as it may change. How about just saying that SIMD memory must be suitably aligned (here and in the x86 version too). ::: js/src/jit/x86/MacroAssembler-x86.cpp @@ +157,5 @@ > + SimdData *f4 = getSimdData(v); > + if (!f4) > + return; > + JS_ASSERT(f4->type() == SimdConstant::Float32x4); > + masm.movdqa_mr(reinterpret_cast<const void *>(f4->uses.prev()), dest.code()); This should be movaps instead of movdqa. Functionally equivalent, but avoids a domain-crossing stall on some micro-architectures.
Attachment #8459656 -
Flags: feedback?(sunfish) → feedback+
Comment 6•10 years ago
|
||
Comment on attachment 8459652 [details] [diff] [review] 1 - Constructors Review of attachment 8459652 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.h @@ +1294,5 @@ > + bool congruentTo(const MDefinition *ins) const { > + if (!ins->isSimdValueX4()) > + return false; > + if (ins->type() != type()) > + return false; congruentIfOperandsEqual already checks that the op() and type() match, so these two checks are redundant.
Attachment #8459652 -
Flags: feedback?(sunfish) → feedback+
Comment 7•10 years ago
|
||
Comment on attachment 8459656 [details] [diff] [review] 2 - Constants Review of attachment 8459656 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.h @@ +1300,5 @@ > } > }; > > +// A constant SIMD value. > +class MSimdConstant : public MNullaryInstruction Looking at the third patch which folds to this made me question this a little. The alternative to having an MSimdConstant class is to have lowering do the pattern-matching to recognize an MSimdValueX4 with all-constant operands in visitMSimdValueX4 and lower it to LInt32x4 or LFloat32x4 right there. Offhand, the advantage of doing this in lowering is that we avoid creating as many MIRNodes, though perhaps it's not that big an advantage. The advantage of using MSimdConstant is that it would make it more convenient for us to do more pattern-matching and constant folding of SIMD values. This won't matter much for emscripten-generated code (which should already have all the constant-folding done already), but it may be nice for hand-written JS code. I wonder if there are other considerations.
Comment 8•10 years ago
|
||
Comment on attachment 8459656 [details] [diff] [review] 2 - Constants Review of attachment 8459656 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.h @@ +1300,5 @@ > } > }; > > +// A constant SIMD value. > +class MSimdConstant : public MNullaryInstruction Thinking about this a little more, I now think MSimdConstant is fine. The common case for MSimdConstant will be when all the operands are immediate constants, so if we want to reduce memory usage, we can have IonBuilder and/or AsmJS.cpp detect this pattern and create MSimdConstants directly instead of creating a bunch of nodes which will be folded away. We also don't need to do this right away, so the current patches are fine for now.
Comment 9•10 years ago
|
||
Comment on attachment 8459657 [details] [diff] [review] 3 - Fold SimdValue into SimdConstant Review of attachment 8459657 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.cpp @@ +593,5 @@ > +{ > + MIRType scalarType = SIMDTypeToScalarType(type()); > + for (size_t i = 0; i < 4; ++i) { > + MDefinition *op = getOperand(i); > + if (!op->isConstant() || op->type() != scalarType) MSimdValueX4's constructor asserts that its operands have the right type, so it shouldn't be necessary to check it again here. An assert for the types wouldn't be unwelcome though, of course. @@ +603,5 @@ > + case MIRType_Int32x4: { > + int32_t a[4]; > + for (size_t i = 0; i < 4; ++i) > + a[i] = getOperand(i)->toConstant()->value().toInt32(); > + cst = SimdConstant(a[0], a[1], a[2], a[3]); It might be nice to add a constructor to SimdConstant to allow |a| to be passed in directly here. That would also make it a little easier to generalize to SIMD types with different numbers of elements. @@ +610,5 @@ > + case MIRType_Float32x4: { > + float a[4]; > + for (size_t i = 0; i < 4; ++i) > + a[i] = getOperand(i)->toConstant()->value().toNumber(); > + cst = SimdConstant(a[0], a[1], a[2], a[3]); as above
Updated•10 years ago
|
Attachment #8459657 -
Flags: feedback?(sunfish) → feedback+
Assignee | ||
Comment 10•10 years ago
|
||
Thanks for the great feedback! pcmpeqw was already present in the Assembler-x86-shared, so that's bonus. I've added fill{Int,Float}32(x, y, z, w) functions to SimdConstant, and two constructors that take array inputs. (In reply to Dan Gohman [:sunfish] from comment #8) > Comment on attachment 8459656 [details] [diff] [review] > 2 - Constants > > Review of attachment 8459656 [details] [diff] [review]: > ----------------------------------------------------------------- > > The common case for MSimdConstant will be when all the operands are > immediate constants, so if we want to reduce memory usage, we can have > IonBuilder and/or AsmJS.cpp detect this pattern and create MSimdConstants > directly instead of creating a bunch of nodes which will be folded away. We > also don't need to do this right away, so the current patches are fine for > now. Actually, I am doing this pattern matching (partially at least) in my Odin SIMD patch, for variable initializers and globals, so that's already useful. The folding was just shown as a use of the MSimdConstant in this bug.
Assignee | ||
Comment 11•10 years ago
|
||
Flipping to r?
Attachment #8459652 -
Attachment is obsolete: true
Attachment #8469351 -
Flags: review?(sunfish)
Assignee | ||
Comment 12•10 years ago
|
||
Attachment #8469354 -
Flags: review?(sunfish)
Assignee | ||
Comment 13•10 years ago
|
||
oops
Attachment #8459656 -
Attachment is obsolete: true
Attachment #8469354 -
Attachment is obsolete: true
Attachment #8469354 -
Flags: review?(sunfish)
Attachment #8469355 -
Flags: review?(sunfish)
Assignee | ||
Comment 14•10 years ago
|
||
Attachment #8469356 -
Flags: review?(sunfish)
Assignee | ||
Comment 15•10 years ago
|
||
Attachment #8459657 -
Attachment is obsolete: true
Attachment #8469357 -
Flags: review?(sunfish)
Comment 16•10 years ago
|
||
Comment on attachment 8469351 [details] [diff] [review] 1 - Constructors Review of attachment 8469351 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/shared/CodeGenerator-x86-shared.cpp @@ +2076,5 @@ > + for (size_t i = 0; i < 4; ++i) { > + FloatRegister r = ToFloatRegister(ins->getOperand(i)); > + masm.storeFloat32(r, Address(StackPointer, i * sizeof(float))); > + } > + masm.loadAlignedFloat32x4(Address(StackPointer, 0), output); Another bonus opportunity here :). We can do this sequence with 3 unpcklps instructions and no stack space, though we'll need temporaries to be able to write intermediate values, and possibly copying: unpcklps %xmm3, %xmm1 # %xmm1 = %xmm1.x,%xmm3.x,%xmm1.y,%xmm3.y unpcklps %xmm2, %xmm0 # %xmm0 = %xmm0.x,%xmm2.x,%xmm0.y,%xmm2.y unpcklps %xmm1, %xmm0 # %xmm0 = %xmm0.x,%xmm1.x,%xmm0.y,%xmm1.y We can theoretically do the same thing for int32x4 with punpckldq, though it's less nice there because we need to move everything into xmm registers first. I'm also totally ok with just adding a comment about this for now and coming back to it later.
Attachment #8469351 -
Flags: review?(sunfish) → review+
Comment 17•10 years ago
|
||
Comment on attachment 8469355 [details] [diff] [review] 2 - SimdConstant Review of attachment 8469355 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/IonTypes.h @@ +241,5 @@ > + }; > + > + private: > + bool defined_; > + Type type_; Since we have our own Type enum, we could perhaps add an Undefined type, and then we wouldn't need a separate defined_ flag. Also, if this class is default-constructed, these fields aren't initialized anywhere. There should be a default constructor which initialized these. @@ +273,5 @@ > + SimdConstant cst; > + cst.fillInt32x4(x, y, z, w); > + return cst; > + } > + static SimdConstant Create(int32_t *array) { Would it make sense to rename this to CreateX4? It feels awkward to pass in a plain pointer with no explicit mention of how many elements will be read from the pointer. @@ +283,5 @@ > + SimdConstant cst; > + cst.fillFloat32x4(x, y, z, w); > + return cst; > + } > + static SimdConstant Create(float *array) { Same as above. @@ +324,5 @@ > + // SimdConstant is a HashPolicy > + typedef SimdConstant Lookup; > + static HashNumber hash(const SimdConstant &val) { > + const int32_t *a = val.u.i32x4; > + return mozilla::HashGeneric(a[0], a[1], a[2], a[3]); This is contrary to the C++ union rules when val is actually holding a float32x4. I recommend using HashBytes here instead.
Attachment #8469355 -
Flags: review?(sunfish) → review+
Comment 18•10 years ago
|
||
Comment on attachment 8469356 [details] [diff] [review] 3 - MSimdConstant Review of attachment 8469356 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/LIR-Common.h @@ +326,5 @@ > > +// Constant SIMD int32x4 > +class LInt32x4 : public LInstructionHelper<1, 0, 0> > +{ > + SimdConstant v_; It would save some memory here to use mir()->value() instead of storing an extra copy here. @@ +330,5 @@ > + SimdConstant v_; > + public: > + LIR_HEADER(Int32x4); > + > + LInt32x4(const SimdConstant &v) : v_(v) {} This constructor should get the explicit keyword. @@ +331,5 @@ > + public: > + LIR_HEADER(Int32x4); > + > + LInt32x4(const SimdConstant &v) : v_(v) {} > + SimdConstant getValue() const { return v_; } SimdConstant is big enough that it makes sense to return a const& here. @@ +337,5 @@ > + > +// Constant SIMD float32x4 > +class LFloat32x4 : public LInstructionHelper<1, 0, 0> > +{ > + SimdConstant v_; ditto @@ +341,5 @@ > + SimdConstant v_; > + public: > + LIR_HEADER(Float32x4); > + > + LFloat32x4(const SimdConstant &v) : v_(v) {} ditto @@ +342,5 @@ > + public: > + LIR_HEADER(Float32x4); > + > + LFloat32x4(const SimdConstant &v) : v_(v) {} > + SimdConstant getValue() const { return v_; } ditto ::: js/src/jit/shared/MacroAssembler-x86-shared.h @@ +662,5 @@ > } > > + bool maybeInlineInt32x4(const SimdConstant &v, const FloatRegister &dest) { > + static SimdConstant zero = SimdConstant::Create(0, 0, 0, 0); > + static SimdConstant minusOne = SimdConstant::Create(-1, -1, -1, -1); These variables can be const, which helps people auditing code for non-const statics. @@ +674,5 @@ > + } > + return false; > + } > + bool maybeInlineFloat32x4(const SimdConstant &v, const FloatRegister &dest) { > + static SimdConstant zero = SimdConstant::Create(0.f, 0.f, 0.f, 0.f); Another const. @@ +675,5 @@ > + return false; > + } > + bool maybeInlineFloat32x4(const SimdConstant &v, const FloatRegister &dest) { > + static SimdConstant zero = SimdConstant::Create(0.f, 0.f, 0.f, 0.f); > + if (v == zero) { This == works because SimdConstant's operator== uses memcmp, rather than elementwise == because -0.0 == 0.0. A brief comment about this would be good. ::: js/src/jit/x64/MacroAssembler-x64.cpp @@ +79,5 @@ > masm.setNextJump(j, prev); > } > > +MacroAssemblerX64::SimdData * > +MacroAssemblerX64::lookupOrAddSimdConstant(const SimdConstant &v) The corresponding function on x86 is named getSimdData. I'm ok with either name, but it'd be nice to use the same name for both, unless there's some subtle difference I'm not seeing. If the code is actually the same, it'd be nice to move it to shared/MacroAssembler-x86-shared.*
Attachment #8469356 -
Flags: review?(sunfish) → review+
Updated•10 years ago
|
Attachment #8469357 -
Flags: review?(sunfish) → review+
Comment 19•10 years ago
|
||
(In reply to Dan Gohman [:sunfish] from comment #16) > unpcklps %xmm3, %xmm1 # %xmm1 = %xmm1.x,%xmm3.x,%xmm1.y,%xmm3.y > unpcklps %xmm2, %xmm0 # %xmm0 = %xmm0.x,%xmm2.x,%xmm0.y,%xmm2.y > unpcklps %xmm1, %xmm0 # %xmm0 = %xmm0.x,%xmm1.x,%xmm0.y,%xmm1.y > > We can theoretically do the same thing for int32x4 with punpckldq, though > it's less nice there because we need to move everything into xmm registers > first. Also with SSE4.1, int32x4 construction can be done nicely with pinsrd: movd %edi, %xmm0 pinsrd $1, %esi, %xmm0 pinsrd $2, %edx, %xmm0 pinsrd $3, %ecx, %xmm0 Again, not imperative to do now.
Assignee | ||
Comment 20•10 years ago
|
||
See question below regarding the default constructor. (In reply to Dan Gohman [:sunfish] from comment #17) > Comment on attachment 8469355 [details] [diff] [review] > 2 - SimdConstant > > Review of attachment 8469355 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/jit/IonTypes.h > @@ +241,5 @@ > > + }; > > + > > + private: > > + bool defined_; > > + Type type_; > > Since we have our own Type enum, we could perhaps add an Undefined type, and > then we wouldn't need a separate defined_ flag. Good idea. That makes sizeof(SimdConstant) == Simd128DataSize, which is good. > > Also, if this class is default-constructed, these fields aren't initialized > anywhere. There should be a default constructor which initialized these. That actually cannot be done, and that's why we have the CreateX4 functions. If we have a user-defined ctor, we cannot include SimdConstant into unions, as it has a non trivial ctor. C++11 allows to create default ctors, but all our targeted platforms don't support it (B2G simulators in particular complained about it, during a try build i made a few weeks ago). I will add a comment to make it clearer. Is there anything else we can do here? > > @@ +273,5 @@ > > + SimdConstant cst; > > + cst.fillInt32x4(x, y, z, w); > > + return cst; > > + } > > + static SimdConstant Create(int32_t *array) { > > Would it make sense to rename this to CreateX4? It feels awkward to pass in > a plain pointer with no explicit mention of how many elements will be read > from the pointer. I named it CreateX4 everywhere, for consistency purposes. Thanks. > > @@ +324,5 @@ > > + // SimdConstant is a HashPolicy > > + typedef SimdConstant Lookup; > > + static HashNumber hash(const SimdConstant &val) { > > + const int32_t *a = val.u.i32x4; > > + return mozilla::HashGeneric(a[0], a[1], a[2], a[3]); > > This is contrary to the C++ union rules when val is actually holding a > float32x4. I recommend using HashBytes here instead. Thanks, I didn't know this existed! Fixed.
Flags: needinfo?(sunfish)
Comment 21•10 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #20) > > Also, if this class is default-constructed, these fields aren't initialized > > anywhere. There should be a default constructor which initialized these. > That actually cannot be done, and that's why we have the CreateX4 functions. > If we have a user-defined ctor, we cannot include SimdConstant into unions, > as it has a non trivial ctor. C++11 allows to create default ctors, but all > our targeted platforms don't support it (B2G simulators in particular > complained about it, during a try build i made a few weeks ago). I will add > a comment to make it clearer. Is there anything else we can do here? Oh, I think I forgot about this when I saw the defined_ flag, because I though that that meant that we always knew if we were defined or not, but then I noticed that it wasn't actually initialized. I don't know of anything else we can easily do here at present, so the code is fine, with a comment :).
Flags: needinfo?(sunfish)
Assignee | ||
Comment 22•10 years ago
|
||
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/4a53363b2efd remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/d9f23f57085d remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/850663302692 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/eb1d969ec449
Assignee | ||
Comment 23•10 years ago
|
||
duh, build bustage: https://hg.mozilla.org/integration/mozilla-inbound/rev/043edc92814e
Comment 24•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/4a53363b2efd https://hg.mozilla.org/mozilla-central/rev/d9f23f57085d https://hg.mozilla.org/mozilla-central/rev/850663302692 https://hg.mozilla.org/mozilla-central/rev/eb1d969ec449 https://hg.mozilla.org/mozilla-central/rev/043edc92814e
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in
before you can comment on or make changes to this bug.
Description
•