Closed Bug 1135039 Opened 9 years ago Closed 9 years ago

SIMD: Generalize shuffles/swizzles for Ion

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla39

Tracking Flags:

Tracking

Status

firefox39

---

fixed

People

(Reporter: bbouvier, Assigned: bbouvier)

References

Details

Attachments

(7 files, 3 obsolete files)

1.polyfill.patch 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 7.64 KB, patch		Details \| Diff \| Splinter Review
swizzle.patch 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 24.52 KB, patch	sunfish : feedback+	Details \| Diff \| Splinter Review
swizzle.patch 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 25.47 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review
1.throw-oob.patch 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 10.95 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review
2.jit.patch 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 9.41 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review
3.shuffle.patch 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 28.06 KB, patch	sunfish : feedback+	Details \| Diff \| Splinter Review
3. LVariadicInstruction 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 5.71 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review
4. Shuffle 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 36.60 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review
5. bonus: Factor LInstructionHelper/LVariadicInstruction code 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 10.59 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review
6. Bonus: Factor codegen with templates 9 years ago Benjamin Bouvier [:bbouvier] (inactive) 10.37 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Description

•

9 years ago

Currently, our SIMD shuffles / swizzles MIR nodes expect constant patterns (masks?), so they need to generalized before being inlinable within Ion.

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Updated

•

9 years ago

Assignee: nobody → benj

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 1

•

9 years ago

Attached patch 1.polyfill.patch (obsolete) — Details — Splinter Review

This will make the interpreter behave more like the polyfill: if the lane index is out of bounds ([0..3] for float32x4/int32x4, [0..1] for float64x2), just load the coerced undefined value in it.  Might be not intended in the polyfill (see also [0]), but it's actually pretty convenient to implement in the JITs...

[0] https://github.com/johnmccutchan/ecmascript_simd/issues/111#issuecomment-75968636

Attachment #8569307 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 2

•

9 years ago

Attached patch swizzle.patch (obsolete) — Details — Splinter Review

This patch implements a generalized swizzle for int32x4 and float32x4, from inlining to codegen on x86. Implementation does the following:
- reserve stack space for 2 vectors
- spill the input vector onto the stack
- load each value addressed by the lane index into a temp register
- spill this temp register at its final position in the output vector (onto the stack)
Not really efficient (with sse4.1, we could probably insert the value directly into the lane), but It Works (tm).

Attachment #8569310 - Flags: feedback?(sunfish)

Dan Gohman [:sunfish]

Comment 3

•

9 years ago

Comment on attachment 8569310 [details] [diff] [review]
swizzle.patch

Review of attachment 8569310 [details] [diff] [review]:
-----------------------------------------------------------------

Nice and straight-forward.

::: js/src/jit-test/tests/SIMD/swizzle.js
@@ +37,5 @@
> +        var r = SIMD.int32x4.swizzle(i4, i % 4, (i + 1) % 4, (i + 2) % 4, (i + 3) % 4);
> +        assertEqX4(r, compI[i % 4]);
> +
> +        // Out of bounds variables lanes
> +        assertEqX4(SIMD.int32x4.swizzle(i4, i + 4, 3, 2, 1), [0, 4, 3, 2]);

It'd be good to test negative lane indices too.

::: js/src/jit/MIR.h
@@ +1821,5 @@
>  
>      ALLOW_CLONE(MSimdSwizzle)
>  };
>  
> +class MSimdGeneralSwizzle :

Please add a brief comment here describing what a "general" swizzle is.

::: js/src/jit/TypePolicy.h
@@ +331,5 @@
>  };
>  
> +class SimdSwizzlePolicy MOZ_FINAL : public TypePolicy
> +{
> +    public:

Style nit: this should be at 2 spaces according to SpiderMonkey style

::: js/src/jit/shared/CodeGenerator-x86-shared.cpp
@@ +2363,5 @@
> +{
> +    FloatRegister input = ToFloatRegister(ins->base());
> +    Register temp = ToRegister(ins->temp());
> +
> +    masm.reserveStack(Simd128DataSize * 2);

It might be nice to have a comment somewhere around here mentioning that this isn't going to generate amazingly fast code, but that it's ok because it's expected that users will use constant swizzle indices.

@@ +2374,5 @@
> +        Label loadZero, go, join;
> +        masm.cmp32(lane, Imm32(0));
> +        masm.j(Assembler::LessThan, &loadZero);
> +        masm.cmp32(lane, Imm32(4));
> +        masm.j(Assembler::LessThan, &go);

An unsigned comparison would allow this to be done in a single branch. Not that performance is a big concern here, but it would make the code a little simpler.

@@ +2416,5 @@
> +        Label loadNaN, go, join;
> +        masm.cmp32(lane, Imm32(0));
> +        masm.j(Assembler::LessThan, &loadNaN);
> +        masm.cmp32(lane, Imm32(4));
> +        masm.j(Assembler::LessThan, &go);

Ditto about unsigned comparison.

@@ +2419,5 @@
> +        masm.cmp32(lane, Imm32(4));
> +        masm.j(Assembler::LessThan, &go);
> +
> +        masm.bind(&loadNaN);
> +        masm.loadConstantFloat32(0.f, ScratchFloat32Reg);

This is loading zero, rather than NaN.

Attachment #8569310 - Flags: feedback?(sunfish) → feedback+

Dan Gohman [:sunfish]

Comment 4

•

9 years ago

Comment on attachment 8569307 [details] [diff] [review]
1.polyfill.patch

Review of attachment 8569307 [details] [diff] [review]:
-----------------------------------------------------------------

The current polyfill behavior isn't really intentional. I'm inclined to wait for resolution on the above-mentioned #111 before making changes here, but I could be convinced otherwise.

Attachment #8569307 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 5

•

9 years ago

Attached patch swizzle.patch — Details — Splinter Review

Thanks for the feedback! Switching to review, even if it will introduce new differences between the JIT and the interpreter. My hope is to get this landed by next week, so that SIMD in Ion can be demo'ed at GDC (otherwise, it's fine if it's not, because we can just have a try-build and demo it at GDC, but well, having it on nightlies would make more impact!).

Attachment #8569310 - Attachment is obsolete: true

Attachment #8569813 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 6

•

9 years ago

Note to self, I'll also implement shuffle later, as it's not used in any of the SIMD demos...

Keywords: leave-open

Dan Gohman [:sunfish]

Comment 7

•

9 years ago

Comment on attachment 8569813 [details] [diff] [review]
swizzle.patch

Review of attachment 8569813 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIR.cpp
@@ +933,5 @@
> +MSimdGeneralSwizzle::foldsTo(TempAllocator &alloc)
> +{
> +    int32_t lanes[4];
> +    for (size_t i = 0; i < 4; i++) {
> +        if (!lane(i)->isConstant() || !lane(i)->type() == MIRType_Int32)

The right side there should be lane(i)->type() != MIRType_Int32.

@@ +936,5 @@
> +    for (size_t i = 0; i < 4; i++) {
> +        if (!lane(i)->isConstant() || !lane(i)->type() == MIRType_Int32)
> +            return this;
> +        lanes[i] = lane(i)->toConstant()->value().toInt32();
> +        if (lanes[i] < 0 || lanes[i] > 4)

The right side here should be >= 4.

Attachment #8569813 - Flags: review?(sunfish) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 8

•

9 years ago

Thanks for the review! Unfortunately, I realized I can't land it without the interpreter patch, because on ARM where we don't have SIMD in Ion, the code stays in the interpreter and thus the test doesn't pass (difference of behavior between engines... it's pretty bad).

Do you prefer me to rework the patch so as to throw when a lane index is out of bounds, or should we check in the first patch (at least, temporarily, until the spec polyfill gets updated)?

Flags: needinfo?(sunfish)

Dan Gohman [:sunfish]

Comment 9

•

9 years ago

(In reply to Benjamin Bouvier [:bbouvier] from comment #8)
> Do you prefer me to rework the patch so as to throw when a lane index is out
> of bounds, or should we check in the first patch (at least, temporarily,
> until the spec polyfill gets updated)?

Throwing when a lane out of bounds sounds to me like desirable semantics. I just commented on issue #111 to this effect. I'd be ok implementing that here for now, and may even hope that we can keep it that way.

Flags: needinfo?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 10

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/b5cac236faad

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 11

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/b5cac236faad

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 12

•

9 years ago

Attached patch 1.throw-oob.patch — Details — Splinter Review

Following the behavior described in https://github.com/johnmccutchan/ecmascript_simd/issues/120#issuecomment-77085936

Attachment #8573218 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 13

•

9 years ago

Attached patch 2.jit.patch — Details — Splinter Review

And the patch for swizzle in the JITs (note that i somehow forgot to add inlining of SIMD.float32x4.swizzle in the previous patch, shame on me!).

Attachment #8569307 - Attachment is obsolete: true

Attachment #8573222 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 14

•

9 years ago

Attached patch 3.shuffle.patch (obsolete) — Details — Splinter Review

So while looking at the codegen for swizzle, i realized that the same codegen could be used for shuffles, and for other vector types as well, so I've tried to generalize the MIR node as much as possible. Codegen is ready to support int8x16 and int16x8 and float64x2.

This means that LSimdShuffleBase becomes variadic. If the approach looks fine, I could create a LVariadicInstruction and have LSimdShuffleBase inherit from it, so that LVariadicInstruction can be reused for other operations...

Another solution was to just have a fixed amount of operands, which would be an upper bounds, but that would mean this upper bound would need to be 18 (for SIMD.int8x16.swizzle). Thoughts?

Attachment #8573350 - Flags: feedback?(sunfish)

Attachment #8573350 - Flags: feedback?(jdemooij)

Jan de Mooij [:jandem]

Comment 15

•

9 years ago

Comment on attachment 8573350 [details] [diff] [review]
3.shuffle.patch

Review of attachment 8573350 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, I'm overloaded with reviews and needinfos these weeks, so I'll leave this to sunfish. He's also more familiar with the SIMD instructions here.

Attachment #8573350 - Flags: feedback?(jdemooij)

Dan Gohman [:sunfish]

Updated

•

9 years ago

Attachment #8573222 - Flags: review?(sunfish) → review+

Dan Gohman [:sunfish]

Comment 16

•

9 years ago

Comment on attachment 8573218 [details] [diff] [review]
1.throw-oob.patch

Review of attachment 8573218 [details] [diff] [review]:
-----------------------------------------------------------------

yay!

Attachment #8573218 - Flags: review?(sunfish) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 17

•

9 years ago

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/cc4786da89db
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/18c84fcee487

Ryan VanderMeulen [:RyanVM]

Comment 18

•

9 years ago

Backed out for Linux32 jit-test asserts.
https://hg.mozilla.org/integration/mozilla-inbound/rev/b79d608ca174

https://treeherder.mozilla.org/logviewer.html#?job_id=7487175&repo=mozilla-inbound

Ryan VanderMeulen [:RyanVM]

Comment 19

•

9 years ago

Also WinXP jit-test timeouts. Looking at your last Try push, all of these issues were present there as well. Please verify that you're green on Try before pushing in the future.

Flags: needinfo?(benj)

Dan Gohman [:sunfish]

Comment 20

•

9 years ago

Comment on attachment 8573350 [details] [diff] [review]
3.shuffle.patch

Review of attachment 8573350 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Benjamin Bouvier [:bbouvier] from comment #14)
> Created attachment 8573350 [details] [diff] [review]
> 3.shuffle.patch
> 
> So while looking at the codegen for swizzle, i realized that the same
> codegen could be used for shuffles, and for other vector types as well, so
> I've tried to generalize the MIR node as much as possible. Codegen is ready
> to support int8x16 and int16x8 and float64x2.

Great!

> This means that LSimdShuffleBase becomes variadic. If the approach looks
> fine, I could create a LVariadicInstruction and have LSimdShuffleBase
> inherit from it, so that LVariadicInstruction can be reused for other
> operations...

Creating an LVariadicInstruction sounds nice, to keep the mundane operand-accessor kind of code somewhat separate from the interesting operation-specific code, even if we only have one subclass of LVariadicInstruction.

> Another solution was to just have a fixed amount of operands, which would be
> an upper bounds, but that would mean this upper bound would need to be 18
> (for SIMD.int8x16.swizzle). Thoughts?

I think variadic things are fine here. We mainly want to keep the code simple, since this is code that we hope won't be used much anyway.

::: js/src/jit/LIR-Common.h
@@ +376,5 @@
>  
> +class LSimdGeneralSwizzleBase : public LInstruction
> +{
> +    LDefinition def_;
> +    mozilla::Vector<LAllocation> operands_;

You can use the FixedList class here instead of Vector.

Attachment #8573350 - Flags: feedback?(sunfish) → feedback+

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 21

•

9 years ago

(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #19)
> Also WinXP jit-test timeouts. Looking at your last Try push, all of these
> issues were present there as well. Please verify that you're green on Try
> before pushing in the future.

Sorry for being an idiot... I've somehow lost track of this try push, so there are a few things I need to enhance in my bugzilla workflow...

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9dcc46796f42

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 22

•

9 years ago

Attached patch 3. LVariadicInstruction — Details — Splinter Review

Here we go!

Attachment #8573350 - Attachment is obsolete: true

Attachment #8576657 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 23

•

9 years ago

Attached patch 4. Shuffle — Details — Splinter Review

Attachment #8576658 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 24

•

9 years ago

Attached patch 5. bonus: Factor LInstructionHelper/LVariadicInstruction code — Details — Splinter Review

Totally not mandatory

Attachment #8576659 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 25

•

9 years ago

Attached patch 6. Bonus: Factor codegen with templates — Details — Splinter Review

Totally not mandatory as well

Attachment #8576661 - Flags: review?(sunfish)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 26

•

9 years ago

And a try build for this new set of patches as well...
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9161b0d10d3f

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 27

•

9 years ago

Pushing the two changesets, as try-run in comment 21.

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/36be55e9a4ff
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/6a42e09d9a26

Flags: needinfo?(benj)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 28

•

9 years ago

And a new try run for the other patches: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3352d70d9bc4

Carsten Book [:Tomcat]

Comment 29

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/36be55e9a4ff
https://hg.mozilla.org/mozilla-central/rev/6a42e09d9a26

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 30

•

9 years ago

Comment on attachment 8576658 [details] [diff] [review]
4. Shuffle

Review of attachment 8576658 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/LIR-Common.h
@@ +376,5 @@
>  
> +class LSimdGeneralShuffleBase : public LVariadicInstruction<1, 1>
> +{
> +  public:
> +    LSimdGeneralShuffleBase(const LDefinition &temp) {

(fixed locally: made this one and the two other ones below explicit)

Dan Gohman [:sunfish]

Comment 31

•

9 years ago

Comment on attachment 8576657 [details] [diff] [review]
3. LVariadicInstruction

Review of attachment 8576657 [details] [diff] [review]:
-----------------------------------------------------------------

Cool!

Attachment #8576657 - Flags: review?(sunfish) → review+

Dan Gohman [:sunfish]

Comment 32

•

9 years ago

Comment on attachment 8576658 [details] [diff] [review]
4. Shuffle

Review of attachment 8576658 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.

::: js/src/jit/MIR.cpp
@@ +950,5 @@
>  
>  MDefinition *
> +MSimdGeneralShuffle::foldsTo(TempAllocator &alloc)
> +{
> +    mozilla::Vector<uint32_t> lanes;

It'd be slightly nicer to use FixedList here, since we know the size up front.

@@ +963,2 @@
>              return this;
> +        lanes.infallibleAppend(uint32_t(temp));

so this becomes lanes[i] = uint32_t(temp);

Attachment #8576658 - Flags: review?(sunfish) → review+

Dan Gohman [:sunfish]

Comment 33

•

9 years ago

Comment on attachment 8576659 [details] [diff] [review]
5. bonus: Factor LInstructionHelper/LVariadicInstruction code

Review of attachment 8576659 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. The details namespace pattern is probably something we should use more :).

Attachment #8576659 - Flags: review?(sunfish) → review+

Dan Gohman [:sunfish]

Updated

•

9 years ago

Attachment #8576661 - Flags: review?(sunfish) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 34

•

9 years ago

Cool, thanks!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a66df632fffa

Status: NEW → ASSIGNED

Keywords: leave-open

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Updated

•

9 years ago

Flags: needinfo?(benj)

Benjamin Bouvier [:bbouvier] (inactive)

Assignee

Comment 35

•

9 years ago

Try link in comment 34.
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/12c2f0b35afe
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/bcad11e292db
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/71a8d6e735ef
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/6abbe0f83479

Flags: needinfo?(benj)

Phil Ringnalda (:philor)

Comment 36

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/12c2f0b35afe
https://hg.mozilla.org/mozilla-central/rev/bcad11e292db
https://hg.mozilla.org/mozilla-central/rev/71a8d6e735ef
https://hg.mozilla.org/mozilla-central/rev/6abbe0f83479

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox39: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla39

You need to log in before you can comment on or make changes to this bug.