The default bug view has changed. See this FAQ.

OdinMonkey: experimental SIMD support

RESOLVED FIXED in mozilla34

Status

()

Core
JavaScript Engine: JIT
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: bbouvier, Assigned: bbouvier)

Tracking

(Depends on: 1 bug, Blocks: 1 bug, {dev-doc-complete})

unspecified
mozilla34
x86_64
All
dev-doc-complete
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 28 obsolete attachments)

9.58 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
139.06 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
38.73 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
65.46 KB, patch
luke
: review+
Details | Diff | Splinter Review
(Assignee)

Description

3 years ago
In parallel of bug 904913, we can begin work on SIMD integration in OdinMonkey. The plan is to restrict ourselves to x86 first. By doing this, we have no dependency on other SIMD work (bailout system integration, register allocation, that now has to take overlapping registers into account, esp. on ARM and MIPS platforms).

This will allow us to make some speed experiments and to compare performance with native code that uses SIMD (for instance, the asm.js bullet demo could use it). This is also important in terms of API testing, as this will let us know whether the API is complete enough for supporting operations also used in C++ code. As the Float32 optimization, this is not planned to be an asm.js / Odin feature only and general support in IonMonkey is necessary before enabling it everywhere.

I will post my WIP patches here.

Comment 1

3 years ago
Great to see progress on that end! The current SIMD implementation in Nightly is IMHO fairly restrictive in its int32x4 implementation. Specifically, int32x4 does neither provide for a min, lessThan, or abs function. lessThan, greaterThan, etc. are present in the polyfill (https://github.com/johnmccutchan/ecmascript_simd), but are missing from the Firefox implementation. Any chance to have it added? 

Without these functions, comparing two int32x4 vectors becomes unnecessary difficult and comes down to subtracting them, reading out the signMask from the result vector, and bit-shifting that signMask to get the comparisons for the individual lanes. The latter operation is obviously sequential, negatively affecting performance.
(Assignee)

Comment 2

3 years ago
(In reply to Soeren.Balko from comment #1)
> Great to see progress on that end! The current SIMD implementation in
> Nightly is IMHO fairly restrictive in its int32x4 implementation.
> Specifically, int32x4 does neither provide for a min, lessThan, or abs
> function. lessThan, greaterThan, etc. are present in the polyfill
> (https://github.com/johnmccutchan/ecmascript_simd), but are missing from the
> Firefox implementation. Any chance to have it added? 
> 
> Without these functions, comparing two int32x4 vectors becomes unnecessary
> difficult and comes down to subtracting them, reading out the signMask from
> the result vector, and bit-shifting that signMask to get the comparisons for
> the individual lanes. The latter operation is obviously sequential,
> negatively affecting performance.

Thanks for the report! For what it's worth, I opened bug 996076, as it seems to be a separate topic :)
(Assignee)

Comment 3

3 years ago
Some news: I have a local patch that allows setting a SIMD int32x4 variable and then retrieving a lane. During implementation, I figured out a few things:
- we don't have any coercion that says "this is a Foox4 variable". As all SIMD operations in the interpreter are such that they return a new value of one precise kind (good news is there shouldn't be any int32x4-ish type, as we don't have int32x4 arrays... for now), we can just track types of operations (for instance, int32x4.add takes 2 int32x4 arguments and returns a int32x4, no possible variations). However, if we want to pass SIMD arguments to another asm.js function or to return a SIMD value, we currently cannot. My proposal would be to add a "copy constructor" for Foox4 types, so that we can validate the type (as Math.fround currently does for Float32):

function f(x) {
  x = int32x4(x); // would be the new ctor that says "x is an int32x4".
}

The only issue is, if this function f is exported and called with an actual argument which *isn't* an int32x4, that will be a problem. My idea is that when calling asm.js code, the glue code that converts the argument could emit an exception if the actual argument type is not compatible with the formal argument type. Bonus, we could even emit the same as the x4 constructor when it has only one argument.
- we should think about how to store the Foox4 value internally (in the compiler). Using the TypedObject representation has advantages as the interpreter uses them to polyfill the SIMD api (better integration with Baseline / Ion in the future, shared MIR nodes, and so on) but implies that we rely on the current representation of Foox4, which, if I understand and remember correctly, is supposed to change to a ValueType in the future. Thoughts?
- for x4 variables, should we allow initialization by an expression? As the SIMD api returns a new value everytime it's called, we could just allow operations to be legal initializers for variables, instead of creating temporary garbage values. To make my point clearer, should we allow:

  var x = int32x4(1,2,3,4);
  var y = int32x4(4,3,2,1);
  var z = int32x4add(x, y); // rather than 'var z = int32x4(0,0,0,0); z = int32x4add(x,y);'
- technicality: at the global var level, when loading x4 kinds and x4 operations, should we allow this:
   var int32x4 = global.SIMD.int32x4;
   var int32x4add = int32x4.add; // rather than global.SIMD.int32x4.add
The latter takes less chars but needs int32x4 to be defined before; however I can't see how we'd have int32x4add defined without having defined int32x4 earlier.

Comment 4

3 years ago
Glad to know your progress :)

(In reply to Benjamin Bouvier [:bbouvier] from comment #3)
> Some news: I have a local patch that allows setting a SIMD int32x4 variable
> and then retrieving a lane. During implementation, I figured out a few
> things:
> - we don't have any coercion that says "this is a Foox4 variable". As all
> SIMD operations in the interpreter are such that they return a new value of
> one precise kind (good news is there shouldn't be any int32x4-ish type, as
> we don't have int32x4 arrays... for now), we can just track types of
> operations (for instance, int32x4.add takes 2 int32x4 arguments and returns
> a int32x4, no possible variations). However, if we want to pass SIMD
> arguments to another asm.js function or to return a SIMD value, we currently
> cannot. My proposal would be to add a "copy constructor" for Foox4 types, so
> that we can validate the type (as Math.fround currently does for Float32):
> 
> function f(x) {
>   x = int32x4(x); // would be the new ctor that says "x is an int32x4".
> }
> 
> The only issue is, if this function f is exported and called with an actual
> argument which *isn't* an int32x4, that will be a problem. My idea is that
> when calling asm.js code, the glue code that converts the argument could
> emit an exception if the actual argument type is not compatible with the
> formal argument type. Bonus, we could even emit the same as the x4
> constructor when it has only one argument.

Is this copy constructor needed for the asm.js type system? We might also need to 
define the coercion from float32x4 to others either, such as Object(Foox4), 
Boolean(Foox4) and Number(Foox4) as we discussed in the last meeting.  
 
> - we should think about how to store the Foox4 value internally (in the
> compiler). Using the TypedObject representation has advantages as the
> interpreter uses them to polyfill the SIMD api (better integration with
> Baseline / Ion in the future, shared MIR nodes, and so on) but implies that
> we rely on the current representation of Foox4, which, if I understand and
> remember correctly, is supposed to change to a ValueType in the future.
> Thoughts?

I agree. In the V8 implementation, the Float32x4/Int32x4 value in heap is 
represented with a head (4 bytes for ia32 and 8 bytes for x64) and 128-bit 
payload. We discussed this with Niko before, it is ideal we could introduce 
the Float32x4/Int32x4 in the JSValue. Now we are working to allocate 
Float32x4/Int32x4 value inline in the Typed Object, hopefully this would allow
us to allocate them directly in heap without calling C++ function.

> - for x4 variables, should we allow initialization by an expression? As the
> SIMD api returns a new value everytime it's called, we could just allow
> operations to be legal initializers for variables, instead of creating
> temporary garbage values. To make my point clearer, should we allow:
> 
>   var x = int32x4(1,2,3,4);
>   var y = int32x4(4,3,2,1);
>   var z = int32x4add(x, y); // rather than 'var z = int32x4(0,0,0,0); z =
> int32x4add(x,y);'

We allow that. The above code is supported by the VM.

> - technicality: at the global var level, when loading x4 kinds and x4
> operations, should we allow this:
>    var int32x4 = global.SIMD.int32x4;
>    var int32x4add = int32x4.add; // rather than global.SIMD.int32x4.add
> The latter takes less chars but needs int32x4 to be defined before; however
> I can't see how we'd have int32x4add defined without having defined int32x4
> earlier.

var int32x4add = global.SIMD.int32x4.add; We support inline int32x4add in
V8 (we could inline both SIMD method call and function call). In Ion, I 
also think so as int32x4add should be a native function.
(Assignee)

Comment 5

3 years ago
(In reply to haitao from comment #4)
> 
> Is this copy constructor needed for the asm.js type system? We might also
> need to 
> define the coercion from float32x4 to others either, such as Object(Foox4), 
> Boolean(Foox4) and Number(Foox4) as we discussed in the last meeting.

Yes, indeed I was implicitly talking about the asm.js type system, as OdinMonkey is Spidermonkey's implementation of asm.js :). As asm.js type system doesn't contain notions of Object / Boolean / Number, these are not necessary for Odin / asm.js. However, it is indeed a fair question for the rest of JS land.

> 
> I agree. In the V8 implementation, the Float32x4/Int32x4 value in heap is 
> represented with a head (4 bytes for ia32 and 8 bytes for x64) and 128-bit 
> payload. We discussed this with Niko before, it is ideal we could introduce 
> the Float32x4/Int32x4 in the JSValue. Now we are working to allocate 
> Float32x4/Int32x4 value inline in the Typed Object, hopefully this would
> allow
> us to allocate them directly in heap without calling C++ function.

I see. As I suspect we won't integrate x4 types into the JSValue and the underlying representation of x4 types is to change in the next months (from a typedobject to a valuetype), the plan is to add a layer of abstraction over it so that we can change it easily.

> We allow that. The above code is supported by the VM.

> var int32x4add = global.SIMD.int32x4.add; We support inline int32x4add in
> V8 (we could inline both SIMD method call and function call). In Ion, I 
> also think so as int32x4add should be a native function.

The two questions were related to the asm.js validation for OdinMonkey and not related to the VM. As asm.js is a subset of JS, everything that works in Odin will work in the VM; the contrary is not true, as asm.js code requires a specific format.

Comment 6

3 years ago
(In reply to Benjamin Bouvier [:bbouvier] from comment #3)
Great thoughts and questions!

> function f(x) {
>   x = int32x4(x); // would be the new ctor that says "x is an int32x4".
> }

Yes, I think this is what we'd ideally have from an asm.js POV.  Just like String(expr) and Number(expr) coerce *anything* to a string/number (or throw), it makes sense for int32x4(x) do the same: if you pass it an object {x:1,y:1,z:1,w:1} or an array [1,1,1,1], it'd build an in32x4 with the given components.  This seems to be the regular thing for a value-type constructor to do, so perhaps we can propose that and for now assume it for the Odin type checker.  As a backup plan, we could always do something analogous to |0 (e.g., x = float32x4.add(x, 0)), but the coercion seems preferable.

> - we should think about how to store the Foox4 value internally (in the
> compiler).

The impl strategy I'm in favor of is:
 - add a single new js::Value take for "value type"
 - this value type means the js::Value holds a JSObject pointer
 - the shape of the JSObject determines the specific value type (so, e.g., one shape for all int32x4s)
 - the shape will specify zero slots (so nothing is marked), but the GC finalize kind (which determines the actual allocation size) will be FINALIZE_OBJECT2 (so 2*sizeof(Value) = 16 bytes) which will give us the space to store the contents inline

This means that, in general, we'll have to handle the new "value type" value kind everywhere where we do a case analysis over values, but hopefully we can introduce abstractions so that, as we add more value type shapes, we don't have to do too much extra work.

>   var z = int32x4add(x, y); // rather than 'var z = int32x4(0,0,0,0); z =
> int32x4add(x,y);'

That's a good idea.  I think simple SSA analysis will make a noop initialization dead, so not terribly important, but it does seem like a good general enhancement to asm.js to let *any* variable initializer hold a general expression.  The one hard part is that Odin currently first parses variables, then calls prepareToEmitMIR, which assumes that locals_.count() is thereafter constant.  We could work around this by having varInitializers_ store the ParseNode* of the initializer and we could re-walk the initializer to emit MIR in prepareToEmitMIR.

>    var int32x4add = int32x4.add; // rather than global.SIMD.int32x4.add

This seems fine and it's analogous to how float32 global variable initializers depend on the Math.fround import.
(In reply to Luke Wagner [:luke] from comment #6)
> The impl strategy I'm in favor of is:
>  - add a single new js::Value take for "value type"
>  - this value type means the js::Value holds a JSObject pointer

It's pretty unclear to me that a JSObject* is what you want here. That seems to carry a fair amount of unnecessary baggage very little (if any) of which we actually want. (For example, it's not clear to me that shapes are necessary or sufficient; or at least not shapes as they are today.) I suppose this will to some extent hinge on the precise details of the "user-defined value types" spec, which is still very much up in the air.

Comment 8

3 years ago
Agreed, we probably want a new GC-thing type which could be much simpler than a JSObject.
(Assignee)

Comment 9

3 years ago
Created attachment 8414554 [details] [diff] [review]
wip

A first draft, that shows the architecture in AsmJS.cpp. This implements int32x4 literal constants in asm.js (var x = int32x4(1,2,3,4)), additions for these constants (z = int32x4add(x,y)) and lane retrievals (val.x, val.y, etc.).

I would like to have the multiply operation implemented too, as well as float32x4 support and add and mul for float32x4, before asking for feedback and splitting my patch.

A few remarks:
- I've used movdqu for moving between addresses and registers, in the case of register spilling for instance. Agner's instruction tables say it's less efficient than movdqa on Core 2 architectures, but aligning the stack slots is painful, as far as I've quickly tested. I'll look deeper into that.
- we can't store any MIR nodes as lanes of a SIMD unit so far (for instance: function f(){var x=42;var y=int32x4(0,0,0,0);y=int32x4(x, x+1|0, g()|0, 13)). The right way to do it would be to allocate temporarily (or even better, pre allocate) space on the stack so that we can 1) move each of the 4 values in a 32 bits slot in the stack 2) use an aligned move from the stack to the SIMD register. Thoughts?
- Odin specific: I made the type (in the sense of asm.js type system) of any int32x4 be Int, even though it could make sense to make it signed by default. Thoughts?
(Assignee)

Comment 10

3 years ago
Forgot to say: it will only work on x64 platforms right now, didn't bother with x86 and ARM yet.
Comment on attachment 8414554 [details] [diff] [review]
wip

Review of attachment 8414554 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/assembler/assembler/X86Assembler.h
@@ +2574,5 @@
> +        m_formatter.prefix(PRE_SSE_66);
> +        m_formatter.twoByteOp(OP2_PADDDW_VdqWdq, (RegisterID)dst, (RegisterID)src);
> +    }
> +
> +    void pxorqw_rr(XMMRegisterID src, XMMRegisterID dst)

I'm curious why you appended "qw" onto the name here.

@@ +2848,5 @@
>  
> +#ifdef WTF_CPU_X86_64
> +    JmpSrc movdqa_ripr(XMMRegisterID dst)
> +    {
> +        spew("movdqa \?(%%rip), %s",

? does not need to be escaped.

@@ +2889,5 @@
> +    }
> +
> +    void movdqa_rr(XMMRegisterID src, XMMRegisterID dst)
> +    {
> +        spew("movdqa %s, %s",

Please add whitespace before the first operand so that it's consistent with other instructions.

::: js/src/jit/LIR-Common.h
@@ +247,5 @@
> +    }
> +};
> +
> +// Adds two packed integers operands
> +class LPackedAddI : public LInstructionHelper<1, 2, 0>

Since we're going to be adding a bunch more operators, would it make sense to have a common class instead of one class per operation, similar to LMathD for floating-point operators?

::: js/src/jit/MIR.h
@@ +1024,5 @@
> +      : MUnaryInstruction(obj), lane_(lane)
> +    {
> +        JS_ASSERT(IsSIMDType(obj->type()));
> +        JS_ASSERT(lane < SIMDTypeToArity(obj->type()));
> +        JS_ASSERT(!IsSIMDType(type));

One additional thing you could assert here is that the scalar type of obj->type() is the same as type.

::: js/src/jit/StackSlotAllocator.h
@@ +36,5 @@
> +            normalSlots.append(height_ += 4);
> +        if (height_ % 16 != 0)
> +            doubleSlots.append(height_ += 8);
> +        // XXX that doesn't ensure that StackSlotAllocator.height + sp is 16
> +        // bytes aligned => use movups in the meanwhile

I suspect the problem here is that the JIT currently doesn't keep the stack pointer aligned.
(Assignee)

Comment 12

3 years ago
Thanks for your early feedback!

(In reply to Dan Gohman [:sunfish] from comment #11)
> Comment on attachment 8414554 [details] [diff] [review]
> wip
> 
> Review of attachment 8414554 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm curious why you appended "qw" onto the name here.

pxor has a MMX variant on 64 bits, but we actually don't care about MMX so i'll remove that.

> 
> @@ +2848,5 @@
> >  
> > +#ifdef WTF_CPU_X86_64
> > +    JmpSrc movdqa_ripr(XMMRegisterID dst)
> > +    {
> > +        spew("movdqa \?(%%rip), %s",
> 
> ? does not need to be escaped.
> 
As far as I can tell, all of the RIP accesses have escaped question marks in this file. Not sure whether it's actually used as a marker that gets replaced by the actual offset later or just a typo we could fix.

> Please add whitespace before the first operand so that it's consistent with
> other instructions.
Fixed.

> 
> Since we're going to be adding a bunch more operators, would it make sense
> to have a common class instead of one class per operation, similar to LMathD
> for floating-point operators?
Yeah, figured out the same thing when implementing subtract. However, I wouldn't extend this to *all* operations of a given kind, as it would just be equivalent to have another visitor pattern in the visitor pattern. Grouping basic arithmetic ops together (+, -, /, *) seems to make sense, withFlag{X,Y,Z,W} should share a LIR class (as getLane{X,Y,Z,W} does in this patch). Does it seem fair?

> 
> ::: js/src/jit/MIR.h
> @@ +1024,5 @@
> > +      : MUnaryInstruction(obj), lane_(lane)
> > +    {
> > +        JS_ASSERT(IsSIMDType(obj->type()));
> > +        JS_ASSERT(lane < SIMDTypeToArity(obj->type()));
> > +        JS_ASSERT(!IsSIMDType(type));
> 
> One additional thing you could assert here is that the scalar type of
> obj->type() is the same as type.
> 
Good idea, I'll add that.

> ::: js/src/jit/StackSlotAllocator.h
> @@ +36,5 @@
> > +            normalSlots.append(height_ += 4);
> > +        if (height_ % 16 != 0)
> > +            doubleSlots.append(height_ += 8);
> > +        // XXX that doesn't ensure that StackSlotAllocator.height + sp is 16
> > +        // bytes aligned => use movups in the meanwhile
> 
> I suspect the problem here is that the JIT currently doesn't keep the stack
> pointer aligned.

Yeah, that's it. A lot of places in the compiler assumes that the worse case needed alignment is 8 bits (not bytes as I wrote in this comment :/), so hacking around is not enough and a deeper analysis must be done.
(In reply to Benjamin Bouvier [:bbouvier] from comment #12)
> > @@ +2848,5 @@
> > >  
> > > +#ifdef WTF_CPU_X86_64
> > > +    JmpSrc movdqa_ripr(XMMRegisterID dst)
> > > +    {
> > > +        spew("movdqa \?(%%rip), %s",
> > 
> > ? does not need to be escaped.
> > 
> As far as I can tell, all of the RIP accesses have escaped question marks in
> this file. Not sure whether it's actually used as a marker that gets
> replaced by the actual offset later or just a typo we could fix.

The spew output is not updated. Looks like a typo. I filed bug 1004221 (with a patch) to fix this.

> Yeah, figured out the same thing when implementing subtract. However, I
> wouldn't extend this to *all* operations of a given kind, as it would just
> be equivalent to have another visitor pattern in the visitor pattern.
> Grouping basic arithmetic ops together (+, -, /, *) seems to make sense,
> withFlag{X,Y,Z,W} should share a LIR class (as getLane{X,Y,Z,W} does in this
> patch). Does it seem fair?

Yeah, that seems fine.
(Assignee)

Updated

3 years ago
Depends on: 1014083
(Assignee)

Updated

3 years ago
Depends on: 1019831
(Assignee)

Comment 14

3 years ago
Created attachment 8435653 [details] [diff] [review]
bug992267-wip.patch

Updated wip. Doesn't address all comments yet, but allows compilations on x86 *and* x64 and tests passing, with patches from bug 1019831 applied.
Attachment #8414554 - Attachment is obsolete: true
(Assignee)

Updated

3 years ago
Depends on: 1023404
(Assignee)

Comment 15

3 years ago
Created attachment 8463485 [details] [diff] [review]
WIP

So now it's in a better shape. It needs other patches from numerous bugs to be applied (bug 1019831, bug 1023404, bug 1021716, bug 1025475, bug 1043337). I will post a rolled up patch so that it's easier to connect the parts and to test directly.

Globals are not finished to be implemented yet (it's in a separate patch -- I am sorry I haven't split feature implementation before, i feel bad about that).

There are a few XXX which need to be discussed. And also other things, like for instance, the initializers for SIMD values are in a separate vector as scalar initializers, as SimdConstant (from bug 1025475) cannot fit in a union, because it has a non-trivial, user-defined constructor. See also [1]. The solution was either to do what I've done, or to add default generated ctor and dtor to SimdConstant (as it's done for Value in Value.h), or make all structs initialize explicitly as in [1]'s best answer. The issue is shown again when implementing Globals, so opinions will be very appreciated.

I've already added a few tests but the number of paths to test is becoming quite high. I'll add more, but would appreciate feedback about obvious paths that wouldn't be tested (for instance, storing and loading from the heap, probably).

[1] http://stackoverflow.com/questions/10693913/c11-anonymous-union-with-non-trivial-members
Attachment #8435653 - Attachment is obsolete: true
Attachment #8463485 - Flags: feedback?(luke)
(Assignee)

Comment 16

3 years ago
Created attachment 8463487 [details] [diff] [review]
Rolled up patch for testing

Rolled up patch. Applies cleanly on m-i  196363:ba89f7ce6ff2.

Comment 17

3 years ago
Comment on attachment 8463485 [details] [diff] [review]
WIP

Review of attachment 8463485 [details] [diff] [review]:
-----------------------------------------------------------------

Here's my first batch of comments.  Still reading through AsmJS.cpp.

Overall, this looks fantastic, really fits with the Odin style of doing things, no major changes requested.  Great work!

Question:
Does the jitSupportsFloatingPoint check in EstablishPreconditions imply we have sufficient HW support for SIMD?  If so, it'd be nice if this was somehow made a bit more explicit, perhaps adding a second flag that is initialized right alongside jitSupportsFloatingPoint so that we could see, in the arch-specific probing code, that we are probing for SSE/NEON.

::: js/src/jit-test/tests/asm.js/testSIMD.js
@@ +65,5 @@
> +
> +assertAsmTypeFail('glob', USE_ASM + F32 + "function f() {var x=f4;} return f");
> +assertAsmTypeFail('glob', USE_ASM + F32 + "function f() {var x=f4();} return f");
> +assertAsmTypeFail('glob', USE_ASM + F32 + "function f() {var x=f4(1,2,3);} return f");
> +assertAsmTypeFail('glob', USE_ASM + F32 + "function f() {var x=f4(1,2,3,4);} return f");

I think we should accept int intializers, in symmetry with how float32 variable initializers accept ints.

@@ +176,5 @@
> +assertEq(f32x4.y, Math.fround(42.42));
> +assertEq(f32x4.z, -0);
> +assertEq(f32x4.w, NaN);
> +
> +// TODO what if the actual argument isn't an int32x4 or a float32x4

I think we just do what the JS semantics say for "int32x4(x)" which will either produce a new SIMD value or throw.  I'm a bit fuzzy on the specifics of what happens in the non-throwing case (like, will int32x4({x:1,y:1,z:1,w:1}) coerce to an int32x4 or throw?), but for now we can just factor out a C++ ToInt32x4/ToFloat32x4 (in SIMD.cpp) and call these in CallAsmJS/CoerceInPlace_*.

@@ +276,5 @@
> +assertAsmTypeFail('glob', USE_ASM + I32 + "var sin=glob.Math.sin; function f() {var x=i4(1,2,3,4); return +sin(x);} return f");
> +assertAsmTypeFail('glob', USE_ASM + I32 + "var ceil=glob.Math.ceil; function f() {var x=i4(1,2,3,4); return +ceil(x);} return f");
> +
> +// 3.2. FFI calls
> +// Checking that FFI calls can't be inlined (yet) with Ion

I see we haven't got any responses yet on http://discourse.specifiction.org/t/simd-should-they-be-returned-from-passed-to-ffi.  Perhaps we can re-pinging people for opinions.

::: js/src/jit/AsmJS.cpp
@@ +480,5 @@
>          }
>          MOZ_ASSUME_UNREACHABLE("Invalid Type");
>      }
>  
> +    Type toScalarType() const {

How about naming these simdToScalarType()/simdToSimdType()?

::: js/src/jit/AsmJSLink.cpp
@@ +223,5 @@
> +    AsmJSSimdType type;
> +    RootedPropertyName simdTypeName(cx);
> +    if (global.which() == AsmJSModule::Global::SimdCtor) {
> +        type = global.simdCtorType();
> +        simdTypeName = global.simdCtorName();

Seems like, with simdCtorType and SimdTypeToName, we wouldn't need simdCtorName.  With that, you could lower the simdTypeName decl+init to after the 'if'.

@@ +237,5 @@
> +    if (!v.isObject())
> +        return LinkFail(cx, "bad SIMD type");
> +
> +    RootedObject x4desc(cx, &v.toObject());
> +    if (!x4desc->hasClass(&X4TypeDescr::class_))

I think you can instead use x4desc->is<XTypeDesc>().

@@ +243,5 @@
> +
> +    int32_t typeDescr;
> +    switch (type) {
> +      case AsmJSSimdType_int32x4:   typeDescr = Int32x4::type;   break;
> +      case AsmJSSimdType_float32x4: typeDescr = Float32x4::type; break;

Perhaps you could initialize the enum value of AsmJSSimdType_* with the *::type values and then have
  int32_t AsmJSSimdTypeToDescriptor(AsmJSSimdType);
in AsmJSModule.h?

@@ +247,5 @@
> +      case AsmJSSimdType_float32x4: typeDescr = Float32x4::type; break;
> +      default: MOZ_ASSUME_UNREACHABLE("unexpected SIMD type"); break;
> +    }
> +
> +    if (typeDescr != x4desc->getReservedSlot(JS_DESCR_SLOT_TYPE).toInt32())

Could you have a JSObject-derived class that provided this as a typed accessor?

@@ +298,5 @@
> +{
> +    RootedValue v(cx);
> +    // SIMD operations are loaded from the SIMD type, so the type must have been
> +    // validated before the operation.
> +    JS_ALWAYS_TRUE(ValidateSimdType(cx, global, globalVal, &v));

Can you put the RootedValue after the comment?

@@ +507,2 @@
>      // (in the low word) or double value, with the coercions specified by the
>      // asm.js signature. The external entry point unpacks this array into the

Could you add ", or a SIMD vector value, with the coercions...".

@@ +605,5 @@
> +      }
> +      case AsmJSModule::Return_Float32x4: {
> +            float *data = (float*)&coercedArgs[0];
> +            // XXX Canonicazlie?
> +            RootedObject x4obj(cx, Create<Float32x4>(cx, data));

nit: 'Create' might be a bit overly-general of a name, perhaps CreateSimd?

::: js/src/jit/shared/CodeGenerator-x86-shared.cpp
@@ +328,5 @@
> +                masm.storeDouble(ToFloatRegister(ins->arg()), dst);
> +                return true;
> +              case MIRType_Int32x4:
> +                // XXX not sure dst (SP + spOffset()) will be always aligned
> +                masm.storeAlignedInt32x4(ToFloatRegister(ins->arg()), dst);

If we keep sp 16-byte-aligned and ABIArgGenerator does its job, then it seems like this would be aligned?
It'd be good if storeAlignedInt32x4 did a DEBUG-only alignment check, though.

::: js/src/jit/x64/Assembler-x64.cpp
@@ +32,5 @@
>      if (regIndex_ == NumIntArgRegs) {
>          current_ = ABIArg(stackOffset_);
>          stackOffset_ += sizeof(uint64_t);
> +        if (IsSimdType(type))
> +            stackOffset_ += sizeof(uint64_t);

Might be a bit more clear to have:
  if (IsSimdType(type))
      stackOffset_ += Simd128DataSize;
  else
      stackOffset_ += sizeof(uint64_t);
for later 'grep Simd128DataSize' when 32 and 64-byte vectors are added.
(Assignee)

Comment 18

3 years ago
Created attachment 8465665 [details] [diff] [review]
globals (WIP)

This one adds support for SIMD globals. I finally preferred making SimdConstant having default ctors, for the union issue.
(Assignee)

Comment 19

3 years ago
Created attachment 8466061 [details] [diff] [review]
WIP v2
Attachment #8463485 - Attachment is obsolete: true
Attachment #8463485 - Flags: feedback?(luke)
Attachment #8466061 - Flags: feedback?(luke)
(Assignee)

Comment 20

3 years ago
Thanks for your review and comments!

Interdiff encountered some errors, there is way more than what's indicated here:
https://bugzilla.mozilla.org/attachment.cgi?oldid=8463485&action=interdiff&newid=8466061&headers=1

(In reply to Luke Wagner [:luke] from comment #17)
> Comment on attachment 8463485 [details] [diff] [review]
> WIP
> 
> Review of attachment 8463485 [details] [diff] [review]:
> -----------------------------------------------------------------
> Question:
> Does the jitSupportsFloatingPoint check in EstablishPreconditions imply we
> have sufficient HW support for SIMD?  If so, it'd be nice if this was
> somehow made a bit more explicit, perhaps adding a second flag that is
> initialized right alongside jitSupportsFloatingPoint so that we could see,
> in the arch-specific probing code, that we are probing for SSE/NEON.

It's strange. On x86, checking for floating point is verifying whether we have SSE2 support, so it's fine; but x64 unconditionally returns true for floating point support, so we need another function jitSupportsSimd(), down to the assembler/MacroAssembler*. However, we can't have this check in EstablishPreconditions, as having SIMD should be optional, not a prerequisite (at least during experimenting). As a matter of fact, I've put the check for jitSupportsSimd() during GenerateCode, which is also checking for a static const bool SupportsSimd, which has the same semantics but is manually defined.

> 
> ::: js/src/jit-test/tests/asm.js/testSIMD.js
> @@ +65,5 @@
> > +
> > +assertAsmTypeFail('glob', USE_ASM + F32 + "function f() {var x=f4;} return f");
> > +assertAsmTypeFail('glob', USE_ASM + F32 + "function f() {var x=f4();} return f");
> > +assertAsmTypeFail('glob', USE_ASM + F32 + "function f() {var x=f4(1,2,3);} return f");
> > +assertAsmTypeFail('glob', USE_ASM + F32 + "function f() {var x=f4(1,2,3,4);} return f");
> 
> I think we should accept int intializers, in symmetry with how float32
> variable initializers accept ints.
Makes sense. Done.

> 
> @@ +176,5 @@
> > +assertEq(f32x4.y, Math.fround(42.42));
> > +assertEq(f32x4.z, -0);
> > +assertEq(f32x4.w, NaN);
> > +
> > +// TODO what if the actual argument isn't an int32x4 or a float32x4
> 
> I think we just do what the JS semantics say for "int32x4(x)" which will
> either produce a new SIMD value or throw.  I'm a bit fuzzy on the specifics
> of what happens in the non-throwing case (like, will
> int32x4({x:1,y:1,z:1,w:1}) coerce to an int32x4 or throw?), but for now we
> can just factor out a C++ ToInt32x4/ToFloat32x4 (in SIMD.cpp) and call these
> in CallAsmJS/CoerceInPlace_*.
I made a ToVectorObject<V> function in SIMD.cpp, which currently just checks if the input HandleValue is actually a Vector of the expected kind and throws otherwise. We can change it in the future so that it can handle scalar types and such, as the specification moves on.

And of course, I've added plenty of tests for this case now \o/

> I think you can instead use x4desc->is<XTypeDesc>().
Done, thanks for the hint.

> Perhaps you could initialize the enum value of AsmJSSimdType_* with the
> *::type values and then have
>   int32_t AsmJSSimdTypeToDescriptor(AsmJSSimdType);
> in AsmJSModule.h?
In this case, AsmJSSimdTypeToDescriptor(AsmJSSimdType) would just convert its input to an int32_t, right? I just did it directly in this function, with a comment.

> Could you have a JSObject-derived class that provided this as a typed
> accessor?
Yes, there is actually already a function for that. Done.

> 
> @@ +605,5 @@
> > +      }
> > +      case AsmJSModule::Return_Float32x4: {
> > +            float *data = (float*)&coercedArgs[0];
> > +            // XXX Canonicazlie?
> > +            RootedObject x4obj(cx, Create<Float32x4>(cx, data));
> 
> nit: 'Create' might be a bit overly-general of a name, perhaps CreateSimd?
Will file a follow-up for that, as Create is already in SIMD.{h,cpp}.

> 
> ::: js/src/jit/shared/CodeGenerator-x86-shared.cpp
> @@ +328,5 @@
> > +                masm.storeDouble(ToFloatRegister(ins->arg()), dst);
> > +                return true;
> > +              case MIRType_Int32x4:
> > +                // XXX not sure dst (SP + spOffset()) will be always aligned
> > +                masm.storeAlignedInt32x4(ToFloatRegister(ins->arg()), dst);
> 
> If we keep sp 16-byte-aligned and ABIArgGenerator does its job, then it
> seems like this would be aligned?
Right. I was fuzzy with the AsmJSCall which handles its own needed stack depth, but the stack args are passed before the call (of course!) so the issue can't show up. Plus, it didn't show up on testing on x86, so that's fine.

> It'd be good if storeAlignedInt32x4 did a DEBUG-only alignment check, though.
Not sure this debug check would add a lot of value: if there's an alignment issue, there'll be a segfault with SIGBUS error, which sounds more explicit than a hlt;

All other style nits fixed.
(Assignee)

Comment 21

3 years ago
Created attachment 8466088 [details] [diff] [review]
WIP v3

Same patch, but we check that the graph doesn't contain SIMD instructions at lowering, not code generation (which is actually already too late).

(see bigger comment above)
Attachment #8466061 - Attachment is obsolete: true
Attachment #8466061 - Flags: feedback?(luke)
Attachment #8466088 - Flags: feedback?(luke)
(Assignee)

Comment 22

3 years ago
Created attachment 8466089 [details] [diff] [review]
ARM stubs
(Assignee)

Comment 23

3 years ago
Created attachment 8466090 [details] [diff] [review]
Testing function isAsmJSSimdCompilationAvailable

Very practical for skipping tests on ARM.
Attachment #8466090 - Flags: feedback?(luke)
(Assignee)

Comment 24

3 years ago
Created attachment 8466094 [details] [diff] [review]
Rolled up patch for testing v2

Seems to work on all platforms, with the latest patches applied.
Attachment #8463487 - Attachment is obsolete: true
(Assignee)

Comment 25

3 years ago
Created attachment 8466217 [details] [diff] [review]
Rolled up patch for testing v3

So Try wasn't really agreeing with me that it would compile on all platforms. I made some changes to a few patches, and i hope my third try round will return greener.

For what it's worth, I've settled the SimdConstant issue by just making static SimdConstant::Create functions rather than constructors. This way, SimdConstant doesn't have user defined ctors and can thus be used in unions.

To wit: there's strictly nothing to change for non-nightly builds. The SIMD object isn't defined in the global scope, in non-nightly builds. So module compilation will work if you use SIMD, but linking will fail as there's no SIMD object in the global scope. This means we could easily ship it and make it a Nightly-only feature \o/
Attachment #8466094 - Attachment is obsolete: true

Comment 26

3 years ago
Comment on attachment 8466088 [details] [diff] [review]
WIP v3

Review of attachment 8466088 [details] [diff] [review]:
-----------------------------------------------------------------

This is getting close to something we can land!

So, given what you said in the above comment about the absence of the SIMD constructors from non-NIGHTLY builds effectively disabling this feature, what else needs to happen before we can land this?

One global nit: I think it's important we regularly go with SIMD or Simd.  There seems to be precedent already for SIMD (builtin/SIMD.cpp, lots of SIMD use in SIMD.cpp) and no Simd that I can see.  There is also a light precedent for all-caps acronyms in Odin (re: FFI), so let's go with SIMD everywhere in this patch too.

::: js/src/jit/AsmJS.cpp
@@ +525,5 @@
> +          case Intish:
> +          case Void:
> +            break;
> +        }
> +        MOZ_CRASH("Invalid SIMD Type");

For now, everywhere uses MOZ_ASSUME_UNREACHABLE.

@@ +1139,5 @@
>          unsigned column;
>      };
>  
>      typedef HashMap<PropertyName*, MathBuiltin> MathNameMap;
> +    typedef HashMap<PropertyName*, AsmJSSimdOperation> SimdOperationNameMap;

Could you instead add a new case to MathBuiltin by adding a case to MathBuiltin::Kind and a new addStandardLibraryMathName overload?

@@ +1832,5 @@
>             (pn->isKind(PNK_NEG) && UnaryKid(pn)->isKind(PNK_NUMBER));
>  }
>  
> +static const ModuleCompiler::Global *
> +IsCallToGlobal(ModuleCompiler &m, ParseNode *pn)

For this type of situation, I'd go with the more traditional signature:
  static inline bool
  IsCallToGlobal(ModuleCompiler &m, ParseNode *pn, const ModuleCompiler::Global **global);

@@ +2024,5 @@
> +            break;
> +          case NumLit::Fixnum:
> +          case NumLit::NegativeInt:
> +          case NumLit::BigUnsigned:
> +            val[i] = float(lit.toInt32());

I think this handles BigUnsigned incorrectly (it'll be treated as a negative which will be exposed in the float).  If so, could you add a test-case.

@@ +2543,5 @@
>          CallSiteDesc callDesc(lineno, column, CallSiteDesc::Relative);
>          curBlock_->add(MAsmJSInterruptCheck::New(alloc(), &m().syncInterruptLabel(), callDesc));
>      }
>  
> +    MDefinition *SimdExtractElement(SimdLane lane, MDefinition *base, MIRType type)

First letter should be lowercase.  With suggested SIMD renaming, this could be extractSIMDElement.

@@ +2552,5 @@
> +        curBlock_->add(ins);
> +        return ins;
> +    }
> +
> +    MDefinition *SimdBinaryArith(MDefinition *lhs, MDefinition *rhs, MSimdBinaryArith::Operation op,

Perhaps just 'binarySIMD' and put it below 'binary'?

@@ +2565,5 @@
> +    template<typename T>
> +    MDefinition *SimdCtorX4(MDefinition *x, MDefinition *y, MDefinition *z, MDefinition *w,
> +                            MIRType type)
> +    {
> +        JS_ASSERT(IsSimdType(type));

constructSIMD?  (X4 is in the signature)

Also, I think all 3 of these functions need an inDeadCode() check.  Can you added a test-case for this?

@@ +3524,5 @@
> +
> +        if (!IsUseOfName(global, m.module().globalArgumentName()))
> +            return m.fail(base, "expecting global.*");
> +        if (mathOrSimd != m.cx()->names().Math && mathOrSimd != m.cx()->names().SIMD)
> +            return m.fail(base, "expecting global.Math or global.SIMD");

I realized that, for slightly better error messages, we can now do:
  return m.failName(base, "expecting %s.*", m.module().globalArgumentName());

@@ +3534,5 @@
> +        JS_ASSERT(mathOrSimd == m.cx()->names().SIMD);
> +        AsmJSSimdType simdType;
> +        if (!IsSimdTypeName(m, field, &simdType))
> +            return m.failName(initNode, "'%s' is not a standard SIMD type", field);
> +        return m.addSimdCtor(varName, simdType, field);

For symmetry, how about using the control flow:
  if (mathOrSimd == m.cx()->names().Math)
     return CheckGlobalMathImport(...)
  if (mathOrSimd == m.cx()->names().SIMD)
     return CheckGlobalSIMDImport(...);
  return m.failName("expecting %s.(Math|SIMD)", m.module().globalArgumentName());
?

@@ +3549,5 @@
>      if (IsUseOfName(base, m.module().importArgumentName()))
>          return m.addFFI(varName, field);
>  
> +    if (base->isKind(PNK_NAME))
> +        return CheckGlobalSimdOperationImport(m, initNode, varName, base->name(), field);

I was thinking that, since the last 3 cases here are all handling names, we could use the following structure:

  if (!base->isKind(PNK_NAME))
      return m.fail(base, "expecting name of variable or parameter");

  if (base->name() == m.module().globalArgumentName())
      ...
  if (base->name() == m.module().importArgumentName())
      ....
  
  const ModuleCompiler::Global *global = m.lookupGlobal(base->name());
  if (!global)
      return m.failName(base, "%s not found in module global scope", base->name());

  switch (global->kind()) {
    case ModuleCompiler::Global::SimdCtor:
      return CheckGlobalSIMDOperationImport(m, global, field);
    case ...
      return m.failName(field, "expecting SIMD constructor name, got %s", base->name());
  }

  MOZ_ASSUME_UNREACHABLE("Bad global kind");
}

@@ +3738,5 @@
> +        return f.addVariable(var, name, literal.varType(), literal.value());
> +    }
> +
> +    const ModuleCompiler::Global *global = nullptr;
> +    if (IsSimdCtorCall(f.m(), initNode, &global)) {

Ideally, I think, all the SIMD-literal handling would occur within IsNumericLiteral and ExtractNumericLiteral.  That means we'd need to extend NumLit to have a SIMD case and possibly hold a SimdConstant, but that seems fine to me.  Storing a Value was just a convenient hack, and probably we should switch to a big union that stores the unboxed types directly.  We could reuse the same union type for TypedValue to avoid the varInitializers/simdVarInitializers split.

The good thing about handling SIMD literals with general literals is it'll force us to consider "what about SIMD literals" at ever case.

Along the same lines, it'd be nice to generalize IsFloatCoercion to "IsTypeCoercion" (that handles both the Float and Simd cases) and call that from the non-float-specific cases: CheckTypeAnnotation, CheckUncoercedCall.

@@ +3991,5 @@
> +    Type baseType;
> +    if (!CheckExpr(f, base, &baseDef, &baseType))
> +        return false;
> +    if (!baseType.isSimd())
> +        return f.fail(base, "dot access base must be a SIMD type");

Maybe add "expected SIMD type, got %s"?

@@ +3998,5 @@
> +    PropertyName *field = DotMember(elem);
> +    if (field == m.cx()->names().signMask) {
> +        // TODO signMask
> +        MOZ_CRASH("signMask NYI");
> +        return true;

I'm not sure what this is, probably should remove for initial landing.

@@ +4470,5 @@
>  
>  static bool CheckCall(FunctionCompiler &f, ParseNode *call, RetType retType, MDefinition **def, Type *type);
>  
>  static bool
> +PromoteToFloat32(FunctionCompiler &f, ParseNode *inputNode, Type inputType, MDefinition *inputDef,

How about CheckFloatCoercionArg?

@@ +4541,5 @@
>          // definition and return types should be ignored by the caller
>          return true;
> +      case RetType::Int32x4:
> +      case RetType::Float32x4:
> +        MOZ_ASSUME_UNREACHABLE("Math.fround can't possibly return SIMD types");

I think this is reachable since retType is the coercion applied by the parent expression, which could be anything.  I *think* an example would be int32x4(fround(3)).  If so, could you add a test?

@@ +4568,5 @@
>  CheckMathBuiltinCall(FunctionCompiler &f, ParseNode *callNode, AsmJSMathBuiltinFunction func,
>                       RetType retType, MDefinition **def, Type *type)
>  {
> +    if (retType.toType().isSimd())
> +        return f.fail(callNode, "math builtin cannot be used as SIMD");

I would think this test wouldn't be necessary (with the above CheckMathFRound change); everywhere should handle a SIMD ret type like any other bad return type.

@@ +5264,5 @@
> +    MOZ_ASSUME_UNREACHABLE("unexpected SIMD type");
> +}
> +
> +static bool
> +CheckSimdCtorCall(FunctionCompiler &f, ParseNode *expr, MDefinition **def, Type *type,

Could all this call logic be refactored so that SIMD calls used the same CheckCall path as other calls?  That is, when you did a float32x4(f()), it'd call CheckCall(), passing retType = Type::Float32x4 etc.  This would match up with the above suggestion to generalize IsFloatCoercion to IsTypeCoercion (which we'll further use in the future for Typed Objects).

::: js/src/jit/AsmJSLink.cpp
@@ +224,5 @@
> +    if (global.which() == AsmJSModule::Global::SimdCtor) {
> +        type = global.simdCtorType();
> +    } else {
> +        JS_ASSERT(global.which() == AsmJSModule::Global::SimdOperation);
> +        type = global.simdOperationType();

Probably don't need the which() assert since simdOperationType() immediately does the same assert.

@@ +239,5 @@
> +    if (!x4desc->is<X4TypeDescr>())
> +        return LinkFail(cx, "bad SIMD type");
> +
> +    // AsmJSSimdType-s are actually initialized to the typeDescr value.
> +    if (int32_t(type) != int32_t(x4desc->as<X4TypeDescr>().type()))

I'm a bit worried about a cast like this since it silences the normal compile-time errors/warnings.  What about if we don't actually introduce a new enum and just use X4TypeDescr::Type?  If that didn't work, I'd rather have the explicitly type conversion function:
  X4TypeDescr::Type AsmJSSimdTypeToTypeDescrType(AsmJSSimdType);
that we could use here.

@@ +522,5 @@
>              if (!RoundFloat32(cx, v, (float *)&coercedArgs[i]))
>                  return false;
>              break;
> +          case AsmJS_ToInt32x4: {
> +            if (!ToVectorObject<Int32x4>(cx, v))

For symmetry with ToInt32/ToNumber, could you instead have a function that takes the raw memory as an outparam pointer?  Perhaps you could hoist the definition of SimdSizedArg (currently in AsmJSModule.h) into SIMD.h and use this as the outparam type.  With that, you could write:

  if (!ToVector<Float32x4>(cx, v, (SIMDSizedType *)&coercedArgs[i])))
      return false;

which is nice and symmetric-looking.  Alternatively, could SimdConstant be used?  I can't find the definition

@@ +533,5 @@
> +          case AsmJS_ToFloat32x4: {
> +            if (!ToVectorObject<Float32x4>(cx, v))
> +                return false;
> +
> +            // XXX canonicalize NaN

IIUC, NaN-canonicalization will happen when reading the lanes of a SIMD vector, and never sooner.  This is important since people sometimes to weird bitwise things with float32x4 vectors and we don't want to scramble their bits on them.  That is, float32x4 vector operations would *never* canonialize and *any* float32x4->{value,number} operation *would* canonicalize.

@@ +587,5 @@
>          callArgs.rval().set(NumberValue(*(double*)&coercedArgs[0]));
>          break;
> +      case AsmJSModule::Return_Int32x4: {
> +            int32_t *data = (int32_t*)&coercedArgs[0];
> +            RootedObject x4obj(cx, Create<Int32x4>(cx, data));

Technically, if any case uses { }, all should.  Instead, you could hoist the RootedObject decl out of the switch and then write:
  obj = Create<Int32x4>(cx, (int32_t*)&coercedArgs[0]);
and avoid the { } in the first place.  Up to you.

@@ +595,5 @@
> +            break;
> +      }
> +      case AsmJSModule::Return_Float32x4: {
> +            float *data = (float*)&coercedArgs[0];
> +            // XXX Canonicazlie?

ditto

::: js/src/jit/x64/Assembler-x64.cpp
@@ +47,5 @@
>          current_ = ABIArg(FloatArgRegs[regIndex_++]);
>          break;
> +      case MIRType_Int32x4:
> +      case MIRType_Float32x4:
> +        // XXX >64 bits args might have to be on the stack on win64

According to http://msdn.microsoft.com/en-us/library/zthk2dkh.aspx, these are passed as pointers (in GPRs) to 16-byte caller-allocated stack memory.  That is actually kindof a pain to fit into the current ABIArgGenerator design but, fortunately, this issue only comes up if we call compiled C++ which we avoid if we make SIMD types not be extern.  I've been meaning to add an 'internal'/'external' enum argument to ABIArgGenerator which would allow us, e.g., to use a better calling ABI on x86.  With this flag, we could assert here that we never passed a SIMD type for an 'external' call.

::: js/src/vm/HelperThreads.cpp
@@ +853,5 @@
>          if (!OptimizeMIR(asmData->mir))
>              break;
>  
> +        if (asmData->mir->usesSimd() && !SupportsSimd)
> +            break;

It makes sense that we can't require this in EstablishPreconditions.  How about, instead, ModuleCompiler checks rt->jitSupportsSimd(), stores that in a bool field of ModuleCompiler and we check that as we type-check SIMD operations.  That way, on failure, we could give an error pointing at the offending ParseNode.
Attachment #8466088 - Flags: feedback?(luke)

Comment 27

3 years ago
Comment on attachment 8466090 [details] [diff] [review]
Testing function isAsmJSSimdCompilationAvailable

Review of attachment 8466090 [details] [diff] [review]:
-----------------------------------------------------------------

Could there instead be an isSIMDAvailable()?
(Assignee)

Comment 28

3 years ago
Created attachment 8470824 [details] [diff] [review]
WIP v4

(In reply to Luke Wagner [:luke] from comment #26)
> Comment on attachment 8466088 [details] [diff] [review]
> WIP v3
> 
> Review of attachment 8466088 [details] [diff] [review]:
> -----------------------------------------------------------------
> @@ +1139,5 @@
> >          unsigned column;
> >      };
> >  
> >      typedef HashMap<PropertyName*, MathBuiltin> MathNameMap;
> > +    typedef HashMap<PropertyName*, AsmJSSimdOperation> SimdOperationNameMap;
> 
> Could you instead add a new case to MathBuiltin by adding a case to
> MathBuiltin::Kind and a new addStandardLibraryMathName overload?
> 
We actually can't, since there would be name collisions between Math builtin and SIMD operations, because of min/max: Math.min vs SIMD.{int,float}32x4.min, ditto for max.

> 
> @@ +5264,5 @@
> > +    MOZ_ASSUME_UNREACHABLE("unexpected SIMD type");
> > +}
> > +
> > +static bool
> > +CheckSimdCtorCall(FunctionCompiler &f, ParseNode *expr, MDefinition **def, Type *type,
> 
> Could all this call logic be refactored so that SIMD calls used the same
> CheckCall path as other calls?  That is, when you did a float32x4(f()), it'd
> call CheckCall(), passing retType = Type::Float32x4 etc.  This would match
> up with the above suggestion to generalize IsFloatCoercion to IsTypeCoercion
> (which we'll further use in the future for Typed Objects).
Yes, this has been done. There was something unclear with my patch, as CheckSimdCtorCall had both the coercion role (e.g. int32x4(x)) and the construction role (e.g. int32x4(1,2,3,4)). Having IsTypeCoercion helped split the roles, so now SimdCtorCall is only used for *really* constructing values.

> 
> @@ +239,5 @@
> > +    if (!x4desc->is<X4TypeDescr>())
> > +        return LinkFail(cx, "bad SIMD type");
> > +
> > +    // AsmJSSimdType-s are actually initialized to the typeDescr value.
> > +    if (int32_t(type) != int32_t(x4desc->as<X4TypeDescr>().type()))
> 
> I'm a bit worried about a cast like this since it silences the normal
> compile-time errors/warnings.  What about if we don't actually introduce a
> new enum and just use X4TypeDescr::Type?  If that didn't work, I'd rather
> have the explicitly type conversion function:
>   X4TypeDescr::Type AsmJSSimdTypeToTypeDescrType(AsmJSSimdType);
> that we could use here.

You're right about the wild casts. I'm in favor with having a new function, otherwise the compiler will complain about Float64x2 (which are being implemented) not being handled in switches all over the places.

> 
> @@ +533,5 @@
> > +          case AsmJS_ToFloat32x4: {
> > +            if (!ToVectorObject<Float32x4>(cx, v))
> > +                return false;
> > +
> > +            // XXX canonicalize NaN
> 
> IIUC, NaN-canonicalization will happen when reading the lanes of a SIMD
> vector, and never sooner.  This is important since people sometimes to weird
> bitwise things with float32x4 vectors and we don't want to scramble their
> bits on them.  That is, float32x4 vector operations would *never*
> canonialize and *any* float32x4->{value,number} operation *would*
> canonicalize.
Good point. I'll add this directly in the SimdExtractElement operation, as it seems we want to always have this behavior.

I've addressed the other comments as well, but there are a few API design issues that need discussion (marked with XXX in the patch).
Attachment #8466088 - Attachment is obsolete: true
Attachment #8470824 - Flags: feedback?(luke)
(Assignee)

Comment 29

3 years ago
Created attachment 8470826 [details] [diff] [review]
IsSimdAvailable()

That works well too.
Attachment #8466090 - Attachment is obsolete: true
Attachment #8466090 - Flags: feedback?(luke)
Attachment #8470826 - Flags: review?(luke)
(Assignee)

Comment 30

3 years ago
Created attachment 8470836 [details] [diff] [review]
Globals (wip)

This adds imported globals and module-defined globals.
Attachment #8465665 - Attachment is obsolete: true
Attachment #8470836 - Flags: feedback?(luke)
(Assignee)

Comment 31

3 years ago
Created attachment 8470846 [details] [diff] [review]
Globals (wip)

forgot to qref
Attachment #8470836 - Attachment is obsolete: true
Attachment #8470836 - Flags: feedback?(luke)
Attachment #8470846 - Flags: feedback?(luke)

Comment 32

3 years ago
Comment on attachment 8470824 [details] [diff] [review]
WIP v4

Review of attachment 8470824 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for all the updates!  We're definitely converging on the right final patch.  A few more ideas came up answering your questions:

::: js/src/asmjs/AsmJSValidate.cpp
@@ +1907,5 @@
> +//          == MathBuiltinFunction etc.} }
> +//          template<typename T> IsTypeCoercion(ExpectedTypeConstraint<T>...)
> +//     - another way?
> +static bool
> +IsTypeCoercion(ModuleCompiler &m, Type expectedType, ParseNode *pn, ParseNode **coercedExpr)

What I'd do is to change 'expectedType' from an inparam to an outparam 'type'.  That is, if IsTypeCoercion returns 'true', *type is the type it was coerced to.  Then the caller can decide what to do.

Also, I'd make the type of 'type' be AsmJSCoercionType since:
 - that is what CheckTypeAnnotation wants
 - CheckCoercionArg would take this as a parameter and be able to construct the RetType directly

Also, you could switch on the AsmJSCoercionType.  You'd have to handle the float/int cases (marking them as unreachable), but that's fine.  Actually, this points out that we're not query for *any* coercion, just coercion *calls*, so perhaps we could use the names IsCoercionCall/CheckCoercionCallArg?

@@ +1975,5 @@
> +// IsTypeCoercion should be used.
> +// - constructor, with X arguments, X > 1. In this case, IsSimdCtorCall is the
> +// one to use.
> +static bool
> +IsSimdCtorCall(ModuleCompiler &m, ParseNode *pn, const ModuleCompiler::Global **global)

I think you're right that this function makes sense to keep separate from IsTypeCoercion.  I think you could change the 'global' outparam to an AsmJSSimdType outparam ('ctorType') since that is all the callers ever need.

@@ +3632,5 @@
> +                               ParseNode *initNode, PropertyName *varName, PropertyName *ctorVarName,
> +                               PropertyName *opName)
> +{
> +    if (!m.supportsSimd())
> +        return m.failSimd(initNode);

IIUC, for any other SIMD thing to validate, CheckGlobalSimdImport has to have validated and thus checked m.supportsSimd().  Thus, I think you can remove all the supportsSimd() checks except the one in CheckGlobalSimdImport.

@@ +3889,5 @@
> +
> +        if (literal.isSimd()) {
> +            if (!f.supportsSimd())
> +                return f.failSimd(initNode);
> +            return f.addVariable(initNode, name, literal.varType(), literal.simd());

As mentioned above, I don't think you need to check supportsSimd().  Also, I realized that TypedValue is really redundant with NumLit and that VarInitializerVector should store a vector of NumLit.  With this, addVariable could take a const NumLit& and then you could have a single call to addVariable here w/o the branch.

Ideally, AsmJSModule::Global::constLiteralValue() would return a NumLit, but it is declared in AsmJSModule.h.  However, looking at the globals WIP patch, I think it'd be nice to store a NumLit in AsmJSModule::Global, so perhaps you can hoist NumLit in AsmJSModule.h (AsmJSNumLit) and use it for the globals patch.  That way there is a single, well-defined concept of "numeric literal", not this js::Value goofiness.  It'll also help in the f.constant() case mentioned next.

Lastly, it'd also be good to rename literal.value() to literal.nonSimdValue(); after these changes, there should only be uses in FunctionCompiler::constant and addVariable.

@@ +3932,5 @@
> +        if (!f.supportsSimd())
> +            return f.failSimd(num);
> +        *def = f.constant(literal.simd(), literal.type());
> +    } else {
> +        *def = f.constant(literal.value(), literal.type());

Here, I think you could add a 'constant' overload that takes a NumLit.  This overload would have one other users (global->constLiteralValue() in CheckVarRef).

@@ +5489,5 @@
> +    if (IsSimdCtorCall(f.m(), expr, &global))
> +        return CheckSimdCtorCall(f, expr, def, type, global);
> +
> +    if (IsSimdOperationCall(f.m(), expr, &global))
> +        return CheckSimdOperationCall(f, expr, def, type, global);

If we were symmetric with the Math functions, we'd require callsite coercions, like:

+Math_sin(a);
int32x4(int32x4_add(a, b));

which means wouldn't have this IsSimdOperationCall() case in CheckUncoercedCall, you'd just add another case to CheckCall().  However, the above is pretty gnarly and totally unnecessary, given that the type of the operation is completely known from the name.  But the same is true for Math standard library functions too (and this is something people occasionally complain about).  So perhaps we could generalize the Math stdlib to be callable here also?  The important thing is that SIMD is symmetric with the other Math stdlibs.  If you agree, it seems like you could make the Math changes in an independent bug blocking this and then, in this patch, you'd just be adding a extra case.

@@ +6816,5 @@
> +
> +static const unsigned FramePushedAfterSaveSimd =
> +   SupportsSimd ? NonVolatileRegs.gprs().size() * sizeof(intptr_t) +
> +                  NonVolatileRegs.fpus().size() * Simd128DataSize
> +                : FramePushedAfterSave;

I think all uses of FramePushedAfterSave are renamed to FramePushedAfterSaveSimd, so might as well just have 1 named FramePushedAfterSave.

@@ +6839,5 @@
>  
>      // In constrast to the system ABI, the Ion convention is that all registers
>      // are clobbered by calls. Thus, we must save the caller's non-volatile
>      // registers.
> +    masm.PushRegsInMask(NonVolatileRegs, /* simdRegs = */ NonVolatileSimdRegs);

You're probably fine w/o the "/* simdRegs = */" since "Simd" is in the name.

::: js/src/jit/x64/Assembler-x64.cpp
@@ +48,5 @@
>          break;
> +      case MIRType_Int32x4:
> +      case MIRType_Float32x4:
> +        // XXX >64 bits args might have to be on the stack on win64
> +        current_ = ABIArg(FloatArgRegs[regIndex_++]);

I don't think you need XXX now; you can just say:
 // On Win64, >64 bit args need to be passed by reference, but asm.js doesn't allow passing
 // SIMD values to FFIs, so we can break the ABI in this case.
and put this comment in the "if (regIndex_ == NumIntArgRegs)" case above.

I just realized, though, we need to align stackOffset_ to be a multiple of Simd128DataSize before doing stackOffset_ += Simd128DataSize, or else the stack address will be misaligned (and we'll need this on both Win and !Win).  It'd be good to add a test for this.
Attachment #8470824 - Flags: feedback?(luke)

Updated

3 years ago
Attachment #8470826 - Flags: review?(luke) → review+
(Assignee)

Updated

3 years ago
Depends on: 1052325
(Assignee)

Updated

3 years ago
Depends on: 1052514
(Assignee)

Comment 33

3 years ago
Created attachment 8471671 [details] [diff] [review]
WIP v5

(globals will be added afterwards, due to recent changes)
Attachment #8470824 - Attachment is obsolete: true
Attachment #8470846 - Attachment is obsolete: true
Attachment #8470846 - Flags: feedback?(luke)
Attachment #8471671 - Flags: feedback?(luke)

Updated

3 years ago
Attachment #8471671 - Flags: feedback?(luke)
Created attachment 8477704 [details] [diff] [review]
WIP 6

This is a rebased version of bbouvier's "WIP 5" patch from Comment 33. It passes testSIMD.js, but is unfinished -- global variable support is missing, and I haven't checked that Comment 32 is adequately addressed.

While bbouvier is on PTO, I've been asked to get SIMD in shape for a demo at the IDF on 9 Sept. Odin support is the only important piece missing. Given the timeframe, it would be ideal to have Odin support landed the week of 25 Aug. That's probably enough time to get the code into good shape, but not enough time for several back-and-forth reviews.

Luke -- is there anything here that strikes you as particularly egregious, that would be a barrier for quick review?
Attachment #8471671 - Attachment is obsolete: true
Attachment #8477704 - Flags: feedback?(luke)

Comment 35

3 years ago
This patch is almost done, but there were some changes bbouvier and I had discussed right before he left (dependent bugs) that I think will lead to changes in this patch (in the validation rules).  I'm certain we can get it landed next week, though, assuming bbouvier is getting back on Monday.
(Assignee)

Updated

3 years ago
Blocks: 1058593

Comment 36

3 years ago
Comment on attachment 8477704 [details] [diff] [review]
WIP 6

New WIP is coming.
Attachment #8477704 - Flags: feedback?(luke)
(Assignee)

Comment 37

3 years ago
Created attachment 8479137 [details] [diff] [review]
WIP 7

To apply atop of dependent bug's patches. No globals yet.
Attachment #8477704 - Attachment is obsolete: true
Attachment #8479137 - Flags: review?(luke)
(Assignee)

Comment 38

3 years ago
Created attachment 8479202 [details] [diff] [review]
Globals (wip)

With the NumLit refactoring, this patch is so much easier!
To ensure alignment of SIMD datum, SIMD values are placed before scalar values in the global section.
Attachment #8479202 - Flags: review?(luke)

Comment 39

3 years ago
Comment on attachment 8479137 [details] [diff] [review]
WIP 7

Review of attachment 8479137 [details] [diff] [review]:
-----------------------------------------------------------------

This looking great, almost done.  Mostly nits, but a few suggested type rule changes.  It'd be nice to look over the patch again with these fixed.

::: js/src/asmjs/AsmJSLink.cpp
@@ +324,5 @@
> +    switch (global.simdOperationType()) {
> +      case AsmJSSimdType_int32x4:
> +        return ValidateInt32x4Operation(cx, global, v);
> +      case AsmJSSimdType_float32x4:
> +        return ValidateFloat32x4Operation(cx, global, v);

Seems like you could inline the two Validate*x4Operation functions and share the error-checking suffix w/o making this function too long.

@@ +533,5 @@
>      const AsmJSModule::ExportedFunction &func = FunctionToExportedFunction(callee, module);
>  
>      // The calling convention for an external call into asm.js is to pass an
> +    // array of 16-byte values where each value contains either a coerced int32
> +    // (in the low word), a double value or a SIMD vector value, with the

For symmetry, could you add "a double value (in the low dword)"

@@ +610,5 @@
>          callArgs.rval().set(ObjectValue(*obj));
>          return true;
>      }
>  
> +    RootedObject x4obj(cx);

The object pointer is immediately stored in callArgs.rval() so it's not live across a GC so you can just use a raw JSObject* and avoid the Rooted push/pop overhead.

@@ +622,5 @@
>        case AsmJSModule::Return_Double:
>          callArgs.rval().set(NumberValue(*(double*)&coercedArgs[0]));
>          break;
> +      case AsmJSModule::Return_Int32x4:
> +        x4obj = CreateSimd<Int32x4>(cx, (int32_t*) &coercedArgs[0]);

For symmetry with above 2 lines, could you take out the space between cast and &?

::: js/src/asmjs/AsmJSModule.h
@@ +136,5 @@
> +        return lit;
> +    }
> +
> +    static AsmJSNumLit Create(Which w, jit::SimdConstant c)
> +    {

Curly at end of previous line.

Could you also add JS_ASSERT(isSimd()) at the end of this function (and JS_ASSERT(!isSimd()) at the end of the other ctor?

@@ +381,5 @@
> +    struct SimdSizedArg {
> +        uint64_t lo;
> +        uint64_t hi;
> +    };
> +    JS_STATIC_ASSERT(sizeof(SimdSizedArg) >= jit::Simd128DataSize);

Perhaps we could just name the type EntryArg?

::: js/src/asmjs/AsmJSValidate.cpp
@@ +1273,5 @@
>              return false;
>          MathBuiltin builtin(cst);
>          return standardLibraryMathNames_.putNew(atom->asPropertyName(), builtin);
>      }
> +    bool addStandardSimdOpName(const char *name, AsmJSSimdOperation op) {

For symmetry, we should probably rename this to addStandardLibrarySimdOpName and rename standardSimdOpNames_ to standardLibrarySimdOpNames_.

@@ +1602,5 @@
> +                          PropertyName *typeVarName, PropertyName *opName)
> +    {
> +        if (!module_->addSimdOperation(type, op, opName))
> +            return false;
> +

I know I *usually* ask for \n after conditionals, but I think it looks a little nicer for all these short inline member function definitions to take them out.

@@ +1854,1 @@
>          return false;

I thought global had to be non-null here.

@@ +1904,5 @@
> +// IsCoercionCall should be used.
> +// - constructor, with X arguments, X > 1. In this case, IsSimdCtorCall is the
> +// one to use.
> +static bool
> +IsSimdCtorCall(ModuleCompiler &m, ParseNode *pn, AsmJSSimdType *type)

Instead of having a comment, arising from the two meanings of a ctor call, how about renaming to IsSimdTuple?

@@ +1910,5 @@
> +    const ModuleCompiler::Global *global;
> +    if (!IsCallToGlobal(m, pn, &global))
> +        return false;
> +
> +    if (!global || !global->isSimdCtor())

I think 'global' is non-null, check me though.

@@ +1938,5 @@
> +    MOZ_MAKE_COMPILER_ASSUME_IS_UNREACHABLE("Unexpected AsmJSSimdType");
> +}
> +
> +static bool IsNumericLiteral(ModuleCompiler &m, ParseNode *pn);
> +static AsmJSNumLit ExtractNumericLiteral(ModuleCompiler &m, ParseNode *pn);

Function names should start on a new line.

@@ +1941,5 @@
> +static bool IsNumericLiteral(ModuleCompiler &m, ParseNode *pn);
> +static AsmJSNumLit ExtractNumericLiteral(ModuleCompiler &m, ParseNode *pn);
> +
> +static bool
> +IsNumericSimdLiteral(ModuleCompiler &m, ParseNode *pn)

"Numeric" kinda doesn't make sense in this name, I think.  (On the subject, the "Numeric" in IsNumericFloatLiteral seems redundant; IsFloatLiteral would be fine.)

@@ +1954,5 @@
> +        if (!IsNumericLiteral(m, arg))
> +            return false;
> +
> +        AsmJSNumLit lit = ExtractNumericLiteral(m, arg);
> +        if (!IsValidSimdLiteralLaneInitializer(lit, type))

I'd inline this function and, for the int32x4 case accept IsLiteralInt() and, for the float32x4 case accept IsNumericNonFloatLiteral.  Technically this accepts BigUnsigned, but I think that's fine and more regular; it all gets cast to an int32 anyhow.

@@ +1995,5 @@
> +{
> +    ParseNode *arg = argList;
> +    int32_t val[4];
> +    for (size_t i = 0; i < 4; i++, arg = NextNode(arg))
> +        val[i] = ExtractNumericLiteral(m, arg).toInt32();

With the above change, you could use JS_ALWAYS_TRUE(IsLiteralInt(...)).

@@ +2009,5 @@
> +    float val[4];
> +    for (size_t i = 0; i < 4; i++, arg = NextNode(arg)) {
> +        AsmJSNumLit lit = ExtractNumericLiteral(m, arg);
> +        switch (lit.which()) {
> +          case AsmJSNumLit::Float:

With the above change, you could use 'val[i] = float(ExtractNumericNonFloatValue(pn));'  (ExtractNumericNonFloatValue currently mutates *pn for the benefit of ExtractNumericLiteral; you probably don't want that in a for loop so I'd make ExtractNumericNonFloatValue take an optional second outparam which is mutated if non-null and make the first arg an inparam.

@@ +2073,5 @@
> +        SimdConstant c = ExtractSimdValue(m, pn);
> +        if (c.type() == SimdConstant::Int32x4)
> +            return AsmJSNumLit::Create(AsmJSNumLit::Int32x4, c);
> +        if (c.type() == SimdConstant::Float32x4)
> +            return AsmJSNumLit::Create(AsmJSNumLit::Float32x4, c);

ExtractSimdValue already cases on the simd type, so perhaps you could just have ExtractSimdValue return an AsmJSNumLit so here you could 'return ExtractSimdValue'?

@@ +2298,5 @@
>          for (unsigned i = 0; i < varInitializers_.length(); i++) {
>              AsmJSNumLit &lit = varInitializers_[i];
> +
> +            MInstruction *ins;
> +            MIRType type = Type::Of(lit).toMIRType();

It's nice to put an uninitialized variable right before the if/else that initializes it, so I'd switch the order of these decls and possibly move the \n to be between 'type' and 'ins'.

@@ +2304,5 @@
> +               JS_ASSERT(IsSimdType(type));
> +               ins = MSimdConstant::New(alloc(), lit.simdValue(), type);
> +            } else {
> +               JS_ASSERT(!IsSimdType(type));
> +               ins = MConstant::NewAsmJS(alloc(), lit.scalarValue(), type);

MSimdConstant::New already asserts !IsSimdType(type), so we can remove that assert.  It'd be good for MConstant::NewAsmJS to assert !IsSimdType(type), so perhaps we can remove the second assert and the curlies.

@@ +2433,5 @@
> +    {
> +        if (inDeadCode())
> +            return nullptr;
> +
> +        JS_ASSERT(IsSimdType(lhs->type()) && rhs->type() == lhs->type());

Could one also assert lhs->type() == type?

@@ +3497,5 @@
> +        return m.addMathBuiltinConstant(varName, mathBuiltin.u.cst, field);
> +      default:
> +        break;
> +    }
> +    MOZ_MAKE_COMPILER_ASSUME_IS_UNREACHABLE("unexpected or uninitialized math builtin type");

In the cases that aren't small inlined helpers, I'd stick with just MOZ_CRASH.

@@ +3578,5 @@
> +      case ModuleCompiler::Global::FFI:
> +      case ModuleCompiler::Global::ArrayView:
> +      case ModuleCompiler::Global::MathBuiltinFunction:
> +      case ModuleCompiler::Global::SimdOperation:
> +        return m.failName(base, "expecting SIMD constructor name, got %s", field);

I probably recommended using a switch but, I see now that, given that there's only one handled case, perhaps best just to do "if (global->which() != SimdCtor) return fail".

@@ +4007,5 @@
> +        return false;
> +    if (!baseType.isSimd())
> +        return f.failf(base, "expected SIMD type, got %s", baseType.toChars());
> +
> +    ModuleCompiler &m = f.m();

If you're going to hoist the common subexpression, how about:
  JSAtomState &names = f.m().cx()->names();
?

@@ +4444,5 @@
>      PropertyName *calleeName = CallCallee(callNode)->name();
>  
>      if (retType == RetType::Float)
>          return f.fail(callNode, "FFI calls can't return float");
> +    if (IsSimdType(retType.toMIRType()))

It seems strange to use the MIR type in this context; could you instead use toType().isSimd()?

@@ +4462,5 @@
>      *type = retType.toType();
>      return true;
>  }
>  
>  static bool CheckCoercedCall(FunctionCompiler &f, ParseNode *call, RetType retType, MDefinition **def, Type *type);

Need to start 'CheckCoercedCall' on a new line.

@@ +4507,5 @@
> +}
> +
> +static bool
> +CheckCoercionArg(FunctionCompiler &f, ParseNode *arg, MDefinition **def, Type *type,
> +                 AsmJSCoercion expected)

Usually the inparams go before the outparams, so could you move 'expected' before 'def'?

@@ +4528,5 @@
> +        if (!CheckInt32x4CoercionArg(f, arg, inputType, inputDef, def))
> +            return false;
> +        break;
> +      case AsmJS_ToFloat32x4:
> +        if (!CheckFloat32x4CoercionArg(f, arg, inputType, inputDef, def))

For these two cases I'd just inline the functions since they are so short and not called anywhere else.

@@ +4557,5 @@
>          return CheckCoercedCall(f, argNode, RetType::Float, def, &_);
>  
>      MDefinition *argDef;
>      Type argType;
> +    if (!CheckCoercionArg(f, argNode, &argDef, &argType, AsmJS_FRound))

The first thing CheckCoercionArg does is handle the PNK_CALL case so I think you can remove the PNK_CALL test right above this.  With that gone, you can remove the *retType assignment above and just end the function with:
  return CheckCoercionArg(f, argNode, AsmJS_FRound, def, type);

@@ +4567,5 @@
>  }
>  
>  static bool
>  CheckMathBuiltinCall(FunctionCompiler &f, ParseNode *callNode, AsmJSMathBuiltinFunction func,
>                       MDefinition **def, MathRetType *mathRetType)

In this or the preceding patch, could you rename these 'retType' and 'mathRetType' outparams to just 'type'?  (They are serving the role of the usual 'type' outparam, they just represent a subset.

@@ +4645,5 @@
>  }
>  
>  static bool
> +CheckBinarySimd(FunctionCompiler &f, ParseNode *call, MDefinition **def, Type *type,
> +                AsmJSSimdType simdType, MSimdBinaryArith::Operation op)

Can you put the inparams before the outparams?

@@ +4657,5 @@
> +
> +    MDefinition *lhsDef, *rhsDef;
> +    Type lhsType, rhsType;
> +
> +    if (!CheckExpr(f, lhs, &lhsDef, &lhsType))

Can you remove \n before CheckExpr?

@@ +4675,5 @@
> +    if (simdType == AsmJSSimdType_float32x4) {
> +        if (!lhsType.isFloat32x4() || !rhsType.isFloat32x4())
> +            return f.fail(lhs, "arguments to SIMD.float32x4.add should both be float32x4");
> +        *type = Type::Float32x4;
> +        *def = f.binarySimd(lhsDef, rhsDef, op, MIRType_Float32x4);

If you added a MOZ_IMPLICIT Type constructor accepting AsmJSSimdType, then you could replace these two branches with a single:
  Type retType = simdType;
  if (retType != lhsType || retType != rhsType)
      return f.fail(...);

  *def = f.binarySimd(lhsDef, rhsDef, op, retType.toMIRType());
  *type = retType;
  return true;

That leaves the !Mul/Div assert in the int32x4 case which could be a JS_ASSERT_IF.

@@ +4703,5 @@
> +
> +static bool
> +CheckSimdCtorCall(FunctionCompiler &f, ParseNode *expr, MDefinition **def, Type *type,
> +                  AsmJSSimdType simdType)
> +{

Can you rename 'expr' to 'call' and JS_ASSERT(call->isKind(PNK_CALL));?  Also, could you put simdType before 'def' (inparams before outparams).

@@ +4715,5 @@
> +    if (numArgs != length)
> +        return f.failName(expr, "invalid number of arguments in call to '%s'", CallCallee(expr)->name());
> +
> +    MDefinition *defs[4];
> +    Type types[4];

If you are generally using SimdTypeToLength to abstract over length, it seems like here you should use a Vector<MDefinition*, 4> that is .resize()d to be SimdTypeToLength(simdType).  Also, I don't think you need a vector of 'types'; one loop-local Type variable should be sufficient.

@@ +4717,5 @@
> +
> +    MDefinition *defs[4];
> +    Type types[4];
> +    size_t i = 0;
> +    argument = CallArgList(expr);

Perhaps argNode for regularity with CheckCallArgs?

@@ +4724,5 @@
> +        JS_ASSERT(i < 4);
> +        if (!CheckExpr(f, argument, &defs[i], &types[i]))
> +            return false;
> +
> +        if (simdType == AsmJSSimdType_int32x4 && !types[i].isSigned())

I think we could accept intish here, since the semantics of constructing an int32x4 would perform ToInt32() on each component (right?)

@@ +4727,5 @@
> +
> +        if (simdType == AsmJSSimdType_int32x4 && !types[i].isSigned())
> +            return f.failf(argument, "argument %d of Int32x4 ctor isn't a subtype of signed", i);
> +
> +        if (simdType == AsmJSSimdType_float32x4) {

Can you switch on simdType?

@@ +4730,5 @@
> +
> +        if (simdType == AsmJSSimdType_float32x4) {
> +            if (types[i].isFloat())
> +                continue;
> +            if (!CheckFloatCoercionArg(f, argument, types[i], defs[i], &defs[i]))

There shouldn't be any deoptimization from calling CheckFloatCoercionArg if type.isFloat(), so I'd take out the isFloat special case.

@@ +4738,5 @@
> +    JS_ASSERT(i == 4);
> +
> +    if (simdType == AsmJSSimdType_int32x4) {
> +        *def = f.constructSimd<MSimdValueX4>(defs[0], defs[1], defs[2], defs[3], MIRType_Int32x4);
> +        *type = Type::Int32x4;

With the above Type constructor accepting AsmJSSimdType; you could replace these two if's with one:
  *def = f.constructSimd<MSimdValueX4>(defs[0], defs[1], defs[2], defs[3], Type(simdType).toMIRType());
  *type = simdType;
  return true;

@@ +4789,5 @@
>  
>      switch (retType.which()) {
> +      case RetType::Int32x4:
> +      case RetType::Float32x4:
> +        return f.fail(callNode, "math builtin can't return vector values");

How about "cannot %s is not a subtype of %s", actualType.type().chars(), retType.type().chars()?

@@ +4881,5 @@
>              return f.failName(callee, "'%s' is not callable function", callee->name());
> +          // No SIMD functions return scalar results that can be coerced with +, |0 or fround.
> +          case ModuleCompiler::Global::SimdCtor:
> +          case ModuleCompiler::Global::SimdOperation:
> +            return f.failName(callee, "'%s' can't be converted into a scalar type.", callee->name());

IIUC, this case would correspond to:
  int32x4(int32x4(f()))
  int32x4(int32x4_add(a, b))
  int32x4(int32x4(1,2,3,4))
?  This seems symmetric to
  fround(fround(f())
  fround(fround(a))
  fround(fround(1))   
which are legal.  I think we have to accept these, assuming we use the same type rules for all stdlib, the same way we handle CheckCoercedMathBuiltinCall.  Fortunately, the SIMD situation is way way simpler: the coercions either are noops or they have type errors.

@@ +6686,5 @@
>  static const RegisterSet NonVolatileRegs =
>      RegisterSet(GeneralRegisterSet(Registers::NonVolatileMask),
>                  FloatRegisterSet(FloatRegisters::NonVolatileMask));
>  #endif
> +static const FloatRegisterSet NonVolatileSimdRegs = (SupportsSimd) ? NonVolatileRegs.fpus()

Are the () necessary around SupportsSimd?

@@ +6765,5 @@
>  
>      // Copy parameters out of argv and into the registers/stack-slots specified by
>      // the system ABI.
>      for (ABIArgTypeIter iter(func.sig().args()); !iter.done(); iter++) {
> +        unsigned argOffset = iter.index() * Simd128DataSize;

Can you replace this with AsmJSModule::SimdSizedArg (or AsmJSModule::EntryArg, as proposed)?

@@ +6798,3 @@
>              } else {
> +                JS_ASSERT(IsSimdType(type));
> +                if (type == MIRType_Int32x4) {

Instead of this nested if structure, could you use a switch like you did in the FPU case above?

::: js/src/builtin/SIMD.h
@@ +171,5 @@
> +template<typename V>
> +bool ToSimdConstant(JSContext *cx, HandleValue v, jit::SimdConstant *out);
> +
> +template<typename V>
> +V TypedObjectMemory(HandleValue v);

I think you no longer need to extern TypedObjectMemory or IsVectorObject.

::: js/src/jit/shared/CodeGenerator-x86-shared.cpp
@@ +327,5 @@
> +              case MIRType_Float32:
> +                masm.storeDouble(ToFloatRegister(ins->arg()), dst);
> +                return true;
> +              // Aligned accesses: stack pointer is aligned before passing
> +              // the stack argument.

Perhaps, more specifically, you can say "StackPointer is SimdStackAlignment-aligned and ABIArgGenerator guarantees stack offsets are SimdStackAlignment-aligned."

::: js/src/jit/x64/Assembler-x64.cpp
@@ +52,5 @@
> +      case MIRType_Int32x4:
> +      case MIRType_Float32x4:
> +        // On Win64, >64 bit args need to be passed by reference, but asm.js
> +        // doesn't allow passing SIMD values to FFIs. The only way to reach
> +        // here is asm to asm calls, so we can break the ABI here.

I'd duplicate the comment (or the version I suggested in Assembler-x86.cpp) in the IsSimdType(type) branch above.

::: js/src/jit/x86/Assembler-x86.cpp
@@ +31,5 @@
>          stackOffset_ += sizeof(uint64_t);
>          break;
> +      case MIRType_Int32x4:
> +      case MIRType_Float32x4:
> +        stackOffset_ = AlignBytes(stackOffset_, SimdStackAlignment);

Is this the actual x86 ABI?  http://agner.org/optimize/calling_conventions.pdf seemed to indicate that the first 3 regs were in registers.  Assuming it isn't the exact ABI, I'd add a comment:
  // SIMD values aren't passed in or out of C++, so we can make up whatever
  // internal ABI we like. visitAsmJSPassArg assumes SimdStackAlignment.

::: js/src/js.msg
@@ +290,5 @@
>  MSG_DEF(JSMSG_STRICT_CODE_WITH,        0, JSEXN_SYNTAXERR, "strict mode code may not contain 'with' statements")
>  MSG_DEF(JSMSG_STRICT_FUNCTION_STATEMENT, 0, JSEXN_SYNTAXERR, "in strict mode code, functions may be declared only at top level or immediately within another function")
>  MSG_DEF(JSMSG_SYNTAX_ERROR,            0, JSEXN_SYNTAXERR, "syntax error")
>  MSG_DEF(JSMSG_TEMPLSTR_UNTERM_EXPR,    0, JSEXN_SYNTAXERR, "missing } in template string")
> +MSG_DEF(JSMSG_SIMD_NOT_A_VECTOR,       0, JSEXN_TYPEERR, "value isn't a SIMD typed object")

I think it should say "SIMD value object" instead of "SIMD typed object".

::: js/src/vm/CommonPropertyNames.h
@@ +165,5 @@
>      macro(scripts, scripts, "scripts") \
>      macro(sensitivity, sensitivity, "sensitivity") \
>      macro(set, set, "set") \
>      macro(shape, shape, "shape") \
> +    macro(signMask, signMask, "signMask") \

I don't see where this is used.
(Assignee)

Comment 40

3 years ago
Created attachment 8479753 [details] [diff] [review]
WIP v7 rebased

still needs to apply comments; rebased to make collaborative work easier
Attachment #8479137 - Attachment is obsolete: true
Attachment #8479137 - Flags: review?(luke)
(Assignee)

Comment 41

3 years ago
Created attachment 8479758 [details] [diff] [review]
Globals (wip)
Attachment #8479202 - Attachment is obsolete: true
Attachment #8479202 - Flags: review?(luke)
Attachment #8479758 - Flags: review?(luke)
(In reply to Benjamin Bouvier [:bbouvier] from comment #41)
> Created attachment 8479758 [details] [diff] [review]
> Globals (wip)

This assertion fails on ARM builds:
JS_STATIC_ASSERT(2 * jit::SimdStackAlignment >= 2 * sizeof(void*) + 2 * sizeof(double));
Comment on attachment 8479753 [details] [diff] [review]
WIP v7 rebased

Review of attachment 8479753 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm/Assembler-arm.h
@@ +1552,5 @@
>      static bool SupportsFloatingPoint() {
>          return HasVFP();
>      }
> +    static bool SupportsSimd() {
> +        return ::SupportsSimd;

The '::' prefix causes a build failure. Remove them looks good.
(Assignee)

Comment 44

3 years ago
Created attachment 8479942 [details] [diff] [review]
WIP v8 (ahem)

Thanks for all the comments!

Nice catch on the int32x4(int32x4_add(x,y)) thing. I've added a few tests for that.

int32x4 shouldn't take intish as it doesn't perform ToInt32 on its input. Opened a follow up bug for SIMD.js, bug 1059321.
Attachment #8479753 - Attachment is obsolete: true
Attachment #8479942 - Flags: review?(luke)
(Assignee)

Comment 45

3 years ago
Created attachment 8479943 [details] [diff] [review]
Diff v7 / v8

fwiw, the diff between wip v7 and wip v8.
(Assignee)

Comment 46

3 years ago
Created attachment 8479985 [details] [diff] [review]
Globals
Attachment #8479758 - Attachment is obsolete: true
Attachment #8479758 - Flags: review?(luke)
Attachment #8479985 - Flags: review?(luke)
(Assignee)

Updated

3 years ago
Blocks: 1059408

Comment 47

3 years ago
Comment on attachment 8479942 [details] [diff] [review]
WIP v8 (ahem)

Review of attachment 8479942 [details] [diff] [review]:
-----------------------------------------------------------------

Great work!

::: js/src/asmjs/AsmJSValidate.cpp
@@ +2031,5 @@
> +        return AsmJSNumLit::Create(AsmJSNumLit::Int32x4, cst);
> +      case AsmJSSimdType_float32x4:
> +        JS_ASSERT(length == 4);
> +        cst = ExtractFloat32x4Value(m, argList);
> +        return AsmJSNumLit::Create(AsmJSNumLit::Float32x4, cst);

Now that Extract*x4Value are so simple, and only called once, I'd inline them into the switch cases.

@@ +2034,5 @@
> +        cst = ExtractFloat32x4Value(m, argList);
> +        return AsmJSNumLit::Create(AsmJSNumLit::Float32x4, cst);
> +    }
> +
> +    MOZ_MAKE_COMPILER_ASSUME_IS_UNREACHABLE("Unexpected SIMD type.");

I'd MOZ_CRASH instead for almost everything in this patch except the tiny inline helper member functions.

@@ +2047,5 @@
> +        // Float literals are explicitly coerced and thus the coerced literal may be
> +        // any valid (non-float) numeric literal.
> +        if (CallArgListLength(pn) == 1) {
> +            pn = CallArgList(pn);
> +            double d = ExtractNumericNonFloatValue(pn, &pn);

'pn' is dead, so you can leave off the second arg here.

@@ +4494,5 @@
> +{
> +    ParseNode *argNode = nullptr;
> +    AsmJSCoercion coercion;
> +    if (!IsCoercionCall(f.m(), callNode, &coercion, &argNode) || coercion != AsmJS_FRound)
> +        return f.fail(callNode, "invalid call to fround");

I just realized, if we're here, we already know IsCoercionCall(callNode) and it's fround (that's why we were called).  In that case, we can replace this call with the standard:

    if (CallArgListLength(call) != 1)
        return f.fail(call, "Math.fround must be passed 1 argument");

    ParseNode *arg = CallArgList(call);

that we see in all the other CheckMath*.

@@ +4496,5 @@
> +    AsmJSCoercion coercion;
> +    if (!IsCoercionCall(f.m(), callNode, &coercion, &argNode) || coercion != AsmJS_FRound)
> +        return f.fail(callNode, "invalid call to fround");
> +
> +    // Make sure to do this before calling CheckCoercedCall

I think this comment is dead now.

@@ +4636,5 @@
> +static bool
> +CheckSimdCtorCall(FunctionCompiler &f, ParseNode *call, AsmJSSimdType simdType, MDefinition **def,
> +                  Type *type)
> +{
> +    JS_ASSERT(call->isKind(PNK_CALL));

I'd put \n after this.

@@ +4652,5 @@
> +    if (!defs.resize(length))
> +        return false;
> +
> +    size_t i = 0;
> +    Type argType;

Can you move this to right before CheckExpr?

@@ +4662,5 @@
> +            return false;
> +
> +        switch (simdType) {
> +          case AsmJSSimdType_int32x4:
> +            if (!argType.isInt())

I think this should accept intish.  (Could you add a test for it?)

@@ +4696,5 @@
> +
> +        if (global->isSimdCtor())
> +            return CheckSimdCtorCall(f, expr, global->simdCtorType(), def, type);
> +        if (global->isSimdOperation())
> +            return CheckSimdOperationCall(f, expr, global, def, type);

For symmetry, I'd pass both of them 'global' instead of the former global->simdCtorType().

@@ +4795,5 @@
> +    switch (retType.which()) {
> +      case RetType::Signed:
> +      case RetType::Double:
> +      case RetType::Float:
> +        return f.failf(call, "SIMD call returns %s, used as scalar", type->toChars());

Perhaps add JS_ASSERT(type->isSimd()); ?
Attachment #8479942 - Flags: review?(luke) → review+
(Assignee)

Comment 48

3 years ago
Created attachment 8480403 [details] [diff] [review]
odin-simd.patch

\o/
Attachment #8466217 - Attachment is obsolete: true
Attachment #8479942 - Attachment is obsolete: true
Attachment #8479943 - Attachment is obsolete: true
Attachment #8480403 - Flags: review+
(Assignee)

Comment 49

3 years ago
Created attachment 8480409 [details] [diff] [review]
odin-simd.patch

- Modified check for dead code in CheckCoercedSimdCall.
- move the test parts of IsSimdAvailable here
Attachment #8480403 - Attachment is obsolete: true
Attachment #8480409 - Flags: review+
(Assignee)

Updated

3 years ago
Blocks: 1059710

Comment 50

3 years ago
Comment on attachment 8479985 [details] [diff] [review]
Globals

Review of attachment 8479985 [details] [diff] [review]:
-----------------------------------------------------------------

Very nice!

::: js/src/asmjs/AsmJSLink.cpp
@@ +97,5 @@
>      return true;
>  }
>  
>  static bool
> +IsValidGlobalImport(JSContext *cx, HandleValue v)

I think the heart of this check is making sure we have a value that can be safely coerced without causing user-visible side effects.  So perhaps we could name this HasPureCoercion?

@@ +99,5 @@
>  
>  static bool
> +IsValidGlobalImport(JSContext *cx, HandleValue v)
> +{
> +    // Ideally, we'd reject all non-primitives, but Emscripten has a bug that

"all non-SIMD non-primitives".

I'd structure this function:

  if (is int32x4 or float32x4)
     return true;

  // this long comment
  if (big test)
     return true;

  return false;

::: js/src/asmjs/AsmJSModule.h
@@ +746,5 @@
>          size_t                            functionBytes_; // just the function bodies, no stubs
>          size_t                            codeBytes_;     // function bodies and stubs
>          size_t                            totalBytes_;    // function bodies, stubs, and global data
>          uint32_t                          minHeapLength_;
>          uint32_t                          numGlobalVars_;

Can you rename this to numGlobalScalarVars_?

@@ +1210,5 @@
>      // are laid out in this order:
>      //   0. a pointer to the current AsmJSActivation
>      //   1. a pointer to the heap that was linked to the module
> +    //   2. the double float constant NaN
> +    //   3. the float32 constant NaN, padded to sizeof(double)

Probably now we can say "padded up to sizeof(Simd128DataSize)" and change the sizeof(double) to sizeof(float) below (since the AlignBytes takes care of it either way).

@@ +1227,5 @@
>      }
>      size_t globalDataBytes() const {
> +        JS_STATIC_ASSERT(jit::AsmJSGlobalVariableOffset
> +                         >= 2 * sizeof(void*) + 2 * sizeof(double));
> +        return jit::AsmJSGlobalVariableOffset +

I think you can get away without using a global variable (in the case of AsmJSActivationGlobalDataOffset, we actually need the constant in the MacroAssembler) and instead having:
  size_t globalSimdVarsOffset() const {
      return AlignBytes(/* 0 */ sizeof(void*) +
                        /* 1 */ sizeof(void*) +
                        /* 2 */ sizeof(double) +
                        /* 3 */ sizeof(float),
                        Simd128DataSize);
  }
  size_t globalDataBytes() const {
      return offsetOfGlobalSimdVars() +
             /* 4 */ pod.numGlobalSimdVars_ * jit::Simd128DataSize +
             /* 5 */ pod.numGlobalVars * sizeof(uint64_t) +
             /* 6 */ pod.funcPtrTableAndExitBytes_;
  }

and replacing globalVariableOffset with globalSimdVarsOffset below.

@@ +1279,3 @@
>                 i * sizeof(uint64_t);
>      }
> +    unsigned globalSimdVarIndexToGlobalDataOffset(unsigned i) const {

Can you put this before globalScalarVarIndexToGlobalDataOffset (b/c it goes before it in memory)?

@@ +1283,5 @@
> +        JS_ASSERT(i < pod.numGlobalSimdVars_);
> +        return globalVariableOffset() +
> +               i * jit::Simd128DataSize;
> +    }
> +    void *globalVarToGlobalDatum(const Global &g) const {

Can you put this function after global(Scalar|Simd)VarIndexToGlobalDatum (b/c it depends on them)?

@@ +1288,5 @@
> +        unsigned index = g.varIndex();
> +        if (g.varInitKind() == Global::VarInitKind::InitConstant) {
> +            return g.varInitNumLit().isSimd()
> +                    ? globalSimdVarIndexToGlobalDatum(index)
> +                    : globalScalarVarIndexToGlobalDatum(index);

SM style is to align the ? and : with the first char of the condition, so the 'g'.

@@ +1294,5 @@
> +
> +        JS_ASSERT(g.varInitKind() == Global::VarInitKind::InitImport);
> +        return IsSimdCoercion(g.varInitCoercion())
> +                ? globalSimdVarIndexToGlobalDatum(index)
> +                : globalScalarVarIndexToGlobalDatum(index);

ditto

::: js/src/asmjs/AsmJSValidate.cpp
@@ +2512,5 @@
>      {
>          if (inDeadCode())
>              return nullptr;
>  
>          uint32_t index = global.varOrConstIndex();

Could you inline this into the expression as you did with storeGlobalVar?

::: js/src/jit/x64/CodeGenerator-x64.cpp
@@ +348,5 @@
>  CodeGeneratorX64::visitAsmJSLoadGlobalVar(LAsmJSLoadGlobalVar *ins)
>  {
>      MAsmJSLoadGlobalVar *mir = ins->mir();
>  
>      CodeOffsetLabel label;

Can you put this 'label' on the line before the below switch?  (Also in visitAsmJSStoreGlobalVar)

@@ +362,1 @@
>          label = masm.loadRipRelativeDouble(ToFloatRegister(ins->output()));

This is pre-existing, but it seems like the Float32 case should use masm.loadRipRelativeFloat32 (which doesn't exist yet).  I think this works because, in global data, the high 32-bits are zero, so loading the 64-bit value is equivalent.  It'd probably be good to fix this, but that can be a separate bug; perhaps you could just add a comment here with TODO?

@@ +395,1 @@
>          label = masm.storeRipRelativeDouble(ToFloatRegister(ins->value()));

ditto
Attachment #8479985 - Flags: review?(luke) → review+

Comment 51

3 years ago
Created attachment 8481175 [details] [diff] [review]
alignment

This patch should hopefully address the Win32 alignment issues we were talking about yesterday:
  https://tbpl.mozilla.org/?tree=Try&rev=a5c8f804305f
Attachment #8481175 - Flags: review?(benj)
(Assignee)

Comment 52

3 years ago
Comment on attachment 8481175 [details] [diff] [review]
alignment

Review of attachment 8481175 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks a lot for doing it. It is very green on Try. With this, the Odin patch could land today \o/

Out of curiosity, I will try to rebase it so that it lands before the odin-simd patch, which will be nice for bisecting. Ideally the order of patches should be: isSimdAvailable(), then this alignment patch, then the two odin patches.

::: js/src/asmjs/AsmJSValidate.cpp
@@ +6654,5 @@
>                    NonVolatileRegs.fpus().size() * Simd128DataSize
>                  : NonVolatileRegs.gprs().size() * sizeof(intptr_t) +
>                    NonVolatileRegs.fpus().getPushSizeInBytes();
>  #endif
> +static const unsigned FramePushedForEntrySP = FramePushedAfterSave + sizeof(void*);

This sizeof(void*) is for argv, right? Is it the amount of bytes we think of when using AsmJSFrameBytesAfterReturnAddress? (and as a matter of fact, could this variable be used here?)

@@ +6694,5 @@
>  #endif
>  
> +    // Put the 'argv' argument into a non-argument/return register so that we
> +    // can use it while we fill in the arguments for the asm.js callee. Use a
> +    // second non-argument/return register as temporary scratch. Also, save it

nit:"it" is referring to the argv register i guess, but the sentence about the second nonargretregister makes it fuzzy.

@@ +6722,5 @@
>      // Bump the stack for the call.
>      PropertyName *funcName = m.module().exportedFunction(exportIndex).name();
>      const ModuleCompiler::Func &func = *m.lookupFunction(funcName);
> +    unsigned stackBytes = StackArgBytes(func.sig().args());
> +    masm.reserveStack(stackBytes + ComputeByteAlignment(stackBytes, AsmJSStackAlignment));

you can also use the shorter masm.reserveStack(AlignBytes(stackBytes, AsmJSStackAlignment));

@@ +6993,5 @@
>      //   | return address | descriptor | callee | argc | this | arg1 | arg2 | ...
>      unsigned offsetToIonArgs = MaybeRetAddr;
>      unsigned ionArgBytes = 3 * sizeof(size_t) + (1 + exit.sig().args().length()) * sizeof(Value);
>      unsigned totalIonBytes = offsetToIonArgs + ionArgBytes + savedRegBytes;
> +    unsigned ionFrameSize = StackDecrementForCall(masm, AsmJSStackAlignment, totalIonBytes);

It could be ABIStackAlignment for the Ion path, right? I agree it's simpler to also prepare Ion for SimdStackAlignment for the future, that will help SIMD work in Ion.

@@ +7191,5 @@
>          i++;
>          JS_ASSERT(i.done());
>  
>          // Call coercion function
> +        AssertStackAlignment(masm, ABIStackAlignment);

Do we have somewhere a static assert that AsmJSStackAlignment % ABIStackAlignment == 0?

::: js/src/jit/CodeGenerator.cpp
@@ +8773,4 @@
>  
>  #ifdef DEBUG
> +    static_assert(AsmJSStackAlignment >= ABIStackAlignment,
> +                  "The asm.js stack alignment should subsume the ABI-required alignment");

even better, how about static-asserting AsmJSStackAlignment % ABIStackAlignment == 0? This will ensure one of the assertions in AsmJSValidate.cpp makes sense.

::: js/src/jit/shared/CodeGenerator-shared.cpp
@@ -97,5 @@
>      }
>  }
>  
> -void
> -CodeGeneratorShared::setupSimdAlignment(unsigned fixup)

Even if I am the person who wrote this code, I'm happy to see it go away!
Attachment #8481175 - Flags: review?(benj) → review+
(Assignee)

Comment 53

3 years ago
Posted a try build with the suggested reordering (this stuff is so sensitive, i prefer to be sure not to have introduced a rebasing mistake).

remote:   https://tbpl.mozilla.org/?tree=Try&rev=67ef4ba962a9
remote: Alternatively, view them on Treeherder (experimental):
remote:   https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=67ef4ba962a9
(Assignee)

Comment 54

3 years ago
Created attachment 8481287 [details] [diff] [review]
1. IsSimdAvailable

Moved some stuff from the big main patch into this one, so that it can compile independently without depending on the big odin patch. Carrying forward r+ from luke.
Attachment #8470826 - Attachment is obsolete: true
Attachment #8481287 - Flags: review+
(Assignee)

Comment 55

3 years ago
Created attachment 8481288 [details] [diff] [review]
2. Alignment patch

Luke's alignment patch. Haven't addressed the comments I've written. Luke, see r+ and comments above and feel free to update this version of the patch instead.
Attachment #8481175 - Attachment is obsolete: true
(Assignee)

Comment 56

3 years ago
Created attachment 8481289 [details] [diff] [review]
3. Main patch

Carrying r+ from luke. To apply after the alignment patch.
Attachment #8466089 - Attachment is obsolete: true
Attachment #8480409 - Attachment is obsolete: true
Attachment #8481289 - Flags: review+
(Assignee)

Comment 57

3 years ago
Created attachment 8481292 [details] [diff] [review]
4. Globals

Addressed comments (fixed the float32 x64 issue btw). Carrying forward r+.

And an ultimate try build (with the globals patch in addition to the previous one in the previous try build). If this is all green, this can be landed even if i'm not around, as all these patches (except for luke's one) are the latest version :)

https://tbpl.mozilla.org/?tree=Try&rev=29874f211630
Attachment #8479985 - Attachment is obsolete: true
Attachment #8481292 - Flags: review+
(Assignee)

Comment 58

3 years ago
Comment on attachment 8481288 [details] [diff] [review]
2. Alignment patch

(forgot to carry forward r+, sorry for the noise)
Attachment #8481288 - Flags: review+

Comment 59

3 years ago
(In reply to Benjamin Bouvier [:bbouvier] from comment #52)
Thanks, great suggestions!

> This sizeof(void*) is for argv, right? Is it the amount of bytes we think of
> when using AsmJSFrameBytesAfterReturnAddress? (and as a matter of fact,
> could this variable be used here?)

It happens to be the same size, but it is logically a different constant.  AsmJSFrameBytesAfterReturnAddress sizeof(AsmJSFrame) - sizeof(return address).  The sizeof(void*) is the size of the return address that we manually pushed above (or x86 call pushed).  I'll clarify this, though.

> It could be ABIStackAlignment for the Ion path, right? I agree it's simpler
> to also prepare Ion for SimdStackAlignment for the future, that will help
> SIMD work in Ion.

You're right and yes that was my intention.

Comment 60

3 years ago
Created attachment 8481430 [details] [diff] [review]
2. Alignment patch

With comments addressed; carrying forward r+.
Attachment #8481288 - Attachment is obsolete: true
Attachment #8481430 - Flags: review+

Updated

3 years ago
Attachment #8481430 - Attachment description: 8481288: 2. Alignment patch → 2. Alignment patch

Comment 61

3 years ago
Looks green!  From IRC, landing for Benjamin now that the tree is open:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fc9f83afab31
https://hg.mozilla.org/integration/mozilla-inbound/rev/9afc72a12cb9
https://hg.mozilla.org/integration/mozilla-inbound/rev/55fb5688e85c
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/991b62ff5461 for OS X only, non-unified only bustage like https://tbpl.mozilla.org/php/getParsedLog.php?id=47055370&tree=Mozilla-Inbound

Comment 63

3 years ago
Oops, missing explicit template instantiation covered up by unified builds.  Do non-unified builds not get exercised on try by default?

https://hg.mozilla.org/integration/mozilla-inbound/rev/4773d0ec1ee8
https://hg.mozilla.org/integration/mozilla-inbound/rev/a6ebf2ffec53
https://hg.mozilla.org/integration/mozilla-inbound/rev/b45a94bb2c63
https://hg.mozilla.org/mozilla-central/rev/b45a94bb2c63
https://hg.mozilla.org/mozilla-central/rev/a6ebf2ffec53
https://hg.mozilla.org/mozilla-central/rev/4773d0ec1ee8
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Depends on: 1061383

Updated

2 years ago
Keywords: dev-doc-needed
The SIMD API is documented here now: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SIMD

Let me know if there is anything specifically for Odin/asm.js to write about. Closing ddn for now.
Keywords: dev-doc-needed → dev-doc-complete
(Assignee)

Comment 66

2 years ago
(In reply to Florian Scholz [:fscholz] (MDN) from comment #65)
> The SIMD API is documented here now:
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/
> Global_Objects/SIMD
> 
> Let me know if there is anything specifically for Odin/asm.js to write
> about. Closing ddn for now.

Thanks for all the amazing documentation! The asm.js specifics are documented on Specifiction [0] and should end up in [1], would they be accepted. As a matter of fact, no specific documentation on MDN is needed, thanks!

[0] http://discourse.specifiction.org/t/request-for-comments-simd-js-in-asm-js/676
[1] http://asmjs.org/spec/latest/
You need to log in before you can comment on or make changes to this bug.