1131613 - Support float32 and float64 atomics

Assignee

Description

•

11 years ago

It is trivial-ish (using aliasing) to implement float32 atomics by means of int32 atomics, and thus absurd not to provide float32 atomics directly. Requires modest support in the VM, the JITs, and in the specs.

Lars T Hansen [:lth]

Assignee

Comment 1

•

11 years ago

Obviously this is trivial only for compareExchange, load, and store - for add and subtract there's typically no direct atomic support in the hardware, as the FPU is involved.

Lars T Hansen [:lth]

Assignee

Comment 2

•

11 years ago

So for add and subtract, this is actually tricky. On some hardware, eg x86, a compare-and-swap loop is typically required, and the exit condition for that loop is floating point equality, with all the complications that entails. On other hardware, eg ARM, a LDREX/STREX loop is typically required, and the exit condition for that loop is simply that the store was successful. On either platform, the interpreter and JIT must agree on the exit condition, so we can't really have portable code in the interpreter that simply uses a CAS loop. (The general version of that problem exists already but has been papered over so far by closing our eyes and hoping the C++ compiler will do as the JIT does, see https://bugzilla.mozilla.org/show_bug.cgi?id=1131624#c0. Something better is needed.) An alternative is to define Atomics.add as something that uses a CAS loop with a defined equality test; I suspect that may be the way to go but that it's premature. FWIW, C++11 only defines fetch_add and fetch_sub for integral types, for floats you have to build your own from the provided compare_exchange primitive.

Lars T Hansen [:lth]

Assignee

Updated

•

11 years ago

Comment 3

•

11 years ago

(In reply to Lars T Hansen [:lth] from comment #2) > An alternative is to define Atomics.add as something that uses a CAS loop > with a defined equality test; I suspect that may be the way to go but that > it's premature. We could also use the spinlock for add and subtract on the floating types on all platforms. However that would force us to use it also for load, store, and compareExchange. It is probably better to have load, store, and compareExchange run at full speed where possible and pay for a loop only for add and subtract, even if that complicates the loop because the exit condition is tricky.

Lars T Hansen [:lth]

Assignee

Comment 4

•

11 years ago

Just adding fuel to this fire: on platforms that support CAS with two results (cell value and update success), so that we can in principle test the update without comparing values, the natural comparison of the floating-point values for loop exit will presumably be bitwise, not by value, if we want the best performance. But that's wrong for NaN values and for +0/-0. In C++ the compare_exchange primitive has two results, allowing the program to use the flag. Furthermore (and unsurprisingly) it defines equality of CAS as bitwise equality. I think backing off on add and subtract is the right thing, for now. Another concern is that C++ code translated to asm.js is going to have to be careful around floating-point CAS since only the old value is returned by Atomics.compareExchange, and floating-point comparison of the old and new value is not the same as what the C++ code had in mind if it tests the result flag of the CAS.

Jukka Jylänki

Comment 5

•

11 years ago

I think that having only atomic load, store and CAS for float32 would be fine. I imagine those are straightforward (as aliased operations to 32-bit int addresses)? Actually, for current pthreads support in Emscripten, that is what I emulate, see https://github.com/juj/emscripten-fastcomp/commit/150c6c74115e6db353f2a0d2cf1fa5e96816c8e6 and https://github.com/juj/emscripten/commit/69c311196495a77719848aea5afadf6b99af3c2e

Lars T Hansen [:lth]

Assignee

Comment 6

•

11 years ago

(In reply to Jukka Jylänki from comment #5) > I think that having only atomic load, store and CAS for float32 would be > fine. I imagine those are straightforward (as aliased operations to 32-bit > int addresses)? CAS is not quite unproblematic because it's really a semantic issue how we want to define "equality". I think, however, that given what these primitives are for and how they are expected to be used the correct definition of equality is "bitwise", and that makes the implementation unproblematic in practice.

Jukka Jylänki

Comment 7

•

11 years ago

Yeah, that's true. Bitwise equality (nan payload being undefined) sounds ok. However, even if we did nothing in terms of the spec, I think the current form that the Emscripten/LLVM code emulates will be ok, since like you mention, it gets us more or less the same thing.

Lars T Hansen [:lth]

Assignee

Comment 8

•

11 years ago

An additional (though slight) complication for CAS and LOAD is that any NaN value has to be canonicalized after loading. Semantically however, this means that a NaN returned from CAS or LOAD is not bitwise equal to what's in memory, which might lead to problems for further use of CAS. For straightforward code this will not be a problem, so it's not a big deal, but it'll be worth noting in the spec.

Lars T Hansen [:lth]

Assignee

Updated

•

11 years ago

Summary: Support float32 atomics → Support float32 and float64 atomics

Lars T Hansen [:lth]

Assignee

Comment 10

•

11 years ago

Attached patch WIP: float32 and float64 atomics in the runtime (obsolete) — Details — Splinter Review

Lars T Hansen [:lth]

Assignee

Comment 11

•

11 years ago

Attached patch WIP: Test cases (obsolete) — Details — Splinter Review

Lars T Hansen [:lth]

Assignee

Updated

•

10 years ago

Depends on: 1142593

Lars T Hansen [:lth]

Assignee

Updated

•

10 years ago

Comment 12

•

10 years ago

Attached patch Float32 and float64 atomics in the runtime — Details — Splinter Review

The twelve patches in this set together implement float32 and float64 atomics for the interpreter, Ion, and Odin - a relatively large feature. They are meant to apply to m-i on top of the patches for bug 1154705 and bug bug 1154714, in that order. The patches in the set are not quite independent, there are some forward references. Patch 1: Float32 and Float64 atomics in the interpreter, plus some naming cleanup in AtomicsObject.cpp.

Attachment #8579375 - Attachment is obsolete: true

Attachment #8579376 - Attachment is obsolete: true

Attachment #8593816 - Flags: review?(jwalden+bmo)

Lars T Hansen [:lth]

Assignee

Comment 13

•

10 years ago

Attached patch Test cases for plain JS — Details — Splinter Review

Attachment #8593817 - Flags: review?(jwalden+bmo)

Lars T Hansen [:lth]

Assignee

Comment 14

•

10 years ago

Attached patch ARM support for Plain JS — Details — Splinter Review

This is low-level code to support the Ion changes that will appear shortly. Included here is also Simulator support for UXTH and UXTB; the need for those was discovered only with the improved test cases in bug 1154704.

Attachment #8593818 - Flags: review?(dtc-moz)

Lars T Hansen [:lth]

Assignee

Comment 15

•

10 years ago

Attached patch Intel support for Plain JS (obsolete) — Details — Splinter Review

This is x86 and x64 support code for the Ion changes that will appear shortly.

Attachment #8593819 - Flags: review?(hv1989)

Lars T Hansen [:lth]

Assignee

Comment 16

•

10 years ago

Attached patch None support for Plain JS — Details — Splinter Review

Stubs to keep the None platform compiling.

Attachment #8593820 - Flags: review?(hv1989)

Lars T Hansen [:lth]

Assignee

Comment 17

•

10 years ago

Attached patch MIPS support for Plain JS — Details — Splinter Review

Stubs to keep the MIPS platform compiling (I hope).

Attachment #8593821 - Flags: review?(branislav.rankov)

Lars T Hansen [:lth]

Assignee

Comment 18

•

10 years ago

Attached patch Ion changes to implement float32/float64 atomics (obsolete) — Details — Splinter Review

These are the Ion changes to support inlining of float32/float64 atomics in plain JS. (There's also a bugfix for a regalloc usage issue that I discovered with my better tests; two uses of useRegisterAtStart have been changed to useRegister.) One design choice made here is to handle float32 compareExchange by generating MIR that reinterprets the float32 to an int32, then uses the int32 compareExchange, then reinterprets the int32 bits as float64. This looks messy in MCallOptimize.cpp but reduces the amount of code, since every platform realization would have to do just that anyway. Contrast this with the situation for float64, where different strategies are needed on different platforms. Anyway, the new nodes for float32->int32 and int32->float32 reinterpretation also come in handy later.

Attachment #8593825 - Flags: review?(hv1989)

Lars T Hansen [:lth]

Assignee

Comment 19

•

10 years ago

Attached patch Optimization test cases for plain JS — Details — Splinter Review

These tests are mostly useful when studying the generated code with eg -D in the shell.

Attachment #8593829 - Flags: review?(hv1989)

Lars T Hansen [:lth]

Assignee

Comment 20

•

10 years ago

Attached patch ARM support for asm.js — Details — Splinter Review

ARM support for the asm.js patch that will appear shortly.

Attachment #8593830 - Flags: review?(dtc-moz)

Lars T Hansen [:lth]

Assignee

Comment 21

•

10 years ago

Attached patch Intel support for asm.js — Details — Splinter Review

Intel support for the asm.js patch that will appear shortly.

Attachment #8593832 - Flags: review?(luke)

Lars T Hansen [:lth]

Assignee

Comment 22

•

10 years ago

Attached patch asm.js changes to implement float32/float64 atomics — Details — Splinter Review

This patch also addresses the general return type issue explained in bug 1155176 comment 1, by falling back on TypedArrayLoadType where a value read from the array is returned, and by returning the rhsType, in the case of Atomics.store.

Attachment #8593835 - Flags: review?(luke)

Lars T Hansen [:lth]

Assignee

Comment 23

•

10 years ago

Attached patch Test cases for asm.js — Details — Splinter Review

In addition to adding test cases for float32 and float64 this change cleans up the asm.js atomics tests generally and adds tests for unsigned types and for halfword types.

Attachment #8593836 - Flags: review?(luke)

Hannes Verschore [:h4writer]

Comment 24

•

10 years ago

Comment on attachment 8593819 [details] [diff] [review] Intel support for Plain JS Review of attachment 8593819 [details] [diff] [review]: ----------------------------------------------------------------- I think there is an issue for x86 MCompareExchangeFloat64TypedArrayElement where we won't have enough registers... ::: js/src/jit/x64/MacroAssembler-x64.h @@ +1256,5 @@ > } > > + void moveDoubleToInt64Bits(FloatRegister src, Register dest) { > + vmovq(src, dest); > + } s/moveDoubleToInt64Bits/moveDoubleToInt64 @@ +1260,5 @@ > + } > + > + void moveInt64BitsToDouble(Register src, FloatRegister dest) { > + vmovq(src, dest); > + } s/moveInt64BitsToDouble/moveInt64ToDouble ::: js/src/jit/x86-shared/Assembler-x86-shared.h @@ +1711,5 @@ > + default: > + MOZ_CRASH("unexpected operand kind"); > + } > + } > +#endif Move this to "jit/x64/Assembler-x64.h" ::: js/src/jit/x86-shared/BaseAssembler-x86-shared.h @@ +1393,5 @@ > + { > + spew("cmpxchgq %s, " MEM_obs, GPReg64Name(src), ADDR_obs(offset, base, index, scale)); > + m_formatter.twoByteOp64(OP2_CMPXCHG_GvEw, offset, base, index, scale, src); > + } > +#endif Can you move this to: "jit/x64/Assembler-x64.h" ::: js/src/jit/x86-shared/MacroAssembler-x86-shared.h @@ +1292,5 @@ > } > > + void moveInt32BitsToFloat(Register src, FloatRegister dest) { > + vmovd(src, dest); > + } -> moveInt32ToFloat32 And can you add description about this instruction and say it doesn't convert the input, but copies it bit by bit. And that convertFloat32ToInt32 should get used in that case? @@ +1296,5 @@ > + } > + > + void moveFloatToInt32Bits(FloatRegister src, Register dest) { > + vmovd(src, dest); > + } moveFloat32ToInt32 And can you add description about this instruction and say it doesn't convert the input, but copies it bit by bit. And that convertXXX should get used in that case? ::: js/src/jit/x86/Lowering-x86.cpp @@ +200,5 @@ > + const LDefinition newLo = tempFixed(ebx); > + const LUse elements = useRegister(ins->elements()); > + const LAllocation index = useRegisterOrConstant(ins->index()); > + const LAllocation oldval = useRegister(ins->oldval()); > + const LAllocation newval = useRegister(ins->newval()); This will fail! In x86 we normally have 7 registers available, but when profiler is enabled ebp is also taken. So we can only take 6 registers at a time in x86. In this particular case we request 8 registers. I assume this isn't failing, since "index" is a constant in the testcases. Can you add a testcase where "index" isn't a constant? That should fail. So some magic will need to get used to get this working properly. Potential solutions is to use useRegisterAtStart for inputs. ::: js/src/jit/x86/MacroAssembler-x86.h @@ +1007,5 @@ > + vmovd(src, destLo); > + vmovapd(src, ScratchDoubleReg); > + vpsrlq(Imm32(32), ScratchDoubleReg, ScratchDoubleReg); > + vmovd(ScratchDoubleReg, destHi); > + } moveDoubleToInt32x2 And can you add description about this instruction and say it doesn't convert the input, but copies it bit by bit. @@ +1013,5 @@ > + void moveInt32x2BitsToDouble(Register srcHi, Register srcLo, FloatRegister dest) { > + vmovd(srcLo, dest); > + vmovd(srcHi, ScratchDoubleReg); > + vpsllq(Imm32(32), ScratchDoubleReg, ScratchDoubleReg); > + vorpd(ScratchDoubleReg, dest, dest); moveInt32x2ToDouble And can you add description about this instruction and say it doesn't convert the input, but copies it bit by bit.

Attachment #8593819 - Flags: review?(hv1989)

Hannes Verschore [:h4writer]

Comment 25

•

10 years ago

Comment on attachment 8593820 [details] [diff] [review] None support for Plain JS Review of attachment 8593820 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/none/MacroAssembler-none.h @@ +410,5 @@ > Register extractBoolean(ValueOperand, Register) { MOZ_CRASH(); } > template <typename T> Register extractTag(T, Register) { MOZ_CRASH(); } > > + void moveFloatToInt32Bits(FloatRegister src, Register dest) { MOZ_CRASH(); } > + void moveInt32BitsToFloat(Register src, FloatRegister dest) { MOZ_CRASH(); } Do we get breakage if this isn't defined? If yes. Please update the names to moveFloat32ToInt32 and moveInt32ToFloat32. If no. Don't add them ;).

Attachment #8593820 - Flags: review?(hv1989) → review+

Hannes Verschore [:h4writer]

Updated

•

10 years ago

Attachment #8593829 - Flags: review?(hv1989) → review+

Lars T Hansen [:lth]

Assignee

Comment 26

•

10 years ago

(In reply to Hannes Verschore [:h4writer] from comment #24) > Comment on attachment 8593819 [details] [diff] [review] > Intel support for Plain JS > > Review of attachment 8593819 [details] [diff] [review]: > ----------------------------------------------------------------- > > I think there is an issue for x86 MCompareExchangeFloat64TypedArrayElement > where we won't have enough registers... I worry about that too but so far I've not run into any problems, despite my best efforts. The instruction needs at least six integer registers. I could see address optimization pushing that number higher though. (For Odin it seemed worse so that back-end calls out to C++ always at the moment.) (More below.) > ::: js/src/jit/x64/MacroAssembler-x64.h > @@ +1256,5 @@ > > } > > > > + void moveDoubleToInt64Bits(FloatRegister src, Register dest) { > > + vmovq(src, dest); > > + } > > s/moveDoubleToInt64Bits/moveDoubleToInt64 I will make this change (here and et seq) if you want but the point here is that this is a bit copy, not a value conversion. In a later remark you ask for a comment clarifying that, so I guess the "Bits" suffix was not clear enough. > ::: js/src/jit/x86/Lowering-x86.cpp > @@ +200,5 @@ > > + const LDefinition newLo = tempFixed(ebx); > > + const LUse elements = useRegister(ins->elements()); > > + const LAllocation index = useRegisterOrConstant(ins->index()); > > + const LAllocation oldval = useRegister(ins->oldval()); > > + const LAllocation newval = useRegister(ins->newval()); > > This will fail! > > In x86 we normally have 7 registers available, but when profiler is enabled > ebp is also taken. So we can only take 6 registers at a time in x86. > > In this particular case we request 8 registers. I assume this isn't failing, > since "index" is a constant in the testcases. Can you add a testcase where > "index" isn't a constant? That should fail. I will write more tests and see what falls out of that, but I currently do test with non-constant indices. Note that oldval, newval, and output are floating point registers, and "only" 6 integer registers are actually being requested. > So some magic will need to get used to get this working properly. Potential > solutions is to use useRegisterAtStart for inputs. Possibly. I've had bad luck with that in these primitives, the register allocator asserts a lot, probably I need to be a little more conservative.

Hannes Verschore [:h4writer]

Comment 27

•

10 years ago

Comment on attachment 8593825 [details] [diff] [review] Ion changes to implement float32/float64 atomics Review of attachment 8593825 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. Though can I have another look, since I requested some (functionality) changes? ::: js/src/jit/CodeGenerator.cpp @@ +364,5 @@ > +void > +CodeGenerator::visitInt32BitsToFloat(LInt32BitsToFloat* lir) > +{ > + masm.moveInt32BitsToFloat(ToRegister(lir->input()), ToFloatRegister(lir->output())); > + if (lir->canonicalizeNaN()) lir->mir()->canonicalizeNaN(), I added a comment about this later in the patch ::: js/src/jit/IonBuilder.h @@ +906,1 @@ > AtomicCheckResult=DoCheckAtomicResult); The last argument here doesn't have a name... (Should be 'checkResult') ::: js/src/jit/LIR-Common.h @@ +3520,5 @@ > + > + public: > + LIR_HEADER(Int32BitsToFloat) > + > + LInt32BitsToFloat(const LAllocation& input, bool canonicalizeNaN) Don't give "bool canonicalizeNaN" as a argument. Just define a function: MInt32ToFloat32* mir() const { return mirRaw_->toInt32ToFloat32(); } And use lir->mir->canonicalizeNaN() where you need it during codegen. ::: js/src/jit/LOpcodes.h @@ +162,5 @@ > _(Float32ToDouble) \ > _(DoubleToFloat32) \ > _(Int32ToFloat32) \ > + _(FloatToInt32Bits) \ > + _(Int32BitsToFloat) \ Please rename to: Float32ToInt32 Int32ToFloat32 ::: js/src/jit/Lowering.cpp @@ +1846,5 @@ > + > +void > +LIRGenerator::visitInt32BitsToFloat(MInt32BitsToFloat* mir) > +{ > + define(new(alloc()) LInt32BitsToFloat(useRegister(mir->input()), mir->canonicalizeNaN()), mir); No need to pass mir->canonicalizeNaN(), see comment in LIR-Common.h ::: js/src/jit/MCallOptimize.cpp @@ +2779,5 @@ > + } > + if (newval->type() != MIRType_Int32) { > + newval = MTruncateToInt32::New(alloc(), newval); > + current->add(newval->toInstruction()); > + } Ok. I got some comments here: 1) We have IntPolicy: which is no-op if input is Int32 else tries to unbox. But bails if not possible 2) We have ConvertToInt32Policy: which is no-op if input is Int32 and else tries to convert to Int32. If conversion would yield with loss of information a bailout occurs. So (2) is already more relaxed than (1) so could also be used. Now the comment says you want to go even further and do this upon truncation. Can you create a TruncateIntPolicy and use that in "MCompareExchangeTypedArrayElement" as type for 3rd and 4th operand? This shouldn't be a problem for Scalar::Float32, since the MTruncate will get added betwen MFloat32toInt32 and MCompareExchangeTypedArrayElement and will become a no-op. And also for MStoreUnboxedScalar? @@ +2793,5 @@ > + MInt32BitsToFloat* i2f = MInt32BitsToFloat::New(alloc(), cas); > + current->add(i2f); > + current->push(i2f); > + } else { > + cas->setResultType(arrayType == Scalar::Uint32 ? MIRType_Double : MIRType_Int32); cas->setResultType(getInlineReturnType()); Should still be correct, right? ::: js/src/jit/MOpcodes.h @@ +115,5 @@ > _(TruncateToInt32) \ > _(ToString) \ > _(ToObjectOrNull) \ > + _(FloatToInt32Bits) \ > + _(Int32BitsToFloat) \ Float32ToInt32 Int32ToFloat32 @@ +203,5 @@ > _(StoreTypedArrayElementHole) \ > _(StoreTypedArrayElementStatic) \ > _(CompareExchangeTypedArrayElement) \ > + _(CompareExchangeFloat64TypedArrayElement) \ > + _(AtomicLoadFloat) \ Please rename to AtomicLoadFloatingPoint

Attachment #8593825 - Flags: review?(hv1989)

Luke Wagner [:luke]

Comment 28

•

10 years ago

If I'm reading correctly, the float ops are basically equivalent to int ops (equality is bitwise and there isn't atomic arith), so what is the reason for having float ops? If they have int semantics, it seems more honest in the API to express them as ints. For float64, you could make the argument that it provides a larger atomic datum, but NaN canonicalization will make it unsuitable for anything but real float64 usage.

Hannes Verschore [:h4writer]

Comment 29

•

10 years ago

Comment on attachment 8593819 [details] [diff] [review] Intel support for Plain JS Review of attachment 8593819 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/x86/Lowering-x86.cpp @@ +200,5 @@ > + const LDefinition newLo = tempFixed(ebx); > + const LUse elements = useRegister(ins->elements()); > + const LAllocation index = useRegisterOrConstant(ins->index()); > + const LAllocation oldval = useRegister(ins->oldval()); > + const LAllocation newval = useRegister(ins->newval()); So like Lars explained two of these 8 registers are floats, so don't count. As a result we only have 6 registers. So that should be ok ;)

Douglas Crosher [:dougc]

Comment 30

•

10 years ago

Comment on attachment 8593818 [details] [diff] [review] ARM support for Plain JS Review of attachment 8593818 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/arm/Lowering-arm.cpp @@ +690,5 @@ > +// For float32: > +// We'll use LDR/STR, as int loads/stores are single-copy atomic. > +// > +// For float64: > +// We'll use LDREXD/STREXD for the operation, for now. On chips where I assume the simulator already supports LDREXD and STREXD? Does it also check that the address is double word aligned?

Douglas Crosher [:dougc]

Comment 31

•

10 years ago

Comment on attachment 8593830 [details] [diff] [review] ARM support for asm.js Review of attachment 8593830 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +2053,5 @@ > + Label goahead; > + BufferOffset bo = masm.ma_BoundsCheck(ptr); > + maybeCmpOffset = bo.getOffset(); > + masm.ma_b(&goahead, Assembler::Below); > + memoryBarrier(MembarFull); Is this barrier only needed when a bounds check is needed? @@ +2163,5 @@ > + > + masm.setupAlignedABICall(3); > + masm.passABIArg(ptr); > + masm.passABIArg(oldval, MoveOp::DOUBLE); > + masm.passABIArg(newval, MoveOp::DOUBLE); Might there be any utility reordering these arguments to pass the ptr last, so that the doubles can be packed in r0,r1 r2,r3? ::: js/src/jit/arm/Simulator-arm.cpp @@ +2274,5 @@ > } > + case Args_Double_IntDoubleDouble: { > + double dval0, dval1; > + int32_t ival; > + getFpArgs(&dval0, &dval1, &ival); Could you double check that the use of getFpArgs is correct?

Lars T Hansen [:lth]

Assignee

Comment 32

•

10 years ago

(In reply to Douglas Crosher [:dougc] from comment #31) > Comment on attachment 8593830 [details] [diff] [review] > ARM support for asm.js > > Review of attachment 8593830 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/jit/arm/CodeGenerator-arm.cpp > @@ +2053,5 @@ > > + Label goahead; > > + BufferOffset bo = masm.ma_BoundsCheck(ptr); > > + maybeCmpOffset = bo.getOffset(); > > + masm.ma_b(&goahead, Assembler::Below); > > + memoryBarrier(MembarFull); > > Is this barrier only needed when a bounds check is needed? Yes, because when the bounds check passes the primitive contains the necessary barriers. > @@ +2163,5 @@ > > + > > + masm.setupAlignedABICall(3); > > + masm.passABIArg(ptr); > > + masm.passABIArg(oldval, MoveOp::DOUBLE); > > + masm.passABIArg(newval, MoveOp::DOUBLE); > > Might there be any utility reordering these arguments to pass the ptr last, > so that the doubles can be packed in r0,r1 r2,r3? I don't think that kind of micro-optimization will make much difference here, and the values will have to be rearranged on the C++ side in any case. It's probably better if they're just on the stack. I'm open to arguments, if you have them :) > ::: js/src/jit/arm/Simulator-arm.cpp > @@ +2274,5 @@ > > } > > + case Args_Double_IntDoubleDouble: { > > + double dval0, dval1; > > + int32_t ival; > > + getFpArgs(&dval0, &dval1, &ival); > > Could you double check that the use of getFpArgs is correct? I did check before, but I will do so again.

Lars T Hansen [:lth]

Assignee

Comment 33

•

10 years ago

(In reply to Douglas Crosher [:dougc] from comment #30) > Comment on attachment 8593818 [details] [diff] [review] > ARM support for Plain JS > > Review of attachment 8593818 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/jit/arm/Lowering-arm.cpp > @@ +690,5 @@ > > +// For float32: > > +// We'll use LDR/STR, as int loads/stores are single-copy atomic. > > +// > > +// For float64: > > +// We'll use LDREXD/STREXD for the operation, for now. On chips where > > I assume the simulator already supports LDREXD and STREXD? It does (search for "exclusive", you want the "1" case in the subsequent switches). > Does it also check that the address is double word aligned? It does - it uses readDW, which contains the necessary check.

Lars T Hansen [:lth]

Assignee

Comment 34

•

10 years ago

(In reply to Hannes Verschore [:h4writer] from comment #24) > Comment on attachment 8593819 [details] [diff] [review] > Intel support for Plain JS > > Review of attachment 8593819 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/jit/x86-shared/BaseAssembler-x86-shared.h > @@ +1393,5 @@ > > + { > > + spew("cmpxchgq %s, " MEM_obs, GPReg64Name(src), ADDR_obs(offset, base, index, scale)); > > + m_formatter.twoByteOp64(OP2_CMPXCHG_GvEw, offset, base, index, scale, src); > > + } > > +#endif > > Can you move this to: "jit/x64/Assembler-x64.h" Hannes: I can do that, but the pattern we have right now really is that all calls to m_formatter are in the BaseAssembler, if necessary with ifdefs around them like here.

Lars T Hansen [:lth]

Assignee

Comment 35

•

10 years ago

(In reply to Hannes Verschore [:h4writer] from comment #27) > Comment on attachment 8593825 [details] [diff] [review] > Ion changes to implement float32/float64 atomics > > Review of attachment 8593825 [details] [diff] [review]: > ----------------------------------------------------------------- > ::: js/src/jit/MCallOptimize.cpp > @@ +2779,5 @@ > > + } > > + if (newval->type() != MIRType_Int32) { > > + newval = MTruncateToInt32::New(alloc(), newval); > > + current->add(newval->toInstruction()); > > + } > > Ok. I got some comments here: > > 1) We have IntPolicy: which is no-op if input is Int32 else tries to unbox. > But bails if not possible > 2) We have ConvertToInt32Policy: which is no-op if input is Int32 and else > tries to convert to Int32. If conversion would yield with loss of > information a bailout occurs. > > So (2) is already more relaxed than (1) so could also be used. Now the > comment says you want to go even further and do this upon truncation. Yes, that follows from the spec, which in turn follows ES6 as far as conversion behavior for typed array insertion is concerned. > Can > you create a TruncateIntPolicy and use that in > "MCompareExchangeTypedArrayElement" as type for 3rd and 4th operand? I will attempt that, it's a nice solution so long as order of evaluation issues do not get in the way. > This shouldn't be a problem for Scalar::Float32, since the MTruncate will > get added betwen MFloat32toInt32 and MCompareExchangeTypedArrayElement and > will become a no-op. That sounds plausible, and is easy to test. > And also for MStoreUnboxedScalar? No, MStoreUnboxedScalar already has its own type policy that actually contains the same code that I've used in compareExchange (using truncate). I probably stole the code I have from an older incarnation of that code.

Lars T Hansen [:lth]

Assignee

Comment 36

•

10 years ago

Hannes: Here's the code for compareExchange on x86 with a variable array index. As you can see, six integer registers are live (edx:eax, ecx:ebx, ebp for the elements pointer, and esi for the index). I had thought -D would enable the profiler and hence reserve ebp, but in any case, should ebp be taken edi should be usable for the elements pointer in this situation. I'll add this test case to the tests that you've already looked at. [Pointer:NON_GC_THING] movl $0xf76c9000, %ebp [CompareExchangeFloat64TypedArrayElement] movd %xmm0, %eax movapd %xmm0, %xmm7 psrlq $32, %xmm7 movd %xmm7, %edx movd %xmm1, %ebx movapd %xmm1, %xmm7 psrlq $32, %xmm7 movd %xmm7, %ecx lock cmpxchg8b edx:eax, 0x0(%ebp,%esi,8) movd %eax, %xmm2 movd %edx, %xmm7 psllq $32, %xmm7 orpd %xmm7, %xmm2 ucomisd %xmm2, %xmm2 jnp .Lfrom842 movsd 0xffffffff, %xmm2

Lars T Hansen [:lth]

Assignee

Comment 37

•

10 years ago

(In reply to Hannes Verschore [:h4writer] from comment #27) > Comment on attachment 8593825 [details] [diff] [review] > Ion changes to implement float32/float64 atomics > > Review of attachment 8593825 [details] [diff] [review]: > ----------------------------------------------------------------- > > @@ +2793,5 @@ > > + MInt32BitsToFloat* i2f = MInt32BitsToFloat::New(alloc(), cas); > > + current->add(i2f); > > + current->push(i2f); > > + } else { > > + cas->setResultType(arrayType == Scalar::Uint32 ? MIRType_Double : MIRType_Int32); > > cas->setResultType(getInlineReturnType()); > Should still be correct, right? Unfortunately it is not still correct. The reason is that the check in atomicsMeetsPreconditions is looser now: if the view type is an integer type other than Uint32, there is no guard on the inline return type. Notably the inline return type can be MIRType_Value. Hence the result type should be set to the precise type in the code quoted above. I can make the check in atomicsMeetsPreconditions stricter again, but I'm not sure why it should be (arguments welcome, of course).

Lars T Hansen [:lth]

Assignee

Comment 38

•

10 years ago

Attached patch Intel support for Plain JS, v2 — Details — Splinter Review

Originally reviewed in comment 24. Followups in comment 26, comment 29, comment 34. I have renamed and commented as suggested but had to choose slightly different names, using a "Move" prefix several places (LMoveInt32ToFloat32) because there was already an LInt32ToFloat32, namely the conversion. Ditto for later patches.

Attachment #8593819 - Attachment is obsolete: true

Attachment #8595232 - Flags: review?(hv1989)

Lars T Hansen [:lth]

Assignee

Comment 39

•

10 years ago

Attached patch Ion changes to implement float32/float64 atomics, v2 — Details — Splinter Review

Originally reviewed in comment 27. Followup in comment 35.

Attachment #8593825 - Attachment is obsolete: true

Attachment #8595235 - Flags: review?(hv1989)

Lars T Hansen [:lth]

Assignee

Comment 40

•

10 years ago

(In reply to Luke Wagner [:luke] from comment #28) > If I'm reading correctly, the float ops are basically equivalent to int ops > (equality is bitwise and there isn't atomic arith), so what is the reason > for having float ops? If they have int semantics, it seems more honest in > the API to express them as ints. For float64, you could make the argument > that it provides a larger atomic datum, but NaN canonicalization will make > it unsuitable for anything but real float64 usage. For float32, in asm.js, it does not matter a great deal whether specialized operations are provided, since the int operations can be used, although I don't know that there's a convenient floatbits-to-intbits primitive in asm.js that does not force the translation to go through memory. So the operations are not quite equivalent and a native implementation will likely beat the emulation. For float64 it does make a difference, since the only alternative is some sort of spinlocked loop. NaN and +0/-0 make the semantics complicated; NaN isn't a concern IMO but the fact that CAS(+0.0, v) will fail if the value in the cell is -0.0 is annoying, to say the least. (In any case C++ provides float32 and float64 atomics with the same semantics as here, and there's at least an argument to be made that we should support the translation well for asm.js.) For handwritten-JS it seems more useful to me to actually support these operations on floating point arrays directly. How useful are they? Probably moderately. I've implemented these because they aid emscripten and because they round out the feature, so to speak. Clearly one can imagine the utility of eg a float64 counter that is updated atomically by multiple threads, using CAS, or other ways of communicating floating point data among threads using atomics.

Branislav Rankov [:rankov]

Comment 41

•

10 years ago

Comment on attachment 8593821 [details] [diff] [review] MIPS support for Plain JS Review of attachment 8593821 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/mips/Lowering-mips.cpp @@ +567,5 @@ > MOZ_CRASH("NYI"); > } > > void > +LIRGeneratorMIPS::visitCompareExchangeFloat64TypedArrayElement(MCompareExchangeFloat64TypedArrayElement* ins) nit: This line is too long ::: js/src/jit/mips/Lowering-mips.h @@ +102,5 @@ > void visitSimdSelect(MSimdSelect* ins); > void visitSimdSplatX4(MSimdSplatX4* ins); > void visitSimdValueX4(MSimdValueX4* ins); > void visitCompareExchangeTypedArrayElement(MCompareExchangeTypedArrayElement* ins); > + void visitCompareExchangeFloat64TypedArrayElement(MCompareExchangeFloat64TypedArrayElement* ins); nit: line too long

Attachment #8593821 - Flags: review?(branislav.rankov) → review+

Luke Wagner [:luke]

Comment 42

•

10 years ago

(In reply to Lars T Hansen [:lth] from comment #40) > although I > don't know that there's a convenient floatbits-to-intbits primitive in > asm.js that does not force the translation to go through memory. There isn't but it's separately valuable and I've been thinking for a while that we should add it as a Math builtin b/c this comes up from time to time. > How useful are they? Probably moderately. I've implemented these because > they aid emscripten and because they round out the feature, so to speak. > Clearly one can imagine the utility of eg a float64 counter that is updated > atomically by multiple threads, using CAS, or other ways of communicating > floating point data among threads using atomics. I ask this b/c I've never seen or heard of any uses of atomic operations on floats. I wonder if it wouldn't be better to wait until we actually have someone complain (given that, from what Jukka says, Emscripten is able to support this already). The advantage is minimizing the work for everyone to implement v.1 which enables faster shipping and general surface area reduction.

Jukka Jylänki

Comment 43

•

10 years ago

In e.g. Unity there are extensive uses of 32-bit and 64-bit Atomic float loads and stores. In the pthreads branch, Emscripten compiler has been set to emit custom emulated instructions for these. See https://github.com/juj/emscripten-fastcomp/commit/3f91518e5ef21c67f69651022babddc7f697bc84#diff-7212822d603be752dbb5c5d4619a359eR849 and https://github.com/juj/emscripten/commit/cd721d1aebc3ebc17a2e4c9fa3e56b7e3649fb4c . The biggest drawback of these is most likely performance. For 32-bit ints vs floats the emulation is probably good, but native 64-bit atomic ops would be nice to have, so that float64 atomic ops could be sanely implemented. I think it would be nice to have these ops in the spec for ease of use, although it is not critical.

Luke Wagner [:luke]

Comment 44

•

10 years ago

(In reply to Jukka Jylänki from comment #43) Ok, I stand corrected. Just for my own benefit: what are the float atomic ops being used for?

Jeff Walden [:Waldo]

Comment 45

•

10 years ago

Comment on attachment 8593816 [details] [diff] [review] Float32 and float64 atomics in the runtime Review of attachment 8593816 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/builtin/AtomicsObject.cpp @@ +200,5 @@ > +{ > + DoubleLongOverlay oldval; > + DoubleLongOverlay newval; > + oldval.d = oldCandidate; > + newval.d = newCandidate; Rather than one-off, ad hoc overlays, I'd prefer you use mozilla::BitwiseCast<>: int64_t oldval = mozilla::BitwiseCast<int64_t>(oldCandidate); int64_t newval = mozilla::BitwiseCast<int64_t>(newCandidate); ... rest of the body, with s/(old|new)val.l/\1val/g... return mozilla::BitwiseCast<int64_t>(oldval); That isolates the bitwise casting to one already-existing location, rather than adding more of them. |using mozilla::BitwiseCast| is fine if you want it. @@ +201,5 @@ > + DoubleLongOverlay oldval; > + DoubleLongOverlay newval; > + oldval.d = oldCandidate; > + newval.d = newCandidate; > + int64_t* addr = (int64_t*)viewData + offset; General preference for constructor-style casts, or failing that static_cast<>, over C-style casts, everywhere in this patch. @@ +212,5 @@ > + *addr = newval.l; > + buffer->regionLock.release<8>(addr); > + oldval.l = x; > + } > + // Note, canonicalization must take place in the caller if appropriate. Hmm. This is going to be...odd...when compare-exchange operations performing an exchange if the existing value is NaN, have no effect when the existing value *is* NaN. (I agree with not equating -0 and +0.) And it's largely impossible to write code to ensure bitwise-identical NaNs appear everywhere. We could make the comparison use SameValue semantics, if we added a well-predicted branch on |IsNaN(oldCandidate)|, guarding a more elaborate loop-based approach. Even that overhead should be dwarfed by cost to synchronize. Should we? Not all users want instruction-level granularity. asm.js probably does, but anyone working in plain handwritten JS, could easily be surprised. There *will* be some such people. And given Granted, you'd have to access and modify a typed array's underlying buffer. But I'm not sure I can discard the possibility quite so immediately. Suppose you wanted compare-and-exchange to use SameValue equality. Could you use a bitwise version and check for the existing, returned value being NaN and retrying the compare-exchange with that? Not really, because the returned value might be canonicalized. I think you'd *have* to do it by examining the underlying bytes. But you'd have to repeatedly loop in case someone was mutating bytes underneath you. Very tricky. I almost wonder if providing both operations might be the answer. I'm not really convinced providing just bitwise, or just SameValue equality, is the right answer here. :-\ @@ +225,5 @@ > + oldval.f = oldCandidate; > + newval.f = newCandidate; > + oldval.i = jit::AtomicOperations::compareExchangeSeqCst((int32_t*)viewData + offset, oldval.i, newval.i); > + // Note, canonicalization must take place in the caller if appropriate. > + return oldval.f; Same BitwiseCast changes. Oh, and you'll want #include "mozilla/Casting.h" for this. @@ +246,5 @@ > bool inRange; > if (!GetSharedTypedArrayIndex(cx, idxv, view, &offset, &inRange)) > return false; > + > + if (view->type() == Scalar::Float32 || view->type() == Scalar::Float64) { We could share a bunch more code here if ToNumber and inRange were used by everyone, then the non-floating-point code did a separate ToInt32 atop that. Given that we control the spec, I think we should push that change into the spec, then make it here. @@ +260,5 @@ > + return atomics_fence_impl(cx, r); > + > + if (view->type() == Scalar::Float32) { > + r.setNumber(JS::CanonicalizeNaN(CompareExchangeSeqCst((float)oldCandidate, > + (float)newCandidate, Either constructor- or C++-style casts. @@ +264,5 @@ > + (float)newCandidate, > + view->viewData(), > + offset))); > + } else { > + r.setNumber(JS::CanonicalizeNaN(CompareExchangeSeqCst(oldCandidate, I think you want setDouble for these two halves, probably. setNumber would return a varying-type value, when really the contents of these arrays are floating point and should always be represented that way, and never as int32_t. @@ +303,5 @@ > +LoadSeqCst(void* viewData, uint32_t offset, SharedArrayRawBuffer* buffer) > +{ > + DoubleLongOverlay v; > + int64_t* addr = (int64_t*)viewData + offset; > + if (jit::AtomicOperations::isLockfree8()) { int64_t* addr = static_cast<int64_t*>(viewData) + offset; if (jit::AtomicOperations::isLockfree8()) return BitwiseCast<double>(jit::AtomicOperations::loadSeqCst(addr)); buffer->regionLock.acquire<8>(addr); int64_t v = *addr; buffer->regionLock.release<8>(addr); return BitwiseCast<double>(v); @@ +367,5 @@ > } > + case Scalar::Float32: { > + FloatIntOverlay v; > + v.i = jit::AtomicOperations::loadSeqCst((int32_t*)view->viewData() + offset); > + r.setNumber(JS::CanonicalizeNaN(v.f)); More BitwiseCast. @@ +390,5 @@ > + if (jit::AtomicOperations::isLockfree8()) { > + jit::AtomicOperations::storeSeqCst(addr, value.l); > + } else { > + buffer->regionLock.acquire<8>(addr); > + *addr = value.l; And again BitwiseCast. @@ +808,5 @@ > > +double > +js::atomics_cmpxchgd_asm_callout(int32_t offset, double oldval, double newval) > +{ > + MOZ_ASSERT(!(offset & 7)); Mild preference for (offset % 8) == 0 as more readably implying 8-byte aligned, same elsewhere. @@ +815,5 @@ > + SharedArrayRawBuffer* rawBuffer; > + GetCurrentAsmJSHeap(&heap, &heapLength, &rawBuffer); > + if ((size_t)offset >= heapLength) > + return JS::GenericNaN(); > + return CompareExchangeSeqCst(oldval, newval, heap, offset>>3, rawBuffer); I assume (without checking) that all these functions take int32_t because they come from JITs, Values, etc. But that's not really how they treat the argument. Instead of passing signed integers everywhere, and having to deal with all the quirks of signed integers (such as right-shift being implementation-defined if the LHS is negative), could you either immediately assign them to uint32_t, or make the arguments themselves uint32_t? Followup is fine. (I ask this because I really would rather read |offset / 8| here and let the compiler do the optimization, than have readers take a mental speedbump for it.)

Attachment #8593816 - Flags: review?(jwalden+bmo) → review+

Hannes Verschore [:h4writer]

Comment 46

•

10 years ago

Comment on attachment 8595232 [details] [diff] [review] Intel support for Plain JS, v2 Review of attachment 8595232 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/x64/LOpcodes-x64.h @@ +18,5 @@ > _(ModPowTwoI) \ > _(PowHalfD) \ > + _(CompareExchangeFloat64TypedArrayElement) \ > + _(AtomicLoadFloatingPoint) \ > + _(AtomicStoreFloat) \ s/AtomicStoreFloat/AtomicStoreFloatingPoint ::: js/src/jit/x86-shared/Assembler-x86-shared.h @@ +1727,5 @@ > + default: > + MOZ_CRASH("unexpected operand kind"); > + } > + } > +#endif This should still get moved to jit/x64/Assembler-x64.h. Or was there also a reason you didn't do that? (I agreed about not moving cmpxchgq, but there is no 'JS_CODEGEN_X64' in jit/x86-shared/Assembler-x86-shared.h and we should try to keep that. We have these platform specific Assemblers for that. ::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp @@ +3208,5 @@ > + > + Label fail; > + if (lir->index()->isConstant()) { > + Address source(elements, ToInt32(lir->index()) * width); > + masm.loadFromTypedArray(vt, source, out, InvalidReg, &fail, /*canonicalize=*/ true); Style nit: add some breathing space: /* canonicalize = */ @@ +3211,5 @@ > + Address source(elements, ToInt32(lir->index()) * width); > + masm.loadFromTypedArray(vt, source, out, InvalidReg, &fail, /*canonicalize=*/ true); > + } else { > + BaseIndex source(elements, ToRegister(lir->index()), ScaleFromElemWidth(width)); > + masm.loadFromTypedArray(vt, source, out, InvalidReg, &fail, /*canonicalize=*/ true); Style nit: add some breathing space: /* canonicalize = */ ::: js/src/jit/x86/LOpcodes-x86.h @@ +19,5 @@ > _(ModPowTwoI) \ > _(PowHalfD) \ > + _(CompareExchangeFloat64TypedArrayElement) \ > + _(AtomicLoadFloatingPoint) \ > + _(AtomicStoreFloat) \ s/AtomicStoreFloat/AtomicStoreFloatingPoint

Attachment #8595232 - Flags: review?(hv1989) → review+

Hannes Verschore [:h4writer]

Comment 47

•

10 years ago

Comment on attachment 8595235 [details] [diff] [review] Ion changes to implement float32/float64 atomics, v2 Review of attachment 8595235 [details] [diff] [review]: ----------------------------------------------------------------- Maybe InterpretFloat32AsInt32 instead of MoveFloat32ToInt32 (for lir/mir name)?

Attachment #8595235 - Flags: review?(hv1989) → review+

Lars T Hansen [:lth]

Assignee

Comment 48

•

10 years ago

(In reply to Hannes Verschore [:h4writer] from comment #46) > Comment on attachment 8595232 [details] [diff] [review] > Intel support for Plain JS, v2 > > Review of attachment 8595232 [details] [diff] [review]: > ----------------------------------------------------------------- +1 on all of those, they are all oversights on my part.

Lars T Hansen [:lth]

Assignee

Comment 49

•

10 years ago

(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #45) > Comment on attachment 8593816 [details] [diff] [review] > Float32 and float64 atomics in the runtime > > Review of attachment 8593816 [details] [diff] [review]: > ----------------------------------------------------------------- Selected followups, assume +1 or "will try to make that work out" for the rest. > @@ +212,5 @@ > > + *addr = newval.l; > > + buffer->regionLock.release<8>(addr); > > + oldval.l = x; > > + } > > + // Note, canonicalization must take place in the caller if appropriate. > > Hmm. This is going to be...odd...when compare-exchange operations > performing an exchange if the existing value is NaN, have no effect when the > existing value *is* NaN. (I agree with not equating -0 and +0.) And it's > largely impossible to write code to ensure bitwise-identical NaNs appear > everywhere. Ah, so we don't have a canonical (quiet) NaN representation (canonical on one platform, I mean)? Tricky. But if, as you say, "largely impossible", then understandable :) Note that for non-asm.js, all callers of the CompareExchangeSeqCst function immediately canonicalize. The reason I've not done that for asm.js is because asm.js generally does not seem to do canonization (except at the boundary with plain JS, presumably). > We could make the comparison use SameValue semantics, if we added a > well-predicted branch on |IsNaN(oldCandidate)|, guarding a more elaborate > loop-based approach. Even that overhead should be dwarfed by cost to > synchronize. Certainly the overhead is dwarfed if we inline the operation; if we choose to bail out the overhead might be noticeable, but that may not be a dealbreaker if NaN is a corner case (as I expect it is). > Should we? Not all users want instruction-level granularity. asm.js > probably does, but anyone working in plain handwritten JS, could easily be > surprised. There *will* be some such people. And given Granted, you'd have > to access and modify a typed array's underlying buffer. But I'm not sure I > can discard the possibility quite so immediately. The assumption is that asm.js wants to translate a CAS on float64 into one of ours, and C++ uses storage equality, for better or worse. So asm.js wants the current semantics. I completely agree that it would be surprising to a JS programmer, provided it is the case that our arithmetic operations can produce several different NaN representations on any given CPU. (Anyone creating a custom NaN representation I'm still not all that worried about.) > Suppose you wanted compare-and-exchange to use SameValue equality. Could > you use a bitwise version and check for the existing, returned value being > NaN and retrying the compare-exchange with that? Not really, because the > returned value might be canonicalized. I think you'd *have* to do it by > examining the underlying bytes. But you'd have to repeatedly loop in case > someone was mutating bytes underneath you. Very tricky. Yes, loop while the bits are NaN, exit when you succeed or the bits are no longer NaN. On that note, no CPU that I know of supports CAS to floating point registers, everything goes through integer registers (see the subsequent patches for examples) and nothing is canonicalized on load. We have access to the bits, directly, at this level, at no extra cost. > I almost wonder if providing both operations might be the answer. I'm not > really convinced providing just bitwise, or just SameValue equality, is the > right answer here. :-\ There's another dimension. As specified, compareExchange only has the one return value, namely the old value in the cell. In C++ it has two return values, the old value and a result for the operation (did it succeed or not). For integers that bit about operation success is pretty much redundant (and for asm.js multiple return values is not great), but for floating point numbers that may be NaN in plain JS it appears not to be. So suppose we add a primitive that returns both those values: var [flag,cellval] = Atomics.compareExchange2(mem, index, oldval, newval) Then it would be possible, without inspecting the bits, for the JS program to decide what to do: if (!flag && isNaN(cellval) && isNaN(oldval)) { // special case } Arguably the programmer would have to be fairly aware to use this strategy in the first case, so perhaps that's not a great argument. And the programmer still can't update that cell with CAS because the non-canonical NaN has no representation in JS. I probably agree that for plain JS there ought to be a SameValue-based variant that handles NaN. I'll make a note on the spec.

Jeff Walden [:Waldo]

Comment 50

•

10 years ago

Comment on attachment 8593817 [details] [diff] [review] Test cases for plain JS Review of attachment 8593817 [details] [diff] [review]: ----------------------------------------------------------------- Unrelated to this, do we have any tests yet that verify proper synchronization behavior, global sequence order, &c. yet? ::: js/src/jit-test/tests/atomics/basic-tests.js @@ +45,1 @@ > // val = 0 Arguably you should update all these val= comments with a +c in them. If you're going for cleaner-looking code, I could buy that rationale. @@ +68,5 @@ > + if (!isFloat) { > + // val = 0 > + assertEq(Atomics.add(a, x, 3), 0+c); // Don't add c again > + // val = 3 > + assertEq(Atomics.sub(a, x, 2), 3+c); // Ditto With c===0 it hardly seems worthwhile putting +c in *only* these places. @@ +88,1 @@ > // val = 0 Given here you revert back to no +c, I think probably even if cleanliness was the idea with not adding +c above everywhere, you probably should give it up here. @@ +121,2 @@ > // val = 0 > + assertEq(gAtomics_compareExchange(a, x, 0, 37+c), 0); Is this testing a different |this| or something? Just not entirely sure I see the point of it, but fine. ...oh, looking later on, I guess so. Guess I wouldn't have bothered, but eh. I guess all the same comments as before keep applying, here. @@ +227,5 @@ > + Atomics.add(a, 0, 1); > + } > + catch (e) { > + thrown = true; > + assertEq(e instanceof TypeError, true); This hits the ReportBadArrayType in atomics_binop_impl, right? I would (separate patch) add float32/float64 cases to that switch with a custom error message to the effect that "binops not supported on floating-point element types" or somesuch. @@ +411,5 @@ > > assertEq(sum, k); > + > + // Values that are Uint32 but not Int32 and which trigger > + // different paths within the JIT. Might be worth adding a few non-constant tests as well -- I could see this being converted into int32_t somehow and avoiding the input-is-double JIT code at some time. @@ +506,5 @@ > + CLONE(testThrowBinop)(f32); > + CLONE(testThrowBinop)(f64); > + > + // Clear this, or the range tests fail > + v32[0] = 0; Maybe add an assert that Atomics.load(a, 0) is 0 at the start of testRangeCAS as documentation?

Attachment #8593817 - Flags: review?(jwalden+bmo) → review+

Lars T Hansen [:lth]

Assignee

Comment 51

•

10 years ago

Dougc: review ping on the patch for plain JS (comment 14, comment 30, comment 33). Note that the code for UXTH/UXTAH landed separately and need not be reviewed as part of this patch.

Flags: needinfo?(dtc-moz)

Lars T Hansen [:lth]

Assignee

Comment 52

•

10 years ago

Jukka, can you followup to Luke's question in comment 44? I think he's waiting on a response before proceeding with reviews.

Flags: needinfo?(jujjyl)

Lars T Hansen [:lth]

Assignee

Comment 53

•

10 years ago

(In reply to Lars T Hansen [:lth] from comment #32) > (In reply to Douglas Crosher [:dougc] from comment #31) > > Comment on attachment 8593830 [details] [diff] [review] > > ARM support for asm.js > > > > Review of attachment 8593830 [details] [diff] [review]: > > ----------------------------------------------------------------- > > ::: js/src/jit/arm/Simulator-arm.cpp > > @@ +2274,5 @@ > > > } > > > + case Args_Double_IntDoubleDouble: { > > > + double dval0, dval1; > > > + int32_t ival; > > > + getFpArgs(&dval0, &dval1, &ival); > > > > Could you double check that the use of getFpArgs is correct? > > I did check before, but I will do so again. Ah, it's arguably correct for hardfp but not for softfp (and the argument rearrangement you suggested would have made it correct in both cases). Will fix this.

Douglas Crosher [:dougc]

Comment 54

•

10 years ago

Comment on attachment 8593818 [details] [diff] [review] ARM support for Plain JS Review of attachment 8593818 [details] [diff] [review]: ----------------------------------------------------------------- Not scrutinized, but it looks plausible and contained.

Attachment #8593818 - Flags: review?(dtc-moz) → review+

Douglas Crosher [:dougc]

Comment 55

•

10 years ago

Comment on attachment 8593830 [details] [diff] [review] ARM support for asm.js Review of attachment 8593830 [details] [diff] [review]: ----------------------------------------------------------------- Not scrutinized, but it looks plausible with the argument passing issue addressed, and its contained. Good to see this support added to the ARM backend, thank you.

Attachment #8593830 - Flags: review?(dtc-moz) → review+

Jukka Jylänki

Comment 56

•

10 years ago

Oh sorry, made a note in the back of my mind to analyze closer where the places are, but then forgot. Took a closer look now and sent email to you guys.

Flags: needinfo?(jujjyl)

Lars T Hansen [:lth]

Assignee

Comment 57

•

10 years ago

(In reply to Lars T Hansen [:lth] from comment #20) > Created attachment 8593830 [details] [diff] [review] > ARM support for asm.js > > ARM support for the asm.js patch that will appear shortly. There are actually other bugs here. I've started testing in earnest with ARMv6 (only ARMHWCAPS=vfp) and I'm running into issues with the way I use the callout protocol with Float64 returns, defineFixed does not like to be used with a floating-point register.

Lars T Hansen [:lth]

Assignee

Comment 58

•

10 years ago

(In reply to Lars T Hansen [:lth] from comment #57) > (In reply to Lars T Hansen [:lth] from comment #20) > > Created attachment 8593830 [details] [diff] [review] > > ARM support for asm.js > > > > ARM support for the asm.js patch that will appear shortly. > > There are actually other bugs here. I've started testing in earnest with > ARMv6 (only ARMHWCAPS=vfp) and I'm running into issues with the way I use > the callout protocol with Float64 returns, defineFixed does not like to be > used with a floating-point register. Turned out this was only a problem with trying to capture the return value of a void instruction. With that change, and with the proper fix for softfp, the ARM code passes all tests configured both as ARMv6 and ARMv7.

Douglas Crosher [:dougc]

Updated

•

10 years ago

Flags: needinfo?(dtc-moz)

Luke Wagner [:luke]

Comment 59

•

10 years ago

Comment on attachment 8593832 [details] [diff] [review] Intel support for asm.js Canceling for now to see if we ultimately end up needing this.

Attachment #8593832 - Flags: review?(luke)

Luke Wagner [:luke]

Updated

•

10 years ago

Attachment #8593835 - Flags: review?(luke)

Luke Wagner [:luke]

Updated

•

10 years ago

Attachment #8593836 - Flags: review?(luke)

Lars T Hansen [:lth]

Assignee

Comment 60

•

10 years ago

Test case cleanup for asm.js was moved to bug 1175494.

Lars T Hansen [:lth]

Assignee

Comment 61

•

10 years ago

We won't support these. The generic code in these patches has been recycled in other contexts, and some of the rest will come back if/when we do int64 atomics.

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → WONTFIX

Lars T Hansen [:lth]

Assignee

Updated

•

8 years ago

WIP: float32 and float64 atomics in the runtime 11 years ago Lars T Hansen [:lth] 11.82 KB, patch		Details \| Diff \| Splinter Review
WIP: Test cases 11 years ago Lars T Hansen [:lth] 12.45 KB, patch		Details \| Diff \| Splinter Review
Float32 and float64 atomics in the runtime 10 years ago Lars T Hansen [:lth] 25.21 KB, patch	Waldo : review+	Details \| Diff \| Splinter Review
Test cases for plain JS 10 years ago Lars T Hansen [:lth] 11.43 KB, patch	Waldo : review+	Details \| Diff \| Splinter Review
ARM support for Plain JS 10 years ago Lars T Hansen [:lth] 33.87 KB, patch	dougc : review+	Details \| Diff \| Splinter Review
Intel support for Plain JS 10 years ago Lars T Hansen [:lth] 37.41 KB, patch		Details \| Diff \| Splinter Review
None support for Plain JS 10 years ago Lars T Hansen [:lth] 3.35 KB, patch	h4writer : review+	Details \| Diff \| Splinter Review
MIPS support for Plain JS 10 years ago Lars T Hansen [:lth] 4.39 KB, patch	rankov : review+	Details \| Diff \| Splinter Review
Ion changes to implement float32/float64 atomics 10 years ago Lars T Hansen [:lth] 36.27 KB, patch		Details \| Diff \| Splinter Review
Optimization test cases for plain JS 10 years ago Lars T Hansen [:lth] 2.45 KB, patch	h4writer : review+	Details \| Diff \| Splinter Review
ARM support for asm.js 10 years ago Lars T Hansen [:lth] 38.16 KB, patch	dougc : review+	Details \| Diff \| Splinter Review
Intel support for asm.js 10 years ago Lars T Hansen [:lth] 24.66 KB, patch		Details \| Diff \| Splinter Review
asm.js changes to implement float32/float64 atomics 10 years ago Lars T Hansen [:lth] 20.71 KB, patch		Details \| Diff \| Splinter Review
Test cases for asm.js 10 years ago Lars T Hansen [:lth] 43.25 KB, patch		Details \| Diff \| Splinter Review
Intel support for Plain JS, v2 10 years ago Lars T Hansen [:lth] 37.72 KB, patch	h4writer : review+	Details \| Diff \| Splinter Review
Ion changes to implement float32/float64 atomics, v2 10 years ago Lars T Hansen [:lth] 40.31 KB, patch	h4writer : review+	Details \| Diff \| Splinter Review