Closed Bug 1077318 Opened 10 years ago Closed 10 years ago

ARM: Generate atomics code for ARMv6 for 1 and 2 byte operations

Categories

(Core :: JavaScript Engine: JIT, defect)

ARM
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 2 obsolete files)

Followup work to bug 979594. On ARMv6 there are no byte and halfword load-exclusive / store-exclusive operations, instead we must implement those operations with RMW and word-wide operations. This affects compareExchange and add/sub/and/or/xor, not load, store, or fence. See stubs in jit/arm/MacroAssembler-arm.cpp.
Could I ask if there are any potential alignment issues in the ARM atomics code? Some devices appear to enable the alignment checks, and we have had to disable an irregexp optimization that was generating unaligned half word instructions, so could this be an issue in the atomics code or are all operations guaranteed to be aligned? This is not just an ARMv6 issue, all the load-exclusive / store-exclusive operations must be aligned even on ARMv7, but if we need to use a word-wide operation on the ARMv7 as a substitute for a halfword or byte sized operation then there might be further alignment issues. If we add back the alignment checks to the simulator we can catch some issues there.
(In reply to Douglas Crosher [:dougc] from comment #1) > Could I ask if there are any potential alignment issues in the ARM atomics > code? There should not be any issues. All operations are restricted (by construction) to being properly aligned. There is no SharedDataView at this time and if there's going to be one I think atomics will probably use some spinlock solution. Personally I don't want one. > This is not just an ARMv6 issue, all the load-exclusive / store-exclusive > operations must be aligned even on ARMv7, but if we need to use a word-wide > operation on the ARMv7 as a substitute for a halfword or byte sized > operation then there might be further alignment issues. My plan is to mask off the low bits of the address before the operation, and the only requirement will be that the start of the array is aligned at at least four bytes and that the allocation is a multiple of four bytes in length; neither is an issue in practice. Frankly performance on ARMv6 is probably not that important so we can afford to do what it takes to stay aligned. > If we add back the alignment checks to the simulator we can catch some > issues there. Interesting point. I have to create a bug for some ARMv6 atomics support anyway (CP15 support for barriers) and at a minimum I can note there that I should add alignment checks to the atomics instructions in general.
Code for atomicFetchOp, support for the BFI instruction in the assembler, a better test case, and some temporary test scaffolding. The in-line code for an atomicFetchOp is fairly large. I debated whether to just call into the VM, but this code will likely be at the core of all synchronization ops for asm.js so I decided to in-line expand after all. I've noted one optimization opportunity in the code (when the output is not needed), others are when the 'value' operand is quite small and when the index operand is known. If they're worth pursuing it would be when similar optimizations are implemented for ARMv7.
Attachment #8504061 - Flags: feedback?(dtc-moz)
Looks like we can avoid the mov that sets up the result because sxtb and uxtb can both perform the rotate as part of the instruction. Nice.
(In reply to Lars T Hansen [:lth] from comment #4) > Looks like we can avoid the mov that sets up the result because sxtb and > uxtb can both perform the rotate as part of the instruction. Nice. And of course load-immediate-and-multiply is an incompetent way to multiply a register by 8. Sigh.
This is basically complete (and sits on top of my patch queue). I'll ask for reviews when that queue has landed. To test this on any platform or on the simulator, make HasLDSTREXBHD() in Architecture-arm.cpp return false. compareExchange() is a little clunky because I elected not to provide it with any temp registers; as a result it recomputes a couple of values.
Attachment #8504061 - Attachment is obsolete: true
Attachment #8504061 - Flags: feedback?(dtc-moz)
Assignee: nobody → lhansen
FWIW, according to bug 1085599 ARMv6 is now a Tier II platform (and somebody will likely run into test failures on ARMv6 soon). That status seems to be supported by https://developer.mozilla.org/en/docs/Supported_build_configurations, though that page appears to me to be a tiny bit stale (I thought Win64 had been promoted out of Tier III).
decoder told me he's hitting the NYI crashes a lot when fuzzing the ARM simulator. Can we easily disable the code for now on ARM?
Flags: needinfo?(lhansen)
It can be disabled in moz.build by removing ENABLE_SHARED_ARRAY_BUFFER. A better solution is probably not to inline the Atomics primitives when on ARMv6, this will be a little hacky but not hard. I'll do so today.
Flags: needinfo?(lhansen)
See Also: → 1146902
On a related note, the prospective in-line code for ARMv6 is painfully large, so on that platform we'll likely want to make calls to C++ for both regular JS and asm.js.
I'm offering this patch now because I'd like to solve bug 1146902, where testers are running into NYI crashes because I've not implemented ARMv6 support for the atomics. For Ion it's easy to disable inlining and fall back on the runtime functions, and the patch coming shortly on bug 1146902 will do so, but for asm.js there needs to be an implementation, and that's what's in this patch.
Attachment #8582593 - Flags: review?(luke)
There's a small bug in this patch: the return type in the construction of FunctionCompiler::Call (two instances) should be Signed, not Double.
Comment on attachment 8582593 [details] [diff] [review] ARMv6 atomics for asm.js via callouts to C++ Review of attachment 8582593 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/asmjs/AsmJSValidate.cpp @@ +5027,5 @@ > + return false; > + > + f.finishCallArgs(&call); > + > + if (!f.builtinCall(callee, call, MIRType_Int32, def)) Could all this be done in the ARM backend codegen for the Atomic MIR nodes instead of in Odin? That's what we've generally done for other ARMv6 workarounds.
(In reply to Luke Wagner [:luke] from comment #13) > Could all this be done in the ARM backend codegen for the Atomic MIR nodes > instead of in Odin? That's what we've generally done for other ARMv6 > workarounds. You're thinking of eg the UDiv code? Probably. I'll look into it.
Substantially the same patch, but the bifurcation logic has been moved out of AsmJSValidate.cpp and into Lowering-arm.cpp where it gives rise to two new platform-specific LIR nodes to handle the callout cases.
Attachment #8582593 - Attachment is obsolete: true
Attachment #8582593 - Flags: review?(luke)
Attachment #8584484 - Flags: review?(luke)
Comment on attachment 8584484 [details] [diff] [review] ARMv6 atomics for asm.js via callouts to C++, v2 Review of attachment 8584484 [details] [diff] [review]: ----------------------------------------------------------------- Great, thanks! ::: js/src/builtin/AtomicsObject.cpp @@ +709,5 @@ > +// value (for binops) and oldval/newval (for cmpxchg) are the values > +// to be operated upon. > + > +static void > +getCurrentAsmJSHeap(void **heap, size_t *length) SM style is to capitalize 'g'. ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +1969,5 @@ > + Scalar::Type viewType = mir->accessType(); > + Register ptr = ToRegister(ins->ptr()); > + Register oldval = ToRegister(ins->oldval()); > + Register newval = ToRegister(ins->newval()); > + int32_t itemSize = (int32_t)byteSize(viewType) * (isSignedIntType(viewType) ? -1 : 1); Instead of this implicit int coding scheme, could you pass the viewType directly (so the callee could simply cast it back to a Scalar::Type)? @@ +1978,5 @@ > + masm.passABIArg(ptr); > + masm.passABIArg(oldval); > + masm.passABIArg(newval); > + > + masm.callWithABI(AsmJSImm_AtomicCmpXchg); Can you assert the output reg is ReturnReg? ::: js/src/jit/arm/Lowering-arm.cpp @@ +659,5 @@ > + LAsmJSCompareExchangeCallout *lir = > + new(alloc()) LAsmJSCompareExchangeCallout(useRegister(ptr), > + useRegister(ins->oldValue()), > + useRegister(ins->newValue())); > + defineFixed(lir, ins, LAllocation(AnyRegister(r0))); How about s/r0/ReturnReg/ here and below to make the intention explicit?
Attachment #8584484 - Flags: review?(luke) → review+
Thanks. The itemSize hack was pointlessly gross. With all comments addressed: https://hg.mozilla.org/integration/mozilla-inbound/rev/9193aac3a945
(In reply to Tooru Fujisawa [:arai] from comment #18) > Fixed the order of #include in AsmJSModule.cpp > > https://hg.mozilla.org/integration/mozilla-inbound/rev/fe024e9b9095 arai: thanks!
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: