Closed
Bug 1077318
Opened 10 years ago
Closed 10 years ago
ARM: Generate atomics code for ARMv6 for 1 and 2 byte operations
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla40
Tracking | Status | |
---|---|---|
firefox40 | --- | fixed |
People
(Reporter: lth, Assigned: lth)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 2 obsolete files)
17.66 KB,
patch
|
Details | Diff | Splinter Review | |
30.78 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
Followup work to bug 979594.
On ARMv6 there are no byte and halfword load-exclusive / store-exclusive operations, instead we must implement those operations with RMW and word-wide operations.
This affects compareExchange and add/sub/and/or/xor, not load, store, or fence.
See stubs in jit/arm/MacroAssembler-arm.cpp.
Comment 1•10 years ago
|
||
Could I ask if there are any potential alignment issues in the ARM atomics code?
Some devices appear to enable the alignment checks, and we have had to disable an irregexp optimization that was generating unaligned half word instructions, so could this be an issue in the atomics code or are all operations guaranteed to be aligned?
This is not just an ARMv6 issue, all the load-exclusive / store-exclusive operations must be aligned even on ARMv7, but if we need to use a word-wide operation on the ARMv7 as a substitute for a halfword or byte sized operation then there might be further alignment issues.
If we add back the alignment checks to the simulator we can catch some issues there.
Assignee | ||
Comment 2•10 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #1)
> Could I ask if there are any potential alignment issues in the ARM atomics
> code?
There should not be any issues. All operations are restricted (by construction) to being properly aligned. There is no SharedDataView at this time and if there's going to be one I think atomics will probably use some spinlock solution. Personally I don't want one.
> This is not just an ARMv6 issue, all the load-exclusive / store-exclusive
> operations must be aligned even on ARMv7, but if we need to use a word-wide
> operation on the ARMv7 as a substitute for a halfword or byte sized
> operation then there might be further alignment issues.
My plan is to mask off the low bits of the address before the operation, and the only requirement will be that the start of the array is aligned at at least four bytes and that the allocation is a multiple of four bytes in length; neither is an issue in practice. Frankly performance on ARMv6 is probably not that important so we can afford to do what it takes to stay aligned.
> If we add back the alignment checks to the simulator we can catch some
> issues there.
Interesting point. I have to create a bug for some ARMv6 atomics support anyway (CP15 support for barriers) and at a minimum I can note there that I should add alignment checks to the atomics instructions in general.
Assignee | ||
Comment 3•10 years ago
|
||
Code for atomicFetchOp, support for the BFI instruction in the assembler, a better test case, and some temporary test scaffolding.
The in-line code for an atomicFetchOp is fairly large. I debated whether to just call into the VM, but this code will likely be at the core of all synchronization ops for asm.js so I decided to in-line expand after all.
I've noted one optimization opportunity in the code (when the output is not needed), others are when the 'value' operand is quite small and when the index operand is known. If they're worth pursuing it would be when similar optimizations are implemented for ARMv7.
Attachment #8504061 -
Flags: feedback?(dtc-moz)
Assignee | ||
Comment 4•10 years ago
|
||
Looks like we can avoid the mov that sets up the result because sxtb and uxtb can both perform the rotate as part of the instruction. Nice.
Assignee | ||
Comment 5•10 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #4)
> Looks like we can avoid the mov that sets up the result because sxtb and
> uxtb can both perform the rotate as part of the instruction. Nice.
And of course load-immediate-and-multiply is an incompetent way to multiply a register by 8. Sigh.
Assignee | ||
Comment 6•10 years ago
|
||
This is basically complete (and sits on top of my patch queue). I'll ask for reviews when that queue has landed.
To test this on any platform or on the simulator, make HasLDSTREXBHD() in Architecture-arm.cpp return false.
compareExchange() is a little clunky because I elected not to provide it with any temp registers; as a result it recomputes a couple of values.
Attachment #8504061 -
Attachment is obsolete: true
Attachment #8504061 -
Flags: feedback?(dtc-moz)
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → lhansen
Assignee | ||
Comment 7•10 years ago
|
||
FWIW, according to bug 1085599 ARMv6 is now a Tier II platform (and somebody will likely run into test failures on ARMv6 soon).
That status seems to be supported by https://developer.mozilla.org/en/docs/Supported_build_configurations, though that page appears to me to be a tiny bit stale (I thought Win64 had been promoted out of Tier III).
Comment 8•10 years ago
|
||
decoder told me he's hitting the NYI crashes a lot when fuzzing the ARM simulator. Can we easily disable the code for now on ARM?
Flags: needinfo?(lhansen)
Assignee | ||
Comment 9•10 years ago
|
||
It can be disabled in moz.build by removing ENABLE_SHARED_ARRAY_BUFFER.
A better solution is probably not to inline the Atomics primitives when on ARMv6, this will be a little hacky but not hard. I'll do so today.
Flags: needinfo?(lhansen)
Assignee | ||
Comment 10•10 years ago
|
||
On a related note, the prospective in-line code for ARMv6 is painfully large, so on that platform we'll likely want to make calls to C++ for both regular JS and asm.js.
Assignee | ||
Comment 11•10 years ago
|
||
I'm offering this patch now because I'd like to solve bug 1146902, where testers are running into NYI crashes because I've not implemented ARMv6 support for the atomics. For Ion it's easy to disable inlining and fall back on the runtime functions, and the patch coming shortly on bug 1146902 will do so, but for asm.js there needs to be an implementation, and that's what's in this patch.
Attachment #8582593 -
Flags: review?(luke)
Assignee | ||
Comment 12•10 years ago
|
||
There's a small bug in this patch: the return type in the construction of FunctionCompiler::Call (two instances) should be Signed, not Double.
Comment 13•10 years ago
|
||
Comment on attachment 8582593 [details] [diff] [review]
ARMv6 atomics for asm.js via callouts to C++
Review of attachment 8582593 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/asmjs/AsmJSValidate.cpp
@@ +5027,5 @@
> + return false;
> +
> + f.finishCallArgs(&call);
> +
> + if (!f.builtinCall(callee, call, MIRType_Int32, def))
Could all this be done in the ARM backend codegen for the Atomic MIR nodes instead of in Odin? That's what we've generally done for other ARMv6 workarounds.
Assignee | ||
Comment 14•10 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #13)
> Could all this be done in the ARM backend codegen for the Atomic MIR nodes
> instead of in Odin? That's what we've generally done for other ARMv6
> workarounds.
You're thinking of eg the UDiv code? Probably. I'll look into it.
Assignee | ||
Comment 15•10 years ago
|
||
Substantially the same patch, but the bifurcation logic has been moved out of AsmJSValidate.cpp and into Lowering-arm.cpp where it gives rise to two new platform-specific LIR nodes to handle the callout cases.
Attachment #8582593 -
Attachment is obsolete: true
Attachment #8582593 -
Flags: review?(luke)
Attachment #8584484 -
Flags: review?(luke)
Comment 16•10 years ago
|
||
Comment on attachment 8584484 [details] [diff] [review]
ARMv6 atomics for asm.js via callouts to C++, v2
Review of attachment 8584484 [details] [diff] [review]:
-----------------------------------------------------------------
Great, thanks!
::: js/src/builtin/AtomicsObject.cpp
@@ +709,5 @@
> +// value (for binops) and oldval/newval (for cmpxchg) are the values
> +// to be operated upon.
> +
> +static void
> +getCurrentAsmJSHeap(void **heap, size_t *length)
SM style is to capitalize 'g'.
::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +1969,5 @@
> + Scalar::Type viewType = mir->accessType();
> + Register ptr = ToRegister(ins->ptr());
> + Register oldval = ToRegister(ins->oldval());
> + Register newval = ToRegister(ins->newval());
> + int32_t itemSize = (int32_t)byteSize(viewType) * (isSignedIntType(viewType) ? -1 : 1);
Instead of this implicit int coding scheme, could you pass the viewType directly (so the callee could simply cast it back to a Scalar::Type)?
@@ +1978,5 @@
> + masm.passABIArg(ptr);
> + masm.passABIArg(oldval);
> + masm.passABIArg(newval);
> +
> + masm.callWithABI(AsmJSImm_AtomicCmpXchg);
Can you assert the output reg is ReturnReg?
::: js/src/jit/arm/Lowering-arm.cpp
@@ +659,5 @@
> + LAsmJSCompareExchangeCallout *lir =
> + new(alloc()) LAsmJSCompareExchangeCallout(useRegister(ptr),
> + useRegister(ins->oldValue()),
> + useRegister(ins->newValue()));
> + defineFixed(lir, ins, LAllocation(AnyRegister(r0)));
How about s/r0/ReturnReg/ here and below to make the intention explicit?
Attachment #8584484 -
Flags: review?(luke) → review+
Assignee | ||
Comment 17•10 years ago
|
||
Thanks. The itemSize hack was pointlessly gross.
With all comments addressed:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9193aac3a945
Comment 18•10 years ago
|
||
Fixed the order of #include in AsmJSModule.cpp
https://hg.mozilla.org/integration/mozilla-inbound/rev/fe024e9b9095
Assignee | ||
Comment 19•10 years ago
|
||
(In reply to Tooru Fujisawa [:arai] from comment #18)
> Fixed the order of #include in AsmJSModule.cpp
>
> https://hg.mozilla.org/integration/mozilla-inbound/rev/fe024e9b9095
arai: thanks!
Comment 20•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/9193aac3a945
https://hg.mozilla.org/mozilla-central/rev/fe024e9b9095
Status: NEW → RESOLVED
Closed: 10 years ago
status-firefox40:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in
before you can comment on or make changes to this bug.
Description
•