Closed Bug 1077318 Opened 10 years ago Closed 10 years ago

ARM: Generate atomics code for ARMv6 for 1 and 2 byte operations

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla40

Tracking Flags:

Tracking

Status

firefox40

---

fixed

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 2 obsolete files)

WIP: bug1077318-ARMv6Atomics.diff 10 years ago Lars T Hansen [:lth] 14.15 KB, patch		Details \| Diff \| Splinter Review
WIP: bug1077318-ARMv6Atomics.diff 10 years ago Lars T Hansen [:lth] 17.66 KB, patch		Details \| Diff \| Splinter Review
ARMv6 atomics for asm.js via callouts to C++ 10 years ago Lars T Hansen [:lth] 27.17 KB, patch		Details \| Diff \| Splinter Review
ARMv6 atomics for asm.js via callouts to C++, v2 10 years ago Lars T Hansen [:lth] 30.78 KB, patch	luke : review+	Details \| Diff \| Splinter Review

Lars T Hansen [:lth]

Assignee

Description

•

10 years ago

Followup work to bug 979594. On ARMv6 there are no byte and halfword load-exclusive / store-exclusive operations, instead we must implement those operations with RMW and word-wide operations. This affects compareExchange and add/sub/and/or/xor, not load, store, or fence. See stubs in jit/arm/MacroAssembler-arm.cpp.

Douglas Crosher [:dougc]

Comment 1

•

10 years ago

Could I ask if there are any potential alignment issues in the ARM atomics code? Some devices appear to enable the alignment checks, and we have had to disable an irregexp optimization that was generating unaligned half word instructions, so could this be an issue in the atomics code or are all operations guaranteed to be aligned? This is not just an ARMv6 issue, all the load-exclusive / store-exclusive operations must be aligned even on ARMv7, but if we need to use a word-wide operation on the ARMv7 as a substitute for a halfword or byte sized operation then there might be further alignment issues. If we add back the alignment checks to the simulator we can catch some issues there.

Lars T Hansen [:lth]

Assignee

Comment 2

•

10 years ago

(In reply to Douglas Crosher [:dougc] from comment #1) > Could I ask if there are any potential alignment issues in the ARM atomics > code? There should not be any issues. All operations are restricted (by construction) to being properly aligned. There is no SharedDataView at this time and if there's going to be one I think atomics will probably use some spinlock solution. Personally I don't want one. > This is not just an ARMv6 issue, all the load-exclusive / store-exclusive > operations must be aligned even on ARMv7, but if we need to use a word-wide > operation on the ARMv7 as a substitute for a halfword or byte sized > operation then there might be further alignment issues. My plan is to mask off the low bits of the address before the operation, and the only requirement will be that the start of the array is aligned at at least four bytes and that the allocation is a multiple of four bytes in length; neither is an issue in practice. Frankly performance on ARMv6 is probably not that important so we can afford to do what it takes to stay aligned. > If we add back the alignment checks to the simulator we can catch some > issues there. Interesting point. I have to create a bug for some ARMv6 atomics support anyway (CP15 support for barriers) and at a minimum I can note there that I should add alignment checks to the atomics instructions in general.

Lars T Hansen [:lth]

Assignee

Comment 3

•

10 years ago

Attached patch WIP: bug1077318-ARMv6Atomics.diff (obsolete) — Details — Splinter Review

Code for atomicFetchOp, support for the BFI instruction in the assembler, a better test case, and some temporary test scaffolding. The in-line code for an atomicFetchOp is fairly large. I debated whether to just call into the VM, but this code will likely be at the core of all synchronization ops for asm.js so I decided to in-line expand after all. I've noted one optimization opportunity in the code (when the output is not needed), others are when the 'value' operand is quite small and when the index operand is known. If they're worth pursuing it would be when similar optimizations are implemented for ARMv7.

Attachment #8504061 - Flags: feedback?(dtc-moz)

Lars T Hansen [:lth]

Assignee

Comment 4

•

10 years ago

Looks like we can avoid the mov that sets up the result because sxtb and uxtb can both perform the rotate as part of the instruction. Nice.

Lars T Hansen [:lth]

Assignee

Comment 5

•

10 years ago

(In reply to Lars T Hansen [:lth] from comment #4) > Looks like we can avoid the mov that sets up the result because sxtb and > uxtb can both perform the rotate as part of the instruction. Nice. And of course load-immediate-and-multiply is an incompetent way to multiply a register by 8. Sigh.

Lars T Hansen [:lth]

Assignee

Comment 6

•

10 years ago

Attached patch WIP: bug1077318-ARMv6Atomics.diff — Details — Splinter Review

This is basically complete (and sits on top of my patch queue). I'll ask for reviews when that queue has landed. To test this on any platform or on the simulator, make HasLDSTREXBHD() in Architecture-arm.cpp return false. compareExchange() is a little clunky because I elected not to provide it with any temp registers; as a result it recomputes a couple of values.

Attachment #8504061 - Attachment is obsolete: true

Attachment #8504061 - Flags: feedback?(dtc-moz)

Lars T Hansen [:lth]

Assignee

Updated

•

10 years ago

Assignee: nobody → lhansen

Lars T Hansen [:lth]

Assignee

Comment 7

•

10 years ago

FWIW, according to bug 1085599 ARMv6 is now a Tier II platform (and somebody will likely run into test failures on ARMv6 soon). That status seems to be supported by https://developer.mozilla.org/en/docs/Supported_build_configurations, though that page appears to me to be a tiny bit stale (I thought Win64 had been promoted out of Tier III).

Jan de Mooij [:jandem]

Comment 8

•

10 years ago

decoder told me he's hitting the NYI crashes a lot when fuzzing the ARM simulator. Can we easily disable the code for now on ARM?

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Assignee

Comment 9

•

10 years ago

It can be disabled in moz.build by removing ENABLE_SHARED_ARRAY_BUFFER. A better solution is probably not to inline the Atomics primitives when on ARMv6, this will be a little hacky but not hard. I'll do so today.

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Assignee

Updated

•

10 years ago

Comment 10

•

10 years ago

On a related note, the prospective in-line code for ARMv6 is painfully large, so on that platform we'll likely want to make calls to C++ for both regular JS and asm.js.

Lars T Hansen [:lth]

Assignee

Comment 11

•

10 years ago

Attached patch ARMv6 atomics for asm.js via callouts to C++ (obsolete) — Details — Splinter Review

I'm offering this patch now because I'd like to solve bug 1146902, where testers are running into NYI crashes because I've not implemented ARMv6 support for the atomics. For Ion it's easy to disable inlining and fall back on the runtime functions, and the patch coming shortly on bug 1146902 will do so, but for asm.js there needs to be an implementation, and that's what's in this patch.

Attachment #8582593 - Flags: review?(luke)

Lars T Hansen [:lth]

Assignee

Comment 12

•

10 years ago

There's a small bug in this patch: the return type in the construction of FunctionCompiler::Call (two instances) should be Signed, not Double.

Luke Wagner [:luke]

Comment 13

•

10 years ago

Comment on attachment 8582593 [details] [diff] [review] ARMv6 atomics for asm.js via callouts to C++ Review of attachment 8582593 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/asmjs/AsmJSValidate.cpp @@ +5027,5 @@ > + return false; > + > + f.finishCallArgs(&call); > + > + if (!f.builtinCall(callee, call, MIRType_Int32, def)) Could all this be done in the ARM backend codegen for the Atomic MIR nodes instead of in Odin? That's what we've generally done for other ARMv6 workarounds.

Lars T Hansen [:lth]

Assignee

Comment 14

•

10 years ago

(In reply to Luke Wagner [:luke] from comment #13) > Could all this be done in the ARM backend codegen for the Atomic MIR nodes > instead of in Odin? That's what we've generally done for other ARMv6 > workarounds. You're thinking of eg the UDiv code? Probably. I'll look into it.

Lars T Hansen [:lth]

Assignee

Comment 15

•

10 years ago

Attached patch ARMv6 atomics for asm.js via callouts to C++, v2 — Details — Splinter Review

Substantially the same patch, but the bifurcation logic has been moved out of AsmJSValidate.cpp and into Lowering-arm.cpp where it gives rise to two new platform-specific LIR nodes to handle the callout cases.

Attachment #8582593 - Attachment is obsolete: true

Attachment #8582593 - Flags: review?(luke)

Attachment #8584484 - Flags: review?(luke)

Luke Wagner [:luke]

Comment 16

•

10 years ago

Comment on attachment 8584484 [details] [diff] [review] ARMv6 atomics for asm.js via callouts to C++, v2 Review of attachment 8584484 [details] [diff] [review]: ----------------------------------------------------------------- Great, thanks! ::: js/src/builtin/AtomicsObject.cpp @@ +709,5 @@ > +// value (for binops) and oldval/newval (for cmpxchg) are the values > +// to be operated upon. > + > +static void > +getCurrentAsmJSHeap(void **heap, size_t *length) SM style is to capitalize 'g'. ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +1969,5 @@ > + Scalar::Type viewType = mir->accessType(); > + Register ptr = ToRegister(ins->ptr()); > + Register oldval = ToRegister(ins->oldval()); > + Register newval = ToRegister(ins->newval()); > + int32_t itemSize = (int32_t)byteSize(viewType) * (isSignedIntType(viewType) ? -1 : 1); Instead of this implicit int coding scheme, could you pass the viewType directly (so the callee could simply cast it back to a Scalar::Type)? @@ +1978,5 @@ > + masm.passABIArg(ptr); > + masm.passABIArg(oldval); > + masm.passABIArg(newval); > + > + masm.callWithABI(AsmJSImm_AtomicCmpXchg); Can you assert the output reg is ReturnReg? ::: js/src/jit/arm/Lowering-arm.cpp @@ +659,5 @@ > + LAsmJSCompareExchangeCallout *lir = > + new(alloc()) LAsmJSCompareExchangeCallout(useRegister(ptr), > + useRegister(ins->oldValue()), > + useRegister(ins->newValue())); > + defineFixed(lir, ins, LAllocation(AnyRegister(r0))); How about s/r0/ReturnReg/ here and below to make the intention explicit?

Attachment #8584484 - Flags: review?(luke) → review+

Lars T Hansen [:lth]

Assignee

Comment 17

•

10 years ago

Thanks. The itemSize hack was pointlessly gross. With all comments addressed: https://hg.mozilla.org/integration/mozilla-inbound/rev/9193aac3a945

Tooru Fujisawa [:arai]

Comment 18

•

10 years ago

Fixed the order of #include in AsmJSModule.cpp https://hg.mozilla.org/integration/mozilla-inbound/rev/fe024e9b9095

Lars T Hansen [:lth]

Assignee

Comment 19

•

10 years ago

(In reply to Tooru Fujisawa [:arai] from comment #18) > Fixed the order of #include in AsmJSModule.cpp > > https://hg.mozilla.org/integration/mozilla-inbound/rev/fe024e9b9095 arai: thanks!

Ryan VanderMeulen [:RyanVM]

Comment 20

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/9193aac3a945 https://hg.mozilla.org/mozilla-central/rev/fe024e9b9095

Status: NEW → RESOLVED

Closed: 10 years ago

status-firefox40: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla40

You need to log in before you can comment on or make changes to this bug.