Status

()

Core
JavaScript Engine: JIT
P3
normal
RESOLVED DUPLICATE of bug 1313336
3 years ago
4 months ago

People

(Reporter: lth, Assigned: lth)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

3 years ago
Support the atomic operations for the ARM64 JIT (Odin, Ion, assembler).
(Assignee)

Comment 1

3 years ago
Created attachment 8676185 [details] [diff] [review]
bug1216486-arm64-masm-atomics.patch

This compiles, but my impression is that Ion is not in an operational state so I've not attempted to test.

I'm looking for feedback on style / appropriateness, in addition to any remarks about (in)correctness.  The code has been translated from the corresponding ARMv7 code.  I believe I could use LDAR/STLR and so on to get rid of one fence, but that's for later.

(I've only done compareExchange so far but the others are easy once we agree on this baseline.)
Attachment #8676185 - Flags: feedback?(jolesen)
Comment on attachment 8676185 [details] [diff] [review]
bug1216486-arm64-masm-atomics.patch

Review of attachment 8676185 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm64/MacroAssembler-arm64.cpp
@@ +261,5 @@
>  }
>  
> +template<>
> +ARMRegister
> +MacroAssemblerCompat::computePointer<BaseIndex>(const BaseIndex& src, ARMRegister r)

Rather than having two template specializations in a .cpp file, I would prefer to simply provide two overloads of a non-template function:

ARMRegister computePointer(const BaseIndex& src, ARMRegister r);
ARMRegister computePointer(const Address& src, ARMRegister r);

I get confused about linkage when declaring a template in a header with specializations in a .cpp file.

@@ +301,5 @@
> +    const ARMRegister oldval(oldval_, 64);
> +    const ARMRegister newval(newval_, 64);
> +    const ARMRegister output(output_, 64);
> +
> +    Dmb(vixl::InnerShareable, vixl::BarrierWrites);

I don't understand these matters well enough, but Appendix J10.3 of the ARMv8 ARM suggests that many synchronization primitives can be implemented without a barrier in ARMv8:

"The ARMv8 architecture adds the acquire and release semantics to Load-Exclusive and Store-Exclusive instructions, which allows them to gain ordering acquire and/or release semantics."

@@ +344,5 @@
> +        Stxr(scratch, newval, ptr);
> +        break;
> +    }
> +    Cmp(scratch, Operand(1));
> +    B(&again, Assembler::Equal);

Note that Aarch64 has cbz and cbnz instructions which fuse compare against 0 and branch into one instruction.

@@ +354,5 @@
> +MacroAssemblerCompat::compareExchange(int nbytes, bool signExtend, const Address& address,
> +                                      Register oldval, Register newval, Register output);
> +template void
> +MacroAssemblerCompat::compareExchange(int nbytes, bool signExtend, const BaseIndex& address,
> +                                      Register oldval, Register newval, Register output);

What's going on here?
Attachment #8676185 - Flags: feedback?(jolesen) → feedback+
(Assignee)

Comment 3

3 years ago
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #2)
> Comment on attachment 8676185 [details] [diff] [review]
> bug1216486-arm64-masm-atomics.patch
> 
> Review of attachment 8676185 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jit/arm64/MacroAssembler-arm64.cpp
> @@ +261,5 @@
> >  }
> >  
> > +template<>
> > +ARMRegister
> > +MacroAssemblerCompat::computePointer<BaseIndex>(const BaseIndex& src, ARMRegister r)
> 
> Rather than having two template specializations in a .cpp file, I would
> prefer to simply provide two overloads of a non-template function:
> 
> ARMRegister computePointer(const BaseIndex& src, ARMRegister r);
> ARMRegister computePointer(const Address& src, ARMRegister r);
> 
> I get confused about linkage when declaring a template in a header with
> specializations in a .cpp file.

I can change this.  We already use templates like this pretty extensively though (and this code is just copied over from the ARM port).  Also see below.

> @@ +301,5 @@
> > +    const ARMRegister oldval(oldval_, 64);
> > +    const ARMRegister newval(newval_, 64);
> > +    const ARMRegister output(output_, 64);
> > +
> > +    Dmb(vixl::InnerShareable, vixl::BarrierWrites);
> 
> I don't understand these matters well enough, but Appendix J10.3 of the
> ARMv8 ARM suggests that many synchronization primitives can be implemented
> without a barrier in ARMv8:
> 
> "The ARMv8 architecture adds the acquire and release semantics to
> Load-Exclusive and Store-Exclusive instructions, which allows them to gain
> ordering acquire and/or release semantics."

Yes, that's what I wrote in the submission comment: I may be able to get rid of one of these Dmb instructions by using the release/acquire instructions (but not both Dmbs, since rel/acq is strictly weaker than sequential consistency).  But I'd like the code to run correctly in its current form first and consider it an optimization.  Also see bug 1077321.

> @@ +354,5 @@
> > +MacroAssemblerCompat::compareExchange(int nbytes, bool signExtend, const Address& address,
> > +                                      Register oldval, Register newval, Register output);
> > +template void
> > +MacroAssemblerCompat::compareExchange(int nbytes, bool signExtend, const BaseIndex& address,
> > +                                      Register oldval, Register newval, Register output);
> 
> What's going on here?

Explicit template instantiation.  Most compilers require this.
(In reply to Lars T Hansen [:lth] from comment #3)
> > I get confused about linkage when declaring a template in a header with
> > specializations in a .cpp file.
> 
> I can change this.  We already use templates like this pretty extensively
> though (and this code is just copied over from the ARM port).  Also see
> below.

Don't change existing practice on my behalf.

The benefit of using template specializations here isn't really clear to me, though. And if I call a template function with the wrong type of argument, I think the error message will be deferred to link time.

Maybe it has to do with MacroAssembler.h being target independent?

> > I don't understand these matters well enough, but Appendix J10.3 of the
> > ARMv8 ARM suggests that many synchronization primitives can be implemented
> > without a barrier in ARMv8:
> > 
> > "The ARMv8 architecture adds the acquire and release semantics to
> > Load-Exclusive and Store-Exclusive instructions, which allows them to gain
> > ordering acquire and/or release semantics."
> 
> Yes, that's what I wrote in the submission comment: I may be able to get rid
> of one of these Dmb instructions by using the release/acquire instructions
> (but not both Dmbs, since rel/acq is strictly weaker than sequential
> consistency).  But I'd like the code to run correctly in its current form
> first and consider it an optimization.  Also see bug 1077321.

Yes, I didn't read your comment properly until after pushing "submit" ;-)

It makes sense to build something that works first.
(Assignee)

Comment 5

3 years ago
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #4)
> (In reply to Lars T Hansen [:lth] from comment #3)
> > > I get confused about linkage when declaring a template in a header with
> > > specializations in a .cpp file.
> > 
> > I can change this.  We already use templates like this pretty extensively
> > though (and this code is just copied over from the ARM port).  Also see
> > below.
> 
> Don't change existing practice on my behalf.
> 
> The benefit of using template specializations here isn't really clear to me,
> though. And if I call a template function with the wrong type of argument, I
> think the error message will be deferred to link time.
> 
> Maybe it has to do with MacroAssembler.h being target independent?

I doubt it since this is a private method, used only by the atomics.  The templatification for the memory operand type is used extensively across the platforms, though I'm not sure how consistently - and as you observe, there's scope for some error, but at least it's a static error (which is probably why templating is preferable to subtyping).  This particular private method just follows the pattern.

I'm not a big fan of the extensive use of templates myself, since they tend to force code into header files and thus help create bizarre header dependencies that are hard to break (vm/TypedArrayCommon.h is nasty), but when in Rome...
(Assignee)

Comment 6

3 years ago
Created attachment 8676758 [details] [diff] [review]
bug1216486-arm64-masm-atomics.patch, v2

This should be the complete masm change (still untested).  No asm changes are needed.

Still to come: lowering, code generation.  There's a dustup around the high-level structure of that on the MIPS side that may spill over to other platforms so I'll wait until that settles down.
Attachment #8676185 - Attachment is obsolete: true
(Assignee)

Updated

2 years ago
Priority: -- → P5
(Assignee)

Updated

2 years ago
Assignee: lhansen → nobody
Status: ASSIGNED → NEW
(Assignee)

Updated

8 months ago
Blocks: 1407535
(Assignee)

Comment 7

5 months ago
New patches forthcoming.
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
Priority: P5 → P3
Hardware: Other → ARM64
(Assignee)

Comment 8

4 months ago
Let's land these methods with the other MacroAssembler implementations.
Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1313336
You need to log in before you can comment on or make changes to this bug.