Closed Bug 1425149 Opened 2 years ago Closed 2 years ago

Clean up JIT APIs to atomic operations

Categories

(Core :: JavaScript Engine: JIT, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

(Paying off technical debt.)

When the atomics were implemented there was a lot of experimentation, including around float atomics and ARMv6 support and 64-bit.  Later we removed some of that, and added Wasm support.  As a consequence the JIT back-ends diverged a bit and common patterns were not extracted and exposed.  Now is a good time to clean it up.
I apologize for the size of this but by and large it only moves a lot of code around.  See commit msg for more info.
Attachment #8936701 - Flags: review?(nicolas.b.pierron)
Blocks: 1423491
Blocks: wasm-mips
Comment on attachment 8936701 [details] [diff] [review]
bug1425149-revamp-masm-atomic-apis.patch

Review of attachment 8936701 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/AtomicOp.h
@@ +71,5 @@
> +{
> +    const MemoryBarrierBits barrierBefore;
> +    const MemoryBarrierBits barrierAfter;
> +
> +    explicit Synchronization(MemoryBarrierBits before, MemoryBarrierBits after)

nit: no need for explicit when given 2 arguments.
Could this be a constexpr as well?

::: js/src/jit/MacroAssembler.h
@@ +1521,5 @@
>      inline void clampIntToUint8(Register reg) PER_SHARED_ARCH;
>  
> +  public:
> +    // ========================================================================
> +    // Primitive atomic operations.

nit: Add a comment to suggest using the *JS version of these functions, if the value might be packed in a js::Value after it is being loaded/swapped.

@@ +1535,5 @@
> +    // whether we wrote or not.
> +    //
> +    // x86-shared: `output` must be eax.
> +
> +    void compareExchange32(Scalar::Type type, const Synchronization& sync, const Address& mem,

The 32 suffix is quite miss-leading.  All mov* suffixes are references to the memory size, while here this is a reference to the output/input register size.

Maybe naming it:
 - compareExchange32OrLess(…)
 - compareExchangeType(…)
 - compareExchange(…)

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +2290,5 @@
>      MOZ_ASSERT(ins->addrTemp()->isBogusTemp());
>  
>      BaseIndex srcAddr(HeapReg, ptrReg, TimesOne, mir->access().offset());
> +    masm.atomicFetchOp32(vt, Synchronization::Full(), op, ToRegister(value), srcAddr, flagTemp,
> +                         ToRegister(ins->output()));

nit: Move the ToRegister(value) call with the other above. (idem before and after)

::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +5104,5 @@
> +            masm.as_uxth(scratch, oldval, 0);
> +        }
> +        break;
> +      case 4:
> +        masm.as_ldrex(output, ptr);

Q: just making sure I understand. The assertion MOZ_ASSERT(!signExtend) got removed because IsSignedIntType is true for Int32 and the code does not need any sign extension because ARM registers are 32 bits anyway, right?

nit: Add a "default: MOZ_CRASH();" case.

@@ +5260,5 @@
> +            masm.as_sxth(output, output, 0);
> +        break;
> +      case 4:
> +        masm.as_ldrex(output, ptr);
> +        break;

nit: Add a  "default: MOZ_CRASH();"  case.

@@ +5351,5 @@
> +        masm.as_ldrexh(scratch, ptr);
> +        break;
> +      case 4:
> +        masm.as_ldrex(scratch, ptr);
> +        break;

nit: Add a "default: MOZ_CRASH();".

::: js/src/jit/mips32/CodeGenerator-mips32.cpp
@@ +521,5 @@
>          masm.ma_load(output.high, BaseIndex(HeapReg, ptr, TimesOne, INT64HIGH_OFFSET), SizeWord);
>          masm.append(mir->access(), masm.size() - 4 , masm.framePushed());
>      }
>  
> +    masm.memoryBarrierAfter(mir->access().sync());

existing issue: There is a missing memoryBarrierAfter before the early return in this function.

follow-up nit: Create a BarrierSection RAII?

@@ +600,5 @@
>          masm.append(mir->access(), masm.size() - 4, masm.framePushed());
>          masm.ma_store(value.low, BaseIndex(HeapReg, ptr, TimesOne), SizeWord);
>      }
>  
> +    masm.memoryBarrierAfter(mir->access().sync());

existing issue: same here.

::: js/src/jit/mips64/CodeGenerator-mips64.cpp
@@ +461,5 @@
>      masm.ma_load(ToOutRegister64(lir).reg, BaseIndex(HeapReg, ptr, TimesOne),
>                   static_cast<LoadStoreSize>(8 * byteSize), isSigned ? SignExtend : ZeroExtend);
>      masm.append(mir->access(), masm.size() - 4, masm.framePushed());
>  
> +    masm.memoryBarrierAfter(mir->access().sync());

existing issue: missing memoryBarrierAfter before the early return.

@@ +526,5 @@
>      masm.ma_store(ToRegister64(lir->value()).reg, BaseIndex(HeapReg, ptr, TimesOne),
>                    static_cast<LoadStoreSize>(8 * byteSize), isSigned ? SignExtend : ZeroExtend);
>      masm.append(mir->access(), masm.size() - 4, masm.framePushed());
>  
> +    masm.memoryBarrierAfter(mir->access().sync());

existing issue: idem.

::: js/src/jit/x64/MacroAssembler-x64.cpp
@@ +977,5 @@
> +        masm.negq(output);
> +        masm.lock_xaddq(output, Operand(mem));
> +    } else {
> +        Label again;
> +        MOZ_ASSERT(output == rax);

follow-up nit: I think it would be better for our register allocator if temp == rax && output != rax, while replacing temp by output in the following code sequence.

It would be better because then the register allocator would not have to move the content of rax into another register, in case multiple atomicFetch instructions are following each others.
Attachment #8936701 - Flags: review?(nicolas.b.pierron) → review+
(In reply to Nicolas B. Pierron [:nbp] from comment #3)
> Comment on attachment 8936701 [details] [diff] [review]
> bug1425149-revamp-masm-atomic-apis.patch
> 
> Review of attachment 8936701 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> > +    explicit Synchronization(MemoryBarrierBits before, MemoryBarrierBits after)
> 
> Could this be a constexpr as well?

Good point.

> @@ +1535,5 @@
> > +    // whether we wrote or not.
> > +    //
> > +    // x86-shared: `output` must be eax.
> > +
> > +    void compareExchange32(Scalar::Type type, const Synchronization& sync, const Address& mem,
> 
> The 32 suffix is quite miss-leading.  All mov* suffixes are references to
> the memory size, while here this is a reference to the output/input register
> size.
> 
> Maybe naming it:
>  - compareExchange32OrLess(…)
>  - compareExchangeType(…)
>  - compareExchange(…)

I chose the latter of these three, that wasn't working before but probably adding the 'JS' suffix to the JS-specific methods cleared the way.

> ::: js/src/jit/arm/MacroAssembler-arm.cpp
> @@ +5104,5 @@
> > +            masm.as_uxth(scratch, oldval, 0);
> > +        }
> > +        break;
> > +      case 4:
> > +        masm.as_ldrex(output, ptr);
> 
> Q: just making sure I understand. The assertion MOZ_ASSERT(!signExtend) got
> removed because IsSignedIntType is true for Int32 and the code does not need
> any sign extension because ARM registers are 32 bits anyway, right?

Technically, it got removed because we may now se Uint32 here where we couldn't see that before; that's a result of how the API evolved.  Int32 and Uint32 are handled the same: no sign extension needed.

> ::: js/src/jit/mips32/CodeGenerator-mips32.cpp
> @@ +521,5 @@
> >          masm.ma_load(output.high, BaseIndex(HeapReg, ptr, TimesOne, INT64HIGH_OFFSET), SizeWord);
> >          masm.append(mir->access(), masm.size() - 4 , masm.framePushed());
> >      }
> >  
> > +    masm.memoryBarrierAfter(mir->access().sync());
> 
> existing issue: There is a missing memoryBarrierAfter before the early
> return in this function.

Nice catch!

> follow-up nit: Create a BarrierSection RAII?

Sigh, yeah.  I keep thinking that, but then I start wondering if it's really necessary because the platforms are so different.  But for ARM, ARM64, MIPS, MIPS64 it would probably be the right thing.  I'll file a bug.

> ::: js/src/jit/x64/MacroAssembler-x64.cpp
> @@ +977,5 @@
> > +        masm.negq(output);
> > +        masm.lock_xaddq(output, Operand(mem));
> > +    } else {
> > +        Label again;
> > +        MOZ_ASSERT(output == rax);
> 
> follow-up nit: I think it would be better for our register allocator if temp
> == rax && output != rax, while replacing temp by output in the following
> code sequence.
> 
> It would be better because then the register allocator would not have to
> move the content of rax into another register, in case multiple atomicFetch
> instructions are following each others.

OK, noted - I will file a bug to investigate, I see what you mean.
https://hg.mozilla.org/mozilla-central/rev/7a87f1ff89f3
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
You need to log in before you can comment on or make changes to this bug.