Closed Bug 1420838 Opened 7 years ago Closed 6 years ago

[MIPS] Add 64-bit atomics JIT support

Categories

(Core :: JavaScript Engine: JIT, enhancement, P5)

enhancement

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: dragan.mladjenovic, Assigned: dragan.mladjenovic)

References

Details

Attachments

(3 files, 14 obsolete files)

35.43 KB, patch
dragan.mladjenovic
: review+
Details | Diff | Splinter Review
5.06 KB, patch
dragan.mladjenovic
: review+
Details | Diff | Splinter Review
233.20 KB, patch
dragan.mladjenovic
: review+
Details | Diff | Splinter Review
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36
Temporary workaround to get build passing on mips32(64) until we get proper wasm baseline tier support for MIPS platforms in a week or so.
Attachment #8932041 - Flags: review?(lhansen)
If you are making MIPS changes to the wasm baseline compiler now, you will likely experience a fair amount of merge pain, since I have several large changes up for review (with possibly more coming) and hope to land them this week and next.  Since they are fundamental changes having to do with the stack frame and how register allocator platform dependencies are handled I will probably wish to block other large changes (such as support for new platforms) until my changes have landed.
Flags: needinfo?(dragan.mladjenovic)
Flags: needinfo?(dragan.mladjenovic)
(In reply to Lars T Hansen [:lth] from comment #2)
> If you are making MIPS changes to the wasm baseline compiler now, you will
> likely experience a fair amount of merge pain, since I have several large
> changes up for review (with possibly more coming) and hope to land them this
> week and next.  Since they are fundamental changes having to do with the
> stack frame and how register allocator platform dependencies are handled I
> will probably wish to block other large changes (such as support for new
> platforms) until my changes have landed.

Hi,

I am aware of refactoring effort for wasm baseline. Are you willing to accept small changes like this attachment that just make minimal changes to get build running on MIPS in transitional period ?
Flags: needinfo?(lhansen)
Comment on attachment 8932041 [details] [diff] [review]
bug1420838_wasm_baseline_ifdef.patch

Review of attachment 8932041 [details] [diff] [review]:
-----------------------------------------------------------------

Even if they're intended for short-term work this is not the way to fix them; the proper fix (short term) is to provide stubs in the macroassembler that MOZ_CRASH.

I'm happy to discuss cleaning up this API (indeed the atomics API really should be defined in MacroAssembler.h) but then that's a different discussion.

::: js/src/wasm/WasmBaselineCompile.cpp
@@ +3661,4 @@
>          masm.atomicExchange64(srcAddr, rv, rd);
> +#else
> +     MOZ_CRASH("BaseCompiler platform hook: xchg64");
> +#endif

The better fix would be to define atomicExchange64 in the MIPS layer and to stub that out, for now.  If, later, the API to that function does not match what the other platforms have then let's talk about how we should fix that.

@@ +3702,5 @@
>      {
>          prepareMemoryAccess(access, check, tls, ptr);
>          ATOMIC_PTR(srcAddr, access, tls, ptr);
>  
> +#if defined(JS_CODEGEN_X64) || defined(JS_CODEGEN_X86) || defined(JS_CODEGEN_ARM)

Same for atomicRMW.

@@ +3779,1 @@
>          switch (op) {

And again, and so on.
Attachment #8932041 - Flags: review?(lhansen) → review-
(In reply to Dragan Mladjenovic from comment #3)
> (In reply to Lars T Hansen [:lth] from comment #2)
> > If you are making MIPS changes to the wasm baseline compiler now, you will
> > likely experience a fair amount of merge pain, since I have several large
> > changes up for review (with possibly more coming) and hope to land them this
> > week and next.  Since they are fundamental changes having to do with the
> > stack frame and how register allocator platform dependencies are handled I
> > will probably wish to block other large changes (such as support for new
> > platforms) until my changes have landed.
> 
> Hi,
> 
> I am aware of refactoring effort for wasm baseline. Are you willing to
> accept small changes like this attachment that just make minimal changes to
> get build running on MIPS in transitional period ?

Not really, sorry :(  If we land code it should be clean code, and this takes us in the wrong direction.  I don't think it should be a massive amount of work for you to stub out the API in the MacroAssembler interface, see eg ARM64 and the 'none' target; they do compile even though they do not have this support.

As I wrote in my comment on the patch, it's possible we should really clean up the MASM interface for atomics, esp if it does not work well for a platform.
Flags: needinfo?(lhansen)
Assignee: nobody → dragan.mladjenovic
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Priority: -- → P3
Attached patch bug1420838_stubs.patch (obsolete) — Splinter Review
I did the stubs. Hope it is ok to piggyback a small additional #ifdef change to baseline to make build running.
Attachment #8932041 - Attachment is obsolete: true
Attachment #8932183 - Flags: review?(lhansen)
Comment on attachment 8932183 [details] [diff] [review]
bug1420838_stubs.patch

Review of attachment 8932183 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.  Thanks!
Attachment #8932183 - Flags: review?(lhansen) → review+
The patch fails to apply. Could you please take a look?
Flags: needinfo?(dragan.mladjenovic)
Keywords: checkin-needed
Attached patch bug1420838_stubs2.patch (obsolete) — Splinter Review
Sorry was missing a squash with local commit.
Attachment #8932183 - Attachment is obsolete: true
Flags: needinfo?(dragan.mladjenovic)
Attachment #8932421 - Flags: review+
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d6bc31bb337b
Stub out wasm baseline atomics for MIPS. r=lth
Keywords: checkin-needed
Attached patch bug1420838_poc.patch (obsolete) — Splinter Review
Hi,
I've made a POC implementation of 64-bit atomics for mips32/mips64. For mips32 it uses a single gloabl spinlock which is visible from jit. I use SymbolicAccess relocations to encode its address into the code. I'm not sure is this ok. I know this mechanism is used for access to c++ wasm builtins. One major open issue is the simulator build. Should we now include mips AtomicOparation.h instead of platform one (by platform I mean x86) on simulator builds ? Implementation for mips32 is a bit tight on registers in order to adhere to API used by wasm baseline. It is still missing mips64 simulator support for new lld, scd instructions.
Attachment #8935328 - Flags: feedback?(yuyin-hf)
Attachment #8935328 - Flags: feedback?(lhansen)
Comment on attachment 8935328 [details] [diff] [review]
bug1420838_poc.patch

Review of attachment 8935328 [details] [diff] [review]:
-----------------------------------------------------------------

From a first look this seems about right.

I think the handling of the SymbolicAddress is appropriate though we should ask Luke (eventually) if it's OK to use the mechanism for this.  The alternative is to hang the pointer to the lock off the tls, but this is harder since the changes will tend to percolate up through lowering and into WasmIonCompile and WasmBaselineCompile, and it's nice to avoid that.

You can use weaker barriers in the MIPS32 atomics - you don't need a full barrier both before and after the lock-acquire and before and after the lock-release.  I have to investigate this (you could do it too) but I think it's sufficient to have a full fence before the lock-acquire and before the lock-release; the two other fences can be an acquire and a release, respectively.  But I'm not 100% sure what's right and what's best, and what you have should be conservatively safe.

About the AtomicOperations.h - no, you should not include the mips ones, you need to use one that is appropriate for the hardware that the simulator is running on.  The simulator should be using those operations to simulate LL and SC.  Then, when you're simulating MIPS32, the JIT will additionally generate code to acquire and release the lock, and that's fine too, the atomic instructions generated there will also be simulated using correct atomics.

We don't really expect the simulators to simulate multiprocessing / atomicity very well.  The ARM simulator tries hard to do a good job and even emulates the CPU's "exclusive monitor" a little bit.
Attachment #8935328 - Flags: feedback?(lhansen) → feedback+
(In reply to Lars T Hansen [:lth] from comment #13)
> Comment on attachment 8935328 [details] [diff] [review]
> bug1420838_poc.patch
> 
> Review of attachment 8935328 [details] [diff] [review]:
> -----------------------------------------------------------------

> You can use weaker barriers in the MIPS32 atomics - you don't need a full
> barrier both before and after the lock-acquire and before and after the
> lock-release.  I have to investigate this (you could do it too) but I think
> it's sufficient to have a full fence before the lock-acquire and before the
> lock-release; the two other fences can be an acquire and a release,
> respectively.  But I'm not 100% sure what's right and what's best, and what
> you have should be conservatively safe.
> 

Thank you for the prompt response.

I actually wanted to use MembarSynchronizing. The thing is that light-weight ordering barriers
(like sync-release) on most of hardware (I'm mainly speaking for mips32 here) will result in same
effect as full completion barrier. The only core series in whose documentation I could find that they
implement ordering barriers are 74K. And even there they give some examples that "imply" that these barriers
only enforce ordering between memory accesses that miss the cache [0]. I'd rather want to be conservative in
this case given that clang/gcc will always emit full sync barrier ignoring the memory model specifiers.
We might get away w/o any barrier before lock-acquire, but I rather not delve on that.
Also note that for Loongson we full sync barrier is always emitted [0]. Not sure on exact reason for this, 
but We maybe should do that for all MIPS targets in MacroAssembler::memoryBarrier. 

[0] http://cdn2.imgtec.com/documentation/MD00605-2B-CMPCOHERE-AFP-01.01.pdf page 26
[1] https://searchfox.org/mozilla-central/source/js/src/jit/mips-shared/Assembler-mips-shared.cpp#1874


> About the AtomicOperations.h - no, you should not include the mips ones, you
> need to use one that is appropriate for the hardware that the simulator is
> running on.  The simulator should be using those operations to simulate LL
> and SC.  Then, when you're simulating MIPS32, the JIT will additionally
> generate code to acquire and release the lock, and that's fine too, the
> atomic instructions generated there will also be simulated using correct
> atomics.

The problem with that approach is that I'v put the the global spinlock and required
code in AtomicOparations-mips-shared.h. And also VM wont be able to synchronize with JIT
because it for example VM uses jit::AtomicOperations::compareExchangeSeqCst<long long> 
which will hardware cas8 while JIT will use a spinlock (effectively racy reads and writes).
It's not that big of an issue I would just have to find a place where to define the spinlock
and required code.
Comment on attachment 8935328 [details] [diff] [review]
bug1420838_poc.patch

Review of attachment 8935328 [details] [diff] [review]:
-----------------------------------------------------------------

yes, on currently loongson cpu, if stype != 0, 'sync' instruction will cause a reserved instruction exception.

we will add lld/scd on mips64 simulator later? like about wasm baseline we said before, support mips32 first, and then support full mips64?

the patch look good to me, thank you.
Attachment #8935328 - Flags: feedback?(yuyin-hf) → feedback+
(In reply to yuyin from comment #15)
> Comment on attachment 8935328 [details] [diff] [review]
> bug1420838_poc.patch
> 
> Review of attachment 8935328 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> yes, on currently loongson cpu, if stype != 0, 'sync' instruction will cause
> a reserved instruction exception.
> 
> we will add lld/scd on mips64 simulator later? like about wasm baseline we
> said before, support mips32 first, and then support full mips64?
> 
> the patch look good to me, thank you.

Thank you for your response.

Yes I will add lld/scd support to mips64 simulator in this patch. Sorry for the delay for mips32 baseline it should land hopefully  on Monday.
Attached patch bug1420838_jit.patch (obsolete) — Splinter Review
Small cleanup. Supports use of 64-bit atomics under mips32 simulator now.
Attachment #8935328 - Attachment is obsolete: true
Attachment #8936440 - Flags: review?(lhansen)
Attached patch bug1420838_sim_mips32.patch (obsolete) — Splinter Review
Attachment #8936441 - Flags: review?(lhansen)
Attached patch bug1420838_sim_mips64.patch (obsolete) — Splinter Review
Atomics support for mips64 simulator w/o some additional fixes to make wasm tests runnable under simulator. While doing this I discovered a problem with simulator redirects for mips64. For functions that return int32 MIPS64 expects int32 value sign extended to full register width while calling convention on X86-64 makes no such guaranties. This causes problems with new wait/wake functions that return -1 in error conditions. I've made workarounds that cover those cases in the simulator.
I'll look at plausibility of fixing this from jit's side when running under simulator.
Attachment #8936566 - Flags: feedback?(yuyin-hf)
Attachment #8936566 - Flags: feedback?(lhansen)
I'll try to get to these patches by the end of next week, but if I don't have time (I'm at a meeting this week and mostly off next week) it'll be January, sorry :(

You'll end up reworking some of this anyway, because I am in the process of cleaning up the interface to the atomic operations (in order to reduce randome machine dependencies elsewhere).  There's no bug for this yet but one will appear this week.
Attached patch bug1420838_atomic32.patch (obsolete) — Splinter Review
This patch removes the unnecessary use of temp registers for 32-bit atomics so we can now have same API surface as other architectures for non subword atomics. Removes 32-bit stubs introduces in the first patch.
Comment on attachment 8936566 [details] [diff] [review]
bug1420838_sim_mips64.patch

Review of attachment 8936566 [details] [diff] [review]:
-----------------------------------------------------------------

Since I usually use a real machine to debug js code, so I know little about simulator.
But I think it is right and reasonable that process the return int32 value on mips64 like you do in function Args_Int_GeneralGeneralInt64Int64.

another topic, it is a bit like this, you can see ecma_6/TypedArray/sort_bascis.js failed on mips64, both real hardware and simulator. it is failed because there are some bugs to handle JSValueTypeInt32. see bug 1424978
Attachment #8936566 - Flags: feedback?(yuyin-hf) → feedback+
Blocks: wasm-mips
I will get to these reviews this week for sure; sorry about the delay.

The patches that just landed for bug 1425149 (changeset 398238:7a87f1ff89f3) changed the masm API for the atomics.  You will need to adapt to this (and indeed, currently mozilla-inbound probably does not compile on MIPS due to these changes).  Those adaptations probably aren't hard, but you'll also need to adapt the patch that's on this bug.  I'm sorry about that but those changes were long overdue.
Comment on attachment 8936440 [details] [diff] [review]
bug1420838_jit.patch

Review of attachment 8936440 [details] [diff] [review]:
-----------------------------------------------------------------

f+ instead of r+ because I'd like to see this again once you've implemented the new porting APIs in the MacroAssembler, bug tenerally this looks good.  I agree with your solution for the AtomicOperations.

::: js/src/jit/mips-shared/AtomicOperations-mips-shared.h
@@ +9,5 @@
>  // NOTE, MIPS32 unlike MIPS64 doesn't provide hardware support for lock-free
>  // 64-bit atomics. We lie down below about 8-byte atomics being always lock-
> +// free in order to support wasm jit. The 64-bit atomic for MIPS32 do not use
> +// __atomic intrinsic and therefore do not relay on -latomic.
> +// Access to aspecific 64-bit variable in memory is protected by an AddressLock

"a specific"

@@ +72,4 @@
>      return true;
> +# else
> +    return false;
> +# endif

Any reason you can't just return true here?  This file is used on mips-32 and mips-64 hardware, or on x86 with the mips-32 simulator.  In both 32-bit cases JS_CODEGEN_MIPS32 will be true.  In the 64-bit case JS_64BIT will be true.

@@ +86,5 @@
>      return true;
> +# elif defined(JS_CODEGEN_MIPS32)
> +    return true;
> +# else
> +    return false;

Really the same question here as above.

@@ +552,5 @@
> +    while (!__atomic_compare_exchange(&spinlock, &zero, &one, true, __ATOMIC_SEQ_CST,
> +          __ATOMIC_SEQ_CST))
> +    {
> +        zero = 0;
> +        continue;

Surely the 'continue' is not necessary?

::: js/src/jit/mips-shared/LIR-mips-shared.h
@@ +435,5 @@
> +        return mir_->toWasmAtomicExchangeHeap();
> +    }
> +};
> +
> +class LWasmAtomicBinopI64 : public LInstructionHelper<INT64_PIECES, 1 + INT64_PIECES, 2>

The "2" implies there are some temps here, but I don't see any below.

::: js/src/jit/mips32/LIR-mips32.h
@@ +204,5 @@
> +    {
> +        setOperand(0, ptr);
> +        setInt64Operand(1, value);
> +        setTemp(0, tmp);
> +

Spurious blank line.

::: js/src/jit/mips32/MacroAssembler-mips32.cpp
@@ +2505,5 @@
> +    as_lbu(zero, addr, 7); // Force memory trap on invalid access before we enter the spinlock.
> +
> +    Label tryLock;
> +
> +    asMasm().memoryBarrier(MembarFull);

This should change in the same way that it has changed in the other back-ends: there should be a parameter that is of type js::jit::Synchronization that specifies how to sync, and here we should just pass that to memoryBarrierBefore() and memoryBarrierAfter() and let it be interpreted there.  That way we can represent various memory orders at higher levels of the compiler.

I'm also fairly sure you don't need a full barrier both before and after the spinlock acquisition but I have a vague memory that we talked about that before, so ignore that if we have.  But it does raise an issue of how the barrier we need here interacts with the barrier that should be emitted by memoryBarrierBefore() and memoryBarrierAfter().  If you just want to keep MembarFull here then that is always correct I think, and I'm OK with it for MIPS.

::: js/src/jit/mips64/MacroAssembler-mips64.cpp
@@ +2607,5 @@
> +
> +template <typename T>
> +void
> +MacroAssemblerMIPS64Compat::compareExchange64(const T& mem, Register64 expect, Register64 replace,
> +                                            Register64 output)

Indentation.

::: js/src/jit/mips64/MacroAssembler-mips64.h
@@ +1094,5 @@
>          MOZ_CRASH();
>      }
> +  protected:
> +        template<typename T>
> +    void atomicFetchOp64(AtomicOp op, Register64 value, const T& mem, Register64 temp,

Looks like an indentation problem.
Attachment #8936440 - Flags: review?(lhansen) → feedback+
Comment on attachment 8936441 [details] [diff] [review]
bug1420838_sim_mips32.patch

Review of attachment 8936441 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/mips32/Simulator-mips32.cpp
@@ +2027,5 @@
>  
> +static int64_t
> +MakeInt64(int32_t first, int32_t second)
> +{
> +    // Little-endian order.

Maybe you want to assert that, since I know there's somebody working on big-endian support.  Or is this something the ABI handles so that we don't have to worry about it here?
Attachment #8936441 - Flags: review?(lhansen) → review+
Comment on attachment 8936566 [details] [diff] [review]
bug1420838_sim_mips64.patch

Review of attachment 8936566 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/mips64/Simulator-mips64.cpp
@@ +1642,5 @@
> +    void* pc = (void*)get_pc();
> +    void* fp = (void*)getRegister(Register::fp);
> +
> +    JitActivation* activation = TlsContext.get()->activation()->asJit();
> +    const wasm::CodeSegment* segment = wasm::LookupCodeSegment(pc);

This code has changed in the ARM simulator, there's now a utility for this that you may be able to use.

@@ +1682,5 @@
> +    wasm::Instance* instance = wasm::LookupFaultingInstance(*segment, pc, fp);
> +    if (!instance || !instance->memoryAccessInGuardRegion((uint8_t*)addr, numBytes))
> +        return false;
> +
> +    LLBit_ = false;

Nice.

@@ +1754,5 @@
>  {
> +    if (handleWasmFault(addr, 2))
> +        return 0xffff;
> +
> +    if ((addr & 1) == 0 || wasm::InCompiledCode(reinterpret_cast<void *>(get_pc()))) {

It looks like unaligned accesses are always allowed?  Is that true also on hardware?  If so, why are there then special unaligned load/store paths in the JIT back-ends?

::: js/src/wasm/WasmBuiltins.cpp
@@ +577,5 @@
>        case SymbolicAddress::WaitI64:
>          *abiType = Args_Int_GeneralGeneralInt64Int64;
>          return FuncCast(Instance::wait_i64, *abiType);
>        case SymbolicAddress::Wake:
> +        *abiType = Args_General3;

Oh, nice.
Attachment #8936566 - Flags: feedback?(lhansen) → feedback+
Attached patch bug1420838_jit2.patch (obsolete) — Splinter Review
Sorry for the delay. This patch includes port to new atomic framework. Had to push JS atomic helpers down to platform specific macroassemblers  due to MIPS vs everybody else signature mismatch. Not sure if that is ok.
Attachment #8936440 - Attachment is obsolete: true
Attachment #8936665 - Attachment is obsolete: true
Attachment #8943278 - Flags: review?(lhansen)
(In reply to Lars T Hansen [:lth] from comment #24)
> Comment on attachment 8936440 [details] [diff] [review]
> bug1420838_jit.patch
> 
> Review of attachment 8936440 [details] [diff] [review]:
> -----------------------------------------------------------------
> ::: js/src/jit/mips-shared/LIR-mips-shared.h
> @@ +435,5 @@
> > +        return mir_->toWasmAtomicExchangeHeap();
> > +    }
> > +};
> > +
> > +class LWasmAtomicBinopI64 : public LInstructionHelper<INT64_PIECES, 1 + INT64_PIECES, 2>

I use "raw" setTemp interface to set thease temps while lowering. Is that ok ?
  
> ::: js/src/jit/mips32/MacroAssembler-mips32.cpp
> @@ +2505,5 @@
> > +    as_lbu(zero, addr, 7); // Force memory trap on invalid access before we enter the spinlock.
> > +
> > +    Label tryLock;
> > +
> > +    asMasm().memoryBarrier(MembarFull);
> 
> This should change in the same way that it has changed in the other
> back-ends: there should be a parameter that is of type
> js::jit::Synchronization that specifies how to sync, and here we should just
> pass that to memoryBarrierBefore() and memoryBarrierAfter() and let it be
> interpreted there.  That way we can represent various memory orders at
> higher levels of the compiler.
> 
> I'm also fairly sure you don't need a full barrier both before and after the
> spinlock acquisition but I have a vague memory that we talked about that
> before, so ignore that if we have.  But it does raise an issue of how the
> barrier we need here interacts with the barrier that should be emitted by
> memoryBarrierBefore() and memoryBarrierAfter().  If you just want to keep
> MembarFull here then that is always correct I think, and I'm OK with it for
> MIPS.

Those barriers are needed to be there irrespective of what js::jit::Synchronization 
specifies for that operation. They are for ordering lock_acquire -> lock_release.
They could be made lighter but that would make no difference especially now that I did 
the change to always emit a full sync barrier.
(In reply to Lars T Hansen [:lth] from comment #26)
> Comment on attachment 8936566 [details] [diff] [review]
> bug1420838_sim_mips64.patch
> 
> Review of attachment 8936566 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jit/mips64/Simulator-mips64.cpp
> @@ +1642,5 @@
> > +    void* pc = (void*)get_pc();
> > +    void* fp = (void*)getRegister(Register::fp);
> > +
> > +    JitActivation* activation = TlsContext.get()->activation()->asJit();
> > +    const wasm::CodeSegment* segment = wasm::LookupCodeSegment(pc);
> 
> This code has changed in the ARM simulator, there's now a utility for this
> that you may be able to use.
If it is ok I'll leave that from these patches and post required changes on bug
1428453. 
> 
> @@ +1754,5 @@
> >  {
> > +    if (handleWasmFault(addr, 2))
> > +        return 0xffff;
> > +
> > +    if ((addr & 1) == 0 || wasm::InCompiledCode(reinterpret_cast<void *>(get_pc()))) {
> 
> It looks like unaligned accesses are always allowed?  Is that true also on
> hardware?  If so, why are there then special unaligned load/store paths in
> the JIT back-ends?

The hardware traps but that gets fixed by kernel. Separate unaligned load/store paths
are there to avoid context switch in case we get a hint that memory address will be unaligned.
Attached patch bug1420838_jit3.patch (obsolete) — Splinter Review
Sorry, forgot to include nits from your AtomicOperations feedback.
Attachment #8943278 - Attachment is obsolete: true
Attachment #8943278 - Flags: review?(lhansen)
Attachment #8943295 - Flags: review?(lhansen)
Attached patch bug1420838_jit4.patch (obsolete) — Splinter Review
More nits fixed.
Attachment #8943295 - Attachment is obsolete: true
Attachment #8943295 - Flags: review?(lhansen)
Attachment #8943932 - Flags: review?(lhansen)
Looking at this, but it will be a couple more days - I am trying to find a way to avoid duplicating all the code that was previously in MacroAssembler.cpp and is now in all platform files except mips.  It's not the first time that kind of problem comes up.
nbp, waldo: If you look at the last patch here there's a situation that I think I've seen before: there is code that is shared among most of the platforms but one platform is different, and the result is that we end up duplicating the common code across per-platform files.  In the case here, code that was in MacroAssembler.cpp is moved into all the per-platform MacroAssemblers.

I don't think we have a completely standard way to avoid this - normally we'd just ifdef the code in MacroAssembler.cpp but check_macroassembler_style will hopefully not allow that - but it basically creates a maintenance burden.  

I suppose we could have shim code in all the platform macroassemblers that call into common code that is under an ifdef, and we could store that common code in a file in jit/shared/ maybe?  A narrower example is that lowerCompareExchangeTypedArrayElement() currently is in x86-shared but is called from both x86 and x64 specific back-ends.  Same principle.

Suggestions?  It's not a massive amount of code so we can just go ahead with the duplication but it feels wrong.
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(jwalden+bmo)
The issue here is something which I have been thinking about for some time, i.e. :

 ** Removing all temporary registers from all MacroAssembler functions **

Currently many of our assembler instruction are using a scratch register, or even pushing values on the stack because our API does not give us the ability to give more unused/scratch registers.

If our MacroAssembler was aware of the set of allocated/allocatable registers, it could use a simple linear register allocator to temporarily spill some register on the stack to free more.  Thus removing the need for extra temporary register as arguments.
Flags: needinfo?(nicolas.b.pierron)
(In reply to Nicolas B. Pierron [:nbp] from comment #34)

> If our MacroAssembler was aware of the set of allocated/allocatable
> registers, it could use a simple linear register allocator to temporarily
> spill some register on the stack to free more.  Thus removing the need for
> extra temporary register as arguments.

Moved to bug 1434368, since it's a longer discussion.
Comment on attachment 8943932 [details] [diff] [review]
bug1420838_jit4.patch

Review of attachment 8943932 [details] [diff] [review]:
-----------------------------------------------------------------

Basically this is fine except for the code duplication resulting from all the extra temps in the macroassembler interface.  Discussion about that is ongoing on bug 1434368.  I know you've waited a very long time to land this already, but if you can stand waiting just a little longer we can try to find a solution that works better than duplicating the code.  If the other discusion stalls I'll just accept this patch and live with the duplication.

::: js/src/jit/MacroAssembler.h
@@ +1538,5 @@
>      //
>      // The 8-bit and 16-bit operations zero-extend or sign-extend the result to
> +    // 32 bits, according to `type`. On MIPS64, the upper 32 bits of the result
> +    // will be sign extended from the lower 32 bits. On other 64-bit systems the
> +    // upper 32 bits will be zero.

After thinking about this, I *believe* this is OK.  But let's change the wording a little bit to say that "On 64-bit systems, the upper 32 bits of the result will be zero on some platforms (eg, on x64) and will be the sign extension of the lower bits on other platforms (eg, MIPS)."  That way there will be no inclination to start guessing about it.

@@ +1667,4 @@
>  
>      void atomicLoad64(const Synchronization& sync, const Address& mem, Register64 temp,
>                        Register64 output)
> +        DEFINED_ON(arm, x86, mips32);

Keep this list alphabetical here and elsewhere, if you can.  (I know, in many cases they are not, but let's try.)

@@ +1681,4 @@
>  
>      void compareExchange64(const Synchronization& sync, const Address& mem, Register64 expected,
>                             Register64 replacement, Register64 output)
> +        DEFINED_ON(arm, arm64, x64, x86, mips32, mips64);

Rewrite as PER_ARCH here and several cases below.

::: js/src/jit/shared/LIR-shared.h
@@ +6433,5 @@
>          setOperand(2, value);
>          setTemp(0, temp1);
>          setTemp(1, temp2);
>      }
>      LAtomicTypedArrayElementBinop(const LAllocation& elements, const LAllocation& index,

You should probably add a "// MIPS32, MIPS64" comment here and a "// ARM, ARM64, x86, x64" comment to the other constructor, and similar for many others in this file, so that there's no confusion.  In fact, if it's possible to use ifdefs maybe we should, even if those are not common in this file.

::: js/src/jit/x86-shared/MacroAssembler-x86-shared.cpp
@@ +1234,5 @@
>  }
>  
>  void
>  MacroAssembler::atomicEffectOpJS(Scalar::Type arrayType, const Synchronization& sync, AtomicOp op,
> +                           Register value, const BaseIndex& mem, Register temp)

Indentation broken.

@@ +1241,5 @@
> +}
> +
> +void
> +MacroAssembler::atomicEffectOpJS(Scalar::Type arrayType, const Synchronization& sync, AtomicOp op,
> +                           Register value, const Address& mem, Register temp)

Indentation broken.
It's a little messy finding exactly where the duplication is here, IMO -- at least going from comment-reading plus smallish stabs into a 200K+ patch.

But assuming I'm reading correctly, could shared code be put into some sort of js/src/jit/*-shared{.h} files, then the particular architectures that need them can #include and use them to define things?  That doesn't seem absolutely wrong to me.  (Albeit a little odd, maybe, to have function templates defined in the .h, but only specialized in platform-specific files.  So it goes.)
Flags: needinfo?(jwalden+bmo)
Er, instantiated (or instantiations declared), I mean.
Comment on attachment 8943932 [details] [diff] [review]
bug1420838_jit4.patch

Let's just ship this and worry about cleaning it up later.  I think my proposal on the other bug is good; collect temps in a structure so that we don't duplicate the interface.  For now, duplicating the interface is how it's done.  And the code that's duplicated in all the files for the JS atomics is not very large, so we can probably live with that for now.  Again, it can be tidied up with a better abstraction later.
Attachment #8943932 - Flags: review?(lhansen) → review+
Attached patch bug1420838_jit5.patch (obsolete) — Splinter Review
Attachment #8943932 - Attachment is obsolete: true
Attachment #8947758 - Flags: review+
Attachment #8936566 - Attachment is obsolete: true
Attachment #8947759 - Flags: review+
Attached patch bug1420838_sim_mips32_2.patch (obsolete) — Splinter Review
Attachment #8947760 - Flags: review+
Dragan Mladjenovic could you please mark the obsolete attachments as such? Also we are encountering issues when landing the patches to inbound.
Flags: needinfo?(dragan.mladjenovic)
Keywords: checkin-needed
Attachment #8932421 - Attachment is obsolete: true
Attachment #8936441 - Attachment is obsolete: true
Attachment #8947760 - Attachment is obsolete: true
Flags: needinfo?(dragan.mladjenovic)
Attachment #8947774 - Flags: review+
(In reply to Arthur Iakab [arthur_iakab] from comment #43)
> Dragan Mladjenovic could you please mark the obsolete attachments as such?
> Also we are encountering issues when landing the patches to inbound.

Sorry,

Can you tray again?
Keywords: checkin-needed
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/mozilla-inbound/rev/663444bb705a
[MIPS] Add 64-bit atomics JIT support; r=lth
https://hg.mozilla.org/integration/mozilla-inbound/rev/8ca4961a705c
[MIPS64] Add atomic support into simulator; r=lth
https://hg.mozilla.org/integration/mozilla-inbound/rev/490e4771b990
[MIPS32] Add atomic.wait callout support into simulator; r=lth
Keywords: checkin-needed
Backed out for bustage: check_macroassembler_style.py:

https://hg.mozilla.org/integration/mozilla-inbound/rev/6de176d6d364ebb474447b8af1dbe3dcb85de830

Push with bustage: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=997c15a7ba086970bb77af8dcf0b214cf3db409b&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Build log: https://treeherder.mozilla.org/logviewer.html#?job_id=160164832&repo=mozilla-inbound

[task 2018-02-02T22:55:23.356Z] 22:55:23     INFO -      is defined in x86/MacroAssembler-x86.cpp
[task 2018-02-02T22:55:23.356Z] 22:55:23     INFO -  void setupWasmABICall();
[task 2018-02-02T22:55:23.356Z] 22:55:23     INFO -      is defined in MacroAssembler.cpp
[task 2018-02-02T22:55:23.357Z] 22:55:23     INFO - +void spectreMaskIndex(Register, Register, Register);
[task 2018-02-02T22:55:23.357Z] 22:55:23     INFO - +    is defined in MacroAssembler.cpp
[task 2018-02-02T22:55:23.357Z] 22:55:23     INFO - +void spectreMaskIndex(Register, const Address&, Register);
[task 2018-02-02T22:55:23.357Z] 22:55:23     INFO - +    is defined in MacroAssembler.cpp
[task 2018-02-02T22:55:23.357Z] 22:55:23     INFO - +void spectreMaskIndex(int32_t, Register, Register);
[task 2018-02-02T22:55:23.357Z] 22:55:23     INFO - +    is defined in MacroAssembler.cpp
[task 2018-02-02T22:55:23.357Z] 22:55:23     INFO - +void spectreMaskIndex(int32_t, const Address&, Register);
[task 2018-02-02T22:55:23.358Z] 22:55:23     INFO - +    is defined in MacroAssembler.cpp
[task 2018-02-02T22:55:23.358Z] 22:55:23     INFO -  void storeRegsInMask(LiveRegisterSet, Address, Register) PER_SHARED_ARCH;
[task 2018-02-02T22:55:23.358Z] 22:55:23     INFO -      is defined in arm/MacroAssembler-arm.cpp
[task 2018-02-02T22:55:23.358Z] 22:55:23     INFO -      is defined in arm64/MacroAssembler-arm64.cpp
[task 2018-02-02T22:55:23.358Z] 22:55:23  WARNING - TEST-UNEXPECTED-FAIL | check_macroassembler_style.py | actual output does not match expected output;  diff is above
[task 2018-02-02T22:55:23.358Z] 22:55:23     INFO - Makefile:64: recipe for target 'check-masm' failed
[task 2018-02-02T22:55:23.359Z] 22:55:23     INFO - make[2]: *** [check-masm] Error 1
Flags: needinfo?(dragan.mladjenovic)
Attached patch bug1420838_jit6.patch (obsolete) — Splinter Review
Attachment #8947758 - Attachment is obsolete: true
Flags: needinfo?(dragan.mladjenovic)
Attachment #8948336 - Flags: review+
Keywords: checkin-needed
Sorry for pushing (no pun intended). In previous rebase I erroneously removed check_macroassembler_style marker causing the make check failures. This issue is fixed now. Can these patches be pushed now?
Applying jit patch6 first fails:
applying bug1420838_jit6.patch
patching file js/src/jit/mips-shared/MacroAssembler-mips-shared.cpp
Hunk #2 FAILED at 1634
1 out of 2 hunks FAILED -- saving rejects to file js/src/jit/mips-shared/MacroAssembler-mips-shared.cpp.rej

Please have a look. Thank you
Flags: needinfo?(dragan.mladjenovic)
Keywords: checkin-needed
Attachment #8948336 - Attachment is obsolete: true
Attachment #8948669 - Flags: review+
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #50)
> Applying jit patch6 first fails:
> applying bug1420838_jit6.patch
> patching file js/src/jit/mips-shared/MacroAssembler-mips-shared.cpp
> Hunk #2 FAILED at 1634
> 1 out of 2 hunks FAILED -- saving rejects to file
> js/src/jit/mips-shared/MacroAssembler-mips-shared.cpp.rej
> 
> Please have a look. Thank you

Sorry. I've done a fresh rebase now. Hopefully it should work.
Flags: needinfo?(dragan.mladjenovic)
Keywords: checkin-needed
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5364873c063c
[MIPS] Add 64-bit atomics JIT support; r=lth
https://hg.mozilla.org/integration/mozilla-inbound/rev/0d5297d75f8e
[MIPS64] Add atomic support into simulator; r=lth
https://hg.mozilla.org/integration/mozilla-inbound/rev/172f8f653720
[MIPS32] Add atomic.wait callout support into simulator; r=lth
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/5364873c063c
https://hg.mozilla.org/mozilla-central/rev/0d5297d75f8e
https://hg.mozilla.org/mozilla-central/rev/172f8f653720
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Depends on: 1492897
Priority: P3 → P5
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: