Closed Bug 1394420 Opened 7 years ago Closed 6 years ago

Provide reliable implementation of jit/AtomicOperations for tier-1 platforms

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla66

Tracking Flags:

Tracking

Status

firefox66

---

fixed

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 8 obsolete files)

bug1394420-jit-generate-atomics.patch 6 years ago Lars T Hansen [:lth] 31.08 KB, patch		Details \| Diff \| Splinter Review
bug1394420-jit-generate-atomics-v2.patch 6 years ago Lars T Hansen [:lth] 52.21 KB, patch		Details \| Diff \| Splinter Review
bug1394420-jit-generate-atomics-v3.patch 6 years ago Lars T Hansen [:lth] 65.11 KB, patch		Details \| Diff \| Splinter Review
arm64-fixes.patch 6 years ago Lars T Hansen [:lth] 2.52 KB, patch		Details \| Diff \| Splinter Review
bug1394420-jit-generate-atomics-v4.patch 6 years ago Lars T Hansen [:lth] 66.15 KB, patch		Details \| Diff \| Splinter Review
bug1394420-arm-volatile-hacks.patch 6 years ago Lars T Hansen [:lth] 3.20 KB, patch		Details \| Diff \| Splinter Review
bug1394420-jit-generate-atomics-v5.patch 6 years ago Lars T Hansen [:lth] 84.07 KB, patch		Details \| Diff \| Splinter Review
bug1394420-jit-generate-atomics-v6.patch 6 years ago Lars T Hansen [:lth] 87.72 KB, patch		Details \| Diff \| Splinter Review
bug1394420-jit-generate-atomics-v7.patch 6 years ago Lars T Hansen [:lth] 90.75 KB, patch	nbp : review+ froydnj : review+	Details \| Diff \| Splinter Review
bug1394420-cleanup-atomics.patch 6 years ago Lars T Hansen [:lth] 63.25 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review

Lars T Hansen [:lth]

Assignee

Description

•

7 years ago

Our current implementations of jit/AtomicOperations.h use compiler intrinsics, such as _InterlockedCompareExchange for MSVC and __atomic and __sync for GCC and clang. The MSVC implementations *might* be OK because they use intrinsics to prevent reordering in the compiler and by extension might tell the compiler not to assume anything about UB (who knows?). The GCC/Clang implementations are known not to be OK, because those intrinsics are documented as assuming race-free programs and the whole point of our primitives is to support racy programs. It's even known (believed?) that if a program is race free, TSO lets the C++ compiler engage in some reordering we want to avoid (by not emitting a post-fence for seq_cst stores), and we must fix this. Basically, the primitives have to be rewritten in in-line assembler or code has to be generated by the JIT for them at startup for our tier-1 platforms. They are not (should not be...) very performance-sensitive and the cost of a call for each of them is probably just fine. We should not worry about tier-3 platforms, notably MIPS, or ambiguous-tier platforms, such as ARM64, at this time.

Lars T Hansen [:lth]

Assignee

Updated

•

7 years ago

Priority: -- → P2

Lars T Hansen [:lth]

Assignee

Comment 1

•

7 years ago

I have some initial sketches for how to jit-compile these primitives for tier-1 platforms, so taking.

Assignee: nobody → lhansen

Status: NEW → ASSIGNED

Lars T Hansen [:lth]

Assignee

Updated

•

7 years ago

Blocks: 1409713

Lars T Hansen [:lth]

Assignee

Comment 2

•

6 years ago

Attached patch bug1394420-jit-generate-atomics.patch (obsolete) — Details — Splinter Review

This is a sketch of the high level, not jitting yet. Although this is now phrased in terms of x86-shared, it is *probably* the case that we can share most code here when we're done, with very modest ifdeffery / platform-specificity to handle corner cases, most notably x86 (everything turns into cmpxchg8b) and maybe mips32 (no 64-bit atomics at all). In the worst case we'll share everything among other platforms and fork something for these two. I haven't dealt with memcpy and memmove; in principle they are easy but we can't afford performance to be awful so ... we'll see.

Lars T Hansen [:lth]

Assignee

Comment 3

•

6 years ago

Attached patch bug1394420-jit-generate-atomics-v2.patch (obsolete) — Details — Splinter Review

Generates code on x86 and x64 and has a clear path to generalization for other tier-1 platforms, and is independent of the compiler we're using.

Attachment #9002887 - Attachment is obsolete: true

Lars T Hansen [:lth]

Assignee

Comment 4

•

6 years ago

Attached patch bug1394420-jit-generate-atomics-v3.patch (obsolete) — Details — Splinter Review

Supports all tier-1 platforms. Not tested yet on ARM and ARM64 as I need to do that on hardware; jitted code is not compatible with simulators.

Attachment #9003186 - Attachment is obsolete: true

Lars T Hansen [:lth]

Assignee

Comment 5

•

6 years ago

Attached patch arm64-fixes.patch (obsolete) — Details — Splinter Review

Additional fixes for arm64. Unfortunately arm32 fixes must wait until my dev system is in a usable state. (FF build requirements changes -> Ubuntu 14 no longer sufficient -> Upgrade to Ubuntu 16 -> "Welcome to Emergency Mode!" -> now in systemd dependency hell.)

Lars T Hansen [:lth]

Assignee

Comment 6

•

6 years ago

Attached patch bug1394420-jit-generate-atomics-v4.patch (obsolete) — Details — Splinter Review

Now with ARM64 and not-very-well-tested-but-seems-to-be-working ARM support.

Attachment #9003460 - Attachment is obsolete: true

Attachment #9004203 - Attachment is obsolete: true

Lars T Hansen [:lth]

Assignee

Comment 7

•

6 years ago

Attached patch bug1394420-arm-volatile-hacks.patch (obsolete) — Details — Splinter Review

A totally gross hack to make things link on ARM because ARM code uses volatile in some places in the TypedArray code to work around an old compiler bug. Ideally we should check that code to see if we can stop using volatile. If that's not possible we need something better than this mess here.

Lars T Hansen [:lth]

Assignee

Comment 8

•

6 years ago

My ARM system is still in a bad state and I've not been able to test the ARM code properly. (It's in a different bad state today. I have something that resembles Ubuntu 16 yet strangely boots an old kernel that is probably sitting on a secret flash drive somewhere. Additionally the system is hard-float and so test cases don't generally pass at all, even without my changes. We've known in the past that this is a problem but I can't find the bug for it.)

Lars T Hansen [:lth]

Assignee

Comment 9

•

6 years ago

OK, so, I can cross-compile from x64 for ARM and run on my dev board, so some progress is to be expected here. Basically two things remaining apart from debugging, quoting myself from above: "I haven't dealt with memcpy and memmove; in principle they are easy but we can't afford performance to be awful so ... we'll see." "A totally gross hack to make things link on ARM because ARM code uses volatile in some places in the TypedArray code to work around an old compiler bug. Ideally we should check that code to see if we can stop using volatile. If that's not possible we need something better than this mess here."

Lars T Hansen [:lth]

Assignee

Updated

•

6 years ago

Depends on: 1495731

Lars T Hansen [:lth]

Assignee

Updated

•

6 years ago

Attachment #9004851 - Attachment is obsolete: true

Lars T Hansen [:lth]

Assignee

Comment 10

•

6 years ago

Attached patch bug1394420-jit-generate-atomics-v5.patch (obsolete) — Details — Splinter Review

Apart from some minor cleanup and one more round of testing this is believed to be complete. A follow-up patch will remove a lot of C++ code that now supports tier-1 platforms, leaving us with jitted atomics for tier-1 (and probably mips) and a feeling-lucky solution for everyone else. This is pretty much the best we can do, and has the smallest maintenance burden we can hope for. A follow-up to the follow-up might consider whether we want to jit the atomics even on the simulators, to simplify testing, but I sort of doubt that it's worth it.

Attachment #9004850 - Attachment is obsolete: true

Lars T Hansen [:lth]

Assignee

Comment 11

•

6 years ago

Attached patch bug1394420-jit-generate-atomics-v6.patch (obsolete) — Details — Splinter Review

Passes testing on x64 and x86 and has bug fixes and extensive commit comments.

Attachment #9015541 - Attachment is obsolete: true

Lars T Hansen [:lth]

Assignee

Updated

•

6 years ago

Depends on: 1497778

Lars T Hansen [:lth]

Assignee

Comment 12

•

6 years ago

Attached patch bug1394420-jit-generate-atomics-v7.patch — Details — Splinter Review

See commit msg for most information. Also be aware of the following patch, which cleans up a lot of the mess in the Atomics layer. Review guidance / requests: - nbp, check the JIT bits please - waldo, please check overall fit with the engine and note performance remarks in the commit message - froydnj, verify that this makes sense to you please

Attachment #9015623 - Attachment is obsolete: true

Attachment #9016313 - Flags: review?(nicolas.b.pierron)

Attachment #9016313 - Flags: review?(nfroyd)

Attachment #9016313 - Flags: review?(jwalden+bmo)

Lars T Hansen [:lth]

Assignee

Comment 13

•

6 years ago

Attached patch bug1394420-cleanup-atomics.patch — Details — Splinter Review

Consolidates many of the files in the atomics layer and prevents most future expansion of it. Largely, this patch moves or removes files. I have tried hard to carry over all the platform-specific bits from the files that are removed, even if that may strictly be redundant, as these files are fallbacks.

Attachment #9016314 - Flags: review?(nfroyd)

Nathan Froyd [:froydnj]

Comment 14

•

6 years ago

Comment on attachment 9016313 [details] [diff] [review] bug1394420-jit-generate-atomics-v7.patch Review of attachment 9016313 [details] [diff] [review]: ----------------------------------------------------------------- I did not look super-closely at the generators, but from a superficial look, they looked fine. ::: js/src/jit/shared/AtomicOperations-shared-jit.cpp @@ +295,5 @@ > +#endif > + default: > + MOZ_CRASH("Unknown size"); > + } > + masm.memoryBarrier(sync.barrierAfter); I think the semantics imposed by unconditionally generating barriers is stronger than we need on e.g. x86. So this approach is not currently as performant as we could be? @@ +302,5 @@ > + return start; > +} > + > +static uint32_t > +GenStore(MacroAssembler& masm, Scalar::Type size, Synchronization sync) `sync` is apparently unused in this function, which seems suspicious. @@ +383,5 @@ > +#endif > + default: > + MOZ_CRASH("Unknown size"); > + } > + offset += direction == CopyDir::DOWN ? 1 : -1; Pull the check out of the loop, perhaps? @@ +385,5 @@ > + MOZ_CRASH("Unknown size"); > + } > + offset += direction == CopyDir::DOWN ? 1 : -1; > + } > + masm.memoryBarrier(sync.barrierAfter); We only need barriers before and after all the instructions, not in between? (I'm not sure what the semantics are here, so asking mostly for my own edification? @@ +750,5 @@ > + uint32_t nop = GenNop(masm); > +#endif > + > + uint32_t load8SeqCst = GenLoad(masm, SIZE8, Synchronization::Full()); > + uint32_t load16SeqCst = GenLoad(masm, SIZE16, Synchronization::Full()); Is it worth having local synchronization objects that can be passed in here, rather than continuously repeating Sychronization::FOO()? @@ +875,5 @@ > + AtomicCompilerFence = (void(*)())(code + nop); > +#endif > + > + AtomicLoad8SeqCst = (uint8_t(*)(const uint8_t* addr))(code + load8SeqCst); > + AtomicLoad16SeqCst = (uint16_t(*)(const uint16_t* addr))(code + load16SeqCst); Drive-by comments, but this is not code I own, so feel free to ignore! 1. I think the use of variable names in the function type clutters things up. 2. I think all of these would be better served with a template like: template<typename R, typename... Args> R(*)(Args...) Frob(uint8_t* ptr, uint32_t offset) { return (R(*))(Args...)(code + offset); } if that's even valid syntax. Then these turn into things like: AtomicLoad8SeqCst = Frob<uint8_t(const uint8_t*)>(code, load8SeqCst); which seems a little more readable? I guess the main shortening there comes from taking out the variable name, so maybe it's not worth much.

Attachment #9016313 - Flags: review?(nfroyd) → review+

Nicolas B. Pierron [:nbp]

Comment 15

•

6 years ago

Comment on attachment 9016313 [details] [diff] [review] bug1394420-jit-generate-atomics-v7.patch Review of attachment 9016313 [details] [diff] [review]: ----------------------------------------------------------------- r=me with AtomicOperations-shared-jit.h changes. ::: js/src/jit/shared/AtomicOperations-shared-jit.cpp @@ +46,5 @@ > + > +// Selected registers match the argument registers exactly, and none of them > +// overlap the result register. > + > +static const LiveRegisterSet NonVolatileRegs; nit: Prefix these with their namespaces: ::js::jit::NonVolatileRegs as there is already a NonVolatileRegs in wasm namespace. @@ +167,5 @@ > + ABIArg arg = iter->abi.next(t); > + switch (arg.kind()) { > + case ABIArg::GPR: { > + if (arg.gpr() != reg) { > + masm.movePtr(arg.gpr(), reg); While I agree that with the predefined registers enumerated above these functions are going to behave correctly given the current set of arguments. However, I suspect that this might become a unnoticed pit-fall for the next person adding a new function here, which is erasing the argument value with an already loaded argument. Instead of having GenGprArg, I would recommend to add "getABIArg" to the MacroAssembler, using the MoveResolver like done in passABIArg. Then Add a function to resolve all the Move instructions [1]. [1] https://searchfox.org/mozilla-central/source/js/src/jit/x64/MacroAssembler-x64.cpp#342-352 @@ +875,5 @@ > + AtomicCompilerFence = (void(*)())(code + nop); > +#endif > + > + AtomicLoad8SeqCst = (uint8_t(*)(const uint8_t* addr))(code + load8SeqCst); > + AtomicLoad16SeqCst = (uint16_t(*)(const uint16_t* addr))(code + load16SeqCst); if-you-want-nit: I wish that the C++ type system we could properly assert the function types against to the generate code. Therefore I will suggest to change the offset to: - A structure to hold a typed offset: > template <T> > struct TypedOffset { > const uint32_t start; > TypedOffset(uint32_t start) : start_(start) {} > }; - Returned by each function: > template <typename V> > static TypedOffset<V(*)(const V*)> > GenLoad(MacroAssembler& masm, Synchronization sync) > { > Scalar::Type size = Scalar::FromType<V>(); > ArgIterator iter; > uint32_t start = GenPrologue(masm, &iter); > GenGprArg(masm, MIRType::Pointer, &iter, AtomicPtrReg); > > // … > } - A structure to cast TypedOffset into its typed pointer: > template <typename T> > SetFromTypedOffset(T& val, uint8_t* code, TypedOffset<T> offset) { > val = T(code + offset.start); > } - And used to set in the correct variable: > auto load8SeqCst = GenLoad<uint8_t>(masm, Synchronization::Full()); > > […] > > SetFromTypedOffset(AtomicLoad8SeqCst, code, load8SeqCst); ::: js/src/jit/shared/AtomicOperations-shared-jit.h @@ +210,5 @@ > + T oldval = *addr; /* good initial approximation */ \ > + for (;;) { \ > + T nextval = (T)AtomicCmpXchg64SeqCst((uint64_t*)addr, \ > + (uint64_t)val, \ > + (uint64_t)oldval); \ Swap oldval and val in this function call. @@ +338,5 @@ > + T oldval = *addr; /* Good initial approximation */ \ > + for (;;) { \ > + T nextval = (T)AtomicCmpXchg64SeqCst((uint64_t*)addr, \ > + (uint64_t)(oldval + val), \ > + (uint64_t)oldval); \ Swap these 2 lines in the function call: (oldval + val) is the new value. @@ +408,5 @@ > + T oldval = *addr; \ > + for (;;) { \ > + T nextval = (T)AtomicCmpXchg64SeqCst((uint64_t*)addr, \ > + (uint64_t)(oldval OP val), \ > + (uint64_t)oldval); \ Swap these 2 lines in the function call to AtomicCmpXchg64SeqCst.

Attachment #9016313 - Flags: review?(nicolas.b.pierron) → review+

Nathan Froyd [:froydnj]

Updated

•

6 years ago

Attachment #9016314 - Flags: review?(nfroyd) → review+

Lars T Hansen [:lth]

Assignee

Updated

•

6 years ago

Attachment #9016313 - Flags: review?(jwalden+bmo)

Lars T Hansen [:lth]

Assignee

Comment 16

•

6 years ago

(In reply to Nathan Froyd [:froydnj] from comment #14)

Comment on attachment 9016313 [details] [diff] [review]
bug1394420-jit-generate-atomics-v7.patch

Review of attachment 9016313 [details] [diff] [review]:

I did not look super-closely at the generators, but from a superficial look,
they looked fine.

::: js/src/jit/shared/AtomicOperations-shared-jit.cpp
@@ +295,5 @@
+#endif
 default:
   MOZ_CRASH("Unknown size");
}

masm.memoryBarrier(sync.barrierAfter);
I think the semantics imposed by unconditionally generating barriers is
stronger than we need on e.g. x86. So this approach is not currently as
performant as we could be?

It doesn't do that. The sync object has a number of bitvectors specifying the barriers that are actually required for the context of the Synchronization. So if the Synchronization comes from an unordered access, barrierAfter will be zero, and no fences will be emitted. If the Synchronization comes from Synchronization::Load(), no code will be emitted on x86, due to the definition of memoryBarrier(). Only for Synchronization::Store and Synchronization::Full will there be a bit here that forces a fence emitted, as desired.

@@ +302,5 @@

return start;
+}

+static uint32_t
+GenStore(MacroAssembler& masm, Scalar::Type size, Synchronization sync)

sync is apparently unused in this function, which seems suspicious.

Nice catch.

@@ +385,5 @@
       MOZ_CRASH("Unknown size");
   }
   offset += direction == CopyDir::DOWN ? 1 : -1;
}

masm.memoryBarrier(sync.barrierAfter);
We only need barriers before and after all the instructions, not in between?
(I'm not sure what the semantics are here, so asking mostly for my own
edification?

A good question. Right now all the callers of GenCopy() pass Synchronization::None(), so it's a moot point. For WebAssembly I expect the spec will say that the copy will have a barrier at the end. I'll remove the sync for now.

@@ +750,5 @@

uint32_t nop = GenNop(masm);
+#endif

uint32_t load8SeqCst = GenLoad(masm, SIZE8, Synchronization::Full());

uint32_t load16SeqCst = GenLoad(masm, SIZE16, Synchronization::Full());

Is it worth having local synchronization objects that can be passed in here,
rather than continuously repeating Sychronization::FOO()?

Well... matter of taste? Let me think about it.

@@ +875,5 @@

AtomicCompilerFence = (void(*)())(code + nop);
+#endif

AtomicLoad8SeqCst = (uint8_t()(const uint8_t addr))(code + load8SeqCst);

AtomicLoad16SeqCst = (uint16_t()(const uint16_t addr))(code + load16SeqCst);

Drive-by comments, but this is not code I own, so feel free to ignore!

I think the use of variable names in the function type clutters things up.

Fair point, though they're a useful mnemonic device.

I think all of these would be better served with a template like:

template<typename R, typename... Args>
R()(Args...)
Frob(uint8_t ptr, uint32_t offset)
{
return (R(*))(Args...)(code + offset);
}

if that's even valid syntax. Then these turn into things like:

AtomicLoad8SeqCst = Frob<uint8_t(const uint8_t*)>(code, load8SeqCst);

which seems a little more readable? I guess the main shortening there comes
from taking out the variable name, so maybe it's not worth much.

Hm, yes. I guess the code's been written and is unlikely to be rewritten, so I may give this one a pass. Removing the variable names in the casts would help by itself, will give that a try.

Lars T Hansen [:lth]

Assignee

Comment 17

•

6 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #15)

Comment on attachment 9016313 [details] [diff] [review]
bug1394420-jit-generate-atomics-v7.patch

Review of attachment 9016313 [details] [diff] [review]:

r=me with AtomicOperations-shared-jit.h changes.

::: js/src/jit/shared/AtomicOperations-shared-jit.cpp
@@ +46,5 @@

+// Selected registers match the argument registers exactly, and none of them
+// overlap the result register.
+
+static const LiveRegisterSet NonVolatileRegs;

nit: Prefix these with their namespaces: ::js::jit::NonVolatileRegs as there
is already a NonVolatileRegs in wasm namespace.

Renamed as AtomicNonVolatileRegs.

@@ +167,5 @@

> > +    ABIArg arg = iter->abi.next(t);
> > +    switch (arg.kind()) {
> > +      case ABIArg::GPR: {
> > +        if (arg.gpr() != reg) {
> > +            masm.movePtr(arg.gpr(), reg);

While I agree that with the predefined registers enumerated above these
functions are going to behave correctly given the current set of arguments.
However, I suspect that this might become a unnoticed pit-fall for the next
person adding a new function here, which is erasing the argument value with
an already loaded argument.

I agree this is a little brittle.

Instead of having GenGprArg, I would recommend to add "getABIArg" to the
MacroAssembler, using the MoveResolver like done in passABIArg. Then Add a
function to resolve all the Move instructions [1].

[1] https://searchfox.org/mozilla-central/source/js/src/jit/x64/MacroAssembler- x64.cpp#342-352

Given the very fixed functionality of this code and how long this patch has been languishing, I won't do so now, but I agree with this in principle. ISTR there were in fact some bugs during development resulting from the unchecked register constraints.

For now, I will:

beef up the comments a little to make it even clearer that these are hard constraints
add a pointer to the language you use above so that if the complexity of this code grows much (eg, MIPS will tend to need more registers) we will reconsider.

> ::: js/src/jit/shared/AtomicOperations-shared-jit.h
> @@ +210,5 @@
> > +        T oldval = *addr; /* good initial approximation */            \
> > +        for (;;) {                                                    \
> > +            T nextval = (T)AtomicCmpXchg64SeqCst((uint64_t*)addr,     \
> > +                                                   (uint64_t)val,     \
> > +                                                   (uint64_t)oldval); \
> 
> Swap oldval and val in this function call.

Nice catch, this was a cut-and-paste from MSVC code, which has reversed the arguments for this function according to what's normal.

Ditto the next two.

Pulsebot

Comment 18

•

6 years ago

Pushed by lhansen@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/2f5be1913934 jit-generate atomic ops to be called from c++. r=nbp, r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/b2ffeeac7326 Consolidate feeling-lucky atomics. r=froydnj

Lars T Hansen [:lth]

Assignee

Updated

•

6 years ago

Comment 19

•

6 years ago

Backed out 2 changesets (bug 1394420) for failing testAtomicOperations.cpp, ESling and jit failures

push that caused the backout: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=b2ffeeac7326d673ff09474ce3007e84687c5a62

failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&selectedJob=223016953&searchStr=android%2C4.3%2Capi16%2B%2Copt%2Ctest-android-em-4.3-arm7-api-16%2Fopt-cppunit%2C%28cpp%29

https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&selectedJob=223016950&searchStr=android%2C8.0%2Cpixel2%2Copt%2Ctest-android-hw-p2-8-0-arm7-api-16%2Fopt-jittest-9%2C%28jit9%29

backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/c311812c530c03588f7ac2fac4fd9f6c38103529

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Assignee

Comment 20

•

6 years ago

•

Edited

The problem here is that while ReturnReg (as defined in the assembler) is identical to the C/C++ return register on x86, ReturnReg64 is not, at least on Linux. So more register definitions are needed. There is a risk that some of this is platform-dependent of course. Investigating further.

Specifically, the C/C++ 64-bit return register is edx:eax, while our ReturnReg64 is edi:eax.

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Assignee

Comment 21

•

6 years ago

A better link: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=b2ffeeac7326d673ff09474ce3007e84687c5a62. The apparent ARM failures are emulator failures, thus the same bug. Thus it looks like only x86 is affected.

Lars T Hansen [:lth]

Assignee

Comment 22

•

6 years ago

•

Edited

OK, several problems found:

on x86 (32-bit), the ReturnReg64 is incorrect, this is bug 1521512, I've worked around it
on ARM (32-bit), the LR is also a temp register that is used by the atomic core operations so the return address must be saved and restored
on ARM and ARM64, the argBase value (for reading arguments from the stack) incorrectly incorporated an adjustment for the pushed return address that is only appropriate for x86/x64

In general, prologue and epilogue are a little platform-dependent and therefore now have proper ifdef nestes with an #error for unsupported platforms, just like the register definitions.

Pulsebot

Comment 23

•

6 years ago

Pushed by lhansen@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/2d921f6ce0a9 jit-generate atomic ops to be called from c++. r=nbp, r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/4e928a279f87 Consolidate feeling-lucky atomics. r=froydnj

Oana Pop-Rus

Comment 24

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/2d921f6ce0a9
https://hg.mozilla.org/mozilla-central/rev/4e928a279f87

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

status-firefox66: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla66

Mike de Boer [:mikedeboer]

Comment 25

•

6 years ago

https://hg.mozilla.org/projects/cedar/rev/2d921f6ce0a911d403a5fa9cabc3e9eabe24e645 Bug 1394420 - jit-generate atomic ops to be called from c++. r=nbp, r=froydnj https://hg.mozilla.org/projects/cedar/rev/4e928a279f87eb139d5683e1b5aba7ffbff59de1 Bug 1394420 - Consolidate feeling-lucky atomics. r=froydnj

Lars T Hansen [:lth]

Assignee

Updated

•

6 years ago

Blocks: 1225033

You need to log in before you can comment on or make changes to this bug.