Provide 8-byte implementations of jit/AtomicOperations

RESOLVED FIXED in Firefox 57

Status

()

defect
P1
normal
RESOLVED FIXED
4 years ago
2 years ago

People

(Reporter: lth, Assigned: lth)

Tracking

(Blocks 2 bugs)

unspecified
mozilla57
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox57 fixed)

Details

Attachments

(14 attachments, 2 obsolete attachments)

49.52 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
10.98 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
10.04 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
14.14 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
19.93 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
5.79 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
8.35 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
5.91 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
11.31 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
22.92 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
9.13 KB, patch
dragan.mladjenovic
: review+
Details | Diff | Splinter Review
6.68 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
1.55 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
10.46 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
Assignee

Description

4 years ago
Clang with -m32 does not have support for 8-byte atomics, it generates calls to (undefined) library functions.

I'm not sure how important this case is, but it would be nice to provide lock-free 8-byte atomics even in this case, since all CPUs will have the necessary instructions and the JIT will be able to use them.
Assignee

Comment 1

3 years ago
This is not needed by the atomics compatibility layer (jit/AtomicOperations.h), so let's worry about it when it becomes an issue.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
Assignee

Updated

2 years ago
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Assignee

Comment 2

2 years ago
We will need this for implementing WebAssembly threads: the spec requires 8-byte atomics to be lock-free.

That clang with -m32 does not support 8-byte atomics was observed with clang 3.5 according to comments in the code (near the top of jit/x86-shared/AtomicOperations-x86-shared.h):

  // This pertains to Clang compiling with -m32, in this case the 64-bit
  // __atomic builtins are not available (observed on various Mac OS X
  // versions with Apple Clang and on Linux with Clang 3.5).
  //
  // For now just punt: disable lock-free 8-word data.  The JIT will
  // call isLockfree8() to determine what to do and will stay in sync.
  // (Bug 1146817 tracks the work to improve on this.)

  # if defined(__clang__) && defined(__i386)
  #  undef LOCKFREE8
  # endif

ARM also needs to be investigated for the compilers we use.

(Really we should abandon the not-really-well-defined C++ implementations of these functions and just drop to assembler, either hand-written or JIT-generated.)
Blocks: 1389458
Assignee

Comment 3

2 years ago
Clang 4.0 appears to have the same problem.
Assignee

Comment 4

2 years ago
I'm renaming this bug.  For Wasm we need (lock-free) 8-byte atomics for all platforms and compilers we support.
Assignee: nobody → lhansen
Status: REOPENED → ASSIGNED
Summary: Clang/x86: Provide 8-byte lock-free atomics for Clang compiling with -m32 → Provide 8-byte implementations of jit/AtomicOperations
Assignee

Comment 5

2 years ago
As far as 32-bit Clang is concerned, on Mac it "just works" now (MacPorts Clang-3.9, Mac OS X 10.12.5), and on Linux it is sufficient to link with -latomic to include the necessary callouts.
(In reply to Lars T Hansen [:lth] from comment #4)
> I'm renaming this bug.  For Wasm we need (lock-free) 8-byte atomics for all
> platforms and compilers we support.

Hi,

Just saw this issue. I wasn't able to find the discussion where it states that 8-byte atomic needed to be lock free for wasm. Is this only for wasm64 or wasm on 64-bit platforms? On MIPS32 we cannot implement lock-free 8-byte atomics. How much of a issue is this? 

Thanks in advance.
Assignee

Comment 7

2 years ago
(In reply to Dragan Mladjenovic from comment #6)
> (In reply to Lars T Hansen [:lth] from comment #4)
> > I'm renaming this bug.  For Wasm we need (lock-free) 8-byte atomics for all
> > platforms and compilers we support.
> 
> Just saw this issue. I wasn't able to find the discussion where it states
> that 8-byte atomic needed to be lock free for wasm.

This was discussed at the May 2017 CG meeting, I was not there myself but I called in.  The meeting notes are here: https://github.com/WebAssembly/meetings/blob/master/2017/CG-05.md#webassembly-threads-proposal.  You should search for the strings "Lock free guarantees" and (a little further down) "64 bit lock free".  

There was a poll, which was strongly in favor of making 64-bit lock-free.  (I believe I voted against it, since I'm quoted directly after the poll results saying that I'm worried about more obscure architectures, and MIPS was uppermost in my mind; Filip Pizlo brought up PowerPC as another.  Common among them is that nobody in the room really could say much about the commercial relevance of those platforms.)  The consensus (also reflected in the minutes) was that people who have a problem with it should come to the group and state their problems.  So, you could file a bug against the threads spec on https://github.com/WebAssembly/threads/ saying that this is a problem for you.  Since the issue with MIPS was mostly known already I don't know if this will do much good, but it doesn't hurt to try.

> Is this only for wasm64 or wasm on 64-bit platforms?

It is for wasm on all platforms.  The intent, I think, is that having to (potentially) specialize algorithms to support both lock-free and non-lock-free 64-bit atomics when most platforms have lock-free 64-bit atomics is both a burden and a problem that will be less and less relevant as time goes on, and it's better to look to the future.

> On MIPS32 we cannot implement lock-free 8-byte atomics. How much of a issue is this? 

I don't think it's a big deal.  If it were me I would just implement a spinlock and lie about it.  SpiderMonkey has some support for locked atomics already, and I won't remove it now if you think you'll want to use it.  I don't know about v8.
Assignee

Updated

2 years ago
Blocks: 1377576
Assignee

Updated

2 years ago
Depends on: 1389669
Assignee

Comment 8

2 years ago
The following patch set (eight patches) generalizes jit/AtomicOperations to handle 8-byte atomics, since we need those for WebAssembly.  The first patch clears the ground, the next six patches change all our implementations of the atomic operations, and the last patch contains test cases.

We will still rely on compiler intrinsics (such as __atomic_whatever and __sync_whatever) on all platforms, even though this is undefined behavior.  Later patches (on other bugs) will fix that by moving to inline assembler for tier-1 platforms.  The setup here is meant to be compatible with that, which is one reason why I'm splitting the GCC and MSVC files for x86 - they are diverging here, and will diverge ever more.
Assignee

Comment 9

2 years ago
See comment 8 for a general introduction to this patch set.

This patch is large but contains very little actual work:

- comment cleanup here and there

- split jit/x86-shared/AtomicOperations-x86-shared.h into ...-shared-msvc.h and
  ...-shared-gcc.h, because these implementations are diverging in this
  patch set and will diverge even more

- rename AtomicOperations::isLockfree() as AtomicOperations::isLockfreeJS()
  because it implements behavior that is now specific to JS and different
  from the behavior of WebAssembly
Attachment #8901806 - Flags: review?(sstangl)
Assignee

Comment 10

2 years ago
See comment 8 for a general introduction to this patch set.

This changes the definitions in jit/none/AtomicOperations-feeling-lucky.h (the file we use for tier-3 platforms that don't want to support their own atomic operations) to support 8-byte atomics:

- default implementations that crash are provided for platforms that don't
  declare that they support 8-byte atomics
- misc assertions are refined to handle the 8-byte case
- comment cleanup

More can be done here, notably to support 32-bit platforms that want to support non-lock-free 64-bit atomics, but the only one I know about right now is MIPS and MIPS has its own implemenation of the atomic operations.
Attachment #8901807 - Flags: review?(sstangl)
Assignee

Comment 11

2 years ago
See comment 8 for a general introduction to this patch set.

Adapt our (tier-1) ARM implementation of atomic operations.  In addition to some comment and #ifdef cleanup, I have removed the support for older GCC through the __sync class of primitives.  We now require gcc > 4.8 and those all have the newer __atomic primitives.
Attachment #8901808 - Flags: review?(sstangl)
Assignee

Comment 12

2 years ago
See comment 8 for a general introduction to this patch set.

This upgrades our definitions for (tier-1) x86 and x64 when compiling with GCC and Clang to 8-byte atomics.  As for ARM, I have taken the opportunity to remove the support for older __sync intrinsics since we require a gcc that has __atomic intrinsics.

Also I've removed the support for non-lockfree 8-byte atomics since the platform will have them, even on 32-bit x86.  I have verified that compiling for 32-bit with Clang has the necessary primitives (on Linux when linking with -latomic, on Mac it "just works" now).
Attachment #8901811 - Flags: review?(sstangl)
Assignee

Comment 13

2 years ago
See comment 8 for a general introduction to this patch set.

This upgrades our support for MSVC on x86 and x64 to 8-byte atomics.  This is a somewhat elaborate rewrite, because it uses the MSVC _Interlocked intrinsics, and those are platform-specific: on x86, only _InterlockedCompareExchange64 exists, the other 64-bit routines have to be implemented by means of that.

On the other hand, given that we have intrinsics like _ReadWriteFence which restricts reordering in the compiler, there's a fighting change that the MSVC implementation does not need to be rewritten in assembler to avoid C++ undefined behavior, or at least avoid UB that matters.
Attachment #8901815 - Flags: review?(sstangl)
Assignee

Comment 14

2 years ago
See comment 8 for a general introduction to this patch set.

Rewrite our (tier-1? tier-2? tier-3?) arm64 support for atomic operations to be 8-byte aware.

Since this layer already uses the __atomic primitives and since we're not serious about arm64 yet I've done a superficial job here, mainly upgrading the assertions so that we can compile.
Attachment #8901820 - Flags: review?(sstangl)
Assignee

Comment 15

2 years ago
See comment 8 for a general introduction to this patch set.

As a public service, I've tried to clean up the MIPS primitives a little to work with 8-byte atomics, but these will be used only on hardware, not even on the MIPS simulator, so it's not been possible to test.
Attachment #8901822 - Flags: review?(sstangl)
Assignee

Comment 16

2 years ago
See comment 8 for a general introduction to this patch set.

This introduces jsapi-test test cases for the atomic operations at 1, 2, 4, and 8 bytes.  They test that the functions exist and return expected values and perform expected operations on memory but do not test atomicity.

The 8-byte tests will only pass if 8-byte atomics are actually supported, I'm not sure if I should do something about that.  It is possible to introduce, say, AtomicOperations::hasSupport8(), to test, and to avoid the test cases if that returns false, this might be nice for some tier-3 32-bit platforms.  Opinions welcome.

(end of patch set)
Attachment #8901823 - Flags: review?(sstangl)
Assignee

Updated

2 years ago
See Also: → 1394420
Comment on attachment 8901806 [details] [diff] [review]
bug1146817-64bit-atomic-preliminaries.patch

Review of attachment 8901806 [details] [diff] [review]:
-----------------------------------------------------------------

Much nicer organization.
Attachment #8901806 - Flags: review?(sstangl) → review+
Attachment #8901807 - Flags: review?(sstangl) → review+
Attachment #8901808 - Flags: review?(sstangl) → review+
Attachment #8901811 - Flags: review?(sstangl) → review+
Comment on attachment 8901815 [details] [diff] [review]
bug1146817-64bit-atomic-x86-msvc.patch

Review of attachment 8901815 [details] [diff] [review]:
-----------------------------------------------------------------

This patch is more complicated. I tried to read it carefully, and didn't notice any errors.

::: js/src/jit/x86-shared/AtomicOperations-x86-shared-msvc.h
@@ +135,3 @@
>      template<> inline T                                              \
> +    AtomicOperations::exchangeSeqCst(T* addr, T val) {               \
> +        static_assert(sizeof(*addr) == 8, "8-byte variant");         \

sizeof(T) here and elsewhere?
Attachment #8901815 - Flags: review?(sstangl) → review+
Attachment #8901820 - Flags: review?(sstangl) → review+
Attachment #8901822 - Flags: review?(sstangl) → review+

Comment 19

2 years ago
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9727d970a03d
Rename isLockfree as isLockfreeJS, plus preparation for 64-bit aware atomics. r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/a71d0947dfd6
Make none/AtomicOperations-feeling-lucky.h 64-bit aware, and cleaner. r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/b6401fe90774
Make arm/AtomicOperations-arm.h 64-bit aware, and cleaner. r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/284002382c21
Make x86-shared/AtomicOperations-x86-shared-gcc.h 64-bit aware, and cleaner. r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf818881916e
Make x86-shared/AtomicOperations-x86-shared-msvc.h 64-bit aware, and cleaner. r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/48df1dbecdd8
Make arm64/AtomicOperations-arm64.h 64-bit aware, and cleaner. r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/800291e5aee9
Make mips-shared/AtomicOperations-mips-shared.h cleaner (though not 64-bit aware). r=sstangl
Assignee

Comment 20

2 years ago
Dragan, a couple things you should be aware of:

The effect of bug 1389669 was that jit/mips-shared/AtomicOperations-mips-shared.h is only included when building on MIPS hardware (or a VM that emulates it), not when building for the MIPS simulator.  The MIPS simulator gets a file appropriate for the hardware the simulator is running on.  As a consequence, the simulator needs to be compatible for its atomics with that file.  I checked the case for x86 and x64 and it seemed OK to me, so I don't think you have much to worry about.

The effect of the changes I pushed in this bug is that the MIPS primitives may fail to compile or run on MIPS hardware because they do not properly support 8-byte atomics: they may fail to compile because of static_asserts that assert the size of the pointed-to element is 4 bytes or smaller, and they may fail to run because of MOZ_ASSERTS that check the same thing.  You should be able to fix this by looking at either jit/x86-shared/AtomicOperations-x86-shared-gcc.h or jit/none/AtomicOperations-feeling-lucky.h, I don't think it's much work, and of course it is not a problem for the simulator.

Obviously, feel free to ask if you have questions about anything at all, I'm happy to help.  And sorry for the turbulence.
Flags: needinfo?(dragan.mladjenovic)
Keywords: leave-open
Assignee

Comment 21

2 years ago
See comment 8 for overall info, etc.

This introduces AtomicOperations::hasAtomic8() for all platforms, so that we can avoid failing tests on platforms where we don't have those, and don't need them.
Attachment #8903518 - Flags: review?(sstangl)
Assignee

Comment 22

2 years ago
See comment 8, comment 16 for background.

This updates the test cases: we now use hasAtomic8() to guard the 8-byte test cases, so that they don't crash if 8-byte atomics are not available.
Attachment #8901823 - Attachment is obsolete: true
Attachment #8901823 - Flags: review?(sstangl)
Attachment #8903520 - Flags: review?(sstangl)
Attachment #8903518 - Flags: review?(sstangl) → review+
Comment on attachment 8903520 [details] [diff] [review]
bug1146817-64bit-atomic-test-cases-v2.patch

Review of attachment 8903520 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi-tests/hidePointer.cpp
@@ +17,5 @@
> +mozilla::Atomic<void*> g_hidden_pointer;
> +
> +// Call this to install a pointer into the global.
> +
> +MOZ_NEVER_INLINE void hidePointer1(void* p)

setHiddenPointer()?

@@ +24,5 @@
> +}
> +
> +// Call this to retrieve the pointer.
> +
> +MOZ_NEVER_INLINE void* hidePointer2()

getHiddenPointer()?

::: js/src/jsapi-tests/testAtomicOperations.cpp
@@ +92,5 @@
> +// bindings holding values of that type.
> +//
> +// No bytes of A and B should be 0 or FF.  A+B and A-B must not overflow.
> +
> +#define ATOMIC_TESTS(T, A, B) \

Not a fan of templates? :-)
Attachment #8903520 - Flags: review?(sstangl) → review+
Assignee

Comment 25

2 years ago
(In reply to Sean Stangl [:sstangl] from comment #23)
> Comment on attachment 8903520 [details] [diff] [review]
> bug1146817-64bit-atomic-test-cases-v2.patch
> 
> Review of attachment 8903520 [details] [diff] [review]:
> ::: js/src/jsapi-tests/testAtomicOperations.cpp
> @@ +92,5 @@
> > +// bindings holding values of that type.
> > +//
> > +// No bytes of A and B should be 0 or FF.  A+B and A-B must not overflow.
> > +
> > +#define ATOMIC_TESTS(T, A, B) \
> 
> Not a fan of templates? :-)

Huh, that's a fair point.  I avoided templates because of the BEGIN_TEST and END_TEST things, hoping that with a macro we'd get better call stacks in case something failed.  But I don't know if it really matters.  I'll clean it up.
Assignee

Comment 26

2 years ago
Uh, I take that back.  CHECK depends on being between BEGIN_TEST and END_TEST, due to the way those macros are implemented.  (BEGIN_TEST and END_TEST wrap the code in a class that inherits from another class that exports information that the expansion of CHECK needs.)

Comment 27

2 years ago
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/39eb143530ce
Introduce AtomicOperations::hasAtomic8(). r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/2cdac401110f
Test cases for jit::AtomicOperations. r=sstangl
Assignee

Updated

2 years ago
Keywords: leave-open

Comment 28

2 years ago
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ae143cf168fe
Addendum to squash MSVC warning, avoid a static arithmetic overflow. r=me
Assignee

Comment 29

2 years ago
Even with the last patch it is orange on Windows XP (!) because of a failed test, but it seems OK on all other platforms.  I'll investigate further.
Assignee

Comment 30

2 years ago
Best guess at the moment is that the test data were not aligned to 64 bits, but TBD.

Comment 31

2 years ago
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd4cb8329586
Correct order of arguments to _InterlockedCompareExchange in AtomicOperations.  Align test data.  Add alignment assertions to x86 and x64.  r=me
(In reply to Lars T Hansen [:lth] from comment #20)
> Dragan, a couple things you should be aware of:
> 
> The effect of bug 1389669 was that
> jit/mips-shared/AtomicOperations-mips-shared.h is only included when
> building on MIPS hardware (or a VM that emulates it), not when building for
> the MIPS simulator.  The MIPS simulator gets a file appropriate for the
> hardware the simulator is running on.  As a consequence, the simulator needs
> to be compatible for its atomics with that file.  I checked the case for x86
> and x64 and it seemed OK to me, so I don't think you have much to worry
> about.
> 
> The effect of the changes I pushed in this bug is that the MIPS primitives
> may fail to compile or run on MIPS hardware because they do not properly
> support 8-byte atomics: they may fail to compile because of static_asserts
> that assert the size of the pointed-to element is 4 bytes or smaller, and
> they may fail to run because of MOZ_ASSERTS that check the same thing.  You
> should be able to fix this by looking at either
> jit/x86-shared/AtomicOperations-x86-shared-gcc.h or
> jit/none/AtomicOperations-feeling-lucky.h, I don't think it's much work, and
> of course it is not a problem for the simulator.
> 
> Obviously, feel free to ask if you have questions about anything at all, I'm
> happy to help.  And sorry for the turbulence.

Hi,
Sorry for not replying sooner. Should I lie about mips32 supporting 8-byte lock-free atomics inside 
AtomicOperations::isLockfree8? This way the value will be same for mips32 hardware and mips32 simulator (x86). 
Will this check become part of wasm::HasCompilerSupport in the future?
Another options is to return false here for mips32 and to maybe skip testAtomicLockFree8 altogether on mips32.
Assignee

Comment 33

2 years ago
That's a good question, about the simulator.

In my opinion you should figure out what's the best solution for you on hardware, and if that is to not have lock-free 8-byte atomics and to not lie about that, then we can add ifdefs in the x86 file so that the simulator has the same behavior as hardware, I think this is the most reasonable solution.

If you do want to lie about it then yes, inside isLockfree8() is the right place to do it.

I just landed some code earlier today that avoids the run-time failure in the testAtomicOperations cases, so you should not have to worry about that: there's a new predicate, AtomicOperations::hasAtomic8(), which says whether the platform has 8-byte atomics or not; if it does, isLockfree8() can return true or false, depending on what is appropriate (your decision).
Assignee

Comment 34

2 years ago
One more adjustment coming.
Keywords: leave-open
Flags: needinfo?(dragan.mladjenovic)
Attachment #8904450 - Flags: review?(lhansen)
Assignee

Comment 36

2 years ago
In comment 31 I landed some ad-hoc alignment assertions (with r=me) to the x86/x64 implementations, since there are actually alignment requirements on the operations and I was trying to hunt down a bug that could have been alignment-related.

This patch cleans up those assertions and implements them for all tier-1 platforms.  I abstracted them as a template function whose result we assert; this is not a perfect solution since release builds will no longer get the static_assert on the atomic cell size they had before, but it removes a lot of redundant clutter and I think it strikes a good balance.
Attachment #8904466 - Flags: review?(sstangl)
Assignee

Comment 37

2 years ago
Comment on attachment 8904450 [details] [diff] [review]
bug1146817-atomic64-mips.patch

Review of attachment 8904450 [details] [diff] [review]:
-----------------------------------------------------------------

Looks OK.  I don't think the last word's been said on the semantics of the atomics anyway, we'll have further adjustments.

::: js/src/jit/mips-shared/AtomicOperations-mips-shared.h
@@ +13,5 @@
>  
> +// NOTE, MIPS32 unlike MIPS64 doesn't provide hardware support for lock-free 64-bit atomics.
> +// We lie down below about 8-byte atomics being always lock-free in order to support wasm jit.
> +// It is necessary to link with -latomic to get the 64-bit atomic intrinsics on MIPS32.
> +

Comments should wrap at 79 columns.  (Code wraps at 99 but comments at 79, for reasons I don't understand.)

@@ +132,5 @@
> +template<>
> +inline float
> +js::jit::AtomicOperations::loadSafeWhenRacy(float* addr)
> +{
> +    return *addr;

Why are there specializations here for float and double?  (And also below.)
Attachment #8904450 - Flags: review?(lhansen) → review+
Attachment #8904450 - Attachment is obsolete: true
Attachment #8904480 - Flags: review+
(In reply to Lars T Hansen [:lth] from comment #37)
> Comment on attachment 8904450 [details] [diff] [review]
> bug1146817-atomic64-mips.patch
> 
> Review of attachment 8904450 [details] [diff] [review]:
> -----------------------------------------------------------------
> Comments should wrap at 79 columns.  (Code wraps at 99 but comments at 79, for reasons I don't understand.)
Done.
> @@ +132,5 @@
> > +template<>
> > +inline float
> > +js::jit::AtomicOperations::loadSafeWhenRacy(float* addr)
> > +{
> > +    return *addr;
> 
> Why are there specializations here for float and double?  (And also below.)

The comments for these functions state that they are not access-atomic for floats so I chose not to use relaxed atomic loads for them. Using relaxed atomic load in case for float (and double on MIPS64) will result in load to GP and move to FP register and for double on MIPS32 it will result to call into libatomic helper both of which doesn't seem necessary. This is only tested for GCC, LLVM might have a slighty optimized behavior in first case.
Assignee

Comment 41

2 years ago
And one more memo to self:

At the moment, 8-byte accesses are implemented as full atomic accesses for tier-1 platforms even on 32-bit systems in order to make them access-atomic.  This is probably horrifically expensive on x86 and ARM and impossible on MIPS32, say, and it's also only meaningful if we change the JITs to emit comparable code (ie our x86 JIT would need to use CMPXCHG8B to load and store 64-bit data).  So right now we're in an inbetween state where we can't stay.

I've put in a request for clarification (https://github.com/WebAssembly/threads/issues/9) with an aim to loosen this requirement in wasm, because we probably don't need it, certainly not for wasm32.  For wasm64 we'll want 8-byte accesses to be access-atomic for sure, but for wasm32 it's a luxury we probably can't afford.

I've also put in a request for clarification (https://github.com/tc39/proposal-bigint/issues/78) around what accesses to the Int64 views that are coming in JS should be doing in this regard, also with the aim to keep it loose.
Assignee

Comment 42

2 years ago
(In reply to Dragan Mladjenovic from comment #40)
> (In reply to Lars T Hansen [:lth] from comment #37)
> > 
> > Why are there specializations here for float and double?  (And also below.)
> 
> The comments for these functions state that they are not access-atomic for
> floats so I chose not to use relaxed atomic loads for them. Using relaxed
> atomic load in case for float (and double on MIPS64) will result in load to
> GP and move to FP register and for double on MIPS32 it will result to call
> into libatomic helper both of which doesn't seem necessary. This is only
> tested for GCC, LLVM might have a slighty optimized behavior in first case.

I see.  I had actually forgotten that loadSafeWhenRacy and storeSafeWhenRacy need to work for floating types.  I have no test cases for that, and I expect that several of my implementations would not work properly for those cases because they cast values in ways that would convert them.  Thank you for reminding me of my own rules...

Code changes and test cases coming up :-/
(In reply to Lars T Hansen [:lth] from comment #41)
> And one more memo to self:

Hi, 
Sorry for bothering You again,
Forgot to ask this. What is the relation of memcpySafeWhenRacy to loadSafeWhenRacy and storeSafeWhenRacy? Does it need to provide same guaranties regarding access atomicity? Right now it is not documented. I assume they do not. Most memcpy implementations (that I know) provide access atomicity up to pointer size (if src and dest are properly aligned) but beyound is highly C library/ architecture dependent.
Assignee

Comment 44

2 years ago
(In reply to Dragan Mladjenovic from comment #43)

> What is the relation of memcpySafeWhenRacy to
> loadSafeWhenRacy and storeSafeWhenRacy? 

They are related only in that the C++ compiler is supposed to be able to call them to access memory racily without triggering C++ undefined behavior.

> Does it need to provide same guaranties regarding access atomicity?

It is access-atomic only on the byte level (ie, trivially).

> Right now it is not documented.

I will add a note in that file so that the next person to walk down this path will not have to ask.

Thanks.
Assignee

Comment 45

2 years ago
This primarily adds test cases for the float and double cases of loadSafeForRaces and storeSafeForRaces.  It turns out I did not need to change the code on any platform, mostly by accident though.  I *could* change the code to use plain loads and stores on x86-gcc, x64-gcc, arm, and arm64, instead of using relaxed atomics as now, but since those primitives will be rewritten soon (bug 1394420) with inline assembler or jit-generated code I didn't bother - they work OK for the time being, and they are compatible with whatever code the JIT generates.

Secondarily it adds test cases for the uint8_clamped case of loadSafeForRaces and storeSafeForRaces.  Finally it adds some comments to clarify the API.
Attachment #8904567 - Flags: review?(sstangl)
Attachment #8904466 - Flags: review?(sstangl) → review+
Comment on attachment 8904567 [details] [diff] [review]
bug1146817-safe-for-races-tweaks.patch

Review of attachment 8904567 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi-tests/testAtomicOperations.cpp
@@ +283,5 @@
> +    ATOMIC_FLOAT_TESTS(double, A, B);
> +}
> +END_TEST(testAtomicOperationsF64)
> +
> +#define ATOMIC_CLAMPED_TESTS(T, A, B) \

Matching #undef?
Attachment #8904567 - Flags: review?(sstangl) → review+

Comment 47

2 years ago
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/43cf87ac46c7
Clean up assertions for tier-1 platforms. r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/b62fb112ddd8
Add float and double test cases for loadSafeWhenRacy / storeSafeWhenRacy. r=sstangl
Assignee

Comment 48

2 years ago
Leaving this open until comment 41 is resolved (access-atomicity for 64-bit accesses on 32-bit systems).
Assignee

Comment 49

2 years ago
I had occasion to compile on(!) ARM hardware today and was using Clang because my gcc is too old and no newer is available for the device.  Clang, unlike gcc, does not want to specialize the templates for loadSafeWhenRacy and storeSafeWhenRacy when the type is uint8_clamped, it wants explicit specializations with some magic casts.  Here they are.

(Apparently we compile for ARM HW only with gcc, since I've not received any complatints.)

I could do this for arm64 too but since that platform is all but unsupported and, again, there have been no complaints, and I can't test compilation for that platform myself, I don't think I'll do so now.
Attachment #8905023 - Flags: review?(sstangl)
Assignee

Updated

2 years ago
Priority: -- → P1
Assignee

Comment 50

2 years ago
Re comment 48, both the JS committee and the Wasm committee are moving in the direction of making 8-byte integer accesses non-access-atomic on 32-bit platforms.
Attachment #8905023 - Flags: review?(sstangl) → review+
(In reply to Lars T Hansen [:lth] from comment #50)
> Re comment 48, both the JS committee and the Wasm committee are moving in
> the direction of making 8-byte integer accesses non-access-atomic on 32-bit
> platforms.

Good to hear!

Comment 53

2 years ago
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a1edfa0dfd73
Specialize for uint8_clamped for ARM hardware, to benefit clang. r=sstangl
Assignee

Comment 54

2 years ago
(I hope this is the last one.)

Since the sentiment is clearly in favor of not having access-atomic 8-byte accesses on 32-bit systems, let's just implement that change.
Attachment #8905943 - Flags: review?(sstangl)
Comment on attachment 8905943 [details] [diff] [review]
bug1146817-non-access-atomic-cleanup.patch

Review of attachment 8905943 [details] [diff] [review]:
-----------------------------------------------------------------

At the same time that I think this is the right decision for default accesses, it would still be nice for the developer to have a way to explicitly use a spinlock on 32-bit systems. Otherwise, if they want atomicity, they have to use a full futex, right?
Attachment #8905943 - Flags: review?(sstangl) → review+
Assignee

Comment 57

2 years ago
(In reply to Sean Stangl [:sstangl] from comment #56)
> 
> At the same time that I think this is the right decision for default
> accesses, it would still be nice for the developer to have a way to
> explicitly use a spinlock on 32-bit systems. Otherwise, if they want
> atomicity, they have to use a full futex, right?

The developer should be able to use the loadSeqCst and storeSeqCst primitives for ordered accesses to have the same effect.  Ignoring the ordering semantics, on x86 those primitives should be the same speed as the LOCK; CMPXCHG8B that would be used for access-atomic loadSafeWhenRacy and storeSafeWhenRacy.  On ARM those primitives will be quite a bit slower than LDRD/STRD (if available), and probably somewhat slower than LDREXD/STREXD loops we'd use for loadSafeWhenRacy and storeSafeWhenRacy, since there will be additional fences.  On MIPS they'd probably be implemented with a spinlock behind the scenes.

I think what you're saying is "perhaps we should add AtomicOperations::loadRelaxed and AtomicOperations::storeRelaxed, at least, and clarify how these are different from loadSafeWhenRacy and storeSafeWhenRacy".  At the moment we have no need for those since they would not be exposed to JS, but I sort of expect us to ship wasm v1 (or v1.1) threads with at least release-acquire atomics, so the matter will come up again soon.

Comment 58

2 years ago
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e7ce80185031
Improve implementation of non-access-atomic safe-for-races accesses. r=sstangl
Assignee

Updated

2 years ago
Keywords: leave-open
Assignee

Updated

2 years ago
Hardware: x86 → All
https://hg.mozilla.org/mozilla-central/rev/e7ce80185031
Status: ASSIGNED → RESOLVED
Closed: 3 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57

Updated

2 years ago
Depends on: 1409196

Updated

2 years ago
Depends on: 1409713
You need to log in before you can comment on or make changes to this bug.