Closed Bug 1107919 Opened 5 years ago Closed 5 years ago

Direct support for spinlock'd atomic operations (for larger data and as optimization)


(Core :: JavaScript Engine, defect)

Not set





(Reporter: lth, Unassigned)


(Blocks 1 open bug)


Currently atomics are supported on 8, 16, and 32 bit data (ints only but float32 could be supported).  64-bit data are not supported because the instructions for atomic operations on 64-bit data are not available on all platforms.

The generally accepted technique for handling 64-bit is either a two-word CAS or a spinlock.  The spinlock requires memory however, and managing that memory is not really part of the API.  So the best solution that has presented itself so far is that there should be a dedicated word per SAB for this spinlock.  That word will however tend to become very hot and is essentially a global resource, it does not allow unrelated locations to have unrelated spinlock locations.  So even if this is the "best" solution it is not remotely a good solution, and 64-bit atomics have remained unspecified.

In Aleksandar Zlicic points out that there are reasons to use the spinlock technique also for smaller data (8 and 16 bit); he doesn't say so explicitly but my impression is that that leads to better-looking code, but there may be performance implications I don't know as well (eg contention on adjacent bytes would be less of a problem).

A similar argument as for float64 applies to 8 and 16-bit data: a global spinlock location for all atomic locations is a terrible idea.

So a couple of thoughts:

- we could remove the 8-bit and 16-bit versions and just require programs to roll
  their own using spinlocks or read-modify-write, but this means we can't use fast
  instructions when they are available

- we could introduce a memory "management" API that allows the use of spinlocks,
  this would enable 64-bit and larger atomics

So here's a proposal that is not at all pretty (I need to see what PNaCl does here but I don't have time right now).  Suppose dta is a Float64 array and iab is an Int32 array.  Then:

   Atomics.compareExchangeSpin(dta, k, iab, s, expected, replacement)

will CAS dta[k] using iab[s] as a spinlock /if it needs to spin/.  On x86 this would be a CMPXCHG instruction as for 32-bit.  On ARM and MIPS it would use the spinlock.

I'm thinking load and store would have similar APIs but add/sub/etc might not

We could introduce similar APIs for 8-bit and 16-bit data (and could remove the existing 8-bit and 16-bit APIs), though without performance measurements I sort of doubt that we want to do this, even if it can in principle reduce contention on adjacent bytes.

We could also introduce similar APIs for arbitrary n-bit data ranges, which is more interesting, in this case the specific 64-bit version is probably not needed.  Suppose ata is any typed array, k is the starting index, and n is the number of elements:

   Atomics.compareExchangeSpin(ata, k, n, iab, s, expected, replacement)
   Atomics.load(ata, k, n, iab, s), k, n, iab, s, ata2, l)

(In the store case this replaces ata[k]..ata[k+n-1] with ata2[l]..ata[l+n-1].)

Certainly there is some overhead still in that the ...Spin version needs to pass extra arguments and it needs to compute values for those arguments.  We could optimize that with feature tests to avoid it when the platform supports the CAS with a simple expression but it becomes sort of absurd at that point.

The underlying problem is of course that an "atomic" cell is a complex, type-dependent datum with possibly more than one field and fields of possibly several types.  Our Atomics don't capture that at all, operating only on arrays.
Moved to the spec tracker:
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.