Closed Bug 1084248 Opened 10 years ago Closed 9 years ago

Split SharedTypedArray methods from TypedArray methods

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox44 --- fixed

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Attachments

(6 files)

It is "mostly wrong" for methods on TypedArray and SharedTypedArray to share implementations as they do now: importantly, STA methods must be aware that they are operating on racy memory and secondarily, do not (yet?) need to worry about neutering.  Shared implementations will tend to become overly complex and to not take into account all the constraints.

Self-hosting the code is attractive.
(In reply to Lars T Hansen [:lth] from comment #0)
> Self-hosting the code is attractive.

It certainly is. Perhaps it'd be good to first do self-hosted implementations of the TypedArray versions and then copy and modify those for SharedTypedArray? Perhaps we could make these Good First (or Second) Bug-fodder, too, if the things to watch out for can be described clearly enough.
Unless the shared methods end up with different semantics, which seems unlikely to me (and highly questionable), I think we can just have one self-hosted method that handles both.  The JITs will have to handle accesses to shared typed array elements differently from non-shared typed array elements, but that's basically no skin off their backs.
I've started looking at this, going about it by renaming the viewData() and dataPointer() methods on shared data as viewDataShared() and dataPointerShared() and tracking usages up through the call hierarchy.  It doesn't look completely awful yet.

To handle the C++ code that can't be removed I'll likely have to provide relaxed atomic accesses in the AtomicOperations layer, or something roughly equivalent that C++ can call safely to perform racy accesses without stepping into undefined territory.

On part I'm not sure about yet is the interaction with PodMove and PodCopy, which are used heavily by the shared TypedArray code that is still in C++.  Presumably the data we're copying and moving have trivial constructors and destructors (ie these stand in for memcpy / memmove).
Duh, "POD".
See Also: → 1176214
Expose an API that can be used from C++ to access memory that might be accessed concurrently by other threads, without synchronization but also without creating undefined behavior due to races.  See a large comment in the code; also see info on the next patch.
Attachment #8655421 - Flags: feedback?(luke)
Attachment #8655421 - Flags: feedback?(jwalden+bmo)
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
This patch is a mouthful, but it's easier than it looks.  The central idea is that accessor functions that will or might return shared memory are given names that distinguish them as such.  These accessors are primarily the definitions of dataPointerShared (SharedArrayObject.h), dataPointerMaybeShared (ArrayBufferObject.h, ditto -inl), and viewDataShared (SharedTypedArrayObject.h), within this patch.

Once the name changes propagate outward we come to points where the data values either are used or escape.  Where they are used we can determine whether to apply the safe-for-races accessors (see previous patch), often the pointer is only used for its value but if there's a memory access it must be made safe.  Where the pointers escape it becomes trickier but normally the solution is to rename the function that returns the pointer (incorporating "shared" in the name) and/or changing the signature (adding a bool* isShared flag that the client is required to be aware of).

This is all in preparation for a monster patch in bug 1176214, where SharedTypedArray will be removed, and TypedArray will (safely) map SharedArrayBuffer throughout Firefox, with opt-in use of shared memory within the DOM.

What I've done here with name and signature changes isn't the only possible solution.  An alternative might be to introduce a notion of a "pointer to shared memory" as a value type, perhaps shared_ptr<T>, requiring it to be unwrapped for use, say.  That seemed more elaborate than what I have now; also, what I have now seems to work well when exposed through the SpiderMonkey API to the DOM code in general, since the extra isShared flag on SM API functions conveys all the necessary information and helps tracks down all call sites.  Even so I'm opening this up for discussion, in case anyone has any opinions or suggestions.
Attachment #8655426 - Flags: feedback?(luke)
Attachment #8655426 - Flags: feedback?(jwalden+bmo)
I know the goal of the *SafeForRaces functions is to keep us out of theoretical UB territory.  However, are there any practical concerns for what the supported compiler/hardware configurations will do?  I can imagine a compiler might do some local load/store reordering based on the assumption that the memory wasn't being concurrently mutated.  I know it's evil to suggest that 'volatile' can fix any problem, but would 'volatile' be sufficient to avoid this case in practice?  Perhaps another goal is to silence what would otherwise be TSan/Helgrind errors?
(In reply to Luke Wagner [:luke] from comment #7)

The current wisdom seems to be that aggressive compiler optimization and hardware optimizations together can have surprising effects that are hard to predict (also hard to predict into the future).  I've also been told that C++ compilers have started analyzing programs with an eye to enabling optimizations that are legal under the assumption that the program is not triggering UB and that this has surprising effects.

As I mentioned in a comment in the code, TSAN is your enemy here.  I doubt I would be able to land removal of SharedTypedArray without an actually non-racy solution.  Before the summer and at Whistler I polled several co-workers on this issue and there was broad agreement that actually avoiding races is very worthwhile.

(More broadly people worry about hardware with race detection, though I don't lose sleep over that - it's just TSAN with more teeth.)

Even if 'volatile' is a solution, which I do not think it is, it would be most convenient to wrap it in accessors as I have here, I expect.  It would be a potential implementation technique to add to the list perhaps.
One nice thing about volatile is that it annotates the type so you can tag it at the source (data(), maybeHeap(), etc) and have it flow out.  However, you could do the same thing w/o volatile by having Racy<T*> which is just a simple wrapper around a pointer and then you have the *SafeForRaces functions take a Racy<T*>.
(In reply to Luke Wagner [:luke] from comment #9)
> One nice thing about volatile is that it annotates the type so you can tag
> it at the source (data(), maybeHeap(), etc) and have it flow out.  However,
> you could do the same thing w/o volatile by having Racy<T*> which is just a
> simple wrapper around a pointer and then you have the *SafeForRaces
> functions take a Racy<T*>.

Yes, that's what I was proposing in my other comment (shared_mem<T>); volatile is too ambiguous I think.  Anyway I have been experimenting with that this morning.  It's pretty clean and I've even found two bugs.

Whether it makes a big difference from what I have now I'm not sure, but it's arguably cleaner.  I'll post a patch when I have something that's tidy.
A smart-pointer based solution: it wraps pointers that are possibly-to-shared memory in SharedMem<T*>, and passes those around.  Atomic operations offer APIs that take these pointers (and currently also plain pointers but the goal is to get rid of the latter).
Attachment #8656552 - Flags: feedback?(luke)
Attachment #8656552 - Flags: feedback?(jwalden+bmo)
Changes that make use of the smart pointers.  The goal here is to avoid using unwrap() to get at the plain pointer as much as possible, and when we do, to annotate the unwrapping (currently with just a comment) explaining why it is safe, or how it might not be safe.  The latter case will require an eventual fix.

Currently I have only four possibly-unsafe unwrappings, continuing to investigate.  (Of those, only usage in asm.js worries me.)

BTW, the smart pointer is currently carrying a flag that says whether the pointer is thought to be to shared memory or is known not to be.  I'd like to demote this flag to DEBUG-only but I've kept it because it may still be needed when I apply the DOM changes.
Attachment #8656554 - Flags: feedback?(luke)
Attachment #8656554 - Flags: feedback?(jwalden+bmo)
Comment on attachment 8656552 [details] [diff] [review]
Safe-for-races operations, exposed to C++, v2

Review of attachment 8656552 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/SharedMem.h
@@ +25,5 @@
> +
> +    SharedMem(const SharedMem<T>& that) : ptr_(that.ptr_), sharedness_(that.sharedness_) {}
> +
> +    template<typename U>
> +    explicit SharedMem(const SharedMem<U>& that) : ptr_((T)that.unwrap()), sharedness_((SharedMem<T>::Sharedness)that.sharedness()) {}

The '(T)that.unwrap()' means that this ctor will effectively be an implicit cast from any ptr type to any ptr type.  Probably better to take out the casts (so only the ones that are implicitly allowed succeed) and, if necessary have an explicit 'cast<T>()' member function.
Attachment #8656552 - Flags: feedback?(luke) → feedback+
(In reply to Luke Wagner [:luke] from comment #13)
> Comment on attachment 8656552 [details] [diff] [review]
> Safe-for-races operations, exposed to C++, v2
> 
> Review of attachment 8656552 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/vm/SharedMem.h
> @@ +25,5 @@
> > +
> > +    SharedMem(const SharedMem<T>& that) : ptr_(that.ptr_), sharedness_(that.sharedness_) {}
> > +
> > +    template<typename U>
> > +    explicit SharedMem(const SharedMem<U>& that) : ptr_((T)that.unwrap()), sharedness_((SharedMem<T>::Sharedness)that.sharedness()) {}
> 
> The '(T)that.unwrap()' means that this ctor will effectively be an implicit
> cast from any ptr type to any ptr type.  Probably better to take out the
> casts (so only the ones that are implicitly allowed succeed) and, if
> necessary have an explicit 'cast<T>()' member function.

The intent is very much to allow free casts between pointer types here, it is used this way extensively in the second patch.  At first blush it would be a pain to do without.  But perhaps there's a good solution for that I haven't seen yet.
Comment on attachment 8656554 [details] [diff] [review]
Avoid undefined behavior in accessing shared memory, v2

Review of attachment 8656554 [details] [diff] [review]:
-----------------------------------------------------------------

Great, this seems like a good degree of protection.  I believe the asm.js uses are safe for the reasons mentioned below:

::: js/src/asmjs/AsmJSModule.cpp
@@ +847,5 @@
> +    // Careful here - if the memory in heapDatum() is shared and may be accessed concurrently
> +    // from other threads it must not be accessed directly from C++ code, but must go via
> +    // safe accessors.  Often the best strategy is to wrap the pointer in a SharedMem when
> +    // it's read, see eg maybeHeap().
> +    heapDatum() = heap->dataPointerMaybeShared().unwrap(/*unsafe? - unclear*/);

heapDatum()'s memory is in the global data section which is attached to the AsmJSModule which is thread local.  Still, probably a good idea to make the heapDatum() method private.

@@ +852,3 @@
>  
>  #if defined(JS_CODEGEN_X86)
> +    uint8_t* heapOffset = heap->dataPointerMaybeShared().unwrap(/*safe - single-threaded initialization*/);

I think the reason, here and below, is "safe - only reference": we're just copying this address into JIT code; the single-thread-ness of initialization doesn't seem necessary (note: 'heap' can be being concurrently used by other threads at this point).

::: js/src/asmjs/AsmJSModule.h
@@ +1583,3 @@
>          MOZ_ASSERT(isDynamicallyLinked());
> +        // IsShared is a conservative approximation, we don't have the information.
> +        return SharedMem<uint8_t*>(heapDatum(), SharedMem<uint8_t*>::IsShared);

Can you use (hasArrayView() && isSharedView()) to determine shared-ness?

::: js/src/asmjs/AsmJSSignalHandlers.cpp
@@ +574,5 @@
>      uintptr_t result = address.disp();
>  
>      if (address.hasBase()) {
>          uintptr_t base;
> +        StoreValueFromGPReg(SharedMem<void*>(&base, SharedMem<void*>::IsUnshared), sizeof(uintptr_t),

IIUC, here and below, all these accesses are either shared or unshared (depending on the module.isSharedView()).

::: js/src/vm/ArrayBufferObject-inl.h
@@ +22,5 @@
> +ArrayBufferObjectMaybeShared::dataPointerMaybeShared() {
> +    ArrayBufferObjectMaybeShared* buf = this;
> +    if (buf->is<ArrayBufferObject>()) {
> +        return SharedMem<uint8_t*>(buf->as<ArrayBufferObject>().dataPointer(),
> +                                   SharedMem<uint8_t*>::IsUnshared);

Perhaps worth adding dataPointerShared() to ArrayBufferObject to factor these out?
Attachment #8656554 - Flags: feedback?(luke) → feedback+
Comment on attachment 8655426 [details] [diff] [review]
Avoid undefined behavior in accessing shared memory

Review of attachment 8655426 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/AsmJSSignalHandlers.cpp
@@ +381,5 @@
>  SetFPRegToLoadedValue(const void* addr, size_t size, void* fp_reg)
>  {
>      MOZ_RELEASE_ASSERT(size <= Simd128DataSize);
>      memset(fp_reg, 0, Simd128DataSize);
> +    AtomicOperations::memcpySafeForRaces(fp_reg, addr, size);

Uh, isn't the memset *directly above this* obviously not safe for races?  Same for the other Set*RegToLoadedValue* methods here.

Without having deep familiarity with these methods or their users, I assume this is going to be called on up-to-128-bit things.  What's the story going to be about racing on sets of them?  My understanding is SIMD stuff is just aligned bags-o-bits greater than max atomic size, and you can access them as uint8_t, uint16_t, uint32_t, double, and so on.  Is the racing guarantee going to be that *if* you ever access one of these, every byte will be either an old value or a new value, and larger-sized components may be any value composed of those old/new values, respecting the sequencing of byte (and larger) operations performed?  This seems really finicky, and almost certainly it's worth some sort of extended comment here and in the other methods.

::: js/src/builtin/AtomicsObject.cpp
@@ +245,5 @@
>  
>      switch (view->type()) {
>        case Scalar::Uint8:
>        case Scalar::Uint8Clamped: {
> +        uint8_t v = jit::AtomicOperations::loadSeqCst((uint8_t*)view->viewDataShared() + offset);

Yeah, the more I see stuff like this, the more I'd really like to see the pointer encapsulated a bit, so that only a very, very small bit of auditable code is ever given access to shared memory.  (Plus the few JIT places, but oh well.)  Could you make that a quick followup, or redo of these patches, or whatever?  This code gives me the (wait for it) jitters.

::: js/src/jit/IonBuilder.cpp
@@ +8897,5 @@
>      else if (obj->resultTypeSet())
>          tarr = obj->resultTypeSet()->maybeSingleton();
>  
>      if (tarr) {
> +        // 'data' may point to shared memory but this usage is safe unless the 'data' pointer

s/memory but/memory, but/

::: js/src/jit/MIR.cpp
@@ +4538,5 @@
>  
>  void*
>  MLoadTypedArrayElementStatic::base() const
>  {
> +    // The return value may point to shared memory but this usage is safe unless the 'data' pointer

s/memory but/memory, but/

@@ +4569,5 @@
>  
>  void*
>  MStoreTypedArrayElementStatic::base() const
>  {
> +    // The return value may point to shared memory but this usage is safe unless the 'data' pointer

Same.

::: js/src/vm/ArrayBufferObject-inl.h
@@ +20,5 @@
> +// See warning on the definition in ArrayBufferObject.h: exercise
> +// caution if the memory is shared.
> +
> +inline uint8_t*
> +ArrayBufferObjectMaybeShared::dataPointerMaybeShared() {

{ on new line.

::: js/src/vm/ArrayBufferObject.h
@@ +84,5 @@
>          return AnyArrayBufferByteLength(this);
>      }
>  
> +    // Note, dataPointerMaybeShared() should be used sparingly.  If
> +    // the memory that is returned is shared then C++ intrinsics must

s/ then C++ intrinsics/, C++ arithmetic operations/

::: js/src/vm/SharedArrayObject.h
@@ +79,1 @@
>          return ((uint8_t*)this) + sizeof(SharedArrayRawBuffer);

Remove the unnecessary |inline| and make that a C++ cast while you're here.

::: js/src/vm/SharedTypedArrayObject.cpp
@@ +366,5 @@
>      getIndex(JSObject* obj, uint32_t index)
>      {
>          SharedTypedArrayObject& tarray = obj->as<SharedTypedArrayObject>();
>          MOZ_ASSERT(index < tarray.length());
> +        return jit::AtomicOperations::loadSafeForRaces(static_cast<const NativeType*>(tarray.viewDataShared())+index);

Spaces around +.

@@ +398,5 @@
>      static void
>      setIndex(SharedTypedArrayObject& tarray, uint32_t index, NativeType val)
>      {
>          MOZ_ASSERT(index < tarray.length());
> +        jit::AtomicOperations::storeSafeForRaces(static_cast<NativeType*>(tarray.viewDataShared())+index, val);

ditto

@@ +674,5 @@
>            return nullptr;                                                                   \
>                                                                                              \
>        SharedTypedArrayObject* tarr = &obj->as<SharedTypedArrayObject>();                    \
>        *length = tarr->length();                                                             \
> +      *data = static_cast<ExternalType*>(tarr->viewDataShared()); /* safe: our caller knows */  \

...it's being given a shared pointer?  This comment should be elaborated.

Slightly more generally, are there users of C++ methods exposing this stuff yet?  it'd be nice if we didn't have to assume people would do the right thing with this.

::: js/src/vm/TypedArrayCommon.h
@@ +20,4 @@
>  #include "vm/SharedTypedArrayObject.h"
>  #include "vm/TypedArrayObject.h"
>  
> +#include "jit/AtomicOperations-inl.h"

It's a no-no to include -inl.h files in non-inl.h files.  Maybe TypedArrayCommon.h is an exception to that principle, tho -- and if so perhaps we should rename it to -inl.h.  Please do that.

@@ +108,5 @@
>      return obj->as<SharedTypedArrayObject>().layout();
>  }
>  
> +// Note, AnyTypedArrayViewData() should be used sparingly.  If the
> +// memory that is returned is shared then C++ intrinsics must never be

Same comma/arithmetic operations change as noted elsewhere in this patch.  (And I guess if this comment's duplicated elsewhere, make the same modifications there.)

@@ +143,5 @@
>      return IsTypedArrayClass(clasp) || IsSharedTypedArrayClass(clasp);
>  }
>  
> +class SharedOps {
> +public:

class SharedOps
{
  public:

Make the same changes to SharedOps below, too.

@@ +145,5 @@
>  
> +class SharedOps {
> +public:
> +    template<typename U>
> +    static inline U read(U* addr) {

Remove the inlines on these, they're implied syntactically by the inline definitions.

@@ +167,5 @@
> +    static inline U read(U* addr) { return *addr; }
> +    template<typename V>
> +    static inline void write(V* addr, V value) { *addr = value; }
> +    static inline void memcpy(void* dest, void* src, size_t size) { ::memcpy(dest, src, size); }
> +    static inline void memmove(void* dest, void* src, size_t size) { ::memmove(dest, src, size); }

Use PodCopy/PodMove for these, with uint8_t casts as needed.  (This'll catch the invalid overlapping uses and assert.)

@@ +204,4 @@
>          uint32_t count = AnyTypedArrayLength(source);
>  
>          if (AnyTypedArrayType(source) == target->type()) {
> +            Ops::memcpy(dest, AnyTypedArrayViewData(source), count*sizeof(T));

Spaces around *.

@@ +222,2 @@
>              for (uint32_t i = 0; i < count; ++i)
> +                Ops::write(dest++, T(Ops::read(src++)));

I think this is going to run into problems on ARM, where JS_VOLATILE_ARM expands to volatile and so you're passing |volatile int8_t*| to Ops::read, which requires the argument be T* as I recall.  *Maybe* you can get rid of JS_VOLATILE_ARM to fix this.  Tryserver this patch (and any necessary modifications) hard.

Maybe you could ameliorate this (and possibly fake out the ARM compiler enough?) with a standalone template class with a method that does all this stuff once, then have all these cases just call that method.  That at least gets rid of a massive amount of duplication.  Although maybe it's not germane here.

@@ +310,5 @@
>              const Value* srcValues = source->as<NativeObject>().getDenseElements();
>              for (; i < bound; i++) {
>                  if (!canConvertInfallibly(srcValues[i]))
>                      break;
> +                Ops::write(dest+i, infallibleValueToNative(srcValues[i]));

Spaces around +.

@@ +359,4 @@
>          uint32_t len = source->length();
>  
>          if (source->type() == target->type()) {
> +            Ops::memmove(dest, static_cast<T*>(AnyTypedArrayViewData(source)), len*sizeof(T));

Spaces around *.

@@ +496,5 @@
>  };
>  
> +template<class SomeTypedArray>
> +inline bool
> +EitherShared(Handle<SomeTypedArray*> target, HandleObject source)

Bah.  This breaks down if |source| is a cross-compartment wrapper to a shared typed array.  I really need to debug the last perf issues with self-hosted %TypedArray%.prototype.set and enable it so we can just assume shared in all these methods (or at least I think we can, once we've made that change).
Attachment #8655426 - Flags: feedback?(jwalden+bmo) → feedback+
Comment on attachment 8655421 [details] [diff] [review]
Safe-for-races operations, exposed to C++

Review of attachment 8655421 [details] [diff] [review]:
-----------------------------------------------------------------

This isn't horrible, but it'd be nice to encapsulate the actual pointers so that C++ *can't* get a direct address to read from or write to, except inside a class that clearly does everything using non-C++ operations, like maybe

  ArrayBufferData<Uint8>& data = ...
  uint32_t v = data.loadUnsynchronized(0); // data[0] or something

approximately.  Or maybe the pointer-wrapping thing instead, if that works out cleaner.  I think maybe I concluded in the other patch-feedback that wrapping the raw pointer would be easier/simpler and would provide as much benefit.

So it look like luke and I basically both would prefer the pointer-wrapping thing.  So I guess I'm kind of okay with this as an intermediate step, but the pointer-wrapping thing would be preferable, so probably you should just do the pointer-wrapping thing directly.

::: js/src/jit/AtomicOperations.h
@@ +98,5 @@
> +    // access memory without synchronization and can't guarantee that
> +    // there won't be a race on the access.  In principle these
> +    // methods will not be written in C++, thus making the races legal
> +    // if *all* accesses to racy addresses from C++ go via these
> +    // functions.  (Jitted code is safe by construction.)

"Jitted code uses machine instructions that provide the same guarantees as these functions."

@@ +109,5 @@
> +    //  - hand-written assembler (maybe inline);
> +    //  - using compiler intrinsics;
> +    //  - trusting the C++ compiler not to generate code that blows
> +    //    up on a race (not really recommended, and remember TSAN
> +    //    is your enemy).

I'd remove the last and note that

    // Trusting the compiler not to generate code that blows up on a race
    // definitely won't work in the presence of TSan, or even of optimizing
    // compilers in seemingly-"innocuous" conditions.  (See
    // https://www.usenix.org/legacy/event/hotpar11/tech/final_files/Boehm.pdf
    // for details.)

@@ +123,5 @@
> +    // Drop-in safe replacement for memcpy()
> +    static inline void* memcpySafeForRaces(void* dest, const void* src, size_t nbytes);
> +
> +    // Drop-in safe replacement for memmove()
> +    static inline void* memmoveSafeForRaces(void* dest, const void* src, size_t nbytes);

I suspect these should be T* and count-based versus void* and size-based, and that the former will be more usable than the latter.

::: js/src/jit/none/AtomicOperations-none.h
@@ +87,5 @@
>  {
>      MOZ_CRASH();
>  }
>  
> +// Not sure what to do for the safe-for-races methods.  Do we ever run

The "none" backend exists for platforms that lack a JIT, and everything that happens in that backend should be unreachable code because CanEnterBaseline or whatever should say No.  So these can all just MOZ_CRASH().
Attachment #8655421 - Flags: feedback?(jwalden+bmo) → feedback+
Oh feh, you obsoleted those patches with the approach we both wanted.  :-\  Looking at those now, hopefully I can get this done before you show up tomorrow/today.
Comment on attachment 8656552 [details] [diff] [review]
Safe-for-races operations, exposed to C++, v2

Review of attachment 8656552 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/AtomicOperations.h
@@ +63,5 @@
>      template<typename T>
>      static inline T loadSeqCst(T* addr);
>  
> +    template<typename T>
> +    static inline T loadSeqCst(SharedMem<T*> addr) {

Remove the gratuitous inlines while you're here.

Don't all the pointer-taking methods want to be private, so that people are forced to go through the SharedMem versions?

@@ +156,5 @@
> +    //  - hand-written assembler (maybe inline);
> +    //  - using compiler intrinsics or directives;
> +    //  - trusting the C++ compiler not to generate code that blows
> +    //    up on a race, perhaps by cleverly using volatile (not
> +    //    really recommended, and remember TSAN is your enemy).

See my comments on all this in the old patch.

@@ +165,5 @@
> +    // is a race on the access.
> +
> +    // Defined for the int types as well as for float32 and float64.
> +    template<typename T>
> +    static inline T loadSafeForRaces(T* addr);

s/SafeForRaces/SafelyRaces/g for a skosh of brevity perhaps.

@@ +204,5 @@
> +    static inline void* memmoveSafeForRaces(void* dest, const void* src, size_t nbytes);
> +
> +    template<typename T>
> +    static inline T memmoveSafeForRaces(SharedMem<T> dest, SharedMem<T> src, size_t nbytes) {
> +        return static_cast<T>(memmoveSafeForRaces(static_cast<void*>(dest.unwrap()), static_cast<void*>(src.unwrap()), nbytes));

Can these memcpy/memmove-like functions just return void?  Gives me jitters seeing side-effectful methods returning things, particularly when the return value is something the user already provided.

::: js/src/jit/none/AtomicOperations-none.h
@@ +88,5 @@
>      MOZ_CRASH();
>  }
>  
> +// Not sure what to do for the safe-for-races methods.  Do we ever run
> +// "none" code?  If we do, then these need to have meaningful

...or wait, you can't make these crash, can you, because they'd be called on JIT-less platforms.  I guess the problem is that this stuff really doesn't belong in the JIT backends, it belongs in some sort of platform-specific yet *not* JIT-related place in the code, with no "none" case at all, just a compile error if no implementation of these for the architecture.

I guess really we want a non-JIT-related place for platform-specific implementations of things, then this can be put in there.  Maybe spin up such a thing in a followup bug?  I'm fine landing with the current locations, as long as we put it aright quickly.

::: js/src/vm/SharedMem.h
@@ +6,5 @@
> +
> +#ifndef vm_SharedMem_h
> +#define vm_SharedMem_h
> +
> +// T should be a pointer type

No need for this, just add

  static_assert(mozilla::IsPointer<T>::value, "SharedMem encapsulates pointer types");

to the class body, and #include "mozilla/TypeTraits.h".

@@ +9,5 @@
> +
> +// T should be a pointer type
> +
> +template<typename T>
> +class SharedMem {

{ on new line for class definitions, in general.

@@ +18,5 @@
> +    };
> +  private:
> +    T ptr_;
> +    Sharedness sharedness_;
> +  public:

Some breathing room (a blank line) between the access sections of this class would be more readable.

@@ +25,5 @@
> +
> +    SharedMem(const SharedMem<T>& that) : ptr_(that.ptr_), sharedness_(that.sharedness_) {}
> +
> +    template<typename U>
> +    explicit SharedMem(const SharedMem<U>& that) : ptr_((T)that.unwrap()), sharedness_((SharedMem<T>::Sharedness)that.sharedness()) {}

I agree -- add a cast<T>() member function.

@@ +27,5 @@
> +
> +    template<typename U>
> +    explicit SharedMem(const SharedMem<U>& that) : ptr_((T)that.unwrap()), sharedness_((SharedMem<T>::Sharedness)that.sharedness()) {}
> +
> +    SharedMem<T>& operator =(const SharedMem<T>& that) { ptr_ = that.ptr_; sharedness_ = that.sharedness_; return *this; }

Multiple lines for this.  Also, I think this would be more readable using s/SharedMem<T>/SharedMem/g, and elsewhere in the rest of this class.  No need to specify particular template arguments if they'll be assumed to be the default ones if you do nothing.

@@ +31,5 @@
> +    SharedMem<T>& operator =(const SharedMem<T>& that) { ptr_ = that.ptr_; sharedness_ = that.sharedness_; return *this; }
> +
> +    SharedMem<T> operator +(size_t offset) { return SharedMem<T>(ptr_ + offset, sharedness_); }
> +
> +    bool operator !() { return ptr_ == nullptr; }

Not |explicit operator bool() { return ptr_ != nullptr; }|?

@@ +34,5 @@
> +
> +    bool operator !() { return ptr_ == nullptr; }
> +
> +    bool isShared() const { return sharedness_ == IsShared; }
> +    Sharedness sharedness() const { return sharedness_; }

I'd kind of prefer exactly one of these, ideally the enum one.  I assume a bool passes in somewhere where the meaning isn't quite so obvious the way an enum would be.

@@ +36,5 @@
> +
> +    bool isShared() const { return sharedness_ == IsShared; }
> +    Sharedness sharedness() const { return sharedness_; }
> +    T unwrap() const { return ptr_; }
> +    T unwrapUnshared() const { MOZ_ASSERT(sharedness_ == IsUnshared); return ptr_; }

I don't think unwrap* should be public like this.  If you do it right, can't these methods be private, exposed *only* to the AtomicOperations struct encapsulating all the load/store/exchange/etc. operations via a couple friend declarations?  That seems like the ultimate goal of SharedMem to me.

@@ +41,5 @@
> +};
> +
> +template<typename T>
> +inline bool operator >=(const SharedMem<T>& a, const SharedMem<T>& b) {
> +    return a.unwrap() >= b.unwrap();

template<typename T>
inline bool
operator >=(...)
{
   ...
}

and so on for the rest.

That said, are these added because of actual need for them, or just for smart-pointer-completeness?  Aside from ++/-- and ==, I probably wouldn't add any of these.  It's hard for me to think of too many places where you'd need these operators (and moreover, the dodgy unwraps inside them).
Attachment #8656552 - Flags: feedback?(jwalden+bmo) → feedback+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #19)
> Comment on attachment 8656552 [details] [diff] [review]
> Safe-for-races operations, exposed to C++, v2
> 
> Review of attachment 8656552 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Don't all the pointer-taking methods want to be private, so that people are
> forced to go through the SharedMem versions?

Eventually, they do want to be private.

> @@ +165,5 @@
> > +    // is a race on the access.
> > +
> > +    // Defined for the int types as well as for float32 and float64.
> > +    template<typename T>
> > +    static inline T loadSafeForRaces(T* addr);
> 
> s/SafeForRaces/SafelyRaces/g for a skosh of brevity perhaps.

They tend not to be called directly that many places so clarity over brevity, but perhaps there are better names out there...

> @@ +204,5 @@
> > +    static inline void* memmoveSafeForRaces(void* dest, const void* src, size_t nbytes);
> > +
> > +    template<typename T>
> > +    static inline T memmoveSafeForRaces(SharedMem<T> dest, SharedMem<T> src, size_t nbytes) {
> > +        return static_cast<T>(memmoveSafeForRaces(static_cast<void*>(dest.unwrap()), static_cast<void*>(src.unwrap()), nbytes));
> 
> Can these memcpy/memmove-like functions just return void?  Gives me jitters
> seeing side-effectful methods returning things, particularly when the return
> value is something the user already provided.

Yes!  I've had the same reaction, will change this.

> ::: js/src/jit/none/AtomicOperations-none.h
> @@ +88,5 @@
> >      MOZ_CRASH();
> >  }
> >  
> > +// Not sure what to do for the safe-for-races methods.  Do we ever run
> > +// "none" code?  If we do, then these need to have meaningful
> 
> ...or wait, you can't make these crash, can you, because they'd be called on
> JIT-less platforms.  I guess the problem is that this stuff really doesn't
> belong in the JIT backends, it belongs in some sort of platform-specific yet
> *not* JIT-related place in the code, with no "none" case at all, just a
> compile error if no implementation of these for the architecture.

No, that's exactly the problem - these functions expose JIT /behavior/ to C++, ie, these /must/ do what jitted code does on the platform.  When that coincides with what the C++ compiler does, great.  When it doesn't it's trickier.

On Jit-less platforms talking about "jit behavior" doesn't make sense, but here these primitives must still expose behavior that may be outside what C++ allows for.  The SafeForRaces methods are in principle not implementable in C++.

> I guess really we want a non-JIT-related place for platform-specific
> implementations of things, then this can be put in there.  Maybe spin up
> such a thing in a followup bug?  I'm fine landing with the current
> locations, as long as we put it aright quickly.

OK.  Really, this is JIT functionality, so it's wrong to see it as a platform thing per se, but we'll need to resolve the conundrum.

Also I may need to document this better.

> ::: js/src/vm/SharedMem.h
> @@ +6,5 @@
> > +
> > +#ifndef vm_SharedMem_h
> > +#define vm_SharedMem_h
> > +
> > +// T should be a pointer type
> 
> No need for this, just add
> 
>   static_assert(mozilla::IsPointer<T>::value, "SharedMem encapsulates
> pointer types");
> 
> to the class body, and #include "mozilla/TypeTraits.h".

Sweet.

> @@ +25,5 @@
> > +
> > +    SharedMem(const SharedMem<T>& that) : ptr_(that.ptr_), sharedness_(that.sharedness_) {}
> > +
> > +    template<typename U>
> > +    explicit SharedMem(const SharedMem<U>& that) : ptr_((T)that.unwrap()), sharedness_((SharedMem<T>::Sharedness)that.sharedness()) {}
> 
> I agree -- add a cast<T>() member function.

Will investigate, still questioning usability but hope to be pleasantly surprised.

> @@ +31,5 @@
> > +    SharedMem<T>& operator =(const SharedMem<T>& that) { ptr_ = that.ptr_; sharedness_ = that.sharedness_; return *this; }
> > +
> > +    SharedMem<T> operator +(size_t offset) { return SharedMem<T>(ptr_ + offset, sharedness_); }
> > +
> > +    bool operator !() { return ptr_ == nullptr; }
> 
> Not |explicit operator bool() { return ptr_ != nullptr; }|?

I tried that, and it blew up badly, but it could be because I left off the explicit?  Can try again.  Regardless, clearly this set of methods can be cleaned up.

> @@ +34,5 @@
> > +
> > +    bool operator !() { return ptr_ == nullptr; }
> > +
> > +    bool isShared() const { return sharedness_ == IsShared; }
> > +    Sharedness sharedness() const { return sharedness_; }
> 
> I'd kind of prefer exactly one of these, ideally the enum one.  I assume a
> bool passes in somewhere where the meaning isn't quite so obvious the way an
> enum would be.

isShared() will go away and the sharedness_ member will become DEBUG-only, I've experimented with the big follow-on patch and that seems to be sufficient.

> @@ +36,5 @@
> > +
> > +    bool isShared() const { return sharedness_ == IsShared; }
> > +    Sharedness sharedness() const { return sharedness_; }
> > +    T unwrap() const { return ptr_; }
> > +    T unwrapUnshared() const { MOZ_ASSERT(sharedness_ == IsUnshared); return ptr_; }
> 
> I don't think unwrap* should be public like this.  If you do it right, can't
> these methods be private, exposed *only* to the AtomicOperations struct
> encapsulating all the load/store/exchange/etc. operations via a couple
> friend declarations?

Not really.  Unwrapping is currently needed in a substantial number of places, and almost always it can be done safely.  It is used at the engine boundary, where it appears most convenient to export the pointer and the isShared flag separately.  It is used where the pointer is needed for its value, though in many cases that's been handled by the operator overloads so it's not as bad as it could be.  There are other cases.   We can argue about each of those uses, but in the end it will come down to whether to add (more) single-use methods or operators for corner cases, or exposing the unwrapping.  I'm trying to achieve a balance.

> That seems like the ultimate goal of SharedMem to me.

IMO it is not.  The goal is to reduce the risk of passing shared memory pointers around, but sometimes it becomes highly desirable to just handle the pointer, because it is safe to do so.  The intent is not to create an insurmountable obstacle, just to prevent you from shooting yourself in the foot without trying really hard.

It's always desirable to remove in-line unwraps when possible, but sometimes there's a lot of work for no actual gain.

> 
> @@ +41,5 @@
> > +};
> > +
> > +template<typename T>
> > +inline bool operator >=(const SharedMem<T>& a, const SharedMem<T>& b) {
> > +    return a.unwrap() >= b.unwrap();
> 
> template<typename T>
> inline bool
> operator >=(...)
> {
>    ...
> }
> 
> and so on for the rest.
> 
> That said, are these added because of actual need for them,

These were added because they were needed as an alternative to explicit unwraps in-line.  Many are used in assertions, for example.

> or just for
> smart-pointer-completeness?  Aside from ++/-- and ==, I probably wouldn't
> add any of these.  It's hard for me to think of too many places where you'd
> need these operators (and moreover, the dodgy unwraps inside them).

See the next patch for many uses of these. 

What makes you think the unwraps are dodgy?  They are well defined, the pointers are used only for their values, not dereferenced.  Or are you just objecting to the uses of unwraps in the first place?
Comment on attachment 8656554 [details] [diff] [review]
Avoid undefined behavior in accessing shared memory, v2

Review of attachment 8656554 [details] [diff] [review]:
-----------------------------------------------------------------

Ugh, this is so messy.  Hopefully a cast<> method will help some.  I hope.  This is so fugly, but I'm at the point in the night/day where I just want to make the fugly code go away so I don't have to think about it, so (gulp) f+.

::: js/src/asmjs/AsmJSSignalHandlers.cpp
@@ +382,4 @@
>  {
>      MOZ_RELEASE_ASSERT(size <= Simd128DataSize);
>      memset(fp_reg, 0, Simd128DataSize);
> +    AtomicOperations::memcpySafeForRaces(fp_reg, addr, size);

You still have the bad memsets in all these methods.

...or wait, no, these aren't writing into shared memory, they're writing into memory a *copy* of the value loaded from the shared address.  So memsets are fine.  Never mind that red herring!

@@ +574,5 @@
>      uintptr_t result = address.disp();
>  
>      if (address.hasBase()) {
>          uintptr_t base;
> +        StoreValueFromGPReg(SharedMem<void*>(&base, SharedMem<void*>::IsUnshared), sizeof(uintptr_t),

Ugh again.  Hopefully luke has a nicer incantation (or at least a [wait for it] shared method to call, or create for calling) that will hide this bogosity.

@@ +655,5 @@
>      // fully wrapped address, and perform the load or store on it.
>      //
>      // Taking a signal is really slow, but in theory programs really shouldn't
>      // be hitting this anyway.
> +    intptr_t unwrappedOffset = accessAddress - module.maybeHeap().unwrap(/*safe - for value*/);

Bah, this should have been size_t and PointerRangeSize, so you don't have any concerns about signed-integer wraparound undefinedness.  Oh well.  Not relevant here.

@@ +682,5 @@
>          MOZ_RELEASE_ASSERT(wrappedAddress + size > wrappedAddress);
>          MOZ_RELEASE_ASSERT(wrappedAddress + size <= module.maybeHeap() + module.heapLength());
>          switch (access.kind()) {
>            case Disassembler::HeapAccess::Load:
> +            SetRegisterToLoadedValue(context, SharedMem<void*>(wrappedAddress), size, access.otherOperand());

Maybe an asUntyped() method that does this void* cast?

::: js/src/builtin/AtomicsObject.cpp
@@ +121,5 @@
>  }
>  
>  static int32_t
> +CompareExchange(Scalar::Type viewType, int32_t oldCandidate, int32_t newCandidate,
> +                SharedMem<void*> viewData, uint32_t offset, bool* badArrayType=nullptr)

Spaces around =.

::: js/src/builtin/SIMD.cpp
@@ +1174,5 @@
>      if (!IsVectorObject<V>(args[2]))
>          return ErrorBadArgs(cx);
>  
>      Elem* src = TypedObjectMemory<Elem*>(args[2]);
> +    SharedMem<Elem*> dst = SharedMem<Elem*>(SharedMem<char*>(AnyTypedArrayViewData(typedArray)) + byteStart);

Ugh that's disgusting.  :-)  Maybe add a member method that converts to bytes for addition, rather than this monstrosity?  (And above as well.)

::: js/src/jit/IonBuilder.cpp
@@ +9260,5 @@
>      if (!tarrObj)
>          return true;
>  
> +    SharedMem<void*> viewData = AnyTypedArrayViewData(tarrObj);
> +    if (tarrObj->runtimeFromMainThread()->gc.nursery.isInside(viewData.unwrap(/*safe - for value*/)))

This unwrap being missed in the other patch is a good reason to have

inline bool
TypedArrayDataIsInsideNursery(const SharedMem<T>& ptr)
{
    ....
}

and similar things, to force the unwrapping into friended classes/methods, IMO.  You could/should use such a method in the addTypedArrayLengthAndData case as well.

::: js/src/jit/MIR.cpp
@@ +4576,5 @@
>  void*
>  MLoadTypedArrayElementStatic::base() const
>  {
> +    SharedMem<void*> viewData = AnyTypedArrayViewData(someTypedArray_);
> +    return viewData.unwrap(/*unsafe? - unsafe if the pointer flows out of jitted code*/);

I'd make these //-style comments above the line in question, throughout the patch.

::: js/src/vm/ArrayBufferObject-inl.h
@@ +24,5 @@
> +    if (buf->is<ArrayBufferObject>()) {
> +        return SharedMem<uint8_t*>(buf->as<ArrayBufferObject>().dataPointer(),
> +                                   SharedMem<uint8_t*>::IsUnshared);
> +    }
> +    return buf->as<SharedArrayBufferObject>().dataPointerShared();

So actually, I was imagining making SharedMem *the* fundamental primitive for exposing typed array/arraybuffer memory at all.  Everything would deal in terms of it.  And when the memory in question *wasn't* actually shared, you'd just pass the pointer to a NonAtomicOps struct, with methods/signatures identical to that in AtomicOperations, and there'd just be dynamic dispatch based on the sharedness of the wrapped pointer.

::: js/src/vm/ArrayBufferObject.h
@@ +14,1 @@
>  #include "vm/Runtime.h"

Alphabetical.

@@ +90,5 @@
> +    // never be used to access it, as that will lead to undefined
> +    // behavior.  Instead use the "SafeForRaces" operations in
> +    // AtomicOperations.
> +
> +    inline SharedMem<uint8_t*> dataPointerMaybeShared();

If unwrap is never exposed except to friends, and it's the only way to get typed array data pointers, there's not really any danger about this thing existing.  Make operations take SharedMem, then fork based on sharedness.  Again, this seems the ultimate desirable state to me.

::: js/src/vm/SharedTypedArrayObject.cpp
@@ +71,5 @@
>       * N.B. The base of the array's data is stored in the object's
>       * private data rather than a slot to avoid the restriction that
>       * private Values that are pointers must have the low bits clear.
>       */
> +    MOZ_ASSERT(buffer->dataPointerShared().unwrap(/*safe - for value*/) != nullptr);

Just do != nullptr?  (inline bool operator==(SM<T>, delctype(nullptr)) seems like a reasonable operator to define, to me -- and its opposite and inverse.)

@@ +366,5 @@
>      getIndex(JSObject* obj, uint32_t index)
>      {
>          SharedTypedArrayObject& tarray = obj->as<SharedTypedArrayObject>();
>          MOZ_ASSERT(index < tarray.length());
> +        return jit::AtomicOperations::loadSafeForRaces(SharedMem<NativeType*>(tarray.viewDataShared())+index);

spaces around +, and elsewhere

::: js/src/vm/TypedArrayCommon.h
@@ +142,5 @@
>  
> +class SharedOps {
> +public:
> +    template<typename U>
> +    static inline U read(SharedMem<U*> addr) {

Given we use load/store everywhere else, super-mild preference for those terms.

As before/elsewhere, remove the inlines on these.  Also it's non-obvious why you use U/V/T rather than just T, for these methods.

@@ +233,4 @@
>          switch (AnyTypedArrayType(source)) {
>            case Scalar::Int8: {
>              JS_VOLATILE_ARM
> +            SharedMem<int8_t*> src = SharedMem<int8_t*>(data);

Even more strongly unconvinced that JS_VOLATILE_ARM is going to work for this any more.  And definitely a casting method would be good, and probably having one templated class with a method to do this would be better than N different fugly copies.

@@ +325,5 @@
>              const Value* srcValues = source->as<NativeObject>().getDenseElements();
>              for (; i < bound; i++) {
>                  if (!canConvertInfallibly(srcValues[i]))
>                      break;
> +                Ops::write(dest+i, infallibleValueToNative(srcValues[i]));

dest + i

@@ +383,5 @@
>          if (!data)
>              return false;
> +        Ops::memcpy(SharedMem<uint8_t*>(static_cast<uint8_t*>(data), SharedMem<uint8_t*>::IsUnshared),
> +                    SharedMem<uint8_t*>(AnyTypedArrayViewData(source)),
> +                    sourceByteLen);

Eeeugh, this is the ugliest thing I've ever seen.  Could you make |data| a SharedMem, maybe, to avoid at least a little mess, and then to cast it however needed later below?
Attachment #8656554 - Flags: feedback?(jwalden+bmo) → feedback+
Attachment #8655426 - Flags: feedback?(luke)
Attachment #8655421 - Flags: feedback?(luke)
Waldo, select followups:

Can you clarify what you (and Luke) mean by a cast<T> member function?  I sense a pattern I'm not familiar with, or I'm missing some vital piece of information.  That said, I changed the constructor in question from doing (T)ptr to static_cast<T>(ptr), this required only a single fixup and that was later simplified, so maybe this is not such a big deal any more.

You also mention that you'd like to see my .unwrap(/*justification*/) comments on the previous line.  That's what I'm trying to avoid - the justification disappearing in the noise.  I was toying with the idea of requiring an actual argument for the unwrapping but it seemed ... excessive.  How about we leave it like it is until the rest of the patch queue falls into place and we see how many unwrap calls are left?  Like you I'd prefer to get rid of as many as possible.

JS_VOLATILE_ARM...  yeah, something is missing there, will investigate.

Beyond that, thanks for copious feedback, I've addressed most everything else.
I'd like a review of this with an eye toward landing it even if there are FIXME comments in some of the platform implementations.  (They will be augmented with appropriate bug references before landing.)  The purpose is to provide the basis for the next patch, which is the important one.

It is indeed possible that there should be additional FIXME comments.  Yhe pre-existing implementations of the sequentially consistent primitives are written in C++ using compiler intrinsics, but those primitives are non-racy only if accesses conflict only with some of the other primitives accessing the same address and the same size.  If, say, an atomic store of uint8 conflicts with an atomic store of uint16 there is an actual race, hence UB.  However, the use of compiler intrinsics on a per-platform basis may give us an out; I'm not sure, but I will continue to investigate.  In the mean time, nothing in this patch makes anything any worse than it already is, and it isolates the horrific parts.
Attachment #8658188 - Flags: review?(jwalden+bmo)
In this patch the solution for the ARM volatile is to move volatile into the pointer type of the smart pointer.  I'm going to be testing this extensively.

The number of unwrap() operations has been reduced by adding other accessors for other purposes (eg, asValue() returns an uintptr_t and can be used many places where the pointer is not used as a pointer) and propagating the smart pointer throughout.  I'm pretty happy with this as it stands.
Attachment #8658189 - Flags: review?(jwalden+bmo)
I'll get to reviews here today.  Believe it or not, this is the newest review in my queue right now (although I've flushed one or two much-simpler, more-recent reviews since these were posted, in an attempt to make myself feel like I've made progress on reviewing stuff).
Comment on attachment 8658188 [details] [diff] [review]
Safe-for-races operations, exposed to C++, v2 cleaned up

Review of attachment 8658188 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/AtomicOperations.h
@@ +87,3 @@
>  class AtomicOperations
>  {
> +    friend class RegionLock;

Hmm, is this actually needed?  Maybe in a subsequent patch?  'cause right now I'm not seeing why this is here, or the forward-decl is needed above.

@@ +90,4 @@
>  
> +  private:
> +    // The following functions are defined for T = int8_t, uint8_t,
> +    // int16_t, uint16_t, int32_t, uint32_t, int64_t, and uint64_t.

It'd be nice to statically assert defined only for these types, but I guess it's hard to do without platform-duplicating the assertion.  :-\

@@ +104,5 @@
>      template<typename T>
>      static inline T exchangeSeqCst(T* addr, T val);
>  
>      // Atomically check that *addr contains oldval and if so replace it
> +    // with newval, in any case return the old contents of *addr.

s/return/returning/

@@ +130,5 @@
>      static inline T fetchXorSeqCst(T* addr, T val);
> +
> +    // The NoSync functions are to be used when C++ code has to access
> +    // memory without synchronization and can't guarantee that there
> +    // won't be a race on the access.

I imagine you chose NoSync over Racing/Racy for a reason?  The latter sound more dangerous, ergo less likely for the unwary to pick up and use.

@@ +132,5 @@
> +    // The NoSync functions are to be used when C++ code has to access
> +    // memory without synchronization and can't guarantee that there
> +    // won't be a race on the access.
> +
> +    // Defined for all the int types as well as for float32 and float64.

integral

@@ +136,5 @@
> +    // Defined for all the int types as well as for float32 and float64.
> +    template<typename T>
> +    static inline T loadNoSync(T* addr);
> +
> +    // Defined for all the int types as well as for float32 and float64.

integral

@@ +160,5 @@
> +    // If the return value is true then a call to the 64-bit (8-byte)
> +    // routines below will work, otherwise those functions will assert in
> +    // debug builds and may crash in release build.  (See the code in
> +    // ../arm for an example.)  The value of this call does not change
> +    // during a run.

s/a run/execution/

::: js/src/vm/SharedMem.h
@@ +11,5 @@
> +
> +template<typename T>
> +class SharedMem
> +{
> +    static_assert(mozilla::IsPointer<T>::value, "SharedMem encapsulates pointer types");

Wrap the string to align the first quote underneath the "m" in mozilla -- these reason-strings are more like comments (that we wrap to 79ch) than code (that we wrap to 99ch), IMO.

@@ +127,5 @@
> +    }
> +
> +    // Cast to char*, add nbytes, and cast back to T.  Simplifies code in a few places.
> +    SharedMem addBytes(size_t nbytes) {
> +        return SharedMem(reinterpret_cast<T>(reinterpret_cast<char*>(ptr_) + nbytes), *this);

MOZ_ASSERT((nbytes % sizeof(typename mozilla::RemovePointer<T>::Type) == 0);

I assume, right?
Attachment #8658188 - Flags: review?(jwalden+bmo) → review+
(In reply to Lars T Hansen [:lth] from comment #22)
> Can you clarify what you (and Luke) mean by a cast<T> member function?  I
> sense a pattern I'm not familiar with, or I'm missing some vital piece of
> information.  That said, I changed the constructor in question from doing
> (T)ptr to static_cast<T>(ptr), this required only a single fixup and that
> was later simplified, so maybe this is not such a big deal any more.

Something like

  template<typename U>
  inline SharedMem<U>
  cast()
  {
    return SharedMem<U>(reinterpret_cast<U>(unwrap()),
                        (SharedMem<U>::Sharedness)sharedness());
  }

with whatever grunge is needed to make the inside code work, using whatever private constructor is needed.  Then you'd have |SharedMem<uint32_t*> sm; sm.cast<int32_t*>()| if you wanted to reinterpret_cast one sort of shared pointer to another.

Or somesuch like that,

(I guess I forgot to mention this in the previous patch-review, didn't I?  Feh.)
If MOZ_ASSERT((nbytes % sizeof(typename mozilla::RemovePointer<T>::Type) == 0); fails because T is sometimes void* and sizeof(void) is No, use mozilla::Conditional and mozilla::IsVoid to substitute in char or somesuch, to keep the assertion whenever possible.
Comment on attachment 8658189 [details] [diff] [review]
Avoid undefined behavior in accessing shared memory, v2 cleaned up

Review of attachment 8658189 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/AsmJSSignalHandlers.cpp
@@ +382,4 @@
>  {
>      MOZ_RELEASE_ASSERT(size <= Simd128DataSize);
>      memset(fp_reg, 0, Simd128DataSize);
> +    AtomicOperations::memcpyNoSync(fp_reg, addr, size);

...I guess this answers the "no, it really sometimes is untyped memcpy" question raised elsewhere.  Maybe the alignment assertion could be downgraded somehow.

Or, looking at the rest of this code, is there a reason this can't be a template method?  Then |addr| could be not-void*-typed, couldn't it?

::: js/src/builtin/AtomicsObject.cpp
@@ +292,5 @@
>  
>      switch (viewType) {
>        case Scalar::Int8: {
>          int8_t value = (int8_t)numberValue;
> +        INT_OP(SharedMem<int8_t*>(viewData) + offset, value);

viewData.cast<int8_t*>()

is what a cast method would look like for these.

::: js/src/builtin/SIMD.cpp
@@ +1147,5 @@
>      Rooted<TypedObject*> result(cx, TypedObject::createZeroed(cx, typeDescr, 0));
>      if (!result)
>          return false;
>  
> +    SharedMem<Elem*> src = SharedMem<Elem*>(AnyTypedArrayViewData(typedArray).addBytes(byteStart));

AnyTypedArrayViewData(typedArray).addBytes(byteStart).cast<Elem*>() once the cast<T> method is added.  And I guess this does mean you need the IsVoid gunky things.

And I guess, come to think of it, cast<T>() should be asserting that the unwrapped pointer is properly T*-aligned.  Please add that into the cast method.

@@ +1174,5 @@
>      if (!IsVectorObject<V>(args[2]))
>          return ErrorBadArgs(cx);
>  
>      Elem* src = TypedObjectMemory<Elem*>(args[2]);
> +    SharedMem<Elem*> dst = SharedMem<Elem*>(AnyTypedArrayViewData(typedArray).addBytes(byteStart));

Same sort of transforms again.

@@ +1175,5 @@
>          return ErrorBadArgs(cx);
>  
>      Elem* src = TypedObjectMemory<Elem*>(args[2]);
> +    SharedMem<Elem*> dst = SharedMem<Elem*>(AnyTypedArrayViewData(typedArray).addBytes(byteStart));
> +    js::jit::AtomicOperations::memcpyNoSync(dst, src, sizeof(Elem) * NumElem);

If the memcpy/memmove functions are always called with typed pointers (are they?), I would rename them to copyElements/moveElements and make the third argument just a count.  This sort of size-multiplication gives me the willies.

::: js/src/jit/IonBuilder.cpp
@@ +8894,5 @@
>      else if (obj->resultTypeSet())
>          tarr = obj->resultTypeSet()->maybeSingleton();
>  
>      if (tarr) {
> +        SharedMem<void*> data = AnyTypedArrayViewData(tarr);

This is pretty nice!  (My brain is clearly demented.)

::: js/src/jit/MIR.cpp
@@ +1064,5 @@
>  void
>  MConstantElements::printOpcode(GenericPrinter& out) const
>  {
>      PrintOpcodeName(out, op());
> +    out.printf(" %" PRIxPTR, value().asValue());

I think %p typically expands to "0x...." where %PRIxPTR is typically "....", so you might want an 0x in the format string.

::: js/src/jit/MIR.h
@@ +7749,5 @@
>  
>      void printOpcode(GenericPrinter& out) const override;
>  
>      HashNumber valueHash() const override {
> +        return (HashNumber)(size_t) value_.asValue();

Pre-existing, but that's a very ugh algorithm for hashing a 64-bit pointer.  Worth a followup, or poking people on IRC and choosing something else driveby.

::: js/src/jit/x86/CodeGenerator-x86.cpp
@@ +301,5 @@
>          else
>              bailoutIf(Assembler::AboveOrEqual, ins->snapshot());
>      }
>  
> +    Operand srcAddr(ptr, int32_t(mir->base().asValue()) + int32_t(offset));

Ugh, unsigned-signed cast.  Seems like it'd be nice if there were another Operand constructor that accepted the unsigned type here, then converted to a signed displacement internally.  Not neceessary here, tho a mini-patch for it would be nice.

::: js/src/vm/ArrayBufferObject-inl.h
@@ +18,5 @@
>  
>  namespace js {
>  
> +inline SharedMem<uint8_t*>
> +ArrayBufferObjectMaybeShared::dataPointerMaybeShared() {

{ on its own line.

::: js/src/vm/SharedArrayObject.cpp
@@ +168,3 @@
>  #       endif
>  #else
> +        UnmapMemory(p.unwrap(/*safe - only reference*/), this->length + AsmJSPageSize);

Seems like you could unwrap just after the SharedMem decl, once, for all of this.  Mildly clearer that every place definitely uses the same pointer, IMO.

@@ +394,2 @@
>      *length = obj->as<SharedTypedArrayObject>().byteLength();
> +    *data = static_cast<uint8_t*>(obj->as<SharedTypedArrayObject>().viewDataShared().unwrap(/*safe - caller knows*/));

"caller knows", lol.  I think we probably should bring in bz or someone to verify this, because who the heck knows  what'll happen if you pass a SharedArrayBufferView being raced upon into, say, WebGL calls that need to verify the data first.

It would be really, really nice to not have this API.  I'm not sure how to make that happen, exactly.  Maybe we can convince someone to forbid Shared stuff in the APIs that happen to accept this.

@@ +399,5 @@
>  js::GetSharedArrayBufferLengthAndData(JSObject* obj, uint32_t* length, uint8_t** data)
>  {
>      MOZ_ASSERT(obj->is<SharedArrayBufferObject>());
>      *length = obj->as<SharedArrayBufferObject>().byteLength();
> +    *data = obj->as<SharedArrayBufferObject>().dataPointerShared().unwrap(/*safe - caller knows*/);

And, same for SAB accesses as before.

@@ +415,5 @@
>  {
>      obj = CheckedUnwrap(obj);
>      if (!obj)
>          return nullptr;
> +    return obj->as<SharedArrayBufferObject>().dataPointerShared().unwrap(/*safe - caller knows*/);

Same again.

::: js/src/vm/SharedArrayObject.h
@@ +74,5 @@
>      void setWaiters(FutexWaiter* waiters) {
>          waiters_ = waiters;
>      }
>  
> +    inline SharedMem<uint8_t*> dataPointerShared() const {

Remove inline here, it's implied.

@@ +77,5 @@
>  
> +    inline SharedMem<uint8_t*> dataPointerShared() const {
> +        return SharedMem<uint8_t*>::shared(reinterpret_cast<uint8_t*>(
> +                                               const_cast<SharedArrayRawBuffer*>(this)) +
> +                                           sizeof(SharedArrayRawBuffer));

If the argument's this long, I'd rather see a temporary so there's no awkward linebreak.

::: js/src/vm/SharedTypedArrayObject.cpp
@@ +675,5 @@
>            return nullptr;                                                                   \
>                                                                                              \
>        SharedTypedArrayObject* tarr = &obj->as<SharedTypedArrayObject>();                    \
>        *length = tarr->length();                                                             \
> +      *data = static_cast<ExternalType*>(tarr->viewDataShared().unwrap(/*safe - caller knows*/)); \

Feh as well for this (WebGL or something else needing to access a shared typed array's raw data, and almost certainly unsafely wrt racing).  Again consult bz or someone (see the other comment mentioning him).

@@ +1010,5 @@
>      if (!obj)
>          return nullptr;
>      SharedTypedArrayObject* tarr = &obj->as<SharedTypedArrayObject>();
>      MOZ_ASSERT((int32_t) tarr->type() == Scalar::Int8);
> +    return static_cast<int8_t*>(tarr->viewDataShared().unwrap(/*safe - caller knows*/));

And, all these methods are dodgy too.

::: js/src/vm/TypedArrayCommon.h
@@ +137,5 @@
>      return IsTypedArrayClass(clasp) || IsSharedTypedArrayClass(clasp);
>  }
>  
> +class SharedOps {
> +public:

class SharedOps
{
  public:
    template...

and same for UnsharedOps.
Attachment #8658189 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #29)
> Comment on attachment 8658189 [details] [diff] [review]
> Avoid undefined behavior in accessing shared memory, v2 cleaned up
> 
> Review of attachment 8658189 [details] [diff] [review]:
> -----------------------------------------------------------------

Thanks for the comments, I thought I would allow myself one followup:

> @@ +394,2 @@
> >      *length = obj->as<SharedTypedArrayObject>().byteLength();
> > +    *data = static_cast<uint8_t*>(obj->as<SharedTypedArrayObject>().viewDataShared().unwrap(/*safe - caller knows*/));
> 
> "caller knows", lol.  I think we probably should bring in bz or someone to
> verify this, because who the heck knows  what'll happen if you pass a
> SharedArrayBufferView being raced upon into, say, WebGL calls that need to
> verify the data first.
> 
> It would be really, really nice to not have this API.  I'm not sure how to
> make that happen, exactly.  Maybe we can convince someone to forbid Shared
> stuff in the APIs that happen to accept this.

So what's going on here (and several other changes that you objected to in the same way) is that (1) "Shared" is part of the function name so the caller *really* knows (because these are called only from new code that can handle shared memory, all existing code uses the unshared API) and (2) the signatures of these functions will change to propagate the isShared flag, when SharedTypedArray is removed, and it will become unambiguous.  There is a monstrous patch on bug 1176214 that demonstrates this in practice.

I've been over this with bz in the past (at Whistler) and I think he's OK with where it's going, but there will be further reviews on those patches when we get that far, from the DOM people.

Note that the above changes don't change anything in whether DOM races or not.  Racy DOM code is converted as part of the monster patch to use the safe-for-races functions, when necessary.


OK, a second followup: we had "loadSafeForRaces" which is fairly clear, given the C++ prohibition on races - Take This Road to Avoid the Dragons.  It's not pretty though, so it became "loadNoSync" because these are unsynchronized. "loadRacy" is misleading IMO, it looks like a warning - Here Be Dragons - when it's really meant to convey just the opposite.  "NoSync" is unfortunately not conveying that properly (and then relaxed accesses are also not synchronizing, confusing the matter further).  "NotRacy" is also wrong.  I might go back to "SafeForRaces"; I could shorten to "Safe", but safe from what?

Jim Blandy suggests maybe tying it in with the C++ language definition somehow, perhaps using the term for the externally observable effect that must be preserved by the language implementation, whatever that might be.  Probably doesn't quite capture what we're after, either.  ("loadNeverUB"?  I don't think so.  "loadAlwaysWellDefined"?  Ugh.)
Maybe "SafeWhenRacy" is the sweet spot.
Blocks: 1208663
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #29)
> Comment on attachment 8658189 [details] [diff] [review]
> Avoid undefined behavior in accessing shared memory, v2 cleaned up
> 
> Review of attachment 8658189 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +1175,5 @@
> >          return ErrorBadArgs(cx);
> >  
> >      Elem* src = TypedObjectMemory<Elem*>(args[2]);
> > +    SharedMem<Elem*> dst = SharedMem<Elem*>(AnyTypedArrayViewData(typedArray).addBytes(byteStart));
> > +    js::jit::AtomicOperations::memcpyNoSync(dst, src, sizeof(Elem) * NumElem);
> 
> If the memcpy/memmove functions are always called with typed pointers (are they?), 

Sadly no, they are predominantly not, for the time being.  This may change once I start massaging the DOM code though.

> ... I would rename them to copyElements/moveElements and make the third
> argument just a count.  This sort of size-multiplication gives me the
> willies.

Amen to that.  I will file a followup bug.
(In reply to Lars T Hansen [:lth] from comment #24)
> Created attachment 8658189 [details] [diff] [review]
> Avoid undefined behavior in accessing shared memory, v2 cleaned up
> 
> In this patch the solution for the ARM volatile is to move volatile into the
> pointer type of the smart pointer.  I'm going to be testing this extensively.

Alas I am unable to repro the original problem (bug 1097253) without the patch on local hardware even with GCC 4.7 (Ubuntu/Linaro 4.7.3-12ubuntu1), in a release build, using the test case provided on that bug.  This is not very surprising since I may have a different GCC or I may be compiling with different settings since this is on a Linux dev board, not for Android, but it makes it hard to know if I'll be introducing a new problem.

Try runs with the patch going here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=97ea15846cbd.  Some investigation needed of some failures, but likely not the fault of this patch.
Blocks: 1210399
See Also: → 1210456
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: