Closed Bug 1389669 Opened 7 years ago Closed 7 years ago

Change definitions of AtomicOperations so as to create fewer headaches

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: modmaker, Assigned: lth)

References

Details

Attachments

(1 file, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36

Steps to reproduce:

Go to js/src directory.  Run ./configure --disable-ion and build mozjs.  Run js repl (dist/bin/js).

js> new Uint8Array([1,2,3])


Actual results:

Segmentation fault caused by MOZ_CRASH because of the lack of AtomicOperations.


Expected results:

TypedArray types should be usable without ion.
Seems like this should work, yes.
Flags: needinfo?(lhansen)
This is actually behaving as intended, even if it is surprising.  For a possible workaround see the fifth paragraph below.  Anyhow the argument is this:

When we load from or store to typed arrays that might be mapping shared memory we must use primitives that are safe for racy access, we can't use plain C++ because doing so would be potentially undefined behavior.

So we call jit::AtomicOperations::loadSafeWhenRacy in this case (because the shell tries to print the value we've just constructed and is calling getIndex on the new array).

The AtomicOperations are provided by jit/AtomicOperations.h.  Near the end of that file there is a fearsome nest of #ifdefs that selects the actual operations to provide based on the hardware we're compiling for, note the case for JS_CODEGEN_NONE which applies for --disable-ion.  Here, we provide definitions for some jitless third-tier platforms (ppc64, arm64, alpha, hppa, super-h), but unless it's one of these special cases we fall back to jit/none/AtomicOperations-none.h, which MOZ_CRASHes for all of the primitives.  Hence we crash here, because the case for JS_CODEGEN_NONE does not have a subcase for x64.

Note carefully that the definitions that we provide for ppc64 et al, in jit/none/AtomicOperations-feeling-lucky.h, ARE NOT TECHNICALLY CORRECT AND CAN NEVER BE CORRECT.  They are (and must be, since this is portable code) C++ renditions of the atomic primitives that will cause C++ undefined behavior in the presence of shared memory.  Mostly this UB is benign, which is how they get away with it, but we have no guarantee that it will remain benign, it depends on how smart the compiler is, for one thing.  We don't really care, though, since these are tier-3 platforms.

If the build we're talking about in this bug is an experimental build for private use (ie the code is not being upstreamed to Firefox) then just adding an x64 case in that ifdef nest and including jit/none/AtomicOperations-feeling-lucky.h will usually be OK.  Including jit/x86-shared/AtomicOperations-x86-shared.h will also work at the moment (and have the same effect), but that code is about to change and may in fact end up using the JIT to generate code at startup, and if so will cease to work in a --disable-ion setting.

(Comments in jit/AtomicOperations.h pretty much explain what I've just explained.)

------

I should mention that right now the platform implementations in jit/{arm,mips-shared,x86-shared}/AtomicOperations-*.h are implemented in C++, which has the UB we want to avoid.  But that will change, as I alluded to just above; there does not seem to be a bug open for that at the moment, but 8-byte atomics for WebAssembly will force the issue.

------

I'm going to close this as INVALID because things are working as designed: a --disable-ion build is always tier-3 and unsupported.  I understand it's frustrating, and I could change the code so that builds always fall back to the nonportable and possibly-wrong implementation, but I don't want to do that for architectures where we have Ion.  I will do it for tier-3 platforms where the maintainers have requested it though, which is why there are several cases as mentioned.
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Flags: needinfo?(lhansen)
Resolution: --- → INVALID
I understand the reasoning for atomic operations, but wouldn't it be much better if we disabled SABs entirely in this case instead of crashing?

More importantly, this isn't about an attempt to use atomic operations, but about simple TypedArray usage. IIUC, these are deeply enough intertwined in our implementation that this doesn't make a difference on that level. It does, however, make a big difference to the usability of these builds. I'd go so far as to say that it makes --disable-ion builds entirely unusable, so we might as well remove support for them.

Given that, I think it'd be vastly preferable to enable the possibly-wrong behavior we have for other tier-3 platforms, but perhaps I'm not fully grasping why that'd be too problematic.
Flags: needinfo?(lhansen)
(In reply to Till Schneidereit [:till] from comment #3)
> I understand the reasoning for atomic operations, but wouldn't it be much
> better if we disabled SABs entirely in this case instead of crashing?

IMO no, because what you have here is the case of a missing platform API implementation where no (truly correct) portable implementation can be provided.

> More importantly, this isn't about an attempt to use atomic operations, but
> about simple TypedArray usage. IIUC, these are deeply enough intertwined in
> our implementation that this doesn't make a difference on that level. It
> does, however, make a big difference to the usability of these builds. I'd
> go so far as to say that it makes --disable-ion builds entirely unusable, so
> we might as well remove support for them.

I think that overstates the problem.  The tier-3 platforms are able to work within this framework.  If it turned out that, say, on SPARC the C++ compiler was so smart as to cause bugs resulting from UB, or if there were a platform that has some kind of hardware race detection that can only be disabled by dropping into assembler, then those tier-3 platforms would provide hand-written implementations of these atomic accessors, but they would still be configured with --disable-ion.

The case is similar for a platform that does not use gcc, clang, or msvc, since the C++ implementations I've provided work only with compilers that have intrinsics compatible with those.  This hasn't come up yet though.

> Given that, I think it'd be vastly preferable to enable the possibly-wrong
> behavior we have for other tier-3 platforms, but perhaps I'm not fully
> grasping why that'd be too problematic.

I don't want there to be a default that appears to work, because then the platform maintainer will not be aware that there might be an issue.  I want this to be opt-in.  (Really I did not want to have the ifdef nest for JS_CODEGEN_NONE at all; I wanted any workarounds for this case to be maintained outside the Firefox tree.  But I can't always have what I want.)

The threat here is a C++ compiler that does whole-program optimization and takes advantage of UB to remove and rearrange code.  C++ allows no races, so it can assume these accesses are not racy or it can prove that they might be, and in either case use that knowledge to rearrange code in strange ways.

At some point I'll probably just throw in the towel and stop fighting this, but I'm not there yet...

(I'm still not clear on why somebody would configure with --disable-ion on a platform that has Ion or why we would worry about it - we don't run this configuration on the build bots, or didn't last I looked, and frequently that configuration does not compile.)
Flags: needinfo?(lhansen)
(In reply to Lars T Hansen [:lth] from comment #4)
> (In reply to Till Schneidereit [:till] from comment #3)
> > I understand the reasoning for atomic operations, but wouldn't it be much
> > better if we disabled SABs entirely in this case instead of crashing?
> 
> IMO no, because what you have here is the case of a missing platform API
> implementation where no (truly correct) portable implementation can be
> provided.

Either I don't understand this answer, or we've been mis-communicating earlier :) What I'm suggesting is to say "SABs are only available in builds that support Ion, or otherwise explicitly opt in to having them enabled regardless of the risk of incorrect behavior." Normal TypedArrays would continue to just work, as they don't rely on said behavior. In this scenario, the code in comment 0 would just work.

> I don't want there to be a default that appears to work, because then the
> platform maintainer will not be aware that there might be an issue.  I want
> this to be opt-in.  (Really I did not want to have the ifdef nest for
> JS_CODEGEN_NONE at all; I wanted any workarounds for this case to be
> maintained outside the Firefox tree.  But I can't always have what I want.)

This, I very much understand.

I guess what I'm mainly saying is that I think a build should not succeed at all if it leads to an unusable engine - and I'd consider an engine that crashes as soon as you try to instantiate any TypedArray unusable.

> (I'm still not clear on why somebody would configure with --disable-ion on a
> platform that has Ion or why we would worry about it - we don't run this
> configuration on the build bots, or didn't last I looked, and frequently
> that configuration does not compile.)

I'm also curious about that. Modmaker, can you say a bit more about what you're trying to use this for?
Flags: needinfo?(modmaker)
(In reply to Till Schneidereit [:till] from comment #5)
> (In reply to Lars T Hansen [:lth] from comment #4)
> > (In reply to Till Schneidereit [:till] from comment #3)
> > > I understand the reasoning for atomic operations, but wouldn't it be much
> > > better if we disabled SABs entirely in this case instead of crashing?
> > 
> > IMO no, because what you have here is the case of a missing platform API
> > implementation where no (truly correct) portable implementation can be
> > provided.
> 
> Either I don't understand this answer, or we've been mis-communicating
> earlier :)

I don't think we've been mis-communicating, not exactly, but I'm not making my point very well.  I'll give it another whirl, but also see the end of this comment for a possible way forward.

Up until now, JS has been implementable in fairly straightforward C++.  The good implementations have played a lot of tricks with representations that are strictly outside the C++ standard but the language is more or less set up to handle that.  That is not true for the data races that come with SAB.  You can implement SAB in portable (ie race-free) C++ but only if you serialize all accesses to a SAB's memory (ie always take a lock; I don't think atomic loads and stores are sufficient), and nobody wants to do that.  Hence there is a platform API inside the JS engine that is separate from the JIT where we provide nonportable definitions of the primitives we use to access to SAB memory.

Since our implementation is structured the way it is, and we don't want to check sharedness at the TypedArray::getIndex level so that we can take one of two paths (use regular accesses if the buffer is nonshared, use special accesses if it is shared), we end up relying on those primitives for access to nonshared memory too.

That API must be implemented for the platform we're compiling for, regardless of the presence of Ion.  That the primitives are in the jit/ directories is a historical artifact.  There is really no reason why Ion is *needed* to support SharedArrayBuffer.  The required primitives could be written in assembler for the platform, which they would have to be if the compiler did not support the necessary intrinsics to write a hacky solution of the kind I have written for gcc, clang, and MSVC.  It may be that Ion is the best solution, if it is available, so that we can generate code on the fly for these primitives, but even that is hard to say for sure - maybe inline assembler is more performant than calls to this generated code?

> What I'm suggesting is to say "SABs are only available in builds
> that support Ion, or otherwise explicitly opt in to having them enabled
> regardless of the risk of incorrect behavior." Normal TypedArrays would
> continue to just work, as they don't rely on said behavior. In this
> scenario, the code in comment 0 would just work.

We can absolutely do that.  Now you have something that is less than ECMAScript if you don't have a JIT, but maybe that's the right tradeoff.  I didn't think so at the time, but I could have been wrong :)

> > I don't want there to be a default that appears to work, because then the
> > platform maintainer will not be aware that there might be an issue.  I want
> > this to be opt-in.  (Really I did not want to have the ifdef nest for
> > JS_CODEGEN_NONE at all; I wanted any workarounds for this case to be
> > maintained outside the Firefox tree.  But I can't always have what I want.)
> 
> This, I very much understand.
> 
> I guess what I'm mainly saying is that I think a build should not succeed at
> all if it leads to an unusable engine - and I'd consider an engine that
> crashes as soon as you try to instantiate any TypedArray unusable.

Indeed, and we are in agreement here.  I would have preferred an #error directive in that ifdef nest, myself, so that the build would fail if suitable primitives for the platform are not present, but since we use --disable-ion builds for testing that Ion can be disabled that's not so great.

Having written all of the preceding, I am coming around to the idea that the bug here - in some sense - is that I've tied the presence of these primitives to the JS_CODEGEN_WHATEVER macros.  Really the primitives should be tied to the concrete architecture (and then the implementation method may depend on the compiler or the presence of Ion or whatever).  In that case, if I've provided an implementation that works for x64, and does not depend on Ion to work (eg say I wrote it in assembler), it will work both with and without Ion enabled.  That is,

#if defined __x64__
  #if defined JS_CODEGEN_ION
    #include "AtomicOperations-IonGenerated.h"
  #elif defined __gcc__ || defined __clang__
    #include "AtomicOperations-feeling-lucky.h"
  #else
    #error "You're on your own"
  #endif
#elif ...
> > (I'm still not clear on why somebody would configure with --disable-ion on a
> > platform that has Ion or why we would worry about it - we don't run this
> > configuration on the build bots, or didn't last I looked, and frequently
> > that configuration does not compile.)
> 
> I'm also curious about that. Modmaker, can you say a bit more about what you're trying to use this for?

I am trying to build SpiderMonkey for iOS (which doesn't support dynamic codegen).  I am aware (if I read the docs/discussions correctly) that iOS is not officially supported in the main library.  But I am able to get builds of SpiderMonkey mostly working.

> Up until now, JS has been implementable in fairly straightforward C++.  The good
> implementations have played a lot of tricks with representations that are strictly
> outside the C++ standard but the language is more or less set up to handle that.
> That is not true for the data races that come with SAB.  You can implement SAB
> in portable (ie race-free) C++ but only if you serialize all accesses to a SAB's
> memory (ie always take a lock; I don't think atomic loads and stores are sufficient),
> and nobody wants to do that.  Hence there is a platform API inside the JS engine
> that is separate from the JIT where we provide nonportable definitions of the
> primitives we use to access to SAB memory.
> 
> Since our implementation is structured the way it is, and we don't want to check
> sharedness at the TypedArray::getIndex level so that we can take one of two paths
> (use regular accesses if the buffer is nonshared, use special accesses if it is
> shared), we end up relying on those primitives for access to nonshared memory too.

Couldn't you just fallback to C++11 <atomic> on platforms that support it?  It would allow cross-platform atomics that also affect code reordering, memory barriers, etc.  What about just using locks when you don't have platform support?
Flags: needinfo?(modmaker)
(In reply to modmaker from comment #7)

> Couldn't you just fallback to C++11 <atomic> on platforms that support it? 
> It would allow cross-platform atomics that also affect code reordering,
> memory barriers, etc.  What about just using locks when you don't have
> platform support?

The gcc/clang workarounds basically use C++11 atomics (the __atomic_whatever intrinsics), and other compilers that support those will indeed work.  Of course this is not "correct", we still have undefined behavior, but pragmatically it works for existing compilers, so far as I know.  (We only fall back to the older __sync_whatever intrinsics when we have to.)

(You should be able to make progress without waiting for me, by just adding an #ifdef for IOS in the nest for JS_CODEGEN_NONE as described earlier and including the same file used for eg ppc64.)
I'll see if I can't restructure the code a little to make this less painful.
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: INVALID → ---
Assignee: nobody → lhansen
Status: REOPENED → ASSIGNED
Summary: Can't use TypedArrays with --disable-ion → Change definitions of AtomicOperations so as to create fewer headaches
Work in progress, but probably roughly right.  This inverts the include tree so that building with or without JIT on hardware X gets definitions for X, if they exist in a JIT-independent way.   I think this will be much more resilient and easier to expand.  It also allows for JIT-dependent definitions to exist when that is to our advantage, and is arguably more correct for the simulator builds.  This is also a better basis for providing 8-byte atomics, which we need for wasm.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=73966049341c7652d4693760cfc3caa06d4eb586
Blocks: 1146817
As explained at the end of comment 6 and in comment 10 this changes the tree of conditions so that we dispatch on hardware platform (and later also on compiler within the platform, this will appear in bug 1146817), and no longer on JIT platform.  Thus if there exists a jit-independent implementation of the atomic ops for the current hardware platform and compiler we will use it regardless of how or whether the JIT is configured.

Thus a build with --disable-ion now behaves properly for the case reported here.  This will continue to be true even if I reimplement the atomic primitives in assembler, say.
Attachment #8900617 - Attachment is obsolete: true
Attachment #8901152 - Flags: review?(till)
Try run that verifies that this also works on Windows (the first one failed).
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0bbb24e071eb48529559ecefe4bcb80e7fded5e7
Comment on attachment 8901152 [details] [diff] [review]
bug1389669-shrink-atomicops-minefield.patch

Review of attachment 8901152 [details] [diff] [review]:
-----------------------------------------------------------------

This is much nicer, and easier to reason about, too. Thank you in particular for the clear explanation in the comments.
Attachment #8901152 - Flags: review?(till) → review+
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/66fd793e7005
make inclusion of AtomicOperations definitions switch on hardware platform, not JIT platform, and remove crashing 'none' case.  r=till
https://hg.mozilla.org/mozilla-central/rev/66fd793e7005
Status: ASSIGNED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: