Setting JS_DISABLE_SLOW_SCRIPT_SIGNALS=1 can DoS operation callbacks

RESOLVED FIXED in mozilla33

Status

()

defect
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: cjones, Assigned: bbouvier)

Tracking

unspecified
mozilla33
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [rr])

Attachments

(5 attachments, 8 obsolete attachments)

9.20 KB, patch
jandem
: review+
Details | Diff | Splinter Review
49.11 KB, patch
luke
: review+
Details | Diff | Splinter Review
30.60 KB, patch
luke
: review+
Details | Diff | Splinter Review
9.14 KB, patch
luke
: review+
Details | Diff | Splinter Review
1.12 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
Ben helped me debug https://github.com/mozilla/rr/issues/971 .  The problem was that a worker thread was never being terminated, just chewing CPU in the background.  The problem reduces to http://mxr.mozilla.org/mozilla-central/source/js/src/vm/Runtime.cpp#669

669     if (!SignalBasedTriggersDisabled()) {
670         RequestInterruptForAsmJSCode(this);
671         jit::RequestInterruptForIonCode(this, mode);
672     }

The thread in question was running |while (true);| in JS.  The JIT'd code was

=> 0x6b071ff6:  jmp    0x6b071ff6

I ran the mochitest harness using --debugger=foo.  When that option is passed, then the harness sets JS_DISABLE_SLOW_SCRIPT_SIGNALS=1.  That made the above !SignalBasedTriggersDisabled() condition false, and the mprotect interrupts were disabled.

So putting this all together, the worker thread became invincible to operation callbacks.  That forces a choice between interruptible threads or being able to use a debugger.

Ben tells me he thinks the intention was to fall back on inserting op-callback checks when JS_DISABLE_SLOW_SCRIPT_SIGNALS=1 is set.
I think we need some kind of solution for being able to use a debugger and being able to control runaway script...
We could just make EnsureAsmJSSignalHandlersInstalled return false if that env variable is set. Then we won't use mprotect and Ion will emit an explicit branch at loop heads. I think this will also disable Odin/asm.js though, we probably don't want that.

An alternative is to set a boolean on the runtime if that env var is set, then we check that bool in LIRGenerator::visitInterruptCheck and fall back to the slower interrupt mechanism. I can fix this next week if nobody beats me to it.
(In reply to Jan de Mooij [:jandem] from comment #2)
> We could just make EnsureAsmJSSignalHandlersInstalled return false if that
> env variable is set. Then we won't use mprotect and Ion will emit an
> explicit branch at loop heads. I think this will also disable Odin/asm.js
> though, we probably don't want that.

This sounds like a good idea; we can avoid disabling Odin by having Odin emit operation-callback checks if signal handlers were disabled.  This would also help on the handful of Android systems where signal handling is broken (IsSignalHandlingBroken() from the elf loader).
(In reply to Luke Wagner [:luke] from comment #3)
> (In reply to Jan de Mooij [:jandem] from comment #2)
> > We could just make EnsureAsmJSSignalHandlersInstalled return false if that
> > env variable is set. Then we won't use mprotect and Ion will emit an
> > explicit branch at loop heads. I think this will also disable Odin/asm.js
> > though, we probably don't want that.
> 
> This sounds like a good idea; we can avoid disabling Odin by having Odin
> emit operation-callback checks if signal handlers were disabled.  This would
> also help on the handful of Android systems where signal handling is broken
> (IsSignalHandlingBroken() from the elf loader).

Further the detection of 'broken' signal handlers on Android includes the detection of slow handlers and this test depends on timing and can fail even on fast devices if they are loaded when doing the check.  So this fall back would help in this case too.
Posted patch Patch for ionSplinter Review
So I followed your second approach in comment 2. This way, we can still use signals for triggering the slow script dialog *and* have Ion emit manual interrupt checks in the meanwhile.

Odin needs more work than that:
- independently from signalHandlersInstalled, we can emit bound checks on x64 if !canUseSignalHandlers
- then in a later patch, we can generalize: if !signalHandlersInstalled, we can just add manual interrupt checks
Attachment #8445203 - Flags: review?(jdemooij)
Comment on attachment 8445203 [details] [diff] [review]
Patch for ion

Review of attachment 8445203 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for fixing this.
Attachment #8445203 - Flags: review?(jdemooij) → review+
Comment on attachment 8445203 [details] [diff] [review]
Patch for ion

Review of attachment 8445203 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Runtime.cpp
@@ +263,5 @@
> +SignalBasedTriggersDisabled()
> +{
> +  // Don't bother trying to cache the getenv lookup; this should be called
> +  // infrequently.
> +  return !!getenv("JS_DISABLE_JIT_SIGNALS");

I didn't notice this change earlier, but I think it's better to keep the old name (JS_DISABLE_SLOW_SCRIPT_SIGNALS), to not annoy developers again :)

Also, this var is set in /testing/mochitest/runtests.py and some other files, so if we do change the name we'd have to fix those places too.
(In reply to Jan de Mooij [:jandem] from comment #7)
> I didn't notice this change earlier, but I think it's better to keep the old
> name (JS_DISABLE_SLOW_SCRIPT_SIGNALS), to not annoy developers again :)

I agree it makes sense to keep the old name, but it'd be nice to add a new, short, memorable name (I always have to relookup the current name).  How about JS_NO_SIGNALS?
(In reply to Luke Wagner [:luke] from comment #8)
> (In reply to Jan de Mooij [:jandem] from comment #7)
> > I didn't notice this change earlier, but I think it's better to keep the old
> > name (JS_DISABLE_SLOW_SCRIPT_SIGNALS), to not annoy developers again :)
>
> I agree it makes sense to keep the old name, but it'd be nice to add a new,
> short, memorable name (I always have to relookup the current name).  How
> about JS_NO_SIGNALS?

I changed the name as the real meaning changes too (we're not disabling slow script signals anymore, just the ion tricks in the first patch) and was planning to modify all places + send a mail to dev-platform. However, keeping the same name is simpler for everyone. Having JS_NO_SIGNALS is indeed shorter, but sounds like a lie to the user, in the sense that we don't effectively deactivate all signals. But it's only bikeshedding :)
Renaming it (and keeping the old one for now) sounds good; I agree JS_DISABLE_SLOW_SCRIPT_SIGNALS is not the most memorable name :)
Comment on attachment 8445203 [details] [diff] [review]
Patch for ion

Review of attachment 8445203 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Runtime.cpp
@@ +262,5 @@
> +static bool
> +SignalBasedTriggersDisabled()
> +{
> +  // Don't bother trying to cache the getenv lookup; this should be called
> +  // infrequently.

Had cause to worry about the thread safety of getenv recently, and might it be prudent to change this to cache the result.
With this patch, we can activate Odin even if JS_NO_SIGNALS=1.
A few notes:
- as i've reused cmplWithPatch, heap length has to be 32-bits. Thinking about it, i could relax that restriction, as there's plenty of addressable space on x64.
- OOL heap access classes and visit method are now shared between x64 and x86.
- prepareForAsmJS needs a new ArrayBufferObject flag that tells whether we mmapped the asm.js heap, to choose the right delete method.

Once this patch gets a r+ and nits / behaviors are fixed, I think we could land the two patches and consider the issue as solved. Implementing the third part will let the possibility to use Odin without signal handlers, but isn't a prerequisite for this bug, which just wants to split the use of signals for JIT tricks and for the slow script dialog. I'll open a follow up bug for the last one.

I am building the entire browser to check that it behaves as expected.
Assignee: nobody → benj
Status: NEW → ASSIGNED
Attachment #8445929 - Flags: review?(luke)
A patch that actually compiles everywhere (OutOfLineLoadTypedArray was defined in the cpp file -- because of unified build, it somehow worked locally but not on try).

For what it's worth, Mochitests seems to pass fine with these two patches applied and another one (not uploaded) that always sets canUseSignalHandlers to false. I expect one of the mochitests to have a test for the timeout dialog feature, as there's a way to pass a "slowscript" argument to the mach harness for mochi-tests. Couldn't trigger the dialog on my machine, but apparently it never shows up at all even on Nightly (tested an infinite empty loop).
Attachment #8445929 - Attachment is obsolete: true
Attachment #8445929 - Flags: review?(luke)
Attachment #8446427 - Flags: review?(luke)
Comment on attachment 8446427 [details] [diff] [review]
Patch for Odin x64 bound checks

Review of attachment 8446427 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for doing this!  A few nits and requests:

::: js/src/jit/AsmJS.cpp
@@ +6997,5 @@
>  {
>      if (!cx->jitSupportsFloatingPoint())
>          return Warn(parser, JSMSG_USE_ASM_TYPE_FAIL, "Disabled by lack of floating point support");
>  
> +    if (!cx->signalHandlersInstalled())

IIUC, you can remove this condition here and also in IsAsmJSCompilationAvailable, right?  (I'd re-run tests with JS_DISABLE_SLOW_SCRIPT_SIGNALS set; isAsmJSCompilationAvailable = false effectively turns off tests.)

::: js/src/jit/AsmJSModule.cpp
@@ +605,5 @@
>          JSC::X86Assembler::setPointer(addr, (void *)(heapOffset + disp));
>      }
> +#elif defined(JS_CODEGEN_X64)
> +    int32_t heapLength = int32_t(intptr_t(heap->byteLength()));
> +    if (cx->canUseSignalHandlers())

I know the value of cx->canUseSignalHandlers() can't change (currently), but I was thinking that it might in the future (maybe someone wants to be able to set the env var dynamically) and that could lead to situations where we don't patch when we need to or vice versa.  How about instead putting a bool 'usesSignalHandlers' on the AsmJSModule, initialized in the constructor from cx->canUseSignalHandlers().  Similarly, we can have a ModuleCompiler::canuseSignalHandlers() const { return module_->usesSignalHandlers(); } instead of testing cx_->canUseSignalHandlers() during compilation.

::: js/src/jit/MIR.h
@@ +10248,5 @@
>  class MAsmJSHeapAccess
>  {
>      ArrayBufferView::ViewType viewType_;
>      bool skipBoundsCheck_;
> +    bool useSignalHandlers_;

It seems like useSignalHandlers_ has the same meaning as skipBoundsCheck_ on x64.  Could we reuse that flag instead?

::: js/src/vm/ArrayBufferObject.cpp
@@ +414,5 @@
>  JS_STATIC_ASSERT(AsmJSAllocationGranularity == AsmJSPageSize);
>  #endif
>  
> +/* static */ bool
> +ArrayBufferObject::prepareForAsmJSNoSignals(JSContext *cx, Handle<ArrayBufferObject*> buffer)

Instead of adding this can you hoist and reuse the x86 version?
Attachment #8446427 - Flags: review?(luke)
(In reply to Luke Wagner [:luke] from comment #14)
> Comment on attachment 8446427 [details] [diff] [review]
> Patch for Odin x64 bound checks
> 
> Review of attachment 8446427 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks for doing this!  A few nits and requests:
> 
> ::: js/src/jit/AsmJS.cpp
> @@ +6997,5 @@
> >  {
> >      if (!cx->jitSupportsFloatingPoint())
> >          return Warn(parser, JSMSG_USE_ASM_TYPE_FAIL, "Disabled by lack of floating point support");
> >  
> > +    if (!cx->signalHandlersInstalled())
> 
> IIUC, you can remove this condition here and also in
> IsAsmJSCompilationAvailable, right?  (I'd re-run tests with
> JS_DISABLE_SLOW_SCRIPT_SIGNALS set; isAsmJSCompilationAvailable = false
> effectively turns off tests.)
I think we can't: in the case where we don't have installed the signal handlers (as on platforms where signal handling is broken, e.g.), during a requestInterrupt we would mprotect asm.js code and then as there are no signals installed, we would simply crash when running the code. Am I missing something?

We could also not RequestInterruptForAsmJSCode if signal handlers are disabled, but that would mean that we just shift the current issue (e.g., an infinite asm.js loop would DoS the browser). The correct fix will be to add manual callback interrupt checks when signal handlers aren't installed, and only at that time we should be able to relax the signal handlers constraint.

> I know the value of cx->canUseSignalHandlers() can't change (currently), but
> I was thinking that it might in the future (maybe someone wants to be able
> to set the env var dynamically) and that could lead to situations where we
> don't patch when we need to or vice versa.  How about instead putting a bool
> 'usesSignalHandlers' on the AsmJSModule, initialized in the constructor from
> cx->canUseSignalHandlers().  Similarly, we can have a
> ModuleCompiler::canuseSignalHandlers() const { return
> module_->usesSignalHandlers(); } instead of testing
> cx_->canUseSignalHandlers() during compilation.
Good idea. I am trying to think if this can even be a bug because of serialization:
- if we compile with no signal handlers, and run with signals: in this case, we'll have bounds checks in the code, so protecting the array will be superfluous but things will work.
- if we compile with signals in mind, and run without signals: the array won't be mprotect'd and there won't be bounds checks in the generated code. This sounds like a security issue, which convinced me that the module's canUseSignalHandlers boolean should be serialized *and* used for deciding which variant of prepareAsmJSHeap we should use. Sorry, that makes the patch even bigger. Another patch with tests will come.

> 
> ::: js/src/jit/MIR.h
> @@ +10248,5 @@
> >  class MAsmJSHeapAccess
> >  {
> >      ArrayBufferView::ViewType viewType_;
> >      bool skipBoundsCheck_;
> > +    bool useSignalHandlers_;
> 
> It seems like useSignalHandlers_ has the same meaning as skipBoundsCheck_ on
> x64.  Could we reuse that flag instead?
In Lowering-x64.cpp, a condition tests for useSignalHandlers_ && !skipBoundsCheck_. In CodeGenerator-x64.cpp, a condition tests for !useSignalHandlers_ && !skipBoundsCheck_, so their meanings are different.
> 
> ::: js/src/vm/ArrayBufferObject.cpp
> @@ +414,5 @@
> >  JS_STATIC_ASSERT(AsmJSAllocationGranularity == AsmJSPageSize);
> >  #endif
> >  
> > +/* static */ bool
> > +ArrayBufferObject::prepareForAsmJSNoSignals(JSContext *cx, Handle<ArrayBufferObject*> buffer)
> 
> Instead of adding this can you hoist and reuse the x86 version?
That's what I've done in the patch: in the preprocessor "else" section, prepareForAsmJS calls prepareForAsmJSNoSignals, as these platforms don't use signals. I've added a comment to make that clearer.
Attachment #8446427 - Attachment is obsolete: true
Attachment #8447119 - Flags: review?(luke)
Posted patch Tests (obsolete) — Splinter Review
And while we're there, let's add one test for Ion too. In this test, we force generation of manual interrupt checks, rather than using signalling.
Attachment #8447120 - Attachment is obsolete: true
Attachment #8447120 - Flags: review?(luke)
Attachment #8447123 - Flags: review?(luke)
Comment on attachment 8447123 [details] [diff] [review]
Tests

Review of attachment 8447123 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/tests/asm.js/testHeapAccess.js
@@ +30,5 @@
> +        setJitCompilerOption("signal.handlers.enable", 0);
> +        // Cloned modules should fail on linking if the initial module has been
> +        // compiled with signals but signals are deactivated.
> +        var code = asmCompile('glob', 'imp', 'b', USE_ASM + HEAP_IMPORTS + 'function f(i) {i=i|0; i32[0] = i; return i8[0]|0}; return f');
> +        assertAsmLinkFail(code, this, null, new ArrayBuffer(4096));

Actually, this assert will trigger on all platforms other than x64. I thought there would be a way to test for the platform in the test, but there seems to be none (i've tried |skip-if(processor != 'x64')| but it doesn't work).

Should we generalize the link failure to all platforms (but have only x64 set needsMappedBuffer to true)?
Attachment #8447123 - Flags: review?(luke)
(In reply to Benjamin Bouvier [:bbouvier] from comment #15)
> The correct fix will be to add
> manual callback interrupt checks when signal handlers aren't installed, and
> only at that time we should be able to relax the signal handlers constraint.

Ah, I assumed incorrectly that this patch did that already.  I think that is the main thing we want to do here to allow JS_DISABLE_SLOW_SCRIPT_SIGNALS to avoid OOM while still enabling asm.js.

> > It seems like useSignalHandlers_ has the same meaning as skipBoundsCheck_ on
> > x64.  Could we reuse that flag instead?
> In Lowering-x64.cpp, a condition tests for useSignalHandlers_ &&
> !skipBoundsCheck_. In CodeGenerator-x64.cpp, a condition tests for
> !useSignalHandlers_ && !skipBoundsCheck_, so their meanings are different.

In the Lowering-x64 case, it seems like, there is not a meaningful distinction and you could just write:
  if (ins->skipBoundsCheck() && ptrValue >= 0)
    useConstant = true;
if useSignalHandlers was replaced by setting skipBoundsChecks.

For the CodeGenerator-x64 case, you're right, there is a distinction: if we skipBoundsCheck, we don't add an AsmJSHeapAccess (b/c it can't fail), but we do want an AsmJSHeapAccess when !canUseSignalHandlers, since we need to patch.  When we start implementing AB neutering properly, even constant heap accesses can fault, so we should just note all heap accesses.  This should allow us have
  skipBoundsCheck = canUseSignalHandlers || isConstantAccess

> > Instead of adding this can you hoist and reuse the x86 version?
> That's what I've done in the patch: in the preprocessor "else" section,
> prepareForAsmJS calls prepareForAsmJSNoSignals, as these platforms don't use
> signals.

Hah, so you have, sorry.
(In reply to Luke Wagner [:luke] from comment #19)
> For the CodeGenerator-x64 case, you're right, there is a distinction: if we
> skipBoundsCheck, we don't add an AsmJSHeapAccess (b/c it can't fail), but we
> do want an AsmJSHeapAccess when !canUseSignalHandlers, since we need to
> patch.  When we start implementing AB neutering properly, even constant heap
> accesses can fault, so we should just note all heap accesses.  This should
> allow us have
>   skipBoundsCheck = canUseSignalHandlers || isConstantAccess

But that would give change the semantics of skipBoundsCheck, right? It'd be wrong on other platforms than x64 to have this, as e.g. x86 can use signal handlers (for the interrupt checks) but needs the bounds checks in all cases (unless we're #ifdef'ining one of the canUseSignalHandlers functions, which sounds creepy). Other than that, I have a patch that implements it on x64 and it works well on this platform.
Flags: needinfo?(luke)
(In reply to Benjamin Bouvier [:bbouvier] from comment #20)
> >   skipBoundsCheck = canUseSignalHandlers || isConstantAccess
> 
> But that would give change the semantics of skipBoundsCheck, right?

On non-x64, canUseSignalHandlers would effectively be 'false' (in the context of loads/stores, cx->canUseSignalHandlers would still be true).  That is, in AsmJS.cpp, m.canUseSignalHandlers would be initialized to cx->canUseSignalHandlers() ifdef JS_CODEGEN_X64 and 'false' otherwise and m.canUseSignalHandlers would propagate elsewhere (module_->usesSignalHandlers, skipBoundsCheck).
Flags: needinfo?(luke)
Alrighty, then.
Attachment #8447119 - Attachment is obsolete: true
Attachment #8447119 - Flags: review?(luke)
Attachment #8448102 - Flags: review?(luke)
Comment on attachment 8448102 [details] [diff] [review]
Patch for Odin x64 bound checks

Review of attachment 8448102 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/AsmJS.cpp
@@ +1151,5 @@
> +        bool canUseSignalHandlers = cx_->canUseSignalHandlers();
> +#else
> +        bool canUseSignalHandlers = false;
> +#endif // JS_CODEGEN_X64
> +        module_ = cx_->new_<AsmJSModule>(parser_.ss, funcStart, offsetToEndOfUseAsm, strict,

Can you put a \n before the new_?

::: js/src/jit/AsmJSLink.cpp
@@ +334,5 @@
> +    // function returns true if, and only if, we will use signal handlers, i.e.
> +    // we want a mapped buffer.
> +    bool noBoundsChecksEmitted = module.usesSignalHandlers();
> +    bool signalHandlersActivated = cx->canUseSignalHandlers();
> +    *needsMappedArrayBuffer = noBoundsChecksEmitted && signalHandlersActivated;

If noBoundsChecksEmitted && !signalHandlersActivated, then this will return false and we won't call prepareAsmJSBuffer.  In that case, we can just call prepareForAsmJS(cx, heap, module.usesSignalHandlers()) and avoid the outparam here.

Next, if module.usesSignalHandlers(), then cx->canUseSignalHandlers was true during compilation.  If it got toggled to false, the someone must have used a debugger to change it.  They could just as well switch off signals after linking succeeds and we don't protect against that, so I think we shouldn't bother here.  In that case, we can remove this whole function and the call below.

::: js/src/jit/MIR.h
@@ +10325,5 @@
>  
>  class MAsmJSStoreHeap : public MBinaryInstruction, public MAsmJSHeapAccess
>  {
> +    MAsmJSStoreHeap(ArrayBufferView::ViewType vt, MDefinition *ptr,
> +                    MDefinition *v)

I expect 'v' could fit on the preceding line.

::: js/src/jit/x64/CodeGenerator-x64.cpp
@@ +251,5 @@
>  
>      if (ptr->isConstant()) {
>          int32_t ptrImm = ptr->toConstant()->toInt32();
>          // Note only a positive index is accepted here because a negative offset would
>          // not wrap back into the protected area reserved for the heap.

Pre-existing: this comment restates the comment in Lowering-x64.cpp; could you remove it?

@@ +260,5 @@
>      }
>  
> +    bool skipBoundsCheck = mir->skipBoundsCheck();
> +    OutOfLineLoadTypedArrayOutOfBounds *ool = nullptr;
> +    uint32_t maybeCmpOffset = UINT32_MAX; // guard value in AsmJSHeapAccess ctor

Instead of a comment, can you add a 'static const uint32_t NoLengthCheck = UINT32_MAX' inside AsmJSHeapAccess and use that within AsmJSHeapAccess instead of 'UINT32_MAX'?

@@ +264,5 @@
> +    uint32_t maybeCmpOffset = UINT32_MAX; // guard value in AsmJSHeapAccess ctor
> +    if (!skipBoundsCheck) {
> +        // On x64, we have bounds check iff we're not using signals handlers
> +        // for OOB accesses, in which case we need to note the access to patch
> +        // it at link time.

From a CodeGenerator pov, we don't need to mention signal handlers; the need for patching naturally falls out of the fact we're not skipping bounds checks so we need a cmp with a patched immediate.  So I'd remove this comment.

@@ +274,5 @@
> +        CodeOffsetLabel cmp = masm.cmplWithPatch(ToRegister(ptr), Imm32(0));
> +        masm.j(Assembler::AboveOrEqual, ool->entry());
> +        maybeCmpOffset = cmp.offset();
> +    }
> +    JS_ASSERT(skipBoundsCheck == (maybeCmpOffset == UINT32_MAX));

With the above, this assert isn't really necessary.  Also, there'd only be one use of 'skipBoundsCheck', so you could inline 'mir->skipBoundsCheck()' into that one use.

@@ +314,4 @@
>      }
>  
> +    bool skipBoundsCheck = mir->skipBoundsCheck();
> +    uint32_t maybeCmpOffset = UINT32_MAX; // guard value in AsmJSHeapAccess ctor

ditto above

@@ +323,5 @@
> +        CodeOffsetLabel cmp = masm.cmplWithPatch(ToRegister(ptr), Imm32(0));
> +        masm.j(Assembler::AboveOrEqual, &rejoin);
> +        maybeCmpOffset = cmp.offset();
> +    }
> +    JS_ASSERT(skipBoundsCheck == (maybeCmpOffset == UINT32_MAX));

ditto above

::: js/src/jit/x64/Lowering-x64.cpp
@@ +137,5 @@
> +    if (ptr->isConstant()) {
> +        int32_t ptrValue = ptr->toConstant()->value().toInt32();
> +        if (ins->skipBoundsCheck() && ptrValue >= 0) {
> +            // The X64 does not inline an explicit bounds check so has no need
> +            // to keep the index in a register, however only a positive index

Pre-existing nit: everything before the comma seems besides irrelevant, so could you remove it and start the sentence with "Only a positive index..."?

::: js/src/vm/ArrayBufferObject.cpp
@@ +432,5 @@
> +
> +void
> +ArrayBufferObject::releaseAsmJSArrayNoSignals(FreeOp *fop)
> +{
> +    fop->free_(dataPointer());

Can you JS_ASSERT(!buffer->isAsmJSMappedArrayBuffer());?

@@ +530,4 @@
>  {
> +    // Platforms other than x64 don't use signalling for bounds checking, so
> +    // just use the variant with no signals.
> +    return prepareForAsmJSNoSignals(cx, buffer);

Can you add
  JS_ASSERT(!usesSignalHandlers);
(and can you rename this parameter here and elsewhere to usesSignalHandlers)?

@@ +804,5 @@
>  #if defined (JS_CPU_X64)
>          // On x64, ArrayBufferObject::prepareForAsmJS switches the
>          // ArrayBufferObject to use mmap'd storage.
> +        if (buffer.isAsmJSMappedArrayBuffer())
> +            sizes->nonHeapElementsAsmJS += buffer.byteLength();

I think you need an
  else
    sizes->mallocHeapElementAsmJS += mallocSizeOf(buffer.dataPointer());
more generally, since isAsmJSMappedArrayBuffer() is correct on all platforms, you can remove the #ifdef.
Attachment #8448102 - Flags: review?(luke) → review+
I forgot to say: thanks a lot!
Thanks for the review!

(In reply to Luke Wagner [:luke] from comment #23) 
> ::: js/src/jit/AsmJSLink.cpp
> @@ +334,5 @@
> > +    // function returns true if, and only if, we will use signal handlers, i.e.
> > +    // we want a mapped buffer.
> > +    bool noBoundsChecksEmitted = module.usesSignalHandlers();
> > +    bool signalHandlersActivated = cx->canUseSignalHandlers();
> > +    *needsMappedArrayBuffer = noBoundsChecksEmitted && signalHandlersActivated;
> 
> If noBoundsChecksEmitted && !signalHandlersActivated, then this will return
> false and we won't call prepareAsmJSBuffer.  In that case, we can just call
> prepareForAsmJS(cx, heap, module.usesSignalHandlers()) and avoid the
> outparam here.
Ha, nice.

> 
> Next, if module.usesSignalHandlers(), then cx->canUseSignalHandlers was true
> during compilation.  If it got toggled to false, the someone must have used
> a debugger to change it.  They could just as well switch off signals after
> linking succeeds and we don't protect against that, so I think we shouldn't
> bother here.  In that case, we can remove this whole function and the call
> below.

That's not the case I was thinking about: the issue can also show up if a module was compiled without bounds checks, then *cached*, then reloaded without signaling.  That makes sense IMO, as a developer will first try to run an asm.js program in the browser and only *then* exit it, and run it with disabled signals when trying to debug it.  It's a better experience to have a Link error rather than no message and having failing bounds checks, if not crashes.  I guess that's also an edge-case, but the whole point of this bug is to make Firefox developers lives easier.  What do you think?
(In reply to Benjamin Bouvier [:bbouvier] from comment #25)
Ah, good point about the caching!  In that case, perhaps we don't need a full function call, but rather
  if (module.usesSignalHandlers() && !cx->canUseSignalHandlers())
with a little comment mentioning caching?
WIP patch for the last part, with debugging printf statements and race conditions included. This cannot yet handle infinite loops (while (1){}), but inserts checks using SP at function prologues and thus can stop the code at function head. Well, almost can, as tests/asm.js/testTimeout3.js fails on x64 and I don't know yet why.
I fixed most of the issues for x86 and x64 but I still need to figure out what's going on for ARM. There are still issues on x64 where $ebp gets randomly clobbered in parallel compilation mode.

There aren't still any loop headers check. Luke, what do you think about having a MInstruction, like MInterruptCheck, but for AsmJS? That would help choosing registers for testing the stack limit value and calling into the operation callback exit. In this case, should we patch at dynamic linking the position of the operation callback label?

If not, does it sound OK to just have the check and the call in the loop header? Any other idea?

Also, we have to generate the operation callback exit and the check for the stack limit value in all cases, whatever we're running with or without signals. Otherwise, during runtime, the stack limit check will exit and a stack overflow error will be thrown, while the real issue is e.g. a timeout. This generates situations of benign races in the engine, if we run with signals: the check could get triggered before the signal is handled, and in this case the stack trace would be different. Is it fine?

For people who'd want to try this patch, as of now on x86 and ARM, it will only work if JS_NO_SIGNALS=1, as we're not making a distinction (yet) between canUseSignalHandlers and usesSignalHandlersForBoundsChecks at several places.
Attachment #8448784 - Attachment is obsolete: true
Flags: needinfo?(luke)
(In reply to Benjamin Bouvier [:bbouvier] from comment #28)
> There aren't still any loop headers check. Luke, what do you think about
> having a MInstruction, like MInterruptCheck, but for AsmJS? That would help
> choosing registers for testing the stack limit value and calling into the
> operation callback exit. In this case, should we patch at dynamic linking
> the position of the operation callback label?

MAsmJSInterruptCheck sounds good.  One thing that occurred to me is that, unlike Ion, we aren't especially concerned about the performance of this interrupt check, since it's only present in weird cases.  That suggests reusing MAsmJSInterruptCheck for both loops and the prologue; I don't know if that simplifies things.

For linking to the exit, I think you can pass in a pointer to ModuleCompiler::interruptLabel_ to the MIR node constructor (rather like how we pass in the Label* of the callee to MAsmJSCall).

> If not, does it sound OK to just have the check and the call in the loop
> header? Any other idea?

That will force spilling everything at loop headers (MAsmJSInterruptCheck::isCall would have to be true).  I said above that perf isn't too important, but this could significantly affect codegen/perf which could limit the utility of JS_NO_SIGNALS for debugging purposes.

> Also, we have to generate the operation callback exit and the check for the
> stack limit value in all cases, whatever we're running with or without
> signals. Otherwise, during runtime, the stack limit check will exit and a
> stack overflow error will be thrown, while the real issue is e.g. a timeout.
> This generates situations of benign races in the engine, if we run with
> signals: the check could get triggered before the signal is handled, and in
> this case the stack trace would be different. Is it fine?

I don't quite follow this, but hopefully MAsmJSInterruptCheck (which would only be generated when !canUseSignals) would avoid this problem.
Flags: needinfo?(luke)
(In reply to Luke Wagner [:luke] from comment #29)
> That suggests reusing MAsmJSInterruptCheck for both loops and the prologue; I don't know
> if that simplifies things.
That actually cannot be done: as we use the same trick as in Ion for the interrupt check (which is, setting the stack limit to -1), we would jump to the stack overflow right away. Instead of that, if the stack overflow check is triggered, we first have to check if the stack limit is -1: if it's true, we call the operation callback exit; otherwise, we jump to the stack overflow exit.

> 
> For linking to the exit, I think you can pass in a pointer to
> ModuleCompiler::interruptLabel_ to the MIR node constructor (rather like how
> we pass in the Label* of the callee to MAsmJSCall).
Sounds good to me.

> 
> > Also, we have to generate the operation callback exit and the check for the
> > stack limit value in all cases, whatever we're running with or without
> > signals. Otherwise, during runtime, the stack limit check will exit and a
> > stack overflow error will be thrown, while the real issue is e.g. a timeout.
> > This generates situations of benign races in the engine, if we run with
> > signals: the check could get triggered before the signal is handled, and in
> > this case the stack trace would be different. Is it fine?
> 
> I don't quite follow this, but hopefully MAsmJSInterruptCheck (which would
> only be generated when !canUseSignals) would avoid this problem.

Sorry I made myself unclear. The case I was thinking about is the following:
- the shell is running with signals (i.e. we have *not* defined the env var JS_NO_SIGNALS).
- a thread calls RequestOperationCallback, which will set the stack limit to -1 and then mprotects code.
- in Odin code, we check for a stack overflow at function headers and loop headers. Now that the stack overflow limit can be -1, we have to check whether the stack limit was -1, in case of stack overflow (otherwise, we would jump to the stack overflow exit and terminate everytime a interrupt request is thrown).

Because of the last bullet, we need to generate the stack limit check in all cases, and thus the operation callback exit as well. I'll try to comment this thoroughly in the affected code.
Still a WIP, but very close to the end: I just need to add loop headers checks and we should be good. I'll split the patch before handing for review.
Attachment #8451036 - Attachment is obsolete: true
Unless I made something wrong, here's a final implementation. It doesn't use the stack limit trick (it was making asm.js prologue too complicated) and instead makes use of MAsmJSInterruptCheck at all places.

We now need to distinguish between usesSignalHandlers() and usesSignalHandlersOOB() in the Odin checker, as now even x86 and ARM can have usesSignalHandlers() but they will still need the OOB checks on arrays.

Two try builds:
- with signals (as usual): https://tbpl.mozilla.org/?tree=Try&rev=3155b835bd1b
- without signals (force canUseSignal to return false at the highest level): https://tbpl.mozilla.org/?tree=Try&rev=7d20b53c68af
Attachment #8451764 - Attachment is obsolete: true
Attachment #8452482 - Flags: review?(luke)
For what it's worth, both try runs look good (after re-triggering some tests).
Comment on attachment 8452482 [details] [diff] [review]
3 - Make Odin not depend on signal handlers

Review of attachment 8452482 [details] [diff] [review]:
-----------------------------------------------------------------

Great work!  Thanks for being so receptive and sorry for the runaround with the stack limit stuff.

::: js/src/jit/AsmJS.cpp
@@ +1148,1 @@
>          bool canUseSignalHandlers = cx_->canUseSignalHandlers();

Can you inline cx_->canUseSignalHandlers() into the use below?

@@ +1230,5 @@
>      bool hasError() const { return errorString_ != nullptr; }
>      const AsmJSModule &module() const { return *module_.get(); }
>      uint32_t srcStart() const { return module_->srcStart(); }
>      bool usesSignalHandlers() const { return module_->usesSignalHandlers(); }
> +    bool usesSignalHandlersOOB() const { return module_->usesSignalHandlersOOB(); }

Can you s/usesSignalHandlers/usesSignalHandlersForInterrupt/ and s/usesSignalHandlersOOB/usesSignalHandlersForOOB/?

@@ +2181,5 @@
> +        m().tokenStream().srcCoords.lineNumAndColumnIndex(pn->pn_pos.begin, &lineno, &column);
> +        CallSiteDesc callDesc(lineno, column);
> +        MInstruction *check = MAsmJSInterruptCheck::New(alloc(), &m().operationCallbackLabel(),
> +                                                        callDesc);
> +        curBlock_->add(check);

Can you put the MAsmJSInterruptCheck::New() call as the argument to add()?  With that change, you'll (barely) be able to fit 'callDesc' on the same line.

@@ +6879,5 @@
> +    i++;
> +
> +    JS_ASSERT(i.done());
> +
> +    JS_ASSERT(((masm.framePushed() + AsmJSFrameSize) % StackAlignment) == 0);

If you AssertStackAlignment() then you'll get this assert and a dynamic one.

@@ +6884,5 @@
> +    masm.call(AsmJSImmPtr(AsmJSImm_HandleExecutionInterrupt));
> +    masm.branchIfFalseBool(ReturnReg, throwLabel);
> +
> +    masm.freeStack(stackDec);
> +    masm.PopRegsInMask(VolatileRegs);

Can you assert that ReturnReg is not in VolatileRegs?

@@ +6950,5 @@
>      }
>  
>      if (!GenerateInterruptExit(m, &throwLabel))
>          return false;
> +    if (!GenerateOperationCallbackExit(m, &throwLabel))

How about renaming to GenerateAsyncInterruptExit and GenerateSyncInterruptExit?  The term "operation callback" has generally been expunged from SM since Jason renamed it to "interrupt" a while back.  Could you propagate the name "syncInterruptExit" (or just "interruptExit" in the MIR/LIR cases where there is no async) throughout the rest of the patch?

Second, could you only generate these exits if the label is used:
if (m().syncInteruptLabel().used() && !GenerateSyncInterruptExit(m, &throwLabel))
?

::: js/src/jit/AsmJSModule.h
@@ +558,5 @@
>      }
>      bool usesSignalHandlers() const {
>          return pod.usesSignalHandlers_;
>      }
> +    bool usesSignalHandlersOOB() const {

How about usesSignalHandlersForOOB()?

::: js/src/jit/MIR.h
@@ +5325,5 @@
>          return AliasSet::None();
>      }
>  };
>  
> +// Check whether we need to fire the interrupt handler at loop headers in asm.js

Also function prologues.  Also, could you mention that this is only emitted if signal handlers are disabled?
Attachment #8452482 - Flags: review?(luke) → review+
(In reply to Luke Wagner [:luke] from comment #34)
> Comment on attachment 8452482 [details] [diff] [review]
> 3 - Make Odin not depend on signal handlers
> 
> Review of attachment 8452482 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Great work!  Thanks for being so receptive and sorry for the runaround with
> the stack limit stuff.

Thanks for your review! I made all the renaming changes, the sync/async interrupt renaming make much more sense with the future profiling work.

> @@ +6884,5 @@
> > +    masm.call(AsmJSImmPtr(AsmJSImm_HandleExecutionInterrupt));
> > +    masm.branchIfFalseBool(ReturnReg, throwLabel);
> > +
> > +    masm.freeStack(stackDec);
> > +    masm.PopRegsInMask(VolatileRegs);
> 
> Can you assert that ReturnReg is not in VolatileRegs?
At this point, we already have checked the return value above, and we don't care about it in CodeGenerator::visitAsmJSInterruptCheck. Do we really need that check?

> Second, could you only generate these exits if the label is used:
> if (m().syncInteruptLabel().used() && !GenerateSyncInterruptExit(m,
> &throwLabel))
> ?
We can totally do this for the sync interrupt label. Regarding the async interrupt label, there are two issues:
1) the label isn't effectively used (as in Label::used()) during the module compilation. It is only bound when generating the async interrupt exit. The label offset is used when finishing the AsmJSModule (to store the async interrupt exit in the static link data), so the label needs to be bound.
2) We could work around this by having the async interrupt exit generated only if usesSignalsForInterrupt() (and then, not store the interrupt label offset etc.) but in this case, it would mean that we can't run a module compiled without signals if signals are actually activated (i.e. we compiled with JS_NO_SIGNALS=1, stored in the cache and then load without JS_NO_SIGNALS=1). That sounds like a limitation we can avoid.

Per 1) and 2), I would generate the async interrupt exit in all cases, making things simpler. Opinions?
Flags: needinfo?(luke)
Posted patch 4 - TestsSplinter Review
Quick tests:
- testHeapAccess: added a test that a module compiled without JS_NO_SIGNALS=1 cannot be linked if signals are deactivated; added a test that a module compiled with JS_NO_SIGNALS=1 runs fine for OOB checks.
- testTimeout*-nosignals.js: duplicated all timeout tests so that they test without signal handlers. On platforms which can't have signal handlers, this is just the same thing as the initial test, but it's worth checking.
- testTimeout-deactivate-reactivate-signals: tests that a module compiled with JS_NO_SIGNALS=1 can still run if signals are activated.
Attachment #8457959 - Flags: review?(luke)
Attachment #8447123 - Attachment is obsolete: true
(In reply to Benjamin Bouvier [:bbouvier] from comment #35)
> At this point, we already have checked the return value above, and we don't
> care about it in CodeGenerator::visitAsmJSInterruptCheck. Do we really need
> that check?

Oh you're right!  I was under the impression that the sync interrupt was itself returning a bool to the caller but of course it doesn't need to do that; if it returns, you continue.

> 1) the label isn't effectively used (as in Label::used()) during the module
> compilation.

Oops, you're right.  Just the sync label, then.
Flags: needinfo?(luke)
Comment on attachment 8457959 [details] [diff] [review]
4 - Tests

Review of attachment 8457959 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!

::: js/src/jsapi.h
@@ +4926,5 @@
>      Register(ION_USECOUNT_TRIGGER, "ion.usecount.trigger")              \
>      Register(ION_ENABLE, "ion.enable")                                  \
>      Register(BASELINE_ENABLE, "baseline.enable")                        \
> +    Register(OFFTHREAD_COMPILATION_ENABLE, "offthread-compilation.enable") \
> +    Register(SIGNAL_HANDLERS_ENABLE, "signal.handlers.enable")

How about just 'signals' instead of 'signal.handlers'?
Attachment #8457959 - Flags: review?(luke) → review+
Backed out tests as they were failing in mysterious ways. Investigating.
https://hg.mozilla.org/integration/mozilla-inbound/rev/c65ea1add688
Duh, if asm.js compilation isn't available (for instance, if --no-fpu is set), testing asm.js caching makes no sense.

Try run for sanity before relanding:
https://tbpl.mozilla.org/?tree=Try&rev=625b782ba68d
Ha, I've created an intermittent: according to the order in which tests are run, as the code in the testTimeoutX-nosignals and testTimeout-deactivate-reactivate-signals are simple copy-pasto from testTimeoutX tests, the asm.js code might be present in the cache and thus reused. I've added simple comments in the source of the variants to ensure that they don't hit cached sources.

https://tbpl.mozilla.org/?tree=Try&rev=bb0d16906388
Oh hah, wow.  There is a --js-cache-per-process flag that creates a new dir for each pid and clears it out when the test finishes, but that is only specified when the tests run in parallel.  I'm guessing that's why you didn't hit it when you ran locally.  Perhaps this should be the default (to avoid this sort of frustration) and --no-js-cache-per-process would be an option?
(In reply to Luke Wagner [:luke] from comment #43)
> Oh hah, wow.  There is a --js-cache-per-process flag that creates a new dir
> for each pid and clears it out when the test finishes, but that is only
> specified when the tests run in parallel.  I'm guessing that's why you
> didn't hit it when you ran locally.  Perhaps this should be the default (to
> avoid this sort of frustration) and --no-js-cache-per-process would be an
> option?

That sounds like the best solution. I think I've hit this again because even a single file is also run several times (under different flags -- this is what --tbpl does). Opened bug 1040823.

Trying to deactivate the first test should do the trick, in the meanwhile:
https://tbpl.mozilla.org/?tree=Try&rev=3751f9ccb1f6
Oops, I think we want this :)
Attachment #8458955 - Flags: review?(benj)
Comment on attachment 8458955 [details] [diff] [review]
fix-interrupt-check

Review of attachment 8458955 [details] [diff] [review]:
-----------------------------------------------------------------

DOH, great catch! I suppose I didn't hit this during testing as most of the time, the pointer mustn't be 0xFF-aligned and thus the sync interrupt check is just called more often than necessary.
Attachment #8458955 - Flags: review?(benj) → review+
Luke, I've pushed your patch with the tests, to be sure they don't fail in an intermittent fashion.

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/d5b344fdd838
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/7fd0b1dfab4d
Depends on: 1099080
No longer depends on: CVE-2015-0817
You need to log in before you can comment on or make changes to this bug.