Closed Bug 857189 Opened 11 years ago Closed 11 years ago

AddressSanitizer's SIGSEGV handler is incompatible with asm.js

Categories

(Core :: JavaScript Engine, defect)

All
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla24

People

(Reporter: decoder, Assigned: decoder)

References

(Blocks 1 open bug, )

Details

(Keywords: sec-want, Whiteboard: [asan][asan-test-failure][security-assurance-q2])

Attachments

(1 file, 1 obsolete file)

Attached patch Patch (obsolete) — Splinter Review
Right now, the ASan builds on TBPL are orange because some asm.js tests fail. I investigated this and it seems that these are tests where the internal asm.js SIGSEGV handler is triggered. However, under ASan, the ASan SIGSEGV handler is used instead, causing an abort.

The straightforward fix for this is to disable the ASan SIGSEGV handler when we're using asm.js. There are several ways to do this, however I think since Firefox would quickly get unusable with this handler active, I'd go for overriding the __asan_default_options method in js/src/ion/AsmJSSignalHandlers.cpp to always use "handle_segv=0". It would still be possible to explicitly re-enable it at runtime using ASAN_OPTIONS.

Note that the ASan SIGSEGV handler is not necessary for detecting bugs. The only purpose it has is to show a nice backtrace in case of a "normal" segfault. With the handler disabled, it will just crash regularly if it's a "real" segfault.

The attached patch is green on try with ASan.
Attachment #732426 - Flags: review?(luke)
Hmm, so does this mean the ASan signal handler is getting installed after the asm.js handler?  The asm.js handler gets installed on the first use of asm.js which usually means it's last (and therefore gets first dibs on faults).  Perhaps you could put a breakpoint on sigaction and see why ASan is installing its handler first?
(In reply to Luke Wagner [:luke] from comment #1)
> Perhaps you could put a breakpoint on sigaction and see why ASan
> is installing its handler first?

I guess you mean why it *isn't* installing it first? If asm.js installs the handler after ASan, then ASan wouldn't see the segfaults that asm.js is handling, right?
No, I meant "is installing its handler first": when asm.js installs its SIGSEGV handler, it saves the sPrevHandler and calls that if the asm.js handler doesn't recognize the fault as coming from asm.js code at an expected location.
I don't understand why I should check that ASan is installing its handler first, because that is intended behavior. The ASan runtime runs at program startup and installs the handler.

However, if only stuff that the asm.js handler does *not* recognize is passed on, then wouldn't the same test crash without ASan? Because it doesn't. I'm just running the testZOOB.js from jit-tests. Maybe asm.js is passing on the segfault somewhere although it recognized it?
I'm pretty sure what's happening here is that, for some reason, ASan is clobbering the asm.js handler (so, calling sigaction after EnsureAsmJSSignalHandlersInstalled) and thus the ASan handler is swallowing the fault before the asm.js handler gets to see it.  I've already tested that the asm.js handler correctly interacts with previously-installed handlers, like breakpad.

Btw, is this on all platforms or just Unix?
I've checked in GDB what the order is, and it is:

1. ASan calls sigaction through __asan::MaybeInstallSigaction
2. The JS shell calls sigaction twice (through ScheduleWatchdog/__bsd_signal)
3. asm.js calls sigaction three times through EnsureAsmJSSignalHandlersInstalled
4. Crash:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7e8b018 in ?? ()
(gdb) bt
#0  0x00007ffff7e8b018 in ?? ()
#1  0x00007ffff7e8c052 in ?? ()
#2  0x00007ffffffcb188 in ?? ()
#3  0x00007ffffffca6e0 in ?? ()
#4  0x00007ffffffcb630 in ?? ()
#5  0x00007ffffffca7a0 in ?? ()
#6  0x00007ffffffca7e0 in ?? ()
#7  0x00007ffffffca720 in ?? ()
#8  0x00007ffffffca760 in ?? ()
#9  0x0000000004776056 in CallAsmJS (cx=0x603a0000fdc0, argc=0, vp=0x7ffff66480b0) at /srv/repos/mozilla-central/js/src/ion/AsmJSLink.cpp:330
[...]
(gdb) x /i $pc
=> 0x7ffff7e8b018:      movsbl (%r15,%rax,1),%eax
(gdb) info reg rax r15
rax            0x7fffffff       2147483647
r15            0x7ffef6400000   140733029810176


This is just unix because ASan isn't supported yet on Windows.
Should have mentioned that continuing after that SIGSEGV triggers the ASan SIGSEGV handler of course.
Hmm, interesting.  That fault is expected and the AsmJSFaultHandler should be handling it.  Could you break on AsmJSFaultHandler, see whether it is hit and, if it is, see where in HandleSignal we 'return false'?
Just tried that, and it isn't even called.
That sounds like the bug.  If, after gdb stops at the SIGSEGV, you step (with 's'), what handler to you step into?
I just get "Cannot find bounds of current function" when stepping after that. When I use "stepi" instead, I end up in __asan::ASAN_OnSIGSEGV immediately.
Comment on attachment 732426 [details] [diff] [review]
Patch

Review of attachment 732426 [details] [diff] [review]:
-----------------------------------------------------------------

Hmm, that is strange; the order of the sigaction calls is such that asm.js's should be the one called, but perhaps ASan is doing more trickery.  Anyhow, we've done our due diligence, r+.

::: js/src/ion/AsmJSSignalHandlers.cpp
@@ +917,5 @@
> +static const char mozAsanDefaultOptionsAsmJS[] = "handle_segv=0";
> +
> +extern "C" MOZ_ASAN_BLACKLIST
> +const char* __asan_default_options() {
> +    return mozAsanDefaultOptionsAsmJS;

Can you just write
  return "handle_segv=0"
and remove mozAsanDefaultOptionsAsmJS instead?
Attachment #732426 - Flags: review?(luke) → review+
Thanks. I confirmed that with handle_segv=0, I'm getting AsmJSFaultHandler called after stepping. I'll try to dig a bit deeper and see if I can find the actual reason for this behavior, as I would feel better if I understood what's going on. But if we cannot figure it out, I'll just land the patch.
If I interpret the source code of ASan correctly, then on Linux, it will intercept sigaction and not allow installing additional signal handlers if it's already handling the signal by itself. I've filed a bug for ASan to find out why this is the case.
LLVM r180255 implements a new option for ASan that allows us to use our own SIGSEGV handler on top of the ASan one. I will investigate if this is working now, and if so, come up with a patch. Of course this will only be effective once we upgraded our clang to that revision.
Marking this as a Q2 goal for security assurance. This goal however cannot include the Clang upgrade, but should be limited to fixing the two issues in our code (one is using the new LLVM setting, which will have no effect on older builds, the other is fixing the signal handler chaining in ASM.js which is broken right now).

I'll come up with a patch soon.
Whiteboard: [asan][asan-test-failure] → [asan][asan-test-failure][security-assurance-q2]
Attached patch Better patchSplinter Review
This patch uses a new option introduced by the ASan developers that allows the program to use its own signal handlers. However, in addition to that we need to fix a bug in ASM.js: sigaction is called twice to install handlers for SIGSEGV and SIGBUG, however, there is only one variable to store the previous handler, which means that the previous SIGSEGV handler (in this case ASan) is not stored. Instead, the previous SIGBUS handler (default handler) is used for both.

The patch solves this by using to variables for the previous handlers.
Assignee: general → choller
Attachment #732426 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #744134 - Flags: review?(luke)
Comment on attachment 744134 [details] [diff] [review]
Better patch

Review of attachment 744134 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks!

::: js/src/ion/AsmJSSignalHandlers.cpp
@@ +915,5 @@
>      // it's previous disposition and return. This will cause the faulting op to
>      // be re-executed which will crash in the normal way. The advantage to
>      // doing this is that we remove ourselves from the crash stack which
>      // simplifies crash reports. Note: the order of these tests matter.
> +    struct sigaction* sPrevHandler = NULL;

Could you name this prevHandler (since it's a local variable now)?

@@ +920,5 @@
> +    if (signum == SIGSEGV) {
> +        sPrevHandler = &sPrevSegvHandler;
> +    } else if (signum == SIGBUS) {
> +        sPrevHandler = &sPrevBusHandler;
> +    }

No { } around single-statement then/else bodies.  Also, could you turn the "else if" into "else"?

@@ +1003,5 @@
>  #endif
>  }
> +
> +#if defined(JS_ASMJS)
> +# ifdef MOZ_ASAN

I think you can remove the #if defined(JS_ASMJS).
Attachment #744134 - Flags: review?(luke) → review+
Blocks: 872565
https://hg.mozilla.org/mozilla-central/rev/835e5b42aa10
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: