1091912 - stop using mprotect to halt Ion/asm.js execution

Assignee

Description

•

11 years ago

The use of mprotect/VirtualProtect to take away execution privileges currently causes some problems: - some OSes don't like taking away *just* PROT_EXEC, so we also have to take away READ and this means there are a lot more places we can fault, not just when executing Ion code - gdb/lldb break *before* running the user-defined SIGSEGV handler and offer no way to break *after* a user-defined SIGSEGV handler has passed on the exception, so there's no .gdbinit that ignores the harmless SIGSEGVs while catching the real SIGSEGVs. We have this JS_NO_SIGNALS hack instead. The whole point of taking away execution privileges is to stop the main thread so we can do something to it (patch backedges on Ion, move the pc if in asm.js code). We achieve this effect instead using SuspendThread on Windows and pthread_kill on other platforms. (But won't pthread_kill still break by default in gdb/lldb? Not if we choose a benign signal that isn't already claimed by NSPR or pthreads or libc. Is there such a signal? I think so: SIGVTALRM.) I think this should also allow some simplification of the current interrupt-lock scheme since we have a lot less to protect against.

Luke Wagner [:luke]

Assignee

Updated

•

11 years ago

Depends on: 1091916

Nicolas B. Pierron [:nbp]

Updated

•

11 years ago

Blocks: 1063188

Luke Wagner [:luke]

Assignee

Comment 1

•

11 years ago

Attached patch rm-exec-fault (obsolete) — Details — Splinter Review

This patch does what comment 0 said and removes about 400 lines of locky code in the process. Everything looks mostly good except B2G and Android 2.3. In the case of B2G, the logs seem to show the seccomp filter isn't happy with the pthread_kill, in theory that's not hard to work around. For the Android 2.3 bustage, given that it's not on 4.0, I assume it's some bug in the emulator's handling of pthread_kill, need to run the emulator to reproduce.

Luke Wagner [:luke]

Assignee

Updated

•

11 years ago

Depends on: 1093893

Luke Wagner [:luke]

Assignee

Comment 2

•

11 years ago

Attached patch rm-exec-fault — Details — Splinter Review

This patch is looking green on try. Bug 1093893 added tkill to the seccomp filter which fixed the b2g orange. For the Android 2.3 orange, I'm pretty darn sure it's a bug in Android 2.3 or at least the emulator. I pushed a bunch of variations of the patch to try and: - the patch makes 1 out of 3 unit test jobs fail on Android 2.3 with a crash, almost always while tearing down a thread (looks like thread data structure corruption, esp the pthread TLS data) - try'ing a patch that makes the signal handler a noop has the same crashes - try'ing a patch that additionally comments out the pthread_kill call has no crashes Wikipedia shows <20% on <=2.3 (http://en.wikipedia.org/wiki/Android_version_history), so this patch just disables signal handling on versions <=2.3. I expect the bug is actually in the emulator; if we ever cared (and had real hardware to test on; tb is all emulators) we could additionally test for that, but it doesn't seem worth it atm.

Attachment #8516467 - Attachment is obsolete: true

Attachment #8518620 - Flags: review?(bhackett1024)

Brian Hackett [Laid off!]

Comment 3

•

11 years ago

Comment on attachment 8518620 [details] [diff] [review] rm-exec-fault Review of attachment 8518620 [details] [diff] [review]: ----------------------------------------------------------------- Nice! ::: js/src/asmjs/AsmJSSignalHandlers.cpp @@ +913,5 @@ > + if (jit::JitRuntime *jitRuntime = rt->jitRuntime()) { > + // If we are in the middle of mutating the backedge list, we must be in > + // C++ and thus not locked into a JIT iloop. We assume that we'll check > + // the interrupt flag at least once before entering JIT code (if not, > + // the interrupt will just be triggered again next second). typo @@ +1029,5 @@ > > +// JSRuntime::requestInterrupt sets interrupt_ (which is checked frequently by > +// C++ code at every Baseline JIT loop backedge) and jitStackLimit_ (which is > +// checked at every Baseline and Ion JIT function prologue). That leaves Ion > +// loop backedges and all asm.js code. These last two cases are handled by this 'That leaves ion loop backedges'? ::: js/src/asmjs/AsmJSSignalHandlers.h @@ +29,5 @@ > namespace js { > > +// Set up any signal/exception handlers needed to execute code in the given > +// runtime. Return whether it is safe to rely on signal handling for asm.js > +// out-of-bounds checks. If this returns false it's also unsafe to use signal handling for Ion interrupts, right?

Attachment #8518620 - Flags: review?(bhackett1024) → review+

Luke Wagner [:luke]

Assignee

Comment 4

•

11 years ago

(In reply to Brian Hackett (:bhackett) from comment #3) > If this returns false it's also unsafe to use signal handling for Ion > interrupts, right? Right, thanks. https://hg.mozilla.org/integration/mozilla-inbound/rev/51572056e797

Ryan VanderMeulen [:RyanVM]

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/7db30249d1d8

Ryan VanderMeulen [:RyanVM]

Comment 11

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/7db30249d1d8

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla36

Steve Singer (:stevensn)

Updated

•

11 years ago

Blocks: 1099493

(Away)

Updated

•

11 years ago

Depends on: 1099339

Nicolas B. Pierron [:nbp]

Comment 12

•

11 years ago

This patch likely caused a ~25% regression on octane-zlib on Firefox OS devices (unagi & flame). http://arewefastyet.com/#machine=26&view=single&suite=octane&subtest=zlib&start=1415802570&end=1415924708

Flags: needinfo?(luke)

Luke Wagner [:luke]

Assignee

Comment 13

•

11 years ago

I'm guessing that's the sdk version being less than 19 (tested via ro.build.version.sdk in EnsureSignalHandlersInstalled). IIUC, new FFOS releases will be based on KK (Android v.4.4, SDK v.19) which will pass this test. What is the Android version being tested in these devices?

Flags: needinfo?(luke)

Hannes Verschore [:h4writer]

Comment 15

•

11 years ago

(In reply to Luke Wagner [:luke] from comment #13) > I'm guessing that's the sdk version being less than 19 (tested via > ro.build.version.sdk in EnsureSignalHandlersInstalled). IIUC, new FFOS > releases will be based on KK (Android v.4.4, SDK v.19) which will pass this > test. What is the Android version being tested in these devices? We are having the same regression on fennec on android. This is android 4.3 and api11. (According to https://developer.android.com/about/dashboards/index.html, only 44% is on kitkat and higher. Not sure if we have data about fennec usages). So do we hope its usage declines quickly and I should actually look into upgrading the flame to android 4.4? Or is there something we can do for 4.3 and earlier?

Luke Wagner [:luke]

Assignee

Comment 16

•

11 years ago

The old strategy had a fair amount of complexity and developer frustration (at the portland work week, like 3 different people out of the blue complained about "those crappy random JS seg faults"; all were most pleased we stopped doing that), so I'd much prefer we just wait it out for 4.4.

Luke Wagner [:luke]

Assignee

Comment 17

•

11 years ago

(Oh, but I forgot to answer your direct question: yes, let's update awfy to 4.4 :)

Boris Zbarsky [:bzbarsky]

Updated

•

10 years ago

Depends on: 1163891

rm-exec-fault 11 years ago Luke Wagner [:luke] 87.08 KB, patch		Details \| Diff \| Splinter Review
rm-exec-fault 11 years ago Luke Wagner [:luke] 88.83 KB, patch	bhackett1024 : review+	Details \| Diff \| Splinter Review