Closed
Bug 1505271
Opened 6 years ago
Closed 6 years ago
Stop creating 1 mach exception handler thread per JSRuntime
Categories
(Core :: JavaScript: WebAssembly, enhancement)
Core
JavaScript: WebAssembly
Tracking
()
RESOLVED
FIXED
mozilla65
Tracking | Status | |
---|---|---|
firefox65 | --- | fixed |
People
(Reporter: luke, Assigned: luke)
Details
Attachments
(2 files, 2 obsolete files)
2.96 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
33.36 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
On OSX, we currently eagerly create one extra thread per JSRuntime (so one for the main-thread and one for each worker) that sits waiting for Mach exceptions from that JSRuntime's owner thread. Instead, we should create at most one thread per process, and lazily, only when wasm is used.
The reason for the current eagerness is that, for a while, the signal handlers were used for all JS Ion code's async interrupts, requiring eager installation, but no longer: bug 864220.
The reason for the current auxiliary-thread-per-JSRuntime is that there wasn't, until recently, a good way to find the faulting JSContext from a process-wide handler thread. Nowadays, wasm code preserves fp which allows, at valid wasm trap sites, the JSContext to be found via fp->tls->instance->realm->runtime->cx.
Having a single global handler simplifies a bunch of other things too, because you can just let it stay alive.
Assignee | ||
Comment 1•6 years ago
|
||
This simple patch switches to fp as the source of truth.
Assignee: nobody → luke
Attachment #9023176 -
Flags: review?(bbouvier)
Assignee | ||
Comment 2•6 years ago
|
||
This patch kills the thread-per-JSRuntime, making the debug thread lazily-created per-process. ~100 lines of code net removed.
Attachment #9023177 -
Flags: review?(bbouvier)
Comment 3•6 years ago
|
||
Comment on attachment 9023176 [details] [diff] [review]
use-fp-not-cx
Review of attachment 9023176 [details] [diff] [review]:
-----------------------------------------------------------------
Nice.
Attachment #9023176 -
Flags: review?(bbouvier) → review+
Comment 4•6 years ago
|
||
Comment on attachment 9023177 [details] [diff] [review]
single-debug-thread
Review of attachment 9023177 [details] [diff] [review]:
-----------------------------------------------------------------
Won't there be an issue if the port is partially initialized, and fails in the middle of initialization? There was code to cleanup this situation in uninstall() (and a ScopeExit in install() call this uninstall()), but it's been removed as far as I can tell.
::: js/src/wasm/WasmSignalHandlers.cpp
@@ +610,2 @@
> {
> + while(true) {
pre-existing: space between while and opening parenthesis
@@ +610,3 @@
> {
> + while(true) {
> + kern_return_t kret;
nit: fuse declaration and definition one line below?
@@ +622,5 @@
> MOZ_CRASH();
> }
>
> + // Magic constant taken from mach_exc in /usr/include/mach/mach_exc.defs.
> + if (request.body.Head.msgh_id != 2405) {
nit: prefer an ALL_UPPERCASE constant here for the msg id?
@@ +711,5 @@
>
> +struct ProcessSignalHandlerState
> +{
> + bool tried;
> + bool have;
These names, albeit short, are not very eloquent; how about `triedSetup` and `areSetup` (or just `setup`)?
@@ +780,5 @@
> + }
> + kret = mach_port_insert_right(mach_task_self(), sMachDebugPort, sMachDebugPort,
> + MACH_MSG_TYPE_MAKE_SEND);
> + if (kret != KERN_SUCCESS) {
> + return false;
What about all the cleanup code that was in uninstall() above? Seems it still should be run if anything goes wrong?
@@ +865,1 @@
> return false;
Note the cleanup code would be needed here too!
Attachment #9023177 -
Flags: review?(bbouvier)
Assignee | ||
Comment 5•6 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #4)
> Won't there be an issue if the port is partially initialized, and fails in
> the middle of initialization? There was code to cleanup this situation in
> uninstall() (and a ScopeExit in install() call this uninstall()), but it's
> been removed as far as I can tell.
The worst that can happen is that the partially-initialized port lives until the end of the process, which it was going to do anyway if the initialization fails. In the original code, this thorough cleanup was necessary because this was per-JSRuntime, not per-process. The rest of the per-process signal-handler initialization works this way, incidentally. Notice also that ProcessSignalState::tried is eagerly set to true so if the init fails once, it won't try again so there is at most one port leaked.
> What about all the cleanup code that was in uninstall() above? Seems it
> still should be run if anything goes wrong?
...
> Note the cleanup code would be needed here too!
... thus I don't think any of this cleanup code is needed.
Assignee | ||
Comment 6•6 years ago
|
||
This addresses the comments and also changes the logic so that the non-OSX signal/exception handlers continue to be installed eagerly (by JS_Init()) and only the OSX exception-handler thread gets created lazily (for the reason commented above wasm::InitEagerProcessSignalHandlers()).
Note that the LazyInstallState field names remain short/terse because they are always read in the the context of `lazyInstallState->tried` which I think gives a pretty good explanation of the meaning.
Attachment #9023177 -
Attachment is obsolete: true
Attachment #9023747 -
Flags: review?(bbouvier)
Assignee | ||
Comment 7•6 years ago
|
||
Try showed that, in some configurations, breakpad signal handlers were getting installed after the wasm ones (which means breakpad was getting called first, triggering a crash, before wasm got a chance). This updated patch goes back to the previous scheme of waiting to install the wasm signal handlers until the first JSRuntime/JSContext is created.
Attachment #9023747 -
Attachment is obsolete: true
Attachment #9023747 -
Flags: review?(bbouvier)
Attachment #9023820 -
Flags: review?(bbouvier)
Comment 8•6 years ago
|
||
Comment on attachment 9023820 [details] [diff] [review]
single-debug-thread
Review of attachment 9023820 [details] [diff] [review]:
-----------------------------------------------------------------
Nice, thanks.
::: js/src/wasm/WasmSignalHandlers.cpp
@@ +883,5 @@
> + // are also task-level (i.e. process-level) exception ports, those are
> + // "claimed" by breakpad and chaining Mach exceptions is dark magic that we
> + // avoid by instead intercepting exceptions at the thread level before they
> + // propagate to the process-level. This works because there are no other
> + // uses of thread-level exception ports.
Thanks, this comment explains the situation much better.
::: js/src/wasm/WasmSignalHandlers.h
@@ +45,5 @@
> +// Assuming EnsureEagerProcessSignalHandlers() has already been called,
> +// this function performs the full installation of signal handlers which must
> +// be performed per-thread/JSContext. This operation may incur some overhead and
> +// so should be done only when needed to use wasm. Currently, this is done in
> +// wasm::HasSupport() which is called when decideing whether to expose the
nit: "deciding", I think?
also nit: wasm::HasCompilerSupport()
Attachment #9023820 -
Flags: review?(bbouvier) → review+
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dfdd63aae0c5
Baldr: find the JSContext via fp in the signal handler (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/0abbc0f316d3
Baldr: only create one wasm exception handler thread per process, and lazily (r=bbouvier)
Comment 10•6 years ago
|
||
Backed out changeset 0abbc0f316d3 (Bug 1505271) for raptor perma failures
push that caused the backout: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&group_state=expanded&searchStr=ugl&revision=0abbc0f316d3d1b4c47338ff34e8a2e02e6479f0
failure: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&group_state=expanded&fromchange=8ea9a919916f7dfd45d9638d9b1f16bf2eb9e97c&searchStr=linux%2Cx64%2Cquantumrender%2Copt%2Craptor%2Cperformance%2Ctests%2Con%2Cfirefox%2Cwith%2Ce10s%2Ctest-linux64-qr%2Fopt-raptor-unity-webgl-firefox-e10s%2Crap-e10s%28ugl%29&selectedJob=210919086
backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/111655db0b164cbf3eaaac6feccf3c98e8d8066d
Flags: needinfo?(luke)
Comment 11•6 years ago
|
||
Backed out changeset 0abbc0f316d3 (Bug 1505271) for raptor perma failures CLOSED TREE
Backed out changeset dfdd63aae0c5 (bug 1505271) for raptor perma failures CLOSED TREE
Comment 12•6 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox65:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Assignee | ||
Comment 13•6 years ago
|
||
Oops, I guess these didn't run because I usually try with "-t none".
Joel: sorry for ni? out of the blue, and feel free to forward me to someone more appropriate, but two questions:
1. I don't see raptor in mozilla-releng.net/trychooser; does it fall under one of the existing labels or is it missing?
2. A symbolicated crash stack would be super-helpful (this looks like it might be a MOZ_CRASH()); is there any way to enable that for raptor runs?
Thanks!
Flags: needinfo?(luke) → needinfo?(jmaher)
Assignee | ||
Comment 14•6 years ago
|
||
Actually, n/m; I thought this might require some arcane build steps to repro, but actually it repros on any build and raptor was delightfully easy to run (`./mach raptor-test --test raptor-unity-webgl`, no other setup!).
The bug was that I added some MOZ_RELEASE(cx->wasmHasSignalHandlers)s at the last moment to asm.js paths, forgetting that asm.js doesn't ensure or require signal handlers (anymore).
(Still, the two feature requests in comment 13 might be nice for future devs. I'd add a third too: it'd be useful to capture the crash string of MOZ_CRASH() if that isn't done by #2.)
Flags: needinfo?(jmaher)
Comment 15•6 years ago
|
||
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5e300fa7763d
Baldr: find the JSContext via fp in the signal handler (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/cbe3c9df4f9c
Baldr: only create one wasm exception handler thread per process, and lazily (r=bbouvier)
Comment 16•6 years ago
|
||
bugherder |
You need to log in
before you can comment on or make changes to this bug.
Description
•