[meta] Support specialization / partial evaluation of C++ interpreter running inside a Wasm module
Categories
(Core :: JavaScript Engine, enhancement, P5)
Tracking
()
People
(Reporter: cfallin, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: meta)
This meta-bug is meant to track whatever upstreaming is agreeable for the work I've done on "weval", the WebAssembly partial evaluator (weval, WIP SpiderMonkey branch).
The basic idea of weval is to take a snapshot of an interpreter and its bytecode (using, e.g., the Wizer Wasm snapshotting tool), and then "specialize" calls to the Interpret
function on particular, constant, PC values. With the help of some intrinsics to note that, e.g., reads of bytecode from memory can be assumed to remain constant, this allows the tool to "unroll" the interpreter loop, so the CFG in the bytecode becomes the CFG in the produced specialized function body. We take a slightly modified interpreter and we get a template-JIT.
With my prototype, I'm seeing a geomean speedup on Octane over pure-C++-interpreter-on-Wasm of 1.83x (that is, 83% speedup), up to 3.5x on crypto
. Separately, a Markdown renderer gets a 1.38x speedup. I suspect further improvements will need ICs in the Wasm (and I'm happy to talk about our thoughts there) but we want to try to ship this speedup first.
I'm working on a clean patch series; in rough order, I think it'll take the following:
-
A patch to add some basic Wizer support to the JS shell itself, for testing/development in vanilla upstream. The deployment scenario where we would use this has its own Wizer integration already but testing only out-of-tree is pretty unwieldy.
-
A modification to the C++ interpreter (under ifdefs to avoid impact in the non-weval case) to make a recursive call at the C++ level for each JS call. (Currently JS-to-JS calls avoid this.) This is necessary because weval's basic primitive is "specialize this function with these constant args", so we need a separate function call which we can specialize with the callee's PC / bytecode body. Separable into:
- A patch to pull out the locals in
Interpret
into a context struct; - A patch for the above JS-call-is-C++-call behavior, under a config option;
- A series of patches to add annotations (calls to weval intrinsics) to the interpreter:
- Pull in weval headers to
third_party
; - Add a notion of "specialized" and "generic" invocations to
Interpret
as an enum arg, with the former tail-calling to the latter on errors and for misc other cases like generators (this is what lets an "interpreter bailout" fall out of the weval transform); - Add the intrinsics to say "bytecode memory is constant" (just a few lines);
- Add the intrinsics to say "specialize the interpreter loop" and "this is the current PC" (just a few lines);
- Add the specialized-variant function pointer to the script data structures, code to make the "specialization request" that weval finds in the snapshot, and code to invoke the specialized function if it exists.
I'd love to get feedback on which of the above could be easily upstreamed and which are too weird or difficult to maintain. I'd prefer to get at least the refactors in step 2 in, because they either should have no impact (context struct, should get SROA'd in native build) or are ifdef'd off, and mean that we'll have substantially less rebase pain. If we can get the actual weval intrinsics in (step 3), all the better. How to test this is also sort of an open question. If it's untested/unsupported upstream, and we validate that everything still works when we upgrade our vendored SM, that's fine; maybe there's appetite for more though.
Patch for step 1 is in bug 1831030 and patches for the rest will be up soon!
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Reporter | ||
Comment 1•2 years ago
|
||
Since the patchset is not likely to be upstreamable, unfortunately I will go ahead and close this bug.
Reporter | ||
Updated•2 years ago
|
Description
•