1499324 - (BaselineInterpreter) [meta] Consider compiling the interpreter dynamically

Assignee

Description

•

7 years ago

Our current C++ interpreter has a number of issues, most importantly: (1) Pretty slow. No IC support. (2) Has its own stack layout and calling convention. This makes calls and OSR into Baseline more slow and complicated than they should be. Baseline is pretty fast but compilation time sometimes shows up in profiles. The past weeks we have been talking a bit about generating an interpreter dynamically. It would work a lot like Baseline, but instead of compiling scripts to JIT code it could interpret bytecode at runtime. So concretely: * Frame layout exactly like Baseline, reusing BaselineFrame and JIT calling convention. * ICs will be attached to JSScript instead of BaselineScript (completely shared by interpreter and Baseline). * Code to compile a given JSOp will be factored out of BaselineCompiler (with some policy template magic). Implementing a new op will then ensure it works in both Baseline and the generated interpreter. * The interpreter trampoline we already have will contain the interpreter code instead of the current call-into-C++-interpreter machinery. This has some nice benefits: (a) The interpreter will use exactly the same ICs and stubs (code + data) as Baseline. (b) With a faster interpreter, we can consider increasing the Baseline compile threshold so we spend less time Baseline compiling. (c) Interpreter => Baseline OSR becomes trivial because the frame layout is the same, so at loop entries we could just change the instruction pointer to the Baseline code. (d) Ion bailouts could probably be simplified a bit by resuming into the interpreter instead of Baseline (until the next loop edge where we can then use (c) to get back into Baseline). Some more things we discussed: (A) The current C++ interpreter could still be used for platforms without a JIT, for differential testing and debugging, or (at least initially) for functions that have > 20000 arguments or so (because the C++ interpreter allocates them on the heap instead of on the much smaller native stack). This interpreter could then be simplified a lot. (B) To determine the current bytecode pc for a given Baseline frame, we currently use the native code => pc map. An interpreter would have to maintain this bytecode pc on the BaselineFrame when we make calls. BaselineFrame already has an "override pc" field that we could probably use. (C) To load the ICEntry for the current bytecode pc we have a few options: (1) Store the IC index in the bytecode as an operand. (2) Flag all bytecode ops that have an IC as JOF_IC and have the bytecode emitter only store the IC index for jump target ops. This means the interpreter could then keep an ICEntry* pointer that is bumped between ops. (D) Most code on websites is very cold (executed once or twice). Using ICs for such code might actually make us slower because we will waste time attaching stubs that are then never used. One idea is to compile two interpreters, one that uses ICs and one that doesn't, and then switch dynamically between the two. A similar mechanism could maybe be used when toggling debug mode for on-stack scripts.

Jan de Mooij [:jandem]

Assignee