Closed Bug 1855321 Opened 1 year ago Closed 11 months ago

Add a portable (no codegen/JIT) baseline interpreter to SpiderMonkey

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
120 Branch
Tracking Status
firefox120 --- fixed

People

(Reporter: cfallin, Assigned: cfallin)

References

(Blocks 1 open bug)

Details

Attachments

(17 files, 1 obsolete file)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Currently, SpiderMonkey's tiers above the C++ interpreter -- including the "baseline interpreter" and "baseline compiler", which can execute inline caches (ICs) and significantly speed up execution -- work only on platforms that support runtime code generation. This is true even for the baseline interpreter, because (i) its interpreter body is generated at startup, by using the same logic that the baseline compiler uses, and (ii) it invokes compiled IC bodies.

Not every platform or environment supports runtime codegen: for example, SpiderMonkey inside a Wasm module (wasm32-wasi target) does not, some operating systems prohibit it, some CPU architectures are not supported by SpiderMonkey's JIT backends, and some use-cases may require the security of disabling JIT. It would be a significant enhancement to SpiderMonkey's usefulness if it could make use of the engineering that has gone into CacheIR-based ICs in these use-cases.

I have implemented this, and the current implementation is seeing nice speedups in production in our use-case: 42% wallclock time reduction for one use-case, 17% for another, a geomean speedup on Octane benchmarks of 1.26x and up to 1.87x on one benchmark (DeltaBlue). I'll attach patches momentarily.

Here is a public version of the design document: link

This commit adds --enable-portable-baseline and
--enable-portable-baseline-force. The former builds SpiderMonkey with
PBL included, but part of the normal tiering hierarchy; when the latter
is included, all JS execution (that is able) is executed by PBL eagerly.

Assignee: nobody → chris
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

This commit adds JitOptions and corresponding JS shell command-line
options to control whether PBL is enabled and what its tiering threshold
is.

Depends on D189294

PBL is going to use some of these same helpers, so let's put them in
Interpreter-inl.h rather than Interpreter.cpp.

Depends on D189295

These field accessors will be needed by PBL's CacheIR interpreter.

Depends on D189296

This is needed to support a JitStackAlignment of 4, which can happen
on 32-bit builds with PBL enabled (e.g. on wasm32-wasi).

Depends on D189297

This is needed by the static branch prediction (next-opcode prediction)
in PBL's CacheIR interpreter.

Depends on D189299

PBL's CacheIR interpreter is measurably faster if we avoid bounds-checks
on every CacheIRReader read of an opcode or operand/immediate. We assume
that we will be executing well-formed CacheIR (this assumption is
already relied upon elsewhere) and thus we instead stop interpreting
when we reach a ReturnFromIC or hit an earlier guard failure.

Depends on D189300

While this will increase the size of CacheIR slightly in memory, it is a
nontrivial performance improvement when interpreting CacheIR opcodes,
because it avoids a hard-to-predict branch. The small size increase
appears worth it, at least in PBL applications, and has not been
observed to lead to any measurable macro-scale memory bloat.

Depends on D189301

This stack is needed by PBL to store baseline stack frames, because it
cannot use the machine stack (which will be managed by the host's C++
compiler instead).

Depends on D189302

These miscellaneous accessors are needed by the PBL interpreter.

Depends on D189303

PBL diverges slightly from the native baseline implementation's stack
state at IC invocation time (though the stack signatures of all JSOps
match when including the pre-IC and post-IC logic) and thus PBL must
avoid invocation of the expression decompiler for error messages in some
cases. This commit makes the fallback IC logic more tolerant of missing
"stack indices" for this purpose.

Depends on D189304

  • Sometimes there is no JitRuntime, even if we are in baseline-related
    code paths.
  • Sometimes a jit-code or stub-code pointer is null.

Depends on D189305

This commit adds the body of PBL, the Portable Baseline Interpreter.
A more detailed description is included in the [SMDOC] in the header
file.

Almost all of this commit was authored by the named commit author; two
CacheIR guard opcodes, the constructor-call fastpath, and several
bugfixes are due to Jamey Sharp <jsharp@fastly.com> (thanks!).

Depends on D189306

Several of the jit-tests need to be modified for all tests to pass with
PBL enabled and eager (test with --portable-baseline --portable-baseline-eager). With these changes, all jit-tests do pass.

The changes fall into several main categories:

  • Because PBL diverges slightly in decompiler usage, error messages are
    sometimes more generic (e.g., "undefined value" rather than "x is
    undefined", that sort of thing). Several of these changes loosen
    error-message expectations or exclude the relevant tests from PBL.
    All actual error statuses (code works or throws exception) remain
    correct.

  • PBL's auxiliary stack and recursion limit are smaller than native
    baseline supports, at least on some platforms; recursion counts that
    are meant to invoke overflow or not invoke overflow (but whose
    specific value is otherwise immaterial) are thus adjusted in places.

  • A few other miscellaneous changes, supported at each point with a
    comment.

Depends on D189309

On PBL, alignment is only to native word size, to avoid the need to add
additional alignment padding.

Depends on D189310

Severity: -- → N/A
Priority: -- → P3
Attachment #9355192 - Attachment description: Bug 1855321 part 1 / 18 (PBL): add configure options. r=jandem → Bug 1855321 part 1 / 17 (PBL): add configure options. r=jandem
Attachment #9355193 - Attachment description: Bug 1855321 part 2 / 18 (PBL): add runtime options to enable/disable PBL. r=jandem → Bug 1855321 part 2 / 17 (PBL): add runtime options to enable/disable PBL. r=jandem
Attachment #9355194 - Attachment description: Bug 1855321 part 3 / 18 (PBL): move some C++ interpreter helpers into inline .h file. r=jandem → Bug 1855321 part 3 / 17 (PBL): move some C++ interpreter helpers into inline .h file. r=jandem
Attachment #9355195 - Attachment description: Bug 1855321 part 4 / 18 (PBL): code motion: put CacheIR field accessors in header. r=jandem → Bug 1855321 part 4 / 17 (PBL): code motion: put CacheIR field accessors in header. r=jandem
Attachment #9355196 - Attachment description: Bug 1855321 part 5 / 18 (PBL): `AbstractFramePtr` tagging: fit within two bits. r=jandem → Bug 1855321 part 5 / 17 (PBL): `AbstractFramePtr` tagging: fit within two bits. r=jandem
Attachment #9355197 - Attachment description: Bug 1855321 part 6 / 18 (PBL): make CacheOp a `uint16_t` explicitly, and expose `NumOpcodes`. r=jandem → Bug 1855321 part 6 / 17 (PBL): make CacheOp a `uint16_t` explicitly, and expose `NumOpcodes`. r=jandem
Attachment #9355198 - Attachment description: Bug 1855321 part 7 / 18 (PBL): CacheIRReader: add "peek" method. r=jandem → Bug 1855321 part 7 / 17 (PBL): CacheIRReader: add "peek" method. r=jandem
Attachment #9355200 - Attachment description: Bug 1855321 part 9 / 18 (PBL): CacheIR: use fixed 16-bit values for ops rather than packed 15-bit values (for speed). r=jandem → Bug 1855321 part 8 / 17 (PBL): CacheIR: use fixed 16-bit values for ops rather than packed 15-bit values (for speed). r=jandem
Attachment #9355201 - Attachment description: Bug 1855321 part 10 / 18 (PBL): add an auxiliary stack. r=jandem → Bug 1855321 part 9 / 17 (PBL): add an auxiliary stack. r=jandem
Attachment #9355202 - Attachment description: Bug 1855321 part 11 / 18 (PBL): Add some needed accessors. r=jandem → Bug 1855321 part 10 / 17 (PBL): Add some needed accessors. r=jandem
Attachment #9355203 - Attachment description: Bug 1855321 part 12 / 18 (PBL): CacheIR fallbacks: make more tolerant of no-decompiler case. r=jandem → Bug 1855321 part 11 / 17 (PBL): CacheIR fallbacks: make more tolerant of no-decompiler case. r=jandem
Attachment #9355204 - Attachment description: Bug 1855321 part 13 / 18 (PBL): miscellaneous ifdefs and null-checks to handle running in new configuration. r=jandem → Bug 1855321 part 12 / 17 (PBL): miscellaneous ifdefs and null-checks to handle running in new configuration. r=jandem
Attachment #9355205 - Attachment description: Bug 1855321 part 14 / 18 (PBL): Add the main implementation of PBL (not yet invoked). r=jandem → Bug 1855321 part 13 / 17 (PBL): Add the main implementation of PBL (not yet invoked). r=jandem
Attachment #9355206 - Attachment description: Bug 1855321 part 15 / 18 (PBL): invoke the PBL tier from tiering logic. r=jandem → Bug 1855321 part 14 / 17 (PBL): invoke the PBL tier from tiering logic. r=jandem
Attachment #9355207 - Attachment description: Bug 1855321 part 16 / 18 (PBL): Tweak profiling to deal with PBL. r=jandem → Bug 1855321 part 15 / 17 (PBL): Tweak profiling to deal with PBL. r=jandem
Attachment #9355208 - Attachment description: Bug 1855321 part 17 / 18 (PBL): update tests. r=jandem → Bug 1855321 part 16 / 17 (PBL): update tests. r=jandem
Attachment #9355209 - Attachment description: Bug 1855321 part 18 / 18 (PBL): Fix JitStackAlignment. r=jandem → Bug 1855321 part 17 / 17 (PBL): Fix JitStackAlignment. r=jandem
Attachment #9355199 - Attachment is obsolete: true
Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/418d4dfe5fa0
part 1 / 17 (PBL): add configure options. r=jandem
https://hg.mozilla.org/integration/autoland/rev/5a041bffa082
part 2 / 17  (PBL): add runtime options to enable/disable PBL. r=jandem
https://hg.mozilla.org/integration/autoland/rev/206c10d873b3
part 3 / 17 (PBL): move some C++ interpreter helpers into inline .h file. r=jandem
https://hg.mozilla.org/integration/autoland/rev/7e4aa1404ab4
part 4 / 17 (PBL): code motion: put CacheIR field accessors in header. r=jandem
https://hg.mozilla.org/integration/autoland/rev/598525a0055a
part 5 / 17 (PBL): `AbstractFramePtr` tagging: fit within two bits. r=jandem
https://hg.mozilla.org/integration/autoland/rev/07686583c076
part 6 / 17 (PBL): make CacheOp a `uint16_t` explicitly, and expose `NumOpcodes`. r=jandem
https://hg.mozilla.org/integration/autoland/rev/90aa66a98802
part 7 / 17 (PBL): CacheIRReader: add "peek" method. r=jandem
https://hg.mozilla.org/integration/autoland/rev/90ddd9ea8b06
part 8 / 17 (PBL): CacheIR: use fixed 16-bit values for ops rather than packed 15-bit values (for speed). r=jandem
https://hg.mozilla.org/integration/autoland/rev/bc9431e5c954
part 9 / 17 (PBL): add an auxiliary stack. r=jandem
https://hg.mozilla.org/integration/autoland/rev/41a81b07f3e1
part 10 / 17 (PBL): Add some needed accessors. r=jandem
https://hg.mozilla.org/integration/autoland/rev/3364d2c6a3b0
part 11 / 17 (PBL): CacheIR fallbacks: make more tolerant of no-decompiler case. r=jandem
https://hg.mozilla.org/integration/autoland/rev/768de7534ff5
part 12 / 17 (PBL): miscellaneous ifdefs and null-checks to handle running in new configuration. r=jandem
https://hg.mozilla.org/integration/autoland/rev/145f913e7033
part 13 / 17 (PBL): Add the main implementation of PBL (not yet invoked). r=jandem
https://hg.mozilla.org/integration/autoland/rev/56a1e3b40bbe
part 14 / 17 (PBL): invoke the PBL tier from tiering logic. r=jandem
https://hg.mozilla.org/integration/autoland/rev/0dd9a6719b68
part 15 / 17 (PBL): Tweak profiling to deal with PBL. r=jandem
https://hg.mozilla.org/integration/autoland/rev/146015287659
part 16 / 17 (PBL): update tests. r=jandem
https://hg.mozilla.org/integration/autoland/rev/31abb9c9df15
part 17 / 17 (PBL): Fix JitStackAlignment. r=jandem

Backed out for causing sm bustages in JSScript.cpp

Flags: needinfo?(chris)

Sorry about that! Moved that definition inside the PBL ifdef; will try landing again.

Flags: needinfo?(chris)
Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/37aabeaab4ce
part 1 / 17 (PBL): add configure options. r=jandem
https://hg.mozilla.org/integration/autoland/rev/50465202ecdb
part 2 / 17  (PBL): add runtime options to enable/disable PBL. r=jandem
https://hg.mozilla.org/integration/autoland/rev/94fefa578027
part 3 / 17 (PBL): move some C++ interpreter helpers into inline .h file. r=jandem
https://hg.mozilla.org/integration/autoland/rev/adeeb6a497ec
part 4 / 17 (PBL): code motion: put CacheIR field accessors in header. r=jandem
https://hg.mozilla.org/integration/autoland/rev/f59766bd98d4
part 5 / 17 (PBL): `AbstractFramePtr` tagging: fit within two bits. r=jandem
https://hg.mozilla.org/integration/autoland/rev/0e4c337e8a63
part 6 / 17 (PBL): make CacheOp a `uint16_t` explicitly, and expose `NumOpcodes`. r=jandem
https://hg.mozilla.org/integration/autoland/rev/60365e2a3f44
part 7 / 17 (PBL): CacheIRReader: add "peek" method. r=jandem
https://hg.mozilla.org/integration/autoland/rev/baf78b746d01
part 8 / 17 (PBL): CacheIR: use fixed 16-bit values for ops rather than packed 15-bit values (for speed). r=jandem
https://hg.mozilla.org/integration/autoland/rev/53e0f3409fae
part 9 / 17 (PBL): add an auxiliary stack. r=jandem
https://hg.mozilla.org/integration/autoland/rev/f7079573e6a6
part 10 / 17 (PBL): Add some needed accessors. r=jandem
https://hg.mozilla.org/integration/autoland/rev/021270427b94
part 11 / 17 (PBL): CacheIR fallbacks: make more tolerant of no-decompiler case. r=jandem
https://hg.mozilla.org/integration/autoland/rev/0f792335cac1
part 12 / 17 (PBL): miscellaneous ifdefs and null-checks to handle running in new configuration. r=jandem
https://hg.mozilla.org/integration/autoland/rev/6b64d50f990c
part 13 / 17 (PBL): Add the main implementation of PBL (not yet invoked). r=jandem
https://hg.mozilla.org/integration/autoland/rev/72c978283dd4
part 14 / 17 (PBL): invoke the PBL tier from tiering logic. r=jandem
https://hg.mozilla.org/integration/autoland/rev/f69064f64ced
part 15 / 17 (PBL): Tweak profiling to deal with PBL. r=jandem
https://hg.mozilla.org/integration/autoland/rev/81239954a114
part 16 / 17 (PBL): update tests. r=jandem
https://hg.mozilla.org/integration/autoland/rev/2c28daffd2b8
part 17 / 17 (PBL): Fix JitStackAlignment. r=jandem
Regressions: 1858823
Blocks: 1859010
Regressions: 1859204
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: