Closed Bug 1831676 Opened 2 years ago Closed 2 years ago

Refactor JS C++ interpreter in support of partial evaluation / specialization

Tracking

()

Status:

RESOLVED WONTFIX

People

(Reporter: cfallin, Assigned: cfallin)

References

Details

Attachments

(5 files)

Bug 1831676: Part 1: Refactor C++ interpreter to put function-local state in InterpretContext. r=jandem 2 years ago Chris Fallin [:cfallin] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1831676: Part 2: Move interpreter-loop body into InterpretInner(). r=jandem 2 years ago Chris Fallin [:cfallin] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1831676: Part 3: add option to perform a C++-level call of InterpretInner() for each JS call. r=jandem 2 years ago Chris Fallin [:cfallin] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1831676: Part 4: Interpreter: cache PC in a local. r=jandem 2 years ago Chris Fallin [:cfallin] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1831676: Part 5: add a notion of "specialization mode" to the interpreter. r=jandem 2 years ago Chris Fallin [:cfallin] 48 bytes, text/x-phabricator-request		Details \| Review

Chris Fallin [:cfallin]

Assignee

Description

•

2 years ago

This bug captures the effort to refactor the C++ interpreter in SpiderMonkey in order to more easily allow partial evaluation for speedups (bug 1831399).

The two main difficulties I found when prototyping this functionality, at least in relation to the interpreter's current design, are that:

The interpreter currently optimizes handling of JS-to-JS calls by remaining within the same C++ call-frame. JS state is updated and the interpreter loop then dispatches to the next opcode (the first opcode in the callee), without any call at the C++ level.

In order to partially evaluate the interpreter over a function body, we need a different activation of the interpreter function for every interpreted function. Hence, we need to (when configured in this mode) always make a C++ call -- recursive at the C++ level, back to the interpreter itself -- on a JS call.
Certain error-paths and a few kinds of control-flow, like generators, lead to a "computed target PC" that is only known in practice at runtime, not from the bytecode, or at least not easily. For example, catching an exception leads to a handler denoted from the try-notes, and the logic to find the right handler is very challenging to partially evaluate. Or, when resuming a generator, the next PC to resume at is a value loaded from runtime state. This is challenging because partial evaluation "unrolls" the interpreter loop to a new specialization for each bytecode, and we need to know statically what the CFG is (what the next-PC is from a given opcode, or next-PCs for each branch if conditional).

In order to make this workable, we have a notion of "entry kind" for the C++ interpreter, and enter in "specialized" mode. If an exception is thrown, a generator is resumed, or any other case occurs that's difficult to partially evaluate, we "bail out" by tail-calling the interpret function in a "generic" mode (one of several actually). Semantically this is meaningless at the pure C++ level -- we replace the current activation with another that dispatches to the place we would have anyway -- but the distinction is very useful for a partial evaluator, because it specializes only those calls that are in "specialized" mode, and we get bailouts.

An important goal of this set of patches is to have zero impact on the current interpreter when not using it for my (admittedly weird) use-case. The first patch in the series pulls some state from stack-locals into a struct (which the compiler should turn back into locals hopefully with SROA) but otherwise everything is behind an ifdef. If I've missed something and there is impact, that's a bug and I'll try to fix it.

Chris Fallin [:cfallin]

Assignee

Updated

•

2 years ago

Blocks: 1831399

Chris Fallin [:cfallin]

Assignee

Comment 1

•

2 years ago

Attached file Bug 1831676: Part 1: Refactor C++ interpreter to put function-local state in InterpretContext. r=jandem — Details

This is necessary for later parts of this patch-series to split
different parts of the interpretation across multiple invocations of
Interpret().

Phabricator Automation

Updated

•

2 years ago

Assignee: nobody → chris

Status: UNCONFIRMED → ASSIGNED

Ever confirmed: true

Chris Fallin [:cfallin]

Assignee

Comment 2

•

2 years ago

Attached file Bug 1831676: Part 2: Move interpreter-loop body into InterpretInner(). r=jandem — Details

This is necessary to separate out the one-time setup logic for the
InterpretContext from the actual body of the interpreter loop, which
we may want to split across multiple call-frames.

Depends on D177326

Chris Fallin [:cfallin]

Assignee

Comment 3

•

2 years ago

Attached file Bug 1831676: Part 3: add option to perform a C++-level call of InterpretInner() for each JS call. r=jandem — Details

This creates a 1-to-1 correspondence between JS calls and C++ calls, when
enabled, allowing a partial evaluation / specialization tool to process
one invocation of InterpretInner() (with constant bytecode PC) to
create a compiled function body for one JS function.

Depends on D177327

Chris Fallin [:cfallin]

Assignee

Comment 4

•

2 years ago

Attached file Bug 1831676: Part 4: Interpreter: cache PC in a local. r=jandem — Details

This change is a potential optimization in itself, but also allows
partial-specialization tools to understand the interpreter's current
PC without having to analyze the memory slot for REGS.pc specially.

Depends on D177328

Chris Fallin [:cfallin]

Assignee

Comment 5

•

2 years ago

Attached file Bug 1831676: Part 5: add a notion of "specialization mode" to the interpreter. r=jandem — Details

This mode does not actually perform any specialization, but it does
distinguish different sorts of calls to InterpretInner(): those for
which pc will always follow some statically-known CFG (e.g.,
conditionals and switches), and those for which it may
be computed arbitrarily, e.g. after resuming a generator. This allows
partial specialization to focus on where it works well and bail out to
a generic interpreter otherwise.

Depends on D177329

Iain Ireland [:iain]

Comment 6

•

2 years ago

(In reply to Chris Fallin [:cfallin] from comment #0)

The interpreter currently optimizes handling of JS-to-JS calls by remaining within the same C++ call-frame. JS state is updated and the interpreter loop then dispatches to the next opcode (the first opcode in the callee), without any call at the C++ level.

In order to partially evaluate the interpreter over a function body, we need a different activation of the interpreter function for every interpreted function. Hence, we need to (when configured in this mode) always make a C++ call -- recursive at the C++ level, back to the interpreter itself -- on a JS call.

This is vaguely similar to an issue that the performance team was having, where we wanted the (perf-based) profiler to attribute samples to the function being interpreted, but that couldn't be determined just by looking at the captured stack. The solution in that case was to add profiler-only trampoline frames when entering a new script in the C++ interpreter. That's not identical to what you need, but it sounds close enough that you might benefit from taking a look.

Chris Fallin [:cfallin]

Assignee

Comment 7

•

2 years ago

This is vaguely similar to an issue that the performance team was having, where we wanted the (perf-based) profiler to attribute samples to the function being interpreted, but that couldn't be determined just by looking at the captured stack. The solution in that case was to add profiler-only trampoline frames when entering a new script in the C++ interpreter. That's not identical to what you need, but it sounds close enough that you might benefit from taking a look.

Very interesting, thanks! I do remember bumping into this when rebasing lately (it looks to be added recently) and wondering what it was for. I agree that the mechanism here is a bit different in other ways; at least in terms of optimizations, e.g. the first patch pulls out common state so the "shared" rooted things don't get re-rooted on each call, and take up stack space.

Nicolas B. Pierron [:nbp]

Updated

•

2 years ago

Severity: -- → N/A

Priority: -- → P1

Chris Fallin [:cfallin]

Assignee

Updated

•

2 years ago

Blocks: 1832406

BugBot [:suhaib / :marco/ :calixte]

Comment 8

•

2 years ago

The following patches are waiting for review from a reviewer who resigned from the review:

ID	Title	Author	Reviewer Status
D177326	Bug 1831676: Part 1: Refactor C++ interpreter to put function-local state in InterpretContext. r=jandem	cfallin	jandem: Resigned from review
D177327	Bug 1831676: Part 2: Move interpreter-loop body into InterpretInner(). r=jandem	cfallin	jandem: Resigned from review
D177328	Bug 1831676: Part 3: add option to perform a C++-level call of InterpretInner() for each JS call. r=jandem	cfallin	jandem: Resigned from review
D177329	Bug 1831676: Part 4: Interpreter: cache PC in a local. r=jandem	cfallin	jandem: Resigned from review
D177330	Bug 1831676: Part 5: add a notion of "specialization mode" to the interpreter. r=jandem	cfallin	jandem: Resigned from review

:cfallin, could you please find another reviewer?

For more information, please visit BugBot documentation.

Flags: needinfo?(chris)

Chris Fallin [:cfallin]

Assignee

Comment 9

•

2 years ago

:jandem communicated to me out-of-band that most of patch series is not likely to be upstreamable; so, I'll close the bug.

Flags: needinfo?(chris)

Chris Fallin [:cfallin]

Assignee

Updated

•

2 years ago

Status: ASSIGNED → RESOLVED

Closed: 2 years ago

Resolution: --- → WONTFIX

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Refactor JS C++ interpreter in support of partial evaluation / specialization

Categories

(Core :: JavaScript Engine, task, P1)

Tracking

()

People

(Reporter: cfallin, Assigned: cfallin)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(5 files)

Description

Updated

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Updated

Comment 8

Comment 9

Updated

Attachment

General

Description

File Name

Content Type