Closed Bug 2023217 Opened 1 month ago Closed 17 days ago

Rebase JS promise integration onto stack switching proposal

Categories

(Core :: JavaScript: WebAssembly, task, P1)

task

Tracking

()

RESOLVED FIXED
151 Branch
Tracking Status
firefox151 --- fixed

People

(Reporter: rhunt, Assigned: rhunt)

References

(Blocks 2 open bugs)

Details

Attachments

(8 files, 17 obsolete files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
557.77 KB, patch
Details | Diff | Splinter Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

In the process of reviewing and refactoring our JS-PI implementation I've rewritten most of the code. The new version implements a subset of the core stack switching proposal [1] and uses that in self-hosted code to implement the JS-PI interfaces.

Some things improved:

  1. The old implementation would give incorrect stack traces when you switched to the main thread
  2. Suspendable stack memory now has guard pages before/after the allocation
  3. Lots of comments, documentation, and assertions
  4. Tracing is a little bit simpler

[1] https://github.com/webassembly/stack-switching

Attachment #9554746 - Attachment is obsolete: true
Assignee: nobody → rhunt
Attachment #9555301 - Attachment description: WIP: Bug 2023217 - wasm: Add assert* and Printf utilities to MacroAssembler. → Bug 2023217 - wasm: Add assert* and Printf utilities to MacroAssembler. r?yury
Status: NEW → ASSIGNED
Attachment #9555302 - Attachment description: WIP: Bug 2023217 - wasm: Add stack switching preference and feature flag. → Bug 2023217 - wasm: Add stack switching preference and feature flag. r?yury
Attachment #9555303 - Attachment description: WIP: Bug 2023217 - wasm: Add stack switching type system. → Bug 2023217 - wasm: Add stack switching type system. r?yury
Attachment #9555306 - Attachment description: WIP: Bug 2023217 - wasm: Add validation and decoding for stack-switching opcodes. → Bug 2023217 - wasm: Add validation and decoding for stack-switching opcodes. r?yury
Attachment #9555316 - Attachment description: WIP: Bug 2023217 - wasm: Add tests for JS-PI with stack switching. → Bug 2023217 - wasm: Add tests for JS-PI with stack switching. r?yury

These patches are almost ready, I just need to find some way to split the second-to-last patch into more reviewable chunks. It's tricky though, it's all connected.

This commit doesn't compile on it's own. It has been split for ease-of-review.

Add ContStack, ContObject, Handlers, Handler, StackTarget, and SwitchTarget.

Update wasm::Context to track the set of created ContStacks, the current
ContStack, the base Handlers, and the current stack limits.

See the big [SMDOC] for an overview of the architecture.

This commit doesn't compile on it's own. It has been split for ease-of-review.

Add builtins for creating builtins and unwinding a handler.

This commit doesn't compile on it's own. It has been split for ease-of-review.

Adds a 'ThrowSuspendError' trap that will be used later.
Adds a 'Unimplemented' trap that will be used later.

This commit doesn't compile on it's own. It has been split for ease-of-review.

A DebuggerFrame can point at a wasm frame using a cloned FrameIter object. The
FrameIter is tied to the activation the wasm frame lived in. If a wasm frame is
suspended on a continuation and no longer has an activation, the FrameIter is
invalid and cannot be used anymore.

JS-PI previously handled this by iterating the stack to search for debugger
frames to detach, and then reattach on resume. This patch takes an alternate
approach the changes a DebuggerFrame created for a frame on a continuation to
have a 'WASM_FRAME_PTR_SLOT' instead of 'FRAME_ITER_SLOT'. The AbstractFramePtr
is used lazily to re-create a FrameIter. We also return false for isOnStack
if the frame is suspended.

This commit doesn't compile on it's own. It has been split for ease-of-review.

Continuations are not exposable across the JS boundary, but technically
a null can still be passed to a nullable contref.

One hack here is that JS-PI has a builtin which takes a contref,
which uses these functions. So we allow the contref through even
though we technically shouldn't. We will need a proper solution
before we ship stack-switching.

This commit doesn't compile on it's own. It has been split for ease-of-review.

JS-PI will need a globally defined JS Promise Tag type which can
be used for passing a promise to react to and resume the
continuation.

This commit doesn't compile on it's own. It has been split for ease-of-review.

See the [SMDOC] in WasmStacks.cpp for more details.

This commit doesn't compile on it's own. It has been split for ease-of-review.

ProfilingFrameIter needs to be aware of continuations. It must
know how to handle the ContBaseFrame which links the continuation
stack to the previous stack.

This commit doesn't compile on it's own. It has been split for ease-of-review.

  • Rework the dynamic switch to the main stack
    • Push 2 slots for saving/restoring the (baseHandlers, currentStack)
    • Move alignment handling into GenerateExit* to centralize the logic
  • Remove the wasmExitSuspender on JitActivation
    • No longer needed for saving/restoring from main stack switches, we have
      the dedicated slots.
    • Use wasm::Context to find a continuation whenever you have an FP
  • Update FrameIter to support continuations
    • ContBaseFrame unwinding
    • Tracking the current continuation stack

This commit doesn't compile on it's own. It has been split for ease-of-review.

Add tracing for continuation stacks.

When a stack is active, it will be found and traced through the normal
FrameIter loop.

When a stack is suspended, we need to manually trace it. See the
[SMDOC] in WasmStacks.cpp for the full GC considerations.

This commit doesn't compile on it's own. It has been split for ease-of-review.

Add a resume barrier builtin which immediately traces a suspended
stack if an incremental GC is in progress. This ensures all stack
slots have been traced before an unbarriered write could happen
to them.

This commit doesn't compile on it's own. It has been split for ease-of-review.

Update unwinding so it:

  • destroys an active stack when we unwind over it
  • switches to an active stack when it catches an exception

This commit doesn't compile on it's own. It has been split for ease-of-review.

Add

  • MWasmFindHandler - searches for a handler that matches a tag
  • MWasmSuspend - suspends the current continuation and jumps to a handler
  • MWasmResume - resumes a suspended continuation

All of these will call masm helpers in WasmStacks.cpp to do
the actual work. This will happen in the next patch.

Anything that's not supported yet will result in an unimplemented
trap in ion.

This commit doesn't compile on it's own. It has been split for ease-of-review.

Add the masm helpers for FindHandlers, Suspend, and Resume.

This commit now finally compiles and passes tests.

Update the self-hosted JS-PI code to use stack-switching instructions,
and remove the previous instructions. The comments are updated for the
new structure of the suspending/promising modules.

Some of the JS-PI promise builtins were re-organized too.

Attachment #9555310 - Attachment is obsolete: true

This commit was split into easier to review parts that don't compile on their
own and then merged back together. The following are the commit messages for
the pieces.

  1. Add basic continuation types.

Add ContStack, ContObject, Handlers, Handler, StackTarget, and SwitchTarget.

Update wasm::Context to track the set of created ContStacks, the current
ContStack, the base Handlers, and the current stack limits.

See the big [SMDOC] for an overview of the architecture.

  1. Add continuation builtins.

Add builtins for creating builtins and unwinding a handler.

  1. Add new errors messages and traps.

Adds a 'ThrowSuspendError' trap that will be used later.
Adds a 'Unimplemented' trap that will be used later.

  1. Rework debugger integration.

A DebuggerFrame can point at a wasm frame using a cloned FrameIter object. The
FrameIter is tied to the activation the wasm frame lived in. If a wasm frame is
suspended on a continuation and no longer has an activation, the FrameIter is
invalid and cannot be used anymore.

JS-PI previously handled this by iterating the stack to search for debugger
frames to detach, and then reattach on resume. This patch takes an alternate
approach the changes a DebuggerFrame created for a frame on a continuation to
have a 'WASM_FRAME_PTR_SLOT' instead of 'FRAME_ITER_SLOT'. The AbstractFramePtr
is used lazily to re-create a FrameIter. We also return false for isOnStack
if the frame is suspended.

  1. Add ToJSValue/ToWebAssemblyValue support for cont.

Continuations are not exposable across the JS boundary, but technically
a null can still be passed to a nullable contref.

One hack here is that JS-PI has a builtin which takes a contref,
which uses these functions. So we allow the contref through even
though we technically shouldn't. We will need a proper solution
before we ship stack-switching.

  1. Add JS Promise Tag.

JS-PI will need a globally defined JS Promise Tag type which can
be used for passing a promise to react to and resume the
continuation.

  1. Add cont base frame stubs.

See the [SMDOC] in WasmStacks.cpp for more details.

  1. Update ProfilingFrameIter.

ProfilingFrameIter needs to be aware of continuations. It must
know how to handle the ContBaseFrame which links the continuation
stack to the previous stack.

  1. Frame unwinding and dynamic switches.
  • Rework the dynamic switch to the main stack
    • Push 2 slots for saving/restoring the (baseHandlers, currentStack)
    • Move alignment handling into GenerateExit* to centralize the logic
  • Remove the wasmExitSuspender on JitActivation
    • No longer needed for saving/restoring from main stack switches, we have
      the dedicated slots.
    • Use wasm::Context to find a continuation whenever you have an FP
  • Update FrameIter to support continuations
    • ContBaseFrame unwinding
    • Tracking the current continuation stack
  1. Tracing for continuation stacks.

When a stack is active, it will be found and traced through the normal
FrameIter loop.
t rebase --continue

When a stack is suspended, we need to manually trace it. See the
[SMDOC] in WasmStacks.cpp for the full GC considerations.

  1. Add resume barrier.

Add a resume barrier builtin which immediately traces a suspended
stack if an incremental GC is in progress. This ensures all stack
slots have been traced before an unbarriered write could happen
to them.

  1. Update unwinding for continuation stacks.

Update unwinding so it:

  • destroys an active stack when we unwind over it
  • switches to an active stack when it catches an exception
  1. Add MIR/LIR nodes for stack switching.

Add

  • MWasmFindHandler - searches for a handler that matches a tag
  • MWasmSuspend - suspends the current continuation and jumps to a handler
  • MWasmResume - resumes a suspended continuation

All of these will call masm helpers in WasmStacks.cpp to do
the actual work. This will happen in the next patch.

Anything that's not supported yet will result in an unimplemented
trap in ion.

  1. Add codegen for stack switching instructions.

Add the masm helpers for FindHandlers, Suspend, and Resume.

  1. Update JS-PI to use stack-switching.

Update the self-hosted JS-PI code to use stack-switching instructions,
and remove the previous instructions. The comments are updated for the
new structure of the suspending/promising modules.

Some of the JS-PI promise builtins were re-organized too.

Attachment #9559033 - Attachment is obsolete: true
Attachment #9559034 - Attachment is obsolete: true
Attachment #9559035 - Attachment is obsolete: true
Attachment #9559036 - Attachment is obsolete: true
Attachment #9559037 - Attachment is obsolete: true
Attachment #9559038 - Attachment is obsolete: true
Attachment #9559039 - Attachment is obsolete: true
Attachment #9559040 - Attachment is obsolete: true
Attachment #9559041 - Attachment is obsolete: true
Attachment #9559042 - Attachment is obsolete: true
Attachment #9559043 - Attachment is obsolete: true
Attachment #9559044 - Attachment is obsolete: true
Attachment #9559045 - Attachment is obsolete: true
Attachment #9559046 - Attachment is obsolete: true
Attachment #9559047 - Attachment is obsolete: true
Pushed by agoloman@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/32da16028144 https://hg.mozilla.org/integration/autoland/rev/98145be0b1bd Revert "Bug 2023217 - wasm: Add tests for JS-PI with stack switching. r=yury" for causing SM bustages @CodeGenerator-shared.h.

Backed out for causing SM bustages @CodeGenerator-shared.h.

Flags: needinfo?(rhunt)

Mid-aired with a change to OOM handling with LambdaOutOfLineCode.

Flags: needinfo?(rhunt)
Pushed by agoloman@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/54da5de19ccb https://hg.mozilla.org/integration/autoland/rev/e101be680848 Revert "Bug 2023217 - wasm: Add tests for JS-PI with stack switching. r=yury" for causing build bustages @WasmStacks.cp.

Backed out for causing build bustages @WasmStacks.cp.

Flags: needinfo?(rhunt)

Non-unified build bustage on Win64 (didn't know we had that job)

Flags: needinfo?(rhunt)
Pushed by agoloman@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/23ce253e39de https://hg.mozilla.org/integration/autoland/rev/b14579dfdfce Revert "Bug 2023217 - wasm: Add tests for JS-PI with stack switching. r=yury" for causing sm bustages @many-outstanding.js.

Backed out for causing sm bustages @many-outstanding.js.

Flags: needinfo?(rhunt)

A real failure in concurrent marking. I thought that was tier-2 though? I can try and figure out what's going on with it. The CGC issue looks unrelated.

Flags: needinfo?(rhunt)

The wasm JS-PI resume barrier calls PerformIncrementalPreWriteBarrierAllChildren
to trace a suspended continuation's stack before resuming it, maintaining the
incremental GC snapshot-at-beginning invariant. It skipped this work if the
ContObject was already marked black, assuming its children had been traced.

With concurrent marking, this assumption is wrong. The background marking thread
marks objects black upon discovery but defers trace hooks to a main-thread buffer
(mainThreadBuffer) which is only drained during the next GC slice. A ContObject
can therefore be black while its trace hook hasn't yet run, meaning the suspended
stack's GC references are untraced. When the resume barrier saw isMarkedBlack()
and returned early, those references could be collected, causing a SIGSEGV when
wasm code later accessed them.

Fix by only skipping the barrier when the Concurrent barrier flag is not set.
This flag is set during incremental GC with concurrent marking enabled, exactly
the condition under which black objects may have unprocessed trace hooks.

Attachment #9569313 - Attachment description: Bug 2023217 - wasm: Fix resume barrier race with concurrent marking. r?jonco → Bug 2023217 - wasm: Fix resume barrier race. r?jonco
Pushed by amarc@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/a786eea6432d https://hg.mozilla.org/integration/autoland/rev/124456804ca7 Revert "Bug 2023217 - wasm: Add tests for JS-PI with stack switching. r=yury" for causing sm bustages @ testMathLib

Backed out for causing sm bustages

Flags: needinfo?(rhunt)

Ugh. Intermittent issue in CGC builds (didn't show up in my previous try pushes). The issue is real and serious though.

The saved stack slots in the trap exit were being statically addressed through FP, which is totally incorrect (trap exit runs at a dynamic offset from FP, not static). This resulted in us writing a random 0 word into the stack. For some reason this only showed up on that one asm.js test with CGC (CGC triggers interrupts which uses the trap exit). The fix is to use SP to address the stack slots.

Flags: needinfo?(rhunt)
QA Whiteboard: [qa-triage-done-c152/b151]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: