Open Bug 1467846 Opened 6 years ago Updated 2 months ago

Implement the Atomics.waitAsync proposal

Categories

(Core :: JavaScript Engine, enhancement, P3)

57 Branch
enhancement

Tracking

()

ASSIGNED

People

(Reporter: alex.fdm, Assigned: iain)

References

(Blocks 2 open bugs, )

Details

(Keywords: dev-doc-needed, parity-chrome, parity-safari)

Attachments

(27 files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
The proposal is currently in Stage 2.
We probably should not implement this *yet*.  The proposal is on hold until SharedArrayBuffer becomes a thing again (I am the champion).
Priority: -- → P3

The proposal is now on Stage 3.

The first round of tests for Atomics.waitAsync have landed in Test262: https://github.com/tc39/test262/tree/master/test/built-ins/Atomics/waitAsync

Would be cool to see this for Firefox as well. Chrome is shipping waitAsync in the next release (87) and will be enabled by default. For reference https://bugs.chromium.org/p/v8/issues/detail?id=10239.

Any updates on this?

There is still some spec work that we are waiting on, to ensure that this can integrate well with wasm.

(In reply to Yulia Startsev from comment #5)

There is still some spec work that we are waiting on, to ensure that this can integrate well with wasm.

To clarify, wasm needs to do some work to support a compatible feature (and it's very early days for that), but I doubt that that work will have any significant impact on what the JS spec looks like.

Ah, my mistake. Then it might appear sooner rather than later.

Assignee: nobody → ystartsev
Assignee: ystartsev → nobody

:ccullen had some interest in this feature, so I will write a few pointers on getting started. This is a tricky feature, but doable and might be really interesting.

For Inspiration, here is how chrome approached the problem, and this revision gives an overall snapshot. Their design document and design revision is also good reading to see what problems they encountered.

V8 did a nice write up for this: https://v8.dev/features/atomics -- there is a good example there to help us get started. Some historical examples can be found here: https://github.com/tc39/proposal-atomics-wait-async/blob/master/example.html

Host hook: https://github.com/whatwg/html/pull/4613/files -- referenced in the specification here: https://tc39.es/proposal-atomics-wait-async/#sec-hostresolveinagent.

The polyfill for this uses a worker to implement the non-blocking behavior on the main thread: https://github.com/tc39/proposal-atomics-wait-async/blob/master/polyfill.js but I think we want to do a full implementation for this. But it might be a good place to look, just to get an idea of the intended behavior.

You can compare async.wait with doWait to get a grasp of things have changed, in relation to our code, as well as how the challenges that chrome ran into while doing their implementation. Also useful reading is the spec text on shared memory access.

Flags: needinfo?(ccullen)
Assignee: nobody → ccullen
Status: NEW → ASSIGNED
Attachment #9273932 - Attachment description: Bug 1467846 - Part 1: Add option for Atomics.waitAsync. r?iain! → Bug 1467846 - Part 1: Add option for Atomics.waitAsync.
Attachment #9273935 - Attachment description: Bug 1467846 - Part 4: Implement Atomics.waitAsync. r?iain! → WIP: Bug 1467846 - Part 4: Implement Atomics.waitAsync.
Attachment #9273935 - Attachment description: WIP: Bug 1467846 - Part 4: Implement Atomics.waitAsync. → Bug 1467846 - Part 4: Implement Atomics.waitAsync.
Attachment #9273933 - Attachment description: Bug 1467846 - Part 2: Update futex waiter to support asynchronous waits. r?iain! → Bug 1467846 - Part 2: Update futex waiter to support asynchronous waits.
Attachment #9273934 - Attachment description: Bug 1467846 - Part 3: Add error for specifying timeout in Atomics.waitAsync and disallow test262 waitAsync tests in the browser. r?iain! → Bug 1467846 - Part 3: Add error for specifying timeout in Atomics.waitAsync and disallow test262 waitAsync tests in the browser.
Attachment #9273936 - Attachment description: Bug 1467846 - Part 5: Implement notify for async waits. r?iain! → Bug 1467846 - Part 5: Implement notify for async waits.
Attachment #9273937 - Attachment description: Bug 1467846 - Part 6: Add shell debugging functions for shared array buffer atomic operations. r?iain! → Bug 1467846 - Part 6: Add shell debugging functions for shared array buffer atomic operations.
Attachment #9273938 - Attachment description: Bug 1467846 - Part 7: Enable waitAsync Atomics test. r?iain! → Bug 1467846 - Part 7: Enable waitAsync Atomics test.
Attachment #9273932 - Attachment description: Bug 1467846 - Part 1: Add option for Atomics.waitAsync. → WIP: Bug 1467846 - Part 1: Add option for Atomics.waitAsync. r?iain!
Attachment #9273933 - Attachment description: Bug 1467846 - Part 2: Update futex waiter to support asynchronous waits. → WIP: Bug 1467846 - Part 2: Update futex waiter to support asynchronous waits. r?iain!
Attachment #9273934 - Attachment description: Bug 1467846 - Part 3: Add error for specifying timeout in Atomics.waitAsync and disallow test262 waitAsync tests in the browser. → WIP: Bug 1467846 - Part 3: Add error for specifying timeout in Atomics.waitAsync and disallow test262 waitAsync tests in the browser. r?iain!
Attachment #9273935 - Attachment description: Bug 1467846 - Part 4: Implement Atomics.waitAsync. → WIP: Bug 1467846 - Part 4: Implement Atomics.waitAsync. r?iain!
Attachment #9273936 - Attachment description: Bug 1467846 - Part 5: Implement notify for async waits. → WIP: Bug 1467846 - Part 5: Implement notify for async waits. r?iain!
Attachment #9273937 - Attachment description: Bug 1467846 - Part 6: Add shell debugging functions for shared array buffer atomic operations. → WIP: Bug 1467846 - Part 6: Add shell debugging functions for shared array buffer atomic operations. r?iain!
Attachment #9273938 - Attachment description: Bug 1467846 - Part 7: Enable waitAsync Atomics test. → WIP: Bug 1467846 - Part 7: Enable waitAsync Atomics test. r?iain!
Attachment #9273932 - Attachment description: WIP: Bug 1467846 - Part 1: Add option for Atomics.waitAsync. r?iain! → WIP: Bug 1467846 - Part 1: Add option for Atomics.waitAsync.
Attachment #9273933 - Attachment description: WIP: Bug 1467846 - Part 2: Update futex waiter to support asynchronous waits. r?iain! → WIP: Bug 1467846 - Part 2: Update futex waiter to support asynchronous waits.
Attachment #9273934 - Attachment description: WIP: Bug 1467846 - Part 3: Add error for specifying timeout in Atomics.waitAsync and disallow test262 waitAsync tests in the browser. r?iain! → WIP: Bug 1467846 - Part 3: Add error for specifying timeout in Atomics.waitAsync and disallow test262 waitAsync tests in the browser.
Attachment #9273935 - Attachment description: WIP: Bug 1467846 - Part 4: Implement Atomics.waitAsync. r?iain! → WIP: Bug 1467846 - Part 4: Implement Atomics.waitAsync.
Attachment #9273936 - Attachment description: WIP: Bug 1467846 - Part 5: Implement notify for async waits. r?iain! → WIP: Bug 1467846 - Part 5: Implement notify for async waits.
Attachment #9273937 - Attachment description: WIP: Bug 1467846 - Part 6: Add shell debugging functions for shared array buffer atomic operations. r?iain! → WIP: Bug 1467846 - Part 6: Add shell debugging functions for shared array buffer atomic operations.
Attachment #9273938 - Attachment description: WIP: Bug 1467846 - Part 7: Enable waitAsync Atomics test. r?iain! → WIP: Bug 1467846 - Part 7: Enable waitAsync Atomics test.

Redirect a needinfo that is pending on an inactive user to the triage owner.
:sdetar, since the bug has recent activity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(ccullen) → needinfo?(sdetar)
Assignee: ccullen → iireland
Flags: needinfo?(sdetar)
Severity: normal → S3

The waiter list for Atomics.wait is homogeneous. With the addition of Atomics.waitAsync, the waiter list will contain both sync and async waiters. (Waiters must be notified in the order in which they waited, so it has to be a single list.) While we're refactoring this code, I added a dedicated header node to the waiter list, which lets us remove a node from the list without needing a reference to the SharedArrayRawBuffer.

Depends on D159743

We currently wait until all OffThreadPromiseTasks are resolved before shutting down the runtime. This is not ideal for something like waitAsync: if the notify never comes, then we will never shut down. This patch adds a mechanism for OffThreadPromiseTasks to opt into being cancellable, in which case we will delete them immediately on shutdown (after calling prepareForCancel on them) instead of blocking.

Depends on D159744

Atomics.notify can be called from any context, but the promise created for Atomics.waitAsync must be resolved in the context that created it. OffThreadPromiseTask will dispatch the message to the correct event loop.

When the runtime shuts down -- for example, when a worker is terminated -- we clear out the async waiters associated with that runtime. The interaction between agent termination and waiters is currently somewhat underspecified (see https://github.com/whatwg/html/issues/8276). This approach is consistent with Chrome's existing implementation.

Depends on D159745

This moves some code around and renumbers steps in preparation for the next patch.

Depends on D159746

This implements Atomics.waitAsync, excluding cases with non-zero timeouts. The remainder of this patch stack is dedicated to adding timeout support.

Breaking out the critical section into its own function made locking easier. The AutoEnterUnsafeOOMRegion is not strictly necessary yet, but a subsequent patch will add code to initialize timeouts, and my pitiful human brain was not up to the task of safely cleaning up if we OOMed halfway through initializing.

Depends on D159747

Synchronous timeouts can wake up the sleeping thread, but we need a separate thread to be woken when asynchronous timeouts fire. In the future we may be able to use the DOM TimeoutManager in browser builds, but for now (and for shell builds) I'm adding an internal TimeoutManager, with a single process-wide timeout thread.

Depends on D159748

We want to use a priority queue to keep track of pending timeouts, but the natural way to prioritize them (lowest timestamp first) is not compatible with taking the highest priority item. Instead of messing around with negative TimeDurations, it seems nicest to just update PriorityQueue to be more flexible.

This patch also adds a highest helper to peek at the highest priority item without removing it.

Depends on D159749

When a waiter is notified, or when the corresponding runtime shuts down, we have to be able to cancel a timeout. A linear scan is not the most efficient way to implement this, but it looks like the DOM TimeoutManager has O(n) insertions (it uses a linked list), so this is probably fast enough for now.

If it turns out that O(n) removal is too slow, we can get O(log(n)) removal by keeping a hashtable mapping keys to heap indices, and updating the keys whenever we swap elements. This would require refactoring PriorityQueue to expose swaps, so I'm leaving it for later.

Depends on D159751

assertHoldsCancelLocks is intended to minimize footguns from a tricky potential race condition.

CancelTimeout is useful when you have a timeout task that is racing with some other event. (In the case of Atomics.waitAsync, this is a call to Atomics.notify.) When the event occurs, we want to cancel the timeout. If the timeout fires, we want to disable the event. Because the timeout fires on a separate thread, the run method presumably needs to claim a lock to disable the event. (For waitAsync, this is the FutexAPI lock.)

However, consider what happens in the following series of events:

  1. Thread T calls Atomics.notify and claims the FutexAPI lock.
  2. A timeout expires and the timeout thread is woken up.
  3. The timeout task is removed from the priority queue. Its run method is called.
  4. The run method must claim the FutexAPI lock to clean up the waiter. We block waiting for thread T to release the lock.
  5. On thread T, we call CancelTimeout to cancel the timeout.

The timeout task has already been removed from the priority queue, so we can't cancel it. We need some way to decide whether the notify or the timeout won the race, and coordinate the cleanup.

I think the least error-prone solution is to say that the timeout always loses the race. We set a cancelled flag on the TimeoutTask, and the run method is responsible for checking that flag before doing anything. Awkwardly, that flag must be guarded by the implementation-specific mutex (eg FutexAPI), not the internal timeout manager lock. To make sure implementations of TimeoutTask get this right, assertHoldsCancelLocks is called when cancelling a timeout, and again when checking the flag.

The next patch has an example of what this looks like in practice.

Alternatives I considered and rejected:

  1. We could instead decide that the timeout always wins this race. In this case the return value of CancelTimeout would indicate whether the timeout had already fired, and the caller (eg Atomics.notify) would be responsible for doing the right thing. The current approach seemed like it put less of a burden on users of the API.
  2. It would be nice if we could return opaque cancellation tokens from AddTimeoutToTimeoutManager and pass those back in to CancelTimeout, instead of the TimeoutTask itself. I couldn't find a way to make that work, because we need access to the TimeoutTask to call assertHoldsCancelLocks.

Depends on D159752

Depends on D159753

Right now, we don't support await in worker threads in the shell: we never drain the job queue, so enqueued promise resolutions never fire.

Depends on D159754

The alternative to this patch is having each AsyncWaiter increment the refcount of the underlying SharedArrayRawBuffer, keeping it alive as long as there's an unnotified waiter.

Depends on D159755

Depends on D159756

Hey! Curious, what is the status on this?

The status is that I wrote a stack of patches that pass the test262 tests, but was not 100% confident of correctness, because there's a bunch of tricky multithreading going on. I got distracted with other work and didn't have time to shepherd them through review. We currently aren't prioritizing this very heavily, since AFAIK there's not a lot of real-world demand, but the work is mostly done, so if we had a compelling reason to prioritize this, I don't think it would take too long to get it into shape.

I am the team lead at StackBlitz for WebContainer, a Node.js runtime for the browser that allows you to run Node.js natively in the browser. I think WebContainer makes for a compelling use case especially because this has been shipped in Chrome and Safari. WebContainer is very performance critical and the fact that Firefox doesn't yet implement Atomics.waitAsync means that our kernel has to rely on a polyfill which uses a dedicated worker. The problem is that every worker adds more pressure to the tab and consumes resources, and userland code often times spawns a handful of processes (workers) as well. So anything that is more "native" such as Atomics.waitAsync that would avoid spawning more workers has a positive effect on the resource consumption of WebContainer and would also positively contribute to the performance. A decent chunk of our users use Firefox and it's unfortunate that for Firefox we still need to rely on the polyfill.

CC @bholley

Safari currently doesn't support this feature. The only engine supporting it at the moment is v8. If your web container is working without a polyfill in both Safari and Chrome, perhaps you mean a different feature? Can you give us more details about the difference in how you are running it in Safari and Firefox?

Yea it's not yet in stable for Safari but it's been shipped to TP. I mean we are still working on support for Safari and for the time being we still have to reply on the polyfill because waitAsync hasn't been shipped for stable yet.

Part of the reason for that may be because the waitAsync specification is still stage 3. They do not ship until a feature is in stage 4 for complex features, as they expressed in committee recently.

As you mention in the safari bug, you are able to polyfill in firefox, but were unable to do so in safari -- https://bugs.webkit.org/show_bug.cgi?id=241414 -- in other words the situation was more severe in the case of safari in that your web containers did not work via the polyfill. In the case here, you have a working polyfill, but it is not as performant as it should be. This is of course not ideal.

We consider performance to be an important part of the web. However at the moment we have our focus on performance elsewhere, meaning that our ability to do feature work related to TC39 is impacted for a large chunk of this year. Given that this is a stage 3 proposal, and an adequate though not ideal solution exists, it is unlikely that our current prioritization will change right now. We will discuss your case at our next meeting. As Iain mentioned, the feature is almost finished and once we have time for it again it will likely go quickly. We of course welcome contributions, and if you have some engineers with expertise in this space who have cycles to address Iain's correctness concerns, it may go faster.

From reading the safari bug, you still have blockers on safari for fully implementing your web components work, namely shared array buffer cloning. Is this still true?

Note that the current synchronous implementation of Atomics.wait is causing troubles to DevTools which is unable to execute its JS codebase while the worker is paused by this method. I provided details, with a minimal test reproducing the issue in bug 1821250 comment 16.

If the async implementation prevent these troubles, it would be an argument to move forward with the async flavor.

Otherwise some help would be appreciated to make the DevTools/DOM codebase work with both API flavors.
I'm wondering if the direct usage of pthread_cond_wait isn't bypassing the dom/ codebase, which ends up being paused unexpectedly.

Note that wasm-bindgen, which is the typical/popular way to use Rust in a web page through WASM uses a shim to provide async wait based on sync wait:
https://github.com/rustwasm/wasm-bindgen/blob/9958e2ec44651fd7b02b023439118328181b1733/crates/futures/src/task/worker.js#L4
https://github.com/rustwasm/wasm-bindgen/blob/9958e2ec44651fd7b02b023439118328181b1733/crates/futures/src/task/wait_async_polyfill.rs#L55-L56
This may broaden the issue with DevTools.

Attachment #9299320 - Attachment description: WIP: Bug 1467846: Part 13: Drain job queue in worker thread → WIP: Bug 1467846: Part 13: Support `await` in test262 worker threads

Depends on D159757

Depends on D186818

Hey all. I wanted mention that Atomics.waitAsync is now fully supported in both Chrome and Safari. We are running into issues with WebContainer due to the missing support for a native version of this and the polyfill that uses a dedicated worker doesn't fully work as waitAsync cannot be accurately polyfilled. It also causes an overhead of dedicated workers which is very unfortunate for an environment like WebContainer which is already putting some strain on the process. I think it'd be really nice if Firefox would support this natively as well.

Blocks: 1884148
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: