SharedArrayBuffer infrastructure doesn't work properly
Categories
(Core :: Web Audio, defect)
Tracking
()
People
(Reporter: ebay, Unassigned)
Details
Attachments
(2 files)
Steps to reproduce:
I ported an old page of mine to the "modern" AudioWorklet infrastruture and more specifically to an implementation that uses SharedArrayBuffer based communication between the AudioWorkletNode and the AudioWorkletProcessor parts.
The respective page can be found here: https://www.wothke.ch/webV2M_worklet/
(I originally designed the page based on Chrome's idiotic limitations and I did not check if things might ideally be done differently in Firefox - but I would not want to waste my time with browser specific hacks anyway and if push comes to shove I'll have to tell my users to just always use the more dominant browser.)
Actual results:
The page always worked fine in Chrome but it originally completely hung the Firefox browser tab (Firefox 118.0.2 and older). The page now also works in Firefox but only thanks to an idiotic hack in my code which should NOT be necessary (and which depends on undocumented behavior that might no longer work with the next update of Firefox).
The problem is that the UI-side code (AudioWorkletNode in this instance - see "What should have happend" below) does NOT see the updates made to a SharedArrayBuffer from the Worker-side (an AudioWorkletProcessor) a LONG TIME ago (30 seconds later the browser reports that the UI code "is hanging" since it STILL DOES NOT see the respective "data update" that the Processor-side had made almost immediately).
The attached code fragment shows the logic that is used on the UI side to wait for the updated SharedArrayBuffer. It is the hacked version that now also works in Firefox but the "console.log" and timer related add-ons in the "busy wait" loop SHOULD NOT be necessary for Firefox to actually propagate SharedArrayBuffer updates within a reasonable period of time! Firefox should definitely NOT wait so long that it reports a "hanging UI thread" before it even tried to propagate the respective SharedArrayBuffer updates to the UI!
Expected results:
To achieve API backward compatibility with the older non-AudioWorklet based implementation my new implementation encapsulates the now asynchonous behavior of the AudioWorklet infrastructure behind some synchronous wrappers.
scenario: The WorkletNode side needs to synchronously return some status information that comes from the WorkletProcessor side.
The sequence used by the implementation to achieve this behavior is this (no, a purely messaging based approach would obviously be useless in this scenrio where a synchronous response is required):
-
The Node-side sends a respective request ("postMessage") to the Processor-side asking for the required information. As part of the request a SharedArrayBuffer (4-byte in length) is sent that has been initialized to some "untouched"-maker value.
The Node-side code then enters a "busy wait" loop that waits for the marker to be changed to anything else, i.e. it expects to be "woken up" eventually by any change that the Processor-side performs on the respective shared memory location. -
The Processor eventually receives the message and updates the shared memory location to signal that it has completed its work, which then should cause the Node-side to eventually exit its "busy wait" loop. (The update in the SharedArrayBuffer must be propated to the Node-side in a reasonanly timely manner, i.e. NOT hours later and before the browser thinks that it has crashed 30 secs later!)
Comment 1•2 years ago
|
||
Setting the component so that engineering can take a look. Thank you for your report.
Comment 2•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Firefox::Messaging System' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
| Comment hidden (advocacy) |
Updated•2 years ago
|
Comment 4•2 years ago
|
||
Please provide runnable code that demonstrate the issue you're seeing.
The type of behavior that you seem to want has been used in production for years on major websites, and we have unit testing in place to ensure it's working at the browser level. There are libraries implementing well known lock-free data structure and algorithm using SharedArrayBuffer and Atomics, and they are continuously tested against various browser versions by their authors.
Attached is a simple test-case that shows what I think you want to do, working in all browsers that support the features in use. It is to be run served with the appropriate headers and protocol, so that a SharedArrayBuffer can be instantiated.
In the future, please be respectful in your communication in the Mozilla community, this is my last and only warning on the matter.
The original broken version (i.e. broken in Firefox) of the example page that I had mentioned in my original post can be found here: https://www.wothke.ch/webV2M_worklet_ff/ (a zipped version of that complete folder can be downloaded here: http://www.wothke.ch/tmp/webV2M_worklet_ff.zip )
That page is identical to the patched version except that it doesn't use the log output and timer comparison in its "busy wait" loop (see worker_backend.js: synchronCall() used in the above code).
That page runs perfectly fine in Chrome but it crashes miserably in Firefox: When calling the page locally via http or via https using the above link the page just hangs to then (after a long time) report "This page is slowing down Firefox. To speed up your browser, stop this page."
The example code that you provided is obviously totally useless with regard to what I "want to do": The whole point is that the UI side code wants wants to perform a synchonous communication with the worker, i.e. it blocks until that response is actually received. PS: as far as I could tell Atomics are specifically NOT available in this scenario - at least not in Chrome.
It seems quite obvious to me that whatever test cases you might already have, do not cover the scenario that is relevant here and I'd suggest you fix that flaw.
Comment 6•2 years ago
|
||
You're doing a naive busy loop on the main thread by doing a non-atomic load on a SharedArrayBuffer in a loop on a memory location that is written to using an atomic store on another thread, this is a programming error.
The JIT compiler hoists the non-atomic load outside the loop and doesn't perform another load inside the loop. This is similar to what a native optimizing compiler would do if a memory location isn't marked volatile, or that kind of thing. Chrome's JIT compiler doesn't optimize the load outside the loop currently (it could very well do in a previous or future version), and it seems to "work", but the program is not correct in the first place.
Please see the specification: https://tc39.es/ecma262/#sec-shared-memory-guidelines, Note 2, end of section, emphasis mine:
Examples of transformations that remain valid are: merging multiple non-atomic reads from the same location, reordering non-atomic reads, introducing speculative non-atomic reads, merging multiple non-atomic writes to the same location, reordering non-atomic writes to different locations, and hoisting non-atomic reads out of loops even if that affects termination. Note in general that aliased TypedArrays make it hard to prove that locations are different.
I see.. thanks for the information. Indeed it seems that my busy-loop now also works in FF when I replace the direct Array access with an Atomics.load based access.
Actually I had not expected that any of the Atomics APIs would actually be available in my context - due to the fact that I had first tried to use "Atomics.wait" (which would be the logical choice for my usecase) only to find that that API is specifically disabled for use in my context - which seems to be an unbelievably poor design decision. I would have loved to hear the twisted reasoning why an Atomics.wait() based implementation is "sooo bad" that the API has to be blocked and why it is "preferable" when people then use Atomics.load in a busy-loop to work around that limitation.. it seems just so ludicrous.
Comment 8•2 years ago
|
||
https://github.com/WebAssembly/threads/issues/106#issuecomment-428891383 has the reasoning behind this. The main thread should never block because that has extremely severe implications, very much like on desktop apps (OSes go as far as displaying their own "such and such app are not responding, do you want to kill them", this is the same here).
The main thread can do any other Atomics operations because those never block (and in fact that be used to implement wait-free and lock-free algorithms). It's not "preferable" to busy-wait compared to doing Atomics.wait on the main, it's equally as bad, but by definition you cannot prevent busy waiting if you're offering mutable shared memory, so it's possible. You wouldn't wait on a condition variable on the main thread of a well written native app, same for the web.
The web platform is built on message passing and asynchronous notifications, all APIs are designed in a way that the main thread is never blocked for long periods of time, because that's where all the interactive bits are, like on other platforms. It's trivial to send an async message back to the main thread to inform it that the setup phase is done in the worklet, there's no need to use atomics or any of that. Instead of busy looping, the threads will be parked nicely waiting for the message back, doing nothing.
In the grand scheme of things, anybody can do an infinite loops on the main thread, we can't really prevent that folks from writing buggy code, but can make it clear that blocking the main thread is not something that should be done, no exception.
Web Workers can synchronously wait and perform large synchronous tasks, that's what they are for. Those threads can do Atomics.wait or busy loop or whatever.
It is silly to pretend that blocking is automatically always so super bad that it automatically leads to unresponsive applications - regardless of the specific usage scenario ("blocking" does not automatically mean "long blocking"!). It is equally silly to pretend that code is always written on the green field and freshly designed from the ground up. There may well be existing synchronous application code that would require major redesign due to some previously synchronous browser API "suddenly" having been replaced with some new API that introduces asychronous behavior in all the wrong places (like in the case of ScriptProcessorNode's "recommended" replacement with AudioWorkletNode).
Contrary to the designers of the web browser, the application developer actually knows his specific usage scenario and the different trade-offs involved. Contrary to what some people might believe, concurrent programming has not been invented by web browser designers and the trade-offs involved in multi-threaded applications are nothing new. It is just presumptuous when the designers of a web browser pretend to have a one size fits all solution that is best for everybody!
An API like Atomics where some functions are "randomly" not available is just poorly designed. As the example shows clearly the blocking of "Atomics.wait" has zero added value since the same "undesired" functionality can be achieved using the other APIs. Except that the workaround makes things actually worse: it reduces code's readability, it makes the Atomics API less predictable/requires special case API documentation, and wastes any chance to actually do something useful with the user's CPU core while the thread is waiting.
Description
•