Closed
Bug 1131953
Opened 8 years ago
Closed 8 years ago
Add futex support for the JS shell
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla38
Tracking | Status | |
---|---|---|
firefox38 | --- | fixed |
People
(Reporter: lth, Assigned: lth)
References
(Blocks 1 open bug)
Details
Attachments
(3 files, 3 obsolete files)
6.06 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
6.45 KB,
patch
|
Details | Diff | Splinter Review | |
5.13 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
Currently the futex support is only installed by DOM (main thread and workers), but since the JS shell supports simple workers with evalInWorker there's no reason why we should not add futex functionality to the shell. That would allow us to have shell tests for futexes. Those tests would not really test any of the DOM code, but they would test the engine logic, and would allow some tests to be written that are hard to write now, esp around how interrupts and futexes interact.
Assignee | ||
Comment 1•8 years ago
|
||
Adding the futex functionality to the shell is easy enough, but for anything meaningful to happen it needs to be possible to transfer (at a minimum) a shared-memory object to the worker. Nothing like that seems to exist. Candidate solutions: - Pass one or more SAB objects to the worker by means of evalInWorker; the worker would obtain these objects by calling some intrinsic (getWorkerParam(n) maybe). Easy enough to extend to other types, and to use structured cloning to pass values. - Register a SAB on the master side with setWorkerSAB(n, sab) and obtain it on the worker side with getWorkerSAB(n). I like the first one, so I'll play around with that.
Assignee | ||
Comment 2•8 years ago
|
||
This compiles and links but has not been tested due to the lack of a way to transmit shared memory objects.
Assignee | ||
Comment 3•8 years ago
|
||
Actually the solution that I'll probably implement is to have a single global mailbox that will hold one SharedArrayBuffer, this mailbox can be read and written by main thread and workers alike. If they need more coordination than that they can coordinate via shared memory. New shell functions, then: setSharedArrayBuffer(sab) getSharedArrayBuffer()
Assignee | ||
Comment 4•8 years ago
|
||
Implements the one-element mailbox. Clean, but could use a little more testing before review.
Attachment #8562680 -
Attachment is obsolete: true
Assignee | ||
Comment 5•8 years ago
|
||
Code an test case. I have an additional test case attached to bug 1074237 that I can land once the jsapi changes on that bug have landed and this patch has landed.
Attachment #8562927 -
Attachment is obsolete: true
Attachment #8563370 -
Flags: review?(luke)
![]() |
||
Comment 6•8 years ago
|
||
Comment on attachment 8563370 [details] [diff] [review] Patch Review of attachment 8563370 [details] [diff] [review]: ----------------------------------------------------------------- I like the global mailbox solution. Once you have shared memory, you can even futexWait on the result of an evalInWorker() from the main thread, which could allow building some nice testing primitives later on. ::: js/src/shell/js.cpp @@ +323,5 @@ > + wait = PR_MillisecondsToInterval((timeout_us + 999) / 1000); // Round up always > + } > + gFutexLockHeld = false; > + if (PR_WaitCondVar(cond_, wait) == PR_FAILURE) > + interrupted = true; IIUC, other than user errors (calling w/o lock held), the only valid PR_FAILURE we expect here is spurious condvar wakeup (EINTR). In that case, we *do* want to continue and try again which is what will happen if we simply ignore the return value (and delete interrupted). @@ +333,5 @@ > + else > + timeout_us -= (after - before); > + } > + } while (!(interrupted || wokenForJSInterrupt_ || woken_ || timedout)); > + waiting_ = false; Can you MOZ_ASSERT(waiting_); before clearing? @@ +337,5 @@ > + waiting_ = false; > + if (interrupted) > + return Uncatchable; > + if (wokenForJSInterrupt_) > + return WokenForJSInterrupt; I think you need to clear wokenForJSInterrupt_ here or else all future waits() will return WokenForJSInterrupt. @@ +354,5 @@ > + case WakeExplicit: > + woken_ = true; > + break; > + case WakeForJSInterrupt: > + wokenForJSInterrupt_ = true; Instead of having two independent bools (with 4 possible pair states, 1 of which is unobservable (both false), 1 of which is impossible (both true)), how about just having a WakeReason as the field and switching on that? To catch bugs, you could set it to WakeReason(-1) before the PR_WaitCondVar and assert the reason is WakeReason(-1) in wake() before clobbering it. @@ +2896,5 @@ > JS_SetErrorReporter(rt, my_ErrorReporter); > > + ShellFutexAPI *fx = js_new<ShellFutexAPI>(); > + if (!fx || !fx->init()) { > + js_delete(fx); Random note, I just noticed that ~JSRuntime has 'delete futexAPI_;'; it should use js_delete(futexAPI_) (global SM rule; allows the embedding to switch allocators). That means that the embedding must also use JS_malloc() to allocate the memory which is a pain and now I'm starting to think that having ~JSRuntime delete the FutexAPI is a bad idea (which was probably mine). It also prevents the FutexAPI object from being a global or member variable (which I would expect them to be for the main-thread and worker-thread, resp.). @@ +4429,5 @@ > + return false; > + } > + > + SharedArrayRawBuffer *newBuffer = args[0].toObject().as<SharedArrayBufferObject>().rawBufferObject(); > + newBuffer->addReference(); Could you have a corresponding dropReference() in main(), right after the JS_DestroyRuntime(). That way, in theory, all SharedArrayRawBuffers would be dead by the time we JS_Shutdown(). Then, to test let this condition be tested/fuzzed, you could keep an #ifdef DEBUG count of total live SharedArrayRawBuffers and assert that number is zero right from inside JS_Shutdown().
![]() |
||
Comment 7•8 years ago
|
||
Actually, taking my above comment further, you could merge waiting_ in by having a tri-state enum: Waiting, Woken, WokenForJSInterrupt and this would map 1:1 with the FutexAPI states.
![]() |
||
Comment 8•8 years ago
|
||
Higher-level question: if JS_RequestInterruptCallback() Just Works for interrupting a blocked futex, do we even need a FutexAPI? I'm having trouble coming up with what it gives us just using PRLock/PRCondVar directly inside AtomicsObject.cpp. I know workers are doing some exciting stuff with nested event loops; maybe I just don't understand the constraints.
Assignee | ||
Comment 9•8 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #6) > Comment on attachment 8563370 [details] [diff] [review] > Patch > > Review of attachment 8563370 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/shell/js.cpp > @@ +323,5 @@ > > + wait = PR_MillisecondsToInterval((timeout_us + 999) / 1000); // Round up always > > + } > > + gFutexLockHeld = false; > > + if (PR_WaitCondVar(cond_, wait) == PR_FAILURE) > > + interrupted = true; > > IIUC, other than user errors (calling w/o lock held), the only valid > PR_FAILURE we expect here is spurious condvar wakeup (EINTR). In that case, > we *do* want to continue and try again which is what will happen if we > simply ignore the return value (and delete interrupted). A spurious wakeup is not reported as a PR_FAILURE, at least not according to the documentation in prcvar.h: PR_FAILURE should only be reported for an explicit PR_Interrupt(), which nobody uses, or for incorrect use of the API (forgetting to lock). (Indeed I expect "spurious wakeup" and "wakeup" are necessarily indistinguishable, or there would be no reason to have a notion of "spurious wakeup"; the cvar could take care of never reporting that.) > @@ +333,5 @@ > > + else > > + timeout_us -= (after - before); > > + } > > + } while (!(interrupted || wokenForJSInterrupt_ || woken_ || timedout)); > > + waiting_ = false; > > Can you MOZ_ASSERT(waiting_); before clearing? I suppose so. > @@ +337,5 @@ > > + waiting_ = false; > > + if (interrupted) > > + return Uncatchable; > > + if (wokenForJSInterrupt_) > > + return WokenForJSInterrupt; > > I think you need to clear wokenForJSInterrupt_ here or else all future > waits() will return WokenForJSInterrupt. No, because it's cleared before the wait loop, but I agree it is good hygiene to clear it after as well. > @@ +354,5 @@ > > + case WakeExplicit: > > + woken_ = true; > > + break; > > + case WakeForJSInterrupt: > > + wokenForJSInterrupt_ = true; > > Instead of having two independent bools (with 4 possible pair states, 1 of > which is unobservable (both false), 1 of which is impossible (both true)), > how about just having a WakeReason as the field and switching on that? To > catch bugs, you could set it to WakeReason(-1) before the PR_WaitCondVar and > assert the reason is WakeReason(-1) in wake() before clobbering it. Maybe - need to think about the consequences. > @@ +2896,5 @@ > > JS_SetErrorReporter(rt, my_ErrorReporter); > > > > + ShellFutexAPI *fx = js_new<ShellFutexAPI>(); > > + if (!fx || !fx->init()) { > > + js_delete(fx); > > Random note, I just noticed that ~JSRuntime has 'delete futexAPI_;'; it > should use js_delete(futexAPI_) (global SM rule; allows the embedding to > switch allocators). That means that the embedding must also use JS_malloc() > to allocate the memory which is a pain and now I'm starting to think that > having ~JSRuntime delete the FutexAPI is a bad idea (which was probably > mine). It was indeed your idea :) It was a nice simplification at the time. I'll file a followup bug for that. > It also prevents the FutexAPI object from being a global or member > variable (which I would expect them to be for the main-thread and > worker-thread, resp.). Not sure - depends on the relative deletion order of things and the protocol for that we eventually end up with. > @@ +4429,5 @@ > > + return false; > > + } > > + > > + SharedArrayRawBuffer *newBuffer = args[0].toObject().as<SharedArrayBufferObject>().rawBufferObject(); > > + newBuffer->addReference(); > > Could you have a corresponding dropReference() in main(), right after the > JS_DestroyRuntime(). That way, in theory, all SharedArrayRawBuffers would > be dead by the time we JS_Shutdown(). Then, to test let this condition be > tested/fuzzed, you could keep an #ifdef DEBUG count of total live > SharedArrayRawBuffers and assert that number is zero right from inside > JS_Shutdown(). Sure thing.
Assignee | ||
Comment 10•8 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #8) > Higher-level question: if JS_RequestInterruptCallback() Just Works for > interrupting a blocked futex, do we even need a FutexAPI? I'm having > trouble coming up with what it gives us just using PRLock/PRCondVar directly > inside AtomicsObject.cpp. I know workers are doing some exciting stuff with > nested event loops; maybe I just don't understand the constraints. The DOM code probably illustrates the need better than the shell code. In the DOM, the worker code needs to interact with a nested event loop that has its own (existing) condition variable that it uses for Runnable dispatch and some timeout logic for handling runaway threads that have been canceled. On the main thread, the FutexAPI uses its own condition variable but needs to manage timeouts that exceed the range of the interval timer used for conditional variable waits. The FutexAPI looks like a bit of overhead but it has actually allowed for flexibility in how waiting is handled in different contexts. It's possible that an extended interrupt mechanism could take the place of all that. (Or perhaps, it's possible the interrupt mechanism should be extended to allow for all of that.) I doubt it would be much simpler but it might be useful for other things as well.
![]() |
||
Comment 11•8 years ago
|
||
I'm sorry, I still don't see the need for a nested event loop in either the main-thread or worker case. In both of those cases, if a futexWait() needs to be interrupted, a watchdog thread requests an interrupt and it is the *interrupt callback* that does the nested event loop. E.g.: http://mxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#4698 This is really no different then how this all works with an ilooping 'while(true){}'. I think Ben gave the advice to use a nested event loop to implement futexWait() when he thought that we'd unconditionally pthread_cond_wait, but since RequestInterruptCallback will wake the cond var (and we can distinguish this wake from a real futexWake()) everything will Just Work. I talked with bent about this on irc and so far we can't see any counter-examples.
![]() |
||
Comment 12•8 years ago
|
||
(Also, in addition to massively simplifying the whole path, it means we can review this all from within js/src.)
(In reply to Luke Wagner [:luke] from comment #11) > I think Ben gave the advice to use a nested event loop to > implement futexWait() when he thought that we'd unconditionally > pthread_cond_wait, but since RequestInterruptCallback will wake the cond var > (and we can distinguish this wake from a real futexWake()) everything will > Just Work. Yep, that was my advice, and my thought process. I didn't know RequestInterruptCallback would break us out of the wait. As long as it does I think this sounds fine. Even the memory reporter stuff should continue to work correctly.
Assignee | ||
Comment 14•8 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #12) > (Also, in addition to massively simplifying the whole path, it means we can > review this all from within js/src.) A simplification would be most welcome, in this case at the cost of an additional watchdog thread for workers presumably (one should be enough). I'll try to rework the code, and then we'll see.
Assignee | ||
Comment 15•8 years ago
|
||
A very nice simplification indeed. And no extra thread needed, that was a misconception on my part. All my tests are passing in a debug build; will continue to test for a little while before posting patches. (I don't have any tests that check that the memory reporter works correctly, though.)
Assignee | ||
Comment 16•8 years ago
|
||
Shared memory mailboxes only (cleaned up) since all the other code moved to bug 1074237.
Attachment #8563370 -
Attachment is obsolete: true
Attachment #8563370 -
Flags: review?(luke)
Attachment #8566024 -
Flags: review?(luke)
![]() |
||
Comment 17•8 years ago
|
||
Comment on attachment 8566024 [details] [diff] [review] Shared memory mailboxes Review of attachment 8566024 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/shell/js.cpp @@ +4302,5 @@ > + newObj = SharedArrayBufferObject::New(cx, buf); > + } > + PR_Unlock(sharedArrayBufferMailboxLock); > + > + args.rval().setObjectOrNull(newObj); I think you need to "if (!newObj) return false" (since SharedArrayBufferObject::New() will have reported an out of memory error). Then this can be setObject(*newObj).
Attachment #8566024 -
Flags: review?(luke) → review+
Assignee | ||
Comment 18•8 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #17) > Comment on attachment 8566024 [details] [diff] [review] > Shared memory mailboxes > > Review of attachment 8566024 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/shell/js.cpp > @@ +4302,5 @@ > > + newObj = SharedArrayBufferObject::New(cx, buf); > > + } > > + PR_Unlock(sharedArrayBufferMailboxLock); > > + > > + args.rval().setObjectOrNull(newObj); > > I think you need to "if (!newObj) return false" (since > SharedArrayBufferObject::New() will have reported an out of memory error). > Then this can be setObject(*newObj). Not quite, but you're right about that bug. The function needs to successfully return null if there is no object in the mailbox.
Assignee | ||
Comment 19•8 years ago
|
||
This does a better job of handling various argument types to setSharedArrayBuffer, notably the buffer argument can be cleared with null. It also properly allows null to be returned from getSharedArrayBuffer. Not asking for re-review, just noting the adjustments.
Assignee | ||
Comment 20•8 years ago
|
||
The futex tests are not yet comprehensive but this covers some interesting cases at least.
Attachment #8566400 -
Flags: review?(luke)
Assignee | ||
Comment 21•8 years ago
|
||
Memo to self: the tests can be run locally but alone and with the harness, but on try they cause errors in the "J" test suite, see here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f1b2d292aac8 Possibly this is to do with timeout (test takes 5-6 seconds to run), or a missing annotation, or a bug in the tests when run on try - need to investigate.
![]() |
||
Comment 22•8 years ago
|
||
Comment on attachment 8566400 [details] [diff] [review] Test cases for shell futexes and shared memory mailbox Review of attachment 8566400 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/tests/shell/shell.js @@ +2,5 @@ > + * This Source Code Form is subject to the terms of the Mozilla Public > + * License, v. 2.0. If a copy of the MPL was not distributed with this > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > + > +if (typeof assertThrowsInstanceOf === 'undefined') { Ah, so jit-tests/lib/asserts.js isn't available to js/src/tests? Most of us default to putting tests in jit-tests/tests/some_new_dir so I don't have to worry about test harness strangeness (it's weird and mostly historic that we have two test harnesses anyways; they have been incrementally merged; and should one day be one). Anyhow, if you have a reason for putting these tests in js/src/tests, there is already an assertThrows() in js/src/tests/shell.js, so you could put this right under.
Attachment #8566400 -
Flags: review?(luke) → review+
Assignee | ||
Comment 23•8 years ago
|
||
Try is green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8a3d11fa7087 Landed: https://hg.mozilla.org/integration/mozilla-inbound/rev/628b43ac056c https://hg.mozilla.org/integration/mozilla-inbound/rev/159b306674c8
Comment 24•8 years ago
|
||
conflict(?) with bug 1101662. shell/futex.js fails with --tbpl option (especially --no-thread). https://hg.mozilla.org/integration/mozilla-inbound/rev/e47a04e3dbe6
Assignee | ||
Comment 25•8 years ago
|
||
(In reply to Tooru Fujisawa [:arai] from comment #24) > conflict(?) with bug 1101662. > shell/futex.js fails with --tbpl option (especially --no-thread). > > https://hg.mozilla.org/integration/mozilla-inbound/rev/e47a04e3dbe6 I'll take a look at that in the morning. --tbpl shouldn't fail but the feature absolutely requires threads (shell workers) so it may just be a matter of disabling the test for some configs.
https://hg.mozilla.org/mozilla-central/rev/628b43ac056c https://hg.mozilla.org/mozilla-central/rev/159b306674c8 https://hg.mozilla.org/mozilla-central/rev/e47a04e3dbe6
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox38:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Updated•8 years ago
|
Assignee: nobody → lhansen
You need to log in
before you can comment on or make changes to this bug.
Description
•