Closed Bug 1276390 Opened 3 years ago Closed 3 years ago
Use blocking polling in worker to handle subprocess IO
58 bytes, text/x-review-board-request
This will allow us to poll with a much longer timeout, but requires careful synchronization between the parent process and worker to make sure messages are received by the worker without unreasonable delay.
Review commit: https://reviewboard.mozilla.org/r/67516/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/67516/
Attachment #8775348 - Flags: review?(aswan)
Comment on attachment 8775348 [details] Bug 1276390 - Use blocking polling in worker to handle subprocess IO. https://reviewboard.mozilla.org/r/67516/#review64842 As always, the Windows bits are a mystery to me, you might want to get a sanity check from somebody with more depth there. Did you consider using eventfd/kqueue instead of a pipe? Those seem like a more natural fit for the application here but at the cost of having to maintain 3 worker implementations instead of 2, so the choice you made seems reasonable. I continue to be concerned about long-term maintenance of the js-types based implementation but I guess we don't have any realistic alternative given where we are right now. ::: toolkit/modules/subprocess/subprocess_unix.jsm:22 (Diff revision 1) > > Cu.import("resource://gre/modules/ctypes.jsm"); > Cu.import("resource://gre/modules/osfile.jsm"); > Cu.import("resource://gre/modules/Services.jsm"); > Cu.import("resource://gre/modules/Task.jsm"); > +Cu.import("resource://gre/modules/XPCOMUtils.jsm"); I don't see where you're actually using this? ::: toolkit/modules/subprocess/subprocess_worker_unix.js:15 (Diff revision 1) > > importScripts("resource://gre/modules/subprocess/subprocess_shared.js", > "resource://gre/modules/subprocess/subprocess_shared_unix.js", > "resource://gre/modules/subprocess/subprocess_worker_common.js"); > > -const POLL_INTERVAL = 50; > +const POLL_TIMEOUT = 5000; why have a timeout at all now? it should be safe to pass -1 to block until there is any i/o
https://reviewboard.mozilla.org/r/67516/#review64842 I don't think we'd really gain much by using kqueue. And since we need the poll-based version regardless, we'd just be adding more code paths that we need to test and maintain. I'm actually pretty happy with the maintainability of this implementation so far. From experience, I can tell you that this change was a lot easier to implement this way than it would have been in C++. I'm still open to the possibility of reimplementing the worker side of this in Rust when that's a viable option, but for now I don't see any particular reason we should. > I don't see where you're actually using this? Oops. It was used in an intermediate revision. > why have a timeout at all now? it should be safe to pass -1 to block until there is any i/o In theory it should, but it's not a great idea to block the worker event loop for too long. It still needs to do things like run the GC and CC periodically, for one thing. And there's always a chance that there's some other important event that we'll miss notifying the worker of, and I'd rather we execute them after some predictable delay than whenever the next IO operation happens.
https://hg.mozilla.org/integration/mozilla-inbound/rev/48cff1c9b619bb224dd763332836522fd60c160f Bug 1276390 - Use blocking polling in worker to handle subprocess IO. r=aswan
This broke xpcshell tests like https://treeherder.mozilla.org/logviewer.html#?job_id=32855439&repo=mozilla-inbound Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/6e95ea056bf5
https://hg.mozilla.org/integration/mozilla-inbound/rev/97b1b6e04bc92e1efa31b66603fe4f060d161aca Bug 1276390 - Use blocking polling in worker to handle subprocess IO. r=aswan
You need to log in before you can comment on or make changes to this bug.