Closed Bug 1276390 Opened 3 years ago Closed 3 years ago

Use blocking polling in worker to handle subprocess IO

Categories

(Toolkit :: Async Tooling, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox50 --- fixed

People

(Reporter: kmag, Assigned: kmag)

References

Details

Attachments

(1 file)

This will allow us to poll with a much longer timeout, but requires careful synchronization between the parent process and worker to make sure messages are received by the worker without unreasonable delay.
Blocks: 1288912
Attachment #8775348 - Flags: review?(aswan) → review+
Comment on attachment 8775348 [details]
Bug 1276390 - Use blocking polling in worker to handle subprocess IO.

https://reviewboard.mozilla.org/r/67516/#review64842

As always, the Windows bits are a mystery to me, you might want to get a sanity check from somebody with more depth there.
Did you consider using eventfd/kqueue instead of a pipe?  Those seem like a more natural fit for the application here but at the cost of having to maintain 3 worker implementations instead of 2, so the choice you made seems reasonable.
I continue to be concerned about long-term maintenance of the js-types based implementation but I guess we don't have any realistic alternative given where we are right now.

::: toolkit/modules/subprocess/subprocess_unix.jsm:22
(Diff revision 1)
>  
>  Cu.import("resource://gre/modules/ctypes.jsm");
>  Cu.import("resource://gre/modules/osfile.jsm");
>  Cu.import("resource://gre/modules/Services.jsm");
>  Cu.import("resource://gre/modules/Task.jsm");
> +Cu.import("resource://gre/modules/XPCOMUtils.jsm");

I don't see where you're actually using this?

::: toolkit/modules/subprocess/subprocess_worker_unix.js:15
(Diff revision 1)
>  
>  importScripts("resource://gre/modules/subprocess/subprocess_shared.js",
>                "resource://gre/modules/subprocess/subprocess_shared_unix.js",
>                "resource://gre/modules/subprocess/subprocess_worker_common.js");
>  
> -const POLL_INTERVAL = 50;
> +const POLL_TIMEOUT = 5000;

why have a timeout at all now?  it should be safe to pass -1 to block until there is any i/o
https://reviewboard.mozilla.org/r/67516/#review64842

I don't think we'd really gain much by using kqueue. And since we need the poll-based version regardless, we'd just be adding more code paths that we need to test and maintain.

I'm actually pretty happy with the maintainability of this implementation so far. From experience, I can tell you that this change was a lot easier to implement this way than it would have been in C++. I'm still open to the possibility of reimplementing the worker side of this in Rust when that's a viable option, but for now I don't see any particular reason we should.

> I don't see where you're actually using this?

Oops. It was used in an intermediate revision.

> why have a timeout at all now?  it should be safe to pass -1 to block until there is any i/o

In theory it should, but it's not a great idea to block the worker event loop for too long. It still needs to do things like run the GC and CC periodically, for one thing. And there's always a chance that there's some other important event that we'll miss notifying the worker of, and I'd rather we execute them after some predictable delay than whenever the next IO operation happens.
https://hg.mozilla.org/integration/mozilla-inbound/rev/48cff1c9b619bb224dd763332836522fd60c160f
Bug 1276390 - Use blocking polling in worker to handle subprocess IO. r=aswan
https://hg.mozilla.org/integration/mozilla-inbound/rev/97b1b6e04bc92e1efa31b66603fe4f060d161aca
Bug 1276390 - Use blocking polling in worker to handle subprocess IO. r=aswan
https://hg.mozilla.org/mozilla-central/rev/97b1b6e04bc9
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Flags: needinfo?(kmaglione+bmo)
You need to log in before you can comment on or make changes to this bug.