Move process launch off the I/O thread and into a thread pool
Categories
(Core :: IPC, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox67 | --- | fixed |
People
(Reporter: jld, Assigned: jld)
References
Details
(Keywords: perf:responsiveness)
Attachments
(4 files)
Process launch can take nontrivial amounts of time, for reasons that aren't all under our control, which is why we're trying to move it off the main thread for responsiveness. However, the parent process I/O thread mediates all IPDL communication with child processes, so blocking it is also not ideal. It looks like what actually needs to run on the I/O thread is, at most, the creation of the channel; the launch itself should be able to run on any thread. This doesn't need its own thread pool or dedicated threads, just a place to send runnables that isn't in contention with anything important, but giving this its own thread pool may be the easiest thing to do with the infrastructure we currently have.
Assignee | ||
Comment 1•6 years ago
|
||
Actually, on Windows we do need a dedicated thread, and specifically *a* thread: https://searchfox.org/mozilla-central/rev/43969de34f5d4b113133d090f024e5eed7a82af0/security/sandbox/chromium/sandbox/win/src/broker_services.cc#323 (Whereas on Mac, we'll probably want thread pool size >1 because of the Mach IPC setup.)
Assignee | ||
Comment 2•6 years ago
|
||
Also, AutoSetProfilerEnvVarsForChildProcess will break if there are overlapping launches on multiple threads (and in general it's preferred to pass strings to LaunchApp instead of side-effecting the parent process).
Updated•6 years ago
|
Assignee | ||
Comment 3•6 years ago
|
||
We can directly set environment variables for the child process on all platforms now, instead of changing the parent's environment and inheriting the changes. This simplifies memory management, but more importantly it's necessary for thread safety to allow launching processes from a thread pool. Depends on D8944
Assignee | ||
Comment 4•6 years ago
|
||
Launching processes takes enough time that we should avoid blocking the parent process's IPC I/O thread for it; it's less bad for responsiveness than blocking the main thread, but it's not good. On Windows we need to use a dedicated thread, because the sandbox isn't thread-safe and it asserts that the same thread is used for every launch. Otherwise, a thread pool is used. Depends on D8945
Updated•6 years ago
|
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8fa5e81ad801 Set profiler env vars in child processes without side-effecting the parent process. r=mstange https://hg.mozilla.org/integration/autoland/rev/8b1f88d7bfeb Move child process launch off the I/O thread. r=mccr8
Assignee | ||
Comment 6•6 years ago
|
||
This was backed out as part of the big async launch patch stack, and I believe it's responsible for both of the failures that were seen: 1. This means that the I/O thread can be woken up for OnChannel{Connected,Error} before the launch thread has finished the launch and set the state to PROCESS_CREATED. I added some assertions about the state in bug 1446161, inspired by review comments, which caught this. 2. Web Replay launches content processes from the middleman process, which doesn't initialize SharedThreadPool. This should be simple to fix.
Assignee | ||
Comment 7•6 years ago
|
||
Also, RegisterChildCrashAnnotationFileDescriptor isn't thread-safe.
Assignee | ||
Comment 8•5 years ago
|
||
Depends on D8945
Assignee | ||
Comment 9•5 years ago
|
||
https://phabricator.services.mozilla.com/D8946 needs re-review for the changes I made; I tried to set that in Phabricator but I don't know how well it worked.
Updated•5 years ago
|
Comment 10•5 years ago
|
||
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e818109ab4e1 Set profiler env vars in child processes without side-effecting the parent process. r=mstange https://hg.mozilla.org/integration/autoland/rev/5a5e1801bb36 Fix thread-safety of crash reporter pid map. r=gsvelto
Comment 11•5 years ago
|
||
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a0cf88b1fe5b Move child process launch off the I/O thread. r=mccr8
Comment 12•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e818109ab4e1
https://hg.mozilla.org/mozilla-central/rev/5a5e1801bb36
https://hg.mozilla.org/mozilla-central/rev/a0cf88b1fe5b
Assignee | ||
Comment 13•5 years ago
|
||
Backed out the main patch (D8946) for causing bug 1519145.
Comment 14•5 years ago
|
||
Backout by jedavis@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/191057a8eccf Backed out changeset a0cf88b1fe5b
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 15•5 years ago
|
||
In order to enable asynchronous launch, destruction of
GeckoChildProcessHost (and its subclasses) has to be delayed until after
launching (or anything else that might be made asynchronous in the
future) has completed, to prevent use-after-free. However, there are
other dependencies on process hosts always being destroyed on the I/O
thread, so refcounting would be difficult to use.
Instead, GeckoChildProcessHost now may not be destroyed directly, but
must go through a method that handles the scheduling.
There are also some minor cleanups to the affected headers (removed
duplicate access modifiers, and made PluginProcessParent final).
Depends on D18010
Comment 16•5 years ago
|
||
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/242953b250a5 Synchronize GeckoChildProcessHost destruction with launching. r=mccr8 https://hg.mozilla.org/integration/autoland/rev/073aa330041c Move child process launch off the I/O thread. r=mccr8
Comment 17•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/242953b250a5
https://hg.mozilla.org/mozilla-central/rev/073aa330041c
Comment 18•5 years ago
|
||
Are you intending uplift to beta 66? I notice the target milestone is set to 66 here though this landed on 67 nightly. Maybe a little complicated for uplift though.
Assignee | ||
Comment 19•5 years ago
|
||
The flags probably got set like that because this got backed out twice and needed some time to fix. It's a risky change, and it's for a performance improvement that's probably not significant pre-Fission (and not well quantified in general), so no uplift.
Comment 20•5 years ago
|
||
Perfect, that's what I like to hear right now in beta. Thanks.
Updated•2 years ago
|
Description
•