Closed Bug 1487287 Opened 6 years ago Closed 6 years ago

Move process launch off the I/O thread and into a thread pool

Categories

(Core :: IPC, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla67
Performance Impact medium
Tracking Status
firefox67 --- fixed

People

(Reporter: jld, Assigned: jld)

References

Details

(Keywords: perf:responsiveness)

Attachments

(4 files)

Process launch can take nontrivial amounts of time, for reasons that aren't all under our control, which is why we're trying to move it off the main thread for responsiveness. However, the parent process I/O thread mediates all IPDL communication with child processes, so blocking it is also not ideal. It looks like what actually needs to run on the I/O thread is, at most, the creation of the channel; the launch itself should be able to run on any thread. This doesn't need its own thread pool or dedicated threads, just a place to send runnables that isn't in contention with anything important, but giving this its own thread pool may be the easiest thing to do with the infrastructure we currently have.
Actually, on Windows we do need a dedicated thread, and specifically *a* thread: https://searchfox.org/mozilla-central/rev/43969de34f5d4b113133d090f024e5eed7a82af0/security/sandbox/chromium/sandbox/win/src/broker_services.cc#323 (Whereas on Mac, we'll probably want thread pool size >1 because of the Mach IPC setup.)
Also, AutoSetProfilerEnvVarsForChildProcess will break if there are overlapping launches on multiple threads (and in general it's preferred to pass strings to LaunchApp instead of side-effecting the parent process).
Whiteboard: [qf] → [qf:p1:f64]
We can directly set environment variables for the child process on all platforms now, instead of changing the parent's environment and inheriting the changes. This simplifies memory management, but more importantly it's necessary for thread safety to allow launching processes from a thread pool. Depends on D8944
Launching processes takes enough time that we should avoid blocking the parent process's IPC I/O thread for it; it's less bad for responsiveness than blocking the main thread, but it's not good. On Windows we need to use a dedicated thread, because the sandbox isn't thread-safe and it asserts that the same thread is used for every launch. Otherwise, a thread pool is used. Depends on D8945
Priority: -- → P1
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8fa5e81ad801 Set profiler env vars in child processes without side-effecting the parent process. r=mstange https://hg.mozilla.org/integration/autoland/rev/8b1f88d7bfeb Move child process launch off the I/O thread. r=mccr8
This was backed out as part of the big async launch patch stack, and I believe it's responsible for both of the failures that were seen: 1. This means that the I/O thread can be woken up for OnChannel{Connected,Error} before the launch thread has finished the launch and set the state to PROCESS_CREATED. I added some assertions about the state in bug 1446161, inspired by review comments, which caught this. 2. Web Replay launches content processes from the middleman process, which doesn't initialize SharedThreadPool. This should be simple to fix.
Also, RegisterChildCrashAnnotationFileDescriptor isn't thread-safe.

https://phabricator.services.mozilla.com/D8946 needs re-review for the changes I made; I tried to set that in Phabricator but I don't know how well it worked.

Flags: needinfo?(continuation)
Flags: needinfo?(continuation)
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e818109ab4e1 Set profiler env vars in child processes without side-effecting the parent process. r=mstange https://hg.mozilla.org/integration/autoland/rev/5a5e1801bb36 Fix thread-safety of crash reporter pid map. r=gsvelto
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a0cf88b1fe5b Move child process launch off the I/O thread. r=mccr8

Backed out the main patch (D8946) for causing bug 1519145.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [qf:p1:f64] → [qf:p2:responsiveness]

In order to enable asynchronous launch, destruction of
GeckoChildProcessHost (and its subclasses) has to be delayed until after
launching (or anything else that might be made asynchronous in the
future) has completed, to prevent use-after-free. However, there are
other dependencies on process hosts always being destroyed on the I/O
thread, so refcounting would be difficult to use.

Instead, GeckoChildProcessHost now may not be destroyed directly, but
must go through a method that handles the scheduling.

There are also some minor cleanups to the affected headers (removed
duplicate access modifiers, and made PluginProcessParent final).

Depends on D18010

No longer blocks: 1521839
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/242953b250a5 Synchronize GeckoChildProcessHost destruction with launching. r=mccr8 https://hg.mozilla.org/integration/autoland/rev/073aa330041c Move child process launch off the I/O thread. r=mccr8

Are you intending uplift to beta 66? I notice the target milestone is set to 66 here though this landed on 67 nightly. Maybe a little complicated for uplift though.

Flags: needinfo?(jld)

The flags probably got set like that because this got backed out twice and needed some time to fix. It's a risky change, and it's for a performance improvement that's probably not significant pre-Fission (and not well quantified in general), so no uplift.

Flags: needinfo?(jld)
Target Milestone: mozilla66 → mozilla67

Perfect, that's what I like to hear right now in beta. Thanks.

Regressions: 1544412
Regressions: 1607153
Performance Impact: --- → P2
Whiteboard: [qf:p2:responsiveness]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: