Don't block process launch waiting for the child to send a Mach port

RESOLVED WONTFIX

Status

()

enhancement
P3
normal
RESOLVED WONTFIX
11 months ago
7 months ago

People

(Reporter: jld, Assigned: jld)

Tracking

(Depends on 1 bug, Blocks 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:p1:f67])

Short version: we're blocking the main thread for on the order of 50-100ms every time we launch a child process on Mac.

Long version: When we launch a child process on OS X, we need to exchange Mach ports with it, so the parent process registers a port and waits for the child to introduce itself.  This means it has to wait for the child process to load and relocate all its libraries and get into XRE_InitChildProcess.  Profiling locally I saw 70-80ms spent blocking this way, and only a few ms of other work occurring within the window measured by the CONTENT_PROCESS_LAUNCH_TIME_MS histogram.

That histogram, on Beta, has quantiles as follows:

5th Percentile     27.53
25th Percentile    41.46
Median             58.69
75th Percentile   104.94
95th Percentile   486.19

So this is kind of bad.  I don't know if we have support for including Mach ports in our event loop or if it would be necessary to spawn a separate thread for this.

The child port is stored in GeckoChildProcessHost::mChildTask and used by the crash reporter, so if there's a crash before the port was sent then we already weren't going to get it.  The other use of the Mach negotiation is for our shared memory implementation; shared memory sending would have to block, and if any of that happens in ContentParent::Init{,Internal} then we run into the same problems as bug 1446161.  However, see also bug 1465669; it's not out of the question that we could replace that code.
Depends on: 1465669
Bug 1265824 indicates that graphics is going to need the core port-wrangling functionality in SharedMemoryBasic_mach, so we may not be able to get rid of that even if shm_open is an adequate shared memory backend (bug 1465669).

However, that might not be a problem: that code is already creating a parent-process thread for each child process to handle communications, and maintains per-other-process state, so that could be factored out and extended to use that thread to asynchronously wait for the setup message.

One failure mode here is if we try to send the child process shared memory immediately after it's launched, in which case we'd still block for unacceptably long (like the problem we're having with async launch and pids/handles).  If that's a problem — and I don't know whether it is — it could be worked around by using shm_open for those items in particular, even if we end up preferring to use Mach memory directly in general. 

(Note that the ongoing work to reduce per-process overhead by distributing global data to content processes in read-only shared memory, which sends some of that data early in startup, is already doing its own thing because IPC shared memory doesn't quite handle its use case.  If that's eventually merged into IPC we'd continue using POSIX APIs for it instead of Mach.)

Updated

10 months ago
Whiteboard: [qf] → [qf:p1:f67]
There's another way to fix this: Don't.

Instead, throw OS X in the same bucket as approximately half of the Windows userbase (i.e., a lot more people) that has similar process launch times according to telemetry, treat this delay like an opaque thing we can't control, and try to avoid blocking either the main or I/O thread.  (In the latter case a general-purpose thread pool can be used, so we'd only need to consume a thread for each process currently launching and not the ones that are already launched.)

This also avoids adding another dependency on the internals of the current shared memory implementation, which we may not want to keep.
Not really actionable at this point.
Priority: -- → P3
I'm going to call this WONTFIX.  As discussed in triage this morning, this latency won't matter much until Fission happens and we're launching a lot more content processes, but Fission will always use async launch, so at that point it won't matter at all.

See also bug 1446161, which adds basic support for async content process launch, and bug 1487287, which moves the launching into a thread pool (instead of sharing a single thread with all IPDL IPC, where this type of jank is still problematic).  What I was vaguely proposing in comment #2 is now actual patches that are up for review.  We'll still have a couple of sync launches at startup for the time being, but Fission should take care of those.
Status: NEW → RESOLVED
Last Resolved: 7 months ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.