Closed
Bug 529234
Opened 16 years ago
Closed 16 years ago
Clean up child processes
Categories
(Core :: IPC, defect)
Core
IPC
Tracking
()
RESOLVED
FIXED
People
(Reporter: cjones, Assigned: cjones)
References
Details
This kinda sorta depends on bug 529005, but not in such a way that I'll mark the dependency.
We should be careful about cleaning up child processes, as leaving zombies around leaks PIDs and probably a small amount of kernel memory per process. Following on bug 529005, after our IPC code thinks a child process should be dead (after IPDL shutdown), it can enqueue a CleanupChildProcess() event for perhaps 100ms later. This would check for child death with waitpid(WNOHANG), and if the child wasn't yet dead, deliver SIGTERM to the child and enqueue another CleanupChildProcess() event for the future. If the child *still* wasn't dead after the second cleanup, we should assume something is wrong and deliver SIGKILL, enqueuing CleanupChildProcess() until the child was harvested. This is similar to how UNIX's "init" shutdown works, except ...
Personally, I think using SIGCHLD to detect child death, rather than timeouts, is preferable, but we don't currently have library support for signal handling, and I'm not 100% sure how windows signals child death (notify event?). We could piggyback on chromium's imported libevent on POSIX, but we need to be careful not to get too tied into chromium code. Regardless of how we detect death, we still need the SIGTERM/SIGKILL timeouts.
| Assignee | ||
Comment 1•16 years ago
|
||
(In reply to comment #0)
> We should be careful about cleaning up child processes, as leaving zombies
> around leaks PIDs and probably a small amount of kernel memory per process.
Also if a child process just hangs during termination, i.e. doesn't even make it to the zombie state, and has open shmem fds, e.g., then the resource leak could be much worse.
| Assignee | ||
Updated•16 years ago
|
Assignee: nobody → jones.chris.g
Windows is easy here - the process object becomes signaled when the process dies.
| Assignee | ||
Comment 3•16 years ago
|
||
The best solution for the time being is to use the chromium process cleanup code, which spawns off a new thread to wait. As long as we ensure the cleanup threads are "detached", it will solve bug 528411.
Blocks: 528411
| Assignee | ||
Comment 4•16 years ago
|
||
Pushed http://hg.mozilla.org/projects/electrolysis/rev/82ce02c1243b. I /want/ to leave this bug open for the sake of cleaning up ContentProcessParent shutdown, but as that's more of a refactoring it should probably be done in a followup.
| Assignee | ||
Updated•16 years ago
|
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment 5•16 years ago
|
||
This patch doesn't seem to have helped at all because we don't actually call ~GeckoChildProcessHost when shutting down a browser with remote plugins...
| Assignee | ||
Comment 6•16 years ago
|
||
It's supposed to be (and used to be). There was a hack pushed recently that may have messed things up, looking into it.
| Assignee | ||
Comment 7•16 years ago
|
||
Is there any evidence other than the tinderbox "missing line" error that leads you to believe the dtor isn't being called? It's working correctly on my local machine.
You need to log in
before you can comment on or make changes to this bug.
Description
•