[e10s] Don't deadlock on a child process for a CPOW

RESOLVED FIXED in mozilla26

Status

()

Core
IPC
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: dvander, Assigned: dvander)

Tracking

unspecified
mozilla26
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

Created attachment 791083 [details] [diff] [review]
timeouts.patch

Whether this approach will be good long-term is unknown - we may want to have a slow-script dialog, or to allow the child process to continue, or have a timeout backoff. For now let's see what happens if we just kill the process, since that's strictly better than a UI deadlock.

I put the logic in RPCChannel.cpp since ContentParent doesn't have access to RPCChannel, but it feels gross... if you prefer I could just static cast and hook ShouldContinueFromReplyTimeout in ContentParent.
Attachment #791083 - Flags: review?(cjones.bugs)
Comment on attachment 791083 [details] [diff] [review]
timeouts.patch

>diff --git a/ipc/glue/RPCChannel.cpp b/ipc/glue/RPCChannel.cpp
>--- a/ipc/glue/RPCChannel.cpp
>+++ b/ipc/glue/RPCChannel.cpp
>@@ -634,11 +634,41 @@ RPCChannel::OnChannelErrorFromLink()
>     mMonitor->AssertCurrentThreadOwns();
> 
>     if (0 < StackDepth())
>         NotifyWorkerThread();
> 
>     SyncChannel::OnChannelErrorFromLink();
> }
> 
>+static bool sCheckedForDebuggingChildProcess = false;
>+static bool sDebuggingChildProcesses = false;

Nit:

  static enum { UNKNOWN, NOT_DEBUGGING, DEBUGGING } sDebuggingChildren;

or something to that effect.

>+bool
>+RPCChannel::ShouldContinueFromTimeout()

I don't understand why we're overriding this.  In the PContent case,
the only way the parent will block on the child is through urgent
messages, so the SyncChannel::ShouldContinueFromTimeout()
implementation seems fine for now.  If we need something fancier down
the road, I'd rather expose the "IPC stack" to the
ShouldContinueFromReplyTimeout() hook in actors than bake that kind
of logic into the IPC layer.

Thanks for the test :).
Attachment #791083 - Flags: review?(cjones.bugs)
Created attachment 793244 [details] [diff] [review]
v2

w/ nits fixed
Attachment #791083 - Attachment is obsolete: true
Attachment #793244 - Flags: review?(cjones.bugs)
Comment on attachment 793244 [details] [diff] [review]
v2

>diff --git a/dom/ipc/ContentParent.cpp b/dom/ipc/ContentParent.cpp

>+bool
>+ContentParent::ShouldContinueFromReplyTimeout()
>+{
>+  // The only time ContentParent sends blocking messages is for CPOWs, so
>+  // timeouts should only ever occur in electrolysis-enabled sessions.

If there's some pref where pref <=> CPOWs enabled, it would be nice to
assert that here.

>diff --git a/ipc/glue/SyncChannel.cpp b/ipc/glue/SyncChannel.cpp

>+    static enum { UNKNOWN, NOT_DEBUGGING, DEBUGGING } sDebuggingChildren = UNKNOWN;

No need for the UNKNOWN, but I don't mind being explicit about the init.

r=me, would be nice to get CPOW assertion in.
Attachment #793244 - Flags: review?(cjones.bugs) → review+
(In reply to Chris Jones [:cjones] [:warhammer] from comment #3)
> >+    static enum { UNKNOWN, NOT_DEBUGGING, DEBUGGING } sDebuggingChildren = UNKNOWN;
> 
> No need for the UNKNOWN, but I don't mind being explicit about the init.

No need for the | = UNKNOWN| init, I meant.  But I don't mind.

Comment 6

5 years ago
https://hg.mozilla.org/mozilla-central/rev/f3dce1d8c04f
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
You need to log in before you can comment on or make changes to this bug.