Closed Bug 1087537 Opened 10 years ago Closed 2 years ago

Linux e10s Mochitest 2 and 3 intermittently send a message to child process too late

Categories

(Core :: DOM: Content Processes, defect, P5)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
e10s + ---

People

(Reporter: mccr8, Unassigned)

References

Details

Linux e10s M3 is intermittently failing to produce a log.  The one time I've managed to reproduce it on TBPL so far it was sending a bad message with this stack for the error:

###!!! [Child][OnMaybeDequeueOne] Error: Channel closing: too late to send/recv, messages will be lost
[Child 1870] ###!!! ASSERTION: Dropping message!: 'false', file /builds/slave/try-lx-d-000000000000000000000/build/dom/ipc/ContentChild.cpp, line 1660
#01: mozilla::dom::ContentChild::ProcessingError(mozilla::ipc::HasResultCodes::Result) [dom/ipc/ContentChild.cpp:1677]
#02: mozilla::dom::PContentChild::OnProcessingError(mozilla::ipc::HasResultCodes::Result) [obj-firefox/ipc/ipdl/PContentChild.cpp:5760]
#03: mozilla::ipc::MessageChannel::ReportConnectionError(char const*) const [ipc/glue/MessageChannel.cpp:1435]
#04: mozilla::ipc::MessageChannel::DequeueOne(IPC::Message*) [ipc/glue/MessageChannel.cpp:996]
#05: mozilla::ipc::MessageChannel::OnMaybeDequeueOne() [ipc/glue/MessageChannel.cpp:1019]
#06: RunnableMethod<mozilla::ipc::MessageChannel, bool (mozilla::ipc::MessageChannel::*)(), Tuple0>::Run() [ipc/chromium/src/base/tuple.h:383]
#07: mozilla::ipc::MessageChannel::DequeueTask::Run() [ipc/glue/MessageChannel.h:404]
#08: MessageLoop::RunTask(Task*) [ipc/chromium/src/base/message_loop.cc:362]
#09: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) [ipc/chromium/src/base/message_loop.cc:372]
#10: MessageLoop::DoWork() [ipc/chromium/src/base/message_loop.cc:447]
#11: mozilla::ipc::DoWorkRunnable::Run() [ipc/glue/MessagePump.cpp:234]
#12: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:830]
#13: NS_ProcessNextEvent(nsIThread*, bool) [xpcom/glue/nsThreadUtils.cpp:265]
#14: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:100]
#15: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:233]
#16: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:508]
#17: nsBaseAppShell::Run() [widget/xpwidgets/nsBaseAppShell.cpp:166]
#18: XRE_RunAppShell [toolkit/xre/nsEmbedFunctions.cpp:713]
#19: mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:272]
#20: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:233]

I think this indicates the parent process tries to send something too late. There's no easy way to tell what is sending it from this stack, so I'm not sure if this is a dupe of bug 1062472 or not.
I've now seen this 3 or so times, and every time there are precisely 54 such errors.
I've also seen this in one M2 run, where there's a similar failure stack over 100 times.
Summary: Linux e10s Mochitest 3 intermittently sends a message to child process too late → Linux e10s Mochitest 2 and 3 intermittently send a message to child process too late
Blocks: 1067633
No longer blocks: 1083897
Intermittent e10s test failure
Priority: -- → P5

e10s content process leak checking has been working for a long time

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.