Closed Bug 1433856 Opened 2 years ago Closed 2 years ago

Crash in mozilla::ipc::MessageChannel::Close

Categories

(Core :: Security: Process Sandboxing, defect, P1, critical)

Unspecified
Windows 7
defect

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 --- unaffected
firefox60 --- fixed

People

(Reporter: calixte, Assigned: handyman)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug was filed from the Socorro interface and is
report bp-7ee0399a-d6d4-43c0-b442-483310180129.
=============================================================

Top 10 frames of crashing thread:

0 xul.dll mozilla::ipc::MessageChannel::Close ipc/glue/MessageChannel.cpp:2711
1 xul.dll mozilla::plugins::PluginModuleChromeParent::CleanupFromTimeout dom/plugins/ipc/PluginModuleParent.cpp:817
2 xul.dll mozilla::ipc::TaskFactory<mozilla::plugins::PluginModuleChromeParent>::TaskWrapper<mozilla::ipc::TaskFactory<mozilla::plugins::PluginModuleChromeParent>::RunnableMethod<void  ipc/glue/TaskFactory.h:43
3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040
4 xul.dll nsThread::Shutdown xpcom/threads/nsThread.cpp:796
5 xul.dll mozilla::dom::workers::RuntimeService::ShutdownIdleThreads dom/workers/RuntimeService.cpp:1882
6 xul.dll nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:701
7 xul.dll nsTimerEvent::Run xpcom/threads/TimerThread.cpp:286
8 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040
9 xul.dll nsThread::Shutdown xpcom/threads/nsThread.cpp:796

=============================================================

There are 2 crashes (from are 2 installations) in nightly 60 with buildid 20180128220152. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1382251.

[1] https://hg.mozilla.org/mozilla-central/rev?node=0dd2318482b5
Flags: needinfo?(davidp99)
I think the important part of the crashed stack (from the chrome process) is:

  xul.dll!mozilla::ipc::MessageChannel::Close() Line 2711 C++
  xul.dll!mozilla::plugins::PluginModuleChromeParent::CleanupFromTimeout(const bool aFromHangUI=false) Line 819 C++
  xul.dll!mozilla::ipc::TaskFactory<mozilla::plugins::PluginModuleChromeParent>::TaskWrapper<mozilla::ipc::TaskFactory<mozilla::plugins::PluginModuleChromeParent>::RunnableMethod<void (__cdecl mozilla::plugins::PluginModuleChromeParent::*)(bool) __ptr64,Tuple1<bool> > >::Run() Line 44 C++
  xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult=0x00000043a2ffa170) Line 1041  C++
  xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait=true) Line 517 C++
  xul.dll!nsThread::Shutdown() Line 796 C++
  xul.dll!mozilla::plugins::FunctionBrokerThread::`scalar deleting destructor'(unsigned int)  C++
> xul.dll!mozilla::plugins::FunctionBrokerParent::~FunctionBrokerParent() Line 54 C++
  xul.dll!mozilla::plugins::FunctionBrokerParent::`scalar deleting destructor'(unsigned int)  C++
  xul.dll!mozilla::plugins::PluginModuleChromeParent::ActorDestroy(mozilla::ipc::IProtocol::ActorDestroyReason why=NormalShutdown) Line 1617  C++
  xul.dll!mozilla::plugins::PPluginModuleParent::OnChannelClose() Line 1325 C++
  xul.dll!mozilla::ipc::MessageChannel::NotifyChannelClosed() Line 2747 C++
  xul.dll!mozilla::ipc::MessageChannel::Close() Line 2724 C++
  xul.dll!mozilla::plugins::PluginModuleChromeParent::CleanupFromTimeout(const bool aFromHangUI=false) Line 819 C++
  xul.dll!mozilla::ipc::TaskFactory<mozilla::plugins::PluginModuleChromeParent>::TaskWrapper<mozilla::ipc::TaskFactory<mozilla::plugins::PluginModuleChromeParent>::RunnableMethod<void (__cdecl mozilla::plugins::PluginModuleChromeParent::*)(bool) __ptr64,Tuple1<bool> > >::Run() Line 44 C++
  xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult=0x00000043a2ffab68) Line 1041  C++
  xul.dll!XPTC__InvokebyIndex() Line 99 Unknown

Note the cyclic calls to PluginModuleChromeParent::CleanupFromTimeout.  This appears to be because the nsThread event system doesn't remove a job from its queue before running it _and_ because the nsThread::Shutdown operation chooses to spin the event queue.  Both of these are questionable (the first is really weird although the method probably doesn't intent to be re-entrant) but this is _very_ low level behavior that we may not want to tamper with.  Instead, I've hacked CleanupFromTimeout so it won't run itself recursively.
Assignee: nobody → davidp99
Flags: needinfo?(davidp99)
Priority: -- → P1
Comment on attachment 8946534 [details] [diff] [review]
Block PluginModuleChromeParent::CleanupFromTimeout from recursing

Tests: https://treeherder.mozilla.org/#/jobs?repo=try&revision=73c3217d44222a9a8b6445260b70ab0ae355faf1&selectedJob=159254510

jimm: There is no STR for this -- there are a number of conditions under which this happens but they are all based on something like IPDL timing out.  See comment 1 for my argument for why this should work.
Attachment #8946534 - Flags: review?(jmathies)
Comment on attachment 8946534 [details] [diff] [review]
Block PluginModuleChromeParent::CleanupFromTimeout from recursing

Review of attachment 8946534 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/plugins/ipc/PluginModuleParent.cpp
@@ +810,5 @@
> +    if (mIsCleaningFromTimeout) {
> +      return;
> +    }
> +
> +    mIsCleaningFromTimeout = true;

consider using AutoRestore.
Attachment #8946534 - Flags: review?(jmathies) → review+
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cda4b83616fc
Block PluginModuleChromeParent::CleanupFromTimeout from recursing. r=jimm
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/cda4b83616fc
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
You need to log in before you can comment on or make changes to this bug.