If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Conflict between destruction of GeckoChildProcessHost and crash reporter

NEW
Unassigned

Status

()

Core
IPC
P3
normal
a year ago
19 days ago

People

(Reporter: sinker, Unassigned)

Tracking

Trunk
Unspecified
Linux
Points:
---

Firefox Tracking Flags

(firefox52 wontfix)

Details

(Reporter)

Description

a year ago
Gecko would hang when doing destruction of GeckoChildProcessHost, nsExternalHelperAppService::GetFromTypeAndExtension() and PContentChild::SendPCrashReporterConstructor() in a content process at the same time, it create a dead lock.

I ran into this bug while I run reftest with only 1 test on a Linux desktop.

The sequence of actions are
 1. Destruction of GechoChildProcessHost
 2. GetFromTypeAndExtension()
 3. SendPCrashReporterConstructor()

The destruction of GeckoChildProcessHost would wait for the dead of the content process, it would change singal handler for SIGCHLD.  But, the implementation of GetFromTypeAndExtension() for Linux would run an external program to get an environment variable (tricky), it also depend on SIGCHLD with its own handler.  It causes the problem.

GeckoChildProcessHost would override the signal handler, so that GetFromTypeAndExtension() would never return for never being signaled.  It blocks the main thread to wait for the external program.  At the same time, SendPCrashReporterConstructor(), in the content process, are waiting for the main thread, blocked by GetFromTypeAndExtension(), of the parent to handle the IPC message.
So, there is dead lock there.

The parent process would stop at
nsExternalHelperAppService::GetFromTypeAndExtension()

#0  0x00007ffff7bcb04f in pthread_cond_wait@@GLIBC_2.3.2 ()
    at /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff68ea1b3 in PR_WaitCondVar (cvar=0x7fffb67a6340, timeout=timeout@entry=4294967295)
    at /home/thinker/progm/mozilla-central/nsprpub/pr/src/pthreads/ptsynch.c:396
#2  0x00007ffff68da8b3 in _MD_WaitUnixProcess (process=0x7fffb8d2cfa0, exitCode=exitCode@entry=0x7fffffff60bc)
    at /home/thinker/progm/mozilla-central/nsprpub/pr/src/md/unix/uxproces.c:832
#3  0x00007ffff68df7fa in PR_WaitProcess (process=<optimized out>, exitCode=exitCode@entry=0x7fffffff60bc)
    at /home/thinker/progm/mozilla-central/nsprpub/pr/src/misc/prinit.c:734
#4  0x00007fffe818a6ac in nsProcess::Monitor(void*) (aArg=aArg@entry=0x7fffc45b7620)
    at /home/thinker/progm/mozilla-central/xpcom/threads/nsProcessCommon.cpp:281
#5  0x00007fffe818a921 in nsProcess::RunProcess(bool, char**, nsIObserver*, bool, bool) (this=0x7fffc45b7620, aBlocking=<optimized out>, aMyArgv=0x7fffb6520800, aObserver=0x0, aHoldWeak=<optimized out>, aArgsUTF8=<optimized out>)
    at /home/thinker/progm/mozilla-central/xpcom/threads/nsProcessCommon.cpp:543
#6  0x00007fffe818aa67 in nsProcess::CopyArgsAndRunProcess(bool, char const**, unsigned int, nsIObserver*, bool) (this=0x7fffc45b7620, aBlocking=<optimized out>, aArgs=0x7fffffff6388, aCount=2, aObserver=0x0, aHoldWeak=<optimized out>)
    at /home/thinker/progm/mozilla-central/xpcom/threads/nsProcessCommon.cpp:378
#7  0x00007fffe896e8dc in nsOSHelperAppService::GetHandlerAndDescriptionFromMailcapFile(nsAString_internal const&, nsAString_internal const&, nsAString_internal const&, nsAString_internal&, nsAString_internal&, nsAString_internal&) (aFilename=..., aMajorType=..., aMinorType=..., aHandler=..., aDescription=..., aMozillaFlags=...)
    at /home/thinker/progm/mozilla-central/uriloader/exthandler/unix/nsOSHelperAppService.cpp:1091
#8  0x00007fffe896eb3d in nsOSHelperAppService::DoLookUpHandlerAndDescription(nsAString_internal const&, nsAString_internal const&, nsAString_internal&, nsAString_internal&, nsAString_internal&, bool) (aMajorType=..., aMinorType=..., aHandler=..., aDescription=..., aMozillaFlags=..., aUserData=aUserData@entry=false)
    at /home/thinker/progm/mozilla-central/uriloader/exthandler/unix/nsOSHelperAppService.cpp:913

The stack trace when the destructor of GeckoChildHost setting a signal handler.
#0  0x00007fffe8510e7a in _evsig_set_handler (base=base@entry=0x7ffff6b20800, evsignal=evsignal@entry=17, handler=handler@entry=0x7fffe8510c7e <evsig_handler>)
    at /home/thinker/progm/mozilla-central/ipc/chromium/src/third_party/libevent/signal.c:221
#1  0x00007fffe8511088 in evsig_add (base=0x7ffff6b20800, evsignal=17, old=<optimized out>, events=<optimized out>, p=<optimized out>)
    at /home/thinker/progm/mozilla-central/ipc/chromium/src/third_party/libevent/signal.c:303
#2  0x00007fffe850cd0e in evmap_signal_add (base=base@entry=0x7ffff6b20800, sig=<optimized out>, ev=ev@entry=0x7fffbc343ce0)
    at /home/thinker/progm/mozilla-central/ipc/chromium/src/third_party/libevent/evmap.c:433
#3  0x00007fffe850f571 in event_add_internal (ev=0x7fffbc343ce0, tv=0x0, tv_is_absolute=0)
    at /home/thinker/progm/mozilla-central/ipc/chromium/src/third_party/libevent/event.c:2075
#4  0x00007fffe8510113 in event_add (ev=0x7fffbc343ce0, tv=tv@entry=0x0)
    at /home/thinker/progm/mozilla-central/ipc/chromium/src/third_party/libevent/event.c:1966
#5  0x00007fffe84fe3a6 in base::MessagePumpLibevent::CatchSignal(int, base::MessagePumpLibevent::SignalEvent*, base::MessagePumpLibevent::SignalWatcher*) (this=0x7ffff6bccd00, sig=sig@entry=17, sigevent=sigevent@entry=0x7fffba0f8f08, delegate=delegate@entry=0x7fffba0f8f00)
    at /home/thinker/progm/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc:314
#6  0x00007fffe84fe3ed in MessageLoopForIO::CatchSignal(int, base::MessagePumpLibevent::SignalEvent*, base::MessagePumpLibevent::SignalWatcher*) (this=this@entry=0x7fffe6365d30, sig=sig@entry=17, sigevent=sigevent@entry=0x7fffba0f8f08, delegate=delegate@entry=0x7fffba0f8f00)
    at /home/thinker/progm/mozilla-central/ipc/chromium/src/base/message_loop.cc:576
#7  0x00007fffe8509420 in ProcessWatcher::EnsureProcessTerminated(int, bool) (process=19597, force=force@entry=false)
    at /home/thinker/progm/mozilla-central/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:211
#8  0x00007fffe8516857 in mozilla::ipc::GeckoChildProcessHost::~GeckoChildProcessHost() (this=0x7fffc50cf820, __in_chrg=<optimized out>)
    at /home/thinker/progm/mozilla-central/ipc/glue/GeckoChildProcessHost.cpp:132
#9  0x00007fffe85168b6 in mozilla::ipc::GeckoChildProcessHost::~GeckoChildProcessHost() (this=0x7fffc50cf820, __in_chrg=<optimized out>)
    at /home/thinker/progm/mozilla-central/ipc/glue/GeckoChildProcessHost.cpp:139
(Reporter)

Comment 1

a year ago
The question is why |GetFromTypeAndExtension()| run an external program.  It call |/bin/sh -c "test -n 
\"$DISPLAY\""| to detect if environment variable |DISPLAY| is set.  I am wondering why not |PR_GetEnv()| but shell.
That shell command is probably from a mailcap file — each mailcap entry can specify an arbitrary shell command that has to be run to determine whether the entry is enabled.
See Also: → bug 227246
Too late for firefox 52, mass-wontfix.
status-firefox52: affected → wontfix

Updated

19 days ago
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.