Closed Bug 705543 Opened 13 years ago Closed 9 years ago

chromehang | libpthread-2.13.so@0xc04c in _MD_WaitUnixProcess with icedtea-web

Categories

(Core :: XPCOM, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: jerome, Unassigned)

References

()

Details

(Keywords: hang)

Crash Data

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0a1) Gecko/20111126 Firefox/11.0a1
Build ID: 20111126031027

Steps to reproduce:

Watch a (flash) video from http://www.megavideo.com


Actual results:

crash
Crash Signature: chromehang | libpthread-2.13.so@0xc04c
Component: General → XPCOM
Product: Firefox → Core
Version: 11 Branch → Trunk
Summary: Just regular browsing with video, first time I see it with the last nightly 20111126 → Crash Report [@ chromehang | libpthread-2.13.so@0xc04c ]
Stack traces in thread 0 look like:
Frame 	Module 	Signature [Expand] 	Source
0 	libpthread-2.13.so 	libpthread-2.13.so@0xc04c 	
1 	libnspr4.so 	PR_WaitCondVar 	nsprpub/pr/src/pthreads/ptsynch.c:417
2 	libnspr4.so 	_MD_WaitUnixProcess 	nsprpub/pr/src/md/unix/uxproces.c:861
3 	libxul.so 	nsProcess::Monitor 	xpcom/threads/nsProcessCommon.cpp:293
4 	libxul.so 	nsProcess::RunProcess 	xpcom/threads/nsProcessCommon.cpp:549
5 	libxul.so 	nsProcess::CopyArgsAndRunProcess 	xpcom/threads/nsProcessCommon.cpp:386
6 	libxul.so 	nsOSHelperAppService::GetHandlerAndDescriptionFromMailcapFile 	uriloader/exthandler/unix/nsOSHelperAppService.cpp:1136
7 	libxul.so 	nsOSHelperAppService::DoLookUpHandlerAndDescription 	uriloader/exthandler/unix/nsOSHelperAppService.cpp:956
8 	libxul.so 	nsOSHelperAppService::LookUpHandlerAndDescription 	uriloader/exthandler/unix/nsOSHelperAppService.cpp:901
9 	libxul.so 	nsOSHelperAppService::GetFromExtension 	uriloader/exthandler/unix/nsOSHelperAppService.cpp:1347
10 	libxul.so 	nsOSHelperAppService::GetMIMEInfoFromOS 	uriloader/exthandler/unix/nsOSHelperAppService.cpp:1543
11 	libxul.so 	nsExternalHelperAppService::GetFromTypeAndExtension 	uriloader/exthandler/nsExternalHelperAppService.cpp:2491
12 	libxul.so 	nsExternalHelperAppService::DoContent 	uriloader/exthandler/nsExternalHelperAppService.cpp:746
13 	libxul.so 	nsDocumentOpenInfo::DispatchContent 	uriloader/base/nsURILoader.cpp:567
14 	libxul.so 	nsDocumentOpenInfo::OnStartRequest 	uriloader/base/nsURILoader.cpp:294
15 	libxul.so 	nsHttpChannel::CallOnStartRequest 	netwerk/protocol/http/nsHttpChannel.cpp:750
16 	libxul.so 	nsHttpChannel::ContinueProcessNormal 	netwerk/protocol/http/nsHttpChannel.cpp:1244
17 	libxul.so 	nsHttpChannel::ProcessNormal 	netwerk/protocol/http/nsHttpChannel.cpp:1181
18 	libxul.so 	nsHttpChannel::ProcessResponse 	netwerk/protocol/http/nsHttpChannel.cpp:1083
19 	libxul.so 	nsHttpChannel::OnStartRequest 	netwerk/protocol/http/nsHttpChannel.cpp:4102
20 	libxul.so 	nsInputStreamPump::OnStateStart 	netwerk/base/src/nsInputStreamPump.cpp:441
21 	libxul.so 	nsInputStreamPump::OnInputStreamReady 	netwerk/base/src/nsInputStreamPump.cpp:397
22 	libxul.so 	nsInputStreamReadyEvent::Run 	xpcom/io/nsStreamUtils.cpp:114
23 	libxul.so 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:625
24 	libxul.so 	NS_ProcessNextEvent_P 	obj-firefox/xpcom/build/nsThreadUtils.cpp:245
25 	libxul.so 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:134
26 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:208
27 	libxul.so 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:189
28 	libxul.so 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:221
29 	libxul.so 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3558
30 	firefox 	main 	browser/app/nsBrowserApp.cpp:201
31 	libc-2.13.so 	libc-2.13.so@0x2130c 	
32 	firefox 	firefox@0x1b9f 

More reports at:
https://crash-stats.mozilla.com/report/list?signature=chromehang%20|%20libpthread-2.13.so%400xc04c
Status: UNCONFIRMED → NEW
Crash Signature: chromehang | libpthread-2.13.so@0xc04c → [@ chromehang | libpthread-2.13.so@0xc04c ]
Ever confirmed: true
Keywords: crash, regression
This isn't surprising, we're calling nsIProcess:Run with blocking set to true ...
Odd though that something in mailcap would take 30 seconds to run.

Jerome, can you attach console output please for a debug build run with NSPR_LOG_MODULES=HelperAppService:5, and/or try "strace -e process /path/to/firefox -no-remote" to see whether that finds which command is exec'd?

This should correspond to a "test=" line in /etc/mailcap or ~/.mailcap.

Debug builds are available from http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011-11-28-mozilla-central-debug/
the console output was get for the debug build with the followings commands :
$ cd /home/jerome/Bureau/firefox/firefox
$ export NSPR_LOG_MODULES=HelperAppService:5
$ strace -e process /home/jerome/Bureau/firefox/firefox -no-remote &> strace_log

I went on a megavideo url for a few seconds and no crash was happening
Thank you, but the log is less useful if the crash is not happening.

Is the crash intermittent, or did using the debug build avert the crash, or is the crash no longer reproducible with a nightly build?  (Perhaps the web site has changed.)

Might be interesting to see whether "grep test= /etc/mailcap ~/.mailcap" shows any unusual test commands.
(In reply to Karl Tomlinson (:karlt) from comment #3)
> or try "strace -e process
> /path/to/firefox -no-remote" to see whether that finds which command is
> exec'd?

Looks like we'll need to add "-f" as an strace option to get the execve syscall after the fork, but the lack of any HelperAppService logging in attachment 577703 [details] suggests that mailcap is now not being parsed.
Severity: normal → critical
Keywords: crash, regressionhang
Summary: Crash Report [@ chromehang | libpthread-2.13.so@0xc04c ] → chromehang | libpthread-2.13.so@0xc04c in nsProcess::Monitor
I saw this same hang myself with GLib 2.30.2 when opening the Applications tab in Preferences.
WaitPidDaemonThread is polling waiting for pr_SigchldHandler to wake up the thread to reap the process.

Unfortunately GLib has installed its own signal handler and doesn't chain up.
http://git.gnome.org/browse/glib/tree/glib/gmain.c?id=2.30.2#n4374

I haven't yet discovered what caused that to be called with SIGCHLD.

There have been some changes in that code since 2.28, including the new public function g_unix_signal_source_new.
Depends on: 678369
Summary: chromehang | libpthread-2.13.so@0xc04c in nsProcess::Monitor → chromehang | libpthread-2.13.so@0xc04c in _MD_WaitUnixProcess with GLib >= 2.30
I see GLib steal the SIGCHLD handler when icedtea-web 1.1.4 is loaded (in the browser process)

#0  ensure_unix_signal_handler_installed_unlocked (signum=17) at gmain.c:4376
#1  0x00007f54f69f5a60 in g_child_watch_source_init () at gmain.c:4620
#2  g_child_watch_source_new (pid=6629) at gmain.c:4667
#3  0x00007f54f69f5b34 in g_child_watch_add_full (priority=0, pid=<optimized out>, 
    function=0x7f547b1d9930 <appletviewer_monitor()>, data=0x19e5, notify=0) at gmain.c:4721
#4  0x00007f547b1d9057 in start_jvm_if_needed ()
   from /usr/lib64/icedtea7-web/lib64/IcedTeaPlugin.so
#5  0x00007f547b1da0df in ITNP_New () from /usr/lib64/icedtea7-web/lib64/IcedTeaPlugin.so
#6  0x00007f54fbb160a9 in mozilla::PluginPRLibrary::NPP_New (this=<optimized out>, 
    pluginType=<optimized out>, instance=<optimized out>, mode=<optimized out>, 
    argc=<optimized out>, argn=<optimized out>, argv=0x7f547d847100, saved=0x0, 
    error=0x7fff68a2221a)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/dom/plugins/base/PluginPRLibrary.cpp:218
#7  0x00007f54fbb04b35 in nsNPAPIPluginInstance::InitializePlugin (this=0x7f54619d5c40)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/dom/plugins/base/nsNPAPIPluginInstance.cpp:463
#8  0x00007f54fbb110c6 in nsPluginHost::TrySetUpPluginInstance (this=0x7f547ce57980, 
    aMimeType=0x7f5462fd6da8 "application/x-java-vm", aURL=0x7f545caf2280, aOwner=0x7f5446129060)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/dom/plugins/base/nsPluginHost.cpp:1313
#9  0x00007f54fbb111ee in nsPluginHost::SetUpPluginInstance (this=0x7f547ce57980, 
    aMimeType=0x7f5462fd6da8 "application/x-java-vm", aURL=0x7f545caf2280, aOwner=0x7f5446129060)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/dom/plugins/base/nsPluginHost.cpp:1193
#10 0x00007f54fbb119b6 in nsPluginHost::InstantiateEmbeddedPlugin (this=0x7f547ce57980, 
    aMimeType=<optimized out>, aURL=0x7f545caf2280, aOwner=0x7f5446129060)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/dom/plugins/base/nsPluginHost.cpp:1069
#11 0x00007f54fb4bcea0 in nsObjectFrame::InstantiatePlugin (this=0x7f547d69b1a0, 
    aPluginHost=0x7f547ce57980, aMimeType=0x7f5462fd6da8 "application/x-java-vm", 
    aURI=0x7f545caf2280)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/layout/generic/nsObjectFrame.cpp:728
#12 0x00007f54fb4bf505 in nsObjectFrame::Instantiate (this=0x7f547d69b1a0, 
    aMimeType=0x7f5462fd6da8 "application/x-java-vm", aURI=0x7f545caf2280)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/layout/generic/nsObjectFrame.cpp:2237
#13 0x00007f54fb5ea25e in nsObjectLoadingContent::Instantiate (this=0x7f545c89ed00, 
    aFrame=0x7f547d69b1f0, aMIMEType=..., aURI=0x7f545caf2280)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/content/base/src/nsObjectLoadingContent.cpp:1900
#14 0x00007f54fb5ea5fd in nsAsyncInstantiateEvent::Run (this=0x7f547224bd00)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/content/base/src/nsObjectLoadingContent.cpp:172
#15 0x00007f54fbcb120f in nsThread::ProcessNextEvent (this=0x7f54ea9151c0, mayWait=false, 
    result=0x7fff68a22a2f)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/xpcom/threads/nsThread.cpp:631
#16 0x00007f54fbc84f43 in NS_ProcessNextEvent_P (thread=<optimized out>, mayWait=false)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/obj-x86_64-unknown-linux-gnu/xpcom/build/nsThreadUtils.cpp:245
#17 0x00007f54fbc26ca2 in mozilla::ipc::MessagePump::Run (this=0x7f54ea913640, 
    aDelegate=0x7f54fcedc3d0)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/ipc/glue/MessagePump.cpp:110
#18 0x00007f54fbccd2b9 in RunHandler (this=0x7f54fcedc3d0)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/ipc/chromium/src/base/message_loop.cc:201
#19 MessageLoop::Run (this=0x7f54fcedc3d0)
    at /var/tmp/portage/www-client/firefox-10.0.1-r1/work/mozilla-release/ipc/chromium/src/base/message_loop.cc:175

(Comparing code in GLib 2.28.5, I expect the same issues.)
Summary: chromehang | libpthread-2.13.so@0xc04c in _MD_WaitUnixProcess with GLib >= 2.30 → chromehang | libpthread-2.13.so@0xc04c in _MD_WaitUnixProcess with icedtea-web
This is the line in /etc/mailcap that firefox is processing at the hang:

> text/*; gview '%s'; edit=gvim -f '%s'; compose=gvim -f '%s'; test=test "$DISPLAY" != ""

I haven't managed to reproduce in my trunk build.
The Applications tab is somewhat less full there, perhaps finding fewer plugins than the system Firefox.
(In reply to Karl Tomlinson (:karlt) from comment #10)
> This is the line in /etc/mailcap that firefox is processing at the hang:
> 
> > text/*; gview '%s'; edit=gvim -f '%s'; compose=gvim -f '%s'; test=test "$DISPLAY" != ""

I got this problem several times with Iceweasel (Debian's version of Firefox) 19, 20 and 21 (the latest version). If I understand correctly, the freeze occurs on the "test" command. This command is just a simple test, which terminates very quickly; I wonder whether this can be the cause of the bug. Example of the beginning of a backtrace:

(gdb) bt full
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
No locals.
#1  0x00007fec5514dd90 in PR_WaitCondVar (cvar=0x7febc36ca3c0, timeout=timeout@entry=4294967295) at ptsynch.c:385
        rv = <optimized out>
        thred = 0x7fec5aa59260
#2  0x00007fec55156384 in _MD_WaitUnixProcess (process=0x7fec05fca080, exitCode=0x7fff136bb234) at uxproces.c:824
        pRec = 0x7febbeb2d260
        retVal = PR_SUCCESS
        interrupted = 0
#3  0x00007fec595bce0c in nsProcess::Monitor (arg=arg@entry=0x7febd99e27f0) at /tmp/buildd/iceweasel-19.0.2/xpcom/threads/nsProcessCommon.cpp:264
        process = {mRawPtr = 0x7febd99e27f0}
        exitCode = -1
#4  0x00007fec595bcfc9 in nsProcess::RunProcess (this=this@entry=0x7febd99e27f0, blocking=blocking@entry=true, my_argv=my_argv@entry=0x7fec225c13c0, observer=observer@entry=0x0, holdWeak=holdWeak@entry=false, argsUTF8=argsUTF8@entry=false) at /tmp/buildd/iceweasel-19.0.2/xpcom/threads/nsProcessCommon.cpp:512
        ptrProc = <optimized out>
#5  0x00007fec595bd212 in nsProcess::CopyArgsAndRunProcess (this=0x7febd99e27f0, blocking=true, args=0x7fff136bb468, count=<optimized out>, observer=0x0, holdWeak=false) at /tmp/buildd/iceweasel-19.0.2/xpcom/threads/nsProcessCommon.cpp:357
        my_argv = 0x7fec225c13c0
        rv = <optimized out>
#6  0x00007fec592add37 in nsOSHelperAppService::GetHandlerAndDescriptionFromMailcapFile (aFilename=..., aMajorType=..., aMinorType=..., aTypeOptions=..., aHandler=..., aDescription=..., aMozillaFlags=...) at /tmp/buildd/iceweasel-19.0.2/uriloader/exthandler/unix/nsOSHelperAppService.cpp:1102
        testCommand = {<nsFixedCString> = {<nsCString> = {<nsACString_internal> = {mData = 0x7fff136bb508 "test -n \"$DISPLAY\"", mLength = 18, mFlags = 65553}, <No data fields>}, mFixedCapacity = 63, mFixedBuf = 0x7fff136bb508 "test -n \"$DISPLAY\""}, mStorage = "test -n \"$DISPLAY\"\000Y\354\177\000\000\270\265k\023\377\177\000\000w\303*Y\354\177\000\000\200a[\034\354\377\372\377\270\265k\023\377\177\000\000`P\\\034\354\377\372\377"}
        process = {<nsCOMPtr_base> = {mRawPtr = 0x7febd99e27f0}, <No data fields>}
        file = {<nsCOMPtr_base> = {mRawPtr = 0x7febd03310c0}, <No data fields>}
        exitValue = 0
        args = {0x7fec59a68ddc "-c", 0x7fff136bb508 "test -n \"$DISPLAY\""}
        optionName = {<nsAString_internal> = {mData = 0x7febcee9d5b0, mLength = 4, mFlags = 0}, <No data fields>}
        match = <optimized out>
        end_executable_iter = <optimized out>
        start_option_iter = <optimized out>
        end_optionname_iter = {mStart = <optimized out>, mEnd = <optimized out>, mPosition = 0x7febcee9d5b8}
        equal_sign_iter = <optimized out>
        equalSignFound = false
        semicolon_iter = {mStart = 0x7febcee9d578, mEnd = 0x7febcee9d654, mPosition = 0x7febcee9d5de}
        end_iter = {mStart = 0x7febcee9d578, mEnd = 0x7febcee9d654, mPosition = 0x7febcee9d654}
        majorTypeEnd = {mStart = 0x7febcee9d578, mEnd = 0x7febcee9d654, mPosition = 0x7febcee9d58e}
        minorTypeEnd = {mStart = 0x7febcee9d578, mEnd = 0x7febcee9d654, mPosition = 0x7febcee9d596}
        start_iter = {mStart = 0x7febcee9d578, mEnd = 0x7febcee9d654, mPosition = 0x7febcee9d59a}
        majorTypeStart = {mStart = 0x7febcee9d578, mEnd = 0x7febcee9d654, mPosition = 0x7febcee9d578}
        minorTypeStart = {mStart = 0x7febcee9d578, mEnd = 0x7febcee9d654, mPosition = 0x7febcee9d590}
        rv = 4294966784
        file = {<nsCOMPtr_base> = {mRawPtr = 0x7febcf2a4e80}, <No data fields>}
        mailcapFile = {<nsCOMPtr_base> = {mRawPtr = 0x7febd80267e0}, <No data fields>}
        mailcap = {<nsCOMPtr_base> = {mRawPtr = 0x7febd80267e8}, <No data fields>}
        buffer = {<nsAString_internal> = {mData = 0x7febbfab0d38, mLength = 110, mFlags = 5}, <No data fields>}
        cBuffer = {<nsFixedCString> = {<nsCString> = {<nsACString_internal> = {mData = 0x7fec05eec208 "application/pdf; xpdf '%s'; test=test -n \"$DISPLAY\"; description=Portable Document Format; nametemplate=%s.pdf", mLength = 110, mFlags = 65541}, <No data fields>}, mFixedCapacity = 63, mFixedBuf = 0x7fff136bb4a8 ""}, mStorage = "\000 .mailcap file (for Firefox)", '\000' <repeats 11 times>"\270, \265k\023\377\177\000\000V\204\334Z\354\177\000\000\016\216\246Y\354\177\000"}
        more = true
        entry = {<nsAString_internal> = {mData = 0x7febcee9d578, mLength = 110, mFlags = 5}, <No data fields>}
[...]

My Debian bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=703472
The following is likely closely related too - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=608856
Yes, the Debian report would be the same thing.
The "test" in mailcap is not really the cause of the bug, but removing the "test" is a workaround to delay the symtoms of the problem.
The problem is that both NSPR and GLib think they are the only libraries to deal with child processes.
Bug 678369 is the way to fix this for Gecko.
Yes, it seems much more general than mailcap things. The bug sometimes occurs at startup, where I have only HTML pages in tabs of the saved session. And the tabs that haven't finished loading are lost when I need to restart again!
Chromehang signature generation was removed awhile ago. This bug is
inactionable at this point without a reliable STR, so I'm closing it as
incomplete. Feel free to the reopen this bug with STR if it still reproduces.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: