Last Comment Bug 681026 - glxtest should wait() for its child to exit
: glxtest should wait() for its child to exit
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: Trunk
: x86 OpenBSD
: -- normal (vote)
: mozilla9
Assigned To: Benoit Jacob [:bjacob] (mostly away)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-22 13:09 PDT by Landry Breuil (:gaston)
Modified: 2011-09-20 04:16 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
call GetData in GetShouldAccelerate (4.07 KB, patch)
2011-08-25 18:03 PDT, Benoit Jacob [:bjacob] (mostly away)
joe: review+
Details | Diff | Splinter Review

Description Landry Breuil (:gaston) 2011-08-22 13:09:39 PDT
As found out the hard way by Nigel Taylor testing enigmail, it seems glxtest not wait()'ing on its child process causes issues later on with next wait() calls.

See for reference : http://mozdev.org/pipermail/enigmail/2011-August/014276.html
which leads to http://lists.freebsd.org/pipermail/freebsd-gecko/2011-August/001711.html and a fix which apparently fixes the issue for both people experiencing it in http://lists.freebsd.org/pipermail/freebsd-gecko/2011-August/001716.html
Comment 1 Benjamin Smedberg [:bsmedberg] 2011-08-22 13:41:17 PDT
It should be waiting here: http://mxr.mozilla.org/mozilla-central/source/widget/src/xpwidgets/GfxInfoX11.cpp#107

But I suppose that if we never use nsIGfxInfo in a particular runtime then we'll never wait for it. However, we specifically don't want to wait for the output synchronously at startup, so the proposed patches are not acceptable. We'll need to either fork a thread to do the wait, or postpone it for a bit. In any case this also exposes a bug in enigmail which I hope will be fixed separately, since there are plenty of ways for "unknown" processes to show up in wait(), and that code should also be using waitpid or the equivalent.
Comment 2 Benoit Jacob [:bjacob] (mostly away) 2011-08-22 14:30:28 PDT
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #1)
> It should be waiting here:
> http://mxr.mozilla.org/mozilla-central/source/widget/src/xpwidgets/
> GfxInfoX11.cpp#107

Indeed.

> 
> But I suppose that if we never use nsIGfxInfo in a particular runtime then
> we'll never wait for it.

This was bug 677531 and is fixed now in mozilla-central. However the only negative effect of this bug, that I'm aware of, was that we had a zombie process staying around, and not doing any harm besides using an entry in the process table. I don't understand how this bug could have resulting in the application staying hung on startup.

> However, we specifically don't want to wait for the
> output synchronously at startup, so the proposed patches are not acceptable.

Indeed. The GLXtest process will typically take between 50 ms and 200 ms to run, depending on your driver (Mesa drivers are fastest, NVIDIA is slower, FGLRX is slowest). So we really don't want to have startup waiting for it.

> We'll need to either fork a thread to do the wait, or postpone it for a bit.

I still don't understand why the application is hanging. Do you understand that?
Comment 3 Benjamin Smedberg [:bsmedberg] 2011-08-22 14:33:28 PDT
See the second link, it is a bug in enigmail from what I can tell. But it's bad manners to leave zombie processes around, so we should definitely make sure that is fixed. So I think this is a dup?
Comment 4 Benoit Jacob [:bjacob] (mostly away) 2011-08-22 15:17:38 PDT
Is the zombie issue somehow the same issue as this application hang? If yes, then this is indeed a duplicate of bug 677531. Is there a need to take it in Aurora or in Beta?
Comment 5 Patrick Brunschwig 2011-08-22 23:41:55 PDT
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> See the second link, it is a bug in enigmail from what I can tell.

"Kill" in Enigmail (or better ipc-pipe) does essentially this:

status = PR_KillProcess(mProcess);
if (status != PR_SUCCESS) {
   ...
   return;
}
status = PR_WaitProcess(mProcess, &mExitCode); // the hang is most likely here

Is this in any way incorrect?
Comment 6 Benoit Jacob [:bjacob] (mostly away) 2011-08-23 08:04:45 PDT
I propose we approach the problem from the other side: does the problem persist with current mozilla-central? If you can't use mozilla-central easily, can you at least check if the patch from bug 677531 fixes it?
Comment 7 Benjamin Smedberg [:bsmedberg] 2011-08-23 08:16:03 PDT
Patrick, I'm just looking at this quote: "Having looked further it seems that the call is actually stuck in the following sequence: nsIPCService::RunPipe -> nsPipeTransport::Terminate -> nsPipeTransport::Kill -> IPC_WaitProcess == _MD_WaitUnixProcess

In the _MD_WaitUnixProcess the following path is taken: FindPidTable returns NULL
and the thread gets stuck forever on PR_WaitCondVar(pRec->reapedCV, ...)"

Maybe this is an NSPR bug. The "waitpid thread" may be notifying the monitor in a racy way, I'm not sure. In any case, this should be fixed on trunk by the other bug.
Comment 8 nigel 2011-08-23 11:39:28 PDT
(In reply to Benoit Jacob [:bjacob] from comment #6)
> I propose we approach the problem from the other side: does the problem
> persist with current mozilla-central? If you can't use mozilla-central
> easily, can you at least check if the patch from bug 677531 fixes it?

I have rebuilt thunderbird 6.0 removing the patch in
http://lists.freebsd.org/pipermail/freebsd-gecko/2011-August/001711.html
and using the patch in bug 677531 in it's place.

The problem has returned.
Comment 9 Benoit Jacob [:bjacob] (mostly away) 2011-08-23 12:01:05 PDT
(In reply to nigel from comment #8)
> I have rebuilt thunderbird 6.0 removing the patch in
> http://lists.freebsd.org/pipermail/freebsd-gecko/2011-August/001711.html
> and using the patch in bug 677531 in it's place.
> 
> The problem has returned.

Can you please set a breakpoint in nsBaseWidget::GetShouldAccelerate(), in widget/src/xpwidgets/nsBaseWidget.cpp, at the place where we call gfxInfo->GetFeatureStatus? It's currently at

http://hg.mozilla.org/mozilla-central/file/8f2530ae725a/widget/src/xpwidgets/nsBaseWidget.cpp#l838

Is this line of code hit?
 - if no, then that's the bug. Why is it not hit? It's always hit in Firefox, why not in other apps? Is this because other apps don't create a window, and this code is only run when creating a window? In this case, can you point me to another line of code in mozilla-central, that's guaranteed to always be hit on startup and that happens as late as possible just before a browser like Firefox would want to start creating windows and drawing stuff?
Comment 10 nigel 2011-08-25 06:50:47 PDT
(In reply to Benoit Jacob [:bjacob] from comment #9)
> (In reply to nigel from comment #8)
> > I have rebuilt thunderbird 6.0 removing the patch in
> > http://lists.freebsd.org/pipermail/freebsd-gecko/2011-August/001711.html
> > and using the patch in bug 677531 in it's place.
> > 
> > The problem has returned.
> 
> Can you please set a breakpoint in nsBaseWidget::GetShouldAccelerate(), in
> widget/src/xpwidgets/nsBaseWidget.cpp, at the place where we call
> gfxInfo->GetFeatureStatus? It's currently at
> 
> http://hg.mozilla.org/mozilla-central/file/8f2530ae725a/widget/src/xpwidgets/
> nsBaseWidget.cpp#l838
> 
> Is this line of code hit?
>  - if no, then that's the bug. Why is it not hit? It's always hit in
> Firefox, why not in other apps? Is this because other apps don't create a
> window, and this code is only run when creating a window? In this case, can
> you point me to another line of code in mozilla-central, that's guaranteed
> to always be hit on startup and that happens as late as possible just before
> a browser like Firefox would want to start creating windows and drawing
> stuff?

The line of code is being hit.

[Switching to process 17558, thread 0x2032c0000]

Breakpoint 2, nsBaseWidget::GetShouldAccelerate (this=0x205bdb400)
    at /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/nsBaseWidget.cpp:838
838         if (NS_SUCCEEDED(gfxInfo->GetFeatureStatus(nsIGfxInfo::FEATURE_OPENGL_LAYERS, &status))) {
Comment 11 Benoit Jacob [:bjacob] (mostly away) 2011-08-25 12:17:42 PDT
(In reply to nigel from comment #10)
> (In reply to Benoit Jacob [:bjacob] from comment #9)
> > (In reply to nigel from comment #8)
> > > I have rebuilt thunderbird 6.0 removing the patch in
> > > http://lists.freebsd.org/pipermail/freebsd-gecko/2011-August/001711.html
> > > and using the patch in bug 677531 in it's place.
> > > 
> > > The problem has returned.
> > 
> > Can you please set a breakpoint in nsBaseWidget::GetShouldAccelerate(), in
> > widget/src/xpwidgets/nsBaseWidget.cpp, at the place where we call
> > gfxInfo->GetFeatureStatus? It's currently at
> > 
> > http://hg.mozilla.org/mozilla-central/file/8f2530ae725a/widget/src/xpwidgets/
> > nsBaseWidget.cpp#l838
> > 
> > Is this line of code hit?
> >  - if no, then that's the bug. Why is it not hit? It's always hit in
> > Firefox, why not in other apps? Is this because other apps don't create a
> > window, and this code is only run when creating a window? In this case, can
> > you point me to another line of code in mozilla-central, that's guaranteed
> > to always be hit on startup and that happens as late as possible just before
> > a browser like Firefox would want to start creating windows and drawing
> > stuff?
> 
> The line of code is being hit.
> 
> [Switching to process 17558, thread 0x2032c0000]
> 
> Breakpoint 2, nsBaseWidget::GetShouldAccelerate (this=0x205bdb400)
>     at
> /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/
> nsBaseWidget.cpp:838
> 838         if
> (NS_SUCCEEDED(gfxInfo->GetFeatureStatus(nsIGfxInfo::FEATURE_OPENGL_LAYERS,
> &status))) {


Holy smokes! That's quite a surprise. I was 99% sure that it wouldn't be hit, given your problem. Can you please step through this code to see what happens? In particular:

1) in GfxInfoX11.cpp, in GetFeatureStatusImpl, is the GetData() call reached?
http://hg.mozilla.org/mozilla-central/file/d0700ba932b4/widget/src/xpwidgets/GfxInfoX11.cpp#l252

2) in GetData(), is the waitpid() call reached?
http://hg.mozilla.org/mozilla-central/file/d0700ba932b4/widget/src/xpwidgets/GfxInfoX11.cpp#l107

Thanks a lot for debugging this.
Comment 12 nigel 2011-08-25 16:11:39 PDT
(In reply to Benoit Jacob [:bjacob] from comment #11)
> (In reply to nigel from comment #10)
> > (In reply to Benoit Jacob [:bjacob] from comment #9)
> > > (In reply to nigel from comment #8)
> > > > I have rebuilt thunderbird 6.0 removing the patch in
> > > > http://lists.freebsd.org/pipermail/freebsd-gecko/2011-August/001711.html
> > > > and using the patch in bug 677531 in it's place.
> > > > 
> > > > The problem has returned.
> > > 
> > > Can you please set a breakpoint in nsBaseWidget::GetShouldAccelerate(), in
> > > widget/src/xpwidgets/nsBaseWidget.cpp, at the place where we call
> > > gfxInfo->GetFeatureStatus? It's currently at
> > > 
> > > http://hg.mozilla.org/mozilla-central/file/8f2530ae725a/widget/src/xpwidgets/
> > > nsBaseWidget.cpp#l838
> > > 
> > > Is this line of code hit?
> > >  - if no, then that's the bug. Why is it not hit? It's always hit in
> > > Firefox, why not in other apps? Is this because other apps don't create a
> > > window, and this code is only run when creating a window? In this case, can
> > > you point me to another line of code in mozilla-central, that's guaranteed
> > > to always be hit on startup and that happens as late as possible just before
> > > a browser like Firefox would want to start creating windows and drawing
> > > stuff?
> > 
> > The line of code is being hit.
> > 
> > [Switching to process 17558, thread 0x2032c0000]
> > 
> > Breakpoint 2, nsBaseWidget::GetShouldAccelerate (this=0x205bdb400)
> >     at
> > /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/
> > nsBaseWidget.cpp:838
> > 838         if
> > (NS_SUCCEEDED(gfxInfo->GetFeatureStatus(nsIGfxInfo::FEATURE_OPENGL_LAYERS,
> > &status))) {
> 
> 
> Holy smokes! That's quite a surprise. I was 99% sure that it wouldn't be
> hit, given your problem. Can you please step through this code to see what
> happens? In particular:
> 
> 1) in GfxInfoX11.cpp, in GetFeatureStatusImpl, is the GetData() call reached?
> http://hg.mozilla.org/mozilla-central/file/d0700ba932b4/widget/src/xpwidgets/
> GfxInfoX11.cpp#l252
> 
> 2) in GetData(), is the waitpid() call reached?
> http://hg.mozilla.org/mozilla-central/file/d0700ba932b4/widget/src/xpwidgets/
> GfxInfoX11.cpp#l107
> 
> Thanks a lot for debugging this.

Neither break point was hit, lines 105, 238 used. 

$ thunderbird -g
/usr/local/lib/thunderbird-6.0/run-mozilla.sh -g /usr/local/lib/thunderbird-6.0/thunderbird-bin
MOZILLA_FIVE_HOME=/usr/local/lib/thunderbird-6.0
  LD_LIBRARY_PATH=/usr/local/lib/thunderbird-6.0:/usr/local/lib/thunderbird-6.0/plugins:/usr/local/lib/thunderbird-6.0
DISPLAY=:0
DYLD_LIBRARY_PATH=/usr/local/lib/thunderbird-6.0:/usr/local/lib/thunderbird-6.0
     LIBRARY_PATH=
       SHLIB_PATH=/usr/local/lib/thunderbird-6.0:/usr/local/lib/thunderbird-6.0
          LIBPATH=/usr/local/lib/thunderbird-6.0:/usr/local/lib/thunderbird-6.0
       ADDON_PATH=
      MOZ_PROGRAM=/usr/local/lib/thunderbird-6.0/thunderbird-bin
      MOZ_TOOLKIT=
        moz_debug=1
     moz_debugger=
moz_debugger_args=
/usr/bin/gdb  --args /usr/local/lib/thunderbird-6.0/thunderbird-bin
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-unknown-openbsd5.0"...
(gdb) b GfxInfoX11.cpp:105
No source file named GfxInfoX11.cpp.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (GfxInfoX11.cpp:105) pending.
(gdb) b GfxInfoX11.cpp:238
No source file named GfxInfoX11.cpp.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 2 (GfxInfoX11.cpp:238) pending.
(gdb) run
Starting program: /usr/local/lib/thunderbird-6.0/thunderbird-bin
Breakpoint 3 at 0x210f998ce: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/GfxInfoX11.cpp, line 105.
Pending breakpoint "GfxInfoX11.cpp:105" resolved
Breakpoint 4 at 0x210f9a00c: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/GfxInfoX11.cpp, line 238.
Pending breakpoint "GfxInfoX11.cpp:238" resolved
nsStringStats
 => mAllocCount:              5
 => mReallocCount:            2
 => mFreeCount:               3  --  LEAKED 2 !!!
 => mShareCount:              1
 => mAdoptCount:              0
 => mAdoptFreeCount:          0
nsNativeModuleLoader::LoadModule("/usr/local/lib/thunderbird-6.0/components/libmozgnome.so.19.0") - load FAILED, rv: 80004005, error:
        File not found
nsNativeModuleLoader::LoadModule("/usr/local/lib/thunderbird-6.0/components/libipc.so.19.0") - load FAILED, rv: 80004005, error:
        File not found
nsNativeModuleLoader::LoadModule("/usr/local/lib/thunderbird-6.0/components/libenigmime.so.19.0") - load FAILED, rv: 80004005, error:
        File not found
WARNING: Re-registering a CID?: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/xpcom/components/nsComponentManager.cpp, line 467
WARNING: Re-registering a CID?: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/xpcom/components/nsComponentManager.cpp, line 467
++DOCSHELL 0x207a86800 == 1
WARNING: NS_ENSURE_TRUE(aURI) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 5066
++DOMWINDOW == 1 (0x2020a9878) [serial = 1] [outer = 0x0]
enigmail.js: Registered components
++DOCSHELL 0x20819a800 == 2
++DOMWINDOW == 2 (0x2093b5c78) [serial = 2] [outer = 0x0]
WARNING: Subdocument container has no content: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/layout/base/nsDocumentViewer.cpp, line 2398
++DOMWINDOW == 3 (0x2221d4478) [serial = 3] [outer = 0x2093b5c00]
WARNING: Subdocument container has no content: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/layout/base/nsDocumentViewer.cpp, line 2398
++DOMWINDOW == 4 (0x207f95478) [serial = 4] [outer = 0x2020a9800]
WARNING: Subdocument container has no content: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/layout/base/nsDocumentViewer.cpp, line 2398
++DOCSHELL 0x20a24cc00 == 3
++DOMWINDOW == 5 (0x20a24c478) [serial = 5] [outer = 0x0]
++DOCSHELL 0x20c2f9400 == 4
++DOMWINDOW == 6 (0x20c2f9878) [serial = 6] [outer = 0x0]
++DOCSHELL 0x20c34c000 == 5
++DOMWINDOW == 7 (0x2054eb078) [serial = 7] [outer = 0x0]
++DOCSHELL 0x223818000 == 6
++DOMWINDOW == 8 (0x223818878) [serial = 8] [outer = 0x0]
++DOCSHELL 0x203ed4800 == 7
++DOMWINDOW == 9 (0x223837478) [serial = 9] [outer = 0x0]
++DOCSHELL 0x208496800 == 8
++DOMWINDOW == 10 (0x208496c78) [serial = 10] [outer = 0x0]
++DOMWINDOW == 11 (0x202ba5478) [serial = 11] [outer = 0x223837400]
WARNING: NS_ENSURE_TRUE(aURI) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 5066
WARNING: NS_ENSURE_TRUE(aURI) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 5066
WARNING: NS_ENSURE_TRUE(aURI) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 5066
WARNING: NS_ENSURE_TRUE(aURI) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 5066
WARNING: NS_ENSURE_TRUE(aURI) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 5066
++DOMWINDOW == 12 (0x208551478) [serial = 12] [outer = 0x20a24c400]
WARNING: Subdocument container has no frame: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/layout/base/nsDocumentViewer.cpp, line 2418
++DOMWINDOW == 13 (0x2020fbc78) [serial = 13] [outer = 0x20c2f9800]
++DOMWINDOW == 14 (0x20cf62c78) [serial = 14] [outer = 0x2054eb000]
WARNING: Subdocument container has no frame: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/layout/base/nsDocumentViewer.cpp, line 2418
++DOMWINDOW == 15 (0x20cf9e078) [serial = 15] [outer = 0x223818800]
++DOMWINDOW == 16 (0x2085a5c78) [serial = 16] [outer = 0x223837400]
++DOMWINDOW == 17 (0x20cff4c78) [serial = 17] [outer = 0x208496c00]
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80550013: file nsLocalMailFolder.cpp, line 300
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80550013: file nsLocalMailFolder.cpp, line 300
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80550013: file nsLocalMailFolder.cpp, line 300
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80550013: file nsLocalMailFolder.cpp, line 300
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80550013: file nsLocalMailFolder.cpp, line 300
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80550013: file nsLocalMailFolder.cpp, line 300
JavaScript strict warning: chrome://messenger-smime/content/msgHdrViewSMIMEOverlay.js, line 176: reference to undefined property msgWindow.msgHeaderSink
JavaScript strict warning: chrome://messenger-smime/content/msgHdrViewSMIMEOverlay.js, line 176: reference to undefined property msgWindow.msgHeaderSink
JavaScript strict warning: chrome://messenger-smime/content/msgHdrViewSMIMEOverlay.js, line 176: reference to undefined property msgWindow.msgHeaderSink
[New process 6305]
++DOMWINDOW == 18 (0x20863d878) [serial = 18] [outer = 0x20a24c400]
--DOMWINDOW == 17 (0x202ba5478) [serial = 11] [outer = 0x223837400] [url = about:blank]
--DOMWINDOW == 16 (0x208551478) [serial = 12] [outer = 0x20a24c400] [url = about:blank]
[New process 6305, thread 0x2017d6000]
Failed to load jar:file:///usr/local/lib/thunderbird-6.0/omni.jar!/chrome/messenger/content/messenger/AccountManager.js
WARNING: redundant multiplexed document?: 'docMapEntry->mString == nsnull', file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/xpcom/io/nsFastLoadFile.cpp, line 1338
WARNING: NS_ENSURE_SUCCESS(rv, 0) failed with result 0x8000FFFF: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 2883
WARNING: NS_ENSURE_TRUE(pusher.Push(aBoundElement)) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/xbl/src/nsXBLProtoImplMethod.cpp, line 327
++DOMWINDOW == 17 (0x20e119078) [serial = 19] [outer = 0x20a24c400]
[New process 6305, thread 0x20e0f5800]
WARNING: OpenGL-accelerated layers are not supported on this system.: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/nsBaseWidget.cpp, line 852
[New process 6305, thread 0x20a7db000]
++DOCSHELL 0x20d12d800 == 9
++DOMWINDOW == 18 (0x20d12dc78) [serial = 20] [outer = 0x0]
WARNING: Subdocument container has no content: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/layout/base/nsDocumentViewer.cpp, line 2398
++DOMWINDOW == 19 (0x20f216478) [serial = 21] [outer = 0x20d12dc00]
WARNING: Subdocument container has no content: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/layout/base/nsDocumentViewer.cpp, line 2398
WARNING: OpenGL-accelerated layers are not supported on this system.: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/nsBaseWidget.cpp, line 852
[New process 6305, thread 0x200d4d800]
WARNING: NS_ENSURE_SUCCESS(rv, 0) failed with result 0x8000FFFF: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 2883
WARNING: NS_ENSURE_TRUE(pusher.Push(aBoundElement)) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/xbl/src/nsXBLProtoImplMethod.cpp, line 327
WARNING: NS_ENSURE_SUCCESS(rv, 0) failed with result 0x8000FFFF: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/base/src/nsContentUtils.cpp, line 2883
WARNING: NS_ENSURE_TRUE(pusher.Push(aBoundElement)) failed: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/content/xbl/src/nsXBLProtoImplMethod.cpp, line 327
--DOCSHELL 0x20d12d800 == 8
[New process 6305, thread 0x20c412800]
--DOMWINDOW == 18 (0x20d12dc78) [serial = 20] [outer = 0x0] [url = chrome://global/content/commonDialog.xul]
2011-08-25 23:40:51.135 [DEBUG] enigmailMessengerOverlay.js: updateOptionsDisplay:
2011-08-25 23:40:51.138 [DEBUG] commonFuncs.jsm: collapseAdvanced:
WARNING: OpenGL-accelerated layers are not supported on this system.: file /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/nsBaseWidget.cpp, line 852
--DOMWINDOW == 17 (0x20f216478) [serial = 21] [outer = 0x0] [url = about:blank]
2011-08-25 23:40:57.638 [DEBUG] enigmailCommon.js: prefWindow
2011-08-25 23:40:57.646 [DEBUG] enigmailCommon.js: this.enigmailSvc = [xpconnect wrapped nsIEnigmail @ 0x2064b8c80 (native @ 0x222fcc7a0)]
2011-08-25 23:40:57.646 [DEBUG] enigmailCommon.jsm: getVersion
JavaScript strict warning: resource://enigmail/enigmailCommon.jsm, line 249: reference to undefined property Components.classes['@mozilla.org/extensions/manager;1']
2011-08-25 23:40:57.647 [DEBUG] enigmailCommon.jsm: installed version: 1.3
2011-08-25 23:40:57.648 [DEBUG] enigmail.js: Enigmail.initialize: START
2011-08-25 23:40:57.652 [ERROR] enigmail.js: CreateFileStream: Failed to create /home/njtayl01/enigmail_log/enigdbug.txt
2011-08-25 23:40:57.652 [DEBUG] enigmail.js: Logging debug output to /home/njtayl01/enigmail_log/enigdbug.txt
2011-08-25 23:40:57.652 [DEBUG] enigmail.js: Enigmail version 1.3
2011-08-25 23:40:57.652 [DEBUG] enigmail.js: OS/CPU=OpenBSD amd64
2011-08-25 23:40:57.652 [DEBUG] enigmail.js: Platform=X11
2011-08-25 23:40:57.653 [DEBUG] enigmail.js: composeSecure=true
2011-08-25 23:40:57.656 [DEBUG] enigmail.js: Enigmail.initialize: gEnvList = DISPLAY=:0,HOME=/home/ntayl01,LOGNAME=ntayl01,LD_LIBRARY_PATH=/usr/local/lib/thunderbird-6.0:/usr/local/lib/thunderbird-6.0/plugins:/usr/local/lib/thunderbird-6.0,MOZILLA_FIVE_HOME=/usr/local/lib/thunderbird-6.0,PATH=/home/ntayl01/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin:/usr/games:.,SHELL=/bin/ksh,USER=ntayl01
2011-08-25 23:40:57.658 [DEBUG] enigmail.js: ResolvePath: filePath=gpg
2011-08-25 23:40:57.663 [CONSOLE] EnigmailAgentPath=/usr/local/bin/gpg
(Hung)

^C
Program received signal SIGINT, Interrupt.
0x000000020e67b78a in poll () from /usr/lib/libc.so.60.1
(gdb) info b
Num Type           Disp Enb Address            What
3   breakpoint     keep y   0x0000000210f998ce in mozilla::widget::GfxInfo::GetData()
                                               at /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/GfxInfoX11.cpp:105
4   breakpoint     keep y   0x0000000210f9a00c in mozilla::widget::GfxInfo::GetFeatureStatusImpl(int, int*, nsAString_internal&, mozilla::widget::GfxDriverInfo*)
                                               at /usr/ports/pobj/thunderbird-6.0/comm-release/mozilla/widget/src/xpwidgets/GfxInfoX11.cpp:238
(gdb) list GfxInfoX11.cpp:105
100         int glxtest_status = 0;
101         bool wait_for_glxtest_process = true;
102         bool waiting_for_glxtest_process_failed = false;
103         while(wait_for_glxtest_process) {
104             wait_for_glxtest_process = false;
105             if (waitpid(glxtest_pid, &glxtest_status, 0) == -1) {
106                 if (errno == EINTR)
107                     wait_for_glxtest_process = true;
108                 else
109                     waiting_for_glxtest_process_failed = true;
(gdb) list GfxInfoX11.cpp:238
233     }
234
235     nsresult
236     GfxInfo::GetFeatureStatusImpl(PRInt32 aFeature, PRInt32 *aStatus, nsAString & aSuggestedDriverVersion, GfxDriverInfo* aDriverInfo /* = nsnull */)
237     {
238         GetData();
239         *aStatus = nsIGfxInfo::FEATURE_NO_INFO;
240         aSuggestedDriverVersion.SetIsVoid(PR_TRUE);
241
242     #ifdef MOZ_PLATFORM_MAEMO
(gdb)
Comment 13 Benoit Jacob [:bjacob] (mostly away) 2011-08-25 16:41:35 PDT
Great, thanks! So, you hit GetFeatureStatus but not GetFeatureStatusImpl. That's strange as the former is supposed to call the latter. I'm looking at the code; meanwhile it would be really helpful if you could set again your breakpoint on the GetFeatureStatus call that's actually reached, and step a little bit from there until GetFeatureStatus returns, to understand why GetFeatureStatusImpl is not reached.
Comment 14 Benoit Jacob [:bjacob] (mostly away) 2011-08-25 16:53:43 PDT
Ah, here's what's quite likely the cause of the problem. GetFeatureStatus is impplemented there:

http://hg.mozilla.org/mozilla-central/file/49884897bb5c/widget/src/xpwidgets/GfxInfoBase.cpp#l548

As you can see, before calling GetFeatureStatusImpl, this function checks if the status is already cached as a pref. (This is known to be a wrong thing to do, see bug 653102).

Can you please try reproducing this bug with a clean profile, or at least resetting all your gfx.blacklist.* preferences? If my theory is correct, that will cause GetFeatureStatusImpl to be reached, fixing your problem.

Meanwhile let me patch this so we always call GetData().
Comment 15 Benoit Jacob [:bjacob] (mostly away) 2011-08-25 18:03:15 PDT
Created attachment 555909 [details] [diff] [review]
call GetData in GetShouldAccelerate

This patch makes GetShouldAccelerate directly call GetData, just before the place where we call GetFeatureStatus, which we know is reached.

Initially I considered instead calling GetData from GfxInfo::Init() but that turned out to be a bad idea: Init() is called by the factory constructor, which is called significantly earlier in the startup process. We want to call GetData as late as possible, just when we need it, to maximize chances that the glxtest process be already finished by the time we waitpid() it, so that we don't end up wasting time waiting for it.
Comment 16 nigel 2011-08-25 18:57:27 PDT
(In reply to Benoit Jacob [:bjacob] from comment #15)
> Created attachment 555909 [details] [diff] [review]
> call GetData in GetShouldAccelerate
> 
> This patch makes GetShouldAccelerate directly call GetData, just before the
> place where we call GetFeatureStatus, which we know is reached.
> 
> Initially I considered instead calling GetData from GfxInfo::Init() but that
> turned out to be a bad idea: Init() is called by the factory constructor,
> which is called significantly earlier in the startup process. We want to
> call GetData as late as possible, just when we need it, to maximize chances
> that the glxtest process be already finished by the time we waitpid() it, so
> that we don't end up wasting time waiting for it.

I went and looked at a backup taken when switching from thunderbird-5 to thunderbird-6
the prefs.js had no gfx.blacklist.* entries. I removed them manually with an editor. That seems to have cleared the hanging. The question now is how they got there, I stumbled on the fact Tools/Add-on was adding the gfx.blacklist.* preferences.

I used a different account, with a clean profile, tried the same Tools/Add-On didn't do anything, tried a few other things, then managed to get the gfx.blacklist.* set by changing printer page setting from US letter to A4. Exited and started, selecting enigmail preferences hung thunderbird.
Comment 17 nigel 2011-08-25 19:05:32 PDT
(In reply to nigel from comment #16)
> (In reply to Benoit Jacob [:bjacob] from comment #15)
> > Created attachment 555909 [details] [diff] [review]
> > call GetData in GetShouldAccelerate
> > 
> > This patch makes GetShouldAccelerate directly call GetData, just before the
> > place where we call GetFeatureStatus, which we know is reached.
> > 
> > Initially I considered instead calling GetData from GfxInfo::Init() but that
> > turned out to be a bad idea: Init() is called by the factory constructor,
> > which is called significantly earlier in the startup process. We want to
> > call GetData as late as possible, just when we need it, to maximize chances
> > that the glxtest process be already finished by the time we waitpid() it, so
> > that we don't end up wasting time waiting for it.
> 
> I went and looked at a backup taken when switching from thunderbird-5 to
> thunderbird-6
> the prefs.js had no gfx.blacklist.* entries. I removed them manually with an
> editor. That seems to have cleared the hanging. The question now is how they
> got there, I stumbled on the fact Tools/Add-on was adding the
> gfx.blacklist.* preferences.
> 
> I used a different account, with a clean profile, tried the same
> Tools/Add-On didn't do anything, tried a few other things, then managed to
> get the gfx.blacklist.* set by changing printer page setting from US letter
> to A4. Exited and started, selecting enigmail preferences hung thunderbird.

Oops - that was meant to be a reply the to previous comment, comment 14, I will try the diff out tomorrow.
Comment 18 Benoit Jacob [:bjacob] (mostly away) 2011-08-25 19:25:06 PDT
Pushed to tryserver: http://tbpl.allizom.org/?tree=Try&usebuildbot=1&rev=0d4eb7bd5719

When the builds are complete (in ~2 hours) the builds will be available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-0d4eb7bd5719
Comment 19 Mozilla RelEng Bot 2011-08-25 22:30:50 PDT
Try run for 0d4eb7bd5719 is complete.
Detailed breakdown of the results available here:
    http://tbpl.allizom.org/?tree=Try&usebuildbot=1&rev=0d4eb7bd5719
Results (out of 2 total builds):
    success: 2
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-0d4eb7bd5719
Comment 20 nigel 2011-08-26 12:17:26 PDT
(In reply to Benoit Jacob [:bjacob] from comment #15)
> Created attachment 555909 [details] [diff] [review]
> call GetData in GetShouldAccelerate
> 
> This patch makes GetShouldAccelerate directly call GetData, just before the
> place where we call GetFeatureStatus, which we know is reached.
> 
> Initially I considered instead calling GetData from GfxInfo::Init() but that
> turned out to be a bad idea: Init() is called by the factory constructor,
> which is called significantly earlier in the startup process. We want to
> call GetData as late as possible, just when we need it, to maximize chances
> that the glxtest process be already finished by the time we waitpid() it, so
> that we don't end up wasting time waiting for it.

I built a new version of the thunderbird and enigmail packages. The older installed version of thunderbird was run, this hung. Thunderbird and enigmail packages were unistalled, and the new versions installed. Thunderbird was started, and enigmail preference / viewing e-mails which used to hang no longer did. Is this sufficient, or do I need to run and check with breakpoints again?
Comment 21 Benoit Jacob [:bjacob] (mostly away) 2011-08-26 12:27:59 PDT
Thanks a lot, I think that's conclusive enough to say that the patch fixes the bug.
Comment 22 Benoit Jacob [:bjacob] (mostly away) 2011-09-07 14:21:25 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/5e90688c720e
Comment 23 Phil Ringnalda (:philor) 2011-09-07 15:23:38 PDT
Backed out for Windows burning like https://tbpl.mozilla.org/php/getParsedLog.php?id=6318404
Comment 24 Daniel Holbert [:dholbert] 2011-09-07 15:37:16 PDT
Comment on attachment 555909 [details] [diff] [review]
call GetData in GetShouldAccelerate

>diff --git a/widget/public/nsIGfxInfo.idl b/widget/public/nsIGfxInfo.idl
>+  [notxpcom] void GetData();
> };

>diff --git a/widget/src/xpwidgets/GfxInfoBase.h b/widget/src/xpwidgets/GfxInfoBase.h
>+  // only useful on X11
>+  virtual void GetData() {}

I think this (in the .h file) needs to be declared as NS_IMETHOD.  (and any existing impl in .cpp files need to be labeled as NS_IMETHODIMP)
Comment 25 Benoit Jacob [:bjacob] (mostly away) 2011-09-18 17:40:57 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/432a30ebd148
Comment 26 Benoit Jacob [:bjacob] (mostly away) 2011-09-18 20:24:02 PDT
Backed out from inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b916b514a499

and re-landed on inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb708067dd57

Note You need to log in before you can comment on or make changes to this bug.