Closed Bug 245820 Opened 20 years ago Closed 20 years ago

event queue disappearing on SMP Linux(?)

Categories

(Thunderbird :: General, defect)

x86
Linux
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 234620

People

(Reporter: mozbugs, Assigned: mscott)

Details

(Keywords: crash)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040606 Firefox/0.8.0+
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040606 Firefox/0.8.0+

Recent thunderbird builds from the aviary branch on Linux (RHEL3) have been
crashing often (1-2 times per day). Having just built with debug on, here's the
first trace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1568760912 (LWP 26214)]
0x00000011 in ?? ()
(gdb) where
#0  0x00000011 in ?? ()
#1  0xb747a699 in nsSupportsArray::ElementAt (this=0x81951c0, aIndex=0) at
nsSupportsArray.cpp:327
#2  0xb747b36b in nsSupportsArray::GetElementAt (this=0x81951c0, aIndex=0,
result=0xa27e9500) at nsSupportsArray.h:65
#3  0xb7472ec5 in ObserverListEnumerator::GetNext (this=0x83a8548,
aResult=0xa27e9500) at nsObserverList.cpp:195
#4  0xb7474261 in nsObserverService::NotifyObservers (this=0x81ae228,
aSubject=0x8ea6f58, aTopic=0xb75274de "nsIEventQueueDestroyed", someData=0x0) at
nsObserverService.cpp:205
#5  0xb74d7595 in nsEventQueueImpl::NotifyObservers (this=0x8ea6f58,
aTopic=0xb75274de "nsIEventQueueDestroyed") at nsEventQueue.cpp:228
#6  0xb74d6fa4 in ~nsEventQueueImpl (this=0x8ea6f58) at nsEventQueue.cpp:138
#7  0xb74d730d in nsEventQueueImpl::Release (this=0x8ea6f58) at nsEventQueue.cpp:195
#8  0xb7463c56 in ~nsCOMPtr (this=0x964a82c) at nsCOMPtr.h:509
#9  0xb74dfd2c in ~nsProxyObjectCallInfo (this=0x964a810) at nsProxyEvent.cpp:109
#10 0xb74e0eb8 in nsProxyObject::Post (this=0x97541a0, methodIndex=36,
methodInfo=0x857fe20, params=0xa27e9730, interfaceInfo=0x8578d88) at
nsProxyEvent.cpp:499
#11 0xb74e480e in nsProxyEventObject::CallMethod (this=0x8493b70,
methodIndex=36, info=0x857fe20, params=0xa27e9730) at nsProxyEventObject.cpp:546
#12 0xb7503642 in PrepareAndDispatch (methodIndex=36, self=0x8493b70,
args=0xa27e97d4) at xptcstubs_gcc_x86_unix.cpp:100
#13 0xb4bc9802 in nsImapProtocol::ReleaseUrlState (this=0xa6c80688) at
nsImapProtocol.cpp:897
#14 0xb4bcb44d in nsImapProtocol::ProcessCurrentURL (this=0xa6c80688) at
nsImapProtocol.cpp:1418
#15 0xb4bca731 in nsImapProtocol::ImapThreadMainLoop (this=0xa6c80688) at
nsImapProtocol.cpp:1163
#16 0xb4bc998e in nsImapProtocol::Run (this=0xa6c80688) at nsImapProtocol.cpp:931
#17 0xb74dac6a in nsThread::Main (arg=0xac0ce7d8) at nsThread.cpp:118
#18 0xb73d502e in _pt_root (arg=0xac078fb0) at ptthread.c:214
#19 0xb738adec in start_thread () from /lib/tls/libpthread.so.0
#20 0xb6d73e8a in clone () from /lib/tls/libc.so.6
(gdb) 

I wasn't interacting with thunderbird when this happened - was working in
another virual desktop, and returned to find it dead ...


Reproducible: Didn't try
Steps to Reproduce:
1.
2.
3.
I have not seen this - it looks like some problem with the event queue going
away - I don't know why it would go away other than a ref-counting problem.
Hmm Looks like there are a few talkback reports with similar imap based stack
traces:

http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=1&searchby=stacksig&match=contains&searchfor=nsSupportsArray%3A%3AElementAt
Another event-queue-ish looking crash. This happened overnight. I've noticed
the thunderbird generally has been crashing consistently overnight for the
last week or so.


###!!! ASSERTION: View is hidden but widget is visible!: '!visible', file
nsViewManager.cpp, line 1662
Break: at file nsViewManager.cpp, line 1662
^G[Thread -1344730192 (LWP 29234) exited]
[Thread -1453507664 (LWP 29232) exited]
###!!! ASSERTION: You can't dereference a NULL nsRefPtr with operator->().:
'mRawPtr != 0', file ../../../dist/include/xpcom/nsAutoPtr.h, line 1041
Break: at file ../../../dist/include/xpcom/nsAutoPtr.h, line 1041
^G
Program received signal SIGSEGV, Segmentation fault.
0xb7463f56 in nsCOMPtr<nsIEventQueue>::get (this=0x8) at nsCOMPtr.h:693
        in nsCOMPtr.h
(gdb) cont
                                                                               
                                                    
Program /export/stuff/mozilla/tb-debug/mozilla/dist/bin/thunderbird-bin (pid =
26735) received signal 11.
Stack:
nsProfileLock::FatalSignalHandler(int)+0x00000137
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/thunderbird-bin +0x00029E61]
UNKNOWN [/lib/tls/libpthread.so.0 +0x0000AD28]
nsCOMPtr<nsIEventQueue>::operator nsDerivedSafe<nsIEventQueue>*()
const+0x0000001E [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so
+0x00073E98]
nsProxyObject::Post(unsigned, nsXPTMethodInfo*, nsXPTCMiniVariant*,
nsIInterfaceInfo*)+0x0000002A
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so +0x000F1C96]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so +0x000F580E]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so +0x00114642]
nsImapProtocol::OnCreateServerSourceFolderPathString()+0x00000091
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so +0x003A368B]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x003980BF]
nsImapIncomingServer::GetImapConnection(nsIEventQueue*, nsIImapUrl*,
nsIImapProtocol**)+0x000002E2
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so +0x0035687E]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x003556A0]
nsImapService::GetImapConnectionAndLoadUrl(nsIEventQueue*, nsIImapUrl*,
nsISupports*, nsIURI**)+0x000001DF
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so +0x003C9B11]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x003C5E2B]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x0036E188]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x0036407F]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x00364313]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x00364313]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x00374648]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so
+0x00357B0F]
nsMsgBiffManager::PerformBiff()+0x0000012E
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so +0x001E1F54]
OnBiffTimer(nsITimer*, void*)+0x00000024
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libmail.so +0x001E0E94]
nsTimerImpl::Fire()+0x00000250
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so +0x000EDBBC]
handleTimerEvent(TimerEventType*)+0x0000010C
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so +0x000EDDA2]
PL_HandleEvent+0x00000054
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so +0x000E59F4]
PL_ProcessPendingEvents+0x000000DC
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so +0x000E5895]
UNKNOWN [/export/stuff/mozilla/tb-debug/mozilla/dist/bin/libxpcom.so +0x000E8B94]
UNKNOWN
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libwidget_gtk2.so
+0x00031088]
UNKNOWN [/usr/lib/libglib-2.0.so.0 +0x00043F8F]
UNKNOWN [/usr/lib/libglib-2.0.so.0 +0x00022C30]
__float128+0x00000098 [/usr/lib/libglib-2.0.so.0 +0x00023C98]
UNKNOWN [/usr/lib/libglib-2.0.so.0 +0x00023FAD]
__float128+0x0000019F [/usr/lib/libglib-2.0.so.0 +0x000246CF]
__float128+0x000000BF [/usr/lib/libgtk-x11-2.0.so.0 +0x000D445F]
UNKNOWN
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libwidget_gtk2.so
+0x0003177A]
UNKNOWN
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/components/libnsappshell.so
+0x0003FE47]
xre_main(int, char**, nsXREAppData const*)+0x000017EC
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/thunderbird-bin +0x00017D24]
unsigned long+0x00000032
[/export/stuff/mozilla/tb-debug/mozilla/dist/bin/thunderbird-bin +0x000110EA]
__libc_start_main+0x000000F8 [/lib/tls/libc.so.6 +0x00015768]
Sleeping for 5 minutes.
Type 'gdb /export/stuff/mozilla/tb-debug/mozilla/dist/bin/thunderbird-bin 26735'
to attach your debugger to this thread.
Done sleeping...
 
Program exited with code 013.
(gdb)
Iain, do you have a multi-processor system? This crash has been around for a
while, now that I look at it. I looked into something similar a couple months
ago. At the time, I thought it was some sort of race condition in the observer
service. 
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #4)
> Iain, do you have a multi-processor system? 

Indeed I do. I have two Xeons, which, with hyperthreading, appears as four
"virtual" CPUs with the Linux SMP kernel.

yeah, that's what the other guy had, iirc. My recollection was that he had some
sort of experimental version of the hyperthreading or something, and when he
turned that off, the problem went away - my recollection is fuzzy on this,
however. I'm still looking for that bug. However, my guess is that this is a
client problem thread-safety problem exposed by the hyperthreading...
bug 234620 is the other bug.  You have a better stack trace, I think. If I'm
reading correctly, it says that the event queue is getting destroyed, which
seems to point to a ref-counting problem, though I'm not sure if thread-safety
issues could cause a ref-counting problem - I'm sure the eventqueue impl uses
threadsafe ref-counting.
Keywords: crash
Not sure where to go from here ... aviary-branch thunderbird still crashes on
average once a day (more than once some days). Fortunately, it generally happens
when I'm not interacting, so the only data I lose is half-composed messages that
I intend to finish later.

Unless I'm doing something stupid in my build and/or runtime environment, I think
this is should be a blocker for 1.0 ?

 
Flags: blocking-aviary1.0?
I meant to also note that aviary firefox has been very stable in the same
build and runtime environment...
Flags: blocking-aviary1.0RC1?
More event-queue-vanishing-weirdness. This is on my dual-PentiumIII system, so
it's not a hyperthreading issue, but there is still SMP in the picture...

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1126026160 (LWP 22140)]
0x00000011 in ?? ()
(gdb) where
#0  0x00000011 in ?? ()
#1  0x00512699 in nsSupportsArray::ElementAt(unsigned) (this=0x9fa1240,
aIndex=0) at nsSupportsArray.cpp:327
#2  0x0051336b in nsSupportsArray::GetElementAt(unsigned, nsISupports**)
(this=0x9fa1240, aIndex=0, result=0x431dc1a0)    at nsSupportsArray.h:65
#3  0x0050aec5 in ObserverListEnumerator::GetNext(nsISupports**)
(this=0xbe780730, aResult=0x431dc1a0)
    at nsObserverList.cpp:195
#4  0x0050c261 in nsObserverService::NotifyObservers(nsISupports*, char const*,
unsigned short const*) (
    this=0x9f7bf68, aSubject=0xbef4e518, aTopic=0x5bf5c7
"nsIEventQueueActivated", someData=0x0)
    at nsObserverService.cpp:205
#5  0x0056f5b1 in nsEventQueueImpl::NotifyObservers(char const*) (this=0xbef4e518,
    aTopic=0x5bf5c7 "nsIEventQueueActivated") at nsEventQueue.cpp:228
#6  0x0056f1b5 in nsEventQueueImpl::InitFromPRThread(PRThread*, int)
(this=0xbef4e518, thread=0xa4cb490, aNative=0)
    at nsEventQueue.cpp:179
#7  0x00571777 in nsEventQueueServiceImpl::MakeNewQueue(PRThread*, int,
nsIEventQueue**) (this=0x9eda6e8,
    thread=0xa4cb490, aNative=0, aQueue=0x431dc340) at nsEventQueueService.cpp:173
#8  0x0057184e in nsEventQueueServiceImpl::CreateEventQueue(PRThread*, int)
(this=0x9eda6e8, aThread=0xa4cb490,
    aNative=0) at nsEventQueueService.cpp:193
#9  0x00571681 in nsEventQueueServiceImpl::CreateMonitoredThreadEventQueue()
(this=0x9eda6e8)
    at nsEventQueueService.cpp:144
#10 0x00578944 in nsProxyObject::PostAndWait(nsProxyObjectCallInfo*)
(this=0xacaa6a0, proxyInfo=0xbef274e8)
    at nsProxyEvent.cpp:353
#11 0x00578e9a in nsProxyObject::Post(unsigned, nsXPTMethodInfo*,
nsXPTCMiniVariant*, nsIInterfaceInfo*) (
    this=0xacaa6a0, methodIndex=24, methodInfo=0xa63eaf8, params=0x431dc4e0,
interfaceInfo=0xa637440)
    at nsProxyEvent.cpp:495
#12 0x0057c82a in nsProxyEventObject::CallMethod(unsigned short, nsXPTMethodInfo
const*, nsXPTCMiniVariant*) (
    this=0xa8abb08, methodIndex=24, info=0xa63eaf8, params=0x431dc4e0) at
nsProxyEventObject.cpp:546
#13 0x0059b65e in PrepareAndDispatch (methodIndex=24, self=0xa8abb08,
args=0x431dc584)
    at xptcstubs_gcc_x86_unix.cpp:100
#14 0x086db455 in nsImapProtocol::OnCreateServerSourceFolderPathString()
(this=0xa4cdcb0) at nsImapProtocol.cpp:5312
#15 0x086db531 in nsImapProtocol::CreateServerSourceFolderPathString(char**)
(this=0xa4cdcb0, result=0x431dc7e4)
    at nsImapProtocol.cpp:5327
#16 0x086d067f in nsImapProtocol::ProcessSelectedStateURL() (this=0xa4cdcb0) at
nsImapProtocol.cpp:1855
#17 0x086cee4b in nsImapProtocol::ProcessCurrentURL() (this=0xa4cdcb0) at
nsImapProtocol.cpp:1375
#18 0x086ce4b1 in nsImapProtocol::ImapThreadMainLoop() (this=0xa4cdcb0) at
nsImapProtocol.cpp:1166
#19 0x086cd70e in nsImapProtocol::Run() (this=0xa4cdcb0) at nsImapProtocol.cpp:934
#20 0x00572c86 in nsThread::Main(void*) (arg=0xa4cb410) at nsThread.cpp:118
#21 0x001c405e in _pt_root (arg=0xa4cb490) at ptthread.c:214
#22 0x001d87fc in start_thread () from /lib/tls/libpthread.so.0
#23 0x00d72aba in clone () from /lib/tls/libc.so.6
(gdb)
Summary: thunderbird aviary branch crash → event queue disappearing on SMP Linux(?)
minus for now. if more information becomes available please renominate
Flags: blocking-aviary1.0PR?
Flags: blocking-aviary1.0PR-
Flags: blocking-aviary1.0?
Flags: blocking-aviary1.0-
dup of a bug to be fixed shortly.

*** This bug has been marked as a duplicate of 234620 ***
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.