Closed Bug 343280 Opened 15 years ago Closed 5 years ago

Crash While Thunderbird Is Idle [@ hashKey] - [@ PL_DHashTableOperate] - [@ nsLDAPConnectionLoop::Run] ldap

Categories

(MailNews Core :: LDAP Integration, defect)

x86
Windows XP
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: mscott, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Attachments

(2 files)

Marcia has filed a couple talkback reports that all have the same stack trace. An LDAP connection loop seems to be causing something odd to happen.

TB20459172

hashKey()  [/builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/build/unifox/ppc/xpcom/ds//builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/mozilla/xpcom/ds/nsHashtable.cpp, line 87]
PL_DHashTableOperate()   nsSupportsHashtable::EnumerateCopy()  [/builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/build/unifox/ppc/xpcom/ds//builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/mozilla/xpcom/ds/nsHashtable.cpp, line 873]
PL_DHashTableEnumerate()   nsSupportsHashtable::Clone()  [/builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/build/unifox/ppc/xpcom/ds//builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/mozilla/xpcom/ds/nsHashtable.cpp, line 885]
nsLDAPConnectionLoop::Run()
nsThread::Main()  [/builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/build/unifox/ppc/xpcom/threads//builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/mozilla/xpcom/threads/nsThread.cpp, line 713]
_pt_root()  [/builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/build/unifox/ppc/nsprpub/pr/src/pthreads//builds/tinderbox/Tb-Moz1.8.0-universal-Release/Darwin_8.5.0_Depend/mozilla/nsprpub/pr/src/pthreads/ptthread.c, line 223]
_pthread_body()
Keywords: crash
Summary: Crash at allaccess.com [@ hashKey] → Crash While Thunderbird Is Idle [@ hashKey]
yeah, I've seen this over the years. I think it's some sort of race condition in the ldap code. I see a similar crash which I thought had to do with a patch I'm trying out on the 2.0 branch, but maybe not...I'll try w/o my patch.
Attached file Apple crash report
Attaching the apple report in case it is useful.
Yikes, this just happened to me again today - TB20964552H. I don't think I often leave my machine idle for long periods of time so that may be why I haven't seen it, but this still seems to be popping up more frequently.
Crashed once again today. Another report (TB21139559Q) indicates a crash in hashKey() in the LDAP connection loop.  I am using the 1.5.0.5 RC3 candidate build. I will switch to the new build and see if I continue to see this issue.
Keywords: qawanted
Crashed again today - http://talkback-public.mozilla.org/search/start.jsp?search=2&type=iid&id=TB21215103Z. I am beginning to think that this is 100% reproducible if my machine goes idle.
It looks like this is fixed on the Thunderbird 2 branch and the trunk.

I see 26 crashes with Marcia's stack trace in 1.5.0.4. 73rd most common crash.

For Thunderbird 2 reports, there are 0 crashes with her stack trace. Ditto for the trunk.

Then again, we get fewer submissions than we do for 1.5.0.4.
While testing the Alpha1 build today, I have already crashed twice - I will paste the apple crash reporter, it looks like it is crashing in __spin_lock_relinquish + 0x18, although the HashKey reference shows up in the report as well.
This is the stack I received today on my second crash. In both instances I was not using Tbird at the time of the crash.
Assignee: mscott → nobody
Severity: normal → critical
Component: General → MailNews: LDAP Integration
Product: Thunderbird → Core
QA Contact: general → ldap-integration
Version: 2.0 → 1.8 Branch
I don't want to lose track of this bug.
Assignee: nobody → mscott
Product: Core → MailNews Core
Assignee: mscott → nobody
interesting, there are a couple variations on talkback, one or more of which is topcrash. (the aggregate for @ hashkey is #4 crasher,  but I'm not going to weed through which ones are topcrash or not.)

Common comment for the ones that match this bug's stack is "Thunbird crashed at take off" (start). For example TB50222687 and TB50401921
hashKey  [mozilla/xpcom/ds/nsHashtable.cpp, line 87]
nsHashtable::Put  [mozilla/xpcom/ds/nsHashtable.cpp, line 220]
nsSupportsHashtable::EnumerateCopy  [mozilla/xpcom/ds/nsHashtable.cpp, line 872]
nsSupportsHashtable::Clone  [mozilla/xpcom/ds/nsHashtable.cpp, line 884]

and just one on crash stats, bp-d75d910f-442b-11dd-adaa-001a4bd43e5c
0  	xpcom_core.dll  	hashKey  	 mozilla/xpcom/ds/nsHashtable.cpp:87
1 	xpcom_core.dll 	PL_DHashTableOperate 	pldhash.c:588
2 	xpcom_core.dll 	nsHashtable::Put 	mozilla/xpcom/ds/nsHashtable.cpp:217
3 	xpcom_core.dll 	nsSupportsHashtable::EnumerateCopy 	mozilla/xpcom/ds/nsHashtable.cpp:868
4 	xpcom_core.dll 	PL_DHashTableEnumerate 	pldhash.c:724
5 	xpcom_core.dll 	nsSupportsHashtable::Clone 	mozilla/xpcom/ds/nsHashtable.cpp:881
6 	thunderbird.exe 	nsLDAPConnectionLoop::Run 	mozilla/directory/xpcom/base/src/nsLDAPConnection.cpp:848
7 	xpcom_core.dll 	nsThread::ProcessNextEvent 	mozilla/xpcom/threads/nsThread.cpp:510
8 	xpcom_core.dll 	NS_ProcessNextEvent_P 	nsThreadUtils.cpp:227
9 	xpcom_core.dll 	nsThread::ThreadFunc 	mozilla/xpcom/threads/nsThread.cpp:254
10 	nspr4.dll 	_PR_NativeRunThread 	mozilla/nsprpub/pr/src/threads/combined/pruthr.c:436
11 	xpcom_core.dll 	nsQueryInterfaceWithError::operator 	nsCOMPtr.cpp:75
12 	msvcr80.dll 	msvcr80.dll@0x29ba 	
13 	msvcr80.dll 	msvcr80.dll@0x2a46 	

Will file a bug for the other stack(s)
Keywords: topcrash
I filed bug 461074 for one of the different stacks, but it has the same line number in nsHashtable. philor comments there "the line in nsMsgGroupView.cpp where we head off the rails was removed on the trunk in bug 384490"

anyway, is this one related to bug 343332?  thread-safety assertion and crash with nsWeakRefPtr and nsLDAPConnection
I think the ldap stack is in the minority of signatures with hashKey, and not a topcrash. so removing topcrash

need to revisit this and bug 461074 after bug 495978 is fixed, perhaps with timeless' tool


FWIF, perhaps related with nsLDAPConnectionLoop::Run in stack ...

3.0b2  bp-d6ac83fd-f9fd-442c-8b4a-b0f382090602

0	thunderbird.exe	nsLDAPConnectionLoop::Run	nsLDAPConnection.cpp:847
1	xpcom_core.dll	nsThread::ProcessNextEvent	xpcom/threads/nsThread.cpp:510
2	xpcom_core.dll	NS_ProcessNextEvent_P	nsThreadUtils.cpp:227
3	xpcom_core.dll	nsThread::ThreadFunc	xpcom/threads/nsThread.cpp:254
4	nspr4.dll	_PR_NativeRunThread	nsprpub/pr/src/threads/combined/pruthr.c:436
5	nspr4.dll	pr_root	nsprpub/pr/src/md/windows/w95thred.c:122 


3.0b2  bp-a2470421-8e7f-4a20-80ce-541ee2090528
0	xpcom_core.dll	nsCOMPtr_base::~nsCOMPtr_base	nsCOMPtr.cpp:81
1	thunderbird.exe	nsLDAPConnectionLoop::Run	nsLDAPConnection.cpp:860
2	xpcom_core.dll	nsThread::ProcessNextEvent	xpcom/threads/nsThread.cpp:510
3	xpcom_core.dll	NS_ProcessNextEvent_P	nsThreadUtils.cpp:227
4	xpcom_core.dll	nsThread::ThreadFunc	xpcom/threads/nsThread.cpp:254
Keywords: topcrash
Summary: Crash While Thunderbird Is Idle [@ hashKey] → Crash While Thunderbird Is Idle [@ hashKey - PL_DHashTableOperate] ldap
bp-05d9746e-a1d1-4671-9565-348a82100402 says "adding an address"
0	libxpcom_core.dylib	hashKey	 xpcom/ds/nsHashtable.cpp:87
1	libxpcom_core.dylib	PL_DHashTableOperate	pldhash.c:599
2	libxpcom_core.dylib	nsHashtable::Put	xpcom/ds/nsHashtable.cpp:217
3	libxpcom_core.dylib	nsSupportsHashtable::EnumerateCopy	xpcom/ds/nsHashtable.cpp:868
4	libxpcom_core.dylib	PL_DHashTableEnumerate	pldhash.c:735
5	libxpcom_core.dylib	nsSupportsHashtable::Clone	xpcom/ds/nsHashtable.cpp:881
6	thunderbird-bin	nsLDAPConnectionLoop::Run	directory/xpcom/base/src/nsLDAPConnection.cpp:868
7	libxpcom_core.dylib	nsThread::ProcessNextEvent	xpcom/threads/nsThread.cpp:521 

bp-c2a3c7b8-aa4e-494d-8d41-11d672100331 (sriram)
"Crashed while writing reply."
Version: 1.8 Branch → Trunk
Hey Wayne, I hit this crash every time I've tried to start the thunderbird app in the last 2 days! Totally unusable right now! The submit report option on the crash_reporter also fails 50% of the times. If it's a familiar stacktrace, is there any known workaround to get TB started?
OS: Win XP
sriram's crash is bp-c2a3c7b8-aa4e-494d-8d41-11d672100331 v3.0
crash exists in 3.1. example bp-a2edc20b-f66d-42dd-87e5-60dd12100604

bienvenu in comment #1
> yeah, I've seen this over the years. I think it's some sort of race condition
> in the ldap code. I see a similar crash which I thought had to do with a patch
> I'm trying out on the 2.0 branch, but maybe not...I'll try w/o my patch.

bienvenu, still got that patch?
Summary: Crash While Thunderbird Is Idle [@ hashKey - PL_DHashTableOperate] ldap → Crash While Thunderbird Is Idle [@ hashKey] - [@ PL_DHashTableOperate] ldap
(In reply to comment #15)

> 
> bienvenu, still got that patch?

No, we decided to go a different way for fixing the thread-safety issues in the ldap code (Standard8 has the bug, I believe)
Depends on: 343332
reporter of bp-77645b0a-101e-45c9-9ccf-ee2942100601 (Andrea) writes
"Whenever Lanikai crashes (frequently before my present version  3.1 [RC1 I think], but I do not think I had any crashes since last update)  then it will crash again immediately unless I start it in safe mode. Once restarted in safe mode, then it will work again with full extensions enabled. ... I never had any crashes before 3.0, all started in between [3.0 and 3."
Summary: Crash While Thunderbird Is Idle [@ hashKey] - [@ PL_DHashTableOperate] ldap → Crash While Thunderbird Is Idle [@ hashKey] - [@ PL_DHashTableOperate] - [@ nsLDAPConnectionLoop::Run] ldap
Crash Signature: [@ hashKey] [@ PL_DHashTableOperate] [@ nsLDAPConnectionLoop::Run]
Depends on: 716345
If anyone still sees a crash, please add a comment.

Everything for the past week in crash-stats is v3, so -> WFM
Status: NEW → RESOLVED
Crash Signature: [@ hashKey] [@ PL_DHashTableOperate] [@ nsLDAPConnectionLoop::Run] → [@ hashKey] [@ PL_DHashTableOperate] [@ nsLDAPConnectionLoop::Run]
Closed: 5 years ago
Keywords: qawanted
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.