Closed Bug 239729 Opened 20 years ago Closed 20 years ago

M17rc1 crash [@ nsLDAPConnection::RemovePendingOperation] trying to do LDAP email address lookup

Categories

(MailNews Core :: LDAP Integration, defect)

x86
All
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: jay, Assigned: Bienvenu)

References

Details

(Keywords: crash, topcrash, verified1.7)

Crash Data

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040206 Firefox/0.8
Build Identifier: 

This is a topcrasher for Mozilla 1.7 beta.  Most users are crashing doing LDAP
lookups/address completions while composing mail.  Here is the latest Talkback data:

Count   Offset    Real Signature
[ 7   nsLDAPConnection::RemovePendingOperation d6727d71 -
nsLDAPConnection::RemovePendingOperation ]
 
     Crash date range: 01-APR-04 to 31-MAR-04
     Min/Max Seconds since last crash: 10365 - 195266
     Min/Max Runtime: 20510 - 577678
 
     Count   Platform List 
     7   [Windows NT 5.1 build 2600]   
 
     Count   Build Id List 
     7   2004031615
 
     No of Unique Users         6
 
 Stack trace(Frame) 

	 nsLDAPConnection::RemovePendingOperation
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/directory/xpcom/base/src/nsLDAPConnection.cpp
 line 438] 
	 nsLDAPOperation::AbandonExt
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/directory/xpcom/base/src/nsLDAPOperation.cpp
 line 416] 
	 nsLDAPOperation::Abandon
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/directory/xpcom/base/src/nsLDAPOperation.cpp
 line 423] 
	 nsLDAPAutoCompleteSession::OnStopLookup
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/xpfe/components/autocomplete/src/nsLDAPAutoCompleteSession.cpp
 line 279] 
	 XPCWrappedNative::CallMethod
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/js/src/xpconnect/src/xpcwrappednative.cpp
 line 2029] 
	 XPC_WN_CallMethod
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp
 line 1288] 
	 js_Invoke
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/js/src/jsinterp.c 
line 943] 
	 js_Interpret
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/js/src/jsinterp.c 
line 2963] 
	 js_Invoke
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/js/src/jsinterp.c 
line 959] 
	 js_InternalInvoke
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/js/src/jsinterp.c 
line 1036] 
	 JS_CallFunctionValue
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/js/src/jsapi.c  line
3591] 
	 nsJSContext::CallEventHandler
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/dom/src/base/nsJSEnvironment.cpp
 line 1269] 
	 nsJSEventListener::HandleEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/dom/src/events/nsJSEventListener.cpp
 line 181] 
	 nsXBLPrototypeHandler::ExecuteHandler
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/xbl/src/nsXBLPrototypeHandler.cpp
 line 461] 
	 nsXBLEventHandler::HandleEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/xbl/src/nsXBLEventHandler.cpp
 line 88] 
	 nsEventListenerManager::HandleEventSubType
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/events/src/nsEventListenerManager.cpp
 line 1435] 
	 nsEventListenerManager::HandleEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/events/src/nsEventListenerManager.cpp
 line 1512] 
	 nsXULElement::HandleDOMEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/xul/content/src/nsXULElement.cpp
 line 2851] 
	 nsXULElement::HandleDOMEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/xul/content/src/nsXULElement.cpp
 line 2870] 
	 nsXULElement::HandleDOMEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/xul/content/src/nsXULElement.cpp
 line 2870] 
	 nsGenericElement::HandleDOMEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/base/src/nsGenericElement.cpp
 line 1989] 
	 nsHTMLInputElement::HandleDOMEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/html/content/src/nsHTMLInputElement.cpp
 line 1399] 
	 PresShell::HandleEventInternal
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/layout/html/base/src/nsPresShell.cpp
 line 6019] 
	 PresShell::HandleEventWithTarget
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/layout/html/base/src/nsPresShell.cpp
 line 5973] 
	 nsTextControlFrame::FireOnInput
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/layout/html/forms/src/nsTextControlFrame.cpp
 line 2822] 
	 nsTextInputListener::EditAction
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/layout/html/forms/src/nsTextControlFrame.cpp
 line 369] 
	 nsEditor::NotifyEditorObservers
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/editor/libeditor/base/nsEditor.cpp
 line 1622] 
	 nsEditor::EndPlaceHolderTransaction
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/editor/libeditor/base/nsEditor.cpp
 line 805] 
	 nsPlaintextEditor::TypedText
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/editor/libeditor/text/nsPlaintextEditor.cpp
 line 509] 
	 nsPlaintextEditor::HandleKeyPress
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/editor/libeditor/text/nsPlaintextEditor.cpp
 line 467] 
	 nsTextEditorKeyListener::KeyPress
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/editor/libeditor/text/nsEditorEventListeners.cpp
 line 251] 
	 DispatchToInterface
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/events/src/nsEventListenerManager.cpp
 line 128] 
	 nsEventListenerManager::HandleEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/events/src/nsEventListenerManager.cpp
 line 1524] 
	 nsGenericElement::HandleDOMEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/base/src/nsGenericElement.cpp
 line 1959] 
	 nsHTMLInputElement::HandleDOMEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/content/html/content/src/nsHTMLInputElement.cpp
 line 1399] 
	 PresShell::HandleEventInternal
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/layout/html/base/src/nsPresShell.cpp
 line 6053] 
	 PresShell::HandleEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/layout/html/base/src/nsPresShell.cpp
 line 5942] 
	 nsViewManager::HandleEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/view/src/nsViewManager.cpp
 line 2235] 
	 nsViewManager::DispatchEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/view/src/nsViewManager.cpp
 line 2025] 
	 HandleEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/view/src/nsView.cpp 
line 79] 
	 nsWindow::DispatchEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp
 line 1068] 
	 nsWindow::DispatchWindowEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp
 line 1085] 
	 nsWindow::DispatchKeyEvent
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp
 line 2957] 
	 nsWindow::OnChar
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp
 line 3143] 
	 nsWindow::ProcessMessage
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp
 line 3850] 
	 nsWindow::WindowProc
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp
 line 1347] 
	 USER32.dll + 0x3a50 (0x77d43a50)  
	 USER32.dll + 0x3b1f (0x77d43b1f)  
	 USER32.dll + 0x3d79 (0x77d43d79)  
	 USER32.dll + 0x3ddf (0x77d43ddf)  
	 nsAppShellService::Run
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/xpfe/appshell/src/nsAppShellService.cpp
 line 524] 
	 main1
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/xpfe/bootstrap/nsAppRunner.cpp
 line 1308] 
	 main
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/xpfe/bootstrap/nsAppRunner.cpp
 line 1712] 
	 WinMain
[c:/builds/tinderbox/Mozilla1.7b/WINNT_5.0_Clobber/mozilla/xpfe/bootstrap/nsAppRunner.cpp
 line 1734] 
	 WinMainCRTStartup()  
	 kernel32.dll + 0x214c7 (0x77e814c7)   
 
     (9885)	Comments: Crash during pinpoint addressing lookup in mail
     (9323)	Comments: composing a message. was in the email address completion 
pulldown.  
     (8437)	Comments: writing a name into the mail compose window
 

And some more user comments:

(12138)	URL: Again crashes when using ldap email address lookup on 1.7beta
     (10077)	URL: Frequent crashes when using ldap email address lookup on 1.7beta
     (9346)	Comments: 1.7 beta crashes frequently when doing ldap
lookups/address completitions in the mail module. Did not happen with 1.6.
     (7559)	Comments: Writing an email. I was trying to add another email
address from the LDAP book. 
     (5894)	Comments: compose email
 (9401)	Comments: I just tried to open a new email compose window and it died.
     (5641)	Comments: Crashed immediately after resolving an address from ldap.
(10812)	Comments: Backspacing again in the address line crashed the app again.
     (5911)	URL: http://onderwijs.cs.utwente.nl/Studenten/INF/2003/VakkenPropedeuse
     (5911)	Comments: composing a new mail: typing first letters in the to-field


Reproducible: Didn't try
Steps to Reproduce:
Keywords: crash, topcrash
This also looks like a similar crash reported a while back.  See Bug 179814.
I have the same problem with mozilla 1.7b and 1.7rc1 und linux. We are using ms
active directory as ldap server.
The application exactly crashes when I try to delete characters during
autocompletion.
cc mscott and bienvenu
taking - I've not been able to reproduce this, but I'll keep trying.
Assignee: sspitzer → bienvenu
440	// remove the operation if it's still there.  
441	//
442	if (!mPendingOperations->Remove(key)) {
	     ^^^^^^^^^^^^^^^^^^ i'm guessing that's null.
Attachment #147148 - Flags: superreview?(dmose)
Attachment #147148 - Flags: review?(dmose)
Going back one rev of the file with cvsblame, the call through
mPendingOperations is revealed to be on line 438, so timeless has found the
symptom, I think.  The only way that that pointer can be null, I think, is if
nsLDAPConnection::Init() hasn't completed executing or nsLDAPConnection::Close()
is fairly far along.  What I suspect is going on is that nsLDAPConnect::Abort is
being called on the main thread (since it's coming from the compose window via
the DOM via XPConnect), while at the same time the connection destructor is
being executed on another thread.  I'm wondering if perhaps the patch in 206018
isn't in some way being too aggressive and causing a refcount to drop to 0
before it should.

Also, bug 179814 looks like a different bug to me: that appears to be in the
addressbook, whereas this is in autocomplete.

Er, I meant nsLDAPOperation::Abort().
*** Bug 240508 has been marked as a duplicate of this bug. ***
Reporter's comment from duped bug, maybe should help with repro:
Fairly regularly, usually on first LDAP query which takes the longest time, if I
type in the address line and hit backspace before it has had a chance to match
against the LDAP query, the whole app crashes. 
Updating summary with M17rc1 since this continues to be a topcrash for Mozilla
1.7 RC1.
Summary: M17beta crash [@ nsLDAPConnection::RemovePendingOperation] trying to do LDAP email address lookup → M17rc1 crash [@ nsLDAPConnection::RemovePendingOperation] trying to do LDAP email address lookup
Also had the problem a few times (TB40876Q, TB39018X, TB34686W). I also have the
impression that it happened when *no* match was to be found, but my 3 samples
are maybe not sufficient to speak in general.
As I never had this before, and as it made me lose a mail I was writing, I'd add
the "regression" and "dataloss" keywords if I could and I am and asking for this
bug to block 1.7
Flags: blocking1.7?
Keywords: zt4newcrash
Dan and I were operating on the theory that there's a race condition between the
ldap thread and the ui thread such that the ldap connection is getting deleted 
before the call to nsLDAPConnection::RemovePendingOperation because of the
nsLDAPOperation::Clear call I added to cut the circular references. That call
releases a reference to the connection. Furthermore, we speculated that
FinishAutoCompleteLookup was getting called and releasing the
nsLDAPAutoCompleteSession's reference to the connection, making the ref count go
to 0, causing the connection to get deleted. However, I don't see how that could
be the problem - FinishAutoCompleteLookup clears the operation before it clears
the connection, so nsLDAPAutoCompleteSession::OnStopLookup would not call
abandon on the operation after FinishAutoCompleteLookup was called. And both
OnStopLookup and FinishAutoCompleteLookup are called on the UI thread. Since the
autocomplete session is holding a reference to the connection object, it's very
hard to imagine the connection object going away because of the Clear call.

Timeless's idea that mPendingOperations might be null is equally hard to believe
- we'd have to have closed the connection, which we only do in the connection
destructor, or not initialized it, but if we haven't initialized it, there
shouldn't be any operations...
Attached patch null check fixes (obsolete) — Splinter Review
My theory is that the if (mConnection) check in nsLDAPOperation::AbandonExt is
the one that will help - I suspect that the mConnection is getting cleared on
another thread, making this a race...
Attachment #148152 - Flags: review?(dmose)
Attachment #148152 - Flags: superreview?(mscott)
Attachment #148152 - Flags: superreview?(mscott) → superreview+
Comment on attachment 148152 [details] [diff] [review]
null check fixes

I would like to propose that we land this patch only on the 1.7 branch, since
it's important that we ship something stable there, and we can get talkback
feedback from it.  Assuming that this does make the crashes go away for 1.7,
I'd suggest that this bug then be retargetted for 1.8beta, where we do the
right thing: actually lock the data before changing it, rather than trying to
clean up after potential race conditions.

The patch itself looks fine, modulo a few style nits:

>Index: base/src/nsLDAPConnection.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/directory/xpcom/base/src/nsLDAPConnection.cpp,v
>retrieving revision 1.50
>diff -u -w -r1.50 nsLDAPConnection.cpp
>--- base/src/nsLDAPConnection.cpp	25 Apr 2004 21:07:12 -0000	1.50
>+++ base/src/nsLDAPConnection.cpp	11 May 2004 00:58:23 -0000
>@@ -520,6 +521,7 @@
> 
>     // invoke the callback 
>     //
>+    if (listener)
>     listener->OnLDAPMessage(aMsg);
> 

Please make this match module style for indentation and bracing, a la:

if (listener) {
    ...
}

Also, can you add a comment mentioning why this is necessary, with a reference
to this bug.

>Index: base/src/nsLDAPOperation.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/directory/xpcom/base/src/nsLDAPOperation.cpp,v
>retrieving revision 1.33
>diff -u -w -r1.33 nsLDAPOperation.cpp
>--- base/src/nsLDAPOperation.cpp	25 Apr 2004 21:07:12 -0000	1.33
>+++ base/src/nsLDAPOperation.cpp	11 May 2004 00:58:24 -0000
>@@ -405,6 +405,8 @@
>     // succeeded (and there's nothing else the caller can reasonably do), 
>     // so we only pay attention to this in debug builds.
>     //
>+    if (mConnection)
>+    {
>     rv = NS_STATIC_CAST(nsLDAPConnection *, NS_STATIC_CAST(
>         nsILDAPConnection *, mConnection.get()))->RemovePendingOperation(this);
> 
>@@ -416,6 +418,7 @@
>         //
>         NS_WARNING("nsLDAPOperation::AbandonExt: "
>                    "mConnection->RemovePendingOperation(this) failed.");
>+      }
>     }

Please add a comment and make this match module style also, which I think will
involve re-indenting the stuff inside the new braces.
Attachment #148152 - Flags: review?(dmose) → review+
thx, I wanted to land this on the trunk to see if its worth landing on the
branch, but I'll leave that up to drivers.  The indentation is correct; it's
just a -uw diff. I'll add a comment and add the k&r braces around the one line
if statement (but I'll feel dirty doing it :-) )
Attachment #148152 - Flags: approval1.7?
Comment on attachment 148152 [details] [diff] [review]
null check fixes

a=mkaply
Attachment #148152 - Flags: approval1.7? → approval1.7+
can people who can recreate this bug try tomorrow's 1.7 nightly build? thx.
Keywords: fixed1.7
Will do, however my experience with the bug is that it is fairly random and
rarely happens on first use of the browser - in my case it usually happens after
a fair amount of use, so it maybe a case of "well, I haven't gotten the crash in
the last 3 days or so....", unless someone else has a consistent test case not
reported yet ....
(In reply to comment #19)
> Will do, however my experience with the bug is that it is fairly random and
> rarely happens on first use of the browser - in my case it usually happens after
> a fair amount of use, so it maybe a case of "well, I haven't gotten the crash in
> the last 3 days or so....", unless someone else has a consistent test case not
> reported yet ....

I believe I can recreate at will. In my case reply-all to a message that has
several addresses, highlight one and backspace to delete, crash.

It is still happening with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8a) Gecko/20040512  Is patch on this or do I need a different build?
Wayne, I don't think that's an ldap crash at all...
Attachment #147148 - Flags: superreview?(dmose)
Attachment #147148 - Flags: review?(dmose)
bienvenu:  Did this make it into the Trunk or 1.7 branch before rc2 went out
(5/14)?  I didn't see any "checked in" comments, so wasn't sure.  Let me know so
i can keep an eye out in the latest talkback data.  Thanks.
Jay, it made it into the 1.7 branch on 05/11, I think, but not the trunk.
mozilla has an ldap server at the office now. talk to me if you want to try to
reproduce.
FWIW, I haven't had a LDAP crash since 5/14 when I installed Mozilla/5.0
(Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040513.
Michael, thx for the info.

Thx, Ben, I'd like to get access to the mozilla server for other reasons, but
lack of access to servers isn't why I haven't been able to reproduce this crash.
I have access to other ldap servers and have not been able to reproduce the
crash on those servers, so it's just a difficult thing to recreate with a debug
build, on my machine, with my network connection...
*** Bug 243953 has been marked as a duplicate of this bug. ***
Flags: blocking1.7? → blocking1.7-
According the the latest Talkback data, the fix on the 1.7 branch looks good.  I
don't see any incidents with Mozilla 1.7 rc2.  There are still crashes showing
up with Thunderbird and Mozilla Trunks though.
I think this should get checked into the trunk - we can leave the bug open to
fix this in another way, but I'm just not going to have time to do it near term,
and I don't think the trunk should suffer...
I agree David.
Used latest 1.7 branch Win06-24, Mac06-30 & Linux06-29 builds:

I am not sure what's the original exact scenario for causing this crash...
But, based on the info I got from the dup bugs.
I have verified crash does not occur when entering recipient email address and
lookup in address book or LDAP.
Changing keywords from fixed1.7 to verified1.7.
Leave this bug status "as is" until this bug be verified on trunk again...
Keywords: fixed1.7verified1.7
Comment on attachment 148152 [details] [diff] [review]
null check fixes

mozilla/directory/xpcom/base/src/nsLDAPConnection.cpp	1.51
mozilla/directory/xpcom/base/src/nsLDAPOperation.cpp	1.34
I don't see any crashes in the latest Talkback data for recent releases or on
the Trunk.  Marking WFM.  If anyone is able to reproduce this, please reopen.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → WORKSFORME
Product: MailNews → Core
Product: Core → MailNews Core
Crash Signature: [@ nsLDAPConnection::RemovePendingOperation]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: