Closed
Bug 308655
Opened 19 years ago
Closed 18 years ago
crash in the middle of a ldap search, also core dump will be produced
Categories
(Directory :: LDAP C SDK, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: shuo.li, Assigned: mcs)
Details
(Keywords: crash)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6
We have built a simple LDAP client to download configuration info from a
Netscape directory server. Due to the large amount of data (>100,000 entries),
we decided to split the traffic into small ones by using filters like '*0',
'*1', '*2'... '*9'.
Everytime we run this test client, after about 10-15 min., it core dumps and
reports a segmentation fault. From the below gdb core dump output, we believe
the problem is in the mozilla library. And it's very likely to be a performance
or memory management problem.
#0 0x0fc3a114 in ber_flush (sb=0x0, ber=0x1003a2f0, freeit=1) at io.c:333
#1 0x0fc1b3d8 in nsldapi_ber_flush (ld=0x1003a2f0, sb=0x0, ber=0x1003a2f0,
freeit=1, async=0) at request.c:328
#2 0x0fbfc83c in do_abandon (ld=0x1003a2f0, origid=8, msgid=8,
serverctrls=0x0, clientctrls=0x0) at abandon.c:226
#3 0x0fbfc0e4 in ldap_abandon_ext (ld=0x1003a2f0, msgid=8, serverctrls=0x0,
clientctrls=0x0) at abandon.c:88
#4 0x0fbfbe24 in ldap_abandon (ld=0x1003a2f0, msgid=8) at abandon.c:61
#5 0x0fc271b0 in nsldapi_search_s (ld=0x1003a2f0,
base=0x10041458 "ddi-object-class=radio-users,zone-id=99", scope=1,
filter=0x10041548 "(&(radio-serial-number=*2)(raduser-sd-en-flag=1))",
attrs=0x0, attrsonly=0, serverctrls=0x0, clientctrls=0x0,
localtimeoutp=0x10039958, timelimit=10000, sizelimit=0, res=0x7ffff5b8)
at search.c:997
#6 0x0fc2700c in ldap_search_ext_s (ld=0x1003a2f0,
base=0x10041458 "ddi-object-class=radio-users,zone-id=99", scope=1,
filter=0x10041548 "(&(radio-serial-number=*2)(raduser-sd-en-flag=1))",
attrs=0x0, attrsonly=0, serverctrls=0x0, clientctrls=0x0,
timeoutp=0x10039958, sizelimit=0, res=0x7ffff5b8) at search.c:942
#7 0x0ffb29bc in DDISession::ddi_search_records (this=0x10039950,
p_ddiObjectClass=0x10041458 "ddi-object-class=radio-users,zone-id=99",
p_filter=0x10041548 "(&(radio-serial-number=*2)(raduser-sd-en-flag=1))",
p_attrs=0x0, p_res=0x7ffff5b8) at ../src/DDISession.cpp:260
#8 0x1000b87c in full_sync_radio_users (m_ddiSession=0x10039950,
m_logger=0x10031680) at ctk_test.cpp:354
#9 0x1000b4f0 in main (argc=5, argv=0x7ffffb24) at ctk_test.cpp:297
#10 0x0f90d130 in __libc_start_main () from /lib/libc.so.6
Reproducible: Always
Steps to Reproduce:
1.
2.
3.
Comment 1•19 years ago
|
||
WIt does seem likely to me this is a bug in libldap, given that your client
appears to crash inside ldap_search_ext_s(). At a glance, it looks like your
timelimit has been exceeded and libldap is trying to abandon the search. Is
that what you expect to be happening?
One thing that might work better or work around this bug would be to avoid using
any of the _s search functions when you expect a lot of LDAP entries to be returned.
hat version of the Mozilla LDAP code are you running?
Thank you for your reply.
The timelimit is set to 10000 in the program. If the unit is second, it is
about 2.8 hours. It has never taken this long to crash. Most time, just 10-20
minutes.
The Mozilla C-SDK we are using is 5.0.
(In reply to comment #1)
> WIt does seem likely to me this is a bug in libldap, given that your client
> appears to crash inside ldap_search_ext_s(). At a glance, it looks like your
> timelimit has been exceeded and libldap is trying to abandon the search. Is
> that what you expect to be happening?
> One thing that might work better or work around this bug would be to avoid
using
> any of the _s search functions when you expect a lot of LDAP entries to be
returned.
> hat version of the Mozilla LDAP code are you running?
Assignee | ||
Comment 3•19 years ago
|
||
The stack trace you provided tells me that a timeout did occur. Look at the
nsldapi_search_s() here:
http://lxr.mozilla.org/mozilla/source/directory/c-sdk/ldap/libraries/libldap/search.c#963
ldap_abandon() is only called if a timeout occurs. So there are 2 questions:
1) Why is a timeout occurring (or why does libldap think one occurred)?
2) Why does the code crash inside ldap_abandon()?
Can you be more specific about which libldap you are using? I'd like to pull
the source tree for the exact library you are using and take a look. The
premature timeout you might be seeing reminds me vaguely of a bug that may have
been fixed but I can't put my finger on it.
(In reply to comment #3)
These are the version info I found from CVS Entries file,
/.cvsignore/5.0/Tue Mar 26 20:53:26 2002//
/Makefile.client/5.0/Tue Mar 26 20:53:26 2002//
/Makefile.in/5.6/Fri Jul 12 18:29:52 2002//
/abandon.c/5.0/Tue Mar 26 20:53:28 2002//
/add.c/5.0/Tue Mar 26 20:53:28 2002//
/bind.c/5.0/Tue Mar 26 20:53:28 2002//
/cache.c/5.0/Tue Mar 26 20:53:28 2002//
/charray.c/5.1/Thu Apr 18 18:00:30 2002//
/charset.c/5.0/Tue Mar 26 20:53:28 2002//
/cldap.c/5.0/Tue Mar 26 20:53:30 2002//
/compare.c/5.0/Tue Mar 26 20:53:30 2002//
/compat.c/5.0/Tue Mar 26 20:53:30 2002//
/control.c/5.0/Tue Mar 26 20:53:30 2002//
/countvalues.c/5.0/Tue Mar 26 20:53:30 2002//
/delete.c/5.0/Tue Mar 26 20:53:30 2002//
/disptmpl.c/5.1/Fri Jun 7 19:09:12 2002//
/dllmain.c/5.0/Tue Mar 26 20:53:32 2002//
/dsparse.c/5.0/Tue Mar 26 20:53:32 2002//
/error.c/5.0/Tue Mar 26 20:53:32 2002//
/extendop.c/5.0/Tue Mar 26 20:53:34 2002//
/fdsetsize.txt/5.0/Tue Mar 26 20:53:36 2002//
/free.c/5.0/Tue Mar 26 20:53:36 2002//
/freevalues.c/5.0/Tue Mar 26 20:53:36 2002//
/friendly.c/5.1/Fri Jun 7 19:09:12 2002//
/getattr.c/5.0/Tue Mar 26 20:53:36 2002//
/getdn.c/5.0/Tue Mar 26 20:53:36 2002//
/getdxbyname.c/5.0/Tue Mar 26 20:53:36 2002//
/getentry.c/5.0/Tue Mar 26 20:53:36 2002//
/getfilter.c/5.1/Fri Jun 7 19:09:12 2002//
/getoption.c/5.0/Tue Mar 26 20:53:36 2002//
/getvalues.c/5.0/Tue Mar 26 20:53:38 2002//
/globals.c/5.0/Tue Mar 26 20:53:38 2002//
/ldap-int.h/5.2/Thu Apr 18 18:01:18 2002//
/ldapfilter.conf/5.0/Tue Mar 26 20:53:40 2002//
/ldapfriendly/5.0/Tue Mar 26 20:53:40 2002//
/ldapsearchprefs.conf/5.0/Tue Mar 26 20:53:40 2002//
/ldaptemplates.conf/5.0/Tue Mar 26 20:53:40 2002//
/memcache.c/5.0/Tue Mar 26 20:53:40 2002//
/message.c/5.0/Tue Mar 26 20:53:42 2002//
/modify.c/5.0/Tue Mar 26 20:53:42 2002//
/mozock.c/5.0/Tue Mar 26 20:53:42 2002//
/nsprthreadtest.c/5.0/Tue Mar 26 20:53:42 2002//
/open.c/5.1/Wed Apr 17 20:53:44 2002//
/os-ip.c/5.5/Fri Apr 26 00:33:00 2002//
/proxyauthctrl.c/5.0/Tue Mar 26 20:53:44 2002//
/psearch.c/5.0/Tue Mar 26 20:53:44 2002//
/pthreadtest.c/5.0/Tue Mar 26 20:53:44 2002//
/referral.c/5.0/Tue Mar 26 20:53:46 2002//
/regex.c/5.0/Tue Mar 26 20:53:46 2002//
/rename.c/5.0/Tue Mar 26 20:53:46 2002//
/request.c/5.1/Fri Jun 21 19:14:02 2002//
/reslist.c/5.0/Tue Mar 26 20:53:46 2002//
/result.c/5.0/Tue Mar 26 20:53:48 2002//
/saslbind.c/5.0/Tue Mar 26 20:53:48 2002//
/sbind.c/5.0/Tue Mar 26 20:53:48 2002//
/search.c/5.0/Tue Mar 26 20:53:48 2002//
/setoption.c/5.1/Tue Apr 30 00:23:58 2002//
/sort.c/5.0/Tue Mar 26 20:53:50 2002//
/sortctrl.c/5.0/Tue Mar 26 20:53:50 2002//
/srchpref.c/5.1/Fri Jun 7 19:09:12 2002//
/svrcore.c/5.0/Tue Mar 26 20:53:52 2002//
/test.c/5.1/Fri Jun 7 19:09:12 2002//
/tmplout.c/5.0/Tue Mar 26 20:53:52 2002//
/tmpltest.c/5.0/Tue Mar 26 20:53:52 2002//
/ufn.c/5.0/Tue Mar 26 20:53:52 2002//
/unbind.c/5.0/Tue Mar 26 20:53:54 2002//
/unescape.c/5.0/Tue Mar 26 20:53:54 2002//
/url.c/5.1/Thu Apr 18 00:48:24 2002//
/utf8.c/5.0/Tue Mar 26 20:53:56 2002//
/vlistctrl.c/5.0/Tue Mar 26 20:53:56 2002//
If there are other places to get the version info, please let me know.
On the other hand, I have to admit that the LDAP server is not fine tuned. If
search for a specific entry, it takes 7-10 minutes to get the result.
(In reply to comment #4)
Just to clarify, the 7-10 minutes delay is the worst case, in which the LDAP
server has 100,000+ entries.
Assignee | ||
Comment 7•19 years ago
|
||
I looked at this some this morning. I am still not sure what the problem is,
but from reading the code and stack trace you posted, it sure looks like the 8th
request (LDAP message ID 8) timed out inside ldap_search_ext_s() and caused an
abandon request to be sent. The crash occurs because the code in do_abandon()
is trying to send a message on a NULL liblber Sockbuf pointer (and I am not sure
why it is NULL).
It does look like the revised abandon code committed as part of the work done
for bug 140182 is better about checking for NULL pointers and such, so pick up
revision 5.1 of abandon.c at least is a good idea (I recommend you upgrade to
the latest code for all files if at possible).
You can view the changes between rev. 5.0 and 5.1 on abandon.c by following this
link:
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=abandon.c&branch=&root=/cvsroot&subdir=mozilla/directory/c-sdk/ldap/libraries/libldap&command=DIFF_FRAMESET&rev1=5.0&rev2=5.1
Comment 8•19 years ago
|
||
Yes, please try out the latest code from HEAD in cvs.
Comment 9•18 years ago
|
||
We are doing a lot of work on the ldap c sdk right now, so please verify that the latest code in CVS HEAD fixes your problem. If not, we need to do some further investigation.
Reporter | ||
Comment 10•18 years ago
|
||
We haven't seen the problem since we used the latest version
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•