Closed Bug 374361 Opened 17 years ago Closed 17 years ago

LDAP searching is broken on OS X

Categories

(Thunderbird :: Address Book, defect)

x86
macOS
defect
Not set
major

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 391499

People

(Reporter: luke.taylor, Assigned: mscott)

References

Details

(Keywords: regression)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.2) Gecko/20070219 Firefox/2.0.0.2
Build Identifier: HEAD

I can't get Ldap searching to work using a trunk build of TB on OS X (10.4.8 on Intel). 

A well as building locally, I've checked nightly builds to track it back and the last one that works with the same test profile is 2007-03-01-04-trunk. 2007-03-02-04-trunk fails (though it often shows an OS X crash report on starting up). I've checked several nightly builds after that, including the latest (2007-03-17-04-trunk).

Reproducible: Always

Steps to Reproduce:
1. Set up a profile with an LDAP server configured
2. Open TB address book
3. Select the directory in the list of address books
4. Type "A" into "Name or Email" quick search box.
Actual Results:  
No results returned. Also no network activity seen in server log or tcpdump output monitoring traffic to the server.

Expected Results:  
Network traffic seen and 100 (maximum configured) results returned.

With NSPR_LOG_MODULES=ldap:5, the only logging output from build 2007-03-17-04-trunk is:

1610559552[1a071b0]: nsLDAPOperation::SearchExt(): called with aBaseDn = 'ou=myaddressbook,cn=My Name,ou=people,dc=monkeymachine,dc=eu'; aFilter = '(|(mail=*A*)(cn=*A*)(givenName=*A*)(sn=*A*))', aAttrCounts = 64, aSizeLimit = 100

My local build (with LDAP_DEBUG defined) has the following output:

-1610559552[2107050]: WARNING: empty langgroup: file /Users/luke/Work/mozilla/gfx/thebes/src/gfxFont.cpp, line 472
47002112[3e47fb10]: nsLDAPConnection::Run() entered
-1610559552[2107050]: nsLDAPOperation::SimpleBind(): called; bindName = 'cn=My Name,ou=people,dc=monkeymachine,dc=eu'; 
-1610559552[2107050]: nsLDAPOperation::SearchExt(): called with aBaseDn = 'ou=myaddressbook,cn=My Name,ou=people,dc=monkeymachine,dc=eu'; aFilter = '(
|(mail=*A*)(cn=*A*)(givenName=*A*)(sn=*A*))', aAttrCounts = 64, aSizeLimit = 100
-1610559552[2107050]: WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed: file /Users/luke/Work/mozilla/directory/xpcom/base/src/nsLDAPOperation.cpp, line
 421
-1610559552[2107050]: WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed: file /Users/luke/Work/mozilla/mailnews/addrbook/src/nsAbLDAPDirectory.cpp, line 
392
-1610559552[2107050]: WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed: file /Users/luke/Work/mozilla/mailnews/addrbook/src/nsAbLDAPDirectory.cpp, line 
265

(note that this currently has the patch from bug 316170 applied).
Version: unspecified → Trunk
Forgot to mention a couple of things:

1. The same behaviour is seen without a bind DN configured (I've seen quite a few bugs mentioning problems with authentication).

2. I successfully ran the windows 2007-03-02-03-trunk build in a parallels VM on the same box against the same server.
I can't see anything in cvs that would have stopped this working (and my linux builds work fine), but that doesn't mean to say something hasn't broken (or it was broken in a separate area of the build).

It'd be useful if you could get the value of retVal from line 420 of nsLDAPOperation - a printf should be enough (http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/directory/xpcom/base/src/nsLDAPOperation.cpp&rev=1.40&mark=420-421#419). Line 421 is failing because it hasn't managed to translate the error code in line 420, so getting retVal may help.

I'll file a patch to allow that value to be output even if it can't be translated.
Depends on: 374508
(In reply to comment #2)
> It'd be useful if you could get the value of retVal from line 420 of
> nsLDAPOperation - a printf should be enough
...
> I'll file a patch to allow that value to be output even if it can't be
> translated.

I've actually just checked in this change - it should be in nightly builds in the next day or so, but it'll be in cvs straight away. If you turn on NSPR LDAP logging (export/set NSPR_LOG_MODULES=ldap:5 before startup of SeaMonkey) you should get that error value being printed out.
Thanks a lot for the response. I've been stepping through the code and the
return code from the bind attempt is 91, indicating a connection failure.

When nsLDAPConnection::OnLookupComplete is called, it calls PR_NetAddrToString
which is failing, leaving garbage in the host string in addrbuf: 

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/directory/xpcom/base/src/nsLDAPConnection.cpp&mark=905#888

This is later passed to the ldap library code which tries to connect to the
supplied host and obviously fails:

2007-03-20 02:17:48.153 Xcode[354] SRC Doc added new bp to proj with an
absolute file ref for /Users/luke/Work/mozilla/nsprpub/pr/src/misc/prnetdb.c
ldap_init
ldap_simple_bind
nsldapi_send_initial_request
nsldapi_send_server_request
nsldapi_connect_to_host:
∂5H¡ø5‡´T>(B<¨Âˇø‡´T>‡´T>‡û1x¡ø∆«5‹´T>ÃL;à¡ø
Q[, port: 389

The error occurs at this line:

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/nsprpub/pr/src/misc/prnetdb.c&mark=2261#2255

So the call to getnameinfo is failing. It returns code "4" which usefully means
"non-recoverable failure in name resolution" according to netdb.h :).

Revision 3.52 of prnetdb.c was made on 2007-03-01 which matches up well with
when the nightly build stopped working.

I'll try again tomorrow with a build using the previous version of that file.
It  would be nice if someone else using OS X could check if they can still
access a directory so I know I'm definitely not doing something dumb.
(In reply to comment #4)
> It  would be nice if someone else using OS X could check if they can still
> access a directory so I know I'm definitely not doing something dumb.

David, would you be able to give ldap a quick test with a recentish Mac trunk build?
yes, definitely.
I just tried the 17th March nightly build on a G4 powerbook and it worked OK against the same directory. The machine it's failing on is a dual-processor Mac Pro (now running 10.4.9). 
I've rebuilt on the Intel Mac with prnetdb.c rolled back to revision 3.51 and confirmed that it now works as expected.
Yes, I think you are right. In PRNetAddr, the first two bytes are the family and are set to "2" (PR_AF_INET). After casting, this ends up in sockaddr.sa_len with sockaddr.sa_family=0.
And it works OK if I transfer transfer the data to a local sockaddr_in struct and pass this to "getnameinfo" instead.
I've just tried out Thunderbird 2.0.0.4 and this no longer seems to be a problem. I haven't had a chance to try with the latest code, but will do so when I have some spare time.
Christian, Luke, I believe your comments on 2007-03-20 are right.
Sorry that this bug fell through the cracks.

If you look at how we pass PRNetAddr to the other socket functions
(see pt_Bind for bind, pt_Connect for connect, and pt_SendTo for
sendto) in ptio.c:
http://lxr.mozilla.org/nspr/source/nsprpub/pr/src/pthreads/ptio.c

you'll see that we copy the PRNetAddr data to a local sockaddr struct,
properly setting its sa_len field, rather than casting for Mac OS X
(where _PR_HAVE_SOCKADDR_LEN is defined), so I believe we should do
the same for getnameinfo.
Depends on: 391499
So did the patch for bug 391499 fix this?
I just checked in a new NSPR tag on Sunday, August 26.
Luke, could you test Monday's build and see if the
patch for bug 391499 (which is in the new NSPR tag)
fixes this bug?
(In reply to comment #15)
> I just checked in a new NSPR tag on Sunday, August 26.
> Luke, could you test Monday's build and see if the
> patch for bug 391499 (which is in the new NSPR tag)
> fixes this bug?

No response from Luke, so I'm resolving this as a dup of bug 391499 as I've not heard of any other problems yet.

Luke: if you do still have problems when you're able to test it, please feel free to reopen.
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.