LDAP XPCOM SDK needs to pre-resolve IP addresses async

VERIFIED FIXED in mozilla0.9.5

Status

P2
major
VERIFIED FIXED
18 years ago
16 years ago

People

(Reporter: dmose, Assigned: leif)

Tracking

other
mozilla0.9.5

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: PDT+)

Attachments

(3 attachments, 9 obsolete attachments)

(Reporter)

Description

18 years ago
Right now, the C SDK is allowed to do domain name resolution under the hood. 
This has the potential of causing the caller to hang if the network is slow or
horked or whatever, since gethostbyname() is used.  We need to get the XPCOM SDK
to do its own resolution using Necko's async resolver calls, probably before
calling into the LDAP C SDK.  It may be possible and worthwhile to use the C
SDK's DNS lookup function hooks to make any calls into gethostbyname assert, so
that we know we're not doing any synchronous calls.
(Reporter)

Updated

18 years ago
Blocks: 82317

Comment 1

18 years ago
What is the timeframe for moving to the 5.0-ish LDAP C SDK code?  I ask because 
the newer code has new extended I/O callback functions that can be used to 
replace the entire "resolve host and try to connect" sequence with one callback. 
 I am not sure how best to fit into the async DNS functions though (I'd have to 
look at how that works in Mozilla).
(Assignee)

Updated

18 years ago
Status: NEW → ASSIGNED
Priority: -- → P3
(Reporter)

Comment 2

18 years ago
The timeframe is a bit unclear; maybe in the next few months.  We're sorting out
some of our scheduling priorities.
(Reporter)

Updated

18 years ago
Target Milestone: --- → mozilla0.9.2
(Reporter)

Comment 3

18 years ago
This change is to boy
Target Milestone: mozilla0.9.2 → mozilla0.9.3

Comment 4

18 years ago
reassign Olga as QA contact 
QA Contact: yulian → olgac

Updated

18 years ago
Priority: P3 → P2
Target Milestone: mozilla0.9.3 → mozilla0.9.4

Updated

18 years ago
Keywords: nsenterprise

Updated

18 years ago
Keywords: nsenterprise → nsenterprise+
(Assignee)

Comment 5

18 years ago
Created attachment 47556 [details] [diff] [review]
Initial patch, "proof of concept"
(Assignee)

Comment 6

18 years ago
Created attachment 47615 [details] [diff] [review]
2nd patch, fixes a crasher in autocomplete
(Assignee)

Comment 7

17 years ago
Hi Peter,

can you take a look at this patch? I've turned |nsLDAPConnection::Init()| into
an asynchronous function. But, I have not a clue how the LDAP datasource work. I
did some changes which I think is what needs to be done, but I'm pretty sure
(ok, I am sure) it is not correct. :)

Can you maybe try this, and see if you can get the datasource to work with the
async Init() function? I'd really appreciate it.

Thanks,

-- Leif

Comment 8

17 years ago
Hi leif, I haven't tried your code but note that I've converted the datasource
to get the connection from the LDAP service (the patch is in bug 83360). I'm
unclear if I need to change anything in that case, and if we could get the same
benefit when using the service. If I understand the patch correctly, that
doesn't seem to be the case. I've experienced the hangs, and I would like to get
rid of them though, so if not using the service fixes that I'll gladly not use
the service and  I'll try out your patch.
(Assignee)

Comment 9

17 years ago
Hmmm, well, the LDAP Service is kinda busted right now :(. It's on the ToDo list
after we finish eMojo. You are correct though, using the LDAP service implies
that all this stuff is automatically taken care off.

So, depending on when you plan to land that change to switch over to the
Service, we have a couple of options:

1) We leave the Data source busted (as with my patch :), until we get the
Service in shape again (and land your patch).

2) We try to get the Data source working with my patch in this bug.


Thanks!

-- Leif

Comment 10

17 years ago
Ok, then I'll try out the patch. By the time dmose reviews my patch (hah ;))
I'll get a ton of complaints from all the datasource users if it's broken ;).
(Assignee)

Comment 11

17 years ago
Created attachment 47895 [details] [diff] [review]
Patch with LDAP Addressbook changes as well
(Assignee)

Comment 12

17 years ago
Adding a couple of Sun Ireland people to the Cc: list. If you guys could take a
look at the changes I've done to your code, that'd be swell.

Thanks,

-- leif

Comment 13

17 years ago
Leif: turns out the datasource as it is in the CVS is already broken. I'll
attach a band-aid patch that fixes it. The real patch is over in bug 83360. It's
your call really if you want to leave it as is, or fix it. I think you should
not let it hold up checking in the rest of this patch.

Comment 14

17 years ago
Created attachment 48007 [details] [diff] [review]
LDAP datasource only patch
(Assignee)

Comment 15

17 years ago
Ok, lets keep this easy, and not mess up the datasource more than it is? If it's
already broken, I'm making it any worse, right? :)

Dan, if that's ok, can you review my last patch please?

Thanks,

-- Leif
(Assignee)

Comment 16

17 years ago
Created attachment 48320 [details] [diff] [review]
Patch without datasource
(Reporter)

Comment 17

17 years ago
Review comments on 9/05 "patch without datasource":

nsILDAPConnection.idl
---------------------
a) why is it necessary to forward-declare nsIDNSListener?

nsILDAPConnection.cpp
---------------------
b) why get the DNS service by CID rather than ContractID?

c) at the end of the init() there are a couple of places where rv is
propagated up the stack.  I've been avoiding doing this in the XPCOM
SDK, except when it's likely that there is going to be enough context
up the stack for the message to be useful/meaningful.  And those cases
are all enumerated in the IDL so that callers will know all possible
return codes that they can expect.  There are a number of errors where
it is likely to be meaningful, such as NS_ERROR_OFFLINE,
NS_ERROR_UNKNOWN_HOST, and NS_ERROR_OUT_OF_MEMORY.  I'd suggest
checking for (at least) those in a switch statement and propagating
them, and for anything else returning the best fit error you can find
up the stack (often NS_ERROR_FAILURE; also try searching lxr for
"#define NS_ERROR" to see a list of interesting error codes defined in
various spots).  

Alternately, just document in the IDL any places where an unknown set
of errors may be getting propagated up the stack

I'm open to discussion on this, as my approach is has its own modes of
suckiness.

d)
+        // Make sure that we have an LDAP connection handle, if not, we
+        // sleep for a bit, and try again later.
+        //
+        if (!rawConn->mConnectionHandle) {
+            PR_Sleep(sleepTime);
+            continue;
+        }

I'm not sure that this is bad, per se, but it seems like just a hack
to avoid using a condition variable.  Even better, how about not
spinning up the connection thread at all until we know that the DNS
lookup has succeeded?  That should make this code fragment
unnecessary, and helps avoid doing unnecessary work if the server name
is bogus.

nsLDAPConnection::OnFound()
---------------------------
e) I think you need to hang any errors in a member variable for
possible use when calling back in OnStopLookup when calling back
OnLDAPInit.  Since this is a callback interface, any errors that you
return here will probably be ignored in a non-DEBUG build (just
verified that this is true for nsDnsService).  This is part because I
presume we want to treat this:

+        // XXX: What should happen here?  The GetHostByName(...) succeeded but
+        //      there are *no* A records..

as NS_ERROR_UNKNOWN_HOST, which has certain specific UI implications
that other errors (eg NS_ERROR_OUT_OF_MEMORY) don't have.

f) This looks like it's going to leak one or more instances of
netAddress.  But why allocate one or more instances of netAddress from
the heap at all, rather than just having a single automatic on the
stack?

g) Can you get gordon or darin or somebody who knows the DNS service
to review the IPv6 conversion goop here?  Is it really necessary (and
portable) to call into the regular resolver library's inet_ntoa()
function?

nsLDAPConnection::OnStopLookup
------------------------------
h) We might wanna do the same thing discussed in point c) when calling
LDAPInit()... ie enumerate the possible errors that are likely to
occur.

i) Why is the following line commented out?

+    // mInitListener = 0;

nsLDAPOperation::AbandonExt()
-----------------------------
j) Having three different return values (none of which are commented)
makes this particularly hard to follow.  I think only one int variable
and one nsresult variable are actually necessary to do what you want.

nsLDAPAutoCompleteSession
-------------------------
k) we should add a new state called RESOLVING, instead of reusing
BINDING to have two meanings.
(Reporter)

Comment 18

17 years ago
review comments on peterv's datasource change patch:

generate*Callback() functions
-----------------------------
a) I assume the reason why the |outer| args to these functions going
away is because we've already got access to that info in the form of
aOuter from the lexical parent scope (CreateDelegate).  Is that right?

b) Why are we losing the callerObject arg as well?  I assume it must
still be available from somewhere, seeing as how you're still using it
in the code, but I don't see where.  Is that something that's always
available thanks to JS engine black magic?

c) Why is it necessary to manually implement QueryInterface on these
classes?  Since there's only one non-nsISupports interface, shouldn't
xpconnect take care of that for you automagically?

Comment 19

17 years ago
Hi Leif,

	John is away on, and i am just back from, holiday...

	Code looks OK.

	I don't fully understand the new LDAP changes so these
	observations below may not be relevant:

	1) The connection is now initiated every time a query is performed.
	   Is this necessary? Presumably it only needs to be performed
	   once.

	2) There is also the potential that the DoQuery method could be
	   called by multiple threads (depending on how the component is
	   used e.g. in OpenOffice address book driver). Connection
	   initiation on every query may cause problems.

Paul.

Comment 20

17 years ago
Dmose:
a) and b) You're right about the scoping. This was broken sometime in the past,
which is why I passed in callerObject and outer to the constructors. It has
since been fixed, and so both aOuter and caller (because of the |var caller =
this;| that i added inside the CreateDelegate function) are available to the
inner objects.
c) Without the QueryInterface function, proxyMgr.getProxyForObject fails when
trying to QI the objects to nsILDAPMessageListener. I didn't know XPConnect
supported an explicit QI to an interface on an arbitrary JS object that just
implements the interface without having a QI function. Are you sure that is the
case?
(Reporter)

Comment 21

17 years ago
peterv: OK, I understand why the QIs are necessary now, thanks to jband.  Can
you add a comment next to 'var caller = this' to explain a bit about what's
going on?  With those changes, r=dmose.
(Assignee)

Comment 22

17 years ago
Created attachment 48899 [details] [diff] [review]
Changes as per reviewer comments
(Assignee)

Comment 23

17 years ago
Requesting SR from David

-- Leif

Comment 24

17 years ago
sr=bienvenu, have you checked that no memory leaks have been introduced?
(Assignee)

Comment 25

17 years ago
Created attachment 49145 [details] [diff] [review]
Fixes some "state" issues in LDAP autocomplete session

Comment 26

17 years ago
Is this landing soon?
(Reporter)

Updated

17 years ago
Attachment #47556 - Attachment is obsolete: true
(Reporter)

Updated

17 years ago
Attachment #47615 - Attachment is obsolete: true
(Reporter)

Updated

17 years ago
Attachment #47895 - Attachment is obsolete: true
(Reporter)

Updated

17 years ago
Attachment #48320 - Attachment is obsolete: true
(Reporter)

Updated

17 years ago
Attachment #48899 - Attachment is obsolete: true
(Reporter)

Comment 27

17 years ago
OK, here's a few more comments on the most recent patch.  They're all minor, so
once you change them in your code (which shouldn't take more than a few
minutes), you've got r=dmose; I don't need to re-review it.

nsLDAPAutoCompleteSession.cpp
-----------------------------
a) In the very first change in the patch, please add a comment
explaining why it's reasonable to return NS_OK if the state is already
INITIALIZING (I believe it is, but it's worth commenting that it's
ONLY reasonable to do this at that point in the function, after
mSearchString has been re-set)

b) The error/log messages that used to be in InitLDAPConnection but
are now in OnLDAPInit need to have their text changed to indicate
this.

c) In the error handling after mConnection->Init() in InitConnection,
a PR_LOG statement has been removed; please put it back with any
appropriate adjustments.

nsLDAPConnection.cpp
--------------------
d) Reading through nsLDAPConnection::Init, you've asked for the DNS
listener to be called back on the UI thread.  I think it should really
be called back on the current thread, rather than the UI thread.  In
many cases, the current thread will be the UI thread, but at least
we're not forcing it on the caller.

e) How about wrapping the 

+    nsCOMPtr<nsIDNSService> pDNSService(do_GetService(kDNSServiceContractId, &rv));

something like this:

+    nsCOMPtr<nsIDNSService>
+
pDNSService(do_GetService(kDNSServiceContractId, &rv));

so it's under 80 cols.
(Assignee)

Comment 28

17 years ago
Created attachment 49375 [details] [diff] [review]
Fixes as per dmose review
(Assignee)

Updated

17 years ago
Target Milestone: mozilla0.9.4 → mozilla0.9.5

Comment 29

17 years ago
Nominating nsbranch+, adding chofmann to CC.
Keywords: nsbranch+
(Assignee)

Comment 30

17 years ago
Checked into trunk.

-- Leif
(Assignee)

Comment 31

17 years ago
Jussi, should I mark this bug keyword "PDT" ? It's been checked into the trunk.

-- Leif
(Assignee)

Updated

17 years ago
Attachment #49145 - Attachment is obsolete: true

Updated

17 years ago
QA Contact: olgac → yulian
Whiteboard: PDT

Comment 32

17 years ago
Please come to the pdt tomorrow at noon to discuss approval
QA Contact: yulian → olgac
Whiteboard: PDT
(Assignee)

Updated

17 years ago
Whiteboard: PDT

Comment 33

17 years ago
Pls update the final Patch Status with r and sr =. 

Let's get Hong to run it through a test plan. Let's get it through its paces.

Comment 34

17 years ago
+        mState = (mState == BINDING ? UNBOUND : BOUND);


I would do mState = (mState == BINDING) ? UNBOUND : BOUND; 

though they're  equivalent.

+    rv = ldapOperation->SimpleBind(NULL);

should be nsnull.

don't need another patch, sr=bienvenu

Comment 35

17 years ago
Using 20010920 & 20010921 Linux trunk builds.

LDAP autocomplete working well against various Directory servers.

Comment 36

17 years ago
Please test this on windows and mac as well.  Also, please ask cpd folks to
enable LDAP based mail address autocompletion on their trunk builds so that we
get broader QA coverage.  Also, please check Talkback data to see if there are
any new crashers since you checked this in on the trunk that could be caused by
these changes.  Thanks!

Comment 37

17 years ago
Verified with 20010924 trunk builds on Wins NT, 2k and XP and MacOS 9.2.

Comment 38

17 years ago
Reassign QA contact to myself
QA Contact: olgac → yulian

Comment 39

17 years ago
check it in - PDT+
Whiteboard: PDT → PDT+
(Assignee)

Comment 40

17 years ago
Created attachment 50812 [details] [diff] [review]
There's code on the branch not on the trunk which conflicts with my patch

Comment 41

17 years ago
there is some code that does 

if (NS_FAILED(rv)) {
...
return NS_ERROR_FAILURE.

Unless there's a good reason to drop the actual error code, it would be better
to return the real rv error.

Other than that, sr=bienvenu

Comment 42

17 years ago
    // get a proxy object so the callback happens on the main thread
     //
-    rv = NS_GetProxyForObject(NS_CURRENT_EVENTQ,
+    rv = NS_GetProxyForObject(NS_UI_THREAD_EVENTQ,
                               NS_GET_IID(nsILDAPMessageListener), 
-                              NS_STATIC_CAST(nsILDAPMessageListener *, this),
+                              NS_STATIC_CAST(nsILDAPMessageListener *, this), 
                               PROXY_ASYNC | PROXY_ALWAYS, 
                               getter_AddRefs(selfProxy));


Why do you want to change it to UI_THREA_EVENTQ? I think there is no UI thread
when this autoconfig->prefldap stuff happens and that's why I used NS_CURRENT_EVENTQ


    if (NS_FAILED(rv)) {
-        FinishLDAPQuery();
+        NS_ERROR("nsPrefLDAP::InitConnection(): couldn't "
+                 "create proxy to this object for callback");
         return NS_ERROR_FAILURE;
     }

FinishLDAPQuery() release the variable which is used to control the event loop
so it must be called in all error conditions otherwise we will hang when that
particular error happens. 

Rest looks fine. 
(Assignee)

Updated

17 years ago
Attachment #50812 - Attachment is obsolete: true
(Assignee)

Comment 43

17 years ago
Created attachment 50821 [details] [diff] [review]
Fix to nsPrefsLDAP, code on branch, not trunk
(Assignee)

Updated

17 years ago
Attachment #50821 - Attachment is obsolete: true
(Assignee)

Comment 44

17 years ago
Created attachment 50822 [details] [diff] [review]
3rd attempt to get the nsPrefsLDAP branch code in shape...
(Assignee)

Updated

17 years ago
Attachment #50822 - Attachment is obsolete: true
(Assignee)

Comment 45

17 years ago
Created attachment 50823 [details] [diff] [review]
Argh, wrong file :( Last patch for nsPrefsLDAP

Comment 46

17 years ago
Comment on attachment 50823 [details] [diff] [review]
Argh, wrong file :( Last patch for nsPrefsLDAP

looks good. r=mitesh
(let me try to submit it right away before you post another patch :-)
(Assignee)

Comment 47

17 years ago
Yeah, sorry about that Mitesh, I rushed the first patch, and I was cutting and
pasting too much. :( Thanks for the quick review.

-- leif

Comment 48

17 years ago
The only nit I have is this line:

+    if NS_FAILED(rv) {

...looks quite naked without parentheses around the condition.  I'd bet some
compiler out there will find a reason to choke on that.

Fix that, and you have sr=hewitt
(Assignee)

Comment 49

17 years ago
Nice catch! :)

Checked in on 0.9.4 branch.
(Assignee)

Comment 50

17 years ago
Checked in on branch as well.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED

Comment 51

17 years ago
Verified with 20010928 branch builds on windows, Linux and Mac.

Configure to unreachable server as following:

216.38.134.163
216.38.134.164
yulian.mcom.com 208.12.32.156

Autocomplete got LDAP server connection failed: error code 91, can't connect to
the LDAP server.
Status: RESOLVED → VERIFIED

Comment 52

17 years ago
I was testing against bug 83023 and noticed that this patch now causes an 
"Unhandled exception in mozilla.exe (MOZLDAP.DLL) 0xC0000005: Access Violation"
on Windows (NT) only in DEBUG mode. The exception happens on the following line.

nsLDAPConnection::OnFound(...)

if(!aHostEnt->hostEnt.h_add_list || !aHostEnt->hostEnt.h_addr_list[0])
{
	...
}

This also happens if you manually add an LDAP Address book to the prefs.js file
against the head and attempt to access the LDAP Address Book. I was unable to
re-create this on Linux or WIN optimized build.

I want to put down this marker. I will investigate this closer when I get a
chance.
(Reporter)

Comment 53

17 years ago
John: thanks for the note... we think we've got a handle on that crasher; see
bug 102227 for details.

Comment 54

17 years ago
I re-tested against the head today and I can confirm that this problem is now
gone with the fix for bug 102227.
You need to log in before you can comment on or make changes to this bug.