Closed Bug 282470 Opened 21 years ago Closed 9 months ago

LDAP* handles don't allow free-threading

Categories

(Directory Graveyard :: LDAP C SDK, defect)

x86
Windows XP
defect
Not set
minor

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: david, Assigned: mcs)

Details

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) Build Identifier: Not sure if this is 'by design' or not, but I discovered it while modifying Directory Server 7.1's replication code to allow async sending of updates to a consumer (async with respect to the results coming back). What I did was to send operations on the ld in one thread, while reading the results in a second thread (calling ldap_result()). I was using the prldap code. What I saw was that the application deadlocked. This was because the SDK blocks in poll while holding one or more locks associated with the ld. The second thread attempts to acquire one of those locks, and the poll continues because it's blocked waiting a response that will never come because the first thread is blocked. I was able to work around this with some code in the application: call the SDK with zero timeout and implement my own backoff timer mechanism. However, I felt it was worthwhile to file a bug in case someone felt motivated to 'fix' the SDK to allow free-threading of its handles. Reproducible: Always Steps to Reproduce: 1. Write an application that opens an LDAP* handle, then sends operations in thread A, e.g. calling ldap_add(), and receives results in thread B, calling ldap_result(). 2. This application will deadlock after a while, especially if there's high latency on the network path, resulting in many operations in transit on the tcp connection. 3. Actual Results: I re-wrote my code ;) Expected Results: Not deadlocked. (or possibly refused to allow me to access the same handle in two threads, with some useful error code, if it's just not supposed to allow that). This is a low priority to fix: I was able to work around it, albeit with some rather nasty code in the application, and suboptimally since I'm now sleeping to avoid a busy-wait-spin in the SDK, calling poll with zero timeout.
I am not surprised there are problems in this area. When locks were originally added to libldap to make it thread safe, I think things worked OK (I remember testing async ldap_search plus ldap_result at one point). Then someone complained about the lack of fine grained locking and a lot of complexity (and many more locks) were added. This is not a trivial problem to solve because of the messy result handling code (which may need to be rewritten). I will try to find time to do some analysis soon to see how we should fix this. Input is welcome.
Status: NEW → RESOLVED
Closed: 9 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.