Closed
Bug 131447
Opened 23 years ago
Closed 23 years ago
races in SimpleBind and SearchExt
Categories
(Directory :: LDAP XPCOM SDK, defect, P1)
Directory
LDAP XPCOM SDK
Tracking
(Not tracked)
VERIFIED
FIXED
mozilla1.0
People
(Reporter: dmosedale, Assigned: dmosedale)
References
Details
Attachments
(1 file, 1 obsolete file)
17.49 KB,
patch
|
dmosedale
:
review+
Bienvenu
:
superreview+
asa
:
approval+
|
Details | Diff | Splinter Review |
While testing the LDAP C SDK 5.0 changes in the browser, I found a race condition in the XPCOM version of SearchExt where the first entry or entries from the directory server can be lost. Inspecting the code reveals that SimpleBind has the same race. The problem is that the new nsILDAPOperation isn't added to the connection's PendingOperations array until after the operation has been kicked off. So if a result is received from the server quickly enough, the connection won't be able to figure out where to call the result back to, and it just drops the result on the floor.
Assignee | ||
Comment 1•23 years ago
|
||
Apparently the LDAP C SDK v5 is faster at certain things, and as a result, the SearchExt race gets hit in autocomplete in debug builds but not optimized builds on my machine (800 Mhz Linux box). How to reproduce: In the autocomplete dropdown, type several letters that will return a bunch of hits. After the menu drops down, type one more letter that should narrow the search but still include the first person on the list. Notice that as the menu updates, that first entry incorrectly disappears. Inside of Netscape, I've been using "duc" and "duca", both of which should return ducarroz. For whatever reason, I've only seen the SearchExt race being lost so far. If the SimpleBind race were to be lost, LDAP functionality would just silently fail. The fact that I'm only seeing the SearchExt race in debug builds and have not yet seen the SimpleBind race may just be because I've got a somewhat fast machine; it's possible that on lower end machines, these would get hit more. The fix, however, is pretty easy.
Updated•23 years ago
|
Assignee | ||
Comment 2•23 years ago
|
||
Ugh. The easy fix that I had in mind won't work. So far, all the other ways I've thought of are either fairly complex, or require changing fundamental assumptions of the way the XPCOM SDK work.
Assignee | ||
Comment 3•23 years ago
|
||
OK, I've figured out a way to fix this that's not too byzantine. Right now, nsLDAPConnectionLoop calls ldap_result with LDAP_MSG_ANY (or whatever it's called) to get any results back that the C SDK has. Instead, I can make nsLDAPConnectionLoop enumate mPendingOperations and call ldap_result() once for each operation. This way it'll never get any messages back from ldap_result before it's prepared to handle them.
Assignee | ||
Comment 4•23 years ago
|
||
Comment 5•23 years ago
|
||
Comment on attachment 75304 [details] [diff] [review] race condition fix, v1 r=sspitzer
Attachment #75304 -
Flags: review+
Comment 6•23 years ago
|
||
In addition to reviewing the patch, I've tested the patch (autocomplete, ldap replication and ldap addressbook search) and it works fine.
Comment 7•23 years ago
|
||
Comment on attachment 75304 [details] [diff] [review] race condition fix, v1 I take that back, I think this patch is causing my CPU to ping.
Attachment #75304 -
Flags: review+
Comment 8•23 years ago
|
||
after doing some ldap work, my CPU is pinged. when I break, I'm in the while (1) loop in nsLDAPConnectionLoop::Run() it looks like we're not exiting out. PL_HashTableEnumerateEntries(PLHashTable * 0x05666c20, int (PLHashEntry *, int, void *)* 0x10029590 _hashEnumerate(PLHashEntry *, int, void *), void * 0x08f9ff08) line 427 + 8 bytes nsHashtable::Enumerate(int (nsHashKey *, void *, void *)* 0x01c715d2 CheckLDAPOperationResult(class nsHashKey *,void *,void *), void * 0x05bd7588) line 361 + 21 bytes nsLDAPConnectionLoop::Run(nsLDAPConnectionLoop * const 0x05bd7588) line 831 nsThread::Main(void * 0x067823f0) line 120 + 26 bytes _PR_NativeRunThread(void * 0x03c5fe08) line 433 + 13 bytes _threadstartex(void * 0x0522caf8) line 212 + 13 bytes KERNEL32! 77e92ca8()
Comment 9•23 years ago
|
||
Comment on attachment 75304 [details] [diff] [review] race condition fix, v1 needs work. after an LDAP operation (like autocomplete), the CPU pings. investigating...
Attachment #75304 -
Flags: needs-work+
Comment 10•23 years ago
|
||
with out this patch, I see the same behaviour, it just doesn't ping the CPU. once started, it runs until exit. at exit, it breaks out here: if (!conn) { mWeakConn = 0; return NS_OK; } It might be that the new code, with it's clone() is too expensive, so the CPU is kept busy. I'll continue to investigate. nsLDAPConnectionLoop::Run(nsLDAPConnectionLoop * const 0x041a8b80) line 654 nsThread::Main(void * 0x041a8ca0) line 120 + 26 bytes _PR_NativeRunThread(void * 0x041a8dc0) line 433 + 13 bytes _threadstartex(void * 0x041a8fa0) line 212 + 13 bytes KERNEL32! 77e92ca8()
Comment 11•23 years ago
|
||
yes, the code with the patch behaves the same way. perhaps (as dmose suggested privately over aim) we're doing to much processing on that thread, even when nothing is going on, and that is pinging the CPU.
Comment 12•23 years ago
|
||
Attachment #75304 -
Attachment is obsolete: true
Comment 13•23 years ago
|
||
when dmose gets better, I'll ask him to review my change.
Assignee | ||
Comment 14•23 years ago
|
||
Comment on attachment 75684 [details] [diff] [review] same patch, but only clone the pending operations hashtable if it has entries. otherwise, sleep briefly. we don't want the connection thread to eat the CPU. r=dmose for sspitzer's changes
Assignee | ||
Updated•23 years ago
|
Attachment #75684 -
Flags: review+
Comment 15•23 years ago
|
||
Comment on attachment 75684 [details] [diff] [review] same patch, but only clone the pending operations hashtable if it has entries. otherwise, sleep briefly. we don't want the connection thread to eat the CPU. sr=bienvenu for Seth's change as well (and the rest is sr=sspitzer, I believe)
Attachment #75684 -
Flags: superreview+
Comment 16•23 years ago
|
||
Comment on attachment 75684 [details] [diff] [review] same patch, but only clone the pending operations hashtable if it has entries. otherwise, sleep briefly. we don't want the connection thread to eat the CPU. sorry, make that sr=bienvenu on the whole patch, r=sspitzer on all of it except his change, which was r=dmose.
Comment 17•23 years ago
|
||
Comment on attachment 75684 [details] [diff] [review] same patch, but only clone the pending operations hashtable if it has entries. otherwise, sleep briefly. we don't want the connection thread to eat the CPU. a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #75684 -
Flags: approval+
Assignee | ||
Comment 18•23 years ago
|
||
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 19•23 years ago
|
||
20020422 trunk and branch builds on Wins. Autocomplete and ldap addressbook search work fine.
You need to log in
before you can comment on or make changes to this bug.
Description
•