Closed Bug 128086 Opened 23 years ago Closed 23 years ago

LDAP replication of all entries

Categories

(MailNews Core :: LDAP Integration, defect, P1)

All
Windows NT

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.0

People

(Reporter: rdayal, Assigned: rdayal)

References

Details

Attachments

(3 files, 14 obsolete files)

65.44 KB, patch
Details | Diff | Splinter Review
57.58 KB, patch
sspitzer
: review+
Bienvenu
: superreview+
Details | Diff | Splinter Review
9.96 KB, patch
dmosedale
: review+
sspitzer
: superreview+
Details | Diff | Splinter Review
This bug tracks implementation of ldap replication used for downloading all entries to the local DB.
First cut of the LDAP replication implementation. A new replication thread is created for replicating all entries of a directory. This thread uses do_CreateInstance for creating objects it needs to use but this throws an exception for nsNativeComponentLoader being not thread safe. This patch is here so that the new ldap replication interfaces, classes and implementation can be taken a look at as well as code from this patch could be used later once nsNativeComponentLoader is fixed to be thread safe.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.0
The above patch also changes the ISupports impl in nsNativeComponentLoader to avoid the exception (debug assertion) mentioned above and work for replicating all entries. However just changing the ISupports impl would not really make the nsNativeComponentLoader thread safe and could lead to problems when used in real cases doing other things besides replication. This XPCOM problem is scheduled to be fixed for 1.1. The current solution for this problem is to use "new" to create the objects needed and manage them locally.
Several XPCOM classes are not thread safe so using XPCOM to create objects or using nsCOMPtr on a separate / multiple thread is not a good idea. However on the replication thread we need to create LDAP Connection and Operation object but the header files for the class implementing interfaces for these are not exported. I am trying to see if i can create the objects on main thread by the replication service class and used in the replication thread using get functions, we manage them appropriately so that they are not released till replication is complete. Most processing is anyway done by the nsILDAPMessageListener implementer (a replication class) which will be created using 'new' and proxied to the replication thread. If the strategy doesnot work, i want to know if we can export the LDAP XPCOM SDK header files ?
Rajiv: I wouldn't support that workaround, no. I've talked to dougt, and he's agreed to take a look at the threadsafety problems sooner; I've marked this bug as depending on that tracking bug, and am emailing drivers to find out their feelings on accepting such a patch for 1.0.
Depends on: 101976
No longer depends on: 101976
Yep i too feel exporting LDAP XPCOM SDK headers is not good, clients should use the interfaces. Anyway, i have the stuff working using the startegy mentioned above in comment #3. I create the nsILDAPOperation and nsILDAPConnection object in the main thread using do_CreateInstance and then the replication thread just gets these and use them to fire the query. Most of the job of results processing is done by the nsReplicationDataProcessor : nsILDAPMsgListener which is created and used in the replication thread.
No longer depends on: 101976
As per the comments above this patch creates the LDAP objects in the main thread which is used by the replication thread to avoid the usage of do_CreateInstance. Also improved code with more checks for null, etc. A working version. Three things to be taken care in the next patch : 1. Need to use a weak reference of Query object in the DataProcessor object to avoid any leaks. 2. Mork is not thread safe so need to change code to proxy the nsAddrDatabase calls to the main thread. Doing this also gives debug assertion though, in CreateNewCardAndAddToDB call, need to figure out if the the card is also proxied to the main thread will this go away. 3. Need to delete the entries if the Mork DB file already exists for the replicated DB file.
Attachment #72004 - Attachment description: initial patch not using new rather than do_CreateInstance → initial patch using new rather than do_CreateInstance
Since Mork is not thread safe we need to proxy all the AddrDatabase calls as well as the card objects it uses to the main thread, this pretty much means that all the expensive tasks of file processing and disk usage (opening and commiting) will be happening in the main thread. Also the LDAP data retrieval happens in the separate connection thread, thus what replication thread really does is fires the query and passes the data from the LDAP connection thread to Mork in the main thread. The question then arises is there really a point in the current situation, when do_CreateInstance and related XPCOM classes as well as the Mork classes are not thread safe, to have a separate replication thread ?
This is updated patch that takes care of all the 3 items in comment #6 above. This patch now proxies / queues all AddrDatabase calls to the main thread since Mork is not thread safe. It however gives a annoying but not harmful debug assertion for the card property, even though the card is proxied to the main thread, the NS_GetProxyForObject does not change the owningthread for the object and hence when the CreateNewCardAndAddToDB function does a QueryReference the assertion is thrown. The processing is however happening on the main thread so there would not be a Mork related problem.
Attachment #72004 - Attachment is obsolete: true
The more i look deeper into it, the more i am realizing that infact doing replication on another thread could be risky since several of the XPCOM classes and implementation are not thread safe as well as all the AB classes and Mork is also not thread safe. Even do_GetWeakReference is not safe (bug # 46358), wrapping the call to it in a lock will not really solve the problem if there are multiple threads calling it at the same time. Also do_GetService is not thread safe and could lead to problem if the real object is not created on the main thread. I use it for getting the nsIAddrDatabase (db factory) object, so i will have to create this too on the main thread and pass it to the replication. With this pretty much nothing is left for the poor replication thread except to add delay by proxing / queing all calls to the main thread. Thus having a separate replication thread is not a good idea as well as risky. I talked to Dan and explained him all the scenarios, he agrees with not having this other replication thread, Seth can you please also look into this, thanks.
Keywords: nsbeta1+
Priority: -- → P1
it sounds like rdayal is suggesting we do this: do the ldap searches (that we need for LDAP replication) on another thread proxy results back to the UI thread, which will be the thread that accesses the addressbook database. this makes sense to me, especially since this is how all the current LDAP / AB code works. I was suprised to see all the bugs about component manager and various things non being threadsafe. given that, and how AB database code is also not thread safe, this seems like the right approach. I'm a little worried about the ldap thread posting too many events back to the UI thread, and starving the UI thread.
Replication in this patch is done on the main thread, tested downloading all entries (around 25,000) from netscape phone book using this patch. This patch uses DirPrefs (although it does not really provide any advantage for reading or even saving directory prefs) for accessing the prefs for the selected directory. Although only a couple of prefs are used in this part of code, several more will be used in the derived ChangeLog implementation which will use the same data memeber defined here. Currently doing more testing for cancelReplication implemented in this patch. Meanwhile Dan can u please start taking a look at this patch. thanks.
In the above patch in nsAbLDAPReplicationService::StartReplication i should initialize rv=NS_ERROR_NOT_IMPLEMENTED and not rv=NS_OK so that if DecideProtocol() returns something that is not implemented/supported, an error is returned.
This is the updated patch that takes care of following : - Cancel replication related changes to Abort and Done to avoid any mutex deadlock. - makes a backup of existing replication file so that the user gets the original replicated file if he/she cancels and avoid any already downloaded data loss. - Use Commit and ForceClosed to clear up MDB cache when Done. - tested with more than 25,000 records multiple times on Windows and Linux - change nsDirPrefs so that a DIR_Server on stack/data-member can be used, thus not requiring a malloc/calloc. I think this patch is all set for review. Dan, can u please review this patch. thanks. - Rajiv.
Attachment #71709 - Attachment is obsolete: true
Attachment #72757 - Attachment is obsolete: true
some quick comments: 1) PR_Sleep (10) ; // give others a chance if that sleep call is valid (I'm not sure it is, I'll leave that to dmose for his review) you'll want to do something like this: PR_Sleep(PR_MillisecondsToInterval(10/* msecs */)); 2) return rv ; Please don't add that space. I've noticed the sun contributed code does it, it drives me crazy. I think a lot of this code is duplicated sun code (which is acceptable, there's another bug, or there will be, about moving all this LDAP connection duplicated code to one place.) back to the space, I've been removing it when ever I touch their code that has it. Please don't add any more code that does it. (.h, .js and .cpp) 3) how did you generate your uuids? two of them look way to similar, based on my experience. did you generate one, and then change one number? +// {5414fff0-263b-11d6-b791-00b0d06e5f27} +// {5414fff1-263b-11d6-b791-00b0d06e5f27} if so, use uuidgen or in #mozilla, do /msg mozbot uuid. remember this CID for later... +// {ece81280-2639-11d6-b791-00b0d06e5f27} 4) there are some alerts in there alert ("in state change") ; alert ("Unable to create Replication Service") ; 5) +<button id="replicate-button" label="ReplicateNow" oncommand="Replicate();" /> +<button id="cancelReplicate-button" label="Cancel Replication" oncommand="CancelReplicate();" /> is that just test UI, or is that real UI? If real, those need to be moved to a string bundle. 6) readonly attribute PRInt32 replicationState ; readonly attribute PRInt32 protocolUsed ; you mean: readonly attribute long replicationState; readonly attribute long protocolUsed; right? (notice, no extra space) 7) + void Init (in nsIAbLDAPReplicationQuery query, in nsIWebProgressListener progressListener) ; interCaps, so this should be void init(); 8) hmm, where have I seen these UUID before? +[scriptable, uuid(5414FFF0-263B-11d6-B791-00B0D06E5F27)] +[scriptable, uuid(ECE81280-2639-11d6-B791-00B0D06E5F27)] don't use the same uuid for the interface as for the implemention! 9) + if (NS_FAILED (rv) || !mConnection) return rv ; remember, do: if (NS_FAILED (rv) || !mConnection) return rv ; so we can set the break point on the return. 10) instead of: +NS_IMETHODIMP nsAbLDAPProcessReplicationData::GetProtocolUsed(PRInt32 *aProtocolUsed) +{ + if (aProtocolUsed) { + *aProtocolUsed = mProtocol ; + return NS_OK ; + } + else + return NS_ERROR_NULL_POINTER; +} do NS_IMETHODIMP nsAbLDAPProcessReplicationData::GetProtocolUsed(PRInt32 *aProtocolUsed) { NS_ENSURE_ARG_POINTER(aProtocolUsed); *aProtocolUsed = mProtocol; return NS_OK; } 11) + if (!aMessage) return NS_ERROR_NULL_POINTER; assuming this is not supposed to happen, make that: NS_ENSURE_ARG_POINTER(aMessage); so that we'll assert.
> is that just test UI, or is that real UI? If real, those need to be moved > to a string bundle. I meant .dtd file.
FYI, my fix for #126134 has landed. so if you update, you'll be able to test your replication code by using addressbook and autocomplete by going offline.
Depends on: 126134
1) PR_Sleep is required in order to not starve the other threads. At this moment there is LDAP Connection thread running which queues up all the entries recieved to this proxy object. Putting this sleep ensures that other threads / UI does not starve. 2) I will change the style, No none of the code here is duplicated sun code, if you can point out the specifics i would like to reuse existing code rather than duplicate any. Also i too use a similiar style of putting the semicolon after the statement after a space, i did not get any comments earlier and did not see it in Mozilla style guide. I will change the entire code to put the semicolon immediately after the statement. 3) No i have not generated these by hand, i use guidgen to generate these guids / uuids. If u generate two guids/uuids immediately one after the other their difference would be only 1. 4) and 5) this is only for testing, the UI part and the progress dialog will be taken care in bug # 120421. 6) and 9) will make the styling change. Will add this in the Mozilla style guide too. 8) These UUIDs are used only in the idl file to define the interface and in nsAbBaseCID.h to define a macro (CID) that can be used to refer to it and the interface. They are not used for any implementation. 10) and 11) will make the style change to use the macro NS_ENSURE_AGR_POINTER.
Comment on attachment 73186 [details] [diff] [review] updated patch replicating on main thread I'm about a third of the way through the patch, here's an initial set of comments; more to come: As far as the PR_Sleep being necessary, have you tested without it since moving to PROXY_SYNC? I would have expected PROXY_SYNC to be sufficient. a) Why is the uriloader dependency necessary? For WebProgressListener? nsAbBaseCID.h ------------- b) how about using a hyphen and skipping the interCaps in the various contract-ids to more closely fit the predominant pattern here? nsIAbLDAPProcessReplicationData.idl ----------------------------------- c) Why is this interface necessary at all? As far as I can tell, none of these methods or attributes are accessed from another module or from JS. Why not just use nsILDAPMessageListener directly? nsIAbLDAPProcessReplicationQuery.idl ------------------------------------ d) Why is this interface necessary at all? src/Makefile.in --------------- e) Why add the .h files here to EXPORTS? Is someone outside the addressbook module is likely to need or want them? src/nsAbLDAPProcessReplicationData.cpp -------------------------------------- f) the destructor needs to call PR_DestroyLock; not sure if it's necessary to check mLock for 0 before calling that -- check the NSPR docs g) in Init(), why check mLock before calling PR_NewLock() other than with an assertion? Also, if the allocation fails, it seems like NS_ERROR_NOT_AVAILABLE would be a more appropriate return code. h) just so you're aware of it, returning errors through OnLDAPMessage() doesn't really cause them be propagated anywhere interesting. Right now, they're ignored, and they could be fixed to log the error somewhere, but that's as good as it will get. i) in OnLDAPInit: using NULL in C++ (not C) code in the mozilla code base causes issues for some weirdo compilers. Prefer nsnull or 0. nsAbLDAPReplicationService.h ---------------------------- j) the initial #ifndef has a misspelled identifier k) use |NS_IMETHOD| instead of |virtual nsresult| for interface methods to make sure that you get the right calling conventions on all compilers. General ------- l) Some of the new files created have the wrong version of the license boilerplate as well as the wrong copyright years; please use the 1.2 tri-license header (see <http://www.mozilla.org/MPL/>. m) Use DEBUG_rdayal instead of RAJIV_DEBUG. The former will be automagically defined by the build system for you. Even better, use an NSPR log module for logging, so that other people can easily debug problems in this code. n) Instead the idl |string| concept, use the new |ACString|, this avoids unnecessary allocations, and will eventually transparently support ref-counted buffer goodness. o) The patch introduces a bunch of ^Ms at the end of lines, which probably means you're using the cygwin version of CVS. I think there's some conversion-related setting you need to change, or, alternately just grab the non-cygwin version of CVS for windows. p) Don't use C-style casts, they make it harder for the compiler to detect potential casting problems. Instead, use the NS_{STATIC,REINTERPRET,CONST}_CAST macros.
Here's some more comments; I'm about 2/3 done reviewing, so there'll be more to come. general ------- q) let's get rid of the DIRPrefs usage r) Unless you know otherwise for a specific XPCOM interface getter, you shouldn't need to check anything other than rv. As an example, the || !mConnection here isn't necessary: + mConnection = do_CreateInstance(NS_LDAPCONNECTION_CONTRACTID, &rv); + if (NS_FAILED (rv) || !mConnection) return rv ; s) There's no need to incur the cost of CreateInstance instead of just new for objects in the same module, unless there's some reasonable expectation of being able to swap out the object you're creating for a compatible one (which is awfully rare around here, I think). ProcessReplData --------------- t) nsAbLDAPProcessReplicationData::OnLDAPBind(): how about adding a comment here about how if this code ever changes to receive callbacks on something other than the main thread, calling Query->QueryAllEntries before CreateABForReplicationDir can will no longer be ok. u) ::CreateABForReplicatedDB(): use of nsFileSpec is deprecated. Can't we just NS_GetSpecialSystemDirectory directly instead of going through what appears to be a pointless wrapper for it and then converting back to an nsIFile? v) ::CreateABForReplicatedDB(): does this line: + (*dbPath) += mReplInfo->fileName ; really do the right thing? w) ::CreateABForReplicatedDB(): how about replicating to the new file name, and postpone this MoveTo until after you've gotten a succesfully flushed database (ie in OnLDAPSearchResult)? This should simplify various cleanup code a bunch. x) ::OnLDAPSearchResult(): none of the calls to mReplicationDB methods check their return values, which might cause problems if something bad happens, like a disk full error. y) ::Done(): OnStateChange won't get called if QueryReferent fails. Is this the intent? z) ::Abort(): even if mOperation->Abandon() fails, shouldn't the rest of the code in this function be executed (ie don't return immediately)? ReplQuery --------- A) nsAbLDAPReplicationQuery::Init(): There are problems here where this method could fail in the middle of or after the call to InitLDAPData, but mInitialized won't be set, so the destructor won't clean things up correctly. However, getting rid of the DIR_Server stuff may make these issues go away. B) ::StartReplication(): NS_ERROR_NOT_AVAILABLE would probably be more correct than NS_ERROR_OUT_OF_MEMORY C) ::ConnectToLDAPServer(): the mConnection->Init() call should use NS_ConvertUTF8toUCS2, not NS_ConvertASCIItoUCS2. D) ::Done(): Check rv after the getService, not replicationService itself.
a) That is correct, uriloader dependency is needed for WebProgressListener. b) sure i will change the contract id names. c) and d) The interfaces provide me the XPCOM framework to manage objects including the useful weak reference that i need to avoid any leaks due to cyclical references. e) i will remove the headers from export in makefile f) and g) i check the mLock to be 0/nsnull for safe programming, if mLock is already allocated we dont need to allocate again, in this case however since we are sure this will be the first function to be called (even in future) before any other methods of the class we could avoid the check. h) ok. i will still keep them as it is just in case the LDAP code is modified to log the errors in future. i) sure ... i use nsnull/0 most of places but might have this one just out of past 9 yrs of habit. j) thanks i will change the misspelled identifier. k) sure i will change to use the NS_IMETHOD macro instead. l) i will change the license boilerplate version #. m) sure i will change the debug tag to DEBUG_rdayal instead of RAJIV_DEBUG. n) cool i didnt know we can use ACString in idl now, i will change to use ACString in the idls and change the code accordingly. o) i will create the patch on linux and checkin from there so that there are no ^M characters. p) sure i will change to use the NS_...._CAST macros. q) i myself donot feel good about nsDirPrefs (see my bug # 129326) but i am using here since it would be required for me to access the authDN and authPswd for ChangeLog implementation which will be deriving from this class. Also i can use it to write/update several prefs for the change log. I discussed this with Dan already. r) this is just safe programming practice i follow, a caller should always check the pointer data returned from absolutely any function call. s) doing a new and then assigning to an interface's nsCOMPtr object or calling do_CreateInstance is nearly the same overhead. t) Better, i would change to call CreateABForReplicationDir before the query call so that it will not matter even if it ever changes to multithreaded in the future. u) yes i am aware of it that is why i am converting it to nsIFile (see my comments where i am doing the conversion). It is here since the nsIAddrDatabase::Open takes a nsIFileSpec object !! v) Yep it does the right thing. w) will look into this alternate way. x) That is because the AB DB calls made in OnLDAPSearchResult() are Commit and Close, and even if they fail the follow through code should still be processed. y) and z) i will change that, thanks. A) oohh i had mInitialized in the destructor because i was earlier releasing the mOperation object (it is commented now). I can remove the mInitialized check from the destructor since nsDirPrefs.DeleteServerContents checks the pointers before freeing them. Using nsDirPrefs is required as mentioned above. B) i will change to use NS_ERROR_NOT_AVAILABLE instead of NS_ERROR_OUT_OF_MEMORY C) ok i will change to use NS_ConvertUTF8toUCS2 instead of NS_ConvertASCIItoUCS2. D) i think i should check both rv and replicationService.
The rest of my comments: nsAbLDAPReplServer ------------------ E) I notice the service only allows one replication at once. Is there some reason not to allow mutiple replications at once, as long they're not replicating to the same file? If this is the right thing, we're probably gonna need specific failure information communicated to the user, which NS_ERROR_FAILURE probably won't be enough to carry. F) ::StartReplication and ::CancelReplication In both of these methods, a PR_Lock is being held while calling out of module, which is generally considered risky unless you know and can guarantee the exact locking semantics of the function you're calling, Given that we've fallen back to doing all the replication work itself on the main thread, the easiest way to fix this is probably just get rid of mLock entirely. This will be a very minor perf win as well, probably. F) ::CancelReplication() + nsCAutoString prefName (aPrefName) ; + + [ ...] + + if (prefName == mPrefName ) Changing that code to this: + if (nsDependentCString(aPrefName) == mPrefName) will save a copy and some stack space. G) From reading the code, it looks like if the user triggers a cancel, and then the query finishes before the cancel event gets triggered, the cancel will return a failure code. If this is ok, please document it. General ------- H) Why hold mLock for so long? If it's even worth keeping the lock at all (since we're back in a single threaded world), wouldn't holding it only for the call to do_GetWeakReference be sufficient? There are a bunch of places where it's being held somewhat longer than that. I) In some places, mState gets reset when a query is done or aborted, in other places it doesn't. Keeping this consistent would probably be good. It's probably also worthwhile to put in a few NS_ASSERTIONS to make sure that the states are being hit at the right times. J) I don't entirely understand why ProcessReplicationData and Query need to be separate objects from each other, given that there seems to be a one-to-one mapping. Would merging these two make the need for weak references go away?
e) Infact i need to export the headers to use the replication constructors for the nsAbFactory which is in the build directory.
E) Since we are doing the replication data processing on main thread replicating multiple ABs at the same time may actually be an overkill and may give an impression of frozen UI. F) Although replication is happening on the main thread both StartReplication and CancelReplication methods are re-enterant. In order to avoid any conflicts for accessing member data, these locks are required. Further the locking semantics of the calling functions are guaranteed in this case so as not to lead to any deadlocks. Will take another look to see if the scope of the locks could be reduced. F) sure, still learning using the right string classes, last week's techtalk on Strings was useful, tried to apply learnings to the ChangeLog implementation. G) that is correct, i will document the information in the code. H) the lock for do_WeakReference and do_QueryReferent are used since there could be simultaneous async calls. that's right i have changed the ChangeLog implementation already to reduce to hold the lock for it. Will apply it here too. I) Ok will look into it. J) ReplicationProcess and ReplicationQuery perform different functionality. Moreover, once nsILDAPService is enhanced to do a Query using a LDAP URL it would become easy to modify this, most of the changes would be in the Query class, maybe complete code there could be replaced by a few calls to the nsILDAPService.
I wrote: > > nsIAbLDAPProcessReplicationData.idl > > ----------------------------------- > > c) Why is this interface necessary at all? As far as I can tell, none > > of these methods or attributes are accessed from another module or > > from JS. Why not just use nsILDAPMessageListener directly? > > > > nsIAbLDAPProcessReplicationQuery.idl > > ------------------------------------ > > d) Why is this interface necessary at all? > > > > J) I don't entirely understand why ProcessReplicationData and Query > > need to be separate objects from each other, given that there seems to > > be a one-to-one mapping. Would merging these two make the need for > > weak references go away? rdayal replied: > c) and d) The interfaces provide me the XPCOM framework to manage objects > including the useful weak reference that i need to avoid any leaks due to > cyclical references. and > J) ReplicationProcess and ReplicationQuery perform different functionality. That's just the semantics of where you're viewing from: 100 feet up, or 10000 feet up. They're certainly all part of the same action. > Moreover, once nsILDAPService is enhanced to do a Query using a LDAP URL it > would become easy to modify this, most of the changes would be in the Query > class, maybe complete code there could be replaced by a few calls to the > nsILDAPService. The above statement is true, but I don't see why that is an argument for keeping the objects separate. Once the service is available, we'll get to simplify the existing code. What does that have to do with whether the existing code consists of one or two objects? Given that keeping them separate has some disadvantages (extra getters required, weak references, two object allocations instead of one), it seems to me that it would make the code significantly simpler to merge these into a single object that implements nsILDAPMessageListener.
I wrote: > > f) the destructor needs to call PR_DestroyLock; not sure if it's > > necessary to check mLock for 0 before calling that -- check the NSPR > > docs > > > g) in Init(), why check mLock before calling PR_NewLock() other than > > with an assertion? Also, if the allocation fails, it seems like > > NS_ERROR_NOT_AVAILABLE would be a more appropriate return code. rdayal said: > f) and g) i check the mLock to be 0/nsnull for safe programming, if > mLock is already allocated we dont need to allocate again, in this > case however since we are sure this will be the first function to be > called (even in future) before any other methods of the class we > could avoid the check. That comment only really addresses point g, not point f. Anyway, I'm just suggesting this would be better to check for using an assertion, since it should only happen if there's actually a bug in the code, so we shouldn't take the performance hit just to protect us from ourselves. > > x) ::OnLDAPSearchResult(): none of the calls to mReplicationDB methods > > check their return values, which might cause problems if something bad > > happens, like a disk full error. > > x) That is because the AB DB calls made in OnLDAPSearchResult() are Commit > and Close, and even if they fail the follow through code should still be > processed. But if something goes wrong, don't we need to propagate this info to the caller, so the caller can propagate an error message to the UI? > > F) ::StartReplication and ::CancelReplication > > > > In both of these methods, a PR_Lock is being held while calling out of > > module, which is generally considered risky unless you know and can > > guarantee the exact locking semantics of the function you're calling, > > Given that we've fallen back to doing all the replication work itself > > on the main thread, the easiest way to fix this is probably just get > > rid of mLock entirely. This will be a very minor perf win as well, > > probably. > > F) Although replication is happening on the main thread both StartReplication > and CancelReplication methods are re-enterant. In order to avoid any > conflicts for accessing member data, these locks are required. Further the > locking semantics of the calling functions are guaranteed in this case so > as not to lead to any deadlocks. Will take another look to see if the > scope of the locks could be reduced. Rajiv and I just talked about this face to face, and it looks like this locking really only needs to be here in case we someday are able to move this code off to another thread rather than having it all run on the UI thread. I'm not sure whether leaving it in now is the right thing to do, but if so, can we at least document this in some comments somewhere?
>> x) That is because the AB DB calls made in OnLDAPSearchResult() are Commit >> and Close, and even if they fail the follow through code should still be >> processed. >But if something goes wrong, don't we need to propagate this info to >the caller, so the caller can propagate an error message to the UI? The caller here as u know is the LDAP Connection thread / nsLDAPConnection, i dont think it displays error messages or log messages, but i will still pass them so that sometime when nsLDAPConnection is enhanced to log error messages, it would. >> F) Although replication is happening on the main thread both StartReplication >> and CancelReplication methods are re-enterant. In order to avoid any >> conflicts for accessing member data, these locks are required. Further the >> locking semantics of the calling functions are guaranteed in this case so >> as not to lead to any deadlocks. Will take another look to see if the >> scope of the locks could be reduced. >Rajiv and I just talked about this face to face, and it looks like >this locking really only needs to be here in case we someday are >able to move this code off to another thread rather than having it all >run on the UI thread. I'm not sure whether leaving it in now is the >right thing to do, but if so, can we at least document this in some comments >somewhere? And also in case if there are multiple callers calling this from different threads. Since it is a service class it should be self sufficient in this respect. I would put comments as suggested by Dan.
>> J) ReplicationProcessor and ReplicationQuery perform different functionality. >That's just the semantics of where you're viewing from: 100 feet up, >or 10000 feet up. They're certainly all part of the same action. I guess we should look from 100 feet up when we are doing implementation. One class is performing the query and the other is processing the results from the query, clearly two different functionalities. >> Moreover, once nsILDAPService is enhanced to do a Query using a LDAP URL it >> would become easy to modify this, most of the changes would be in the Query >> class, maybe complete code there could be replaced by a few calls to the >> nsILDAPService. > The above statement is true, but I don't see why that is an argument > for keeping the objects separate. Once the service is available, > we'll get to simplify the existing code. What does that have to do > with whether the existing code consists of one or two objects? Yes this is definitly not an argument for having one or two objects. What i meant here (Moreover,) is when we do implementation we also think about code managebility, if one knows that code might/would change in future, keeping code that would largely change organized separately would help at the time doing the change, more so if a third person is doing the change. > Given that keeping them separate has some disadvantages (extra getters > required, weak references, two object allocations instead of one), it > seems to me that it would make the code significantly simpler to merge > these into a single object that implements nsILDAPMessageListener. If getters are inline they are not really any overhead and an extra object allocation is hardly any overhead considering the cleanly organized code we get with each class doing only what it is supposed to do. I think merging the code would not really be a good object oriented design. I think there are several better places (also in Mozilla in general) which we can optimize, for eg : usage of nsCOMPtr and do_CreateInstance for local objects, for an implementation nsA:nsIA in an function we do : foo() { nsCOMPtr <nsIA> a = do_CreateInstance... } instead using this : foo() { nsA a; } is what can save unnecessary heap allocation (since the object is created on stack instead) and overhead of going thru do_CreateInstance, if we can access the class directly. In fact i will change the nsLDAPSearchEntry() for it if possible. Also if we donot use nsCOMPtr and make sure to addref and release an object appropriately we donot really need to use weak references. I will also change the code for this.
Patch coming soon with the changes for all the above comments needing changes. Thank you very much Dan for a thorough and useful review. - Rajiv.
Attached patch updated patch (obsolete) — Splinter Review
Hi Dan and Seth, Can u please take a look at this updated patch. thanks, - Rajiv.
Attachment #73186 - Attachment is obsolete: true
rdayal wrote: > > e) In fact i need to export the headers to use the replication constructors > for the nsAbFactory which is in the build directory. Ewww, gross. I've filed bug 131849 about this buildsystem bogosity. >>> x) That is because the AB DB calls made in OnLDAPSearchResult() are Commit >>> and Close, and even if they fail the follow through code should still be >>> processed. >> >> But if something goes wrong, don't we need to propagate this info to >> the caller, so the caller can propagate an error message to the UI? > > > The caller here as u know is the LDAP Connection thread / nsLDAPConnection, i > dont think it displays error messages or log messages, but i will still pass > them so that sometime when nsLDAPConnection is enhanced to log error > messages, it would. Oh, I'd forgotten about that. I think we need to do better than this for major errors, though. Can you file another bug against yourself to add some code to OnLDAPSearchResult to put up an error dialog for the user (perhaps by calling the prompt service on the UI thread)? >>> J) ReplicationProcessor and ReplicationQuery perform different >>> functionality. >> >> That's just the semantics of where you're viewing from: 100 feet up, >> or 10000 feet up. They're certainly all part of the same action. > > I guess we should look from 100 feet up when we are doing implementation. One > class is performing the query and the other is processing the > results from the query, clearly two different functionalities. I think I was attempting to prematurely optimize where it's unlikely to be necessary, so I withdraw this request to merge the two classes. > Thank you very much Dan for a thorough and useful review. - Rajiv. You're quite welcome! I'll try and look at your new patch later today.
Here's a few new comments, but I realized that several of the comments from the previous review still haven't been addressed. So can you finish with both the old and new comments, and then I'll go through the whole patch again? Thanks. Unaddressed points from the original review ------------------------------------------- c, d: these interfaces are still unnecessary. Just use nsISupports (if needed) and nsILDAPMessageListener, moving the documentation comments to the C++ files. n: nsACString instead of string w: postpone the MoveTo in order to simplify the code General ------- K) Looking at the diff, it appears that all the long filenames are named with lowercase letters rather than interCaps. L) It looks like many of the NS_REINTERPRET_CASTs could actually be done less drastically with one or two NS_STATIC_CASTs. M) For member variables of an object, nsCString is preferred to nsCAutoString. nsAbLDAPProcessReplicationData.cpp ---------------------------------- N) The boilerplate seems to have extra newlines in it and is missing a copyright year. O) mQuery should still be an nsCOMPtr, not a raw pointer; this should not affect the ability to avoid weak references P) Why the |if (progressListener)| test? Always assigning it, even if 0, should be fine. nsAbLDAPReplicationQuery::InitLDAPData -------------------------------------- Q) Instead of + nsCAutoString uri (mDirServer.uri); + rv = mURL->SetSpec(uri); Try + rv = mURL->SetSpec(nsDependentCString(mDirServer.uri)); since this wraps the existing string wrather than allocating a new one and copying.
> c, d: these interfaces are still unnecessary. Just use nsISupports > (if needed) and nsILDAPMessageListener, moving the documentation > comments to the C++ files. I discussed with you about this, i have these interfaces since there could be other classes for change log and LCUP which should implement these interfaces. If I donot have these interfaces i will still make the methods here as virtual in the class so not having these interfaces would not be of any advantage, having these interfaces define what the implementations need to do and is useful for future implementation. >>n: nsACString instead of string i am not sure if i have nsACString instead of string in the idl i can use it from javascript since i am not sure if i can create nsACString object in javascript. I am not an expert in javascript so please enlighten me here. > w: postpone the MoveTo in order to simplify the code this would not be of an advantage when i need to use this for change log since change log code updates the existing file and in that case making a backup before doing the updates would be a better idea. > O) mQuery should still be an nsCOMPtr, not a raw pointer; this should > not affect the ability to avoid weak references Yes i will do so i didnot know that i can release a reference by assigning the nsCOMPtr reference to null or zero.
Attached patch wrong patch ... ignore this one. (obsolete) — Splinter Review
Attachment #74804 - Attachment is obsolete: true
Attachment #74928 - Attachment description: updated patch including idl file change to use ACString → wrong patch ... ignore this one.
Attachment #74928 - Attachment is obsolete: true
wrong patch in previous comment, use this.
1) +#define NS_ABLDAP_REPLICATIONSERVICE_CONTRACTID \ + "@mozilla.org/ldap-replication-service;1?type=addrbook" +#define NS_ABLDAP_REPLICATIONQUERY_CONTRACTID \ + "@mozilla.org/ldap-replication-query;1?type=addrbook" +#define NS_ABLDAP_PROCESSREPLICATIONDATA_CONTRACTID \ + "@mozilla.org/ldap-process-replication-data;1?type=addrbook" why do your contract id's have a type? Do we ever plan on replicating anything other than type=addrbook? 2) +#include "nsISupports.idl" +#include "nsIAbLDAPReplicationQuery.idl" +#include "nsILDAPMessageListener.idl" +#include "nsIWebProgressListener.idl" I think all you need to include here is nsILDAPMessageListener.idl and maybe nsISupports.idl. The other two you can just do interface nsIWebProgressListener; interface nsIAbLDAPReplicationQuery; 3) +[scriptable, uuid(5414FFF1-263B-11d6-B791-00B0D06E5F27)] Like I've mentioned before, don't use the same UUID as you used for the CID. Generate a new one. 4a) + void Init (in nsIAbLDAPReplicationQuery query, in nsIWebProgressListener progressListener) ; interCaps. This should be void init(), not void Init() 4b) + void abort(in PRBool aIsCallDone) ; what is aIsCallDone? That parameter name doesn't make any sense to me. can you come up with something better, and add a comment to what it does? 5) +[scriptable, uuid(5414FFF0-263B-11d6-B791-00B0D06E5F27)] Again, don't use the same UUID as you used for the CID. Generate a new one. 6) +#include "nsIWebProgressListener.idl" you can just do interface nsIWebProgressListener; 7) +[scriptable, uuid(ECE81280-2639-11d6-B791-00B0D06E5F27)] Again, don't use the same UUID as you used for the CID. Generate a new one. 8) + nsIAbLDAPReplicationQuery * Query = mQuery ; + mReplInfo = NS_STATIC_CAST(nsAbLDAPReplicationQuery*,Query)->GetReplicationPrefInfo(); + if (!mReplInfo) { + mQuery = 0 ; + return rv; + } seems like you want to return failure here, instead of rv, which will be NS_OK; + nsCOMPtr<nsILDAPOperation> operation; + nsIAbLDAPReplicationQuery * Query = mQuery ; + rv = NS_STATIC_CAST(nsAbLDAPReplicationQuery*,Query)->GetOperation(getter_AddRefs(operation)); + nsCOMPtr<nsILDAPConnection> connection; + rv = NS_STATIC_CAST(nsAbLDAPReplicationQuery*,Query)->GetConnection(getter_AddRefs(connection)); + nsIAbLDAPReplicationQuery * Query = mQuery ; + rv = NS_STATIC_CAST(nsAbLDAPReplicationQuery*, Query)->QueryAllEntries(); Don't do these static casts. Either don't use COM, and just use C++. Or (and this would be simpler) add these methods to the nsIAbLDAPReplicationQuery interface. For replicationInfo, you'll need to use noscript and something like: [ptr] native DIR_ReplicationInfo(DIR_ReplicationInfo); %{C++ #include "nsDirPrefs.h" %} x) + if (!mInitialized ) return NS_ERROR_NOT_INITIALIZED; do + if (!mInitialized) + return NS_ERROR_NOT_INITIALIZED; this way you can set the breakpoint on the return. + if (!mReplicationDB || !mDBOpen) return NS_ERROR_FAILURE; same thing. x) + if (!(mCount % 50)) // inform the listener every 5 entries your comment doesn't match your mod. x) + PR_Sleep (10); // give others a chance Like I mentioned in my first review, if you are going to do this, you need to do: PR_Sleep(PR_MillisecondsToInterval(10 /* msecs */)); You wrote: "PR_Sleep is required in order to not starve the other threads. At this moment there is LDAP Connection thread running which queues up all the entries recieved to this proxy object. Putting this sleep ensures that other threads / UI does not starve." How does 10 millisecond sleep solve this problem? Does that cause this thread to yield to the UI thread? If so, is it only for 10 milliseconds? Is that enough? Is that too much? How'd you come at this number? It seems like it would depend on your network speed and the speed of your machine. x) + if (!mInitialized ) return NS_ERROR_NOT_INITIALIZED; should be + if (!mInitialized ) + return NS_ERROR_NOT_INITIALIZED; x) + nsCOMPtr<nsIAddrBookSession> abSession = do_GetService(NS_ADDRBOOKSESSION_CONTRACTID, &rv); + if (NS_FAILED(rv) || !abSession) { + Done(); + return rv; + } you don't need to check abSession. rv will never be NS_OK and abSession will be null. + if (mReplicationDB && mDBOpen) { + rv = mReplicationDB->Commit(nsAddrDBCommitType::kLargeCommit); + rv = mReplicationDB->ForceClosed(); // forced close since this flushes the Mork + mDBOpen = PR_FALSE; + // once we have saved the new replication file, delete the backup file + if (mBackupReplicationFile) + rv = mBackupReplicationFile->Remove (PR_FALSE); do you want to null out mBackupReplicationFile? + } + return rv; note, return rv here might return failure if you fail to remove the replication backup or if ForceClosed() failed. Is that what you want? Or, do you want to return NS_OK, and add assertions if any of the calls (to Commit(), ForceClosed(), or Remove() fail?) x) Note, nsFileSpec is deprecated, from http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsFileSpec.h#42 THESE CLASSES ARE DEPRECIATED AND UNSUPPORTED! USE |nsIFile| and |nsILocalFile|. I'll log a bug to fix GetUserProfileDirectory(), and all the callers. Just something for you to keep in mind. + nsFileSpec* dbPath; + rv = abSession->GetUserProfileDirectory(&dbPath); + nsCOMPtr<nsIAddrDatabase> addrDBFactory = + do_GetService(NS_ADDRDATABASE_CONTRACTID, &rv); + if (NS_FAILED(rv) || !addrDBFactory) { + Done(); + return rv; + } again, you only need to check rv, you don't need to check both. + rv = addrDBFactory->Open(dbPath, PR_TRUE, getter_AddRefs(mReplicationDB), PR_TRUE); + delete dbPath; + if (NS_FAILED(rv) || !mReplicationDB) { + Done(); + return rv; + } Note, you'll leak dbPath on earlier failures. This is why we need to fix GetUserProfileDirectory() so we can use a comptr. x) + if (!mInitialized ) return NS_ERROR_NOT_INITIALIZED; again, return on a seperate line. x) + nsCOMPtr<nsILDAPOperation> operation; + nsIAbLDAPReplicationQuery * Query = mQuery ; + rv = NS_STATIC_CAST(nsAbLDAPReplicationQuery*,Query)->GetOperation(getter_AddRefs(operation)); Don't cast like this. x) + + nsIAbLDAPReplicationQuery * Query = mQuery ; + NS_STATIC_CAST(nsAbLDAPReplicationQuery*, Query)->Done(aSuccess); Static cast, fix this. + // since this is called when all is done here, either on success, failure or abort + // releas the query now. + mQuery = 0; Please do mQuery = nsnull; nsnull reminds the reader it's a ptr, 0 (while the same thing as NULL), doesn't imply that. x) + DIR_ReplicationInfo * mReplInfo ; +nsAbLDAPReplicationQuery::~nsAbLDAPReplicationQuery() +{ + DIR_DeleteServerContents (&mDirServer); +nsresult nsAbLDAPReplicationQuery::Init(const nsCString & aPrefName, nsIWebProgressListener *aProgressListener) +{ + if (aPrefName.IsEmpty()) + return NS_ERROR_NULL_POINTER ; that's not a null pointer. do NS_ERROR_UNEXPECTED. + mDataProcessor = do_CreateInstance (NS_ABLDAP_PROCESSREPLICATIONDATA_CONTRACTID, &rv); + if (NS_FAILED (rv) || !mDataProcessor ) return rv; again, seperate lines, and you don't need to check mDataProcessor. x) + nsIAbLDAPProcessReplicationData* dataProcessor = mDataProcessor; + rv = ( NS_STATIC_CAST(nsAbLDAPProcessReplicationData*, dataProcessor) )->Init (this, aProgressListener); + if (NS_FAILED (rv)) return rv; no cast! +nsresult nsAbLDAPReplicationQuery::InitLDAPData(const nsCString & aDirPref) +{ + if (aDirPref.IsEmpty()) + return NS_ERROR_NULL_POINTER ; use NS_ERROR_UNEXPECTED. x) add a comment to why this is needed. (include the bug number to the bug that was setting filenames to the personal address book) + if (!nsCRT::strcasecmp(mDirServer.fileName,kPersonalAddressbook) || !mDirServer.fileName) + { + mDirServer.fileName = nsnull; + DIR_SetServerFileName (&mDirServer, mDirServer.serverName); + } x) + mURL = do_CreateInstance(NS_LDAPURL_CONTRACTID, &rv); + if (NS_FAILED (rv) || !mURL) return rv; you don't need to check mURL + mConnection = do_CreateInstance(NS_LDAPCONNECTION_CONTRACTID, &rv); + if (NS_FAILED (rv) || !mConnection) return rv; you don't need to check mConnection + mOperation = do_CreateInstance(NS_LDAPOPERATION_CONTRACTID, &rv); + if (NS_FAILED (rv) || !mOperation) return rv; you don't need to check mConnection x) + nsCAutoString host; + rv = aURL->GetHost(host); + if (NS_FAILED (rv) || !host.get() ) return rv; this will return OK if host is empty. is that what you want? seems like you want: + nsCAutoString host; + rv = aURL->GetHost(host); + if (NS_FAILED(rv)) + return rv; + if (host.IsEmpty()) + return NS_ERROR_UNEXPECTED; same with port and dn. x) + nsCOMPtr<nsIAbLDAPReplicationService> replicationService = + do_GetService(NS_ABLDAP_REPLICATIONSERVICE_CONTRACTID, &rv); + if (NS_SUCCEEDED(rv) && replicationService) you don't need to check both rv and replicationService x) + { + // ( (nsAbLDAPReplicationService *)((nsIAbLDAPReplicationService *) replicationService) )->OnReplicationDone(); + nsIAbLDAPReplicationService * service = replicationService; + ( NS_REINTERPRET_CAST(nsAbLDAPReplicationService *, service) )->OnReplicationDone(); + } don't do the cast. x) + +inline DIR_ReplicationInfo * nsAbLDAPReplicationQuery::GetReplicationPrefInfo () +{ + if (!mInitialized ) return nsnull ; + + return mDirServer.replInfo ; +} + +inline nsresult nsAbLDAPReplicationQuery::GetOperation (nsILDAPOperation ** retval) +{ + if (!mInitialized ) return NS_ERROR_NOT_INITIALIZED ; + if (!retval) return NS_ERROR_NULL_POINTER; + + NS_IF_ADDREF(* retval = mOperation); + + return NS_OK ; +} + +inline nsresult nsAbLDAPReplicationQuery::GetConnection (nsILDAPConnection ** retval) +{ + if (!mInitialized ) return NS_ERROR_NOT_INITIALIZED ; + if (!retval) return NS_ERROR_NULL_POINTER; + + NS_IF_ADDREF(* retval = mConnection); + + return NS_OK ; +} See my comments about how to do specify these in the idl. you won't be able to keep them as inline. instead of + if (!mInitialized ) return NS_ERROR_NOT_INITIALIZED ; + if (!retval) return NS_ERROR_NULL_POINTER; do + if (!mInitialized) + return NS_ERROR_NOT_INITIALIZED; + NS_ENSURE_ARG_POINTER(retval); x) Why are you using a lock in nsAbLDAPReplicationService? + if (!mLock) { + mLock = PR_NewLock(); + if (!mLock) return NS_ERROR_NOT_AVAILABLE; + } x) + mQuery = do_CreateInstance (NS_ABLDAP_REPLICATIONQUERY_CONTRACTID, &rv); + if (NS_SUCCEEDED (rv) && mQuery) don't need to check both. x) + nsIAbLDAPReplicationQuery * query = (nsIAbLDAPReplicationQuery *) mQuery; + rv = NS_STATIC_CAST(nsAbLDAPReplicationQuery *, query)->Init (mPrefName, progressListener); no cast, fix the code, see earlier comments. x) + if (aPrefName.IsEmpty()) + return NS_ERROR_NULL_POINTER ; not a null pointer, use NS_ERROR_UNEXPECTED or NS_ERROR_FAILURE; x) + if (mQuery) + mQuery = 0; // release query obj use nsnull; x) + if (mQuery) + mQuery = 0; // release query obj use nsnull; x) -static nsresult dir_DeleteServerContents (DIR_Server *server) +nsresult DIR_DeleteServerContents (DIR_Server *server) Can you explain why you had to make this internal function public? This seems flawed. Why was DIR_DeleteServer() not enough? Here's some code that I wrote to do something similar (but it never got check in, since I fixed the code to read prefs directly) + // use the prefName to get the DIR_Server, so we can get the fileName + DIR_Server *server; + server = (DIR_Server *) calloc(1, sizeof(DIR_Server)); + if (!server) + return NS_ERROR_OUT_OF_MEMORY; + + DIR_InitServer(server); + + server->prefName = strdup(directoryServerPrefName.get()); + if (!server->prefName) + return NS_ERROR_OUT_OF_MEMORY; + + DIR_GetPrefsForOneServer(server, PR_FALSE, PR_FALSE); + // use server->fileName; + DIR_DeleteServer(server); + server = nsnull; x) one last comment, you seem to switch back and forth when adding unnecessary spaces. if (foo ) { bar = x ; Cheese (); } From http://www.mozilla.org/hacking/reviewers.html, #20: "When reviewing code, verify that the prevailing module style, if any, has been observed ("When in Rome...") in the patch. If the module is a mess, identify an owner and call for him or her to assert a coding style. Messy files are a telltale of poor or non-existent ownership." I'm asserting a coding style. Please don't do this. As I've mentioned before, SUN did this is their code, and it drives me crazy. Please go over your patch and fix this.
general question: why do we need to make a .backup file? Can't we just open the mab file, and if we fail or cancel at any point, can we just close it without committing?
> I'll log a bug to fix GetUserProfileDirectory(), and all the callers. http://bugzilla.mozilla.org/show_bug.cgi?id=132180
> Rajiv and I just talked about this face to face, and it looks like > this locking really only needs to be here in case we someday are > able to move this code off to another thread rather than having it all > run on the UI thread. I'm not sure whether leaving it in now is the > right thing to do, but if so, can we at least document this in some comments > somewhere? if this is the case, then defintely remove the locking code and add a comment to the code about the issues with moving the code off the UI thread. Let's keep this code as simple and clean as possible.
about the imap thread and sleeping, bienvenu writes: "Well, the imap thread never sleeps. That I know of, we don't see the IMAP thread starving the UI thread. The ui updates - we proxy over to the ui thread all the time. and we have necko on yet another thread The main thing that starves the ui thread is the ui thread Too much processing on the UI thread." I'd say take out the sleep, and see if you notice any problems. If you do, try yielding, by doing PR_Sleep(PR_INTERVAL_NO_WAIT)
Attached patch updated patch (obsolete) — Splinter Review
Attachment #74930 - Attachment is obsolete: true
at a quick read through, it is looking great! I've found five small issues, can you address them? I'll continue to review the code in detail. 1) a nit, it should should be mDirServer(nsnull), not mDirServer(0), right? I notice you switch between nsnull and 0 in your constructors. Can you use nsnull for pointers and leave 0 for ints, to help the reader here, and in the other constructors? +nsAbLDAPReplicationQuery::nsAbLDAPReplicationQuery() + : mConnection (0), + mOperation (0), + mDataProcessor (0), + mURL (0), + mDirServer(0), + mInitialized (PR_FALSE) 2) this isn't needed: + DIR_ReplicationInfo * GetReplicationPrefInfo (); 3) this either +inline DIR_ReplicationInfo * nsAbLDAPReplicationQuery::GetReplicationPrefInfo () +{ + if (!mInitialized ) return nsnull; + + return mDirServer->replInfo; +} 4) you don't want to return nsnull here, you want to return NS_ERROR_NOT_INITIALIZED +NS_IMETHODIMP nsAbLDAPReplicationQuery::GetReplicationInfo(DIR_ReplicationInfo * *aReplicationInfo) +{ + NS_ENSURE_ARG_POINTER (aReplicationInfo); + if (!mInitialized ) return nsnull; + + *aReplicationInfo = mDirServer->replInfo; + + return NS_OK; +} 5) + if (!nsCRT::strcasecmp(mDirServer->fileName,kPersonalAddressbook) || ! mDirServer->fileName) + DIR_SetServerFileName (mDirServer, mDirServer->serverName); again, please include a comment as to why this check for kPersonalAddressBook is neeeded. (the reason is bug http://bugzilla.mozilla.org/show_bug.cgi? id=99124)
I noticed that you cleaned up some fo the "foo ()" or "foo() ;" style issues. I appreciated it. There are still some in there, if you happen to come across them in your next patch, please squash them.
Attached patch updated patch (obsolete) — Splinter Review
Attachment #75328 - Attachment is obsolete: true
Attachment #75346 - Attachment is obsolete: true
Attachment #75361 - Attachment is obsolete: true
1) suggestion for your switch statement: + switch(messageType) + { + case nsILDAPMessage::RES_BIND: + rv = OnLDAPBind(aMessage); + break; + case nsILDAPMessage::RES_SEARCH_ENTRY: + rv = OnLDAPSearchEntry(aMessage); + break; + case nsILDAPMessage::RES_SEARCH_RESULT: + rv = OnLDAPSearchResult(aMessage); -> break; + default: -> rv = NS_ERROR_UNEXPECTED; // or rv = NS_OK; + break; + } + + return rv; depending on what you want to return for the messageTypes you haven't added case statements for. 2) mDataProcessor(nsnull), you don't need to initialize nsCOMPtrs to null. Sorry I didn't catch that the first time. If it's a raw pointer, like mDirServer, then "mDirServer(nsnull)," is correct. nsCOMPtrs don't need to be initialized, the are null by default. 3) + switch(DecideProtocol()) + { + case nsIAbLDAPProcessReplicationData::kDefaultDownloadAll : + mQuery = do_CreateInstance(NS_ABLDAP_REPLICATIONQUERY_CONTRACTID, &rv); + if(NS_SUCCEEDED(rv) && mQuery) + { + rv = mQuery->Init(aPrefName, progressListener); + if(NS_SUCCEEDED(rv)) + { + rv = mQuery->DoReplicationQuery(); + if(NS_SUCCEEDED(rv)) + { + mReplicating = PR_TRUE; + return rv; + } + } + } + break; + default : + break; + } + + if(progressListener && NS_FAILED(rv)) + progressListener->OnStateChange(nsnull, nsnull, nsIWebProgressListener::STATE_STOP, PR_FALSE); + + return rv; does that OnStateChange() ever get reached? It doesn't look like it.
> does that OnStateChange() ever get reached? It doesn't look like it. my mistake, it will be reached if DoReplicationQuery() fails. please just fix #1 and #2.
Attached patch updated patch (obsolete) — Splinter Review
Attachment #75365 - Attachment is obsolete: true
I did some performance testing and analysis and observed that the function MozillaLdapPropertyRelator::createCardPropertyFromLDAPMessage in nsLDAPProperties is leaking. I looked further into it and found out that the nsLDAPMessage::GetAttributes calls nsLDAPMessage::IterateAttributes which allocates memory for the attributes, however this memory is never released. I am in the process of figure out when exactly should this memory be released whether in nsLDAPMessage itself or in the MozillaLdapPropertyRelator.
I'm trying out the patch, so that I can work on the UI. four things: 1) when replication completes, I get a mork assertion: NTDLL! 77f9f9df() nsDebug::Assertion(const char * 0x04588d6c, const char * 0x04589040, const char * 0x0458900c, int 57) line 291 + 13 bytes mork_assertion_signal(const char * 0x04588d6c) line 57 + 31 bytes morkEnv::NewError(const char * 0x04588b80) line 405 + 19 bytes morkNode::NonNodeError(morkEnv * 0x0447dbd0) line 340 morkNode::CutWeakRef(morkEnv * 0x0447dbd0) line 651 morkNode::SlotWeakNode(morkNode * 0x00000000, morkEnv * 0x0447dbd0, morkNode * * 0x05213df8) line 505 morkStore::SlotWeakStore(morkStore * 0x00000000, morkEnv * 0x0447dbd0, morkStore * * 0x05213df8) line 770 + 20 bytes morkTable::CloseTable(morkEnv * 0x0447dbd0) line 190 + 18 bytes morkTable::CloseMorkNode(morkEnv * 0x0447dbd0) line 100 morkTable::~morkTable() line 108 morkTable::`scalar deleting destructor'(unsigned int 1) + 15 bytes morkObject::Release(morkObject * const 0x05213dd8) line 68 + 162 bytes morkTable::Release(morkTable * const 0x05213dd8) line 178 + 12 bytes nsAddrDatabase::Release(nsAddrDatabase * const 0x04477db0) line 173 nsCOMPtr<nsIAddrDatabase>::~nsCOMPtr<nsIAddrDatabase>() line 491 nsAbLDAPProcessReplicationData::~nsAbLDAPProcessReplicationData() line 76 + 33 bytes nsAbLDAPProcessReplicationData::`scalar deleting destructor'(unsigned int 1) + 15 bytes nsAbLDAPProcessReplicationData::Release(nsAbLDAPProcessReplicationData * const 0x0447a8e0) line 52 + 135 bytes nsCOMPtr_base::assign_assuming_AddRef(nsISupports * 0x00000000) line 436 nsCOMPtr_base::assign_with_AddRef(nsISupports * 0x00000000) line 74 nsCOMPtr<nsISupports>::operator=(nsISupports * 0x00000000) line 821 nsProxyObject::~nsProxyObject() line 297 nsProxyObject::`scalar deleting destructor'(unsigned int 1) + 15 bytes ProxyDestructorEventHandler(PLEvent * 0x015f1a50) line 590 + 31 bytes PL_HandleEvent(PLEvent * 0x015f1a50) line 590 + 10 bytes PL_ProcessPendingEvents(PLEventQueue * 0x043fc860) line 520 + 9 bytes _md_EventReceiverProc(HWND__ * 0x000c031c, unsigned int 49545, unsigned int 0, long 71288928) line 1071 + 9 bytes USER32! 77e13eb0() USER32! 77e1401a() USER32! 77e192da() nsXULWindow::ShowModal(nsXULWindow * const 0x040f6ef0) line 287 nsWebShellWindow::ShowModal(nsWebShellWindow * const 0x040f6ef0) line 1089 nsContentTreeOwner::ShowAsModal(nsContentTreeOwner * const 0x041ada74) line 442 nsWindowWatcher::OpenWindowJS(nsWindowWatcher * const 0x01bc728c, nsIDOMWindow * 0x01c53fcc, const char * 0x040dbfa0, const char * 0x03f33600, const char * 0x01b48c00, int 1, unsigned int 1, long * 0x01b8acd8, nsIDOMWindow * * 0x0012c0a0) line 705 GlobalWindowImpl::OpenInternal(GlobalWindowImpl * const 0x01c53fc8, const nsAString & {...}, const nsAString & {...}, const nsAString & {...}, int 1, long * 0x01b8accc, unsigned int 4, nsISupports * 0x00000000, nsIDOMWindow * * 0x0012c46c) line 3866 + 129 bytes GlobalWindowImpl::OpenDialog(GlobalWindowImpl * const 0x01c53fd0, nsIDOMWindow * * 0x0012c46c) line 2743 + 59 bytes XPTC_InvokeByIndex(nsISupports * 0x01c53fd0, unsigned int 14, unsigned int 1, nsXPTCVariant * 0x0012c46c) line 106 XPCWrappedNative::CallMethod(XPCCallContext & {...}, XPCWrappedNative::CallMode CALL_METHOD) line 2025 + 42 bytes XPC_WN_CallMethod(JSContext * 0x01c2bf60, JSObject * 0x01594c88, unsigned int 4, long * 0x01b8accc, long * 0x0012c748) line 1266 + 14 bytes js_Invoke(JSContext * 0x01c2bf60, unsigned int 4, unsigned int 0) line 788 + 23 bytes js_Interpret(JSContext * 0x01c2bf60, long * 0x0012d584) line 2745 + 15 bytes js_Invoke(JSContext * 0x01c2bf60, unsigned int 1, unsigned int 2) line 805 + 13 bytes js_InternalInvoke(JSContext * 0x01c2bf60, JSObject * 0x03315878, long 66237944, unsigned int 0, unsigned int 1, long * 0x0012d7dc, long * 0x0012d6ac) line 880 + 20 bytes JS_CallFunctionValue(JSContext * 0x01c2bf60, JSObject * 0x03315878, long 66237944, unsigned int 1, long * 0x0012d7dc, long * 0x0012d6ac) line 3412 + 31 bytes nsJSContext::CallEventHandler(nsJSContext * const 0x01c2eb68, void * 0x03315878, void * 0x03f2b5f8, unsigned int 1, void * 0x0012d7dc, int * 0x0012d7e0, int 0) line 1016 + 33 bytes nsJSEventListener::HandleEvent(nsJSEventListener * const 0x0321c000, nsIDOMEvent * 0x041ef030) line 180 + 77 bytes nsEventListenerManager::HandleEventSubType(nsListenerStruct * 0x0321bd98, nsIDOMEvent * 0x041ef030, nsIDOMEventTarget * 0x0305ef78, unsigned int 8, unsigned int 2) line 1217 + 20 bytes nsEventListenerManager::HandleEvent(nsEventListenerManager * const 0x0321c088, nsIPresContext * 0x02fc02c0, nsEvent * 0x0012f3f4, nsIDOMEvent * * 0x0012f348, nsIDOMEventTarget * 0x0305ef78, unsigned int 2, nsEventStatus * 0x0012f718) line 1385 + 36 bytes nsXULElement::HandleDOMEvent(nsXULElement * const 0x0305ef70, nsIPresContext * 0x02fc02c0, nsEvent * 0x0012f3f4, nsIDOMEvent * * 0x0012f348, unsigned int 2, nsEventStatus * 0x0012f718) line 3457 nsXULElement::HandleDOMEvent(nsXULElement * const 0x03e1e220, nsIPresContext * 0x02fc02c0, nsEvent * 0x0012f3f4, nsIDOMEvent * * 0x0012f348, unsigned int 2, nsEventStatus * 0x0012f718) line 3474 + 50 bytes nsXULElement::HandleDOMEvent(nsXULElement * const 0x03e1ef20, nsIPresContext * 0x02fc02c0, nsEvent * 0x0012f3f4, nsIDOMEvent * * 0x0012f348, unsigned int 2, nsEventStatus * 0x0012f718) line 3474 + 50 bytes nsXULElement::HandleDOMEvent(nsXULElement * const 0x03de8d50, nsIPresContext * 0x02fc02c0, nsEvent * 0x0012f3f4, nsIDOMEvent * * 0x0012f348, unsigned int 2, nsEventStatus * 0x0012f718) line 3474 + 50 bytes nsXULElement::HandleDOMEvent(nsXULElement * const 0x032d31d8, nsIPresContext * 0x02fc02c0, nsEvent * 0x0012f3f4, nsIDOMEvent * * 0x0012f348, unsigned int 2, nsEventStatus * 0x0012f718) line 3474 + 50 bytes nsXULElement::HandleDOMEvent(nsXULElement * const 0x03215730, nsIPresContext * 0x02fc02c0, nsEvent * 0x0012f3f4, nsIDOMEvent * * 0x0012f348, unsigned int 1, nsEventStatus * 0x0012f718) line 3474 + 50 bytes PresShell::HandleEventInternal(nsEvent * 0x0012f3f4, nsIView * 0x00000000, unsigned int 1, nsEventStatus * 0x0012f718) line 6055 + 44 bytes PresShell::HandleEventWithTarget(PresShell * const 0x01c876f0, nsEvent * 0x0012f3f4, nsIFrame * 0x00000000, nsIContent * 0x03215730, unsigned int 1, nsEventStatus * 0x0012f718) line 6024 + 22 bytes nsEventStateManager::CheckForAndDispatchClick(nsEventStateManager * const 0x0324fd60, nsIPresContext * 0x02fc02c0, nsMouseEvent * 0x0012f90c, nsEventStatus * 0x0012f718) line 2629 + 66 bytes nsEventStateManager::PostHandleEvent(nsEventStateManager * const 0x0324fd68, nsIPresContext * 0x02fc02c0, nsEvent * 0x0012f90c, nsIFrame * 0x03f8df6c, nsEventStatus * 0x0012f718, nsIView * 0x03e6f668) line 1684 + 28 bytes PresShell::HandleEventInternal(nsEvent * 0x0012f90c, nsIView * 0x03e6f668, unsigned int 1, nsEventStatus * 0x0012f718) line 6075 + 43 bytes PresShell::HandleEvent(PresShell * const 0x01c876f4, nsIView * 0x03e6f668, nsGUIEvent * 0x0012f90c, nsEventStatus * 0x0012f718, int 1, int & 1) line 5978 + 25 bytes nsViewManager::HandleEvent(nsView * 0x03e6f668, nsGUIEvent * 0x0012f90c, int 1) line 2064 nsView::HandleEvent(nsViewManager * 0x02fc0da0, nsGUIEvent * 0x0012f90c, int 1) line 306 nsViewManager::DispatchEvent(nsViewManager * const 0x02fc0da0, nsGUIEvent * 0x0012f90c, nsEventStatus * 0x0012f808) line 1870 + 23 bytes HandleEvent(nsGUIEvent * 0x0012f90c) line 83 nsWindow::DispatchEvent(nsWindow * const 0x03e6f1ac, nsGUIEvent * 0x0012f90c, nsEventStatus & nsEventStatus_eIgnore) line 865 + 10 bytes nsWindow::DispatchWindowEvent(nsGUIEvent * 0x0012f90c) line 886 nsWindow::DispatchMouseEvent(unsigned int 301, unsigned int 0, nsPoint * 0x00000000) line 4711 + 21 bytes ChildWindow::DispatchMouseEvent(unsigned int 301, unsigned int 0, nsPoint * 0x00000000) line 4963 nsWindow::ProcessMessage(unsigned int 514, unsigned int 0, long 655442, long * 0x0012fd20) line 3596 + 28 bytes nsWindow::WindowProc(HWND__ * 0x000d031e, unsigned int 514, unsigned int 0, long 655442) line 1130 + 27 bytes USER32! 77e13eb0() USER32! 77e1401a() USER32! 77e192da() nsAppShellService::Run(nsAppShellService * const 0x01bc56c8) line 309 main1(int 2, char * * 0x003047e0, nsISupports * 0x00000000) line 1350 + 32 bytes main(int 2, char * * 0x003047e0) line 1698 + 37 bytes mainCRTStartup() line 338 + 17 bytes KERNEL32! 77e87903() 2) I made this change locally: -NS_IMPL_ISUPPORTS2(nsAbLDAPProcessReplicationData, nsIAbLDAPProcessReplicationData, nsILDAPMessageListener) +NS_IMPL_THREADSAFE_ISUPPORTS2(nsAbLDAPProcessReplicationData, nsIAbLDAPProcessReplicationData, nsILDAPMessageListener) 3) I ran into this crash, when I had a .backup file that already existed. (see #4) void nsAbLDAPProcessReplicationData::Done(PRBool aSuccess) { if(!mInitialized) return; + if (mQuery) mQuery->Done(aSuccess); 4) // XXX todo // check if backup file exists, if so delete it. rv = mBackupReplicationFile->MoveTo(nsnull, backupFile.get());
1) This is happening at the time when the nsCOMPtr<nsIAddrDatabase> mReplicationDB is getting released when the ReplicationDataProcessor object is released. Not sure why, let me take a look and also talk to the Mork experts. 2) nsAbLDAPProcessReplicationData is not really thread safe, should we make its ISupports implementation thread safe ? 3) oh when going thru the code i added a call to Done(FAIL) if CreateABForReplication fails, in fact CreateABForReplication itself calls Done if it fails internally. I will change and add a comment there. 4) What if this backup file belongs to the user, i will change so that it generates a new filename if it already exists.
once we have replication, people will start complaining about #132729 (perf with large .mab files)
Blocks: 132729
Attached patch updated patch (obsolete) — Splinter Review
Attachment #75467 - Attachment is obsolete: true
I've updated and rebuilt with the new files. I'm getting this assertion: NTDLL! 77f9f9df() nsDebug::Assertion(const char * 0x01ad259c, const char * 0x1012a070, const char * 0x1012a040, int 528) line 291 + 13 bytes NS_CheckThreadSafe(void * 0x00302ca0, const char * 0x01ad259c) line 528 + 34 bytes nsAbLDAPProcessReplicationData::AddRef(nsAbLDAPProcessReplicationData * const 0x04357408) line 52 + 61 bytes nsAbLDAPProcessReplicationData::QueryInterface(nsAbLDAPProcessReplicationData * const 0x04357408, const nsID & {...}, void * * 0x0573fc10) line 52 + 172 bytes nsQueryInterface::operator()(const nsID & {...}, void * * 0x0573fc10) line 47 + 25 bytes nsCOMPtr_base::assign_from_helper(const nsCOMPtr_helper & {...}, const nsID & {...}) line 80 + 18 bytes nsCOMPtr<nsISupports>::nsCOMPtr<nsISupports>(const nsCOMPtr_helper & {...}) line 803 nsProxyEventObject::~nsProxyEventObject() line 450 nsProxyEventObject::`scalar deleting destructor'(unsigned int 1) + 15 bytes nsProxyEventObject::Release(nsProxyEventObject * const 0x04357510) line 497 + 34 bytes nsProxyEventObject::~nsProxyEventObject() line 464 + 27 bytes nsProxyEventObject::`scalar deleting destructor'(unsigned int 1) + 15 bytes nsProxyEventObject::Release(nsProxyEventObject * const 0x04357630) line 497 + 34 bytes nsCOMPtr<nsILDAPMessageListener>::~nsCOMPtr<nsILDAPMessageListener>() line 491 nsLDAPOperation::~nsLDAPOperation() line 52 + 22 bytes nsLDAPOperation::`scalar deleting destructor'(unsigned int 1) + 15 bytes nsLDAPOperation::Release(nsLDAPOperation * const 0x043572d8) line 54 + 135 bytes nsCOMPtr<nsILDAPOperation>::~nsCOMPtr<nsILDAPOperation>() line 491 nsLDAPMessage::~nsLDAPMessage() line 116 + 22 bytes nsLDAPMessage::`scalar deleting destructor'(unsigned int 1) + 15 bytes nsLDAPMessage::Release(nsLDAPMessage * const 0x043a8e68) line 43 + 135 bytes nsCOMPtr<nsILDAPMessage>::assign_assuming_AddRef(nsILDAPMessage * 0x00000000) line 473 nsCOMPtr<nsILDAPMessage>::assign_with_AddRef(nsISupports * 0x00000000) line 965 nsCOMPtr<nsILDAPMessage>::operator=(nsILDAPMessage * 0x00000000) line 585 nsLDAPConnectionLoop::Run(nsLDAPConnectionLoop * const 0x043c39c8) line 671 nsThread::Main(void * 0x043c3b00) line 120 + 26 bytes _PR_NativeRunThread(void * 0x043c3c00) line 433 + 13 bytes _threadstartex(void * 0x043c3dc0) line 212 + 13 bytes KERNEL32! 77e92ca8()
I should say that I get that assertion on failure. If everything goes smoothly, I don't get any assertions (including the mork assertion that I mentioned in http://bugzilla.mozilla.org/show_bug.cgi? id=128086#c50)
according to dmose: "that assertion you're seeing is because objects called through proxies need to have THREADSAFE_ISUPPORTS" see bug http://bugzilla.mozilla.org/show_bug.cgi?id=101252 I've made the change and checked it in, since this isn't part of the build.
note, because of the mac file name limit, you are going to have to rename 3 files: nsAbLDAPProcessReplicationData.cpp nsAbLDAPProcessReplicationData.h nsIAbLDAPProcessReplicationData.idl
talking with rdayal over aim, we're worried that appending .backup to the exiting .mab file will result in errors on the mac, if the .mab filename is long. one solution, use nsIFile's createUnique() to generate a unique file. for the xxx.mab.backup file doesn't exist, and it isn't too long, that will create it. if it does exist, or the name is too long, that will create another file. this will make the code that removes the .mab.backup file obsolete (sorry rajiv.) the side effect is that if we don't properly remove the "backup" file, we could end up littering the users profile directory with backup files.
Another solution is that we create our own backup file by appending .bak or whatever, if the name is too long (>31) strip the last 4 chars before the .bak. This way we can delete our created backup file if it already exists (in case we are not able to delete on prior replication completion/cancel due to any crash or any other fatal error). This will guarantee that we donot litter up user's Profile dir with these backup files even in cases of crashes.
Attachment #75552 - Attachment is obsolete: true
Comment on attachment 75683 [details] [diff] [review] updated patch to take care of filename limitation on mac r=sspitzer nice work, rajiv.
Attachment #75683 - Flags: review+
new copyright notices should probably be 2002. Some of them are 2001 + /** + * this method is the callback when query is done, failed or successful + */ + void done(in PRBool aSuccess); why not use in boolean instead of PRBool? Or are we going towards using NSPR types instead of basic idl types? You're dropping the return value here, because there's no break before the default: + case nsILDAPMessage::RES_SEARCH_RESULT: + rv = OnLDAPSearchResult(aMessage); + default: + // for messageTypes we donot handle return NS_OK to LDAP and move ahead. + rv = NS_OK; + break; for things like this, we prefer IsEmpty(): if (!dn.get()) + if (!mInitialized) + return NS_ERROR_NOT_INITIALIZED; + + nsresult rv = mDataProcessor->Abort(); + return rv; just return mDataProcessor->Abort() directly
Ran Rajiv's build on my Win2K. All the entries in Netscape phone book were downloaded into local DB successfully.
Replication works on Linux too.
yulian, do you have a test plan for LDAP replication? If so, can you provide the link for it? I've been doing some ad-hoc testing, it would be useful to known the official test plan.
Attached patch updated patchSplinter Review
Hi David, Can u please super review this updated patch. thanks, - Rajiv.
Attachment #75683 - Attachment is obsolete: true
Comment on attachment 75806 [details] [diff] [review] updated patch sr=bienvenu
Attachment #75806 - Flags: superreview+
sr=bienvenu, but... please fix this before you check in: + // donot call Done here since CreateABForReplicationDir calls it. typo - should be don't or do not. Also, for future reference (and you might consider filing a bug on this and fixing it, or trying it and seeing if it makes a difference): + if(!(mCount % 10)) // inform the listener every 10 entries + { + mListener->OnProgressChange(nsnull,nsnull,mCount, -1, mCount, -1); + // in case if the LDAP Connection thread is starved and causes problem + // uncomment this one and try. + // PR_Sleep(PR_INTERVAL_NO_WAIT); // give others a chance + } a better technique here is to inform the user every 1/2 or 1 second - that way, you won't ping the cpu updating status.
> a better technique here is to inform the user every 1/2 or 1 second - that > way, you won't ping the cpu updating status. Yes, I like this suggestion. perhaps instead of using mCount, we could use a PRTime, say mLastUpdate. We could use PR_Now() and if (mLastUpdate - PR_Now() > 1 second or .5 seconds) { update UI } note, you need to use the PRTime / 64 bit friendly math operator. I'd suggest spinning that improvement off to another bug. I think it'd be worth doing this same trick for the news header downloading code. I'll log a bug on me for that.
Comment on attachment 75806 [details] [diff] [review] updated patch r=sspitzer if we get a= for this, remember to land my patch in #120427 to show the offline UI panel, which has the button to kick off replication. I'll go work on the issues in bug #120421. but those won't hold up this bug.
Thank you Seth and David for your r and sr. I have filed a bug (bug # 133310) as suggested by David in his last comment above.
Thanks to JF Ducarroz who had helped me with the mac build.
Comment on attachment 76127 [details] [diff] [review] xml mac project changes r=dmose
Attachment #76127 - Flags: review+
Comment on attachment 76127 [details] [diff] [review] xml mac project changes sr=sspitzer
Attachment #76127 - Flags: superreview+
Comment on attachment 75806 [details] [diff] [review] updated patch a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #75806 - Flags: approval+
Comment on attachment 76127 [details] [diff] [review] xml mac project changes a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #76127 - Flags: approval+
feature landed, marking as fixed.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Verified with trunk builds on various platforms.
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: