implement LDIF export

VERIFIED FIXED in mozilla0.9.8

Status

VERIFIED FIXED
17 years ago
14 years ago

People

(Reporter: sspitzer, Assigned: sspitzer)

Tracking

Trunk
mozilla0.9.8

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 12 obsolete attachments)

32.61 KB, patch
dmose
: review+
Bienvenu
: superreview+
Details | Diff | Splinter Review
36.26 KB, patch
dmose
: review+
Details | Diff | Splinter Review
implement LDIF export

we've got support for .tab and .csv, but not ldif.

Comment 1

17 years ago
*** Bug 118957 has been marked as a duplicate of this bug. ***
coming soon, I hope.

as this is the quickest way to fix the bug where import / export loses mailing 
lists.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.9
Created attachment 64195 [details] [diff] [review]
initial patch, still need to work on the non-ascii data part, and test.
pulling back into 0.9.8, I should be able to finish this work before then.
Target Milestone: mozilla0.9.9 → mozilla0.9.8
Created attachment 64198 [details] [diff] [review]
more work for ldif export...still more to come
Attachment #64195 - Attachment is obsolete: true
Created attachment 64210 [details] [diff] [review]
improved patch, handle non-ASCII data.
Attachment #64198 - Attachment is obsolete: true
Created attachment 64211 [details] [diff] [review]
whoops, that last patch had other fix mixed in.
Attachment #64210 - Attachment is obsolete: true
here are some known issues with this patch:

1) the following columns will not get written out to ldif:

+  {k2ndEmailColumn, nsnull, PR_TRUE},
+  {kWorkPhoneColumn, nsnull, PR_TRUE}, 
+  {kWorkAddressColumn, nsnull, PR_TRUE},
+  {kWorkAddress2Column, nsnull, PR_TRUE}, 
+  {kWorkCityColumn, nsnull, PR_TRUE}, 
+  {kWorkStateColumn, nsnull, PR_TRUE}, 
+  {kWorkZipCodeColumn, nsnull, PR_TRUE}, 
+  {kWorkCountryColumn, nsnull, PR_TRUE}, 
+  {kBirthMonthColumn, nsnull, PR_TRUE}, 
+  {kBirthDayColumn, nsnull, PR_TRUE}, 

this is bad, as Addressbook -> LDIF -> Addressbook is lossy. 

on possible solution, add some more "xmozilla*" attributes, and fix import 
handle them.

I'll log a new bug on this.

2) we're exporting the kPreferMailFormatColumn column as 
{html,unknown,plaintext}.  It's really a boolean (true or false).
I'll add code to do this:

if == html
 I'll do "xmozillausehtmlmail: true"
else if == plaintext 
 I'll do "xmozillausehtmlmail: false"
else
  I won't do anything.  (that should leave it as unknown on import.)

I'll fix this before checking in.

3)

we loose generic columns on export, like _AimScreenName

I'll log a new bug on this.

4) this code sort of lives in a three places:

export, import, and LDAP.

comments from the code:

// We have two copies of this function in the code, one here for import and 
// the other one in addrbook/src/nsAddressBook.cpp for migrating.  If ths 
// function need modification, make sure change in both places until we resolve
// this problem.

and:

// XXX todo, merge with what's in nsAbLDAPProperties.cpp, so we can
// use this for LDAP and LDIF export
// XXX todo what about _AimScreenName, or other generic columns?

I'll log a new bug on merging all the code.
Created attachment 64362 [details] [diff] [review]
fix, addresses the prefer format problem, fixes import.
Attachment #64211 - Attachment is obsolete: true
Created attachment 64363 [details] [diff] [review]
complete patch.
Attachment #64362 - Attachment is obsolete: true
Created attachment 64365 [details] [diff] [review]
complete, unified diff
Attachment #64363 - Attachment is obsolete: true
dmose points out privately that I'm violating not one, but two rfcs.

1) the LDIF attribute names are non standard

2) need to base64 encode data if it begins with CR,LF,SPACE,:,<

I've got a patch, but I need to double check that I've got the attribute names 
right.

then, I've got to make sure the import code will handle the rfc complaint 
attribute names, as well as 4.x attributes names.

here comes a new patch.
Created attachment 64408 [details] [diff] [review]
new patch, changes to import code are next.
Attachment #64365 - Attachment is obsolete: true
Created attachment 64418 [details] [diff] [review]
fixed the table (per dmose's suggestions), next step make sure this works with the import code.
Attachment #64408 - Attachment is obsolete: true
Created attachment 64447 [details] [diff] [review]
updated patch, import and export code fixed.
Attachment #64418 - Attachment is obsolete: true
Created attachment 64471 [details] [diff] [review]
updated patch (fixed xmozillausehtmlmail import)
Attachment #64447 - Attachment is obsolete: true
OS = all.

adding nsbeta1 keyword.  this is important as exporting to LDIF / importing 
LDIF properly allows users a way to move addressbooks between profiles (an 
often requested feature).

exporting to LDIF is also the only way to export and not lose mailing lists and 
non-ASCII data.
Keywords: nsbeta1
OS: Windows 2000 → All
Hardware: PC → All
Comment on attachment 64471 [details] [diff] [review]
updated patch (fixed xmozillausehtmlmail import)

First set of comments (almost entirely nits):

nsAddressBook.cpp
-----------------
1) replace PL_strstr with Right() from nsAReadableUtils.h and
.Equals() to fix the XXX issue.  Also, does a
caseInsensitiveComparator for the file extension strings need to be
used?

2) can exportingLDIF be const, since it doesn't change after
initialization?

3) prefer strlen to PL_strlen(), since some libc's hand-optimize this
sort of routine to be better than NSPR's version ever can.  Actually,
each string's length in the table shouldn't be hard to compute
with pointer arithmetic to avoid doing the traversals at all, but it's
probably not worth the time it would take to change the code to do
that.

4) +	    if (i + 1 < EXPORT_ATTRIBUTES_TABLE_COUNT) {

It's possibly better to do this:

if ( i < EXPORT_ATTRIBUTES_TABLE_COUNT - 1 )  {

So that the constant value on the right can be directly compared to i,
which will probably be in a register.  I suspect most optimizers can
figure this out on their own, however.	Even better, if you make your
loops count down to 0 instead of up from 0, compares with the loop
control variable will take advantage of the fact that many processors
have special "branch-if-zero" instructions.

5) how about making the append functions take the more general
nsAFlatCString &, and have callers use nsCAutoString rather than
nsCString since these are living on the stack anyhow.

6) prefer strcmp() to PL_strcmp, as in (3)

7) using operator= here:

+		   if (value.Equals(NS_LITERAL_STRING("html").get()))
+		     value.Assign(NS_LITERAL_STRING("true").get());
+		   else if (value.Equals(NS_LITERAL_STRING("plaintext").get()))
+		     value.Assign(NS_LITERAL_STRING("false").get());
+		   else
+		     value.Assign(NS_LITERAL_STRING("").get());  // unknown.
+		 }
+ 

seems like it's easier to read than .Assign

8) On Mac, MSG_LINEBREAK will end up being CR only, which is not
permitted according to the LDIF RFC.  MSG_LINEBREAK can only be used
on platforms where it evaluates to CR/LF or LF, so we probably need a
bit of special-case code for platforms where that's not true.
Comment on attachment 64471 [details] [diff] [review]
updated patch (fixed xmozillausehtmlmail import)

The remaining comments:

9) the indenting in ::ExportAddressBook is messed up, especially some of the
braces, making it somewhat hard to read.

10) ::AppendLDIFForMailList(): traditionally, objectclass: attributes
are printed immediately after the dn.  This doesn't appear to be
required, but it's a trivial change.  :-)

11) nsAddressBook::AppendDnForCard(): see comments 5 and 7.  Also, you
potentially avoid extra mallocs by using the concatenation operation
on strings: eg instead of

+    cnStr.Append(NS_LITERAL_STRING("cn=").get());
+    cnStr.Append(displayName.get());

consider:

+   cnStr = NS_LITERAL_STRING("cn=") + displayName.get();

12) nsAddressBook::AppendBasicLDIFForCard(): see previous comment
about combining multiple appends with string-concat operator.

13) Same function.  Seems like we should also be appending
organizationalPerson and inetOrgPerson objectclasses.

14) ::AppendProperty(): the PR_FREEIF at the end doesn't need to be
conditional.  You already tested for a null pointer immediately
before.
Corrected point 11:

11) nsAddressBook::AppendDnForCard(): see comments 5 and 7.  Also, you
potentially avoid extra mallocs by using the concatenation operation
on strings: eg instead of

+    cnStr.Append(NS_LITERAL_STRING("cn=").get());
+    cnStr.Append(displayName.get());

consider:

+   cnStr = NS_LITERAL_STRING("cn=") + displayName;

> 1) replace PL_strstr with Right() from nsAReadableUtils.h and
> .Equals() to fix the XXX issue.  Also, does a
> caseInsensitiveComparator for the file extension strings need to be
> used?

fixed.  good suggestion

> 2) can exportingLDIF be const, since it doesn't change after
> initialization?

good catch.  I always for get to const stuff.  this should allow the compiler
to generate better code, right?

> 3) prefer strlen to PL_strlen(), since some libc's hand-optimize this
> sort of routine to be better than NSPR's version ever can.  Actually,
> each string's length in the table shouldn't be hard to compute
> with pointer arithmetic to avoid doing the traversals at all, but it's
> probably not worth the time it would take to change the code to do
> that.

What's the official ruling on this?  I saw recently how we switched from
nsCRT::memcpy to memcpy.  Is PL_strlen officially out? I'm going to continue to 
use it until an official edict comes out.  I feel we should make a seamonkey
wide decision on this, and let the modular owners so we can fix all the code at 
once.

note, this code is hit when exporting to .csv to write out the column headers 
once.
not a performance issue, so PL_strlen is better than hard coding the lengths in 
the
table, as if the column names change, the table needs to be updated.  plus, 
this makes
the table larger at run time, right?

> 4) +	    if (i + 1 < EXPORT_ATTRIBUTES_TABLE_COUNT) {
> It's possibly better to do this:
> if ( i < EXPORT_ATTRIBUTES_TABLE_COUNT - 1 )  {

fixed.

> Even better, if you make your
> loops count down to 0 instead of up from 0, compares with the loop
> control variable will take advantage of the fact that many processors
> have special "branch-if-zero" instructions.

I'll spin up another bug about this optimization.  
the order is important (for .tab, .csv and .txt).  I claim that would
be a very small optimization and not worth holding up this patch for 0.9.8
when I get to quantifying export,  I can re-evaluate.

> 5) how about making the append functions take the more general
> nsAFlatCString &

fixed

> and have callers use nsCAutoString rather than
> nsCString since these are living on the stack anyhow.

> 6) prefer strcmp() to PL_strcmp, as in (3)

again, I'll wait for an edict about this.

> 7) using operator= here:

good suggestion, done.

> 8) On Mac, MSG_LINEBREAK will end up being CR only, which is not
permitted according to the LDIF RFC.  MSG_LINEBREAK can only be used
on platforms where it evaluates to CR/LF or LF, so we probably need a
bit of special-case code for platforms where that's not true.

fixed.  thanks for catching where I violated the RFC.

> 9) the indenting in ::ExportAddressBook is messed up, especially some of the
> braces, making it somewhat hard to read.

fixed

> 10) ::AppendLDIFForMailList(): traditionally, objectclass: attributes
> are printed immediately after the dn.  This doesn't appear to be
> required, but it's a trivial change.  :-)

fixed

> 11) nsAddressBook::AppendDnForCard(): see comments 5 and 7.  

fixed

> Also, you potentially avoid extra mallocs by using the concatenation operation
> on strings: eg instead of

fixed

> 12) nsAddressBook::AppendBasicLDIFForCard(): see previous comment
> about combining multiple appends with string-concat operator.

fixed

> 13) Same function.  Seems like we should also be appending
> organizationalPerson and inetOrgPerson objectclasses.

done, but can you elaborate on why?

> 14) ::AppendProperty(): the PR_FREEIF at the end doesn't need to be
> conditional.  You already tested for a null pointer immediately
> before.

fixed

new patch on the way.  I'm testing it out now.
Created attachment 64917 [details] [diff] [review]
updated patch, testing now.
Attachment #64471 - Attachment is obsolete: true
Created attachment 64923 [details] [diff] [review]
updated patch, fixed the problem with my call to Right()

dmose, please re-review

the other thing this patch does is I've turned ExportDirectory() to
ExportDirectoryToLDIF() and ExportDirectoryToDelimitedText().  this allows me
to remove the boolean (exportingLDIF), make the code easier to read, and solve
the LDIF_LINEBREAK issue.
Attachment #64917 - Attachment is obsolete: true

Comment 24

17 years ago
Comment on attachment 64923 [details] [diff] [review]
updated patch, fixed the problem with my call to Right()

sr=bienvenu
Attachment #64923 - Flags: superreview+

Comment 25

17 years ago
One comment for Dan, ultimately, the difference between PR_Free and PR_FREEIF is
not the null check, but the fact that PR_FREEIF will set the pointer to null
afterwards. In other words, PR_Free checks for null. But you're right that
PR_Free is better here.
>> 2) can exportingLDIF be const, since it doesn't change after
>> initialization?
>
> good catch.  I always for get to const stuff.  this should allow the compiler
> to generate better code, right?

Exactly.  Plus if somebody in the future accidently adds some code that
overwrites it, the compiler will tell them.

>> 3) prefer strlen to PL_strlen(), since some libc's hand-optimize this
>> sort of routine to be better than NSPR's version ever can.  Actually,
>> each string's length in the table shouldn't be hard to compute
>> with pointer arithmetic to avoid doing the traversals at all, but it's
>> probably not worth the time it would take to change the code to do
>> that.
>
> What's the official ruling on this?  I saw recently how we switched from
> nsCRT::memcpy to memcpy.  Is PL_strlen officially out? I'm going to continue
> to use it until an official edict comes out.  I feel we should make a 
> seamonkey wide decision on this, and let the modular owners so we can fix all 
> the code at once.

Sure, that sounds fine.  I don't think there's anything official just yet.  This
is just what I've been hearing from shaver.

> note, this code is hit when exporting to .csv to write out the column headers 
> once.  not a performance issue, so PL_strlen is better than hard coding the 
> lengths in the table, as if the column names change, the table needs to be 
> updated.  plus, this makes the table larger at run time, right?

Actually, what I really meant here was to do pointer arithmetic to figure this
stuff out from the array pointers that already exist, no need for extra data. 
But you're right, this isn't actually worth the time it would take to write the
code, since it's not in a performance critical area.

>> Even better, if you make your
>> loops count down to 0 instead of up from 0, compares with the loop
>> control variable will take advantage of the fact that many processors
>> have special "branch-if-zero" instructions.
>
> I'll spin up another bug about this optimization.  
> the order is important (for .tab, .csv and .txt).  I claim that would
> be a very small optimization and not worth holding up this patch for 0.9.8
> when I get to quantifying export,  I can re-evaluate.

Definitely.  It's an undoubtedly a very tiny win, and shouldn't hold up this patch.

>> 13) Same function.  Seems like we should also be appending
>> organizationalPerson and inetOrgPerson objectclasses.
>
> done, but can you elaborate on why?

Some of the attributes we use are defined in those schemas.  People who want to
type-check the values need that info in order to do so.

With the few tweaks that we discussed in AIM, r=dmose@netscape.com.
Created attachment 64955 [details] [diff] [review]
revised patch
Attachment #64923 - Attachment is obsolete: true
Attachment #64923 - Attachment is obsolete: false
Attachment #64923 - Flags: review+
this has r=dmose, sr=bienvenu.

note, AB -> LDIF -> AB is not loss-less, but it's close.  see bug #119360 for 
details.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED

Comment 30

17 years ago
Trunk build 2002-02-01-03: WinMe
Trunk build 2002-02-01-08: Linux RH 7.1
Trunk build 2002-01-01-08: Mac 10.1
Verified Fixed.
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.