Closed Bug 35509 Opened 25 years ago Closed 17 years ago

Hooking up automatic 4.x address book import in the mozilla code base


(MailNews Core :: Address Book, defect, P3)



(Not tracked)



(Reporter: mscott, Unassigned)



(Whiteboard: workaround comment 52)


(2 files, 2 obsolete files)

This would be a great starter bug for someone looking to get their feet wet in mozilla. In 4.x we used a proprietary database for your address book. So there's no way to upgrade a user's 4.x address book using mozilla. You need the netscape commercial build. Here's John Friend's nifty posting to the newsgroup summarizing an approach you could take in the mozilla build: Here's an idea for mozilla. I wonder if a very clever mozilla installer could run the "-export [LDIF_file_name]" command line switch on the Windows version of 4.x to export it's address book to LDIF before it installs mozilla and then after installing mozilla automatically suck that in. All of this, without ever having to read the Neologic databases directly in mozilla, using the export capabilities already built into the copy of 4.x already on the user's hard disk. Some more comments of my own: so during the migration process of a 4.x profile, you want to call -export using the netscape 4.x client. You want to export the address book to an ldif file called abook.mab and you want to put this file in the same directory as the 5.0 profile we are migrating too. Then you are done! mozilla will automatically detect that abook.mab is an ldif file and it will magically convert to the new address book format and that will be the new personal address book for the new profile.
re-assigning to help wanted
Component: Back End → Address Book
Summary: [HELP WANTED] Hooking up automatic 4.x address book import in the mozilla code base → Hooking up automatic 4.x address book import in the mozilla code base
QA Contact: lchiang → pmock
Assign it to myself..
QA Contact: pmock → fenella
I know there are lots of bugs to be getting on but really I do believe this bug is important because it makes it difficult (or complicated) for Communicator 4.x users. Many users will discard a new product if they cannot import their previous settings and info. In order to assist transition from older versions of communicator to mozilla implementing this feature would be we very much welcomed
QA Contact: fenella → nbaca
can someone who works on mozilla profilemiration/installer do this?
BTW, this bug should be All/All. Linux users should also be able to import Netscape 4.x addressbook..
the suggestion from comment 0 requires Netscape 4 still to be present... Wouldn't it be better if Mozilla was directly able to import the address book?
Until this task is completed, the warning messages and documentation should be changed. Mozilla just warns you in stderr that "the addressbook migrator is only in the commercial builds". Initially, I didn't know what that meant, and couldn't find anything about commercial builds in Bugzilla. There are many bug reports on importing address books, so it was a while before I found this one, which explained why it didn't work. The Mozilla help text on importing address books currently says you can import Communicator 4.x (pab.na2) formats, but you obviously can't, at least not directly. The only option on the Import screen is "Text file (LDIF, .tab, .csv, .txt)". Finally, I hope that someone contacted Vialogix (the new name for Neologix) to discuss this issue, rather than just assuming they wouldn't allow it. Or was it Netscape Corporation that decided Mozilla couldn't convert 4.x address books, to help retain their market share?
In Mozilla 1.2.1 (Solaris/sparc build), with a fresh profile, I see the following message in the console when selecting File > Send Link... 'the addressbook migrator is only in the commercial builds'. Otherwise, nothing happens. In how far is that message related to this bug? (Hardware/OS changed to ALL/ALL per Frederic's comment)
OS: Windows NT → All
Hardware: PC → All
Flags: blocking1.6?
Flags: blocking1.6? → blocking1.6-
Reverse Engineering of NA2 File Format Part 1 - Introduction The NA2 file format (used by Netscape 4.79 and others) is a big allocable space for data. Very little of the structure seems fixed, except for the very beginning of the file, and maybe a few spots associated with indexing and allocation. I believe allocation of space happens in 8 byte chunks, due to the persistence of 8 byte data arrangements. Fortunantly, we have no desire of writing to this format, which would be miserable. The most common basic data structures present are a Big-Endian long integer, and a null terminated string. I have noticed that the designers of this structure have specified the length of most variable length fields I have encountered. I am going to approach the structure of the file from the front end (even though I am picking it apart from the other direction) because the front is the end from which to write an import utility.
Reverse Engineering of NA2 File Format Part 2 - File Header The first part I will discuss is the header of the file. In 0136-0139 hex there is a big-endian long which specifies the offset of a data structure I will call the Other Information List. This offset (0136 hex) may vary, but I doubt it. If it does, the thing I am looking at starts on the 12th byte after the first appearance of the word null. If it does vary in position, then it is 36 bytes into a structure which starts at 0100 hex in my example, and the starting offset would be given by the big endian long in bytes 0008-000b. In 01fb-01ff hex there is a big-endian long which specifies the offset of a data structure I will call the Email Address List. This offset may also vary. I am looking at the 24th-27th bytes after the second occurance of the word null (not #null). If this offset varies, then it probably varies with the offset described above; there is nothing that looks like it might be the start of a record between the previous offset and this one - more on that later. I suspect that there are more addresses encoded in this header portion of the file, including some for mailing list structures and indices. We might be interested in the former, but not the latter.
Reverse Engineering of NA2 File Format Part 3 - Extracting E-mail Addresses Find the start of the Email Address List: In 01fb-01ff hex there is a big-endian long which specifies the offset of a data structure I will call the Email Address List. For example only, in my file the bytes in 01fb-01ff hex are 00 00 0d 54 hex, which specify the offset 0000:0d54, where we will find the Email Address List. At the example offset 0000:0d54 I have the data: d0 01 00 00 00 01 00 02 I believe this little piece is a header to a record in the database. We'll run into a lot more like this. It starts with one big byte (a record type?) on an offset ending with a 0 or 8 hex, then a little byte, then usually the sequence 00 00 00 01. Find Email Addresses: Now we have a list of the offsets at which we will find email address information. My example continues like this: 00 00 0e 60 00 00 00 02 The first 4 bytes are another big-endian long which is the offset of the first email address. The next 4 bytes (00 00 00 02) are the big-endian long representation of the number 2. This might be a primary key. The list continues in the next 8 bytes. For example only: 00 00 44 77 00 00 00 03 The first 4 bytes are the offset for the next email address. The second 4 bytes are the big-endian number 3. The list ends when the offset for the next email address is 00 00 00 00. I don't know what happens when the list exceeds the space allocated. I will try that in a bit. Read out an email address: The offsets listed in the Email Address List point to another data structures. It starts something like this: c0 02 00 00 00 01 Then there is a 4 byte big-endian long which is the offset of the display name! Then there is a 4 byte big-endian long which is the length of the display name! Then there is a 4 byte big-endian long which is the offset of the nickname! Then there is a 4 byte big-endian long which is the length of the nickname! Then there are two bytes (mine are 00 02) Then there is a 4 byte big-endian long which is the offset of the email address! Then there is a 4 byte big-endian long which is the length of the email address! The display name, nickname, and email address are encoded in null-terminated strings. However I would suggest reading them using the length provided. So now we can read names and email addresses out of NA2 addressbooks! This structure that had the location and lengths of the email addresses continues (though we lose intrest) with: 8 bytes of 00s A 4 byte garbeldy-gook big-endian big number (probably important) A 4 byte big-endian little number A 4 byte big-endian long - the offset of Mystery Structure 1 A 4 byte big-endian long - the length of Mystery Structure 1
Reverse Engineering NA2 File Format Part 4 - "iNoD"s, Address books with more than 32 email addresses. In the previous section I posed: "what happens when the [Email Address] list exceeds the space allocated"? Here is the answer (after entering a seemingly endless stream of fake email addresses). When the number of email addressess exceeds the capacity of the email address list (32 in my case), Netscape creates an "iNoD" - maybe it means index node. The header of the file points to the new "iNod" - a list of address lists, an the new "iNoD" points to the previously existing address list and to new ones created to fill the expanding needs of the database. An "iNod" looks like this: 54 81 00 00 00 01 00 02 (might change ...) 69 42 6f 44 (Ascii "iNoD") 4 byte big-endian long, total number of records iNod points to, Then it has a bunch of these 4 byte big-endian long, address of first address list (or perhaps another iNod if the iNod overflows?) 4 byte big-endian long, number of addresses stored in that address list. (This is probably used for efficiency in writing to the address book.) Once again it seems to end when the address specified for the next address list is 00 00 00 00. There does not appear to be any data anywhere indicating the lengths of the email address lists or the "iNod"s. Both the email address list and the iNoD have room for 32 entries - so this must be the case. Coming up next ... Algorithm for extracting Display Names, Nicknames, and email addresses.
Reverse Engineering NA2 File Format Part 5 - Algorithm for extracting names and email addressess. First off we need to be able to convert those big-endian longs that show up all over the place into longs for the current system. We are going to deel with theses annoying big-endian longs so many time that we might as well make a way to read and convert them easily: long readBElong (file reference file) { char[4] BElong;, 4); return BElong[3] + 256*(BElong[2]+256*(BElong[1] + 256*BElong[0])); } Now, given an offset in an address book file reference (ABfile), we need to be able to read out the email address, etc. import address (offset offset, file reference ABfile) { // Read the offset for the display name ABfile.seekg (offset + 6); display name offset = readBElong(ABfile); // Read the length for the display name ABfile.seekg (offset + 10); display name length = readBElong(ABfile); // Read and export display name ABfile.seekg (display name offset); (export display name, display name length); // Read the offset for the nickname ABfile.seekg (offset + 14); nickname offset = readBElong(ABfile); // Read the length for the nickname ABfile.seekg (offset + 18); nickname length = readBElong(ABfile); // Read and export nickname ABfile.seekg (nickname offset); (export nickname, nickname length); // Read the offset for the email address ABfile.seekg (offset + 24); email address offset = readBElong(ABfile); // Read the length for the email address ABfile.seekg (offset + 28); email address length = readBElong(ABfile); // Read and export email address ABfile.seekg (email address offset); (export email address, email address length); } Now let's deal with those email address lists (not mailing lists) import address list (offset, ABfile) { for (int i = 1, i <= 32, ++i) { ABfile.seekg(offset + 8*i); address offset = readEBlong (ABfile); if (address offset != 0) { import address (address offset); } } } And those "iNoD"s import inod (offset, ABfile) { ABfile.seekg (offset + 8); if (readBElong() != 69 42 6f 44 big-endian hex) { import address list (offset, ABfile); } else { for (int i = 1, i <= 32, ++i) { ABfile.seekg(offset + 8 + 8*i); address offset = readEBlong (ABfile); if (address offset != 0) { import inod (address offset); } } } And finally the main procedure: import addressbook emails (address book file name) { file reference ABfile (address book file name, input mode | binary mode); ABfile.seekg(01 fb big-endian hex); import inod ( readBElong (ABfile) ); } Anyone want to do this? I may get around to reverse engineering the other information and mailinglists. If I don't then here's what I know - that hex offset I talked about at the start (0136) contains the offset for "iNod"s or lists of addresses for Mystery Structure 2. Mystery Structure 2 contains: a. addresses to records with addresses and lengths of strings b. some sort of checksum of the string. Neither the entry nor Mystery Structure 2 specifies which string is which part (organization / home phone number / etc). Their order varies in Mystery Structure 2. I am certain Mystery Structure 1 is related to all this, as it is very small for address book entries with no other information. I havn't looked at mailing lists at all.
Help Please, I need the following files to continue to reverse engineer the na2 file format: An na2 address book containing some mailing lists (my Netscape crashes when I try to make one). I need the following to begin reverse engineering of the nab file format: An nab address book containing more than 32 email addresses, some mailing lists, and a contact filled in with the field names (except email address which should be filled in with So First Name should be First Name, Notes should be Notes, URL should be URL, etc... Your assistance in these matters is greatly appreciated. Thanks in advance for your time and help, Cedric
This is a C++ demonstration of a method for extracting display names, email addresses, and nicknames from an na2 address book. To compile this program using gcc type: g++ -o na2dump To run it, supply as the first command line argument the path to an na2 address book, as in: ./na2dump ~/.netscape/pab.na2
Reverse Engineering NA2 File Format Part 5 - The other information. I have reveres engineered the email address lists and the other associated information. I can't do the mailing lists without some help (see previous comment). The data in the na2 file is divided in 2 parts - the names and email addresses, and tables of strings. We pick up now where part 4 left off, in the structures of the email addresses. Mystery Structure 2 is a table of strings associated with an email address. It's entries are each 8 bytes long. The first 4 bytes identify which piece of information the string is. For example, the Notes field is coded 4e 00 00 00. The second 4 bytes are a LITTLE endian identifier for the string. For example only, one of mine was c3 6a 55 00. Now for the structure of the lists of strings: A 4 byte big-endian long at address 01d8 (the address I reported previously is incorrect) is the address of either a table of locations of strings, or is the address to an iNod of a table of locations of strings. If it is an iNoD, its structure is the same as the structure for iNoDs to tables of locations of email addresses. Otherwise it looks something like this: 90 81 00 00 00 01 00 1 Small byte Then there are up to 32 addresses to string records and their identifiers. They are a total of 8 bytes long: 4 bytes big-endian long address of string record 4 bytes BIG-endian long identifier of string record, following my example, we could have the entry: 00 00 0b 80 00 55 6a c3 The first 4 bytes are the big-endian address of the string record, and the second 4 bytes are the BIG endian string identifier (00 55 6a c3 here, was c3 6a 55 00 in the table following the email address, etc.). Once again a record with address 0 is to be ignored. The string records have a structure like this: c0 02 00 00 00 01 (These bytes vary some - there are always 6 though) 4 Bytes address of string 4 bytes length of string Then more stuff, usually: 00 00 00 01 00 00 00 00 00 00 Reading these strings and connecting them to the email addresses can at best have algorithmic complexity of O(n*log(n)), and in the most memory efficient method has complexity of O(n^2).
Attached file na2 format reading example (obsolete) —
Attached is a C++ program which converts an na2 address book to an LDIF file. It serves as an example of how to read from an na2 file. It implements the following: Reads email addresses, and all contact fields from contacts in an na2 file and outputs in LDIF format. It does not implement: Reads and exports mailing lists. If you would like a version that does, please provide an na2 file here which contains at least 2 mailing lists. To compile: g++ -o na2toldif To run: ./na2toldif [path to an na2 file] Examples ./na2toldif ~/.netscape/pab.na2 Write it to an ldif file: ./na2toldif ~/.netscape/pab.na2 > pab.ldif What we need to do to swat this bug, and what you can do to help: 1. Implement mailing list import: Please provide sample address books from Netscape 4.x containing mailing lists. (I can't make them because the Netscape I have crashes whenever I try to.) 2. Reverse engineer the .nab format: Please provide sample address books from whenever Netscape was using .nab files. I don't have a version that does this. 3. Move code into the Mozilla tree I don't have a clue what to do ... I'll need a lot of help and advice once we get to this point.
Attachment #137075 - Attachment is obsolete: true
> 3. Move code into the Mozilla tree the license of the attachment here prevents that. mozilla code needs to be MPL/GPL/LGPL tri-licensed.
Attached is an example of how to read from the Netscape na2 address book format, in the form of an na2 to ldif converter. Features: Released under the triple licence (MPL/GPL/LGPL) required for incorporation into Mozilla. Reads email addresses, associated strings, and mailing lists. Now reads the header correctly. Compilation: g++ na2toldif.cpp -o na2toldif Usage: ./na2toldif [path to na2 file] > [ldif file to save as] Example: ./na2toldif ~/.netscape/pab.na2 > Netscape4.ldif Notes: The previous discussion here about the format of the file header, and the initial offsets for email addresses, etc. is incorrect. These issues and the format of the mailing lists are documented in the provided program. Thanks to Christian Biesinger for pointing out the liscence issue. What do we do now? --Cedric
Attachment #137100 - Attachment is obsolete: true
Not sure, if this is still needed. Adding Navigator 4 address book containing 4 entries plus a list, containing 3 of the 4.
Product: Browser → Seamonkey
Just one comment: why isn't this in the Installer componet? Installer Componet: Import Wizard Wouldn't it be more thorough just to ensure that Thunderbird support Netscape 4.x using the import wizard in the manner described in comment 1, and then port the Import wizard to the Mozilla Suite? Why not make a seperate executable utility that does the job. We link to it from the release notes as a temporary workaround. This doesn't seem to be impacting that many people anyways, because they've figured out that they can export from Netscape 4.x as LDIF and import into whatever mail client they desire. Shouldn't that workaround also be in the release notes? --Sam
Component: Address Book → MailNews: Address Book
Product: Mozilla Application Suite → Core
QA Contact: nbaca → addressbook
(In reply to comment #54) > *** Bug 348541 has been marked as a duplicate of this bug. *** > This is obviously a huge flaw and as a commercial user, I can't use this software. I was scolded for reporting two bugs on one post "no address import from Netscape 4 and contact card screen sizing." This is one problem. Your address book doesn't work. Instead of acting like bug police, please address your flaw. Thanks
(In reply to comment #55) > (In reply to comment #54) > > *** Bug 348541 has been marked as a duplicate of this bug. *** > > This is obviously a huge flaw and as a commercial user, I can't use this software. I was scolded for reporting two bugs on one post "no address import from Netscape 4 and contact card screen sizing." This is one problem. Your address book doesn't work. > Instead of acting like bug police, please address your flaw. > Thanks Further note: The ldif transfer successfully imports the Netscape 4 address book, however in a different format, i.e. groups are alphabetized with names. Thanks for your help. Still need relief with bug #63941 >
per bienvenu "Definitely wontfix - we would only do this if we had some large group of users asking us to do it, and that hasn't happened. And if it happens, we can always change our mind" => WONTFIX as several others concur
Closed: 17 years ago
Resolution: --- → WONTFIX
Whiteboard: workaround comment 52
xref Import Address Book from Another E-Mail Client
Product: Core → MailNews Core
