Closed Bug 382876 Opened 17 years ago Closed 4 years ago

[meta] remove mork usage from address book

Categories

(MailNews Core :: Address Book, task, P2)

Tracking

(thunderbird_esr78 wontfix)

RESOLVED FIXED
82 Branch
Tracking Status
thunderbird_esr78 --- wontfix

People

(Reporter: stever, Unassigned)

References

(Blocks 1 open bug, )

Details

(Keywords: meta, Whiteboard: [gs])

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)
Build Identifier: sqlite address book support

It would be nice to have the address book in the sqlite format.  This would make it easier for applications to retrieve, update one address book.

Reproducible: Always
Status: UNCONFIRMED → NEW
Component: Address Book → MailNews: Address Book
Ever confirmed: true
Product: Thunderbird → Core
QA Contact: address-book → addressbook
Version: unspecified → Trunk
(In reply to comment #0)
> It would be nice to have the address book in the sqlite format.  This would
> make it easier for applications to retrieve, update one address book.

It would possibly make it easier for the address book, but not necessarily multiple applications. Last time I checked there were locking problems using the sqlite format. See http://developer.mozilla.org/en/docs/Storage#How_to_corrupt_your_database for more info.

However I'm not against it especially if someone steps up to do the work.
Keywords: helpwanted
Blocks: 11050
Taking this bug, re bug 11050 comment #25
Assignee: nobody → Pidgeot18
Keywords: helpwanted
Mark within bug 11050 you mentioned flexibility. Does that also give us the option for user defined fields? Or even fields which should exist while accessing a system address book e.g. under OS X for the moment. That would be further a great enhancement and should be planned carefully side by side with the schema. 
(In reply to comment #3)
> Mark within bug 11050 you mentioned flexibility. Does that also give us the
> option for user defined fields? Or even fields which should exist while
> accessing a system address book e.g. under OS X for the moment. That would be
> further a great enhancement and should be planned carefully side by side with
> the schema. 

It is intended that we'll have flexibility in the schema to allow for future functionality. Just adding flexibility to the schema won't fix us having user defined fields - indeed although its difficult, for current mork books it is possible and would probably need some UI work, for other AB types again it probably just needs a bit of UI work and some backend work.

There are various changes that I expect we'll need to do alongside implementing this schema that will enhance the flexibility of the address book overall.
Bug 118665 could be a bit if a special case for "user defined fields". (Just to keep in mind.)
Current state of things:

I haven't had much time to work on it the past week, so it's not terribly far. The big holdup is that I wish to duplicate interfaces as little as possible. Only three interfaces need to be duplicated (the MDB card and directory, as well as the database). Of those, the database is the largest roadblock; if the differences between the SQL and the MDB backend could be merged together, the largest need for separate instances of the other two would be destroyed. I don't want to actually get around to implementing the SQL backend until after the interfaces are settled.

So, in the meantime, here is my roadmap for work:
1. Fix up the MDB card and directory impls and interfaces. I see some various places where I could tune these interfaces up that don't fall under the scope of this bug, but which would be desirable to fix in the course of this bug.
2. Massage the nsIAddrDatabase to hopefully be SQL/MDB agnostic (would probably also simplify interface).
3. Implement the SQL backend.
4. Change the UI to accommodate SQL.
5. Change the assumption elsewhere.

An example of some of the changes in the first step are as follows:
* Use AString instead of wstring and ACString instead of string
* Simplify nsIAbMDBCard to only have one UID to better mesh with SQL.
* Remove nsAbMDBCardProperty

Some refactoring of mailing lists are also desirable here.

I hope to have much of step 1 posted in a patch this weekend. Other factors will probably intervene, though...
Status: NEW → ASSIGNED
(In reply to comment #6)
> 1. Fix up the MDB card and directory impls and interfaces. I see some various
> places where I could tune these interfaces up that don't fall under the scope
> of this bug, but which would be desirable to fix in the course of this bug.
> 2. Massage the nsIAddrDatabase to hopefully be SQL/MDB agnostic (would probably
> also simplify interface).

I would be quite happy to drop the nsIAddrDatabase interface altogether - all functions should really be accessible via nsIAbDirectory or nsIAbCard (inserting SQL/MDB/whatever where necessary) - so don't limit yourself unnecessarily. Though I've got a feeling when I thought about this the other day, I ended up thinking we'd have to keep it anyway. The point is, if a user of the directory is using something derived from nsIAddrDatabase, then their code must be MDB/SQL specific, which is what I really don't like here, for example see how we've limited ourselves in palm sync and in the import/export code by using nsIAddrDatabase. It may also be that we could not expose the interface, but just have a separate object in the back-end to handle the database functions.

Anyway, you're a bit closer to that bit of the code than I am at the moment. I just wanted to point out that we shouldn't limited ourselves to the existing interfaces.

> An example of some of the changes in the first step are as follows:
> * Use AString instead of wstring and ACString instead of string

I'm definitely not against those, it should help the l18n in address book as well.

> * Remove nsAbMDBCardProperty

I'm not against this, but be aware of the palm sync extension code that I think uses it as well.
This is a rough listing of what I would like to see in terms of interfaces. Some more notes:

nsIAbMDBCard becomes nsIAbDBCard (with MDB and SQL implementations, among others in import/). I also envision seeing it tied only to its parent DB and being impossible to move it from its DB.

nsIAbMDBDirectory similarly becomes nsIAbDBDirectory. I have looked at this the least, so I have practically no opinions on how to change.

It is nsIAddrDatabase that I focus on the most. One of the opening functions is dropped, a few methods are renamed, nsIMdbRow is purged, and the addCardXXX methods are dropped to be replaced with one of two functions. I also got rid of what may be some unused functions.

Questions/comments/concerns?
(In reply to comment #7)
> (In reply to comment #6)
> ...
> > * Remove nsAbMDBCardProperty
> 
> I'm not against this, but be aware of the palm sync extension code that I think
> uses it as well.

if there is a better way in http://mxr.mozilla.org/mozilla/source/mailnews/extensions/palmsync/src/nsAbIPCCard.h then I'm all for it.

and nsIAbMDBCard :
http://lxr.mozilla.org/mozilla/source/mailnews/extensions/palmsync/src/nsAbPalmSync.h#143
http://lxr.mozilla.org/mozilla/source/mailnews/extensions/palmsync/src/nsAbPalmSync.cpp#159
From the database corruption link in comment #2, it mentions the following as one of the activities that can cause corruption:

---
Open the database from an external program while it is open in Mozilla. Our caching breaks the normal file-locking in sqlite that allows this to be done safely.
---

It seems that this places a great limitation on the utility of the sqlite backend.  Is there a document which can explain why Mozilla needs to cache the database such that external programs cannot perform concurrent access?
Depends on: 413260
I would like to add one more thing.  I have been following the use of Mork in Mozilla applications since 2005 when I first wrote a Mork parser in Python.  I needed a way to read and write address book files, and given that sqlite was "right around the corner", I wrote a quick script to take care of my immediate need.

It is now nearly 3 years later and there is yet to be an upgrade to the Mork format.  From a technical perspective, this is puzzling, since the Mork format is certainly not a superior database format in any respect.  On the contrary, the choice to Mork as a database backend is questionable at best.

I would greatly appreciate if we could start a discussion on what exactly is hanging up the replacement of Mork in Mozilla.  If sqlite has technical limitations, it would be good to air them.  Conversely, if alternative formats such as Berkeley DB are not suitable, that would be useful information.

At a bare minimum, it would seem that if nothing is suitable, at least an alternative database format can be implemented that is not nearly so painful to parse and process with external programs.

Besides merely the replacement of Mork, it would be useful to discuss the use cases that are relevant to users of Mozilla.  For example, many users want to read and write the database files while Mozilla is running.  However, the current implementation of sqlite seems to disallow this mode of operation.  It would be unfortunate to rush ahead with a Mork replacement, but still impose unreasonable limitations that will cause problems in the long run.

As a comparison, the KDE productivity apps do a very good job of external application access.  For example, Kontact can store the address book in a standard VCF file, dynamically monitoring changes by external applications.  This makes writing an addressbook synchronizer very straightforward.

I welcome everyone's thoughts on this issue.  With the community's input, I'm sure that this problem can be resolved effectively.
(In reply to comment #11)
> I would greatly appreciate if we could start a discussion on what exactly is
> hanging up the replacement of Mork in Mozilla.  If sqlite has technical
> limitations, it would be good to air them.  Conversely, if alternative formats
> such as Berkeley DB are not suitable, that would be useful information.

In short, MailNews hasn't replaced Mork as no-one until now has volunteered to do the work. I believe some of the issues/problems with the current sqlite implementation may have been relieved on some of the trunk builds. Work is currently blocked by the fact we need to improve some of our internal interfaces first, otherwise we'll end up doing the work twice.

However, the sort of discussion you want is best suited to the newsgroups, the best one for this being: mozilla.dev.apps.thunderbird.
Unless anyone has a major objection, I'd like the comments to be put in this bug, since it is directly relevant to the task.  I may have some time to do this, but probably not for the next couple of months given the size of the task.  If some other folks would like to help, I can lend a hand in getting it done.

I understand that the interfaces are less than perfect.  There is always a good argument against doing things twice.  However, in this case, it seems to have become paralysis by analysis.  If an interim solution had been implemented 3 years ago, there could have been much incremental improvement by today.

Here is my proposal:  How about replacing the lowest level file operations of Mork with a sqlite backend?  In other words, don't worry about the higher-level functionality.  Simply swap out the convoluted text database format with sqlite so that the data is easily accessible.  Then, as the sqlite backend gets improved, the use cases such as concurrent access can be realized.

Mork seems to just be a set of tables and records with fixed fields.  A draft schema for sqlite might be the following:

TABLE   -> (field, field, ...)

field   -> (id, name)
scope   -> (id, name)
table   -> (id, name, scope_id)
record  -> (id, table_id, scope_id)
content -> (id, record_id, field_id, value)

This is similar to the in-memory approach that I had used with my Mork script.  With it, I was able to de-construct and construct the Mork format for at least the Thunderbird addressbook.

Does anybody else agree with this approach?
1) A Bug is no discussion forum, bug reports should focus on the work to fixing an issue, not on the discussions of what possible ways of solving there are, because bugs don't have threading, marking of read/unread posts, etc. Newsgroups are better suited for that.

2) The mozStorage backed by SQLite as the database engine are already integrated in core Mozilla code and shipped with Thunderbird and SeaMonkey, so that's the obvious choice for the database backend.

3) Many of our current interfaces are tied to the Mork backend, and a list of currently known bugs are caused by the choice of the current interfaces, so it's an obvious thing to first get an impression of what the targeted interfaces look before creating new database schemes. Changing database structure after it already has been used intensively is always lot of work which nobody wants to take on, we should avoid that if in any way possible.

4) As Mark has said (and he probably knows best), the problem here it that's a lot of tedious and unspectacular work, which, in volunteer-driven projects as ours, is something most people don't find a lot of motivation in, and so it isn't being done.
I apologize for responding so late, but gmail has been marking my bugmail as spam for the past week...

(In reply to comment #13)
> Does anybody else agree with this approach?

I would tend to disagree with this approach. The main reason is that, having written code using the Mork interfaces, I find that the interfaces are shoddy and highly prone to leaks and unexpected bugs. Not to mention that I don't think the database has had anyone maintaining it in a long while. The sole uses of Mork, in fact, are address book and message database now that places uses SQLite.

As the main person involved with doing the rewrite, I will be adding the SQLite backend as soon as the interfaces stabilize (see the bug blocking this one). This is a high priority bug, but unfortunately a large-scale one being blocked by another large scale bug.
Assignee: Pidgeot18 → nobody
Status: ASSIGNED → NEW
Flags: wanted-thunderbird3?
(sorry about the spam, hand slipped)
Assignee: nobody → Pidgeot18
Depends on: 424446
Product: Core → MailNews Core
Blocks: 453975
Just Updating and reminding all off this enhancement.
Can't leave ideas like this in closet.
(In reply to comment #18)
> Just Updating and reminding all off this enhancement.
> Can't leave ideas like this in closet.

If you've read the comments, you'll know that I have already started working on this. It almost certainly won't be finished by TB 3, since there's a lot of effort elsewhere that needs to be finished first.
Thanks Joshua, good to know there's progress
I don't know if anyone has mentioned it - but the address book needs to live on the IMAP store. Many of us are reading our email from multiple computers - sometimes simultaneously. I have had to resort to using the Sync Kolab add-on to keep my address books up to date.

However, the whole idea of keeping an address book local to a PC in a networked world is just obsolete. The address book should live in the IMAP store (POP is also an obsolete protocol).

Is anyone working on having the address book kept in the IMAP store?
(In reply to comment #21)
> I don't know if anyone has mentioned it - but the address book needs to live on
> the IMAP store.

That would be bug 160523.

> However, the whole idea of keeping an address book local to a PC in a networked
> world is just obsolete. The address book should live in the IMAP store (POP is
> also an obsolete protocol).

IMAP doesn't work for everyone in all situations. It isn't the only solution. However let us not discuss that here as it isn't relevant to this bug.
Before inventing a new wheel, one should have a look at alpine/pine (mail software). It uses remote config and remote address book (via IMAP) for quite some years and it would be nice to have ONE address book when using both applications. "alpine" is still under active development, have a look here:

http://sourceforge.net/projects/re-alpine/
(In reply to comment #23)
> Before inventing a new wheel, one should have a look at alpine/pine (mail
> software). It uses remote config and remote address book (via IMAP) for quite
> some years and it would be nice to have ONE address book when using both
> applications. "alpine" is still under active development, have a look here:

There is no desire to shackle the Thunderbird or SeaMonkey address books to the constraints of other address book formats (e.g., Apple's OS X Contacts, etc.).

What this bug seeks to do is to remove the mork implementation for local address books in favor of an SQLite implementation, due to the desire to rid ourselves of a crappy database implementation. If we change our minds about this, we will probably close this bug and use another one, so please stop trying to suggest other alternatives for implementation here.
I realise that you may want to focus on the SQLlite implementation, but I definitely support the sentiment of Stefan in wanting an address book that can be synchronized over IMAP.

Perhaps this could at least be allowed for in the new SQLlite implementation by leaving space for some fields giving a last updated time (or sequence #) and unique ID to each entry. That could make the task of synchronizing simpler.
There are a whole load of things we can do if we rework the database structure and we will be starting to look at them soon.

Rather than trying to discuss/pre-empt it here. I suggest you keep an eye on the news.mozilla.org newsgroup "mozilla.dev.apps.thunderbird" or via the equivalent mailing list here: https://lists.mozilla.org/listinfo/dev-apps-thunderbird

When we start fleshing out more details and ideas, that is where we'll be posting them.
Flags: wanted-thunderbird3?
Blocks: 387334
This is not just an enhancement, as the Mork address book is prone to corruption (see bug 366457).
(In reply to VanillaMozilla from comment #28)
> This is not just an enhancement, as the Mork address book is prone to
> corruption (see bug 366457).

Yup, I agree.

I had read several times the article at https://developer.mozilla.org/en/Mork_Why since several years.  There is sentence: "Mork was invented out of whole cloth to satisfy requirements for flexibility, resistance to corruption, and brevity in file format for expected content."

Let's see the points:
* flexibility -- visibly yes, because we can add a new field at the beginning of the file

* brevity in file format for expected content -- yes, by design... well, if you don't try to read it as a text file....

* resistance to corruption -- well, well, well, I've been trying very hard to see what justify this point, but I can't ...

Take an example: The Mork format relies on pointers at the end of the file, those ^XX^YY

Let's say I have ^84^A1 binding LastName to something, but a corruption changes A1 to B1, we still have wrong data.

I really don't see how Mork is resistant to corruption.
Unless you're offering to do the replacement, this kind of discussion is inappropriate for a bug. All it is doing is cluttering up the bug and making it less likely that someone will work on it.
Assignee: Pidgeot18 → nobody

I think IndexedDB would be preferable, as sqlite is not web compatible.

Type: enhancement → task

Let's use this as the tracker bug for mork removal from address book.

Depends on: 1572324
Summary: requesting conversion from mork db to sqlite for addressbooks → remove mork usage from address book
Depends on: 1581765
Keywords: meta
Summary: remove mork usage from address book → remove mork usage from address book [meta]
Summary: remove mork usage from address book [meta] → [meta] remove mork usage from address book
No longer blocks: 11050
Depends on: 1660126
Priority: -- → P2

Fixed by bug 1660126. Goodbye Mork address book, I won't miss you.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 82 Branch
You need to log in before you can comment on or make changes to this bug.