Closed Bug 364808 Opened 18 years ago Closed 15 years ago

Mail Store: Use a SQLite Database for Storing Mails

Categories

(Penelope Graveyard :: General, enhancement, P5)

enhancement

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 361807

People

(Reporter: kamikazow, Assigned: mdudziak)

Details

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en; rv:1.9a1) Gecko/20061031 Camino/1.2+
Build Identifier: 

Currently Thunderbird can lose lots of mails when crashing during a write
access in the mail database.
Since SQLite will be part of Mozilla 1.9 anyway (for Firefox's Places system)
Penelope should use this as well for storing mails.
Penelope's mail database would become "crash proof".

This is the Penelope version of bug 361807

This is also be a possible solution for bug 359311

Reproducible: Always
This would be a major, major change to the way Thunderbird (Penelope) works, but might be useful....

Let's see how many votes we get.
Severity: major → enhancement
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Use a SQLite Database for Storing Mails → Mail Store: Use a SQLite Database for Storing Mails
I do not know about using SQLite in Penelope, but a crash recovery system is necessary.

Most times Eudora "rebuild mailbox" got the job done.

A couple of times I had to make a Eudora2 fresh installation, download all the mail on the server, and move the missing emails to a "transfer" mailbox.  I then copied the transfer mailbox to the folder of the old version of Eudora.  Open the old Eudora and then move the emails to the in box and hand delete the spam.  Very tedious and time consuming.  Anything that would make that kind of process unnecessary has my vote.
NO! Please don't store mails in a database! The major reason why I use Eudora and Thunderbird is because they store emails in plain text files. This is not just an important feature for me in a mail program. Any program which doesn't store mails as plain text is for me completely out of the question.

Do I need to explain here why? Isn't it obvious? Well, in short:

- In case of a disk crash, I can easily get to mails which don't happen to be on one of the bad disk blocks. And I can use simple text tools to restore everything that is still readable on disk. (I know, I did that for myself, and for others. A few lines of Perl were all that was needed)

- I can use standard text tools to search for specific mails. This is useful for fuzzy searches in huge mailboxes.

- I don't need any sort of special program to be able to read old mail. In 20 years, on my new Mawinux version 9.3 OS, I will be able to read my old mails with the text editor available on that platform, even if nobody remembers what SQlite used to be. (well, if I have the files on a readable medium)

- etc.

SQLite should be used for the address book and such things, but not for the mails.

If the mail store needs to be improved, make it Maildir format (one file per mail instead of big files with many mails)
i don't think M's points are valid.

"- In case of a disk crash":
restore the backup. how often does a HDD crash and how often does TB crash? while TB is not buggy, it can still crash or the OS can crash. a crash while recieving mails results in a corrupted mailbox. the usage of a transaction safe database like SQLite would prevent that.
one of the reasons MS outlook is that widely used is its usage of a database system. mails don't get lost when the mail client or the OS hangs.

"- I can use standard text tools to search for specific mails.":
you use text search tools to look though a huge mbox file? with a SQL database you could through the entire thing within seconds.

"- I don't need any sort of special program to be able to read old mail. In 20 years":
nobody is saying that TB can't have an option to export as mbox file.

"If the mail store needs to be improved, make it Maildir format (one file per
mail instead of big files with many mails)":
do you have an idea how **** this is for the file system? a typical cluster on a file system is 4kb in size. now consider thousand of mails each 1-2kb. you can't store 2 mails in a single cluster when they are separate files. this results in lots of wasted HDD space.
mbox stores binary attachments encoded as ACSII. this is also wasted space.

a SQLite database is fast, reliable, and compact.
Status: NEW → ASSIGNED
NO. The reason I use Eudora is because it's mailboxes are NOT in a horrendously large and opaque database. Moreover, I make daily incremental backups, so I only have to backup the mailboxes that actually have changes in them; not an entire humongously large file every day. 
> Currently Thunderbird can lose lots of mails when crashing during a write
> access in the mail database.

I understand the theory, but does it happen in practice? In other words, is this a solution looking for a problem? 

I haven't encountered the problem in supporting several small businesses on Mozilla/TB mail clients (and Eudora), in my own usage, or providing end-user support in Mozillazine forums. In fact, they've been exceptionally stable -- one reason I use these mail clients -- and I've had more problem with db-based clients like Outlook.

There are other headaches, like the time it takes to reindex large mailboxes and mailbox size limits. Something that fixes those bugs, and has other attractions, might be useful.

I prefer the text format for compatibility, both now and in the future. In 10+ years, will I be able to read a SQLite db created today? Where can I find 10 year old db apps today? I read any mbox file ever created, though, even 15 years old.

As they say KISS, or more precisely, let's keep it 'as simple as possible, and no simpler'.

(BTW, the maildir bug is Bug 58308)
 I oppose a compulsory change for Thunderbird (see Bug 361807 Comment 6). There is, unfortunately, no mechanism for a vote against, hence this comment.

Markus S.: mbox format is not as vulnerable to corruption as you paint it to be, and even if it is corrupted you can follow your own advice and "restore the backup". Disk space is cheap and attachments can be detached/deleted if it's really a problem.
Pardon me if this is bugspam. Is there any possibility of leaving the mbox or single file/message storage formats, and using something like MozStorage just as an organized list of URI's (much like bookmarks) which points to either the local files, or the messages on the server?

From the extensions point of view, the current database just confuses me. Maybe its fear of Mork, but trying to search or filter messages across all accounts seems difficult. SQLite I'm at least partially comfortable with, and that base makes digging in to things much easier.

I can understand the push against creating a single mass database for everything, but it seems like there's a compromise in there where a more standard database format is used and standard mail storage formats are used at the same time.
Priority: -- → P5
One SQLite database per mailbox is the best way to go. And it's quite easy to write a script to export from such a database, I already did it (in PHP).
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.