Closed Bug 1921394 Opened 1 year ago Closed 6 months ago

Database structure for folder hierarchy

Categories

(MailNews Core :: Database, task)

Tracking

(Not tracked)

RESOLVED FIXED
134 Branch

People

(Reporter: darktrojan, Assigned: darktrojan)

References

(Blocks 2 open bugs)

Details

Attachments

(24 files, 5 obsolete files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

In this bug I'll be posting my patches relating to storing folders in the global database. At this point it's experimental, does nothing with actual mail folders, and isn't connected to anything. Nothing about this is finalised, especially the names of various things, but we are pretty happy with the structure so far.

I've been thinking about this a bit, and here are some more thoughts:

  • Folders are mostly reflection of structure on a remote server or (in the case of local folders) the filesystem.
  • We need the concept of folders in a global database so we can associate messages with them (messages are inside one or more folders).
  • We get folder names and hierarchy from the server (or filesystem). But network traffic is slooooooow, so it makes sense to cache names/hierarchy locally once we have them.
  • The server always holds the Authoritative Truth (tm).
  • We have local, not-on-the-server data associated with folders too, eg ordering, view settings, localised names for special folders etc.
  • In the current database-per-folder system, there is a folderInfo struct which holds such stuff (although I don't think that captures hierarchy - I think the code uses the filesystem structure for hierarchy, updated with info from server when available).
  • For folders, we are synchronising three things: the server state, the in-memory state and the database state. Keeping the latter two in step isn't so bad as long as you're ruthless in guarding access to the mechanisms of change. It'll suck if you can update one representation without updating the other :-)

I would be trying to think of terms of the folders in the database being cached representations of server-side objects, plus some extra non-server app-side data stuck onto them (and presumably add-ons would like the ability to slap extra custom data onto folders too).

So to me, the interesting kinds of questions are:

  • how do we discover folders (from remote servers and local folder filesystem)
  • how do we deal with server-side folder changes? eg think of a yahoo IMAP account where the user renames a folder through the webmail interface while not using TB. What happens the next time TB runs?
    • I don't think a client can catch this with vanilla IMAP... It'll just be one folder disappearing (deleted!) a new one showing up, containing messages to (re)download.
    • I'm pretty sure ews and jmap would have some kind of immutable server-assigned id for folders so they can be moved around and reparented without fuss. So we should support that.
    • I'd bet there are IMAP extensions which support that too (objectID?).
  • How do we resolve local "speculative" operations with eventual server success/failure - eg a user reparents a folder locally:
    • we want to update our local representation immediately (we're a native app - everything should be instant, dammit!)
    • we kick off a network operation to do the same thing on the server
    • if the server succeeds, great, all done here.
    • if the server operation fails... then we need to roll things back and make sure our local folder data (and the messages therein) are still in sync with the "true" state on the server.
    • There's the ugly case where we don't know if it succeeded or failed! Start operation, network goes down... what to do in such cases? (we need general sync-with-server code to sort it out - such a sync mechanism should probably also be used by the initial connection to the server) What does the IMAP code handle this? No idea - I'm sure it mostly all just works, but I bet there's lots of duplication and ugly corner cases in there, with lots of potential to go horribly wrong. It'd be nice to think about this in a more protocol-neutral way.

Each server will be represented as a separate root in the folders table.

I'm still very nervous about the use of the nested set model when we have recursion at our disposal. Please read up on the risks of the large updates necessary when new folders are created on accounts with many folders. The overhead and risk involved in so many write operations just doesn't seem to be worth it when read performance is the same when using recursive CTEs.

https://lars.yencken.org/hierarchy-in-sql
https://en.wikipedia.org/wiki/Nested_set_model#Performance

I'm not concerned about the performance. This table will have a small number of rows in it – probably less than 10,000 in the worst cases (we know some users have 100 accounts, but have they got 100 folders in each?) and less than 100 for the vast majority of users. Even if there were 10,000 rows, SQLite can update that many in milliseconds.

As the quote in that Wiki article says, "It’s best used when you need to query a tree more frequently than you need to modify the tree." which is what we'll be doing.

That said, most of what I'm working on at the moment doesn't depend on which database model we use. We'd still want all of the functionality I'm building out around the database, and all of the test cases are still valid. We could change out the database model without a lot of work.

A folder in the database can either have an ordinal value or null. Folders with ordinals are sorted
first, in ascending order, then folders without ordinals in alphabetical order. The order of those
with null will be further refined in a later patch, as it needs to account for non-ASCII characters
and the ordering of special folder types.

I did end up moving away from the nested set model, but not for performance reasons. I realised that to have a folder's children be manually ordered (we store the order as specified by the user), automatically ordered (based on the folder type and name), or some combination of both. In the nested set model the children have to be stored in some order and that just made it too complicated. I'd rather store no order where necessary and deal with it as the database is read.

Attachment #9428984 - Attachment description: WIP: Bug 1921394 - Sort folders as they are read in from the database. → Bug 1921394 - Sort folders as they are read in from the database. r=BenC!,#thunderbird-reviewers
Attachment #9427667 - Attachment description: WIP: Bug 1921394 - Implement the core of the new database for folders. → Bug 1921394 - Implement the core of the new database for folders. r=BenC!,#thunderbird-reviewers
Attachment #9427668 - Attachment description: WIP: Bug 1921394 - Implement moving existing folders around. → Bug 1921394 - Implement moving existing folders around. r=BenC!,#thunderbird-reviewers
Attachment #9428200 - Attachment description: WIP: Bug 1921394 - Implement functions for flag handling. → Bug 1921394 - Implement functions for flag handling. r=BenC!,#thunderbird-reviewers
Attachment #9428201 - Attachment description: WIP: Bug 1921394 - Add functions and guards for working with multiple roots. → Bug 1921394 - Add functions and guards for working with multiple roots. r=BenC!,#thunderbird-reviewers
Attachment #9428985 - Attachment description: WIP: Bug 1921394 - Implement folder lookup by path. → Bug 1921394 - Implement folder lookup by path. r=BenC!,#thunderbird-reviewers
Blocks: 1846550
Keywords: leave-open
Target Milestone: --- → 134 Branch
Pushed by geoff@darktrojan.net: https://hg.mozilla.org/comm-central/rev/f89f51302a93 Implement the core of the new database for folders. r=BenC,aleca https://hg.mozilla.org/comm-central/rev/11e7ffe6e93e Sort folders as they are read in from the database. r=BenC,aleca https://hg.mozilla.org/comm-central/rev/1005945049c0 Implement moving existing folders around. r=BenC,aleca https://hg.mozilla.org/comm-central/rev/7e9b1129eb86 Implement functions for flag handling. r=BenC,aleca https://hg.mozilla.org/comm-central/rev/b517801bcbae Add functions and guards for working with multiple roots. r=BenC,aleca https://hg.mozilla.org/comm-central/rev/e1a1ba99d17c Implement folder lookup by path. r=BenC,aleca

PIDL will do most of the work for us, but not the string properties.

This does the database read off the main thread. Subsequent patches will add the initial reconciliation of on-disk folders to the task thread.

Pushed by john@thunderbird.net:
https://hg.mozilla.org/comm-central/rev/46abd41e1cc1
Make C++ shortcuts to infallible Folder properties. r=BenC
https://hg.mozilla.org/comm-central/rev/31f948947954
Load folders from the database off the main thread. r=BenC, mkmelin

All of this start-up code is probably going to get moved out of here eventually, but for now we need it.

Also featuring logging and comments.

Attachment #9436774 - Attachment description: WIP: Bug 1921394 - Initialise folder database entries from the filesystem at start-up. → Bug 1921394 - Initialise folder database entries from the filesystem at start-up. r=BenC!,#thunderbird-reviewers

These lines accidentally passed because null == undefined. I didn't realise I'd given the folders the wrong parent when rearranging the database.

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/eb3fa51e18d2
Clean up leaked objects in FolderDatabase. r=mkmelin
https://hg.mozilla.org/comm-central/rev/38219568f4e4
Fix a mistake in a test. r=mkmelin

We're going to want to use these functions for multiple tables soon, so I've dragged the common
stuff out into a separate class that can be shared. This class will be the only entry point to gain
access to database-handling classes (e.g. via DatabaseCore::GetFolders), which means we can be sure
there's only one of each of them.

  • Strings should be normalised to Unicode canonical composition form before being inserted into the database.
    This will prevent mistakes when comparing strings containing non-ASCII characters.
  • Folders (that have no explicit ordering) should be ordered in a case-sensitive and locale-aware way.
Attachment #9436772 - Attachment description: Bug 1921394 - Create panorama.sqlite if it doesn't exist when FolderDatabase starts. r=BenC!,#thunderbird-reviewers → Bug 1921394 - Create panorama.sqlite if it doesn't exist when FolderDatabase starts. r=BenC,aleca
Attachment #9436773 - Attachment description: Bug 1921394 - Add functions for inserting and deleting folders. r=BenC!,#thunderbird-reviewers → Bug 1921394 - Add functions for inserting and deleting folders. r=BenC
Attachment #9436774 - Attachment description: Bug 1921394 - Initialise folder database entries from the filesystem at start-up. r=BenC!,#thunderbird-reviewers → Bug 1921394 - Initialise folder database entries from the filesystem at start-up. r=BenC
Attachment #9441579 - Attachment description: Bug 1921394 - Move the common database functions into their own class. r=#thunderbird-back-end-reviewers → Bug 1921394 - Move the common database functions into their own class. r=mkmelin
Attachment #9441580 - Attachment description: Bug 1921394 - Handle non-ASCII characters when sorting folders and comparing paths. r=#thunderbird-back-end-reviewers → Bug 1921394 - Handle non-ASCII characters when sorting folders and comparing paths. r=mkmelin

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/e8ee36285fe3
Create panorama.sqlite if it doesn't exist when FolderDatabase starts. r=BenC,aleca
https://hg.mozilla.org/comm-central/rev/44b382f10ec7
Add functions for inserting and deleting folders. r=BenC
https://hg.mozilla.org/comm-central/rev/c66c2827746a
Initialise folder database entries from the filesystem at start-up. r=BenC
https://hg.mozilla.org/comm-central/rev/f7ee9d3b59a2
Move the common database functions into their own class. r=mkmelin
https://hg.mozilla.org/comm-central/rev/89a00dd512d7
Handle non-ASCII characters when sorting folders and comparing paths. r=mkmelin

Pushed by heather@thunderbird.net:
https://hg.mozilla.org/comm-central/rev/a8a69849071b
Clone aFile in FolderCollector::FindChildren to avoid Windows failure. r=leftmostcat

This is almost all temporary code so that development in other areas can continue.

This is the very first part of the LiveView class. It gets much more complicated from here.

This adds functions to SQLite for detecting a string in a space-separated list, and a LiveView filter
for selecting messages with or without a given tag. We'll only use the positive case for now but the
negative case will be useful later.

Attachment #9443336 - Attachment description: WIP: Bug 1921394 - Add a stub messages database. r=#thunderbird-back-end-reviewers → Bug 1921394 - Add a stub messages database. r=#thunderbird-back-end-reviewers
Attachment #9443337 - Attachment description: WIP: Bug 1921394 - Create a very basic live view. r=#thunderbird-back-end-reviewers → Bug 1921394 - Create a very basic live view. r=#thunderbird-back-end-reviewers
Attachment #9443760 - Attachment description: WIP: Bug 1921394 - Create a filter for messages with or without a tag. r=#thunderbird-back-end-reviewers → Bug 1921394 - Create a filter for messages with or without a tag. r=#thunderbird-back-end-reviewers

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/1f012cc43b61
Add a stub messages database. r=mkmelin
https://hg.mozilla.org/comm-central/rev/999117f7a24a
Create a very basic live view. r=mkmelin
https://hg.mozilla.org/comm-central/rev/bf502ad204d4
Create a filter for messages with or without a tag. r=mkmelin

This is the very start of this class which power the message list. With this patch we get messages
from the back end and display them in descending date order. Subsequent patches will add sorting
by other fields and in ascending order as required.

This patch adds notification abilities to the live view and handles messages added to or removed
from the database.

This patch has the beginnings of the sort mechanism for live views.

This is the first textual property we'll be able to sort by.

This adds a SQL function for formatting addresses based on the existing preferences for doing so.
I've implemented it in this way so that SQLite can handle most of the sorting (except when new
messages are added) and we don't have to think about it, because it's just the same as any other
property. It's currently quite slow across a large number of messages but that's mostly due to the
header parsing being implemented in JS. We should fix that.

Comment on attachment 9459061 [details]
Bug 1921394 - Create an adapter for getting messages from a live view to the message list. r=#thunderbird-reviewers

Revision D233996 was moved to bug 1941471. Setting attachment 9459061 [details] to obsolete.

Attachment #9459061 - Attachment is obsolete: true

Comment on attachment 9459062 [details]
Bug 1921394 - Make the live view data adapter live. r=#thunderbird-reviewers

Revision D233997 was moved to bug 1941471. Setting attachment 9459062 [details] to obsolete.

Attachment #9459062 - Attachment is obsolete: true

Comment on attachment 9459063 [details]
Bug 1921394 - Enable sorting of the message list by ascending date order. r=#thunderbird-reviewers

Revision D233998 was moved to bug 1941471. Setting attachment 9459063 [details] to obsolete.

Attachment #9459063 - Attachment is obsolete: true

Comment on attachment 9459064 [details]
Bug 1921394 - Enable sorting of the message list by subject. r=#thunderbird-reviewers

Revision D233999 was moved to bug 1941471. Setting attachment 9459064 [details] to obsolete.

Attachment #9459064 - Attachment is obsolete: true

Comment on attachment 9459065 [details]
Bug 1921394 - Implement sender column formatting and sorting. r=#thunderbird-back-end-reviewers

Revision D234000 was moved to bug 1941471. Setting attachment 9459065 [details] to obsolete.

Attachment #9459065 - Attachment is obsolete: true

We're going to want these. I'd rather containsChildNamed was hasChildNamed, but it is what it is.

This is going to make it easier to find an nsIFolder's corresponding nsIMsgFolder, because we'll be
able to use GetChildNamed and the nsIFolder's name. It also means we have proper folders names in
cases where the file name is a hash (e.g. when forbidden characters are involved).

It doesn't help with localised special folder names, but that's a story for another time.

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/ec6098535e6f
Add containsChildNamed and getChildNamed to nsIFolder. r=mkmelin
https://hg.mozilla.org/comm-central/rev/3f755d1c1038
Change FolderCollector to use folder names instead of filenames. r=mkmelin

I've been using Date in JS and PRTime in C++ without any unit conversion because it just works and
I wasn't looking too closely. But Date is milliseconds and PRTime is meant to be microseconds.

This adds the unit conversion and cleans up some date handling in unit tests for better human readability.

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/5294de3db27a
Fix discrepancies in date/time units. r=aleca

Pushed by john@thunderbird.net:
https://hg.mozilla.org/comm-central/rev/80287fb8def3
Add documentation for the Panorama code that exists so far. r=aleca

I guess we can call this one closed, it's unlikely to get any more patches.

Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Keywords: leave-open
Resolution: --- → FIXED

Unlikely doesn't mean impossible.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

I deliberately left this bit until later as it wasn't relevant at the time. I guess now is later.

I've overhauled the comparator logic while here to make it a bit more readable. I fixed a bug
where we didn't update the ordinal member of a Folder object when moving the folder. And I added
a method for resetting the order of a folder's children to the natural order, in case the user
wants to do that after reordering in the UI.

Maybe this time it'll be the last time.

Bitrotted, please rebase.

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/e267083147d5
Include special folders in the sort order calculation. r=mkmelin

Status: REOPENED → RESOLVED
Closed: 9 months ago6 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: