The default bug view has changed. See this FAQ.

Convert panacea.dat from mork to another database format such as mozStorage(SQLite DB), IndexedDB(constructed on mozStorage in Mozilla), localStorage(DOMStorage, constructed on mozStorage in Mozilla)

NEW
Unassigned

Status

MailNews Core
Database
--
enhancement
9 years ago
3 months ago

People

(Reporter: jcranmer, Unassigned)

Tracking

(Blocks: 4 bugs)

Dependency tree / graph
Bug Flags:
wanted-thunderbird +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 7 obsolete attachments)

(Reporter)

Description

9 years ago
Panacea.dat currently uses mork; this bug is to change it to use any other format instead. The file is controlled by ns{I}MsgFolder{CacheElement}; given its simplicity, this shouldn't take more than a few days to pound out.

Proposed SQLite schema:

CREATE TABLE entries (
  folderKey CHAR,
  propertyName CHAR,
  value CHAR
);

Proposed text file format:
[<folderKey>]
<propertyName1>=<value1>
<propertyName2>=<value2>
<propertyName3>=<value3>

I'm leaning towards SQLite myself. Least hassle involved.

Migrating these files isn't important since msgFolderCache already rebuilds the file if it is missing.
(Reporter)

Comment 1

9 years ago
Created attachment 305016 [details] [diff] [review]
Nearly complete patch

First patch, almost complete. Still to be done:
1. Logging the cache DB. (Necessary?)
2. Leak testing.
3. Documenting nsIMsgFolderCache{Element}
4. Change nsMsgFolderCacheElement ?

No migration handled, so password-protected IMAP servers may look a little off on first loading. Then again, it has the same effect as deleting panacea.dat, so it doesn't really matter too match.
(Reporter)

Comment 2

9 years ago
Created attachment 306726 [details] [diff] [review]
Patch #1

This is the patch ready for review. I've decided to not do logging, since mozStorage already logs SQL statements. That said, it does do printf DEBUG logging of cache initialization and cache removal, which won't spam up the output. What would spam up the output is an assertion in nsMsgFolderCache that triggers if a property is set with the same value it already has.
Attachment #305016 - Attachment is obsolete: true
Attachment #306726 - Flags: review?(bienvenu)

Comment 3

9 years ago
Just a side note:
> + * The default cache is a SQLite file, panacea.dat, located in the profile

I don't think it's a good idea to use the same file name here. People might be using their profile with various app versions, which might get confused (or rather think that the panacea.dat they find is just invalid).
Please adhere to the usual naming scheme and use panacea.sqlite or even better use a *meaningful* name like msgfoldercache.sqlite etc.

(Reporter)

Comment 4

9 years ago
Created attachment 306787 [details] [diff] [review]
Patch #1.5

Er, forget the previous patch. Relying on a broken build for testing doesn't work. Turns out that nsCOMPtr doesn't like mixing with raw pointers in the realm of nsInterfaceHashtable. Oh well, I got some practice with subverting code separation principles and direct memory access.

This also fixes some bugs with handling completely unexpected files, and changes the filename from panacea.dat. Eww, that's ugly.
Attachment #306726 - Attachment is obsolete: true
Attachment #306787 - Flags: review?(bienvenu)
Attachment #306726 - Flags: review?(bienvenu)

Comment 5

9 years ago
Comment on attachment 306787 [details] [diff] [review]
Patch #1.5

Very cool!

I hate dropping errors:

+  if (NS_FAILED(rv) || !connectionReady)
+    return NS_ERROR_FAILURE;

can you see your way to returning the actual rv? Or does this never fail?

For statements that we execute frequently, like SetStringProperty, does it make sense to cache the SQL statement instead of recreating it every time?



since you're storing the values as decimal, can't you use atoi or some mozilla equivalent?
     char C = resultStr.CharAt(index);
-    PRInt8 unhex = ((C >= '0' && C <= '9') ? C - '0' :
-    ((C >= 'A' && C <= 'F') ? C - 'A' + 10 :
-     ((C >= 'a' && C <= 'f') ? C - 'a' + 10 : -1)));
+    PRInt8 unhex = (C >= '0' && C <= '9') ? C - '0' : -1;
     if (unhex < 0)
       break;
-    result = (result << 4) | unhex;
+    result = result * 10 + unhex;
   }


Why not ignore attempts to set the value to what it already is, and return early?
+  // Find the current property value
+  nsCString currentValue;
+  nsresult rv = GetStringProperty(propertyName, currentValue);
+  // Update it if it exists...
+  if (NS_SUCCEEDED(rv))
   {
-    rv = m_owningCache->GetStore()->StringToToken(m_owningCache->GetEnv(), propertyName, &property_token);
-    if (NS_SUCCEEDED(rv))
-    {
-      struct mdbYarn yarn;
+    // Commented out to prevent large spamming of output.
+    //NS_ASSERTION(!currentValue.equals(propertyValue), "Should only set non-equal values");
Attachment #306787 - Flags: review?(bienvenu) → review-
(Reporter)

Updated

9 years ago
Flags: wanted-thunderbird3.0a1?
(Reporter)

Comment 6

9 years ago
Created attachment 310046 [details] [diff] [review]
Patch #2

(#3 in today's blizzard of patches)

Fixes bienvenu's comments, and is updated to take into account bug 420459.
Attachment #306787 - Attachment is obsolete: true
Attachment #310046 - Flags: review?(bienvenu)

Comment 7

9 years ago
Comment on attachment 310046 [details] [diff] [review]
Patch #2

surely that should be if (currentValue.Equals, not !currentValue.Equals...

+    //NS_ASSERTION(!currentValue.equals(propertyValue), "Should only set non-equal values");
+    if (!currentValue.equals(propertyValue))
+      return NS_OK;
+    rv = m_dbConnection->CreateStatement(NS_LITERAL_CSTRING(
Attachment #310046 - Flags: review?(bienvenu) → review-
(Reporter)

Comment 8

9 years ago
Created attachment 311158 [details] [diff] [review]
Patch #2.5

I blame Java. And global warming.

IMHO, camelCase is a better method-naming scheme than CamelCase, but I don't write the style specs...
Attachment #310046 - Attachment is obsolete: true
Attachment #311158 - Flags: review?(bienvenu)

Comment 9

9 years ago
Oops, I didn't even notice the .equals vs .Equals - I was more focussed on the !

In other words, you only want to return early if the old value Equals the new value, not !Equals. The way you have it now we would never change a value, unless I'm very confused...
(Reporter)

Comment 10

9 years ago
Created attachment 311185 [details] [diff] [review]
Patch #2.8

Hmm, didn't catch the ! thing... Guess the assertion tripped me up. In retrospect, that explains some of the unread counts I assumed were just a factor of news not having accurate unread counts.
Attachment #311158 - Attachment is obsolete: true
Attachment #311185 - Flags: review?(bienvenu)
Attachment #311158 - Flags: review?(bienvenu)

Comment 11

9 years ago
Comment on attachment 311185 [details] [diff] [review]
Patch #2.8

ok, thx for the quick turn-around!
Attachment #311185 - Flags: review?(bienvenu) → review+
(Reporter)

Updated

9 years ago
Attachment #311185 - Flags: superreview?(neil)
Comment on attachment 311185 [details] [diff] [review]
Patch #2.8

> 		  mork \
>+			storage \
> 		  txmgr \
I don't suppose you would mind copying the weird indentation?

>+  /**
>+   * Closes the cache.
>+   *
>+   * This will cause a compressed commit to occur.
>+   */
>   void close();
But it never actually closed the cache, did it ;-)

>+  nsCOMPtr<mozIStorageService> storageService = nsnull;
>+  storageService = do_GetService(MOZ_STORAGE_SERVICE_CONTRACTID, &rv);
Write this as one line. (The = nsnull is particularly silly.)

>+    NS_ASSERTION(schemaVersion == 0, "Unknown schema version...");
You should fail in this case, so the cache could be regenerated.

>+    nsMsgFolderCacheElement* element = new nsMsgFolderCacheElement(
>+        this->m_dbConnection);
>+    element->SetKey(value);
I wonder why this is a public method, rather then part of the constructor.

>+  if (*result)
>+    return NS_OK;
>+  else if (createIfMissing)
No else required. Leave a blank line though.

>+    nsCString hashKey(pathKey);
>+    m_cacheElements.Put(hashKey, folderCacheEl);
nsDependentCString (and you could inline it too).

>+  rv = statement->BindUTF8StringParameter(1, nsCString(propertyName));
nsDependentCString again. (Did I miss any others?)

>+  PRBool hasKey;
>+  if (NS_SUCCEEDED(statement->ExecuteStep(&hasKey)) && hasKey)
>+  {
>+    statement->GetUTF8String(0, result);
>+  }
>+  else
>+  {
>+    result.Truncate();
>+    return NS_ERROR_FAILURE;
>+  }
>+  return NS_OK;
Eww. Perhaps:

rv = statement->ExecuteStep(&hasKey);
NS_ENSURE_SUCCESS(rv, rv);
if (hasKey)
  return statement->GetUTF8String(0, result);

result.Truncate();
return NS_ERROR_FAILURE;

>+  *aResult = atoi(resultStr.get());
Use resultStr.ToInteger, although I don't see why you can't use statement->GetInt32 instead.

> 		  mork \
>+			storage \
> 		  necko \
(Ditto.)
Attachment #311185 - Flags: superreview?(neil) → superreview-
(Reporter)

Comment 13

9 years ago
Created attachment 311890 [details] [diff] [review]
Patch #3

Fixes Neil's problems. Also fixes a test failure (forgot about the addition of the directory service provider test).

In addition, it actually closes the cache when Close() is called. For that change, I am re-requesting r from bienvenu.
Attachment #311185 - Attachment is obsolete: true
Attachment #311890 - Flags: superreview?(neil)
Attachment #311890 - Flags: review?(bienvenu)

Comment 14

9 years ago
I think GetStringProperty is going to crash with a null m_dbConnection if Close is called first, because Close nulls out m_dbConnection.
Comment on attachment 311890 [details] [diff] [review]
Patch #3

>+    NS_ASSERTION(schemaVersion == 0, "Unknown schema version...");
>+    if (schemaVersion != 0)
>+      return NS_ERROR_FAILURE;
Nit: Consider NS_ENSURE_TRUE(schemaVersion == 0, NS_ERROR_FAILURE); which warns (rather than asserting) and returns if false.

As far as I can tell, ::Close really closing is safe because it will just make subsequent calls to nsMsgFolderCache fail as not initialised while nsMsgFolderCacheElement will fail because the underlying connection (to which it still owns a reference) has been closed.
Attachment #311890 - Flags: superreview?(neil) → superreview+

Comment 16

9 years ago
Comment on attachment 311890 [details] [diff] [review]
Patch #3

ok, thx, Joshua.
Attachment #311890 - Flags: review?(bienvenu) → review+
(Reporter)

Comment 17

9 years ago
Created attachment 312653 [details] [diff] [review]
patch to check in

Patch to check in, fixed Neil's nit. r/sr carried over from previous patch
Attachment #311890 - Attachment is obsolete: true
Attachment #312653 - Flags: superreview+
Attachment #312653 - Flags: review+
(Reporter)

Updated

9 years ago
Keywords: checkin-needed
Checking in mailnews/base/build/Makefile.in;
/cvsroot/mozilla/mailnews/base/build/Makefile.in,v  <--  Makefile.in
new revision: 1.84; previous revision: 1.83
done
Checking in mailnews/base/public/nsIMsgFolderCache.idl;
/cvsroot/mozilla/mailnews/base/public/nsIMsgFolderCache.idl,v  <--  nsIMsgFolderCache.idl
new revision: 1.11; previous revision: 1.10
done
Checking in mailnews/base/public/nsIMsgFolderCacheElement.idl;
/cvsroot/mozilla/mailnews/base/public/nsIMsgFolderCacheElement.idl,v  <--  nsIMsgFolderCacheElement.idl
new revision: 1.7; previous revision: 1.6
done
Checking in mailnews/base/src/Makefile.in;
/cvsroot/mozilla/mailnews/base/src/Makefile.in,v  <--  Makefile.in
new revision: 1.136; previous revision: 1.135
done
Checking in mailnews/base/src/nsMailDirProvider.cpp;
/cvsroot/mozilla/mailnews/base/src/nsMailDirProvider.cpp,v  <--  nsMailDirProvider.cpp
new revision: 1.7; previous revision: 1.6
done
Checking in mailnews/base/src/nsMsgFolderCache.cpp;
/cvsroot/mozilla/mailnews/base/src/nsMsgFolderCache.cpp,v  <--  nsMsgFolderCache.cpp
new revision: 1.59; previous revision: 1.58
done
Checking in mailnews/base/src/nsMsgFolderCache.h;
/cvsroot/mozilla/mailnews/base/src/nsMsgFolderCache.h,v  <--  nsMsgFolderCache.h
new revision: 1.24; previous revision: 1.23
done
Checking in mailnews/base/src/nsMsgFolderCacheElement.cpp;
/cvsroot/mozilla/mailnews/base/src/nsMsgFolderCacheElement.cpp,v  <--  nsMsgFolderCacheElement.cpp
new revision: 1.25; previous revision: 1.24
done
Checking in mailnews/base/src/nsMsgFolderCacheElement.h;
/cvsroot/mozilla/mailnews/base/src/nsMsgFolderCacheElement.h,v  <--  nsMsgFolderCacheElement.h
new revision: 1.11; previous revision: 1.10
done
Checking in mailnews/base/test/unit/test_nsMailDirProvider.js;
/cvsroot/mozilla/mailnews/base/test/unit/test_nsMailDirProvider.js,v  <--  test_nsMailDirProvider.js
new revision: 1.2; previous revision: 1.1
done
Checking in mailnews/build/Makefile.in;
/cvsroot/mozilla/mailnews/build/Makefile.in,v  <--  Makefile.in
new revision: 1.28; previous revision: 1.27
done
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9

Comment 19

9 years ago
First startup after this patch was "surpising" (could have used a little warning of this in the patch) I assume that MsgFolderCache had to be built for the first time,
+  // If we can't open a database, let's destroy it and rebuild it...
This happened only after expanding each account, Mail,Blogs etc and took a bit of time. (maybe 30 seconds for each account)
Everything seems to work OK after the initial build.
(resulting with a 183KB sqlite file from an original 44KB panacea.dat file)

I have no way to accurately measure this, but the overall startup time seems to have increased by 3 or 4 seconds after this patch.
I'm wondering how folks with many more accounts will be affected.
Here's what I have:
Pop3 mailbox with about 500 msgs
RSS feeds about 25
Gmail pop3 access
About 50 newsgroups in 3 separate accounts

Comment 20

9 years ago
More possibly related problems:
Long delays in getting new (SMPT) between the time "downloading Mail" appears in the status bar, and the mail actually appears in the inbox.
MsgFolderCache may be growing (up to 198KB now) 

Updated

9 years ago
Depends on: 426658

Comment 21

9 years ago
I believe that this patch breaks the folder cache - the whole point of the folder cache is to avoid opening .msf files unless we need the headers.  After testing an unrelated patch, I believe that operations like startup, building folder menu's, etc, cause us to open the .msf files and leave them open, whereas before this change, we didn't.

I believe this will cause a whole host of issues, and remind folks of why it was called panacea.dat in the first place :-). Basically, Joshua, we should only be opening db's when we're getting the message headers, or adding messages to a folder. Displaying the folder pane shouldn't cause us to open every db.

Comment 22

9 years ago
I have a couple patches I'd like to land, but I'm concerned about their interaction with a fix for this regression.  In particular Bug 417354 will impact both startup and perceived-startup, which this patch regressed.  Also, the performance problems in this patch will be expressed further when Bug 413781 lands.

Do people think a fix for this is coming soon?  If not, I'd argue that the patch should be backed out until the regression can be solved.
That sounds right to me too.

Comment 24

9 years ago
When and if this is backed out, panacea.dat will be used again.
Any warnings for nightly testers, that have used the sqlite version for several days.(other than "you should have backed up your profile")
(Reporter)

Comment 25

9 years ago
Created attachment 313465 [details]
lsof output of two files

This is the lsof output I get for thunderbird with and without the patch applied. The base build is trunk from approximately last Friday night with my WIP for 11054 applied. Both panacea.dat and msgFolderCache.sqlite preexist; the listing was taking as soon after the initial window popped out. I attempted to ensure that all newsserver twisties were closed, although the list of which twisties were open and closed between the two.

Servers:
2 IMAP (franklin.tjhsst.edu and imap.google.com, most folders subscribed)
1 Local Folders
3 NNTP (news.mozilla.org w/ m.a.d.thunderbird, m.test, m.s.firefox; three others with one each).

Comment 26

9 years ago
I don't know if MsgFolderCache.sqlite existed before or not (probably not)
If it would help, I could email you my 44kb panacea.dat file and my 203kb sqlite file. (I'd rather not make that public)

Comment 27

9 years ago
so I have to take back some of what I said - the folder cache works sometimes, but  not others (or at least it seems not to work in other cases). Joey's patch for xbl-ifying the move/copy menus definitely seems to bypass the folder cache, when trying to get the MRU time.  But I'm not sure why.
(Reporter)

Comment 28

9 years ago
Some clarifications as to working theories:

* People should be seeing an increase load time the first time a profile is loaded after this patch is applied, roughly equivalent to the effect of deleting panacea.dat
* Loading times afterwards should only be impacted to the extent that SQLite is faster or slower than mork (does anyone have definite comparisons)
* Ditto for memory usage
* WRT file size, it appears that SQLite adds some overhead to file size (mork adds a surprisingly small amount). Considering that we are using it for a rather simple relationship--essentially a Map<Tuple<A,B>, Value> relationship--some of the latent database portions are probably inflating the size significantly. I will look into doing some simple optimizations on file size.
* msfs should only be loaded on the file recreation; if not, it may be due to the presence of a second patch in the tree (I have been blinded by such a combination before).

Google turned the following match up when looking into sqlite overhead:
http://www.mail-archive.com/sqlite-users@sqlite.org/msg19231.html
Apparently we're not the only ones to get hit by large increases in DB file size.

Comment 29

9 years ago
Joshua and I had a look at part of this tonight.  We were examining the folder-bindings patch, which shows a noticeable slowdown with this patch.  This was pinned down to a getStringProperty call on an nsIMsgFolder.  Most importantly, repeat calls to this function showed performance approximating the performance without the patch.  Only the initial calls to this were slow, which suggests some sqlite caching is going on.

Most of the other reports about performance impacts that I've read seem to follow a similar pattern, that the first call to a folder-cache function is slow, but repeat calls show normal performance.

One possible path to investigate is whether storing common folder properties in a single, multi-column row, rather than in a generic properties table, would allow this caching to take place sooner, and reduce the loss here.
(Reporter)

Comment 30

9 years ago
Expanding on what Joey said, this is a copy of the properties from a panacea.dat file:
(8A=charset)(8B=folderName)(8C=boxFlags)(8D=hierDelim)(8E=onlineName)
(8F=aclFlags)(90=MRUTime)(91=LastPurgeTime)
(92=curJunkFolderLastPurgeTime)(80=ns:msg:db:row:scope:folders:all)
(81=ns:msg:db:table:kind:folders)(82=key)(83=flags)(84=totalMsgs)
(85=totalUnreadMsgs)(86=pendingUnreadMsgs)(87=pendingMsgs)
(88=expungedBytes)(89=folderSize)

Simple investigation of the file leads me to believe that 82-8A are stored on every folder, while 8B-92 are more optional. If we do go the two-tabled route, the proposed schema would need to take this into account.

Alternatively, we may want to look into better fine-tuning of SQLite as well.
Depends on: 427614
It looks to me as though the entries table is not indexed in any way.  That seems like a good candidate to speed things up...
The current schema implements an EAV model, which is usually slower than an EV model (as described in comment 29) but more flexible and often fast enough.  You could implement the hybrid suggested in comment 29 and comment 30, with dense properties stored in the EV table and sparse ones in the EAV table, but since that approach is more complicated, it's worth trying to optimize the current model first.

The patch contains two SELECT queries:

  SELECT DISTINCT folderKey FROM entries
  SELECT value FROM entries WHERE folderKey=?1 AND propertyName=?2

Neither of these are using indices.  An index across both folderKey and propertyName would speed up the latter query.  It might also speed up the former, if SQLite, like MySQL, is capable of using indices across a leftmost-superset of the columns being searched.  Otherwise, a second index on just folderKey would speed up the former.

As for disk space usage, you would save space by storing property names in a lookup table and referring to them by integer ID in the entries table.  And depending on what folder keys look like, you might save space by doing the same for them (if you adopt the hybrid approach, this would just be the ID of the folder in the EV table).

The content pref service in toolkit implements a similar EAV model and uses these techniques for performance.  Its schema is defined here:

http://lxr.mozilla.org/mozilla/source/toolkit/components/contentprefs/src/nsContentPrefService.js#631

Disclaimer: these are general recommendations that seem useful based on a cursory examination of the code, but I haven't examined the code closely enough to be sure that they would help in this particular case.
Erm, s/EV/AV/g (attribute-value, the typical database schema model, as opposed to the entity-attribute-value model, which the patch currently uses).
One more thing: if you make the value column type "BLOB", then you can store and retrieve both integers and strings in the column (due to SQLite's manifest typing <http://www.sqlite.org/datatype3.html>) without SQLite performing any automatic type conversion that you have to undo, which may be more robust.
I have done some preliminary profiling to look into the before/after situation, but due to issues with dtrace entry-point probes against the nightly binaries not working, all I can say thus far is this:

When flipping between two IMAP INBOX folders probing the top-level sqlite3 calls, I see 14 separate setup/execute/teardown request sessions happen each time I change to the other INBOX.  I have some timing numbers, but they are meaningless without a basis for comparison.  (I will say min=83 uS, max=961 uS, avg=240 uS to show that they're not horrible, even on a now-595KiB msgFolderCache.sqlite, though most of that is from the gmail IMAP 680-byte folderKey values).

Although a change to an EV-based schema will reduce the query effort, and normalization and indices even more so, it is probably worth considering using some form of memory cache, even if only for the most recently looked-up folder, so we don't end up issuing the same (faster) call 14 times in a row.

I hope to have some better performance tooling going on soon so that I can actually put real comparison numbers up with an idea of where any slow-downs/improvements may be.
As per the discussion, both here in bug 427614, there are a number of possible performance / memory regressions that may be related to the changes that are were landed here, and a bunch of potential ways to address them.  Past experience has shown that trying to clean up this sort of thing after landing often results in that work being very drawn out over time.  Furthermore, since there are at least some perceived regressions here, we have to either release 3.0a1 with these regressions (which won't make users happy) or gate 3.0a1 on getting them fixed (which seems likely to hurt people's productively as well as potentially slow down the release).

We discussed at today's weekly Thunderbird meeting and came to the conclusion that it's probably better to back this out and then re-land when we think it's ready.  Because we don't currently have any numbers related to Mork to compare trunk changes with, it's impossible for us to have much of a feeling for which of the various things we care about are actual regressions from what's already shipped.

So if we do this, we'll want to figure out what the set of criteria are to re-land.  We'll definitely want some memory and performance instrumentation on Tinderbox, and I think we'll also want to have fairly specific ideas of how we think this patch is likely to effect the stuff that Joshua talks about in comment 28.

Assuming we go ahead and back out, we'll then have an unused msgFolderCache.sqlite and .sqlite-journal.  Presumably we can either detect this for people who have run nightlies and upgrade, or just suggest people remove it.  There will also be an outdated panacea.dat in the profile directory, which I don't entirely understand the consequences of.  Is this something that will be automagically detected and regenerated?

I have more ideas about how to get from backed-out to relanded (or something similar), but I think this piece is the important thing to deal with first.

Thoughts?
(Reporter)

Comment 37

9 years ago
WRT a backout and relanding:

1. TB will reuse your old panacea.dat file. Some statistics may be off, depending on how long ago you last used said file. If you're too concerned about that (practical stuff shouldn't matter), you can just rm it.
2. At this point, it seems likely that the relanded patch will include some sort of schema change. Upon relanding, the old msgFolderCache.sqlite file should be blown away as a result.
3. In response to some points brought up by myk, bienvenu, and other resources related to SQLite, I will probably be testing several different methods to improve performance.
4. Don't hold your breath waiting for relanding by 3.0a1.
5. The reland will still store the database in msgFolderCache.sqlite, although with an incremented schema version.

Comment 38

9 years ago
I've bounced back and forth between an sqlite build (current) and older builds and had no problem with panaceea.dat working fine. However, I'm not so sure about the other way around. (how robust is the error recovery in sqlite)
I used an older build using the same profile, then returned to an sqlite build with the result of crashing on startup.
http://crash-stats.mozilla.com/report/index/ee3f8b91-05d4-11dd-bcd5-001cc4e2bf68
 
deleting the msgFolderCach.sqlite file from my profile did not help.

I ended up using software update on the older build (full download) which seemed to fix my profile with a new sqlite file.

This seems to indicate that the sqlite file is not re-building properly.


(In reply to comment #38)
> I used an older build using the same profile, then returned to an sqlite build
> with the result of crashing on startup.
> http://crash-stats.mozilla.com/report/index/ee3f8b91-05d4-11dd-bcd5-001cc4e2bf68

This looks more like bug 425936 which already handles such stack frames. Would be nice if you could comment.
Note: if this is getting backed out and relanded, then it's worth reconsidering the schema.  If the attributes are sparse, and potentially many attributes will get added on the fly, then it makes sense to stick to the existing EAV model (but with the modifications previously discussed).  Otherwise, an AV model is better.

More info about EAV, including advantages and drawbacks, is available in Wikipedia <http://en.wikipedia.org/wiki/Entity-Attribute-Value_model>.
Given discussion here, I'd like to see this patch backed out and re-landed.  Given Wayne's & Joshua's comments and a bit of testing on my end, I think it's probably not worth suggesting that nightly users blow away their old file, unless we run into specific cases where there are likely to be problems.

I have meetings for the rest of the day, so jminta has graciously agreed to do the backout and watch the tree.

Before relanding, I think it's important to try some general understanding of:

* UI responsiveness (i.e. are we likely to block the UI thread for too long)
* speed of rebuilding & CPU usage
* memory usage
* disk footprint

prioritized in that order.  e.g. I think we can live with getting a little bigger on disk, but we can't live with getting less responsive.
We need this understanding for both the first-time (i.e. building the index) and subsequent (i.e. reading the index) cases.

In an ideal world, we'd write automated code to test all these things in place so that we could compare across mork & various possible schemas.  More realistically, I suspect some basic unit tests will get us a long way: in addition to the usual reasons for unit tests, we can profile & instrument the tests to give us some data about how this stuff will scale.  This, in combination with judicious dtrace usage on Thunderbird itself seems likely to be good enough.

For relanding, I think we want to at least have some basic memory tests and speed tests up and running on Tinderbox for startup & shutdown.  Ideally, we'd also have some sort of speed test(s) that will exercise this code.  We might be able to do that as part of Ts.
Status: RESOLVED → REOPENED
Flags: wanted-thunderbird3.0a1? → wanted-thunderbird3.0a1+
Resolution: FIXED → ---

Comment 42

9 years ago
I've now backed this patch out.

/cvsroot/mozilla/mailnews/base/build/Makefile.in,v  <--  Makefile.in
new revision: 1.85; previous revision: 1.84
done
Checking in mailnews/base/public/nsIMsgFolderCache.idl;
/cvsroot/mozilla/mailnews/base/public/nsIMsgFolderCache.idl,v  <--  nsIMsgFolderCache.idl
new revision: 1.12; previous revision: 1.11
done
Checking in mailnews/base/public/nsIMsgFolderCacheElement.idl;
/cvsroot/mozilla/mailnews/base/public/nsIMsgFolderCacheElement.idl,v  <--  nsIMsgFolderCacheElement.idl
new revision: 1.8; previous revision: 1.7
done
Checking in mailnews/base/src/Makefile.in;
/cvsroot/mozilla/mailnews/base/src/Makefile.in,v  <--  Makefile.in
new revision: 1.137; previous revision: 1.136
done
Checking in mailnews/base/src/nsMailDirProvider.cpp;
/cvsroot/mozilla/mailnews/base/src/nsMailDirProvider.cpp,v  <--  nsMailDirProvider.cpp
new revision: 1.8; previous revision: 1.7
done
Checking in mailnews/base/src/nsMsgFolderCache.cpp;
/cvsroot/mozilla/mailnews/base/src/nsMsgFolderCache.cpp,v  <--  nsMsgFolderCache.cpp
new revision: 1.60; previous revision: 1.59
done
Checking in mailnews/base/src/nsMsgFolderCache.h;
/cvsroot/mozilla/mailnews/base/src/nsMsgFolderCache.h,v  <--  nsMsgFolderCache.h
new revision: 1.25; previous revision: 1.24
done
Checking in mailnews/base/src/nsMsgFolderCacheElement.cpp;
/cvsroot/mozilla/mailnews/base/src/nsMsgFolderCacheElement.cpp,v  <--  nsMsgFolderCacheElement.cpp
new revision: 1.26; previous revision: 1.25
done
Checking in mailnews/base/src/nsMsgFolderCacheElement.h;
/cvsroot/mozilla/mailnews/base/src/nsMsgFolderCacheElement.h,v  <--  nsMsgFolderCacheElement.h
new revision: 1.12; previous revision: 1.11
done
Checking in mailnews/base/test/unit/test_nsMailDirProvider.js;
/cvsroot/mozilla/mailnews/base/test/unit/test_nsMailDirProvider.js,v  <--  test_nsMailDirProvider.js
new revision: 1.3; previous revision: 1.2
done
Checking in mailnews/build/Makefile.in;
/cvsroot/mozilla/mailnews/build/Makefile.in,v  <--  Makefile.in
new revision: 1.29; previous revision: 1.28
done

Comment 43

9 years ago
Downloaded an hourly zip and applied over my current trunk build.
All noted perf regressions are gone, with no impact on my profile.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9pre) Gecko/2008040917 Thunderbird/3.0a1pre ID:2008040917
Moving from wanted‑thunderbird3.0a1+ flag to wanted‑thunderbird3.0a2? flag since code for Thunderbird 3 Alpha 1 has been frozen.
Flags: wanted-thunderbird3.0a1+ → wanted-thunderbird3.0a2?
(Reporter)

Comment 45

9 years ago
Conducting profiling tests on optimal schema for this code.

Profile code (Ctrl-V into error console): var Ci = Components.interfaces; var sum = [0,0]; for (var iter=0;iter<100;iter++) {var t = Date.now();function recurseFolder(folder) {  try { folder.getStringProperty("MRUTime"); } catch (e) {}  if (folder.hasSubFolders) {    var folders = folder.subFolders;    while (folders.hasMoreElements()) {      recurseFolder(folders.getNext().QueryInterface(Ci.nsIMsgFolder));    }  }}var am = Components.classes["@mozilla.org/messenger/account-manager;1"].getService(Ci.nsIMsgAccountManager);var servers = am.allServers;for (var i=0; i<servers.Count();i++) {  var s = servers.GetElementAt(i).QueryInterface(Ci.nsIMsgIncomingServer);  recurseFolder(s.rootFolder);} sum[1] += Date.now()-t; if (sum[0] == 0) sum[0] = sum[1] } sum[1] /= 100; sum

Testing procedure:
1. rm msgFolderCache.sqlite
2. Run TB
3. Type in gmail password, click on all newsgroups once.
4. Close TB
5. Start again
6. Enter query, run thrice
7. Take numbers from first run, look at the other two for more inquiries.
8. Observe file size

Results:
-- Name --              -- First Run --  -- 100 avg --
Old panacea.dat               7               4.53
Old 418551 patch              174             37.01
SQL with key index            56              10.52
SQL with key, prop index      55              10.15
SQL with key+prop index       47               9.51
SQL with key, key+prop index  52               9.7

Conclusions:
(file size for panacea.dat versus msgFolderCache.sqlite cannot be compared due to dogfooding)
* No indexes is very slow
* Best index appears to be a combined key and property index
* Subsequent runs are very fast
* From checking multiple times, it appears that run 101, 201 on the last few were high single digit (6-9), but less than average. Some sort of flushing may be going on in SQLite averaging tests, distorting them upwards.
* Multiple indexes on one table appear to do little on improving time and instead aggravate the size problem.
* Size varies around 68K-72K for msgFolderCache, 6K (high) for panacea.dat.

To do:
* Check better EAV schemae (splicing folderKey/property into separate tables)
* Check coding improvements. It looks like SQLite does a wonderful job of caching for us, so creating internal caching hashtables may not be necessary
* Check setting properties.

Other notes:
* Background processes: 3 xterms, vim, ssh, firefox, audacious, TB branch. Nothing started or exited after the panacea.dat started testing. Next tracks all changed in the downtime between tests, so that shouldn't impact times.
* Folders in cache:
** Gmail IMAP: INBOX, Draft, Trash, [GMail]/{Spam,Drafts,Trash,Sent Mail,Starred}, listarchive, gnomezilla, My Bugs
** news.mozilla.org: m.d.a.t, m.d.planning, m.d.platform, m.test
** localhost (news): test.group.2
** Local Folders: Drafts, Test, Unsent Messages, Trash, Inbox, Sent

The rest will be done tomorrow, with any luck...

Comment 46

9 years ago
(In reply to comment #45)
> Results:
> -- Name --              -- First Run --  -- 100 avg --
> Old panacea.dat               7               4.53
> Old 418551 patch              174             37.01
> SQL with key index            56              10.52
> SQL with key, prop index      55              10.15
> SQL with key+prop index       47               9.51
> SQL with key, key+prop index  52               9.7

So your numbers suggest that dropping Mork (here?) would have no positive effect, but a perf regression of about 114% on average?!
Or am I misinterpreting?
(Reporter)

Comment 47

9 years ago
(In reply to comment #46)
> So your numbers suggest that dropping Mork (here?) would have no positive
> effect, but a perf regression of about 114% on average?!
> Or am I misinterpreting?

I've only done one test, based on the current patch. I can't say for sure until I do more testing (which will happen after I finish catching up on last night's bug spam).
(Reporter)

Comment 48

9 years ago
Testing, part 2:
So I did probably a few dozen tests, but some of the numbers were skewed off because I changed some schemas around without modifying enough of the other stuff.

This test suite uses the following schema:
CREATE TABLE folders (
    folderName CHAR PRIMARY KEY,
    id INTEGER NOT NULL
);
CREATE TABLE entries (
    folder INTEGER REFERENCES folders(id),
    propertyName CHAR,
    value CHAR
);
CREATE UNIQUE INDEX folderIndex ON entries (folder, propertyName);

The size was cut down from 100352 to 24576 under this schema, compared with 6550 under panacea.dat; there is still room to go here.

Time runs:
      -- Name --         -- First Run --  -- 100 avg --
(From above table)
Old panacea.dat               7                4.53
SQL with key+prop index       47               9.51
(New results)
Adding schema                 48               9.1
Pulling out createStatement   11               4.72

In terms of speed, the new schema didn't improve anything beyond size (which is still admirable), but running createStatement fewer times (once per element instead of once per getProperty) seems to be the main speed gain. Similarly, I doubt that adding a property name->id map will improve speed, but it should aid on the size front.

The downside is that the speed gains come at the cost of some need to be careful with APIs. I'll place the comments in the relevant spaces. On the plus side, this enables me to recalibrate some of nsMsgFolderCacheElement internal hackery (namely almost everything calls getStringProperty at some point). I may also have to fix the account manager's incorrect usage of the cache.
(Reporter)

Updated

9 years ago
Status: REOPENED → ASSIGNED
(Reporter)

Comment 49

9 years ago
Testing, part 3:
This test suite doesn't test the impact of the schema, but the impact of committal. Preliminary pre-testing results, elided here for brevity, indicate that the difference in commit times between small, large, session, and compress commits is within noise.

Like before, these are the steps I used:
1. Clear panacea.dat/msgFolderCache.sqlite before tests.
2. Start Thunderbird.
3. Reject IMAP password.
4. Open error console, paste code.
5. Run once to generate official results.
6. Run a few more times for variance tests (not posted with numbers).

Test code (beautified, unlike above):
var Ci = Components.interfaces;
var dbService = Components.classes["@mozilla.org/msgDatabase/msgDBService;1"]
                          .getService(Ci.nsIMsgDBService);
var am = Components.classes["@mozilla.org/messenger/account-manager;1"]
                   .getService(Ci.nsIMsgAccountManager);
var sum = [0, 0];
for (var iter=0; iter<100; iter++) {
function commitFolder(folder) {
  try {
    dbService.openFolderDB(folder, false, true).Commit(3);
  } catch (ex) {}
  if (folder.hasSubFolders) {
    var folders = folder.subFolders;
    while (folders.hasMoreElements()) {
      commitFolder(folders.getNext().QueryInterface(Ci.nsIMsgFolder));
    }
  }
}
var t = Date.now();
var servers = am.allServers;
for (var i=0; i<servers.Count();i++) {
  var s = servers.GetElementAt(i).QueryInterface(Ci.nsIMsgIncomingServer);
  commitFolder(s.rootFolder);
}
sum[1] += Date.now()-t;
if (sum[0] == 0)
  sum[0] = sum[1];
}
sum[1] /= 100;
sum

Results:
 --Test--                 --First Run--  --100 avg--  --Variance notes--
No folder cache commit         252          142.44      Subsequent get worse
Old panacea.dat                184          142.77      Subsequent get worse
SQLite                         213          128.39      Subsequent like before

The last test, unfortunately, has some testing errors: I was trying to fix some errors with SQLite that cropped up, and updated the IMAP mail; I also turned off my music, which appears to be the reason for the 14 ms impact, as subsequent trials, with the music, did show the same times (within noise) as the other trials did in the variance tests.

In conclusion: folder cache commit times have no bearing on overall database commits (we're talking 10 ms at best compared with a 140-150ms commit time). SQL-ification has no impact on these regards. Also, all the errors I was seeing was actually an error on my part that has been fixed.

There will probably only be one subsequent test run, that of an agglomerated create, get, set, delete, and commit time. Profiling code is such fun... :-(
Flags: wanted-thunderbird3.0a2? → wanted-thunderbird3?
(Assignee)

Updated

9 years ago
Product: Core → MailNews Core
(Reporter)

Comment 50

8 years ago
In interest of full disclosure:
1. I don't think those numbers take into account possible #pragma optimizations
2. The test was running on a Linux ext3 partition, so the numbers might be affected by fsync issues as compared to what one would see on Windows or Mac.
(In reply to comment #50)
> 2. The test was running on a Linux ext3 partition, so the numbers might be
> affected by fsync issues as compared to what one would see on Windows or Mac.
If you wrap it all in a transaction, you shouldn't be hitting fsync issues.
Severity: normal → enhancement
Summary: Convert panacea.dat from mork → Convert panacea.dat from mork to another database format
Flags: wanted-thunderbird3? → wanted-thunderbird+

Updated

7 years ago
Blocks: 453975
(Reporter)

Comment 52

6 years ago
I prototyped a simple reimplementation of this using Google's LevelDB, having hard it was a relatively fast <key, value> implementation. I also picked a fairer test for producing results:
1. Initialize the database to a current value of the panacea.dat from my large profile (~20 accounts, Linux/Windows cross-platform usage, several extensions)
2. Start up Thunderbird (about half of the accounts are gone), all folder cache accesses until it stabilizes
3. Get all mail, all folder cache accesses until it stabilizes
4. Shutdown, all folder cache accesses at death

Between each test, I flushed the page cache, so the values are roughly indicative of cold-time startup. The first 5 steps did all of this; for tests 6-10, I initialized the tests and then ran the other timing characteristics on the same database.

For rough ballpark numbers (I did no optimization on LevelDB), initialization is about 30% slower with LevelDB (lower 400ms versus lower 300ms), shutdown is about 50% faster (but the difference is like 10ms, so don't put a lot of weight on it), get all mail is fairly even (50ms across the board) while initial startup is about 100% slower, or lower 200ms versus upper 400ms. All of these numbers I just realized are from a debug build, so they may mean absolutely nothing.

Judging from these results, I suspect LevelDB just has a higher code initialization head time than mork does, implying a sunk cost of perhaps 350ms or so just for initializing the database.

For the record, there is an incongruenty in the API in that getStringProperty doesn't throw on nonexistence but getInt32Property does. When I assumed this wasn't the case (i.e., they both threw), the resulting slowdown was on the order of 1000%, since I'm getting around 20K properties.

In short: right now, this is at about parity with Mork in the long run (most likely), with a sunk cost on startup. I can probably tweak some properties to fix the startup costs, but I don't think I can overcome the gap. LevelDB is also multithreaded, and I suspect that real-life code can mitigate some startup overhead by loading the cache async and doing other things while it loads (I think LevelDB may use other threads to run async). In fact, I'll make a note to run cold startup tests of Thunderbird itself before and after to measure the impact, although picking the right event to stop timing may be difficult.
mail-startup-done perhaps? That's when we consider ourselves to be ready for user interaction.
(Reporter)

Updated

4 years ago
Assignee: Pidgeot18 → nobody
(Reporter)

Updated

4 years ago
Status: ASSIGNED → NEW
Is there any way to make "current panacea.dat use or new msgfoldercache.sqlite use" selectable?
If possible, we can test msgfoldercache.sqlite any time, as we can test MaildirStore any time.
(In reply to WADA from comment #54)
> Is there any way to make "current panacea.dat use or new
> msgfoldercache.sqlite use" selectable?
> If possible, we can test msgfoldercache.sqlite any time, as we can test
> MaildirStore any time.

I should think this is doable.  Joshua, what do you think? And is there a better db choice today?

LevelDB was young at the time, and after 18 releases perf should be better https://github.com/google/leveldb/releases   And we now also have lmdb http://symas.com/mdb/
Flags: needinfo?(Pidgeot18)
Target Milestone: mozilla1.9 → ---
Blocks: 65086
(Reporter)

Comment 56

2 years ago
(In reply to Wayne Mery (:wsmwk) from comment #55)
> (In reply to WADA from comment #54)
> > Is there any way to make "current panacea.dat use or new
> > msgfoldercache.sqlite use" selectable?
> > If possible, we can test msgfoldercache.sqlite any time, as we can test
> > MaildirStore any time.
> 
> I should think this is doable.  Joshua, what do you think? And is there a
> better db choice today?

OS.File.writeAtomic + JSON store should be better. Most instances of panacea.dat should be on the order of ~10s of KB--that doesn't call for a high-perf database.
Flags: needinfo?(Pidgeot18)
(In reply to Joshua Cranmer [:jcranmer] from comment #56)
> (In reply to Wayne Mery (:wsmwk) from comment #55)
> > (In reply to WADA from comment #54)
> > > Is there any way to make "current panacea.dat use or new
> > > msgfoldercache.sqlite use" selectable?
> > > If possible, we can test msgfoldercache.sqlite any time, as we can test
> > > MaildirStore any time.
> > 
> > I should think this is doable.  Joshua, what do you think? And is there a
> > better db choice today?
> 
> OS.File.writeAtomic + JSON store should be better. Most instances of
> panacea.dat should be on the order of ~10s of KB--that doesn't call for a
> high-perf database.

After Thunderbird restart my panacea.dat is 800K.  Before shutdown it was 18MB.  I don't think I have a huge message store.

Perhaps telemetry data in this area would be useful.
(In reply to comment #45)
> Results:
> -- Name --              -- First Run --  -- 100 avg --
> Old panacea.dat               7               4.53
> Old 418551 patch              174             37.01
> SQL with key index            56              10.52
> SQL with key, prop index      55              10.15
> SQL with key+prop index       47               9.51
> SQL with key, key+prop index  52               9.7

This is pretty ordinaal result : If not caced yet, takes long, if already cached, it7s never slow.
This is similar to:
    If first run of Tb after re-boot, startup takes pretty long, but second or later startup is never slow moduke is held in Disk Cache etc.
A solution of this kind of issue is "Prefetch" like one. If SQL DB, "non-termination task who keeps SQLite DB open or who keeps accesing DB" can be a sokution.
Adobe, Apple etc. aalready uses this kind of technque for Adobe Reader,, QuickTime etc..
FYI.
If DB which is identical to "collection of flat object like { Key : Value }" is sufficient, IndexedDB and localStorage can be used.
   IndexedDB and localStorage in Fx/Tb utilizes mozStorage. So, these can e called "Wrapper of SQLite DB with utilizing JSON".
   IndexedDB   : Key = BLOB in SQLite, Value = BLOB in SQLite, JavaScript Object <-> JSON is perhaps done by IndexedDB.
                          Add data   : IndexedDB.add(Key, JavaScript Object) like call.
                          Read data :  var Obj = IndexedDB.get(Key) like call.
   localStorage : Key = TEXT in SQLite, Value = TEXT in SQLite. So, user has to do conversion of JavaScript Object <-> JSON text by himself.
                          Add data   : localStorage.setIttem(Key, Value) ;
                          Read data : var TextData = localStorage.detIttem(Key);
                          Del data    : localStorage.removeItem8Key);
                          Del aall data  : localStorage.clear();
   localStorage consists of single Table only. (same in sessionStorage)
        CREATE TABLE webappsstore2 (scope TEXT, key TEXT, value TEXT, secure INTEGER, owner TEXT)
   User can update key/value set only in assigned scope=uri of site=origin in "same origin policy" by getItem()/setItem()/removeItem/clear() only.
   This is replacement/enhancement  of cookie. "value" can be JSON text, so JavaScript object is saved after convertion to JSON data by user himself.
   So, if FolderURL can be considered as site URI=scope=origin of "same origine policy", it's pretty easy to use.
(Reporter)

Comment 60

2 years ago
(In reply to WADA from comment #59)
> FYI.
> If DB which is identical to "collection of flat object like { Key : Value }"
> is sufficient, IndexedDB and localStorage can be used.

IndexedDB can't be used. The message folder cache is inherently synchronous, and IndexedDB only offers an asynchronous API. It's possible to do an early async load, but that implies loading all the values--in which case IndexedDB only offers lots of overhead for what's basically a JS get/set operation.

The folder cache, when I last measured it, was something like one database load followed by a thousand gets and sets in the startup path. The only way to not regress that performance from mork is to do a hashtable operation for each get/set, and databases are unlikely to get you that. The only benefit to using an SQL-based database (e.g., IndexedDB) is that we don't have to maintain mork--and quite frankly, an ad-hoc JSON file format would have better performance characteristics and still achieve the same benefit.
FYI.

localStorge test under Firefox.

const BR = "<" + "BR>" ;
var Test=new Array();
var NumRun = 10; var NumItem = 1000;
var Value = "??????????????????????????????????????????????????????????????????????????????????????????????????????????????????///????" ;
for(var rr=0;rr<NumRun;rr++)
{
     var Data = {};
         Data[ "T_0" ] = new Date();  for(var nn=0;nn<NumItem;nn++) { localStorage.setItem(nn,Value); }
         Data[ "T_1" ] = new Date();  for(var nn=0;nn<NumItem;nn++) { var ItemData = localStorage.getItem(nn); }
         Data[ "T_2" ] = new Date();  for(var nn=0;nn<NumItem;nn++) { localStorage.removeItem(nn); }
         Data[ "T_3" ] = new Date();
     Test[ Test.length ] = "Run = " + rr + ", NumItem = " + NumItem
          +    ", setItem = " + ( Data[ "T_1" ] - Data[ "T_0" ] ) 
          +    ", getItem = " + ( Data[ "T_2" ] - Data[ "T_1" ] ) 
          + ", removeItem = " + ( Data[ "T_3" ] - Data[ "T_2" ] ) ; 
}

Test result.

Run = 0, NumItem = 1000, setItem = 34, getItem = 10, removeItem = 35
Run = 1, NumItem = 1000, setItem = 22, getItem = 6, removeItem = 21
Run = 2, NumItem = 1000, setItem = 21, getItem = 6, removeItem = 22
Run = 3, NumItem = 1000, setItem = 21, getItem = 6, removeItem = 21
Run = 4, NumItem = 1000, setItem = 21, getItem = 5, removeItem = 23
Run = 5, NumItem = 1000, setItem = 21, getItem = 6, removeItem = 21
Run = 6, NumItem = 1000, setItem = 21, getItem = 6, removeItem = 21
Run = 7, NumItem = 1000, setItem = 21, getItem = 6, removeItem = 21
Run = 8, NumItem = 1000, setItem = 21, getItem = 6, removeItem = 21
Run = 9, NumItem = 1000, setItem = 22, getItem = 5, removeItem = 22

localStorage consists of single Table only and simple(same in sessionStorage), and synchronous.
       CREATE TABLE webappsstore2 (scope TEXT, key TEXT, value TEXT, secure INTEGER, owner TEXT)
Majority of data volume in panacea.dat looks "full path of file for mailbox <-> Mbox name". It looks for me above performaance is not so bad for such data.
If JavaScript Object data is needed, conversion to JSON is possible. Although Value data size is limited in localStorage, it's usually sufficient for FolderCache like use. As localStorage.setItem(Key,Value) / var Dat=localStorage.getItem(Key) is pretty simple to use, it's far easiert than ad-hoc JSON file writing/reading. And, because of "SQLite DB", theere is no need to pay attention to "file loss" nor "deletion by user" :-)
localStorage is replacement/enhancement of cookie, but there is no reason to prohibit "use of it by Tb's messanger".
(In reply to Joshua Cranmer [:jcranmer] from comment #0)
> Proposed SQLite schema:
> CREATE TABLE entries (  folderKey CHAR, propertyName CHAR, value CHAR );

There is no difference from localStorage.
  localStorage consists of single Table only and simple(same in sessionStorage), and synchronous.
        CREATE TABLE webappsstore2 (scope TEXT, key TEXT, value TEXT, secure INTEGER, owner TEXT)
If actual localStorage exposed to JavaScript code of HTML of a Web site, scope is hidden by browser. So, if "primaryKey:SecondryKey:Value" is needed, code like "Key = primaryKey + ':' + SecondryKey; localStorge.setItem(Key, Value);" is needed.
However, if XUL JaavaScript code, code like following can be used.
   var ios = Components.classes["@mozilla.org/network/io-service;1"]
             .getService(Components.interfaces.nsIIOService);
   var ssm = Components.classes["@mozilla.org/scriptsecuritymanager;1"]
             .getService(Components.interfaces.nsIScriptSecurityManager);
   var dsm = Components.classes["@mozilla.org/dom/storagemanager;1"]
             .getService(Components.interfaces.nsIDOMStorageManager);

   // Same as prepration work for window.localStorage for a Web Site
   var url = "https://www.example.com"; // <== used for "scope"
   var uri = ios.newURI(url, "", null);
   var principal = ssm.getCodebasePrincipal(uri);
   var localStorage = dsm.getLocalStorageForPrincipal(principal, ""); // <-- "scope" is hidden

   // Same as Script code of a Web Site
   localStorage.setItem("MyKey1", "MyValue1");

So, if folderURI like one is used for "uri" of above code, localStorage is absolutely same as your SQLite Table which you used for your test.
i.e. localStorage can be used as pretty good Wrapper of simple mozStorage DB(==SQL DB) with Syncronous API.
Summary: Convert panacea.dat from mork to another database format → Convert panacea.dat from mork to another database format such as mozStorage(SQLite DB), IndexedDB(constructed on mozStorage in Mozilla), localStorage(DOMStorage, constructed on mozStorage in Mozilla)

Comment 63

2 years ago
(In reply to Joshua Cranmer [:jcranmer] from comment #60)
> The message folder cache is inherently synchronous

Sounds like that will need to change at least in the long run, as anything doing disk access should never be sync but always async. But of course I have no clue on how deep those changes would need to go, and it might make sense to do some step in between what we have now and a solution backed by async disk access.
FYI.
If localStorge in XUL, "split FolderCach", "per folder FolderCach", "hierarchical FolderCache" is easily obtained with synchronous API.

   FolderA = "mailbox://server/MboxName"                FolderB = "mailbox://server/MboxName/SubFolder"
    var uri = ios.newURI(FolderA , "", null);                    var uri = ios.newURI(FolderB , "", null);
    var principal = ssm.getCodebasePrincipal(uri);         var principal = ssm.getCodebasePrincipal(uri);
    var localStorage =                                                      var localStorage =
      dsm.getLocalStorageForPrincipal(principal, "");        dsm.getLocalStorageForPrincipal(principal, "");

    // Same as Script code of a Web Site                         // Same as Script code of a Web Site
    localStorage.setItem("MyKey1", "MyValue1");          localStorage.setItem("MyKey1", "MyValue1");

    following Table row is created                                    following Table row is created       
      scope=server/MboxName:mailbox:80                       scope=server/MboxName/SubFolder:mailbox:80   
      key=MyKey1                                                              key=MyKey1                         
      value=MyValue1                                                        value=MyValue1

Further, following is possible.
   Category = "foldercache://server/MboxName"                                        
    var uri = ios.newURI(Category , "", null);  var principal = ssm.getCodebasePrincipal(uri);          
    var localStorage =  dsm.getLocalStorageForPrincipal(principal, "");  

    // Same as Script code of a Web Site               
    localStorage.setItem("MyKey1", "MyValue1");        

    following Table row is created                     
      scope=server/MboxName:foldercache:8080, key=MyKey1, value=MyValue1 

If following is done, there is no need of "file I/O for folderTree.json", as far as folderTree.json size is not so huge.
   Category =  "file://folderTree/json"                                        
    var uri = ios.newURI(Category , "", null);  var principal = ssm.getCodebasePrincipal(uri);          
    var localStorage =  dsm.getLocalStorageForPrincipal(principal, "");  
    // Same as Script code of a Web Site               
    localStorage.setItem("folderTree.json", content of folderTree.json);        

    following Table row is created                     
       scope=folderTree/json:file:80, key=folderTree.json, value=content of folderTree.json file
    Read is also easy :  var folderTree_json_data = loclStorge.getItem(""folderTree.json") ;

So, as API of localStorage is synchronous, mapping from "current FolderCache based on Mork" to "lovalStorage" is easy, and current FolderURI, messageURI etc.can be fully utilized.
However, because localStorage is based on mozStorage==SQLite Table, same performance issue observed by Joshua Cranmer  occurs.

Bigegst volume of data in pance,dat is "full path of Mbox .<-> MboxName" data.
If this data is moved to localStorage, I believe file size of panacea.dat is drastically reduced,
I believe that "high performance of Mork DB which is in-memory-DB" is not needed for data like " full path of Mbox <-> MboxName".
And, this data is mainly used upon restart and folder re-discovery only. There is no need to keep in memory always.

I believe there is no need to discard Mork DB merely by it's not new technology.
I believe following is sufficient.
   Don't hold data which doesn't require high performance of Mork DB in Mork DB.
   Hold such data at cheep place such as SQLite DB on HDD instead of in expensive/valuable memory.
   Keep non-dynamic data on HDD such as SQLite DB. Why "Thread column choice" like data requre Moek DB? (currently saved in .msf)
And, I think following is possible.
   If "data loss or integurity loss by power failure/system failure" is not preferable,  move it from Mork DB to SQLite DB.
       Integrity of data of Commit'ed changes is guranteed by SQLite, because data of SQLite is first written on HDD.
       In SQLite, data in memory is simply a cached data for read/update performance.
       Main reason of "outdated msf condition" in local Berkley folder is Mork is in-memory DB and .msf file can not be updated timely until DB close..
Blocks: 791581
FYI.

Following is already used scope(and key/value) in Table=webappstore2 of webappstore.sqlite on Tb 45 and Daily.
> INSERT INTO "someTable" VALUES ("","tnetnoc.:chrome","tnetnoc.:chrome","name","King Yatter");
> 
> INSERT INTO "someTable" VALUES ("","gro.allizom.snodda.secivres.:https:443",
>   "gro.allizom.snodda.secivres.:https:443","discopane-url",
>   "https://services.addons.mozilla.org/en-US/thunderbird/discovery/pane/53.0a1/WINNT#
>     {%22inspector@mozilla.org%22:{%22name%22:%22DOM%20Inspector%22,
>     %22version%22:%222.0.16.1-signed%22,%22type%22:%22extension%22,
>     %22userDisabled%22:false,%22isCompatible%22:true,%22isBlocklisted%22:false},
>     %22custombuttons@xsms.org%22:
>     {%22name%22:%22Custom%20Buttons%22,%22version%22:%220.0.5.8.9%22,
>     %22type%22:%22extension%22,%22userDisabled%22:false,%22isCompatible%22:true,
>     %22isBlocklisted%22:false},%22SQLiteManager@mrinalkant.blogspot.com%22:
>     {%22name%22:%22SQLite%20Manager%22,%22version%22:%220.8.3.1-signed.1-signed%22,
>     %22type%22:%22extension%22,%22userDisabled%22:false,%22isCompatible%22:true,
>     %22isBlocklisted%22:false},%22{f69e22c7-bc50-414a-9269-0f5c344cd94c}%22:
>     {%22name%22:%22Theme%20Font%20&%20Size%20Changer%22,
>     %22version%22:%2250.0%22,%22type%22:%22extension%22,%22userDisabled%22:false,
>     %22isCompatible%22:true,%22isBlocklisted%22:false},%22WinBackMyTrash@gmail.com%22:
>     {%22name%22:%22WinBackMyTrash%22,%22version%22:%220.1.0.020%22,
>     %22type%22:%22extension%22,%22userDisabled%22:false,%22isCompatible%22:true,
>     %22isBlocklisted%22:false},%22{972ce4c6-7e08-4474-a285-3208198ce6fd}%22:
>     {%22name%22:%22Default%22,%22version%22:%2253.0a1%22,%22type%22:%22theme%22,
>     %22userDisabled%22:false,%22isCompatible%22:true,%22isBlocklisted%22:false},
>     %22{e2fda1a4-762b-4020-b5ad-a41df1933103}%22:
>     {%22name%22:%22Lightning%22,%22version%22:%225.5a1%22,%22type%22:%22extension%22,
>     %22userDisabled%22:false,%22isCompatible%22:true,%22isBlocklisted%22:false},
>     %22{fdf1cc26-48c9-18af-3146-d16bf63810bd}%22:
>     {%22name%22:%22Shockwave%20Flash%22,%22version%22:%2224.0.0.186%22,
>     %22type%22:%22plugin%22,%22userDisabled%22:false,%22isCompatible%22:true,
>     %22isBlocklisted%22:false},%22custombutton://buttons/Thunderbird/custombuttons-button0%22:
>     {%22name%22:%22checkCachedDBs%22,%22type%22:%22custombuttons%22,
>     %22isCompatible%22:true,%22isBlocklisted%22:false},
>     %22custombutton://buttons/Thunderbird/custombuttons-button1%22:
>     {%22name%22:%22localStorage%22,%22type%22:%22custombuttons%22,
>     %22isCompatible%22:true,%22isBlocklisted%22:false},
>     %22custombutton://buttons/Thunderbird/custombuttons-button2%22:
>     {%22name%22:%22Adhoc%22,%22type%22:%22custombuttons%22,
>     %22isCompatible%22:true,%22isBlocklisted%22:false}}");
> 
> scope is reversed site domain due to same origin policy. 
>   uri=chrome://content
>     -> scope=tnetnoc.:chrome
>   uri=https://services.addons.mozilla.org:443
>     ->  scope=gro.allizom.snodda.secivres.:https:443

"Yatter King" is my name for Windows login ID.
A mozilla-central feature looks to already use localStorage for chrome: uri.

Addon Manager looks to already utilize localStorage using HTTP uri for addon and JSON data.

Following is chrome: uri seen in Thunderbird.
> documentURI example in Thunderbird.
>  chrome://global/content/console.xul
>  chrome://inspector/content/inspector.xul
>  chrome://sqlitemanager/content/sqlitemanager.xul
>  chrome://messenger/content/messenger.xul
>  chrome://messenger/content/messengercompose/messengercompose.xul
>  chrome://messenger/content/messageWindow.xul
>  chrome://messenger/content/addressbook/addressbook.xul
>  chrome://messenger/content/activity.xul
>  chrome://messenger/content/SearchDialog.xul

In Thunderbird, site uri of chrome is always chrome://messenger/content.
So, if chrome: uri is used for data such as panacea.dat content, structured Key is needed for simple/easy utilization of localStorage.
To utilize localStorage easily, following are needed.
- sheme/protocol=https: && faked unique domain name for Thunderbird features
- new scheme/protocol support such as mail-folder:, mail-message: in same origin policy.
You need to log in before you can comment on or make changes to this bug.