Closed Bug 1396655 Opened 7 years ago Closed 7 years ago

Cleanup folder DB cache eviction logic and msgDBCacheManager debugging output

Categories

(MailNews Core :: Database, enhancement)

enhancement
Not set
normal

Tracking

(thunderbird_esr5260+ fixed)

RESOLVED FIXED
Thunderbird 58.0
Tracking Status
thunderbird_esr52 60+ fixed

People

(Reporter: aceman, Assigned: aceman)

References

Details

Attachments

(1 file, 2 obsolete files)

There are still bugs (e.g. bug 1388696) claiming the folder database eviction in msgDBCacheManager::checkCachedDBs() may work incorrectly.

I propose to improve the debug output from the DB cache manager and simplify the eviction logic.
Attached patch 1396655.patch (obsolete) — Splinter Review
This improves the logging output by summaring how many DBs are pinned in a window, how many are already partly close and how many are still open, how many we need to close.
It simplifies the eviction logic e.g. by not running .queryElementAt() (with internal queryInterface), IsFolderOpenInWindow() and .databaseOpen on some DBs twice, but already grouping which DBs are the ones for potential closing (normal, unpinned, not already closed) the the first pass.
Attachment #8904350 - Flags: review?(rkent)
Attachment #8904350 - Flags: review?(jorgk)
Attachment #8904350 - Flags: feedback?(vseerror)
Comment on attachment 8904350 [details] [diff] [review]
1396655.patch

Review of attachment 8904350 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/base/util/msgDBCacheManager.js
@@ +157,5 @@
>        // more than maxOpenDBs folders to stay open after this loop.
> +      let dbsToClose = dbs.length - Math.max(maxOpenDBs - pinnedDBs, 0);
> +      log.info("Need to close " + dbsToClose + " DBs");
> +      // Order databases by lowest lastUsedTime (oldest) at the end.
> +      dbs.sort((a, b) => b.lastUseTime - a.lastUseTime);

This shouldn't change behaviour compared to the original sort, but in this new algorithm we pop from the end of the array so the array must be sorted with lowest values last.

The original was intended to have lowest values first.
However, the original sort was:
dbs.sort((a, b) => a.lastUseTime > b.lastUseTime);

This may have worked in some way, however the compareFunction in the sort returning boolean (a>b) looks certainly bogus and is against the spec:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort
The compareFunction should return and integer.

So this part could fix a bug.
Thanks for looking at this. I've been wanting to get back to these issues.

What is a "partly closed" database?
What is "pinned"?

And (so I can target user feedback) what type of scenarios will this algorithm change help - both cases where too few databases are getting closed (and we see memory bloat), and cases where databases are being closed too aggressively?  I think we have users reporting both.
Flags: needinfo?(acelists)
"Partly closed" mean that we cut the connection to the database (db.folder.msgDatabase = null) but the DB is still hanging in the cache for some reason (we don't know why it happens). The algorithm skips those as if they are already closed.

I called "pinned" those DBs of folders that are open in some window. I can reword that if needed.

The algorithm change shouldn't help any scenarios, it mostly improves the debugging output and simplifies the loops.
Unless the the sorting algorithm before the fix ordered the DBs incorrectly and e.g. closed the most recently used DBs first (instead of the least recent ones). That could fix something. The sorting appeared to work correctly (that is why it was there) right now, but it could be luck. I think the syntax was illegal so the sort order could vary or end up undefined. So this resembles the second case you mention "and cases where databases are being closed too aggressively".

I don't think the algorithm closes "too few databases" in any case. It closes the right amount, they just may hang in the cache for some time due to the "partly closed" state as described above.
Flags: needinfo?(acelists)
Comment on attachment 8904350 [details] [diff] [review]
1396655.patch

Review of attachment 8904350 [details] [diff] [review]:
-----------------------------------------------------------------

I think this is an excellent clean-up effort. The original code got the DBs twice (gDbService.openDBs) and checked IsFolderOpenInWindow(db.folder) twice, inefficiency ahoy!

Since this is (semi-)complicated stuff, I think you need to add more comments and improve the variable names. Looking forward to v2 :-)

And sorry about the delay in review, I'm still beating my co-reviewer to it ;-)

::: mailnews/base/util/msgDBCacheManager.js
@@ +121,5 @@
>      let closeThreshold = (Date.now() - idleLimit) * 1000;
> +    let cachedDBs = this._dbService.openDBs;
> +    log.info("Periodic check of cached folder databases (DBs), count=" + cachedDBs.length);
> +    let uncachedDBs = 0;
> +    let pinnedDBs = 0;

I'm not so happy with those variable names. Also, we need a comment to explain them here, so the reader can check the definition against the code.
pinnedDB is one "pinned to a window". Also, 'DBs' suggests an array of them. Why not:
// Count databases connected to a folder which is open in a window.
numOpenInWindow = 0;

What about uncachedDBs? Here you count the ones which have
!db.folder.databaseOpen
and the ones where you close due to no window and timeout and set: db.folder.msgDatabase = null;
Looking at nsMsgDBFolder.cpp they will be the ones with !db.folder.databaseOpen, so it's OK to count them in the same variable.
How about: numXXX, hmm, I can't think of a good name right now.
Aren't they a little bit like zombies? The database still has a folder, but the folder claims that it has no database?

@@ +157,5 @@
>        // more than maxOpenDBs folders to stay open after this loop.
> +      let dbsToClose = dbs.length - Math.max(maxOpenDBs - pinnedDBs, 0);
> +      log.info("Need to close " + dbsToClose + " DBs");
> +      // Order databases by lowest lastUsedTime (oldest) at the end.
> +      dbs.sort((a, b) => b.lastUseTime - a.lastUseTime);

No need to sort if dbsToClose <= 0.
Attachment #8904350 - Flags: review?(jorgk)
Attachment #8904350 - Flags: feedback?(vseerror)
Attachment #8904350 - Flags: feedback+
(In reply to Jorg K (GMT+2) from comment #5)
> // Count databases connected to a folder which is open in a window.
Better:
  // Count databases whose folder is open in a window.
(In reply to Jorg K (GMT+2) from comment #5)
> ::: mailnews/base/util/msgDBCacheManager.js
> @@ +121,5 @@
> >      let closeThreshold = (Date.now() - idleLimit) * 1000;
> > +    let cachedDBs = this._dbService.openDBs;
> > +    log.info("Periodic check of cached folder databases (DBs), count=" + cachedDBs.length);
> > +    let uncachedDBs = 0;
> > +    let pinnedDBs = 0;
> 
> I'm not so happy with those variable names. Also, we need a comment to

OK, will do.
 
> Aren't they a little bit like zombies? The database still has a folder, but
> the folder claims that it has no database?

Yes, they are zombies we can't explain yet. Setting the link to a DB to null should release the DB but it doesn't always.
It may or may not be related the to assertion about leaving open databases at TB shutdown you are hunting elsewhere. We just know that opening some new TB window (or dialog) suddenly makes these zombies go away. So some link is cut at that moment (maybe like folderpane being defocused) allowing those to be destroyed.
Attached patch 1396655.patch v1.1 (obsolete) — Splinter Review
Attachment #8904350 - Attachment is obsolete: true
Attachment #8904350 - Flags: review?(rkent)
Attachment #8906268 - Flags: review?(rkent)
Attachment #8906268 - Flags: review?(jorgk)
Comment on attachment 8906268 [details] [diff] [review]
1396655.patch v1.1

Review of attachment 8906268 [details] [diff] [review]:
-----------------------------------------------------------------

Nice.

::: mailnews/base/util/msgDBCacheManager.js
@@ +160,2 @@
>        // Close some DBs so that we do not have more than maxOpenDBs.
> +      // However, we skipped DBs for a folders that are open in a window

Nit: grammar "a folders".

@@ +163,5 @@
>        // more than maxOpenDBs folders to stay open after this loop.
> +      log.info("Need to close " + dbsToClose + " more DBs");
> +      // Order databases by lowest lastUseTime (oldest) at the end.
> +      dbs.sort((a, b) => b.lastUseTime - a.lastUseTime);
> +      for (;dbsToClose > 0; dbsToClose--) {

Nit: I prefer the while you had before. Then we don't get into the discussion of whether it should be:
for ( ; dbsToClose > 0; dbsToClose--) {
Attachment #8906268 - Flags: review?(jorgk) → review+
(In reply to Jorg K (GMT+2) from comment #9)
> Nit: I prefer the while you had before. Then we don't get into the
> discussion of whether it should be:
> for ( ; dbsToClose > 0; dbsToClose--) {

No problem, I just found for() takes one less line ;)
Attachment #8906268 - Attachment is obsolete: true
Attachment #8906268 - Flags: review?(rkent)
Attachment #8906293 - Flags: review?(rkent)
Comment on attachment 8906293 [details] [diff] [review]
1396655.patch v1.2

Thanks for addressing the nits.
Attachment #8906293 - Flags: review+
Comment on attachment 8906293 [details] [diff] [review]
1396655.patch v1.2

Review of attachment 8906293 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM, and sorry for the slow review.
Attachment #8906293 - Flags: review?(rkent) → review+
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/2c194c1a63a9
Cleanup folder DB cache eviction logic and the debugging output. r=jorgk,rkent
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 58.0
Ever since this landed, I get shutdown crashes where the last thing I see on the console is this:

[7708, Main Thread] WARNING: some msg dbs left open: '!m_dbCache.Length()', file c:/mozilla-source/comm-central/mailnews/db/msgdb/src/nsMsgDatabase.cpp, line 84
db left open 

Note that "open" is followed by a space and the cursor hangs there.

Have you seen that, too?
Flags: needinfo?(acelists)
Sometimes I get that, but rarely. But I think we also got this before the patch.
But is  NS_WARNING_ASSERTION() supposed to crash the app? Doesn't look like that per the comment in it (bug dxr is old by 2 weeks now).
The code following it intends to list the open databases.
The stack trace is strange, as if it stops in a printf()-like function.
Flags: needinfo?(acelists)
(In reply to Jorg K (GMT+2) from comment #15)
> [7708, Main Thread] WARNING: some msg dbs left open: '!m_dbCache.Length()',
> file c:/mozilla-source/comm-central/mailnews/db/msgdb/src/nsMsgDatabase.cpp,
> line 84
> db left open 

Sorry, I didn't notice this last line. So it doesn't crash on the NS_WARNING_ASSERSION but our printf code:
printf("db left open %s\n", (const char *) pMessageDB->m_dbName.get());

I can look at that, maybe the name is nullptr at that time.
Jorg, do you still observe this crash in any way?
Flags: needinfo?(jorgk)
No, not these days since these days it crashes on bug 1437323 :-(
Flags: needinfo?(jorgk)
Comment on attachment 8906293 [details] [diff] [review]
1396655.patch v1.2

[Approval Request Comment]
Risk to taking this patch (and alternatives if risky): probably none, riding on c-c since TB 58.
Reason: To get better debugging from ESR users not jumping to 60 immediately. Reports like bug 1240722 still exist.
Attachment #8906293 - Flags: approval-comm-beta?
See Also: → 1330872
Summary: Cleanup folder DB cache eviction logic and the debugging output → Cleanup folder DB cache eviction logic and msgDBCacheManager debugging output
Comment on attachment 8906293 [details] [diff] [review]
1396655.patch v1.2

I think you mean ers52
Attachment #8906293 - Flags: approval-comm-beta? → approval-comm-esr52?
See Also: → 1240722
Attachment #8906293 - Flags: approval-comm-esr52? → approval-comm-esr52+
See Also: → 1726319
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: