Closed Bug 1774072 Opened 3 years ago Closed 3 years ago

Folder shows phantom unread messages after restart due to mismatch of MSF/DB and folderCache.json

Categories

(MailNews Core :: General, defect)

Thunderbird 102
defect

Tracking

(thunderbird_esr102 fixed, thunderbird102 affected)

RESOLVED FIXED
103 Branch
Tracking Status
thunderbird_esr102 --- fixed
thunderbird102 --- affected

People

(Reporter: rachel, Assigned: mkmelin)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

Attached file Drafts.zip (obsolete) —

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0

Steps to reproduce:

It seems that recent backend changes have cause a variety of issues, bug 1742975, bug 1773822 and this very puzzling issue of phantom unread messages in the folder tree for folders with no new messages in them. Looks like some sort of MSF corruption.
STR:
Incorporate the attached folder as local folder.
Start TB.
Folder will show one unread message.
Open the folder, no unread messages.
Switch to another folder.
Open TB again, unread message is back.

Fun fact: The folder properties show three messages, but there are only two.

Flags: needinfo?(benc)
Blocks: tb102found

Just happened again on a local folder to which a bunch of unread messages were moved which were subsequently read in that folder. The folder comes back with 16 supposedly unread messages after a restart.

Not even a folder repair fixes the issue.

Alice, can you reproduce this and find the regression? Unpack Drafts and Drafts.msf into some local folder, or replace Drafts(.msf) in your Local Folders. Then start TB, 2 unread messages show in TB 102 beta. Visit the folder, the unread messages should go away. Then start TB again, unread messages come back. At some older version the unread messages indicator should be permanently gone.

Flags: needinfo?(alice0775)
Summary: Folder shows phantom new messages after restart → Folder shows phantom unread messages after restart

I don't see the problem. I didn't have a Drafts folder in Local Folders so created it. Then, with tb shutdown, I replace the Drafts and Drafts.msf with yours. On tb start, I see no unread on folder Drafts. On opening Drafts, neither message show unread and both display OK. On tb restart, Drafts folder shows no unread.
This is with daily 103.0a1. Haven't tried the official beta or anything else on this.

Fun-er fact: My Drafts folder shows 2 messages in properties.

Hmm, we restored the folder from the ZIP file and the unread messages didn't show any more. We had a folder that showed 16 unread messages across many many restarts with no unread messages in the folder. Eventually the false unread messages went away and restoring it from a ZIP backup (taken when the 16 unread messages showed) now doesn't show any unread messages any more. Very mysterious. Certainly, like the IMAP MSF corruption, something that is hard to catch.

Flags: needinfo?(alice0775)

The phenomenon has happened on various folders so far, mostly ones were messages are moved to via filter (no true for drafts), too bad the faulty state can't be caught. The suggestion would be to set up POP3 and filters and see what happens.

Based on this experience, the issue seems to be in the folderCache.json which stores the unread count. Likely it got out of step with the database. So set the "totalUnreadMsgs" to any count and you'll see exactly what we're seeing. No idea how to reset this count. It does eventually become good again. You can also try to set "totalMsgs" for more fun.

Summary: Folder shows phantom unread messages after restart → Folder shows phantom unread messages after restart due to mismatch of MSF/DB and folderCache.json

We've debugged this is bit. Sadly, Mozilla logging on Windows doesn't appear to flush the last block when shutting down, so we had to implement our own basic logging. On a simple profile we see this at the end of the log:

=== nsMsgAccountManager::Shutdown, nulling m_msgFolderCache
=== nsMsgFolderCache::Flush 1
=== nsMsgFolderCache::SaveFolderCache
=== nsMsgFolderCache::SaveFolderCache - at the end

Nulling the last reference to the ref-counted folder cache object destroys the object and calls the Flush() from the DTOR as designed. However, on a more complex profile, we only see the first line. So there is some other reference held to the object somewhere and hence the DTOR isn't called and nothing gets flushed. Given that TB is know not to handle databases properly leaving some open when shutting down, flushing the cache in the DTOR doesn't appear to be a robust approach.

Attachment #9281070 - Attachment is obsolete: true
Assignee: nobody → mkmelin+mozilla
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

Our suggestion is not the only possible solution. You could also investigate why the DTOR isn't reliably run, which still appears risky. You should also consider removing the Flush() call from the DTOR. In any case, this comment here needs to be fixed:
https://searchfox.org/comm-central/rev/96e876dbe3b2875e26f357152758d16c6908af4c/mailnews/base/public/nsIMsgFolderCache.idl#42-44

(In reply to Rachel Martin from comment #11)

Our suggestion is not the only possible solution. You could also investigate why the DTOR isn't reliably run, which still appears risky. You should also consider removing the Flush() call from the DTOR.

The patch gets things working for now, but yeah, the real fix would be to sort out the shutdown issues. There seem to be a fair few of them... (searching for "thunderbird shutdown")

Flags: needinfo?(benc)

Magnus, can you address
https://searchfox.org/comm-central/rev/96e876dbe3b2875e26f357152758d16c6908af4c/mailnews/base/public/nsIMsgFolderCache.idl#42-44
a little better before landing, like:

  /**
   * Write immediately to cacheFile if any data has been changed.
   * This happens in the cache dtor anyway, but we use it during shutdown and in
   * unit testing (so tests don't have to wait for JS garbage collection).
   */

Sure.

Target Milestone: --- → 103 Branch

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/ad179d9246a7
Flush folder cache at Account Manager shutdown. r=benc

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED

That took down a lot of tests with application crashed [@ nsMsgAccountManager::Shutdown()].

Looks like a if (m_msgFolderCache) m_msgFolderCache->Flush(); is necessary, especially for xpcshell tests.

Thanks for taking care of this. A better commit message would be:
Don't crash during account manager shutdown if folder cache isn't initialized.

Note that it's a folder cache, not a message cache.

Regressions: 1776451

Comment on attachment 9282095 [details]
Bug 1774072 - Flush folder cache at Account Manager shutdown. r=benc

[Approval Request Comment]
Regression caused by (bug #):
User impact if declined: Unread message counts may be wrong
Testing completed (on c-c, etc.): On 103.0b1
Risk to taking this patch (and alternatives if risky):

Attachment #9282095 - Flags: approval-comm-esr102?

Comment on attachment 9282095 [details]
Bug 1774072 - Flush folder cache at Account Manager shutdown. r=benc

[Triage Comment]
Approved for esr102

It should be a solid patch, but a little hesitant to take both this and bug 1773822 at the same time, with beta being live less than 24 hours. However, by the time 102.0.1 ships the patch will have been on beta for 5-6 days. So we can decide in a few days against taking it if need be.

Attachment #9282095 - Flags: approval-comm-esr102? → approval-comm-esr102+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: