Closed Bug 1428097 Opened 2 years ago Closed Last year

Slow opening large gmail IMAP folder, even when there are no new messages in that folder. (Slow Fetch?)

Categories

(MailNews Core :: Networking: IMAP, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 64.0

People

(Reporter: andysem, Assigned: gds)

References

Details

(Keywords: perf)

Attachments

(8 files, 5 obsolete files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0
Build ID: 20171227151206

Steps to reproduce:

Thunderbird is very slow when updating a very large (>100k emails) folder in an IMAP (Gmail) account. When I start Thunderbird and select an email in the large folder, the status line says "Sending login information", "Opening folder ..." or "Loading message..." or something like that, the progress bar spinning, and hangs in this state for dozens of seconds or a minute before showing the email contents. I can navigate the other folders and even open emails there, but the large folder stays "locked" this way for quite a while. When it unlocks, I can click on messages in the folder and they are shown almost immediately until I click "Get Messages" or Thunderbird checks for new email on timer. When synchronization starts, the folder gets "locked" again, and I can only view messages that I clicked *before* the sync started - any messages I didn't open in this folder since Thunderbird started are not displayed, any new (unread) messages in the mailbox are not shown either in the message list until the folder gets "unlocked".

So the steps to reproduce would be roughly this:

1. Have an IMAP account with multiple folders, one of which is very large (> 100k messages). In my case, this is a mailing list folder. In my case, the account is configured as Disconnected IMAP (i.e. the messages are stored locally and synchronized with the remote server).
2. Initiate synchronization of the local cache with the remote IMAP server. This can be done by either just starting Thunderbird (it starts sync on startup) or pressing "Get Messages" or waiting until it performs the periodic synchronization.
3. Try navigating the large folder and clicking on different messages. You will notice that contents of many or all messages are not displayed in the lower section of the interface.

While in this "locked" state, Thunderbird doesn't seem to be doing anything CPU-bound (the CPU is loaded by 2-5% by Thunderbird), but it looks like it is downloading something at ~150 kbit/s rate.

What I tried and it didn't help:

- Compacting the folder
- Repairing the folder
- Deleting and re-creating the account, which involved re-downloading all messages
- In the account settings, configuring "Synchronize the most recent" to 3 months or 30 days (at this point I already downloaded all messages from server)

I do need all messages locally so that I'm able to search for older messages from time to time, so not downloading and keeping messages in the local cache is not a solution for me.

Kubuntu 17.10 x86_64, Thunderbird 52.5.0, 64-bit.



Actual results:

The synchronization with the remote IMAP server apparently takes a lot of time (tens of seconds) and messages in the local folder are inaccessible for all this time.


Expected results:

I don't think that checking for new messages on the remote server should take that long. Even if it does, I would expect that the locally stored messages that were downloaded previously to be fully accessible while sync is in progress.
So it is slow in the "get new mail" process, not so much actually clicking on the folder, right?

> 1. Have an IMAP account with multiple folders, one of which is very large (> 100k messages). In my case, this is a mailing list folder. In my case, the account is configured as Disconnected IMAP (i.e. the messages are stored locally and synchronized with the remote server).

So you have something  on the imap folder filtering mailing messages to a non-Inbox folder?
Flags: needinfo?(andysem)
Keywords: perf
Summary: Slow on large IMAP folders → Slow syncing large IMAP folder
See Also: → 1428102
> So it is slow in the "get new mail" process, not so much actually clicking on the folder, right?

The slow part, I assume, is the synchronization with the remote server. But until this synchronization is complete, Thunderbird doesn't display messages from the problematic folder and this is the most critical part of the problem.

Clicking on folders and messages works, the UI is not blocked.

> So you have something  on the imap folder filtering mailing messages to a non-Inbox folder?

There are server-side filters configured in Gmail that move inbound mail from the mailing list to the corresponding folder inside Inbox. No filters on Thunderbird side.
Flags: needinfo?(andysem)
I've collected IMAP log from Thunderbird and it looks like it really fetches all 100k messages from the server on every update. The log contains lines like these for every message in the folder:

2018-01-04 23:08:27.373990 UTC - 1697613568[7f46612e4420]: ReadNextLine [stream=5875da00 nb=33 needmore=0]
2018-01-04 23:08:27.374024 UTC - 1697613568[7f46612e4420]: 553ea800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 1 FETCH (UID 1 FLAGS (\Seen))
2018-01-04 23:08:27.374057 UTC - 1697613568[7f46612e4420]: ReadNextLine [stream=5875da00 nb=33 needmore=0]
2018-01-04 23:08:27.374068 UTC - 1697613568[7f46612e4420]: 553ea800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 2 FETCH (UID 2 FLAGS (\Seen))
2018-01-04 23:08:27.374081 UTC - 1697613568[7f46612e4420]: ReadNextLine [stream=5875da00 nb=33 needmore=0]
2018-01-04 23:08:27.374089 UTC - 1697613568[7f46612e4420]: 553ea800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 3 FETCH (UID 3 FLAGS (\Seen))
...  [lots of similar lines here] ...
2018-01-04 23:09:00.626970 UTC - 1697613568[7f46612e4420]: ReadNextLine [stream=5875da00 nb=43 needmore=0]
2018-01-04 23:09:00.626988 UTC - 1697613568[7f46612e4420]: 553ea800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 103147 FETCH (UID 103149 FLAGS (\Seen))
2018-01-04 23:09:00.627013 UTC - 1697613568[7f46612e4420]: ReadNextLine [stream=5875da00 nb=43 needmore=0]
2018-01-04 23:09:00.627024 UTC - 1697613568[7f46612e4420]: 553ea800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 103148 FETCH (UID 103150 FLAGS (\Seen))
2018-01-04 23:09:00.627037 UTC - 1697613568[7f46612e4420]: ReadNextLine [stream=5875da00 nb=15 needmore=0]
2018-01-04 23:09:00.627047 UTC - 1697613568[7f46612e4420]: 553ea800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: 70 OK Success

After that follow lines that look like fetching a particular message from the server. Notice that the whole sequence of fetches took more than half a minute.
(In reply to andysem from comment #3)
> I've collected IMAP log from Thunderbird and it looks like it really fetches
> all 100k messages from the server on every update.

was afraid of that..
Component: Untriaged → Networking: IMAP
Product: Thunderbird → MailNews Core
Version: 52 Branch → 52
(In reply to andysem from comment #3)
> After that follow lines that look like fetching a particular message from
> the server. Notice that the whole sequence of fetches took more than half a
> minute.
What do you expect us to do about a slow connection or slow server?
As I said in the description, most importantly, I would like that synchronization does not block displaying messages that were downloaded before. I.e. even if TB is fetching the 100k messages from server, it should still allow me to read mail that were downloaded before (including during the previous runs of TB).

Second, I don't think that fetching 100k messages is necessary. If there's no way to fetch only changes from the server, I would like if there was a configurable limit for me to set to only synchronize the most recent messages and a separate option (a button) to force full synchronization.

I don't know IMAP, so if there is a more efficient way of synchronization like batching or requesting only the changes on the server, that could help as well.
Just for the record:

(In reply to andysem from comment #3)
> I've collected IMAP log from Thunderbird and it looks like it really fetches
> all 100k messages from the server on every update. The log contains lines

> * 1 FETCH (UID 1 FLAGS (\Seen))
[...]
> * 103148 FETCH (UID 103150 FLAGS (\Seen))

That is no message download. TB just fetches the Flags to synchronize the folder content.
This tells TB which mails have been deleted since the last connection.

> After that follow lines that look like fetching a particular message from
> the server.

There, the new mails were loaded.
1. Is it slow even when there are no new messages in that folder?
2. Do you have any special config settings for causing this folder to be checked for new mail, as described at http://kb.mozillazine.org/How_do_I_check_for_new_messages_in_other_folders ?


xref bug 324710
Flags: needinfo?(andysem)
1. Yes.
2. The folder is not marked in the properties but I found mail.server.default.check_all_folders_for_new = true in prefs.js. I do indeed want this and other folders checked for new mail.

In addition to the ways to initiate synchronization I mentioned earlier, apparently, the sync also begins if I just select the folder in the folder tree. So in my case the slow sync affects reading messages even if you're just switching between folders, whether there are new messages or other changes in the folders or not.
Flags: needinfo?(andysem)
I've found this article: https://www.ghacks.net/2013/09/07/fix-gmail-imap-slows-thunderbird-mails-arriving-timely-fashion/

I did the opposite it recommends (i.e. I enabled CONDSTORE by adding mail.server.default.use_condstore = true in prefs.js; the article says it should be disabled by default) and it seems like the synchronization is now much much faster. I didn't test it very much, so I don't know if this mode of operation has any problems. But it might be a useful insight on a possible solution/optimization.
Yeah, condstore was disabled in bug 912216 because it had been causing problems.  So, you may see some side effects - see the bugs blocking bug 912216

Note mail.server.default.check_all_folders_for_new = true hurts performance (to what degree I'm not certain)
Summary: Slow syncing large IMAP folder → Slow opening large gmail IMAP folder. (Slow Fetch?) OK with condstore enabled
Does your protocol log perhaps show we are hitting bug 46503?
Flags: needinfo?(andysem)
> Does your protocol log perhaps show we are hitting bug 46503?

I'm not sure, but I don't think so. I don't understand much of the logs, sorry.

When the sync starts, I can see the following lines:

2018-01-08 14:05:54.825323 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/std-proposals:SendData: 64 select "INBOX/boost/boost-dev"
2018-01-08 14:05:55.923318 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=150 needmore=0]
2018-01-08 14:05:55.923360 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * FLAGS (\Answered \Flagged \Draft \Deleted \Seen $ATTACHMENT $Forwarded $NotPhishing $Phishing $REPLIED $SENT $SIGNED Junk NonJunk receipt-handled)
2018-01-08 14:05:55.923375 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=184 needmore=0]
2018-01-08 14:05:55.923380 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * OK [PERMANENTFLAGS (\Answered \Flagged \Draft \Deleted \Seen $ATTACHMENT $Forwarded $NotPhishing $Phishing $REPLIED $SENT $SIGNED Junk NonJunk receipt-handled \*)] Flags permitted.
2018-01-08 14:05:55.923391 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=42 needmore=0]
2018-01-08 14:05:55.923395 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * OK [UIDVALIDITY 605300024] UIDs valid.
2018-01-08 14:05:55.923419 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=17 needmore=0]
2018-01-08 14:05:55.923424 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 103171 EXISTS
2018-01-08 14:05:55.923431 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=12 needmore=0]
2018-01-08 14:05:55.923435 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 0 RECENT
2018-01-08 14:05:55.923439 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=43 needmore=0]
2018-01-08 14:05:55.923443 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * OK [UIDNEXT 103174] Predicted next UID.
2018-01-08 14:05:55.923448 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=30 needmore=0]
2018-01-08 14:05:55.923452 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * OK [HIGHESTMODSEQ 9527481]
2018-01-08 14:05:55.923457 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=62 needmore=0]
2018-01-08 14:05:55.923461 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: 64 OK [READ-WRITE] INBOX/boost/boost-dev selected. (Success)
2018-01-08 14:05:55.926653 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:SendData: 65 getquotaroot "INBOX/boost/boost-dev"
2018-01-08 14:05:56.085438 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=40 needmore=0]
2018-01-08 14:05:56.085496 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * QUOTAROOT "INBOX/boost/boost-dev" ""
2018-01-08 14:05:56.085524 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=39 needmore=0]
2018-01-08 14:05:56.085534 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * QUOTA "" (STORAGE 2487368 15728640)
2018-01-08 14:05:56.085697 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=15 needmore=0]
2018-01-08 14:05:56.085711 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: 65 OK Success
2018-01-08 14:05:56.085731 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:SendData: 66 UID fetch 1:* (FLAGS)
2018-01-08 14:05:56.290681 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=33 needmore=0]
2018-01-08 14:05:56.290722 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 1 FETCH (UID 1 FLAGS (\Seen))
2018-01-08 14:05:56.290748 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=33 needmore=0]
2018-01-08 14:05:56.290759 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 2 FETCH (UID 2 FLAGS (\Seen))
2018-01-08 14:05:56.290769 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=33 needmore=0]
2018-01-08 14:05:56.290777 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 3 FETCH (UID 3 FLAGS (\Seen))
2018-01-08 14:05:56.290786 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=43 needmore=0]
2018-01-08 14:05:56.290794 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 4 FETCH (UID 4 FLAGS (\Answered \Seen))
2018-01-08 14:05:56.290805 UTC - -1689286912[7ff689b84420]: ReadNextLine [stream=869665e0 nb=33 needmore=0]
2018-01-08 14:05:56.290813 UTC - -1689286912[7ff689b84420]: 84405800:imap.googlemail.com:S-INBOX/boost/boost-dev:CreateNewLineFromSocket: * 5 FETCH (UID 5 FLAGS (\Seen))

and so on for all messages in the folder. This sequence only shows once per sync (i.e. I don't see it fetching 100k messages twice).
Flags: needinfo?(andysem)
> Yeah, condstore was disabled in bug 912216 because it had been causing problems.  So, you may see some side effects - see the bugs blocking bug 912216

It looks like the problems are:

1. Notifications of new messages not working reliably. https://bugzilla.mozilla.org/show_bug.cgi?id=885220
2. Messages deleted on the server may not appear deleted in TB. https://bugzilla.mozilla.org/show_bug.cgi?id=1123094
3. New messages received on the server inside folders may not appear in TB or appear after a long delay. https://bugzilla.mozilla.org/show_bug.cgi?id=693204

Is that basically all of them?

Regarding 1, I don't believe notifications worked reliably for me without CONDSTORE either. Often I would have TB running in the background and not see any popups about new messages and then bring TB window forward to see that there actually are new messages. I suspect that this may be caused by the fact that my phone also checks for new mail periodically and it may affect how TB perceive messages as new and worth notifying about. This is probably a bug, but it doesn't look like it is caused by CONDSTORE (not exclusively by it, anyway).

Regarding 2 and 3, I did some limited testing. I tried sending new messages from a non-Gmail account so that they are received in Inbox or a folder inside Inbox. I wanted to see how TB, with CONDSTORE enabled, detects various events, such as:

- receipt of a new unread message to the server
- marking the message as read on the server (through the web interface of Gmail)
- deleting the message on the server (through the web interface of Gmail)

In all these cases, TB would notice the event after it performs a periodic synchronization with the server or I perform the sync manually (by clicking on the "Get Messages" button or selecting the folder). It seems like new messages are not detected instantly by TB (that's what IDLE is supposed to do, right?), or at least not always are. But that is still ok if the periodic sync gets the 
job done eventually. It is definitely more preferable to me than waiting 30 seconds to just read a message.

I think I will keep CONDSTORE enabled for now and see if any other problems appear.
After a few days of testing I can tell that enabling CONDSTORE does not fully solve the problem. It looks like TB will still fetch flags for all messages in the folder in some cases, which causes to the same kind of lockups that I described without CONDSTORE.

I'm not sure what are the exact situations when TB does the full sync. It doesn't happen just on any folder selection, periodical or manual synchronization. Maybe it happens when new messages arrive and TB performs periodical sync.
Summary: Slow opening large gmail IMAP folder. (Slow Fetch?) OK with condstore enabled → Slow opening large gmail IMAP folder. (Slow Fetch?)
> Maybe it happens when new messages arrive and TB performs periodical sync.

Yes, I've now confirmed this.
Gene, a lot of meat here - appears to be well documented and reproducible. What do you think of similar symptoms in bug 589310, and this bug?  N.B. comment 15

This is also similar to another bug you recently looked at?
Severity: normal → major
Flags: needinfo?(gds)
Summary: Slow opening large gmail IMAP folder. (Slow Fetch?) → Slow opening large gmail IMAP folder, even when there are no new messages in that folder. (Slow Fetch?)
There is another bug possibly from the same reporter, I think. I say this because his log shows a "boost" folder under INBOX for gmail. I don't remember the details of what the problem was. Well, just searched for it: Bug 1428102, not quite the same.
The reporter should check the setting for mail.server.serverX.force_select. It should not be enabled for gmail. This could cause more flag checking and could slow down access to huge mailboxes. It should be auto or no-auto but not yes or yes-auto for gmail. Probably a long-shot.
I don't have any mailboxes with anywhere near 100k messages, maybe 5k so I don't see a problem. Maybe I could download a mailing list archive and obtain a lot of messages?
In my case I have mail.server.default.force_select = auto, mail.server.server1.force_select = no-auto, mail.server.server3.force_select = no-auto and still have the problem.
Your force_select settings should be ok as is, I thought it was just something worth checking. 
So are the 100k message mostly in the inbox/boost/boost.dev mailbox? Yes, I think that is the mailing list you mention. You might enable the "timestamp" logging module along with IMAP:5 and see how long it takes to fetch the flags for the many messages in inbox/boost/boost.dev. (The log fragment above doesn't show timestamps.)
When a folder is selected (clicked on), tb always asks for the flags for each message in the folder. If there are 100k messages, it might take a while. You might try moving the older messages to a new folder and make sure it is not checked for new mail when new mail is fetched. This would reduced the fetch time for flags when the folder with newer and probably most relevant messages are selected. The "archived" older messages would still be searchable.
Another thing to check is if mail.imap.use_status_for_biff is true which is default. If false, this will cause new mail check to use imap select instead of imap status. Select will, as mentioned above, cause a lot more network activity like you report.
Flags: needinfo?(gds)
> So are the 100k message mostly in the inbox/boost/boost.dev mailbox?

There are >100k messages *just* in the boost-dev folder. There are other multi-thousand and tens-thousand folders, the problem is less apparent with those.

> how long it takes to fetch the flags for the many messages

I've watched the log while I was experimenting. For the majority of the delay Thunderbird was fetching flags.

> You might try moving the older messages to a new folder and make sure it is not checked for new mail when new mail is fetched.

This is too inconvenient for me to use. Occasionally, I need to search though messages in the folder. I don't use the search in TB for this but only the quick filter as it is more convenient to use. Also, managing two folders adds burden on me, and as other folders approach 100k messages, the burden will increase.

Sorry, but I need a proper solution to the problem, not a workaround. As a workaround I could already use the web interface of Gmail. Would it be possible to not fetch flags for every message? Apparently, even if enabled, CONDSTORE is not always used to synchronize flags.

> Another thing to check is if mail.imap.use_status_for_biff is true which is default.

mail.imap.use_status_for_biff is true.
(In reply to andysem from comment #22)
>...
> This is too inconvenient for me to use. Occasionally, I need to search
> though messages in the folder. I don't use the search in TB for this but
> only the quick filter as it is more convenient to use. Also, managing two
> folders adds burden on me, and as other folders approach 100k messages, the
> burden will increase.

(I know this is a workaround) enter, virtual folders.  I've used these to special case search scenarios and quick filter works great.
Reporter, I think you are saying enabling CONDSTORE doesn't help for the many emails in your INBOX/Boost/boost-dev mailbox on gmail. Tb still fetches all the flag and takes a long time. Is that your main problem? Also, maybe you see problems with email notification in that folder?

Anyhow, do you know if it is possible for me to obtain the mailing list files that I can import to my tb and test this myself? I will look into that possibility. (No folder I currently have has anywhere near 100k emails!)
Flags: needinfo?(andysem)
> Tb still fetches all the flag and takes a long time. Is that your main problem?

Yes, that seems to be the problem.

> Also, maybe you see problems with email notification in that folder?

Maybe. I can't say for sure if that folder is any special because new mail notifications are working unreliably in general. I often open TB and it has new mail (in INBOX/Boost/boost-dev or elsewhere) while there was no notification.

> Anyhow, do you know if it is possible for me to obtain the mailing list files that I can import to my tb and test this myself?

I don't think I'm at liberty to publish that correspondence. You understand there are lots of emails, I'm not sure I can publish all of them.

Also, to reproduce the issue the emails have to be on the server, not just in the local TB cache. I think, you could generate lots of emails in a test mailbox with a script or something.
Flags: needinfo?(andysem)
Yes, a script might work too. Or maybe I don't really need 100k emails to test. :}

I assume you are on linux. Same here. I also have noticed that sometimes there is no notification pop-up for new email. By notification I assume you mean the pop-up message saying you've mail and not just the highlighted unread count number beside the folder?
> I assume you are on linux.

Yes, Kubuntu 18.04.

> By notification I assume you mean the pop-up message saying you've mail and not just the highlighted unread count number beside the folder?

Yes. The number of unread messages beside the folder name works fine, although, naturally, it updates with a significant delay since it takes time to fetch flags for INBOX/Boost/boost-dev.
Since your last comment 27 I have been working on the problem and I think I am getting a handle on it. I obtained a couple of mbox files with random messages that I imported and got up to about 5000 messages in my gmail folder. Not nearly as much as your 100,000 messages but still a lot.

Without CONDSTORE (CS) enabled, all the flags are fetched for each message at startup. This can't be avoided in order to do a correct sync. Therefore, a solution to this bug will required having CS enabled.

CONDSTORE should help with the problem but, as you observed, it doesn't always help. You may see an initial improvement with CS enabled, but if new mails become present or are expunged at your boost-dev mailbox while tb is shutdown, all 100k flags will be fetched again on tb startup. A well known limitation of CS is that it does not detect expunged message. Therefore, if a message is expunge while tb is shut down, a full flag sync is definitely needed to determine which messages were expunged. But if only new messages were added or a flag (read/unread, starred etc) was changed on an existing message, then a full sync is not really needed. The current tb code does not distinguish between these cases and does a full sync if the message count has changed for any reason. I have modified the code to determine when only new mails are added so that the full sync can be avoided unless messages were also expunged. A patch is forthcoming. 

Another problems with CONDSTORE implementation in tb is that a message expunged by another client is not always detected in tb even when both clients are online. This is pointed out in Bug 1124569 and I have independently verified and fixed it. Since this is not directly related to this bug, I will probably submit the patch for this with Bug 1124569, although the changes are somewhat intertwined. 

These proposed changes don't directly address the multitasking and performance issues that you point out. However, avoiding the unnecessary flag fetches that occur should improve the performance. A future optimization to speed up the fetches needed to determine what was expunged might be to do an imap "search uid all" instead of a full flag fetch when CS is enabled. This will return just a list of existing message UIDs that would be much few bytes transferred, possibly fewer round trips and simpler and faster parsing than a full flag fetch.
Thanks for looking into this. Once the patches go upstream and if there is a nightly build of TB somewhere I will be happy to try it out.
Assignee: nobody → gds
Severity: major → normal
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
This is the raw prototype changes to fix this bug and Bug 1124569 combined containing lots of debug printfs and other temporary and alternate constructs. It is just for reference and does not require formal review, but any comments are welcome. My intention is to clean this up and break it into two separate patches addressing the two bugs independently.
I had to revise my original idea on this some but it now does what I want.

Jorg, Would it be possible to do a "try" build with this change so maybe the reporter can test it before any review changes are requested?

This should now only do a massive fetch of the reporter's 100k flags if a message was expunged by another client while tb is shutdown. With a mailing list archive like the user has, this probably wouldn't happen often. 

New messages arriving while tb is shutdown would probably be common. This fixes it so that all flags are not re-fetch and only the new flags and messages are downloaded.

In addition, without this fix, if just a flag is changed by another client while tb is shutdown (e.g., message marked read) the message flags at all messages above the changed message were re-downloaded unnecessarily.

I have also added a new logging type, called IMAP_CONDSTORE, that will help with testing.

Also, this fix requires that CONDSTORE pref is enabled.
Attachment #9007522 - Attachment is obsolete: true
Attachment #9007920 - Flags: review?(jorgk)
Here is a try build based on TB 60, that might be better than having to used a Daily. If no good, please let me know and I can do one for Daily:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=6df344a44c7cf5448c0a49ffe1207405f8568e83

You know how to pick the executables off that build, right? Click the "B" and download it.
I click the "B" and see info pop up at the bottom. Only thing I see to download is zip of what looks like all the comm files, link to the left of "B" called 6df344a44c7c. But don't see a tb executable in them. My build at home puts my tb executable at /home/gene/mozilla/obj-x86_64-pc-linux-gnu/dist/bin/thunderbird.
Ok, I finally see it. After clicking Linux's B, it's at the bottom under category "Job Details", item "target.tar.bz2" (see link below to download). I downloaded and and extracted it directly to ~/Downloads/. Then I run it in a terminal with bash command:
MOZ_LOG=IMAP_CONDSTORE:5,sync,timestamp MOZ_LOG_FILE=  /home/gene/Downloads/thunderbird/thunderbird -no-remote -profilemanager 2>&1 | tee ~/tb.log | grep "IMAP_CONDSTORE"

Looks like it's working since I see the expected IMAP_CONDSTORE log prints in the terminal and in ~/tb.log.

Andysem, Since you are on linux too, you can get the .tar.bz2 from this link and use it the same way without disturbing your official tb installation:
https://queue.taskcluster.net/v1/task/ACuD97u-TRuYkHL2bwZbIA/runs/0/artifacts/public/build/target.tar.bz2
andysem, When you watch the IMAP_CONDSTORE log messages and you see a message that says:

D/IMAP_CONDSTORE Other client expunged msgs, do full fetch to removed expunged msgs

it will indicate that a full flag fetch is occurring. You shouldn't see this if the fix is working right (unless a message was actually expunged while tb was shutdown). You *should* see messages like these below and, except for the last message listed, only at startup:

D/IMAP_CONDSTORE Sanity, #deleted(0), #prevExists(3), #exists(3)
D/IMAP_CONDSTORE Sanity, #addedByPartialFetch(0)
D/IMAP_CONDSTORE numToCheck = 0
D/IMAP_CONDSTORE Avoid unneeded fetches after just flag changes
D/IMAP_CONDSTORE Sanity, #deleted(0), #prevExists(50), #exists(50)
D/IMAP_CONDSTORE Sanity, #addedByPartialFetch(0)
D/IMAP_CONDSTORE numToCheck = 0
D/IMAP_CONDSTORE Avoid unneeded fetches after just flag changes
D/IMAP_CONDSTORE Check for new messages above UID=732   <-- occurs on each check for new mail in folder

Let me know if this helps.  Thanks!
I can see messages "Other client expunged msgs, do full fetch to removed expunged msgs" when a new email arrives and I check for new mail by clicking "Get Messages". Maybe it also appears when TB performs a periodic check.
You should only see "Check for new messages above UID=*" on the periodic check or manual check and nothing else with the log tag IMAP_CONDSTORE.
What do you see when you click "get new messages"?

You may need to enable standard IMAP logging to see which folder is getting fetched, or possibly you could use wireshark to see what is happening fully. This will enable both log types:

MOZ_LOG=IMAP_CONDSTORE:5,IMAP:5,sync,timestamp MOZ_LOG_FILE=  <your path to thunderbird>/thunderbird -no-remote -profilemanager 2>&1 | tee ~/tb.log | grep "IMAP_CONDSTORE"

Restart tb and then attach to this bug at least the IMAP_CONDSTORE log lines from the log. Also, if you could edit the log file with what was happening when the events were recorded it would definitely help.

This bug primarily fixes sync with CONDSTORE for server changes that occur while tb is shutdown or offline. So you should usually only see the CONDSTORE logs items right after tb starts up (except for check for new messages...).

Note: If you have accounts other than gmail they will also show up in the CONDSTORE (and IMAP) log. So the expunge message you saw may be coming from another account where an expunge may have actually occurred.

Anyhow, do you see any improvement in the performance with this change?
With tb running, I haven't seen any problems detecting new messages with IDLE or with "Get messages" button and this bug fix should not affect that. 

One problem with IDLE is that typically the IMAP server will disconnect after 30m if there is no activity and tb does not automatically reconnect. So the workaround for this is to set the periodic check for new message a few minutes shorter than the server timeout, such as 25m. This will ensure that the IDLE connection stays up. 

Another problem with IDLE is that, since there are, by default, 5 connections per account, only 5 folders can be monitored for IDLE. It appears that the selection for which folder gets monitored is somewhat random (maybe always INBOX is monitored, not sure and the currently selected folder), so it is possible that your folder of interest, e.g., boost-dev, is not being monitored. I think the workaround for this is to set folder property "always check folder for new mail" so that the periodic check will still detect new entries; but this will not be instant like if IDLE were in effect. 

So new features are needed:
1. Reconnect if an IDLE connection is dropped by the server and re-establish IDLE.
2. Add a new folder property that ensures that the folder is always monitored by IDLE, limited to a max of 5 (or whatever the advance server setting contains) folders having the setting.
(In reply to gene smith from comment #38)

> One problem with IDLE is that typically the IMAP server will disconnect
> after 30m 

I don't know, how the usual servers handle this timeout (<30 minutes is a suggestion from the IMAP-RFC). But in first place it is TB who's cutting the connection. You may set mail.server.server<n>.timeout to any value you want(default is 15 IIRC), but TB will lower it to 29 when he connects to that server.

>  if there is no activity and tb does not automatically reconnect.
> So the workaround for this is to set the periodic check for new message a
> few minutes shorter than the server timeout, such as 25m. This will ensure
> that the IDLE connection stays up. 

That's the usual workaround. No matter, who cuts the connection.

> 2. Add a new folder property that ensures that the folder is always
> monitored by IDLE, limited to a max of 5 (or whatever the advance server
> setting contains) folders having the setting.

Account -> server-setting -> advanced -> Maximum number of server connections to cache
(mail.server.server<n>.max_cached_connections)
handles this. 

There is/was another pref:
mail.imap.max_cached_connections
But I don't know, if it is still active (total max number???) or if it is obsolete.
(In reply to Alfred Peters from comment #39)
> (In reply to gene smith from comment #38)
> 
> > One problem with IDLE is that typically the IMAP server will disconnect
> > after 30m 
> 
> I don't know, how the usual servers handle this timeout (<30 minutes is a
> suggestion from the IMAP-RFC). But in first place it is TB who's cutting the
> connection. You may set mail.server.server<n>.timeout to any value you
> want(default is 15 IIRC), but TB will lower it to 29 when he connects to
> that server.

Specifically, the main imap RFC 3501 says that if the imap server has an inactivity logout timer, it MUST be at least 30m. The IDLE RFC 2177 recommends that the client terminate the IDLE after 29m and then reestablish the IDLE to keep from being probably timed out. (I think sending a NOOP at this point would also be useful to ensure new messages are detected.)

I just checked again how this works using the gmail server with the periodic check for new email disabled and idle enabled using wireshark and linux command lsof. What I see with gmail is that after the IDLE command is sent by tb and if nothing occurs to cause gmail to respond to the IDLE (e.g., no new mail), then after about 32m gmail sends the FIN and tb ACKs it. This leaves the connection in CLOSE_WAIT state per lsof command  "lsof -i4 -i6 -itcp -P | grep thunder". When another folder or an email is accessed by tb user, tb sends a FIN and gmail sends ACK or RST it to complete the disconnect, then a new connection is initiated by tb and IDLE is enabled on the new connection for the selected folder.

For each account/server there does exist a mail.server.serverX.timeout that gets set automatically to "29" (minutes I assume). However, I don't see that this ever does anything. It should terminate IDLE, maybe send a NOOP and then reestablish IDLE at the 29m point, but I don't see this happening. Therefore, IDLE is broken after the imap server sends the FIN at about the 30m point (until the user does something to cause the connection and IDLE to be reestablish).

> 
> >  if there is no activity and tb does not automatically reconnect.
> > So the workaround for this is to set the periodic check for new message a
> > few minutes shorter than the server timeout, such as 25m. This will ensure
> > that the IDLE connection stays up. 
> 
> That's the usual workaround. No matter, who cuts the connection.

Yes, not ideal but a workaround.

> 
> > 2. Add a new folder property that ensures that the folder is always
> > monitored by IDLE, limited to a max of 5 (or whatever the advance server
> > setting contains) folders having the setting.
> 
> Account -> server-setting -> advanced -> Maximum number of server
> connections to cache
> (mail.server.server<n>.max_cached_connections)
> handles this. 
> 
> There is/was another pref:
> mail.imap.max_cached_connections
> But I don't know, if it is still active (total max number???) or if it is
> obsolete.

Yes, advanced server setting let the user set this per account with default of 5. However, if you have 20 folders and you are filtering new mailing into them and you want instant notification when there is new email, you would have to set this to 20 cached connections or more to be sure that each interesting folder produces a quick notification (increased unread count in bold). Some servers may complain if there are too many connections.
So, assuming 5 connection, the folder property should allow an IDLE connection for up to the user's 5 highest priority folders. The remaining 15 would have to use a polled new mail check set to an acceptably low check interval. The selected high priority folders that use IDLE should (ideally) not need to poll at all except at the 29m point to keep the connection alive.
In debugging various problems, I have noticed that whatever folder is select does IDLE, even Junk and Trash. Sometimes INbox has IDLE stolen. I'm not sure how tb decides which folders use IDLE but, other than the currently selected folder, it seems kind of random. That's why I think it would be good to allow the user the select which folders will always use IDLE.
(In reply to Alfred Peters from comment #39)
> (In reply to gene smith from comment #38)
> 
> > One problem with IDLE is that typically the IMAP server will disconnect
> > after 30m 
> 
> I don't know, how the usual servers handle this timeout (<30 minutes is a
> suggestion from the IMAP-RFC). But in first place it is TB who's cutting the
> connection. You may set mail.server.server<n>.timeout to any value you
> want(default is 15 IIRC), but TB will lower it to 29 when he connects to
> that server.

It looks like you are theoretically right about tb cutting the connection. The function that checks for and handles the 29m timeout does a disconnect and kills the thread. I don't see anywhere that it does what it should do and stop and restart the IDLE.

Here's the function that checks the timeout and does the disconnect:
https://dxr.mozilla.org/comm-central/source/comm/mailnews/imap/src/nsImapIncomingServer.cpp#677

This will set the pref back to 29 if you try to set it above 29 with config editor. But the way I think it works in real time is that if you try to "steal" and reuse the connection when you select another folder and the connection was quiet for at least 29 minutes (e.g., waiting for IDLE response), then it disconnects and forces a new connection to be opened for the selected folder. Not sure how useful this is.

Anyhow, guess this discussion should be forked into a new bug and/or I think there are existing bugs that touch on these issues.
(In reply to gene smith from comment #37)
> You should only see "Check for new messages above UID=*" on the periodic
> check or manual check and nothing else with the log tag IMAP_CONDSTORE.
> What do you see when you click "get new messages"?

As I said, when a new message arrived and I clicked "Get Messages", I got "Other client expunged msgs" message in the log. I sent myself a message from another account to test this. That new message was received in the Inbox, not the large >100k folder, though. I'm sure I didn't delete any messages on the account.

Also, I started TB without "-no-remote -profilemanager" arguments, if it matters.

> You may need to enable standard IMAP logging to see which folder is getting
> fetched, or possibly you could use wireshark to see what is happening fully.

I will collect more logs later.

> Note: If you have accounts other than gmail they will also show up in the
> CONDSTORE (and IMAP) log. So the expunge message you saw may be coming from
> another account where an expunge may have actually occurred.

Indeed, I have multiple accounts set up in TB, but I'm sure no messages were deleted on the server while I was testing. Could TB falsely detect message deletion?

> Anyhow, do you see any improvement in the performance with this change?

I can't tell yet from my short experiment yet. The main problem is with my mailing list folder, and I didn't test how was the delay when I receive a message from the mailing list.
(In reply to gene smith from comment #38)
> With tb running, I haven't seen any problems detecting new messages with
> IDLE or with "Get messages" button and this bug fix should not affect that. 

In general, I don't rely on IDLE and could as well live with it disabled. What I do rely on is the periodic checks for the new mail and I expect them to always check all folders, unless specifically disabled. I did not disable checks for new mail on any of my folders.
(In reply to andysem from comment #42)
> 
> Also, I started TB without "-no-remote -profilemanager" arguments, if it
> matters.

No doesn't matter. It's OK to not use these. I think without these it just uses your usual/default profile instead of giving you a selection.

> 
> Indeed, I have multiple accounts set up in TB, but I'm sure no messages were
> deleted on the server while I was testing. Could TB falsely detect message
> deletion?

My code changes try to determine when a message is expunged by another client while tb was shutdown. (Of course, the "other client" could be another tb instance.) I haven't seen a false indication in my testing. That's why I need the IMAP_CONDSTORE log lines to see what is happening. 

> 
> > Anyhow, do you see any improvement in the performance with this change?
> 
> I can't tell yet from my short experiment yet. The main problem is with my
> mailing list folder, and I didn't test how was the delay when I receive a
> message from the mailing list.

The change should mostly affect the activities that occur at startup or possibly when you newly select a folder and it needs a new tcp/imap connection to be established. I.e., it should now not fetch 100k flags when a new message had arrived in your mailing list mailbox while tb was down.
(In reply to andysem from comment #43)
> 
> In general, I don't rely on IDLE and could as well live with it disabled.
> What I do rely on is the periodic checks for the new mail and I expect them
> to always check all folders, unless specifically disabled. I did not disable
> checks for new mail on any of my folders.

Yes, it looks like IDLE may be broken after 30m and you never know which folder(s) it will be used on, except for the currently selected folder.

Just to understand your setup, in a previous comment you said you have this set via config editor:
mail.server.default.check_all_folders_for_new = true
So that should check *all* folders on every account for new mail, maybe even Junk and Trash. I will check if this really works as advertised.

I think the recommended way is to set check_all_folders_for_new false and set each individual folder property to be checked for new. This way you can skip checking probably uninteresting folders like maybe Junk and Trash etc.

If this is working, when you click "get new messages" button you should see my CONDSTORE log message line "checking for new mail" for each of your folders. 

Just curious, how many accounts are you running and how many folders in each account?
One other thing that might be useful in testing the fix: You can add a column to your list of message in a folder that shows "order received" and sort on that. (Click the widget at the far right in the column headings.) The "order received" column actually shows the server assigned UID of each message. So you can correlate the UID you see in the IMAP_CONDSTORE log lines with the actual message. Messages are always added with increasing UID.
(In reply to gene smith from comment #45)
>
> Just to understand your setup, in a previous comment you said you have this
> set via config editor:
> mail.server.default.check_all_folders_for_new = true
> So that should check *all* folders on every account for new mail, maybe even
> Junk and Trash. I will check if this really works as advertised.

Yes, it works but it skips the check for folders that identify as Junk or Trash.

> 
> I think the recommended way is to set check_all_folders_for_new false and
> set each individual folder property to be checked for new. This way you can
> skip checking probably uninteresting folders like maybe Junk and Trash etc.
> 
> If this is working, when you click "get new messages" button you should see
> my CONDSTORE log message line "checking for new mail" for each of your
> folders. 

This is not always true. With you current setup, you will only see log messages like this
> D/IMAP_CONDSTORE Check for new messages above UID=5813
for connection to folders that are in imap SELECTED state. Assuming you have 5 connections (default) then on tb startup you will have INBOX and INBOX/boost/boost-dev selected plus any folder that you had last previously clicked on with tb user interface, e.g., All-mail folder. Then when you click to another folder, e.g., Send Mail, it will become selected in the 4th connection and then if you select another folder it will be imap SELECTED in the 5th connection. If you click to a another folder that you haven't visited, it will de-select one of the SELECTED non-Inbox folders(*) and SELECT that folder. 

If you don't click on other folders after startup, except for INBOX and children, the  connections never reach the imap SELECTED state but are still used to check for new emails at the periodic interval. 

The check for new emails on folders that aren't in SELECTED state uses the imap STATUS command and folders that are in SELECTED state used the fetch command which is reported via IMAP_CONDSTORE line quoted above.

If IDLE is enabled, it will and can only be used in connections to folders that are in SELECTED state. The IDLE will be interrupted in one or more connections to do the periodic STATUS check for the non-SELECTED folders and then IDLE is reissued.

(*) I don't know how it decides which folder to de-SELECT; maybe random?  P/S: Sorry for the long descriptions, but want to document this some before I forget it.
Ok, I have collected a log from my experiment:

1. Start TB, wait until it settles down checking for new mail.
2. In the web interface of my other mail account, send myself an email (to the account that is monitored by TB).
3. In TB, click "Get Messages". This is about where the attached log starts.
4. TB says "Other client expunged msgs" and starts fetching flags for INBOX. Then it downloads the received message and proceeds to checking other folders.
5. Sometime about 11:29:04 it starts to fetch flags for all messages in the "Sent Mail" folder. This happens after some delay after it has finished checking folders for new messages (at 11:29:02), so it may be due to a periodic check for new mail.
6. At the end it completes fetching flags and downloads the new message in the "Sent Mail" folder. This is because, apparently, Gmail understands that, even though from another account, it was I who sent that test message and puts it into the "Sent Mail" folder as well as the INBOX.
> Just curious, how many accounts are you running and how many folders in each account?

I have 2 IMAP accounts managed by TB, both hosted on Gmail. The third email account that I used to send test message is not managed by TB and is not on Gmail, but it is configured to synchronize with one of the Gmail accounts (i.e. Gmail periodically pulls new messages from it to one of its folders).

I have 15 folders on one account and 8 on the other, not counting INBOX or other special folders.
> The change should mostly affect the activities that occur at startup or possibly when you newly select a folder and it needs a new tcp/imap connection to be established. I.e., it should now not fetch 100k flags when a new message had arrived in your mailing list mailbox while tb was down.

Well, if it doesn't fix the full fetch when a new message arrives (while TB was running) then it is not much of improvement, unfortunately. I mean, if I have TB running the whole day, it will still get locked up as it fetches flags on new incoming messages. I would think that it doesn't need a full fetch regardless of when the new message has arrived - whether while TB was running or not.

> Assuming you have 5 connections (default)

I have 1 cached connection per account configured in TB. Even so, TB still occasionally shows a notification on startup saying that the server has closed a connection and I should probably reduce the number of cached connections.

BTW, the periodic checks for new mail are configured to 10 minutes, so in my Comment 48 the delay before fetching flags for the "Sent Mail" folder shouldn't have been because of the periodic check (10 minutes certainly didn't pass since TB startup until that point).
I'm not at a place to test it now but it looks like the previously recorded max UID stored for the folder INBOX was never updated from the last message received there on the previous tb run. Before you sent the test message it looks like you were visiting (selected on) another folder and then when the log starts you selected INBOX to see if the test message came in. When I tested it I was already selected on INBOX when I sent the test message from another client. So maybe my fix, as is, won't work. Unfortunately, I don't print the max stored UID in my new log lines so not sure.  I will check this asap. Thanks!
Been unable to see the same problem, so far. But may be seeing other related problems that cause a full fetch (maybe not properly recorded or produced by gmail "highest mod sequence"?).

A question regarding your log: Which folder were you selected to when tb started? Or did you change the selection to INBOX before you clicked on "get new messages"? (I've tried it different ways and don't really see a big difference.) Also, did you click on Sent Mail  or did it just start fetching all flags on its own?

Can you try this?: Select INBOX and shut down tb. Then start tb and keep INBOX selected. Then send the test messages from the other client. Do you see every flag in INBOX fetched when you do this?

I will keep trying tomorrow. Thanks for helping!
> Before you sent the test message it looks like you were visiting (selected on) another folder and then when the log starts you selected INBOX to see if the test message came in.

No, INBOX was selected since the very start of TB, all I did was to click "Get Messages".

> Which folder were you selected to when tb started?

INBOX.

> Or did you change the selection to INBOX before you clicked on "get new messages"?

No.

> Also, did you click on Sent Mail  or did it just start fetching all flags on its own?

I did not click on "Sent Mail". I think it started fetching flags for "Sent Mail" as part of the process of getting new mail when I clicked "Get Messages". It started happening after a small delay, but I can't tell what caused the delay.
> Can you try this?: Select INBOX and shut down tb. Then start tb and keep INBOX selected. Then send the test messages from the other client. Do you see every flag in INBOX fetched when you do this?

I redid the previous experiment (which is what you're asking above) and got the same result, so the answer is yes. However, I can add some new info:

1. I sent my test email at around 11:03 UTC. You can see some activity in the log at this time, but the message was *not* displayed in TB interface at this point.
2. At around 11:04 UTC I clicked "Get Messages", that's when the activity started, including fetching flags for INBOX and "Sent Mail". After this the new message appeared in INBOX and "Sent Mail".
3. Through the whole experiment, INBOX was the selected folder from the start and I didn't click anything except "Get Messages".

I've attached the new log.
(In reply to andysem from comment #49)
> > Just curious, how many accounts are you running and how many folders in each account?
> 
> I have 2 IMAP accounts managed by TB, both hosted on Gmail. The third email
> account that I used to send test message is not managed by TB and is not on
> Gmail, but it is configured to synchronize with one of the Gmail accounts
> (i.e. Gmail periodically pulls new messages from it to one of its folders).

Is that why new mail appeared in [gmail]/Sent Mail when you sent the test message from non-gmail and not-on-tb account? 

Another question: Is this sync you mention between the non-gmail and gmail servers a proprietary link or does it use IMAP or maybe POP too? Just curious, probably not related to this bug.

Also, you say you only have 1 connection configured per account. I have been testing with default 5 and haven't tried 1. Having just 1 should still work but it causes IDLE to  be interrupt and restarted each time new mail is checked in each folder.

Anyhow, thanks for the 2nd log that also shows what was happening a bit before you sent the test message I think. Looking though it now...
> Is that why new mail appeared in [gmail]/Sent Mail when you sent the test message from non-gmail and not-on-tb account? 

I don't know. Maybe.

> Is this sync you mention between the non-gmail and gmail servers a proprietary link or does it use IMAP or maybe POP too?

It's one of the Gmail features, it's using POP to fetch messages from the other account.

https://support.google.com/mail/answer/21289

I have tested multiple times in a row, and each time the test message appeared in INBOX and "Sent Mail", so I'm pretty sure this sync was not the reason why the message appeared in "Sent Mail".
In the 2nd log at line 1541, the fetch indicates that your test message with UID 59773 has been detected. Then at line 1547 you can see that the log indicates this is not new but is instead an existing message. The code determines it is new if the new uid (57773) is greater than the largest saved UID for the folder. Unfortunately, I failed to print highest saved UID in the log with the patch you are running.

However, earlier in the log at line 590 there is another fetch that obtains messages in INBOX above UID=103465. Then the actual highest UID is fetched from the server which is UID=58732. This indicates that tb "thinks" that the highest UID saved for the INBOX folder is 103465 which seems much too large.

So could you enable the "order received" column for INBOX and tell me the highest UID you see listed? I suspect you will see something like 59733. Somehow, it appears that tb's database has the completely wrong value stored for highest UID for INBOX.

I am seeing other problems, but something you could try is to repair the INBOX under its properties and try your test message again and see if that helps. The repair should clean up the database and, of course, it will re-download all the INBOX messages.
I was waiting for the discussion to subside and perhaps a new patch to appear, but that's not the case. So let's see whether this breaks any tests:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=b3b8aa34a75704408a7cbe276ac9ebed3037b3fc
> So could you enable the "order received" column for INBOX and tell me the highest UID you see listed?

58735. The test message I've sent for my last experiment is 58733.

> something you could try is to repair the INBOX under its properties and try your test message again and see if that helps

I tried repairing INBOX and "Sent Mail" and it seems to have helped. I've attached a new log.

1. TB started with INBOX selected by default.
2. At about 10:54 UTC I've sent myself a test message, as before. You can see some activity in the log at this time, however the message did not appear in the TB interface.
3. At about 10:55 UTC I clicked "Get Messages". After this the message appeared in INBOX and "Sent Mail". This time TB didn't fetch all the flags.

Nice, thanks for your help!

I'm a little puzzled though why the message is not displayed when it is detected by IDLE, because clearly there is some activity at that point.
(In reply to andysem from comment #59)
> 
> I'm a little puzzled though why the message is not displayed when it is
> detected by IDLE, because clearly there is some activity at that point.

With just one connection, when tb checked for new mail before you sent the test msg, the last folder it checked was INBOX/GitHub and it kept it SELECTed and resent IDLE at 10:54:11. Since tb was not SELECTed on INBOX, where the new message arrived, IDLE did not respond to it (gmail didn't send a "* <num msgs in mbox> EXISTS" response to IDLE). Instead, about a minute later, at 10:55:04 you must have clicked the "check for new mail" button. This caused tb to terminated the IDLE on INBOX/GitHub and then tb SELECTed INBOX and discovered the new message.

So with fewer "cached" connection configured, there is more chance that the folder of interest won't be SELECTed so it won't receive instant notification, via IDLE response, of new mail. This shows that even INBOX can have IDLE stolen from it. This illustrates why I favor having a folder property that says "if possible, IDLE on this folder".

Anyhow, glad the repair worked. I think the max saved UID for the folder should be reset only when the UIDVALIDITY is changed by the server. No idea how it got set to that very high value. Seems kind of bad that that happened!
(In reply to Jorg K (GMT+2) from comment #58)
> So let's see whether this breaks any tests:
> https://treeherder.mozilla.org/#/jobs?repo=try-comm-
> central&revision=b3b8aa34a75704408a7cbe276ac9ebed3037b3fc

Looks broken under the "Z". Don't know how to read this. Did my change cause all those items in "failure summary"?
Nope, all cool, you didn't cause this. I'll get to the review eventually.
That's good to hear! Will need to submit an updated patch soon and will request another "try" for reporter (andysem) to test.
Here's a refreshed patch. I found one bug, added some more logging lines and did a lot more testing.

Jorg, If you could kick off another "try" build so reporter andysem can run it I would appreciate it.

andysem, when you run the new "try" build I have changed the string to enable the condstore logging. It is now just "IMAP_CS", e.g..:

MOZ_LOG=IMAP_CS:5,sync,timestamp MOZ_LOG_FILE=  <path-to-try-build>/thunderbird 2>&1 | tee ~/tb.log | grep "IMAP_CS"

So far, I have only tested this with gmail. I will test other servers such as Dovecot soon.
Attachment #9007920 - Attachment is obsolete: true
Attachment #9007920 - Flags: review?(jorgk)
Attachment #9009494 - Flags: review?(jorgk)
I've tested this new build in two configurations: with 1 cached connection, as before, and with 3 cached connections. The test scenario was the same as before: I open TB with INBOX selected, send a test message from another account and press "Get Messages".

The 3-connection tests differed from 1-connection ones in that the test message appeared in the INBOX as soon as I sent it from the other account (i.e. it was detected by IDLE and displayed in the UI). The message did not appear in "Sent Mail" until I pressed "Get Messages", though.

In one of the experiments with 3 cached connections (chronologically, it was one of the first experiments I did) I've seen TB doing the full fetch on the "Sent Mail" folder after I pressed "Get Messages". I'm not sure what triggered it, and I could not reproduce it in the following experiments with either 1 or 3 cached connections. Other than that one test, full fetch did not happen and the test message was received normally.

I was also lucky enough to test the behavior when a message is received in my large mailing list folders (including while TB was shut down) and it didn't trigger the full fetch. Glad to see this!
This log was taken with 3 cached connections. It was one of the first experiments and it did show a full fetch on "Sent Mail".

The test message was sent on 10:36 UTC, I clicked "Get Messages" on 10:37. You can see it starts fetching flags after 10:37:31.741111.

Unfortunately, a non-test message has arrived in one of the folders, so I had to edit it out.
Attached file tb_1conn.log.xz
This log was taken with 1 cached connection. No full fetch here. Test message sent at 10:44, "Get Messages" clicked at 10:45.
Attached file tb_3conn.log.xz
This log was taken with 3 cached connections again. No full fetch here as well. Test message sent at 10:47, "Get Messages" clicked at 10:48.
(In reply to andysem from comment #67)
> Created attachment 9009572 [details]
> tb_3conn_full_fetch.log.xz
> 
> This log was taken with 3 cached connections. It was one of the first
> experiments and it did show a full fetch on "Sent Mail".
> 
> The test message was sent on 10:36 UTC, I clicked "Get Messages" on 10:37.
> You can see it starts fetching flags after 10:37:31.741111.

For some reason for "Sent Mail" tb obtained 0 (zero) for the highest stored modseq. So it did a fetch 1:* (CHANGEDSINCE 0) which is effectively a full flag fetch. I saw this myself here once while testing yesterday maybe on "All Mail" (which I don't think you have enabled or subscribed). I now have log items to print where stored highest modseq (and uid) is saved or if it is reset to empty, but I haven't seen a reset in the log. I don't see anything in your log that indicates it was reset during any of your 3 runs.

I don't know if this applies to you, but sometimes with tb running from the command line I just quickly shut it down with crtl-c. This may potentially cause file changes to not get completely flushed from RAM to the "summary" database files and may cause problems with inconsistent data. Therefore, when doing these CONDSTORE tests, I now try to always shutdown tb from the application. Anyhow, when I start tb again I will look for this problem of (CHANGEDSINCE 0).

I also did some testing with Dovecot yesterday. I see a problem where if a flag is changed by another client it doesn't show up in tb. This was long ago reported for gmail and was one of the reasons CONDSTORE was disable by default. Now gmail works OK in this respect. It looks like the main problem with Dovecot is that the flag change IDLE response does not include the UID, just the msg sequence number (not the same as MODSEQ). Gmail now does return the UID in the flag changed response. I may be able to fix this, TBD...
> I don't know if this applies to you, but sometimes with tb running from the command line I just quickly shut it down with crtl-c.

I always close TB with the close button of its main window. No Ctrl+C.
Found another problem that should maybe be forked to a new bug but will note it here. In testing with dovecot I have 5 idling connections but only 3 of them respond to the IDLE response sent by dovecot, e.g. "* 5 EXISTS". After adding some debug printfs to tb, it appears that all the setup is occurring error free but 2 of the selected folders don't respond completely to the IDLE message from the dovecot server, i.e.,:

" Assume 25 messages in the currently selected mailbox
tb: 1 IDLE
dc: + idling
" with another client, add a new message to this mailbox
dc: * 26 EXISTS    "dovecot reports the new existing msg count
tb: " receives the exists response OK but never sends DONE and never triggers mailbox update like it should
:
:
" other client adds another message to mailbox
dc: * 27 EXISTS  "dovecot reports another message exists
tb: ignores this and any more IDLE responses and new messages don't appear in tb list until biff or "new messages" button clicked.

What should happen is that after the "* 26 EXISTS" response tb should terminate IDLE by sending DONE and then do what is needed to update the mailbox/folder and go back to IDLE mode. But this doesn't happen on 2 of the 5 connections. It has me stumped.

I don't think this is dovecot specific since it seems to be doing what it should.

This may be related to Bug 693204 and/or its dups.
(In reply to gene smith from comment #72)
> Found another problem that should maybe be forked to a new bug but will note
> it here. In testing with dovecot I have 5 idling connections but only 3 of
> them respond to the IDLE response sent by dovecot,
> ...
> This may be related to Bug 693204 and/or its dups.

Perhaps this is related to the long undo bug 1241524 ?
(In reply to gene smith from comment #40)
> (In reply to Alfred Peters from comment #39)

Not really about this bug, but Bug 236584 attempted to fix the IDLE server timeout that I mentioned in comment 38. However, don't think it really fixed it.

Also, from "way back" this bug sounds very similar to this bug: Bug 479450. It seems to have been dropped and the underlying behavior sounds different.(In reply to Wayne Mery (:wsmwk) from comment #73)

> (In reply to gene smith from comment #72)
> > Found another problem that should maybe be forked to a new bug but will note
> > it here. In testing with dovecot I have 5 idling connections but only 3 of
> > them respond to the IDLE response sent by dovecot,
> > ...
> > This may be related to Bug 693204 and/or its dups.
> 
> Perhaps this is related to the long undo bug 1241524 ?

I don't see a relation to that. I have created a new bug report for the idle response problem: Bug 1493480
(In reply to gene smith from comment #70)
> 
> I also did some testing with Dovecot yesterday. I see a problem where if a
> flag is changed by another client it doesn't show up in tb. This was long
> ago reported for gmail and was one of the reasons CONDSTORE was disable by
> default. Now gmail works OK in this respect. It looks like the main problem
> with Dovecot is that the flag change IDLE response does not include the UID,
> just the msg sequence number (not the same as MODSEQ). Gmail now does return
> the UID in the flag changed response. I may be able to fix this, TBD...

Now back to *this* bug! The fix I did for dovecot (and other possible servers that don't return a UID Fetch response to flag changes) is to always do a CHANGEDSINCE flag fetch when CONDSTORE enabled and the most recently received modseq differs from the last saved modseq for the folder (unless there are other reasons to do a full flag fetch). Before, for some unknown reason, tb only did a CHANGEDSINCE fetch if the flag state arrays were empty and, if not, did a fetch from the highest saved UID, which doesn't catch any messages with changed flags at a lower UID.

Updated patch pending.
Jorg, Please do one more try build for andysem so he can test again on his gmail accounts. Also, this should be ready now for a review if it works for him.
Attachment #9009494 - Attachment is obsolete: true
Attachment #9009494 - Flags: review?(jorgk)
Attachment #9011335 - Flags: review?(jorgk)
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=d4a5e500be729802e2ab70a66c00b717e4f9006c
The last try had a compile problem on Mac:
nsImapMailFolder.cpp:2659:12: error: format specifies type 'unsigned long' but the argument has type 'uint64_t'
Comment on attachment 9011335 [details] [diff] [review]
1428097-slow-opening-large-folder-v0.patch(v2)

Even more compile problems on Mac now.
Attachment #9011335 - Flags: review?(jorgk)
Sorry about the osx build failures. Now I think I see what those " PRIxXX " things are about! Changed all my IMAP_CS log lines to use them instead of %d, %lu etc. Also fixed bool's to print string "true" or "false".
Attachment #9011335 - Attachment is obsolete: true
Attachment #9011570 - Flags: review?(jorgk)
Thanks. Shall we wait a day or two for the test results with the latest version?
Jorg, Yes if you don't mind trying another build with my latest v3 patch then andysem (reporter) can maybe test it. Or he can test the last build you did that failed on OSX if you prefer not to build again, which would be this (built OK for linux):
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=d4a5e500be729802e2ab70a66c00b717e4f9006c
Hopefully andysem is willing try another version.

Note: my v3 patch only touched the MOZ_LOG(IMAP_CS,...) lines compared to patch v2.
Flags: needinfo?(andysem)
I did my test with this new version of TB, as before, with 3 cached connections. TB did not perform a full fetch, everything seems to be working the same way as it did in my previous test. I will keep this version running for a while to see if a full fetch happens.
Flags: needinfo?(andysem)
Forgot to say that in this last log the test email was sent at 12:09 UTC, and I clicked "Get Messages" at 12:10 UTC.
After several hours of use TB didn't do a full fetch for any of my folders. During this period I received and sent messages in TB.
Comment on attachment 9011570 [details] [diff] [review]
1428097-slow-opening-large-folder-v0.patch(v3)

Review of attachment 9011570 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry about the delay. I could find a million excuses, like bustage, releasing TB 60.2.1, etc., but the truth is that this is not my favourite review since I'm no expert in the field. I looked at the changes carefully and they seem to make sense. Given that you and the reporter did a lot of testing we should accept it as it is with a few minor issues addressed and a question re. the sanity check previously using m_flagState->NumberOfDeletedMessages() answered.

Clearing the r? for now to get it out of my queue. I promise the next review will be a whole lot quicker.

::: mailnews/imap/src/nsImapProtocol.cpp
@@ +4148,5 @@
> +    // Print to log items affecting needFullFolderSync:
> +    MOZ_LOG(IMAP_CS, LogLevel::Debug, ("Do full sync?: mFolderHighestUID=%" PRIu32 ", added=%" PRId32 ", useCS=%s",
> +          mFolderHighestUID, added, useCS?"true":"false"));
> +    bool needFullFolderSync = !mFolderHighestUID || (flagStateEmpty && !useCS);
> +    bool needFolderSync;

You can initialise this to false ...

@@ +4178,5 @@
> +           GetServerStateParser().fHighestModSeq != mFolderLastModSeq);
> +    }
> +    else
> +    {
> +      needFolderSync = false;

... and then lose this dangling else clause.

@@ -4166,5 @@
>          // if we did a CHANGEDSINCE fetch, do a sanity check on the msg counts
>          // to see if some other client may have done an expunge.
>          if (m_flagState->GetPartialUIDFetch())
>          {
> -          if (m_flagState->NumberOfDeletedMessages() +

Before the sanity check was different, it looked at deleted messages. So that wasn't useful?

@@ +4207,5 @@
>          // to see if some other client may have done an expunge.
>          if (m_flagState->GetPartialUIDFetch())
>          {
> +          uint32_t exists = GetServerStateParser().NumberOfMessages();
> +          uint32_t prevExists = mFolderTotalMsgCount;

These variable names sound like for a bool. Are there better names? How about numExisting?

@@ +4261,5 @@
> +          // existing in the folder.
> +          bool expungeHappened = numNewUIDs != (exists - prevExists);
> +          if (expungeHappened)
> +          {
> +            // sanity check failed - need full fetch to removed expunged msgs.

typo: remove.

@@ +4263,5 @@
> +          if (expungeHappened)
> +          {
> +            // sanity check failed - need full fetch to removed expunged msgs.
> +            MOZ_LOG(IMAP_CS, LogLevel::Debug,
> +                ("Other client expunged msgs, do full fetch to removed expunged msgs"));

and here.

@@ +4305,5 @@
>        // if we're using CONDSTORE, and the parser hasn't seen any UIDs, use
> +      // the highest UID previously seen and saved for the folder instead.
> +      if (useCS)
> +      {
> +        if (!highestRecordedUID)

Please merge the two if blocks like it was before.
Attachment #9011570 - Flags: review?(jorgk) → feedback+
(In reply to Jorg K (GMT+2) from comment #86)
> Comment on attachment 9011570 [details] [diff] [review]
> 1428097-slow-opening-large-folder-v0.patch(v3)
> 
> ::: mailnews/imap/src/nsImapProtocol.cpp
> @@ +4148,5 @@
> > +    // Print to log items affecting needFullFolderSync:
> > +    MOZ_LOG(IMAP_CS, LogLevel::Debug, ("Do full sync?: mFolderHighestUID=%" PRIu32 ", added=%" PRId32 ", useCS=%s",
> > +          mFolderHighestUID, added, useCS?"true":"false"));
> > +    bool needFullFolderSync = !mFolderHighestUID || (flagStateEmpty && !useCS);
> > +    bool needFolderSync;
> 
> You can initialise this to false ...

Done

> 
> @@ +4178,5 @@
> > +           GetServerStateParser().fHighestModSeq != mFolderLastModSeq);
> > +    }
> > +    else
> > +    {
> > +      needFolderSync = false;
> 
> ... and then lose this dangling else clause.

It's lost.

> 
> @@ -4166,5 @@
> >          // if we did a CHANGEDSINCE fetch, do a sanity check on the msg counts
> >          // to see if some other client may have done an expunge.
> >          if (m_flagState->GetPartialUIDFetch())
> >          {
> > -          if (m_flagState->NumberOfDeletedMessages() +
> 
> Before the sanity check was different, it looked at deleted messages. So
> that wasn't useful?

Right, it was always zero in my tests. Also, right here was the primary reason so many flags (100k) were fetched when the reporter received a new mail. A more effective test for doing full flag fetch is to compare the previous to the current EXISTS count so you only have to do a full fetch of all 100k flags if an older message is expunged; but you have to look at the new UIDs too.

> 
> @@ +4207,5 @@
> >          // to see if some other client may have done an expunge.
> >          if (m_flagState->GetPartialUIDFetch())
> >          {
> > +          uint32_t exists = GetServerStateParser().NumberOfMessages();
> > +          uint32_t prevExists = mFolderTotalMsgCount;
> 
> These variable names sound like for a bool. Are there better names? How
> about numExisting?

Ok, exists->numExists, prevExists->numPrevExists. I prefer keeping just *Exists instead of *Existing since just "EXISTS" is a standard imap response token: https://tools.ietf.org/html/rfc3501#page-71

> 
> @@ +4261,5 @@
> > +          // existing in the folder.
> > +          bool expungeHappened = numNewUIDs != (exists - prevExists);
> > +          if (expungeHappened)
> > +          {
> > +            // sanity check failed - need full fetch to removed expunged msgs.
> 
> typo: remove.

Done

> 
> @@ +4263,5 @@
> > +          if (expungeHappened)
> > +          {
> > +            // sanity check failed - need full fetch to removed expunged msgs.
> > +            MOZ_LOG(IMAP_CS, LogLevel::Debug,
> > +                ("Other client expunged msgs, do full fetch to removed expunged msgs"));
> 
> and here.

Done

> 
> @@ +4305,5 @@
> >        // if we're using CONDSTORE, and the parser hasn't seen any UIDs, use
> > +      // the highest UID previously seen and saved for the folder instead.
> > +      if (useCS)
> > +      {
> > +        if (!highestRecordedUID)
> 
> Please merge the two if blocks like it was before.

Done
Jorg, Here's the updated patch that incorporates your comments.
Attachment #9011570 - Attachment is obsolete: true
Attachment #9013468 - Flags: review?(jorgk)
Comment on attachment 9013468 [details] [diff] [review]
1428097-slow-opening-large-folder.patch(v4)

Thanks, let's go with this. I'll fix some white-space issues before landing, like inconsistent indentation of MOZ_LOG().

I'll try to write this a little better:
+      needFolderSync = ((flagStateEmpty || added == deleted) && (!useCS ||
+        (GetShowDeletedMessages() &&
+           GetServerStateParser().NumberOfMessages() != mFolderTotalMsgCount))) ||
+        (useCS &&
+           GetServerStateParser().fHighestModSeq != mFolderLastModSeq);
Attachment #9013468 - Flags: review?(jorgk) → review+
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/028f53e3c744
With CONDSTORE, eliminate unneeded flag fetches at startup. r=jorgk
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 64.0
Blocks: 583677
You need to log in before you can comment on or make changes to this bug.