Closed Bug 555171 Opened 14 years ago Closed 9 years ago

massive FETCH after EXPUNGE, when no new messages in the folder and using imap delete model

Categories

(Thunderbird :: Folder and Message Lists, defect)

x86
Windows Vista
defect
Not set
major

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 218075

People

(Reporter: andrey, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; ru; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2 (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; ru; rv:1.9.2.2pre) Gecko/20100302 Lightning/1.0b2pre Lanikai/3.1b1

In one of my IMAP folder there are 20000 messages. If I delete some, then expunge folder, sometimes Thunderbird starts re-read entire folder, or very large part of it (12000 messages in last time). Why I do this, if there no any new message in the folder, and entire folder was already readed many times before? 200 Mb of excess traffic every day :( Below is a log of such session (message bodies are skipped, my comments in []):

IN:  131 noop
OUT: [no any notifications]
OUT: 131 OK NOOP completed
IN:  132 UID fetch 24668:* (FLAGS)
OUT: 132 OK UID FETCH completed
IN:  133 IDLE
OUT: + idling
OUT: [no any notifications]
OUT: 133 OK IDLE completed
IN:  134 expunge
OUT: * 20706 EXPUNGE
OUT: 134 OK EXPUNGE completed
IN:  135 UID fetch 24668:* (FLAGS)
OUT: 135 OK UID FETCH completed
IN:  136 IDLE
OUT: + idling
OUT: [no any notifications]
OUT: 136 OK IDLE completed
IN:  137 noop
OUT: [no any notifications]
OUT: 137 OK NOOP completed
IN:  138 UID fetch 24668:* (FLAGS)
[no any new messages]
OUT: 138 OK UID FETCH completed
[so nothing new messages, flags changes etc, but TB starts to re-read messages]
IN:  139 UID fetch 24582,24583:24584,24597,24600:24601,24608,24621,24623,24634:24635 (UID RFC822.SIZE BODY.PEEK[])
OUT: * 20678 FETCH (UID 24582 RFC822.SIZE 3620 BODY[] {3620}
OUT: 139 OK UID FETCH completed
IN:  140 UID fetch 24665 (UID RFC822.SIZE BODY[])
OUT: * 20704 FETCH (UID 24665 RFC822.SIZE 6427 BODY[] {6427}
OUT: 140 OK UID FETCH completed
IN:  141 IDLE
OUT: + idling
OUT: [nothing new again]
OUT: 141 OK IDLE completed
IN:  142 UID fetch 24564,24568,24570:24572,24576:24577 (UID RFC822.SIZE BODY.PEEK[])
OUT: * 20669 FETCH (UID 24564 RFC822.SIZE 1762 BODY[] {1762}
OUT: 142 OK UID FETCH completed
IN:  143 UID fetch 24535,24536:24537,24541,24545,24560:24562 (UID RFC822.SIZE BODY.PEEK[])
OUT: * 20659 FETCH (UID 24535 RFC822.SIZE 2527 BODY[] {2527}
OUT: 143 OK UID FETCH completed
IN:  144 UID fetch 24531,24532:24534 (UID RFC822.SIZE BODY.PEEK[])
(...and more 12000 messages)

Reproducible: Sometimes

Steps to Reproduce:
1. EXPUNGE

Actual Results:  
Expunged and then massive FETCH.

Expected Results:  
Only expunge.

Very annoying and expensive.

I'm using Thunderbird since version 0.4, and seems this bug first appeared in v3.
Version: unspecified → Trunk
Is the offline store for this folder > 4GB? If so, Thunderbird is probably trying to "repair" your offline store, which looks broken because it's bigger than 4GB. If the folder online is significantly less than 4GB total, you could try removing the offline store and let it get regenerated.

3.1 beta 1 is supposed to have a fix for the 4GB offline store issue - bug 532323. If beta 1 is downloading the same messages over and over again, that implies there's something wrong with that fix. You've been running 3.1 b1 Lanikai builds when you saw this issue?
No, there are no files of >4Gb size. Size of the "inbox" file is about 1Gb (entire account - 9Gb). And there is big number of files with names like "nstmp-204":
26.03.2010  10:33 69Mb nstmp-200
26.03.2010  10:40 25Mb nstmp-201
26.03.2010  11:31 69Mb nstmp-202
26.03.2010  11:45 24Mb nstmp-203
26.03.2010  14:17 69Mb nstmp-204
26.03.2010  16:45 24Mb nstmp-205
...
07.12.2009  09:29  1Mb nstmp-1
> You've been running 3.1 b1 Lanikai builds when you saw this issue?

Yes, I see this issue multiple times every day with "Mozilla/5.0 (Windows; U; Windows NT 6.0; ru; rv:1.9.2.2pre) Gecko/20100302 Lightning/1.0b2pre Lanikai/3.1b1"

> If the folder online is significantly less than 4GB total, you could
try removing the offline store and let it get regenerated.

OK, its size is less than 4Gb, I will try this.
(In reply to comment #0)
> IN:  133 IDLE
> OUT: + idling
> OUT: [no any notifications]
> OUT: 133 OK IDLE completed
> IN:  134 expunge
> OUT: * 20706 EXPUNGE
> OUT: 134 OK EXPUNGE completed
(In reply to comment #2)
> And there is big number of files with names like "nstmp-204":
> 26.03.2010  10:33 69Mb nstmp-200
> 26.03.2010  10:40 25Mb nstmp-201
> 26.03.2010  11:31 69Mb nstmp-202
> 26.03.2010  11:45 24Mb nstmp-203
> 26.03.2010  14:17 69Mb nstmp-204
> 26.03.2010  16:45 24Mb nstmp-205
> ...
> 07.12.2009  09:29  1Mb nstmp-1

Expunge is issued by manual "Compact" operation by you.
Do you frequently request "Compact" from UI?
Do you frequently request "Compact" from UI while compact is in progress?
If yes, see bug 546384, please.

If "Compact" is interfered at critical step, offline-store may be currupted.
  1. copy living mail data to nstmp-xxx
  2. rebuild-index like process with nstmp-xxx => new .msf data
  3. try to replace old .msf & offline-store by new .msf data & nstmp-xxx
     3.1 delete old offline-store
                      <== interfere by other including next Compact request 
     3.2. rename nstmp-xxx to offline-store file name
  If 3.2 fails due to interfere by other, "no offline-store data" condition
  can happen. It may produce re-fetch of many mails.
> Expunge is issued by manual "Compact" operation by you.
Right. Not sure it is exactly 'Compact' - using Russian version, there it called 'Сжать' (compact or compress...).

> Do you frequently request "Compact" from UI?
Yes. After moving/deleting of new portion of messages arrived in INBOX. Using 'Compact' entry from context menu on the folder.

> Do you frequently request "Compact" from UI while compact is in progress?
Never, if I see 'compressing...' in the status bar or rolling 'working' icon in the toolbar.

> If "Compact" is interfered at critical step, offline-store may be currupted.
Yes, sometimes it becomes corrupted - bodies of some messages become not matching its headers (after EXPUNGE too). I've reported it as bug in October - bug#518997.

Hope it is possible to fix this. Use some mutex at 'critical step'. Can't live without EXPUNGE :)
> If yes, see bug 546384, please.

I use 'compact' command to force EXPUNGE on the IMAP server, not needed any local storage compacting. Does it possible to separate 'local compact' and 'remote expunge' in the UI?
Andrey, I take it you're using the imap delete model? There's no way to separate local compact and remote expunge in the UI as of now.

We've definitely heard reports of people noticing offline store corruption after compact, but I'm not sure if it's caused by the compaction, or we just notice it during compaction, which tries to clean up corruption.
Status: UNCONFIRMED → NEW
Ever confirmed: true
> Andrey, I take it you're using the imap delete model?

Yes, I'm using TB as IMAP-client and follow the IMAP usage model, mark-to-delete/expunge.
Tested with my other IMAP account with more than 13000 messages in folder: deleted (marked to delete) 5 messages, run 'compress' from context menu on that folder. nstmp-1 appeared in local folder, TB status bar 'downloading 50 from 756 messages', size of nstmp-1 increased to 1.5Mb, expunge finished, deleted messages disappered from message list (in UI). But nstmp-1 is still in the directory (now 12 minutes old). 

Waiting more, do nothing. No new messages in the folder, no new notifications from sever (looked into log), but TB started to re-fetch 12899 messages from 13828 in the folder (now 6078 from 12899). There no new 'nstmp' files in the folder, just increasing 'inbox' file (now 402 Mb).

The offline storage of this account was deleted yesterday (on your advice), then downloaded completely. And today it downloading again :(

I believe, this re-FETCH is triggered NOT by nested EXPUNGE. There was only _one_ EXPUNGE on this folder today, but already TWO massive refetch.
I've discovered why the nstmp-NN file is getting left in at least one case, and it does reveal an issue with the way we compact offline stores. I don't see how it could directly cause your issue, however, though it does leave us in an uncleaned-up state that could be problematic.

I suspect the fetching is just autosync trying to fetch messages that we don't think we have offline. Why we think that, I don't know yet.

The bug I found has to do with doing an expunge from the UI on the open folder, when the open folder has deleted but not expunged messages. We start the offline store compaction, which builds up a list of msg keys in the db, and then we start the imap expunge in parallel. If the imap expunge finishes before we finish the compaction, one or more of the msg keys will no longer be in the db, or on the server, which leads to an error trying to stream the message. We don't handle that error correctly. I have a patch that fixes our handling of that error in general. I can generate a test server build for you to try to see if it helps with your specific problem.

I need to write an xcpshell test for the bug, and also see if I can deal with the concurrency issue better.
> I can generate a test server build for you to try to see
> if it helps with your specific problem.

Yes, I ready to test it.
Summary: massive FETCH after EXPUNGE, when no new messages in the folder → massive FETCH after EXPUNGE, when no new messages in the folder and using imap delete model
Keywords: perf
Setting dependency to bug 546384 for ease of tracking, because same phenomenon as bug 546384 is also reported in this bug.
(In reply to comment #4)
> Yes, sometimes it becomes corrupted - bodies of some messages become not
> matching its headers (after EXPUNGE too).

Wrong mail data display problem with IMAP which I can reliablly reproduce is Bug 501851 only.  
> Bug 501851 If appending to IMAP offline-store file is interfered with
> other software, mail data at offset=0 in offline-store(length=102,256 bytes) is used as mail data
"Compact of Tb himself" may be the "other software" in that bug.
Andrev, do you still see this issue in v3.1, now that bug 546384 is fixed?

if it's not fixed, then perhaps Blu3's bug 541917 could be duped to this.
Severity: normal → major
I've not resisted this bug (and connections limit bug #555537) and switched to webmail in June, after 6 happy years with Thunderbird...
(In reply to David :Bienvenu from comment #9)
> I've discovered why the nstmp-NN file is getting left in at least one case,
> and it does reveal an issue with the way we compact offline stores. I don't
> see how it could directly cause your issue, however, though it does leave us
> in an uncleaned-up state that could be problematic.
> 
> I suspect the fetching is just autosync trying to fetch messages that we
> don't think we have offline. Why we think that, I don't know yet.
> 
> The bug I found has to do with doing an expunge from the UI on the open
> folder, when the open folder has deleted but not expunged messages. We start
> the offline store compaction, which builds up a list of msg keys in the db,
> and then we start the imap expunge in parallel. If the imap expunge finishes
> before we finish the compaction, one or more of the msg keys will no longer
> be in the db, or on the server, which leads to an error trying to stream the
> message. We don't handle that error correctly. I have a patch that fixes our
> handling of that error in general. I can generate a test server build for
> you to try to see if it helps with your specific problem.
> 
> I need to write an xcpshell test for the bug, and also see if I can deal
> with the concurrency issue better.

sounds to me like a duplicate of bug 218075, or highly related
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.