340265 - can't remove/delete large number of messages at the same time, timeout error (imap)

Reporter

Description

•

19 years ago

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 I'm running courier imap on my server and some users have complained they can't remove messages from some of their folders. Apparently, their thunderbird (1.5.0.x) was setup to move deleted messages to Trash folder and they had more than 6,000 messages they needed to remove. They were trying to remove the messages by selecting them all and pressing delete button. The client would work for a while, then they would get a timeout error, reconnect and find all messages still in the folder. So they would try it again and it would fail again. If they later checked their Trash folder, they would find several copies of all those messages sitting there in addition to messages still being in the original folder ;-/ Here's what happens behind the scene: 1. User selects all messages in TB and clicks on delete. 2. TB sends a COPY [range] command to the server. 3. Serves starts copying the messages to the Trash folder. 4. Since there's a lot of messages and some of them are really large, it takes server a while to respond, so TB displays "timeout" error message to the user. 5. Server *continues to process the request* though (imapd is still running) and copies all the messages to Trash. 6. TB never finds out the messages were copied and doesn't remove any messages from the original folder so we end up with having all the messages we wanted to remove still in the folder with a copy in Trash 7. User notices the messages are not removed, and repeats the actions (goto [1]). Why one would have 6,000+ messages in his folder which they wanted to remove at once is a whole another story, but it happens often enough to be annoying, unfortunately. The workaround is to select up to 500-1000 messages at a time or try setting up TB to remove messages right away or to mark them as deleted, but users in general setup TB just once and if their preference is to move messages to Trash, it's their choice. What would be ideal is if TB instead of processing all selected messages at once would split them into groups if the number of messages exceeded a certain limit, which could be configurable in the client itself. I think groups of 100 messages would be reasonable as it provides enough feedback to the user (counter is updating) and guarantees a timely server response. Same goes for removal (expunge) and such. Reproducible: Always Steps to Reproduce: 1. setup thunderbird to put deleted messages in Trash 2. choose a folder with large number of messages in it (3,000 plus, depends on the imap server performance) 3. select all messages 4. click on delete 5. after getting a timeout error, check the folder again, see all messages still there 6. check Trash folder in several minutes, find all messages there in addition to being in the original folder Actual Results: messages are not removed from the original folder but copied into Trash Expected Results: messages should be removed from the original folder and put into Trash folder this behavior is easy to reproduce with Maildir type folders on the imap server, specifically courier imap (the most recent 0.52.3)

David :Bienvenu

Comment 1

•

19 years ago

I'd suggest that the user increase their timeout under tools | options | advanced | general, if they're going to do that, and the server is that slow.

Antony Gelberg

Comment 2

•

19 years ago

(In reply to comment #1) > I'd suggest that the user increase their timeout under tools | options | > advanced | general, if they're going to do that, and the server is that slow. > Increasing timeouts is a notorious programming bodge though. What value should the user increase the timeout to? The user relies on the client to get it right. Is it not possible to make the move (or delete) operation atomic?

Wayne Mery (:wsmwk)

Comment 3

•

18 years ago

Sergiy, what release were you running in comment 0? Has the problem gotten better or worse with automatic updates? did comment 1 help? I experienced "partial" delete results twice in the last few days. Once when our mail server was hammered and on it's knees. And once today under normal load. But it was only ~200 messages I was deleting from trash - less than a dozen were not deleted.

Summary: can't remove large number of messages at the same time → can't remove/delete large number of messages at the same time

Wayne Mery (:wsmwk)

Comment 4

•

18 years ago

correction due to faulty memory attempted 40 messages, 2 didn't delete (mail server under severe load) attempted ~200 messages - a dozen not deleted (server under normal load)

Sergiy Zhuk

Reporter

Comment 5

•

18 years ago

> Sergiy, what release were you running in comment 0? Whatever stable TB version we have back then (1.5.0.4?) > Has the problem gotten > better or worse with automatic updates? did comment 1 help? The behavior didn't change, so nothing has changed in that regard. One can increase the timeout, yes, but it's not a solution, since it's low by default and would require every user to do it. I think processing selected messages in groups of N each (where N could be about a 100 msgs) on delete and any other mass update would improve user experience especially if there's some kind of progress bar or status line (say on the bottom of the screen) showing the number of messages selected and number of messages processed. Right now even if you set timeout high enough, most users would simply not wait long enough thinking the app is hosed. // serge

Sergiy Zhuk

Reporter

Comment 6

•

18 years ago

So, it this gonna be addressed ? The Trash folder handling seems to be terribly redundant. You're copying all messages into Trash and then don't even expunge messages from the original folder, but flag them as deleted. So essentially now you have two copies of the message which user wants to get rid of in the first place. Then the user has to expunge his mailbox and empty the Trash.

Michael

Comment 7

•

18 years ago

I have also had this issue. I tried deleting mail in a folder that is sent via mailing list using Thunderbird/IceDove and had to have my WebHost delete my TMP folder because I was unable to delete it from the shell. They stated there were 36,000 messages in the Trash/tmp folder from trying to delete mail, and that they have seen this issue in the past.

Wayne Mery (:wsmwk)

Comment 8

•

18 years ago

(In reply to comment #6) > So, it this gonna be addressed ? > The Trash folder handling seems to be terribly redundant. > You're copying all messages into Trash and then don't > even expunge messages from the original folder, but flag them > as deleted. you can change the delete model of course (tho doesn't address all the issues) see also http://kb.mozillazine.org/Compacting_folders

Summary: can't remove/delete large number of messages at the same time → can't remove/delete large number of messages at the same time, timeout error (imap)

Nelson Bolyard (seldom reads bugmail)

Comment 9

•

18 years ago

Just FYI, holding down the shift key when deleting messages causes them to be marked deleted in the original folder and NOT copied to the trash folder. But you knew that already, right? :)

john fisher

Comment 10

•

17 years ago

I was just hit with this bug using ver 2.0.0.6 on Ubuntu. I *may* have created it using an earlier version from home. I have an IMAP account. There were a 1000 or so mails in the main inbox folder ( a month's worth of saved ) and I tried to highlight-delete about 100 or so in one move. I don't know anything about traditional email file-handling ( see http://kb.mozillazine.org/Compacting_folders ) though it sounds, well, archaic. So maybe this weird sort of out-of-synch duplicative method is just whats required- I defer to experience. But. Because it both corrupts your data, and freezes you out of your account I think this ought to be labeled a critical. I'll be happy to confirm it for you too, just let me know. thanks John

Chris Wilson

Comment 11

•

17 years ago

I've experienced this bug twice. Both times required my host to manually log into the mail server and clear out the Trash TMP folder, leaving the mail account inaccessible until they get around to doing it.

Takanori MATSUURA

Comment 12

•

17 years ago

Dup. of bug 296453?

Brian Awood

Comment 13

•

17 years ago

(In reply to comment #12) > Dup. of bug 296453? > I don't think so, although I suppose it is possible that there are actually 2 different problems that display very similar symptoms. I've been meaning to post on this bug for a while and haven't had a chance to, so here are my observations. Our site has seen bug 340265 many times from the server side. Since we run a cyrus-imap installation with over 80k users, we still experience it on a regular basis. I suspect that the timeout users are seeing, are not actually timeouts (at our site anyway). I would also add that TB appears to retry the delete on it's own after the timeout and, in most cases, it is not the user retrying. Our cyrus configuration creates hardlinks of messages that are copied. Since a delete for most people is actually a copy to Trash, even large numbers of messages can be "deleted" quite rapidly. The combination of this bug and the speed at which hardlinks can be created has caused some users to end up with 500k-1M+ messages in their trash (or some other folder they were moving to), most of the messages being duplicates of other messages. Although this doesn't cause a quota issue for our users, it causes other problems and we generally have to disable their access to that mailbox while it is cleaned up. In trying to recreate this bug in our test environment I found it wasn't strictly connected to the number of messages that were being deleted. Many users trigger it when trying to delete as few as a 1000 messages, but I was able to delete 4-5000+ messages easily. The bug appears to occur when the OK response from the IMAP server is beyond a certain length. For example, if you delete 5000 message and their UIDs are sequential from 1:5000, then the process goes something like: <30 uid copy 1:5000 "Trash" >30 OK [COPYUID 1:5000] Completed And TB accepts that and continues. But if the UIDs are not sequential, and the range is something like [1:10,26,57,....] which can get very long, the server takes the range, does it and responds in a few seconds or less with; >30 OK [COPYUID 1:10,26,57,.....] Completed But when the range section is very long, TB appears to ignore or discard the response, waits for the timeout, then tries the whole thing over again. I can post more detailed debug output if there is a need for it. You can guess how much we hate this bug when a user selects 1000 random messages to delete before going to bed at 2am and we usually get paged 1 or 2 hours later. Brian

David :Bienvenu

Comment 14

•

17 years ago

Brian, more detailed output might be helpful, or even better, access to a test account where I can recreate the problem. yes, we do retry operations when they fail/timeout, but we should only retry them once, though there are certainly reports where we retry multiple times.

David :Bienvenu

Comment 15

•

17 years ago

Is it possible that the OK [COPYUID response might approach 8K in length, in the case where you were able to reproduce the problem? I do see an 8K buffer in the code that's used to create a line and it's possible that we might just spin when presented with a line that long. I'll try to test that here...

Status: UNCONFIRMED → NEW

Ever confirmed: true

David :Bienvenu

Comment 16

•

17 years ago

We have code to grow the line buffer dynamically, but I have a suspicion that it's not always invoked. See http://mxr.mozilla.org/seamonkey/source/mailnews/base/util/nsMsgLineBuffer.cpp#370

Status: NEW → ASSIGNED

Brian Awood

Comment 17

•

17 years ago

An 8K response seems large, but possible. However it's been a while since I did the testing when I first reproduced it. I'll post full logs as soon as I can. I'll see if I can get you access to a test account also.

David :Bienvenu

Comment 18

•

17 years ago

I can also try shrinking the buffer size in my local build and see if I run into problems sooner...

Assignee: mscott → bienvenu

Status: ASSIGNED → NEW

David :Bienvenu

Comment 19

•

17 years ago

I think I was able to reproduce this by reducing the buffer size from 8K to 200, and copying a bunch of messages. I need to catch it in the debugger, though.

Brian Awood

Comment 20

•

17 years ago

Attached file A case where TB appears to have split the list up into several smaller lists — Details

Brian Awood

Comment 21

•

17 years ago

Attached file A case where TB sent a range, when UIDs where not sequential — Details

Brian Awood

Comment 22

•

17 years ago

It now appears as though I'm unable to reproduce the bug using TB 2.0.0.16. I'm not sure exactly which version I was using when I originally reproduced it, but it was probably 2.0.0.9. It also seems as though TB has changed the way it sends the COPY commands. Sometimes it appears to break the complete list of messages up into separate copy commands, and other times it just sends a range even though the UIDs are not sequential. In the later case the server responds with a complete list of UIDs copied, not a range, and even with a large response (>8k) TB seems to handle it ok. I've attached some example logs.

David :Bienvenu

Comment 23

•

17 years ago

IMAP allows us to use ranges with "non-existent" uids, in other words, 1:10 is fine, even if 2-8 don't exist. But to do that, we need to know the set of existing uids, and at some point we extended the copy/move code to know about the set of existing uids. I thought that was pre 2.0, however. If the problem was that TB was generating a too long copy command, and the server errored out, and we retried over and over again, that would explain this. But it sounded earlier like you were pretty sure that the command was succeeding, but the server was replying with a very long copyuid string.

Brian Awood

Comment 24

•

17 years ago

When I was able to reproduce before, the copy command was definitely successful on the server side but TB continued to retry. It looks like the ftp site only has 2.0.0.14 & 16 available, is there any place to download older versions for testing or would I have to check out a code branch from cvs?

David :Bienvenu

Comment 25

•

17 years ago

Yes, a log from 2.0.0.9 would be very useful: ftp://ftp.mozilla.org/pub/thunderbird/releases/2.0.0.9/

Jens Müller (:tessarakt)

Comment 26

•

16 years ago

IMO Thunderbird should do something like gray out the deleted messages and start deleting/moving them as a background activity.

David :Bienvenu

Comment 27

•

16 years ago

in 3.0, we delete them immediately as an offline operation and playback the offline operation in the background.

Wayne Mery (:wsmwk)

Comment 28

•

15 years ago

Sergy and others, in version 3.0 or 3.1, is the problem gone for you, or do you still see it. Please comment

Assignee: bienvenu → nobody

Component: General → Folder and Message Lists

QA Contact: general → folders-message-lists

Whiteboard: closeme 2010-09-25 WFM?

Joshua Cranmer [:jcranmer]

Comment 29

•

15 years ago

RESOLVED INCOMPLETE due to lack of response to previous question. If you feel this change was made in error, please respond to this bug with your reasons why.

Status: NEW → RESOLVED

Closed: 15 years ago

Resolution: --- → INCOMPLETE

satch

Comment 30

•

7 years ago

I don't know if I'm experiencing the same bug as others on this ticket. Thunderbird 52.6.0 (64-bit) on CentOS 7.4 IMAP mail account running on server with Dovecot 1. Start with folder containing 5,000 messages 2. Select most messages (all but 8) 3. Push the DELETE button Expected response: progress bar in bottom bar decrease in message count in bottom bar no alerts Observed response: No change in bottom bar Dialog box: "A script on this page may be busy, or it may have stopped responding. You can stop the script now, or you can continue to see if the script will complete. Script: chrome://messenger/content/folderPane.js:2113 [ ] Don't ask me again (Stop script) (Continue) The situation is repeatable. When I press (Continue) the dialog does NOT disappear, but instead sits there. After a delay, the dialog will disappear and immediately re-appear. Repeat about 10 times. When I press (Stop script) Thunderbird is in an unstable state; I have to quit and restart in order to be able to work with the folder pane again.

A case where TB appears to have split the list up into several smaller lists 17 years ago Brian Awood 11.39 KB, text/plain		Details
A case where TB sent a range, when UIDs where not sequential 17 years ago Brian Awood 16.50 KB, text/plain		Details