Closed Bug 476074 Opened 15 years ago Closed 13 years ago

deleting or moving large numbers of messages at once very slow with high cpu usage

Categories

(Thunderbird :: Folder and Message Lists, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 296453

People

(Reporter: erase, Unassigned)

Details

(Keywords: perf)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b2) Gecko/20081201 Firefox/3.1b2
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b3pre) Gecko/20090129 Shredder/3.0b2pre

selecting a large number of messages in an IMAP folder (ie, 350+) then deleting them or moving them causes the application to hang for several seconds while using a large amount of cpu.

Reproducible: Always

Steps to Reproduce:
1. fill IMAP folder with 500 messages
2. select all messages in folder, choose delete
Actual Results:  
the application hangs until the operation is complete, and uses an extrodinary amount of cpu. attempting to navigate away to another element in the application results in no action, the user must wait - if the wait is long enough, an unresponsive script notification pops up.

Expected Results:  
more of move operation should happen in the background. perhaps a quick first step to update the internal folder contents, then return application control to the user while the move completes in the background.
Sounds to be Bug 296453 or combination of Bug 296453 and Bug 452221(spin-off of a problem in initial Bug 296453)

"deleting or moving" by Drag&Drop? Or problem occurs even when "delete" via menu or delete key? Deep/long thread of mails exists? Problem occurs on any mail folder?

> if the wait is long enough, an unresponsive script notification pops up.

I saw a bug which reported similar phenomenon when delete of mails, but I can't reach it again.
occurs on any mail folder. deleting via the delete key or moving via drag&drop produce the same behavior.

there were some threads, the most any of them had was about 30 messages. i'll generate a mailbox with a bunch of random email to eliminate the threaded possibility and report back.

the unresponsive script notifications always pop up as:
chrome://messenger/content/folderPane.js:276
and chrome://messenger/content/folderPane.js:999
tried with 500 and 1000 randomly generated non-threaded pieces of mail.

it's still slow, and deleting 1000 seemed more than twice as slow as 500. this test was against a quicker mailserver and it generally seemed faster than the slower mailserver to delete these messages - but that shouldn't matter, should it? (ie, the delete should be happening in the background, so it shouldn't matter how fast or slow the imap server is)
Because "delete by Drag&Drop" is not relevant to your problem, problem of Bug 296453 doesn't seem to be relevant to your problem.

"Mail delete" is consists of; (if IMAP, when "Move to Trash model")
  (1) Copy mails to Trash (if IMAP, UID uid COPY "Trash")
  (2) Indicates the mails are deleted (if IMAP, UID uid flag \Deleted)
Sorry but I don't know (1)/(2) is executed for each mail, or for some block of mails, or at once for whole mails to be deleted.
And, if IMAP, notification by IDLE can occur simultaneously while deletion of mails is in progress.
These will make "time takes to delete multiple mails" non-linear to number of mails to delete, if IMAP folder.
Get IMAP protocol log and check what flow really occurs during your delete of IMAP mails. See Bug 402793 Comment #1 for getting log.
If Bug 86396 will be fixed, get IMAP log with timestamp, using Trunk builds. You can probably get time related data by timestamp of NSPR log.   

Do you use "Move to Trash model" for deletion of mails in IMAP folder?
If yes, will other than "Move to Trash model" alter situation?
Do you enable IDLE support?
If yes, will disabling of "IDLE support" alter situation?

Note: As seen in Bug 452221, "CPU 100%" for a while and UI hang itself(during .msf update after deletion) can not be avoided currently. O(num_of_mails**2) time for Shift+Delete(nearly equals to time to take .msf update after deletion) is merely observed easily if relatively long thread(with test case of Bug 452221).
yes, "Move to Trash model" is enabled.

tbird is moving all messages / setting flags in a single imap operation and not an operation per message (i've confirmed this)

from tracing the imap process on the server side, i can see a lag between hitting delete and the imap command being sent through - it's likely the unavoidable wait time dealing with the .msf update you mentioned.

i don't see a way to disable IDLE support in tbird? i turned off automatic checking for new messages, but IDLE commands are still being sent.

there are actually 3 steps it's doing on a move to trash - before the two you mentioned it sets the \Seen flag on a message:
16:22:54 read(0, "17 uid store 12253 +Flags (\\Seen)\r\n", 3811) = 35
16:22:54 read(0, "19 uid copy 12253 \"Trash\"\r\n", 3776) = 27
16:22:55 read(0, "20 uid store 12253 +FLAGS (\\Deleted \\Seen)\r\n", 3749) = 44

this particular message had already been seen, so resetting the \Seen flag was somewhat pointless. had the message been unseen previously, obviously it makes sense - so that it gets copied into the trash. with a large number of messages (all already marked seen), the operation to remark them all seen took a not-insignificant amount of time - if tbird only did the seen step for messages that actually needed doing, it could reduce 10-20% of the time spent waiting for the imap server.

i'm still a little confused as to why a large delete operation blocks the entire application. since the delete operation is happening in a thread, shouldn't i be able to work in other folders while the delete operation continues? iirc, this is what happened with tbird 2 - a large delete would render the folder unusable until the operation completed, but other folders would still be usable (and the general UI wouldn't be blocked).
(In reply to comment #5)
> i don't see a way to disable IDLE support in tbird?

Via Advanced button in Account Settings/Server Settings.

> a large delete would render the folder unusable until the operation completed,
> but other folders would still be usable (and the general UI wouldn't be blocked).

Following part in bug summary of Bug 452221 is for the phenomenon.
> "CPU 100% by the Shift+Delete" locks UI while delete operation
I guess;
 - "final .msf update after delete" is CPU bound task.
 - "final .msf update after delete" is executed under UI thread for a window
   at where delete operation of many mails is executed.
 - If "final .msf update after delete" takes long, UI freeze is exposed to user.
(In reply to comment #5)
> 16:22:54 read(0, "17 uid store 12253 +Flags (\\Seen)\r\n", 3811) = 35
> 16:22:54 read(0, "19 uid copy 12253 \"Trash\"\r\n", 3776) = 27
> 16:22:55 read(0, "20 uid store 12253 +FLAGS (\\Deleted \\Seen)\r\n", 3749) = 44

Double "+Flags \Seen" is possibly a result of change by Bug 211439. (See also Bug 475911)
> Bug 211439 Delete should mark as read (also on POP and IMAP when offline)
Component: General → Folder and Message Lists
Keywords: perf
QA Contact: general → folders-message-lists
Severity: normal → major
Brendan, did you see this issue in version 2?
Summary: moving large numbers of messages at once very slow with high cpu usage → deleting or moving large numbers of messages at once very slow with high cpu usage
hm. it's been quite a while. i don't think it was as pronounced in version 2 as it is in 3... iirc it was that took longer after i switched that made me notice. i feel like i tested it in both to make sure it was a version 3 bug, but i can't remember well enough to definitively say so for sure.
I am the one that filed the duplicate issue 568807.  I don't ever recall seeing this in TB2.
OS: Linux → All
Hardware: x86 → All
Brendan, do you still see this when using version 3.1.7 or newer?
Whiteboard: [closeme 2011-02-21]
Just retested this on my side. Had a folder (mailinglist) which contained around 72000 messages. Marked the oldest 10k ones and hit the "del"-key. I assumed high memory-usage which I think I've seen in the past - but that didn't occur now (or not anymore?). CPU went to 100% on one core/process for around 2:30min which was not blazingly fast, but I think "quite okay" for the huge job it had to carry out. So to me it looks like something in this area might have improved meanwhile.

Tested on Thunderbird 3.1.7, Fedora 14.
Hmm, shortly afterwards (and after showing the reduced number of messages in the folder-tree-view) Thunderbird again went into a 100-percept-cpu-loop - maybe due to repacking the index or so? It's still working after a few minutes and doesn't react to task-switches or so (UI doesn't display, completely blocked).
Stefan, do you have global indexing turned on? Or desktop search integration? And was this an imap or local folder?
Global indexing is off (explicitly - to avoid even worse scenarios). And I also don't have a desktop-search (like Nepomuk-framework or so) running here.
Stefan, imap or local messsages? and if IMAP, is it an imap server running locally on your Fedora box?
sorry, i missed that detail. Remote IMAP (on server in the datacenter, not local network)
i'm getting the same results as stefan. tried with Message Synchronization + Global Indexing, only Global Indexing, and without either. i was deleting 1000 message chunks at a time

in all cases the UI becomes unresponsive for several seconds after hitting the delete button, then it comes back to life and the "moving messages to trash" indicator comes up.
Stefan, deletes are done in two steps - we do the initial delete as an offline operations, and delete the headers from the UI. Then, half a second later, we playback the offline operation, which sends the command to the server. It sounds like the second phase is blocking the UI, for some reason.
Well, if those are the two phases I faced, then the first phase took those 2:30min I mentioned and the second one quite some time longer. I can't say how long the server might have taken (I know it maintains an index on the server as well etc.) but from your reply I read that the second phase should be a "background-job"?
(In reply to comment #21)
> Well, if those are the two phases I faced, then the first phase took those
> 2:30min I mentioned and the second one quite some time longer. I can't say how
> long the server might have taken (I know it maintains an index on the server as
> well etc.) but from your reply I read that the second phase should be a
> "background-job"?

Yes, it is. I guess there's a phase in-between, which is any piece of code that's listening for delete events gets sent a notification, and does whatever it wants to do for that delete event. That can include updating saved searches on that folder, updating counts in the folder pane, global index updating, desktop search updating, etc (I realize you have the latter two turned off). We also do commit the .msf file for the folder you deleted messages from but I can't imagine that takes long.
Whiteboard: [closeme 2011-02-21] → [dupeme?]
For me the undercover behavior of this ticket is the same of bug #296453 

=> Duplicate
Status: UNCONFIRMED → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
Whiteboard: [dupeme?]
You need to log in before you can comment on or make changes to this bug.