Closed Bug 296453 Opened 19 years ago Closed 11 years ago

Racing CPU, slow performance moving/deleting/dragging large number of messages

Categories

(Thunderbird :: Mail Window Front End, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 538378
Thunderbird 3.0b3

People

(Reporter: meroca, Assigned: Bienvenu)

References

(Blocks 1 open bug)

Details

(Keywords: perf, Whiteboard: [bulkoperations][imap AND local])

Attachments

(2 files, 6 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041217
Build Identifier: Thunderbird v. 1.0.2 (20050317)

When moving or deleting a large number of messages (6000-ish), the CPU spikes at
100% and hangs there for a long time before you even get GUI feedback.

This is mostly client end.  The IMAP move/delete barely taxes the IMAP server
running UW-IMAP, but brings my 2.8Ghz CPU on the Windows 2000 PC to its knees.

Moving 6000 messages takes about 3-5 minutes using Outlook Express Client.  This
takes well over 30 minutes with Mozilla Mail and Thunderbird.

Reproducible: Always

Steps to Reproduce:
1. Connect to any IMAP server
2. Move or delete 6000 messages
3. Check CPU usage and take a long break.

Actual Results:  
30+ minutes go by using 100% CPU without progress meter or GUI feedback.

Expected Results:  
Should take much less time and not use 100% CPU.

This task uses less than a second of process time on the IMAP server.
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
This issue is still present in the current 1.5Beta1.

There are a variety of other bugs on the same general topic..  That is,
Thunderbird doesn't handle IMAP folders with large numbers of messages well.  It
takes orders of magnitude longer to process mass number of messages in
Thunderbird than it does with other IMAP clients (Outlook Express).
Version: unspecified → 1.5
I will confirm this, as I have had this same problem, only this happens to me with less than 6000 messages - currently I'm trying to view and/or delete about 4000 and Thunderbird appears to be hosed.  It also acts slow with around 2000 messages.  Even 1000 is slow, though it doesn't appear to hose the application.  Perhaps it's the implementation of the LDAP server?

This is a major problem for me.  I'm using the latest 1.5 release and I often have to delete a few thousand emails from my inbox, and it's a very very painful process.  I seems to me that there is some inefficiency in the delete code that make s the algorithm O(n^2) or thereabouts.  Deleting 10 messages is quick, deleting 100 messages seems to take more than 10 times as long.  The lack of GUI feedback is also a problem, it looks as though Thunderbird has locked up.
I think it might be related to "duplicate" subjects or senders.

I have a spam folder, moving and deleting emails in this folder is no problem (I regularly move 2000 emails at the same times from that box without any issue).

I have another folder which get bounced messages. From this one, deleting just 100 emails uses all the CPU and 300MB+ of memory (instead of 70-100MB usually). I don't know for how long because I usually kill the app before it finished (if it can finish!). Last time I tried, I waited 5min before killing the app. Deleting ~30 emails sometimes take 30s.

Because they are bounced messages, most emails have similar header (subject: "Mail delivery failed: returning message to sender", sender: "Mail Delivery Subsystem <MAILER-DAEMON@cyb10.orb.com>").

For people deleting large amount of email (like the reporter), the number of possible duplicate of subject/sender/<something else?> increases, delaying by that much the operation.
I am using Thunderbird 1.5.0.7

I can confirm that moving/copying under Thunderbird with IMAP is completey bugged ! I have huge painful difficulties to move or copy messages !

It's the same problem for me, I'm handling like 800-2000 emails...

There are even some local folders of 80 emails that I'm trying to copy on the server since MONTHS, it's just impossible ! It always crash in the middle.

It depends a lot on the connexion and on the servers, some work better, it seems that if the connexion and the server and not absolutely perfect and high-speed, Thunderbird cannot work.

The worst problem with Thunderbird and IMAP is CERTAINLY the lack of any control or information on what's happening !

Copying 1000 messages takes something like one or two hours, but in Thunderbird you don't know at all if it is progrssing, blocked, or if it failed.

Worse, I sometimes almost lost emails, because it copied 1942 messages instead of 1949, I had to check EVERY email manually to find which 5 were missing...

Maybe with more info on what's happening it would be easier to debug all this
Still having the same problem with TB version 2.0pre (20070304)
Few 1000 mails and the performance is gone, cpu 100% and no "normal" response from the TB Client !!!
QA Contact: general
version 2.0.0.4 (20070604) on Windows

same issue.  1318 emails to delete from IMAP folder, hit control A to select them all, then delete (or shift-delete) and it sits there thinking about it, then 100% cpu, and (Not Responding) on the toolbar and time, time, time goes by before anything happens (usually a ctrl-shift-esc & end-task).

Shouldn't be a network lag issue as my IMAP server is on 100mb LAN connection in same office.

perhaps a visual progress indicator would be nice as well as add a process that can monitor lag or whatnot.
What does it take to mark this bug as "confirmed"?  I either have to deal with an out of control racing CPU or use Outlook Express to delete/move messages.
Severity: normal → major
OS: Windows 2000 → Windows XP
Version: 1.5 → 2.0
Slow response from server usually make TB "panic" and in most such cases it don't work well.
Thunderbird 2.0.0.9 (20071031) on either Windows or Linux.

I also find deletion of emails in Thunderbird extremely slow. My particular setup is a TLS IMAP connection to a Dovecot server.

I configure Thunderbird to mark messages as deleted, so when I delete 1000-odd messages, this just marks said messages as deleted on the server. This process isn't too painful and completes in a reasonable time, without taxing the client CPU.

The expunge takes an extraordinary amount of time, but doesn't tax the client CPU as in other reports here. (It also doesn't tax the server CPU)

The same operation in other email clients, e.g. MS Outlook, completes quickly (as in a few seconds, versus a few minutes in TB).
This has gotten significantly worse in the last release or two, happening even when deleting only a couple of hundred messages at a time (the previous workaround). Thunderbird is now effectively useless for me unless I clear the messages with another mail client.
I've been having this problem as well, just encountered it while running a debug build, and noticed it spit this out on the console right about the time the CPU usage went up:

WARNING: Dropping timer event because thread is dead, file /Users/dave/Source/MozillaBuild/mozilla/xpcom/threads/nsTimerImpl.cpp, line 510
Status: UNCONFIRMED → NEW
Ever confirmed: true
that was a build from the MOZILLA_1_8_BRANCH with today's source.
Assignee: mscott → nobody
OS: Windows XP → All
Hardware: PC → All
This problem (which one must admit is not well defined since it's cause is unknown) predates version 2.0. So if your problem only _started_ in 2.0 then you may be in the wrong bug.

Dave, perhaps not related, but bug 253955 cites "Dropping timer event because thread is dead".
Keywords: perf, qawanted
Version: 2.0 → unspecified
Importing flags and assignee from the dupe.  There's a bunch of debugging work on the duped bug worthy of reference.
Assignee: nobody → bugmil.ebirol
Flags: blocking-thunderbird3?
xref bug 218075
not sure if relevant but from irc "dmose  stuart write the timer code initially"
I'm taking this bug.

There's a pecobro trace at http://clicky.visophyte.org/examples/pecobro/20080505-01/index.xml if anyone wants to see the side-effects of the underlying issue.  (Warning: You probably want to be using Firefox 3 and be okay if it stresses Firefox enough to crash it...)
Assignee: bugmil.ebirol → bugmail
So, the problem observed in the pecobro trace was an O(n^2) problem due to the deletion happening in exactly the wrong sequence.  Regardless of whether the message view is threaded, messages are threaded under the hood.  This wouldn't be quite so bad except that Thunderbird, by default, threads based on the subject.  (See my notes on threading that cover the default and preferences one can use at: http://wiki.mozilla.org/MailNews:Message_Threading).  Get a few thousand messages with the same subject, and they all get to become part of the same thread.  There's a good chance if you're deleting all the messages with the same subject that the first message of the thread will also be the root of the thread... you delete him, and every one of his children gets re-parented, resulting in notifications for each of them.  The new parent is the next one in line for deletion, so the process repeats itself with just one fewer message to notify on.

The solution to the problem is to delete the messages in an order that eliminates spurious re-parenting and notifications.  The quick hack approach to this would just be to delete things in reverse order; this would work much of the time because the message list/tree view and standard usage patterns tend to work in our favor.  However, it still leaves the potential for pathological cases/inconsistent performance, and so isn't a good solution.

This patch implements an O(n) topological-sort (preferring to minimize space rather than time, constant-wise) to ensure that we always delete children before their parents.  In local-folder testing, this netted a 7x-8x speed-up.  I was able to delete 11240 messages in 0:55 (min:secs) instead of 6:30, and 4496 messages in 0:08 instead of 1:06.  I want to say the IMAP speed-up was even more considerable, but I broke my IMAP testing account and so don't have solid numbers on that.  (It also isn't clear why IMAP would have different performance characteristics for the CPU-bound portion.)

The logic in the patch was tested by prototyping things in Python and throwing many threading permutations against it.  It should be good, but I would like to add some form of unit test.

The bad news is that it looks like there is another problem somewhere that results in the ridiculous memory usage and potentially worse performance.  Using an (un-patched) stock nightly, I was able to observe two distinct modes of operation.  The first was that which I reported above; similar (reasonable) memory usage (ex: growth of 10 MiB) to my patched version, but 7-8x slower.  The second was an explosion of memory usage (ex: growth of 310 MiB) and performance I would give up on and kill thunderbird.  The second case seems to reliably happen the second time I try and delete a whole bunch of messages from the nightly, but not the first time.  (It could have to do with whether I copied the messages into the test folder that session, or in a prior session... which could in turn affect the threading in the folder, given the poor quality of my test messages; they are all clones.)

I am going to try and figure out what is going on in the second case before trying to get this onto the trunk.  Whatever is causing the memory consumption could potentially still occur with the patch.  (Although there's a good chance it's an outgrowth of the re-parenting/excessive notifications affecting some data structures.  In that case, the patch would avoid triggering the problem, but not correct the underlying issue.)
Okay, figured it out.  It turns out my pecobro trace was from case 2 (really really bad performance and memory gobbling).  Case 1 is actually without any listeners aggravating things, and so is just showing us the O(n^2) worst-case scenario without javascript entering the equation.

mail-folder-bindings.xml provides the XBL binding that gives us the "Move To" and "Copy To" menus that we love.  (Apparently recently XBLified on bug 413781, commit landed 2008/04/10.)  It adds itself as an nsIFolderListener to nsIMsgMailSession so that it can hear about changes to the folder structure that would invalidate its menu (additions/removals) or change the display of menu items (property changes, renames).

The fundamental problem with this is that the binding is fired for message-related events too, not just the folder events it cares about.  As far as I can see, there is no way to receive only events where the subject is a folder (rather than a message).  Given that we would expect at least an order of magnitude difference in terms of the number of events generated by messages and by folders, it seems like it would be appropriate to have such a mechanism.

There are two bugs in the implementation of mail-folder-bindings.xml that make things much worse.

The first bug is that the listener calls QueryInterface(C.i.nsIMsgFolder) on the passed-in item without using instanceof first.  This generates an exception for every message.  This is the cause of the memory explosion observed.

The second bug is that the listener adding/removal is inconsistent and can result in the listener being added multiple times (and possibly never removed; not sure if that is just a leak or would cause a crash).  (Although nsMsgMailSession implements operator== which makes duplicate-prevention possible, it uses AppendElement instead of AppendElementUnlessExists, so duplicates can happen.)  I feel very sorry for anyone who did a lot of folder re-naming/adding/removal and then tried to delete several thousand messages in the same thread...

The attached patch resolves those outright bugs, but it still may be appropriate to try and fix the world so that the listener only is triggered for folder events in the first place.
Attachment #320678 - Flags: review?(bugzilla)
(In reply to comment #21)
> The first bug is that the listener calls QueryInterface(C.i.nsIMsgFolder) on
> the passed-in item without using instanceof first.
Actually you don't need to call both as in the success case they both have the same side-effect. If it's never supposed to fail then use QueryInterface, but if as here you don't know then use instanceof.
(In reply to comment #21)
> The second bug is that the listener adding/removal is inconsistent and can
> result in the listener being added multiple times (and possibly never removed;
> not sure if that is just a leak or would cause a crash).  (Although
> nsMsgMailSession implements operator== which makes duplicate-prevention
> possible, it uses AppendElement instead of AppendElementUnlessExists, so
> duplicates can happen.)  I feel very sorry for anyone who did a lot of folder
> re-naming/adding/removal and then tried to delete several thousand messages in
> the same thread...

Are we confident that the Add/Removes will be balanced for menuitems/non-menuitems? In the short term I'm happy for this, but we should fix this comment at least:

623          - @note _ensureInitialized can be called repeatedly without issue, so
624          -       don't worry about it here.

I think it may be worth filing a bug on AppendElementUnlessExists, I almost changed it when I changed how nsIMsgMailSession handles its folder but was persuaded to leave it the same by Neil & David (Bienvenu).

> The attached patch resolves those outright bugs, but it still may be
> appropriate to try and fix the world so that the listener only is triggered for
> folder events in the first place.
> 
Please ensure there's a bug filed on that even if you don't fix it.
> Are we confident that the Add/Removes will be balanced for
> menuitems/non-menuitems? In the short term I'm happy for this, but we should
> fix this comment at least:
>
> 623          - @note _ensureInitialized can be called repeatedly without issue,
so
> 624          -       don't worry about it here.

The note is still true.  _ensureInitialized can be called willy nilly because it uses "this._initialized" to guard against multiple-initialization.  The problem is that the _teardown method cleared "this._initialized" but did not remove the listener.

Status: NEW → ASSIGNED
Comment on attachment 320686 [details] [diff] [review]
[checked in] v2 fix bugs in mail-folder-bindings.xml, eliminate redundant QI

(In reply to comment #25)
> The note is still true.  _ensureInitialized can be called willy nilly because
> it uses "this._initialized" to guard against multiple-initialization.  The
> problem is that the _teardown method cleared "this._initialized" but did not
> remove the listener.

Ok, so I missed that. I'd be happy with this patch, except I've just realised this is in mail/ where I haven't got review privs. Suggest you try philor or dmose?
Attachment #320686 - Flags: review?(bugzilla)
Attachment #320686 - Flags: review?(dmose)
possible dupes:
bug 375584 - Problem moving or copying using drag and drop to nested imap 
414166 - Apparent memory leak when deleting large numbers of emails 
315691 - Excessive memory usage when deleting old messages from large folder 
365838 - Deleting large number of mails in local mail subfolder hangs program with high memory usage 
Bug 398684 – Virtual memory size of Thunderbird increases 40MB after each "Shift+Delete of all 40,000 mails and Compact folder"

?? a cousin: Bug 218075 – Compact of IMAP folder with lots of deleted message slow with IMAP delete model
I kinda suspect that the topo-sort bug is going to want fixing too, since mail-folder-bindings.xml didn't exist when this bug was filed.
Comment on attachment 320686 [details] [diff] [review]
[checked in] v2 fix bugs in mail-folder-bindings.xml, eliminate redundant QI

Looks good; r=dmose.
Attachment #320686 - Flags: review?(dmose) → review+
Comment on attachment 320686 [details] [diff] [review]
[checked in] v2 fix bugs in mail-folder-bindings.xml, eliminate redundant QI

Checking in mail/base/content/mail-folder-bindings.xml;
/cvsroot/mozilla/mail/base/content/mail-folder-bindings.xml,v  <--  mail-folder-bindings.xml
new revision: 1.4; previous revision: 1.3
Attachment #320686 - Attachment description: v2 fix bugs in mail-folder-bindings.xml, eliminate redundant QI → [checked in] v2 fix bugs in mail-folder-bindings.xml, eliminate redundant QI
"v2 fix bugs in mail-folder-bindings.xml" allowed _teardown's body to be executed multiple times per actual initialization.  This results in an assertion triggering, although no 'real' ill effects result.
Attachment #323216 - Flags: review?(philringnalda)
Attachment #323216 - Attachment is obsolete: true
Attachment #323216 - Flags: review?(philringnalda)
I had totally missed the use of _teardown by _build with clones.  Thanks to sid0 for assistance in testing/sanity-checking.
Attachment #323220 - Flags: review?(philringnalda)
Comment on attachment 323220 [details] [diff] [review]
v2 fix _teardown (to handle children clone creation and general case)

r=philringnalda
Attachment #323220 - Flags: review?(philringnalda) → review+
check-in of v2 fix _teardown needed, doesn't fix the bug.
Keywords: qawantedcheckin-needed
patch de-bitrotted (there was a change within the range of context).  carrying r=philor forward.
Attachment #323220 - Attachment is obsolete: true
Attachment #324938 - Flags: review+
Comment on attachment 324938 [details] [diff] [review]
[checked in] v2.1 fix _teardown, bitrot fix (minor change in context)

Checking in mail/base/content/mail-folder-bindings.xml;
/cvsroot/mozilla/mail/base/content/mail-folder-bindings.xml,v  <--  mail-folder-bindings.xml
new revision: 1.6; previous revision: 1.5
done
Attachment #324938 - Attachment description: v2.1 fix _teardown, bitrot fix (minor change in context) → [checked in] v2.1 fix _teardown, bitrot fix (minor change in context)
Do you have an SMP system (Dual Core, Quad Core, etc.)?

Then try to bind thunderbird to one CPU:

taskset -pc 0 <PID of /usr/lib/thunderbird/thunderbird-bin>

Then try again if thunderbirds hangs.

For me it seems that it solves the problem that thunderbird hangs often (multiple times a day).
Since I have bound it to one CPU, it is running some days without any problems. 

I hope that this makes it easier to find the bug.
possible dup or related bug 426367
Marking as blocking, since this is a really painful failure mode.  Andrew, what needs to happen to the topo-sorting patch before it can land?
Flags: blocking-thunderbird3? → blocking-thunderbird3+
Priority: -- → P2
Target Milestone: --- → Thunderbird 3.0b2
Patch "v1 topo-sort; avoid pathological removal order by using a topological sort" from asuth no longer applied to trunk, was trivial to unbitrot, attached. Not tested yet.

FYI, I also see this bug and it royally messes up my inbox, because I can't move mails (mostly spam) anymore.
Attachment #320521 - Attachment is obsolete: true
(In reply to comment #40)
> Marking as blocking, since this is a really painful failure mode.  Andrew, what
> needs to happen to the topo-sorting patch before it can land?

A unit test (and review).  I'm currently writing unit tests for gloda, so this should be able to get done pretty soon.
Some testing by BenB suggests that even adding the topo-sort fix only addresses part of the problem; I'll leave him to add detailed numbers.  In particular, it sounds like there are still backend performance issues (which could perhaps be dealt with in this bug, or spun off to another) as well as drag-n-drop code issues (which sounds like it needs a spin-off bug).  
One of the things that David mentioned might be going on in the backend is that an nsMsgDBHeader object is being created for every message.
Yes, with and without the patch, drag&drop completely kills TB and RAM.

Running with the patch, *and* avoiding drag&drop by using menu Message | Move | <server> | <folder>, I can at least get a few thousand msgs (7,300 msgs from a 200,000 msg folder to a 120,000 msg folder) moved. I am still unable to move 20,000 msgs without going over 2 GB RAM.

Also, since the 7,000 msg move, TB is CPU- and RAM-hungry on startup. 2 minutes with 100% CPU and 700 MB RAM, just for the first window to appear, and again the same (another 2 minutes, and 1.3 GB total) after I click on another folder and back on inbox.

When I then try to move msgs, RAM increases from there.
Attachment #335105 - Flags: review?(dmose)
What I saw when testing attachment 333196 [details] [diff] [review] on bug 243631 was a huge increase in cpu usage when scrolling the view. It appeared that each scroll was causing some of the nsMsgDBView functions to be called repeatedly, and as it was looking up items in (a large) address book it was just killing performance.

If we're doing that on scrolling, I could something similar affecting us here.
I think the patch in bug 243631 was causing painting to be really slow, since every paint required a lookup in the address book, and scrolling causes a lot of painting. 
Blocks: 387998
It surely seems to me that trunk builds are considerably worse than the build from a month ago when it comes to CPU and RAM usage (see my last comment) even at startup (not msg move) - 2 min and 1 GB for startup are new even for me. I filed bug 451957 about it.
Given the existence of bug 452221, let's deal with the topo-sort piece there, and use this one for other backend work.
(In reply to comment #46)
> Yes, with and without the patch, drag&drop completely kills TB and RAM.

So, I looked into the difference between drag&drop and just using "move to/copy to."  The details follow, but I think it suffices to say that we are doing drag-and-drop wrong.  So very, very wrong.  The drag-and-drop machinery does not appear like it was intended to pass around thousands of items at a time.  We gain nothing from exploding the messages from just "hey, just use the given view's selection" to "here is each and every message the user selected in URI form" as our URIs are gibberish to anyone except the active thunderbird profile anyways.

From inspection, it appears that the use of drag&drop introduces the extra cost of:

1) Getting the URI for every selected message.

2) Storing the URIs in a nsDOMDataTransfer object.  Each URI is stored in its own nsTArray inside a containing nsTArray.  The outer array starts from a zero size, goes to 1, and then doubles every time.  Except for the logarithmic realloc-ing, this isn't so bad.  (It's just realloc, no operator= involved.)  The URI string is converted to an nsIVariant.

3) nsDomDataTransfer::GetTransferables creates an nsITransferable for each of those URIs, converts the URI string (now an nsIVariant) to a WString and stuffs it into a newly created nsISupportsString.  Each nsTransferable owns an nsVoidArray which ends up with 1 DataStruct which holds the aforementioned string and a nsCAutoString (notable for having in-line storage) for the flavor of "text/x-moz-message".  Amusingly, the creation of this duplicate data does not result in the destruction of the original data.

4a) For each item, dragSession.getData is called.  This copies the data from the transferable corresponding to the item into the passed-in transferable.  The data is then extracted from the transferable, resulting in the flavor string being copied, not to mention the XPConnect overhead from the caller having to pass in 3 new objects to receive the 3 out parameters.

4b) Each URI for each selected message is converted back into a header.  This involves mapping to the relevant message service, having the service locate the folder (which may involve using the RDF service), and then retrieving the header based on the key.  While this is happening, none of the memory from previous steps is collected!  So at the conclusion of this loop, we will have all of the nsIMsgDBHdr instances plus both copies of the drag-and-drop data!

They both end up using the message copy service.  And as hypothesized by dmose's comment #45, this does involve retrieving a nsIMsgDBHdr for each message in the move-to/copy-to case as wel.

I have a modest proposal.  The modest proposal is that we gut the current drag-and-drop in favor of just conveying the window whose view's selection should be moved.  (Or if we rule out cross-window dnd of this, we just need to know that we're talking about the view!)  We then call MsgMoveMessage/MsgCopyMessage, just like the move to/copy to menu would do.

Additionally, we modify nsMsgDBView::CopyMessages to take smaller bites; instead of translating all of the view indices to nsIMSgDBHdrs and passing them to the copy service, it can do 100-1000 at a time.  This would result in added complexity because the db view would need to listen to the copy service to know when the previous job completes and keep track of the state.  Thankfully, I understand that bienvenu has undertaken a rewrite of the class anyways and is not a violent person, so I figure we're good to go on that front.
Thanks, asuth, for looking into this. Funny indeed.
Ideally, the nsIMsgDBHdrs could also be removed. All we *really* need, incase of IMAP, is one command to the IMAP server, with the (I think) message numbers. So, all we need is the list of these numbers and source and target folder. The views can then be refreshed normally (we have to cope with this situation anyways, because the folder can change on the server anyways, due to other clients or new mail + server-side filters).
Comment on attachment 335105 [details] [diff] [review]
[moved to bug 452221] topo-sort, v3, avoid pathological removal order by using a topological sort

patch has found a new home on bug 452221.
Attachment #335105 - Attachment description: topo-sort, v3, avoid pathological removal order by using a topological sort → [moved to bug 452221] topo-sort, v3, avoid pathological removal order by using a topological sort
Attachment #335105 - Attachment is obsolete: true
Attachment #335105 - Flags: review?(dmose)
A few things to note - we'd like to support drag drop of messages to the desktop; I'm not sure what affect that would have on proposed solutions to this problem, but I doubt it's a simplification :-) Also, to Ben's comment, in cross-folder views (which may become the norm, if we go to smart folders), message keys are not sufficient. You need the message keys and the corresponding folders to identify each message. Also, I'm not rewriting the view classes; I haven't touched the move copy code at all, thank goodness :-)

nsMsgSearchDBView is already a copy service listener, because when the user move/copies a selection that spans folders, we need to parcel those out into their respective folders, and we chain those copies.  But there's still an issue where we bungle selecting the next message, by prematurely trying to load a message before all the copies are done.
I'm going to move this to b2 - we're unlikely to redo drag drop in the next couple weeks, and it works OK for medium size selections.
Target Milestone: Thunderbird 3.0b1 → Thunderbird 3.0b2
I am not actively working this right now.  If anyone wants to pursue this immediately, please feel free.  Note: This is being tracked as a TB3 blocker and will not be forgotten about.
Assignee: bugmail → nobody
Status: ASSIGNED → NEW
Priority: P2 → --
Anyone interested in taking this?  It will have to bump off the b2 blocker list otherwise.
moving to b3 - assigning to myself, but not actively working on this, so contributions are welcome!
Assignee: nobody → bienvenu
Target Milestone: Thunderbird 3.0b2 → Thunderbird 3.0b3
Component: General → Message Compose Window
QA Contact: general → message-compose
Summary: With IMAP, racing CPU, Slow performance moving/deleting large number of messages → With IMAP, racing CPU, Slow performance moving/deleting/dragging large number of messages
actually, the new js folder pane has made this much much worse, because of all the notifications it receives in this situation, and every one crosses the js <=> c++ boundary. So long before the issues Andrew tracked down come into play, we slow down when you move/delete a thousand messages or so. I think batching is required in the notifications to fix this.
Component: Message Compose Window → Mail Window Front End
QA Contact: message-compose → front-end
bienvenu, with batching, you mean one notification every n (e.g. 1000) messages, or exactly one notification for the user action, all at once?
bug 236842 also appears to have a batching issue.
I mean simply the ability to pass multiple messages into the message notifications, instead of having to do multiple notifications if multiple messages messages are changed/moved/deleted, etc. e.g., nsIFolderListener::OnItemRemoved could have a flavor that takes an nsIArray of items removed.

http://mxr.mozilla.org/comm-central/source/mailnews/base/public/nsIFolderListener.idl#57

nsIMsgFolderListener was written with this in mind - it allows arrays to get passed into the notififations:

http://mxr.mozilla.org/comm-central/source/mailnews/base/public/nsIMsgFolderListener.idl#56

A single user action can cause multiple messages to change in ways that can't easily be described with a single notification.
(In reply to comment #65)
> nsIMsgFolderListener was written with this in mind - it allows arrays to get
> passed into the notififations:
> 
> http://mxr.mozilla.org/comm-central/source/mailnews/base/public/nsIMsgFolderListener.idl#56

I haven't seem the code we're talking about, but can we switch to nsIMsgFolderListener, at least for the notifications it supports?
if bug 330448 indeed is a dupe of this bug, 'with imap' should be removed, because the same problem also is happening with local storage.
if there is another bug with imap that makes even _viewing_ slow, then these are separate issues.
I pulled some discussion on this just now and found this conclusion.  

Because 'thread_without_re' pref is set to false by default now this case should only be a problem for those who change the pref in their about:config.  With that know the number of people affected becomes significantly less, therefore this should likely remain wanted for tb3, but not blocking.
Flags: wanted-thunderbird3+
Flags: blocking-thunderbird3-
Flags: blocking-thunderbird3+
I have not changed thread_without_re, and I am not drag/dropping, but when I select ~5000 messages in an IMAP folder and then select Message > Move To > [another IMAP folder on the same server], Thunderbird eats CPU and freezes until an Unresponsive Script warning dialog appears citing chrome://messenger/content/folderPane.js:1011.

If I press Continue in that dialog, I eventually get additional ones until the move is complete, which has taken three and four dialogs the two times I've tried it.

Am I experiencing this bug, bug 476074, or some other bug?
You are experiencing the pain of doing 5000 calls from c++ to js via a folder listener so that the folder pane can update counts. We have a bug about doing batching for this but I don't know the number off the top of my head.
Adding "Not IMAP-only-issue" in bug summary to avoid confusion, because DUP'ed bugs are not only for IMAP folder case.
Please change summary to shorter/appropriate and suitable one for search and understandigng problem.
Summary: With IMAP, racing CPU, Slow performance moving/deleting/dragging large number of messages → With IMAP, racing CPU, Slow performance moving/deleting/dragging large number of messages (Not IMAP-only issue. Occurs on local mail folder too.)
Summary: With IMAP, racing CPU, Slow performance moving/deleting/dragging large number of messages (Not IMAP-only issue. Occurs on local mail folder too.) → Racing CPU, slow performance moving/deleting/dragging large number of messages
FWIW: I'm getting frozen UI for ~5 seconds on compacting just one marked-as-deleted message in my IMAP inbox (but oddly no slow-down comacting in sub-folders). I searched bugzilla, but only found this bug that came closest.

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.4pre) Gecko/20090829 Lightning/1.0pre Shredder/3.0b4pre
Whiteboard: Not IMAP-only issue. local mail folder too.
i'm encountering serious pain trying to copy 42 messages with attachments from a Local Folders MBOX subfolders to Dovecot 1.2.9 IMAP server over VPN. drag n dropping the folder doesn't seem to work at all. context menu Copy/Move To results in TB shooting up to 100% in CPU usage and then either nothing gets actually copied or a seemingly random small number of (8) messages get copied. 

either way, expected result of having the same messages in the imap folder is very far from being achieved. same result happens consistently after fresh TB start.
details left out of comment 75:
Windows XP Pro
OpenVPN, about 60kB/sec upload bw
TB 3.0.4

i should perhaps mention that CPU gets killed to the point where the network connections such as vpn and remote desktop get timed out.

so yeah... this led to another discovery. couldnt take the dropped remote desktop connection anymore, zipped the TB profile folder up from the Windows machine, downloaded it to my LAN (re dovecot server) OS X machine.

opened the profile in OS X thunderbird, everything works without a single hitch. i can drag n drop MBOX folders -> IMAP server, i can use Move To, it doesn't matter what the message count is, everything from 42 to 2442 messages were moved without a single problem.

just testing the same thing on another XP Home laptop over VPN, seeing exactly the same thing as before: 695 messages in folder, 22 get copied, then CPU shoots up for a while, then comes down, then "Move To (MBOX) -> IMAP folder" command does not anything at all, most likely until thunderbird restart.

i have no idea what to make of this. any ideas how to debug in more detail? i'm a bit green when it comes to sampling, profiling, whatnot running windows processes.
I'm hitting this on Windows and Linux (single core 32-bit, and dual core 64-bit, respectively):

Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10pre) Gecko/20100422 Lightning/1.0b1 Shredder/3.0.5pre

I have about 4,000 "duplicate" messages (only differing by timestamp) which I created on my IMAP server while trying to test another bug.

The 'hang' occurs after selecting all (CTRL+A) and trying to drag them.
the "it's worse in version 3" from comment 62 is Bug 506509
Whiteboard: Not IMAP-only issue. local mail folder too. → [bulkoperations][imap AND local]
I believe this is related.  I have (well... had, before circumventing the Thunderbird delete bug with Mulberry) a folder of about 1707 messages (zone minder reports) and attempted to delete them.  Each has an attachment.  They never delete, even when the operation is run overnight.

Thunderbird 6.0.2 on Windows 7 Ultimate SP1, client 8GB RAM.
Steps:
Select folder (with a lot of messages - 1700 tested)
"ctrl a" (select all messages)
"del" (delete all selected)

Result:
Indefinite wait. 
No messages deleted from the selected folder
However, copies are made in the trash folder.

Detail:
It appears that copies are made of each of the files in inbox.trash, but the operation times out, then starts over, which creates new copies.  Ultimately this will fill up /usr on the server and cause a server-side crash.  After I turned on logging:  

 set NSPR_LOG_MODULES=all:5

Log file fills (e.g. 141MB of log in 10 minutes) with:

 8304[aab33c0]: ReadNextLine [stream=832cce8 nb=62 needmore=0]
 8304[aab33c0]: aab1800:mail.myserver.com:S-INBOX.Security:CreateNewLineFromSocket: PyEATE0m4m8AuJnJpQDEDAFQwmFkxN5MANw0cigAbho5ERf00BiCBfAC0B0C

This entry is repeated 311,240 times in the log file.  (created in between 11:46:22 and 11:56:25 - that is almost exactly 10 minutes).

I can make the log file available for debug.

maillog on the server shows nothing too out of the ordinary

 Sep  8 11:56:36 servername imapd-ssl: DISCONNECTED, user=username, ip=[ip.add.re.ss], headers=503688, body=8313498, rcvd=5407, sent=9300938, time=417, starttls=1
 Sep  8 11:56:36 servername imapd-ssl: LOGOUT, user=username, ip=[ip.add.re.ss], headers=0, body=0, rcvd=1130, sent=908858, time=502, starttls=1
 Sep  8 11:56:36 servername imapd-ssl: LOGOUT, user=username, ip=[ip.add.re.ss], headers=0, body=0, rcvd=355, sent=44379, time=605, starttls=1
 Sep  8 11:56:36 servername imapd-ssl: LOGOUT, user=username, ip=[ip.add.re.ss], headers=0, body=0, rcvd=128, sent=620, time=508, starttls=1
 Sep  8 11:56:36 servername imapd-ssl: LOGOUT, user=username, ip=[ip.add.re.ss], headers=0, body=25144, rcvd=3651, sent=131177, time=596, starttls=1
 Sep  8 12:00:46 servername imapd-ssl: LOGIN, user=username, ip=[ip.add.re.ss], port=[57612], protocol=IMAP
 Sep  8 12:00:46 servername imapd-ssl: LOGIN, user=username, ip=[ip.add.re.ss], port=[57611], protocol=IMAP
 Sep  8 12:00:56 servername imapd-ssl: LOGIN, user=username, ip=[ip.add.re.ss], port=[57633], protocol=IMAP
 Sep  8 12:01:19 servername imapd-ssl: LOGIN, user=username, ip=[ip.add.re.ss], port=[57636], protocol=IMAP
 Sep  8 12:02:23 servername imapd-ssl: LOGIN, user=username, ip=[ip.add.re.ss], port=[57641], protocol=IMAP

I cron-d top per minute to a log file: peak imapd (courier-imap-4.5.0,2) utilization
   PID USERNAME     THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU   COMMAND
 72792 user          1  106    0  3148K  1712K  CPU2  2   1:18 67.04%     imapd

I also recorded System Monitor Log.blg for the thunderbird.exe process (avail if useful) it shows thunderbird thread count jumps to 40 and sticks and averages 36.  I/O ops/sec averages 5,188 and peaks at 96,301 about every 90 seconds, which correlates with "126.563%" of processor time.  IO bytes per sec runs hard (peak 187 mbyte/sec over gigabit lan) for the first 2 minutes from launch, but drops to about zero after that (that is, the "delete operation is not moving much data.  Memory usage maxes out at 412MB of pool nonpaged bytes.  This does not seem to max out either processor or memory, but the operation hangs and never completes.

Deleting the messages one at a time works as expected.  The message count in the source folder is decremented by each delete, the message count in trash is incremented by one count each delete.

With the Mulberry mail client:

Open mailbox takes about 2 seconds (listing all remaining messages accurately, 1680 after individual delete tests)
select all takes imperceptible amount of time 
"delete" takes imperceptible amount of time (mulberry overlines messages to indicate marked for delete)
"expunge" takes < 1 second, probably time to rewrite the folder view as blank.

Top on the server did not capture a spike in IMAPD (peak about 6% CPU during operation, though 1 second updates might have missed the request).

Thunderbird then almost immediately shows the folder as empty.

That is, there is something seriously broken about Thunderbird's delete function that is not broken in Mulberry.
my [Bug 610131] Moving a large message or multiple messages between IMAP folders causes TB to hang/crash, results in duplicate messages  
has been marked as duplicate to this bug.

I would like to add information about our problem:
In our case all the problems with the duplicates and Thunderbird hangs disappeared, after we replaced the extern eSata Buffalo Raid - System by a simple internal hard disk within our ubuntu 8.04 LTS dovecot-postfix-Mail Server. 

In some linux newsgroups there was a problem reported with linux eSata-Drivers, but I don't know any details.

Since that change no more overload situations within the mail server or the TB-Clients happened.
So I draw the following conclusion: It was the combination of Thunderbird and the esata raid within the server, that caused this behaviour. No other mail client caused these duplicates, other mail clients got relatively slow when moving mails, but finished the way they should. Thunderbid must be using any parameter within the server that was not reliable. May be this helps to reproduce this strange behaviour.
I saw that with Cyrus, server on fast LAN, used only by me. Neither client nor server uses eSATA disks.
The problem persist also on TB 11.0.

I'm testing TB with an IMAP srv.
I have a folder "a" with 10,000 mails. I copy all mails to the empty folder "b".
TB put a core of the CPU to 100% for about a mins (substantially no load
on IMAP srv).

If I repeat the copy (having in "b" the first 10,000 mails) after about 5min
the CPU is already ad 100%.

Please note that during this long period NO commands are sent to IMAP srv.
Roberto, you're seeing Bug 538378, which has a patch and is just waiting for code review.
Are you sure David? Bug 538378 only talks bout a memory leak, nothing about consuming 100% CPU...
(In reply to Charles from comment #92)
> Are you sure David? Bug 538378 only talks bout a memory leak, nothing about
> consuming 100% CPU...

yes.
(In reply to David :Bienvenu from comment #91)
> Roberto, you're seeing Bug 538378, which has a patch and is just waiting for
> code review.

exists some like a "nightly build" with this patch which I could try?  If yes,
is the "standard" nightly build?

TIA
(In reply to Roberto Colmegna from comment #94)
> (In reply to David :Bienvenu from comment #91)
> > Roberto, you're seeing Bug 538378, which has a patch and is just waiting for
> > code review.
> 
> exists some like a "nightly build" with this patch which I could try?  If
> yes,
> is the "standard" nightly build?
> 
> TIA

Not yet - but it should come after this coming week end. And you'll find it at http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-central/
I think it should be in today's build, actually.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Hi David,

thanks.  I have tested the daily-build (v. 14.0a1) moving 10,000 mails 
from a folder to an empty one. 

TB remain a CPU hog, but substantially reduce the memory usage. 
The CPU constantly remain at 15% (on a 4-core phenom) with write-ops on 
local SSD-disk  but without ops against IMAP srv. 

This CPU usage is presents for some minutes with the status bar which report 
"compacting folder...".

Please note that I have DISABLES the folders "off-line usage", so the compact 
operation should terminate quickly.

IMPORTANT: after folder compaction terminate the CPU remain at 15% without any I/O ops and without any msg on TB status bar.

If I close and reopen TB the CPU back to 15% after some seconds from restart.

TIA
Hi,

It is several days I am trying to merge 4 different mail accounts into one.
All of them are Thunderbird IMAP mail accounts (on google) and I use a new Windows 8 computer (i5) that I just installed.
The disc used is a SATA standalone disc.

What happens (as I understand it) is that when I move the mails they are first uploaded to IMAP and then downloaded into the new account.

THis would be OK, even if it takes several days.
What is not OK is that the process gets interrupted all the time, and at the moment I am moving 20-mail chunks that take 10 minutes or more each.
The delay is connected with the size of the mails.
The bigger the mail, the harder it is to get it through.
I have mails from 1 KB to 30MB size. (Design data from customers).
If I sort them by size, I can move chunks of several hundred of the small sized mails. But when I arrive at 3MB mails the process does not complete even with one single mail.
THe only way I found to do that is to export into eml files and import one by one.

As I said, what makes it terrible is that if the whole chunk is not processed, the action does not complete. So the moved mails remain still in the original folder, even if some files were successfully copied.
It would be nice that at least those files that are copied, when the move fails, disappear from the original folder.

This also happens for exported eml files. The process can be very long (it doesn't make it faster).

As I Said, I could accept that it takes days, but not that I have to sit on the PC for several weeks and move files every 10 minutes.

Is there a way to avoid such waste of time?
Especially considering that Outlook works fine for such things? (Importing the 14GB mbox file took just one hour!)
Thanks!
Orazio, I assume you use the newest Thunderbird?

REOPENing, because the issue is still not fixed.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Yes. It is release 17.0.7, brand new installed on a freshly installed Windows 8 Pro PC.

Just to help with the reason:
When it interrupts the copy process I see for 2 seconds at the bottom right of the PC a message that the server connection was lost.
I am unable to read the whole text (it is about 3 lines and I read just the first one), but it is always the same.
So I am pretty sure that while it copies and from time to time connects to the server to get some information (additionaly to the copying notification from time to time I see that it communicates with the server for other kind of data!). And when the reply from the server takes more time, the copy process is interrupted, and everything in the original folder stays the same.
(In reply to Orazio di Bella from comment #98)
> What is not OK is that the process gets interrupted all the time, and at the
> moment I am moving 20-mail chunks that take 10 minutes or more each.
> The delay is connected with the size of the mails.
> The bigger the mail, the harder it is to get it through.
> I have mails from 1 KB to 30MB size. (Design data from customers).
> If I sort them by size, I can move chunks of several hundred of the small
> sized mails. But when I arrive at 3MB mails the process does not complete
> even with one single mail.

This does not sound like the issue that this bug is about.
Please file a new bug https://bugzilla.mozilla.org/enter_bug.cgi?product=Thunderbird
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: