617839 - Switching to threads with a large number of emails is extremely slow

Reporter

Description

•

15 years ago

I use the Threads with Unread view to read my bugmail. The volume of bugmail that I receive on some bugs (such as the randomorange bug) is extremely large. I experience an about 30 seconds of delay whenever I press n to go to the next message, and the next message is in that thread. I haven't profiled it, but I should be able to do so if you need me to. This is completely reproducible.

WADA:World Anti-bad-Duping Agency

Comment 1

•

15 years ago

Next issues are known in "long or deep threads" case on real mail folder. (1) Rebuild-Index takes long. (2) Bug 452221 : Update of .msf after delete mails takes long. (3) Switching to threaded view takes long. If "many threads" condition is added, situation becomes worse. Problem seems to exist also in mail viewing at threaded view. IMAP? POP3 or Local Folders? What folder view? All Folders? Unified Folders? Or other? How many mails? How many threads? What is number of mails in a thread on average? How many unread mails on average? How many threads which contain unread mails on average? What is number of unread mails in a thread on average? Does problem occur with "All Mails" view? Does problem occur with explicitely defined Virtual Folder? Saved Search Folder for bugmail folder. Filter=unread mail.

(no longer active)

Reporter

Comment 2

•

15 years ago

(In reply to comment #1) > Next issues are known in "long or deep threads" case on real mail folder. > (1) Rebuild-Index takes long. > (2) Bug 452221 : Update of .msf after delete mails takes long. > (3) Switching to threaded view takes long. > If "many threads" condition is added, situation becomes worse. > > Problem seems to exist also in mail viewing at threaded view. Let's focus on this single issue in this bug and get other bugs for the rest of the issues. > IMAP? POP3 or Local Folders? IMAP. > What folder view? All Folders? Unified Folders? Or other? I'm using Unified Folders, but that doesn't make any difference. The perf problem is with the message list view. > How many mails? How many threads? What is number of mails in a thread on > average? Total number of emails right now is 149,411. I don't know how many threads but it's in the order of thousands. The average mail per thread number has no significance. This problem is not related to the number of threads shown, it's related to the number of emails in that single large thread. That number is 2,841 right now. > How many unread mails on average? How many threads which contain unread mails > on average? What is number of unread mails in a thread on average? I usually have something between 50-400 unread messages in this folder. Most of the thread has one or a handful at maximum of unread messages. But the number of unread emails in the thread in question is not significant. > Does problem occur with "All Mails" view? It happens in all of the views! > Does problem occur with explicitely defined Virtual Folder? > Saved Search Folder for bugmail folder. Filter=unread mail. I haven't tried that.

(no longer active)

Reporter

Comment 3

•

15 years ago

OK, I grabbed a shark profile of the entire time we take in expanding that thread. Here's the profile: http://people.mozilla.org/~eakhgari/tb-threadperf.mshark. Here's the summary: we spend 57% of our time in nsMsgThreadEnumerator::MsgKeyFirstChildIndex, and the rest in nsMsgThread::GetChildHdrAt. The first block boils down to 14%, 13% and 11% in morkTable::GetTableRow, morkTableRowCursor::NextRow and morkObject::Release, respectively. The second block boils down to 9%, 9% and 7% in morkTableRowCursor::NextRow, morkTable::GetTableRowCursor and morkObject::Release, respectively. Looking at the bottom-up profile, we spend a total of 10% of our time in morkTableRowCursor::NextRow, 8.7% in nsMsgDatabase::CreateMsgHdr, 8% in morkRow::AcquireRowObject, and about 10% of time allocating in orkinHeap::Alloc and freeing in morkObject::Release.

WADA:World Anti-bad-Duping Agency

Comment 4

•

15 years ago

(In reply to comment #2) > Let's focus on this single issue in this bug and get other bugs for the rest of the issues. All my questions is for this your single issue. And, I believe that, for analysis of this kind of problem, understanding about "why required time is far longer than O(number of mails in a thread) in other than your this single issue". > > What folder view? All Folders? Unified Folders? Or other? > I'm using Unified Folders, but that doesn't make any difference. >(snip) > This problem is not related to the number of threads shown, > it's related to the number of emails in that single large thread. > That number is 2,841 right now. > > Does problem occur with "All Mails" view? > It happens in all of the views! Virtual Folder or not(All Folders or Unified Folder, View:All or View:Unread, ...) sounds irrelevant. It seems only "Threaded View + long thread" is required condition. As I wrote in bug 452221 comment #4, bug 452221 was observed with large(long) but not so huge thread(250 to 2000 mails in a thread). As I tested with mails of same Subject:/no Reply-To:,Reference: etc., it's similar to your bugmails. And, IIRC, when I saw next phenomenon during test of that bug, I used "View:All" and I opened threaded view with "thread is expanded" status and all mails was Read status. > (3) Switching to threaded view takes long. (Real folder as test for that bug. Rebuild-index was always needed for test.) (In reply to comment #3) > the entire time we take in expanding that thread. When you enter in threaded view, is thread collapsed initially? If so, when you enter in threaded view with thread expanded, is threaded view shown quickly? Additional question. "Order of mails in a thread" seems affected currently by sort column, Date column or Order Received column. It may affect check results or measured time. Do you use Date column only? By the way, if problem occurs with simple "All Folders" folder view and View:All, please remove not-mandatory conditions such as Unified Folders, Unread view from your checking to make it simple.

WADA:World Anti-bad-Duping Agency

Comment 5

•

15 years ago

Woops. I wanted to say understanding about "why ... in other ... issue" is required.

(no longer active)

Reporter

Comment 6

•

15 years ago

(In reply to comment #4) > When you enter in threaded view, is thread collapsed initially? Yes. > Additional question. > "Order of mails in a thread" seems affected currently by sort column, Date > column or Order Received column. It may affect check results or measured time. > Do you use Date column only? I have Subject, User (coming from an extension) and Date visible. But the sorting is irrelevant. > By the way, if problem occurs with simple "All Folders" folder view and > View:All, please remove not-mandatory conditions such as Unified Folders, > Unread view from your checking to make it simple. I'm not sure what you mean here. I believe that given the STR in comment 0 and the profile I posted, we should have enough information to diagnose and fix this.

WADA:World Anti-bad-Duping Agency

Comment 7

•

15 years ago

Attached file Mail folder file, 2000 mails per a thread, one thread in mail folder — Details

This is same test case I attached to bug 452221 comment #4. All mails(2000 mails) has same subject. First mail : Subject: xxx All other mails : Subject: Re: xxx No In-Reply-To:, No References:. All mails is Read status. Order of mails in unix mbox : normal order. oldest is top, newest is bottom. Note: Original case was : All mails has "Subject: Re: xxx". As Tb 3.1 did not show as a thread with original test case, Subject of first mail was changed from "Subject: Re: xxx" to "Subject: xxx". Switching to threaded view(expanded status) takes sufficiently long if "2000 mails per a thread". Delay is observed with "1000 mails per a thread" too, but delay is not so large. Delay is also observed with "500 mails per thread", but it's small. This is the "(3) Switching to threaded view takes long" in my comment #1 which I saw during test of bug 452221. As I saw such report in other bugs, I didn't open separate bug for it. Similar time lag was observed with "collapse then re-expand by -/+ at threaded view". I didn't check it during test of bug 452221. I checked it after I read your bug. If collapsed status at threaded view, time lag was observed upon change back to non-threaded view too.

Wayne Mery (:wsmwk)

Comment 8

•

15 years ago

so dependent on bug 452221? If so, it has the start of a patch, so there may be hope :) This has more to do with thread navigation than the reader UI, perhaps should move to Folders and lists

Severity: normal → major

Wayne Mery (:wsmwk)

Comment 9

•

15 years ago

Ehsan, is speed/slowness the same if you click thread and it's not collapsed? (If it is the same, then bug 452221.)

Depends on: 452221

(no longer active)

Reporter

Comment 10

•

15 years ago

(In reply to comment #9) > Ehsan, is speed/slowness the same if you click thread and it's not collapsed? Do you mean if I just expand the thread, will I experience the same slow-down? Yes, I can.

Wayne Mery (:wsmwk)

Updated

•

15 years ago

Component: Message Reader UI → Folder and Message Lists

OS: Mac OS X → All

QA Contact: message-reader → folders-message-lists

(no longer active)

Reporter

Comment 11

•

14 years ago

Is there anything anyone can do for this bug? I'm estimating that this bug is wasting 5 minutes per average day for me. ;-)

David :Bienvenu

Assignee

Comment 12

•

14 years ago

Ehsan, this bug is really about how slow it is to expand a very large thread, right? It doesn't really have to do with unread navigation or threads with unread; it merely has to do expanding very large threads. If so, the reason for this is the the view code that expands threads does it in a recursive way to handle thread nesting, and has to build up a lot of state in each call. This could be sped up a lot, though the greater the speedup, the greater the chance for regressions. In other words, I could redo the algorithm so it's essentially linear, or I could just speed up the existing algorithm. Just to be clear, you're talking about threads with unread view, not simply the "unread" view? In the former, you'll see read and unread messages; in the latter, it's only unread messages. In general, I think some of your folders are way too big even for linear algorithms, essentially voiding the warranty, but this is a case where our algorithm is n-squared. And you're trying to mitigate the size of your folder by using the unread view, so I'm a bit sympathetic. I can try to come up with a patch and generate a try server build for you...

(no longer active)

Reporter

Comment 13

•

14 years ago

(In reply to comment #12) > Ehsan, this bug is really about how slow it is to expand a very large > thread, right? It doesn't really have to do with unread navigation or > threads with unread; it merely has to do expanding very large threads. Yes, it's only a problem with really slow expanding of a thread with a large number of messages. It has nothing to do with the state of the messages in the thread. > If > so, the reason for this is the the view code that expands threads does it in > a recursive way to handle thread nesting, and has to build up a lot of state > in each call. This could be sped up a lot, though the greater the speedup, > the greater the chance for regressions. In other words, I could redo the > algorithm so it's essentially linear, or I could just speed up the existing > algorithm. I should also note that the thread in question is 1-level deep, if it makes any difference. Can't we use tests to ensure that the fix here doesn't cause regressions? (BTW, see comment 3 where I profiled this) > Just to be clear, you're talking about threads with unread view, not simply > the "unread" view? In the former, you'll see read and unread messages; in > the latter, it's only unread messages. I'm talking about threads with unread view, but it's not relevant. Honestly, the only relevant comments in this bug are comment 0, comment 3 and this one. :/ > In general, I think some of your folders are way too big even for linear > algorithms, essentially voiding the warranty, but this is a case where our > algorithm is n-squared. And you're trying to mitigate the size of your > folder by using the unread view, so I'm a bit sympathetic. I can try to come > up with a patch and generate a try server build for you... That would be amazing. I also copied the mail folder in question on a USB disk and gave it to Blake. Here's my permission for you or anyone else to use it if you need to (it's my bugmail folder; there's a bunch of security bugs there, but aside from that, nothing personal in there).

Whiteboard: [see comments 3 and 13]

David :Bienvenu

Assignee

Comment 14

•

14 years ago

(In reply to comment #13) > > I should also note that the thread in question is 1-level deep, if it makes > any difference. It does, actually. > > Can't we use tests to ensure that the fix here doesn't cause regressions? Tests test what tests test :-) The existing tests can ensure that we don't break obvious stuff, but the edge cases worry me. > > That would be amazing. I also copied the mail folder in question on a USB > disk and gave it to Blake. Here's my permission for you or anyone else to > use it if you need to (it's my bugmail folder; there's a bunch of security > bugs there, but aside from that, nothing personal in there). I can probably simulate the bugmail case here :-)

David :Bienvenu

Assignee

Comment 15

•

14 years ago

A few comments - the single level thread case (i.e., one parent, a lot of replies at the same level) is typically pretty much linear. I tried a 443 message bugzilla thread, and it's pretty much instantaneous in release builds on my windows box. The shark profile is missing; it would be interesting to see that. It might also be useful to copy that thread to a new folder, and see if you see the same slowdown with that folder. If so, you could e-mail me the folder. We spend a lot of time figuring out if a message is in a killed sub-thread. We could optimize the case of no sub-threads being killed by simply setting a flag on the thread if nothing is killed.

(no longer active)

Reporter

Comment 16

•

14 years ago

You should now be able to access the shark profile. I'll give copying the thread to another folder a shot tomorrow, but any reason you don't want to use the existing folder I gave to Blake?

David :Bienvenu

Assignee

Comment 17

•

14 years ago

(In reply to comment #16) > You should now be able to access the shark profile. I'll give copying the > thread to another folder a shot tomorrow, but any reason you don't want to > use the existing folder I gave to Blake? Blake is in Toronto and I'm in SF so a USB key doesn't do me much good ;-) I do see the part of the algorithm that's n-squared, and it does explain the hot spots you saw in the shark profile. So I don't think I need the folder at this point. But fixing it will be a bit of a challenge, mainly due to our use of polymorphism and depth-first recursion.

David :Bienvenu

Assignee

Comment 18

•

14 years ago

Attached patch reduce memory allocations (obsolete) — Details — Splinter Review

This removes a redundant method on nsIMsgThread, in favor of one that's a bit more efficient, and changes the code that expands a thread to temporarily increase the db hdr cache to avoid creating and destroying the same headers over and over again. Ehsan, I'll try to get a try server build going with this patch so you can try it out.

Assignee: nobody → dbienvenu

Attachment #540257 - Flags: review?(neil)

neil@parkwaycc.co.uk

Comment 19

•

14 years ago

Comment on attachment 540257 [details] [diff] [review] reduce memory allocations > if (aIndex >= m_keys.Length()) > return NS_MSG_MESSAGE_NOT_FOUND; This should really bail out for negative indexes too. If you don't want to change the signature of GetChildHdrAt, you could cast aIndex to PRUint32. > PRUint32 childIndex = m_keys.IndexOf(msgKey); >- return (childIndex != kNotFound) ? GetChildAt(childIndex, aResult) : NS_MSG_MESSAGE_NOT_FOUND; >+ return (childIndex != kNotFound) ? >+ GetChildHdrAt(childIndex, aResult) : NS_MSG_MESSAGE_NOT_FOUND; ... and then you could simplify this to always call GetChildHdrAt because kNotFound would then be caught by the check. > if (aIndex >= m_folders.Count()) > return NS_MSG_MESSAGE_NOT_FOUND; Ditto. >- NS_ERROR("how did we get in the wrong thread?"); >+// NS_ERROR("how did we get in the wrong thread?"); ??? > if (aIndex >= (PRInt32) m_keys.Length()) > return NS_MSG_MESSAGE_NOT_FOUND; Ditto. >+ if (aIndex > (PRInt32) m_numChildren) >+ return NS_MSG_MESSAGE_NOT_FOUND; Ditto. Also, why is this one > when the others are >= ?

David :Bienvenu

Assignee

Comment 20

•

14 years ago

Attached patch fix addressing comments — Details — Splinter Review

I think this is what you had in mind - haven't had a chance to run the tests with this patch, though.

Attachment #540257 - Attachment is obsolete: true

Attachment #540266 - Flags: review?(neil)

Attachment #540257 - Flags: review?(neil)

neil@parkwaycc.co.uk

Comment 21

•

14 years ago

Comment on attachment 540266 [details] [diff] [review] fix addressing comments >+NS_IMETHODIMP nsMsgXFViewThread::GetChildHdrAt(PRInt32 aIndex, nsIMsgDBHdr **aResult) > { > if (aIndex >= (PRInt32) m_keys.Length()) > return NS_MSG_MESSAGE_NOT_FOUND; [Missed one?] >+ if (aIndex < 0 || aIndex >= (PRInt32) m_numChildren) [Actually I was suggesting PRUint32(aIndex) >= m_numChildren]

neil@parkwaycc.co.uk

Comment 22

•

14 years ago

Comment on attachment 540266 [details] [diff] [review] fix addressing comments r=me with that last check fixed.

Attachment #540266 - Flags: review?(neil) → review+

David :Bienvenu

Assignee

Comment 23

•

14 years ago

http://hg.mozilla.org/comm-central/rev/b7a5437581ef pushed to trunk. Ehsan, if you want to try tomorrow's nightly, and let me know if this is significantly better, that would be helpful.

(no longer active)

Reporter

Comment 24

•

14 years ago

(In reply to comment #23) > http://hg.mozilla.org/comm-central/rev/b7a5437581ef pushed to trunk. Ehsan, if > you want to try tomorrow's nightly, and let me know if this is significantly > better, that would be helpful. It is *a lot* faster! There is a noticeable delay (my guess is around 500ms) but not long enough for Thunderbird to beachball. Thanks a lot for the fix!

David :Bienvenu

Assignee

Comment 25

•

14 years ago

great, thx, I'm going to mark this fixed, then, because it sounds like it's sufficient, and is very safe.

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

David :Bienvenu

Assignee

Comment 26

•

14 years ago

not sure if this was in tb 5 or 6, but we don't have a plain old tb 5 tfv

Target Milestone: --- → Thunderbird 6.0

Mail folder file, 2000 mails per a thread, one thread in mail folder 15 years ago WADA:World Anti-bad-Duping Agency 1.95 MB, text/plain		Details
reduce memory allocations 14 years ago David :Bienvenu 15.15 KB, patch		Details \| Diff \| Splinter Review
fix addressing comments 14 years ago David :Bienvenu 10.16 KB, patch	neil : review+	Details \| Diff \| Splinter Review