Closed Bug 508978 Opened 11 years ago Closed 11 years ago

VirtualFolderChangeListener should defer notifications, only periodically commit

Categories

(MailNews Core :: Database, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 3.0rc1

People

(Reporter: asuth, Assigned: Bienvenu)

Details

(Whiteboard: [no l10n impact])

Attachments

(1 file, 1 obsolete file)

In an attempt to make my gmail All Mail folder more reasonable, I apparently deleted 49063 messages from my gmail account.  While this is one of the more hardcore pathological cases one could come up with, it does provide the opportunity to notice pain points.

For every single message of the 49063 messages, nsMsgDatabase::DeleteMessages is calling nsMsgDatabase::DeleteHeader with notify=true, which in turn results in a call to nsMsgDatabase::NotifyHdrDeletedAll.  Apparently, VirtualFolderChangeListener::OnHdrDeleted is one of the listeners because of the "Archives" smart folder.   He calls:
  m_virtualFolder->UpdateSummaryTotals(PR_TRUE); // force update from db.

He in turn likes to:
  if(oldTotalMessages != newTotalMessages)
    NotifyIntPropertyChanged(kTotalMessagesAtom, old, new);

which results in nsMsgMailSession::OnItemIntPropertyChanged who triggers folderPane.js's gFolderTreeView's OnItemIntPropertyChanged.  This results in a linear scan of the folder pane's rows and treeview row invalidation if it finds one.  This linear scan turns out to be pretty expensive.  While I think the folder pane needs to use nsIMsgFolderListener instead of nsIFolderListener, I don't think it's at fault here.  There is no way for it to get the notification it wants without the problem in this case.

VirtualFolderChangeListener should mark the virtual folder dirty and defer any form of notification until the batch completes.  This could be accomplished by marking itself dirty and posting an event to run on the event loop later, implicitly handling batches.  Or it could be accomplished by having the DB announcer explicitly announce the start/end of a batch via a new method or the new generic onEvent notification.

Given that this dude listens on gmail's All Mail folder for the archive virtual folder, I think this has to be considered blocking.
Flags: blocking-thunderbird3?
Forgot to mention that he does a Commit every time too.  So it should be doing a commit at the end of the batch instead and perhaps some interim modulo batches if it really wants.
Flags: blocking-thunderbird3? → blocking-thunderbird3+
Whiteboard: [no l10n impact]
Arbitrarily slotting this to beta 4.
Target Milestone: --- → Thunderbird 3.0b4
Target Milestone: Thunderbird 3.0b4 → Thunderbird 3.0rc1
We've got batching at the db level, and we have the ability to enable/disable notifications at the folder level. But apparently we aren't doing that in this case, because ::UpdateSummaryTotals would bail immediately...I'll need to look at the stack trace for this, once my build recovers from downloading a newsgroup with 90K msg hdrs...
Assignee: nobody → bienvenu
Priority: -- → P1
I'm liking the onEvent notification idea at the moment...
Status: NEW → ASSIGNED
Based on my sophisticated profiling technique, I suspect it's copying the offline stored messages that's really what's slowing us down here.
I think the offline store copy issue is worthy of its own bug - I can't tell from the initial report if the delete was a delete to trash or a shift delete. If the former, I think most of the time was spent copying the offline store messages to the trash.  Bug 519979 filed for that issue.

After spending a while implementing the onEvent notification idea, tweaking a bunch of code, and not getting much satisfaction because bug 519979 was dominating everything, I'm warming to the event posting idea because then I don't have to change all the call sites that loop over messages, just the listener, and I need to change the listener anyway. Maybe we'll end up doing both because other listeners might want to know this kind of stuff.
I have a small patch that implements the event posting idea, and it gets rid of all the extra VirtualFolderChangeListener events. However, it breaks one of the mailnews xpcshell tests, and we still have other change listeners sending a bunch of notifications. I'm a little reluctant to change the test to be aware that virtual folder count changes are now potentially async since other code (e.g., extensions) could easily make the same mistake. I'm a little surprised that the update event doesn't get run before the next js call into the backend. 

If I go with the explicit begin/end batch notifications, then I can make the count changes appear synchronous, at the cost of changing quite a bit more code.
Attached patch one approach (obsolete) — Splinter Review
this tweaks the one test that was broken, and rolls synchronous virtual folder changes into one slightly async change. 

I think I'd want to have some evidence that this makes a significant performance increase before getting reviews.
I could potentially test this, as I have many virtual folders (but not gmail on this machine).  However, I'm not sure I've seen this problem for a long time, at least for smallish numbers of messages.  What threshold do you suggest one starts seeing this?
I don't know, since  Bug 519979 really dominates the time it takes to do large deletes, and even with that fixed, the actual copying of the bytes in the offline store is still going to be the limiting factor, I bet.

This patch will still be helpful for things like portable thunderbird, since it does reduce the number of db commits we do.
Whiteboard: [no l10n impact] → [no l10n impact][has potential fix]
Comment on attachment 404110 [details] [diff] [review]
one approach

let's see what asuth thinks of this...
Attachment #404110 - Flags: review?(bugmail)
Comment on attachment 404110 [details] [diff] [review]
one approach

The strategy seems good.  A few notes:

- I don't think you ever set m_batchingEvents to true.
- I would suggest putting the logic that checks/modifies m_batchingEvents inside PostUpdateEvent.  Presumably it will get inlined automatically so the extra call will become moot, but you could also move the definition up to the declaration to force its hand a bit more.
- If you could use a commenting style doxygen would pick up, that would be ideal.  "/** blah */" or "/// blah" (single-line) are conventionally what we have been using in the mozilla codebase, although there are other syntaxes it understands too.  (Qt style, etc.)
Attachment #404110 - Flags: review?(bugmail) → review+
this addresses asuth's comments - carrying forward r+, requesting sr.

I think we still may want to do the event based batching for listeners other than the virtual folder listeners, but I'll save that for a different patch.
Attachment #404110 - Attachment is obsolete: true
Attachment #405732 - Flags: superreview?(bugzilla)
Attachment #405732 - Flags: review+
Comment on attachment 405732 [details] [diff] [review]
fix addressing asuth's comments

>+  /**
>+   *Posts an event to update the summary totals and commit the db.
>+   * We post the event to avoid committing each time we're called
>+   * in a synchronous loop.
>+   */
>+  nsresult PostUpdateEvent(nsIMsgFolder *folder, nsIMsgDatabase *db);

nit: Space before Posts.
Attachment #405732 - Flags: superreview?(bugzilla) → superreview+
fix checked in. I think there's more batching we can do to, but this particular bug is addressed.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [no l10n impact][has potential fix] → [no l10n impact]
(In reply to comment #13)
>
> I think we still may want to do the event based batching for listeners other
> than the virtual folder listeners, but I'll save that for a different patch.

bienvenu, is a bug needed?  I don't see one.

asuth, v.fixed?
wsmwk, I don't know if there is an existing bug - there are still several issues with doing batch operations on large numbers of messages, some general, some specific to imap or local. I spun up a new bug for an imap issue - bug 538378, but there are definitely more batching things we could do.
You need to log in before you can comment on or make changes to this bug.