Closed Bug 441932 Opened 16 years ago Closed 10 years ago

Keep lists of NEW messages (m_newSet, m_saveNewMsgs, m_newMsgs) only in nsMsgDBFolder

Categories

(MailNews Core :: Backend, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: rkent, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [patchlove])

Attachments

(1 file)

Currently we maintain several lists of new messages, spread between nsMsgDatabase and nsMsgDBFolder:

1) m_newSet in nsMsgDatabase, "new messages since last open"

2) m_saveNewMsgs in nsMsgDBFolder, "set of new msgs, which we might want to run junk controls on."

3) m_newMsgs in nsMsgDBFolder, "set of new messages for a folder who has had its db closed"

Some of the problems with this are:

1) Keeping m_newMsgs and m_newSet in sync requires an array copy whenever the database is open and closed. 

2) The responsibility to maintain the counts of new messages are spread throughout the code, in calls to SetNumNewMessages and SetHasNewMessages.

3) One manifestation of this is bug 372372, where counts in virtual folders based on NEW flag do not work correctly. (Originally this work was destined for bug 372372, but the work needed affects many other issues as well, so I decided to break the work into several pieces in backend bugs like this one).

The proposal is to consolidate the handling of volatile lists in nsMsgDBFolder, removing m_newSet from nsMsgDatabase. The addition to the list will be done by setting the NEW flag. Maintenance of counts will be done entirely within nsMsgDBFolder.

I've seen comments from David Bienvenu proposing this instead be consolidated in nsMsgDatabase (sorry, can't find the bug reference at the moment.) The problem is that the nsMsgDatabase object's lifetime is directly connected with the opening and closing of the database file, so it cannot be relied on to hold these volatile counts. That might be fixed - but that seems like a riskier approach in the short run than just relying on the more stable nsMsgDBFolder object.

This bug is part of a larger proposal that I am considering making to separate BIFF notification from the NEW flag, but these changes are orthogonal to that requirement.

Patch with proposed solution to follow.
Blocks: 11040
This patch will solve a number of bugs associated with new message processing - although the primary intent is the reorganization, not the solving of specific problems. I'll try to document specific situations that are solved though.

Bug #1 (may be similar to bug 372372)

STR (using POP3 and a single shared local inbox folder):

1. Create a search folder on the inbox, called "New", with Status Is New criteria
2. Receive a new mail in the inbox.
3. Select the inbox, then select the Sent folder (which clears all new messages in the inbox)

Expected result: NEW decoration is removed from the New search folder name, and the number of unread for the New folder goes to 0

Actual results: there is no change to the New folder in the folder tree. When selected, the Unread counts go to zero. When selected then unselected, the NEW decoration is removed.

Bug #2 to be solved:

STR: (Using IMAP)

1. Create a filter on an IMAP account, tagging a message Important whose Subject Contains Important
2. Create a virtual search folder on the IMAP Inbox, with Status Is New, and saved under Local Folders.
3. Exit and re-enter TB (there are some issues with adding listeners with creation of virtual folders which are unrelated to this bug, but can cause issues in reproducing).
4. Send and receive a message with Important in the subject.

Expected results: the new message has New decoration in the message pane.
Actual results: the new messages does not have New decoration in the message pane.

I'm not sure what role the New virtual folder has in this, but it seems to be needed to reproduce.
Attached patch First lookSplinter Review
I'd like bienvenu to do a first look at this patch. After any initial comments, I'd like a different reviewer to get more eyes on it.

This patch is intended to be refactoring, that is I tried to change as little of the existing behavior as possible. There are several main changes:

1) Management of NEW status, to the maximum extent possible, is done simply by setting and clearing the MSG_FLAG_NEW flag. All of the calls to maintain new counts and hasnew status, that were previously scattered throughout the code, have been removed.

2) The lists of new counts (which are needed for efficient clearing of the NEW status from a folder) are now maintained in the stable nsMsgDBFolder object, rather than the volatile nsMsgDatabase object.

3) This refactoring lays the foundation for two followups: first, to separate BIFF notification status from the NEW status (which is defined as "This message was added to a folder since the last time you opened the folder"); second, to allow the possibility to store the NEW flag in the database so that it persists between program restarts. This will allow bug 11040 to be implemented easily.

4) The need for junk classification is now represented by a new volatile message flag. The justification for this is much weaker than the justification for separating BIFF NOTIFICATION from NEW, but the management code for the volatile flags makes more sense when there are multiple flags, so I left it in partially as a demonstration of the code. As the definition of NEW becomes crisper, the need to separate "Need to classify" from "haven't seen yet in a folder" may be needed.

This code incidently fixes bug 372372, but that is not the main intention. Preparation for cleanup of various BIFF bugs and enhancements is the main motiviation.
Attachment #331179 - Flags: superreview?(bienvenu)
Product: Core → MailNews Core
I've looked through this a bit - a few nits first:

this attribute name could be a little more descriptive:

hasClassifyMessages

I think it means hasMessagesToClassify, right?

MSG_FLAG_CLASSIFY - 

I hate having to use one of our finite number of bits for this status, which is pretty temporary. Why does this have to be a flag? Why can't code setting this flag keep track of the messages that need classification like we did before?


I don't get what what you mean by "savedFlags" in these methods. 
+  /**
+   * apply volatile flags to correct message flags
+   *
+   * @param msgKey   Key of the message whose flag to correct
+   * @param oldFlags message flag prior to correction
+   *
+   * @return        corrected flags
+   */
+  unsigned long getSavedFlags(in nsMsgKey msgKey, in unsigned long oldFlags);
+
+  /**
+   * set the volatile flags from a message flag. New flag
+   * bits to set must be applied using numerically increasing msgKey
+   * values to keep the underlying data structures sorted.
+   *
+   * @param msgKey   Key of the message whose flags to save
+   * @param flags    message flags prior to correction
+   *
+   * @return         corrected flags
+   */
+  void setSavedFlags(in nsMsgKey msgKey, in unsigned long flags);

I don't understand this comment either:
 
2) The lists of new counts (which are needed for efficient clearing of the NEW
status from a folder) are now maintained in the stable nsMsgDBFolder object,
rather than the volatile nsMsgDatabase object.

nsMsgDatabase is persistent, and I'm not sure why you consider it volatile. nsMsgDBFolder is transient...


This seems like an awkward hack:

+    
+    /**
+     * Temporarily set the flags without affecting the db, listeners, or
+     * volatile lists. Used in virtual folders to test for search changes
+     * in flags. Caller must restore flags back to their original value,
+     * using a second call to setRawFlags, to keep things in sync.
+     *
+     * @param flags    message flags to set
+     */  
+    void setRawFlags(in unsigned long flags);
 
more later, but those are my first impressions.
(In reply to comment #4)
> 
> MSG_FLAG_CLASSIFY - 
> 
> I hate having to use one of our finite number of bits for this status, which is
> pretty temporary. Why does this have to be a flag? Why can't code setting this
> flag keep track of the messages that need classification like we did before?
> 
The case for this flag is pretty weak I admit. The case is much stronger, IMHO, for a future NOTIFY flag, if we separate the concept of NEW decoration in folders from BIFF notify. But I didn't want to deal with that issue in this bug - though the original code I wrote included that, I just removed it to reduce the scope of an already oversize patch - but I left in the CLASSIFY flag because it was important to demonstrate the logic for dealing with multiple volatile flags. Perhaps I could leave that logic in, but only have a single flag in this bug?

There's also the argument that there are three flags that could be shared with the PRIORITY bits for the non-volatile cases like CLASSIFY - so there really isn't a shortage of available temporary flags.

But let me think about it.

> 
> I don't get what what you mean by "savedFlags" in these methods ... 
> 

"Saved" as in saved in memory in the current object (nsMsgDBFolder)

> I don't understand this comment either:
> 
> 2) The lists of new counts (which are needed for efficient clearing of the NEW
> status from a folder) are now maintained in the stable nsMsgDBFolder object,
> rather than the volatile nsMsgDatabase object.
> 
> nsMsgDatabase is persistent, and I'm not sure why you consider it volatile.
> nsMsgDBFolder is transient...
>
Don't you have this backwards? nsMsgDatabase objects are created and destroyed all of the time, but not nsMsgDBFolder objects. This is a really key point we must agree on though, as it is fundamental to the patch.

I tried in the debugger looking at calls to the destructors for nsMsgDatabase and nsMsgDBFolder. nsMsgDatabase is routinely destroyed while changing views (you did a patch a few years ago to do that to close the databases when not in use) but there were no calls to the nsMsgDBFolder destructors.
> 
> This seems like an awkward hack: ...
> 
> +    void setRawFlags(in unsigned long flags);
>
I chuckled at this, as I am replacing an older "awkward hack" (getRawFlags) with another. I believe it is necessary though to prevent trying to set a new flag on a header using an old key in the saved flag structures, which violates the requirement that they are always sorted.

>Don't you have this backwards? nsMsgDatabase objects are created and destroyed
>all of the time, but not nsMsgDBFolder objects. This is a really key point we
> must agree on though, as it is fundamental to the patch.

But nsMsgDatabases can persist values directly, and nsMsgFolder objects can't. If you want to make "new" persist across runs of the program, you're just going to need to move the new list back to nsMsgDatabase, and make it store the new list in the .msf file. Then it doesn't matter that the in memory representation of the db comes and goes. You could even make nsMsgDatabase persist the new flags without making new persist across runs of the program, as long as we did something like ignore newlists in db's that were opened when the folder didn't have the hasNew flag set.

Which means we might want to decide about whether we're going to make new persist across sessions or not...
I agree about the awkward hack, but I was hoping we could get rid of my awkward hack, instead of replacing it with an other :-) 

I know it's just a nit, but this seems fraught with peril:
+   * set the volatile flags from a message flag. New flag
+   * bits to set must be applied using numerically increasing msgKey
+   * values to keep the underlying data structures sorted.


(In reply to comment #6)
>
> If you want to make "new" persist across runs of the program, you're just going
> to need to move the new list back to nsMsgDatabase, and make it store the new
> list in the .msf file. Then it doesn't matter that the in memory representation
> of the db comes and goes.

I actually implemented this, then removed it to "simplify" this patch.

Here is what I thought the issue was.

It is a fairly common operation to iterate over a folder, and reset the new message flag. As currently designed (that is before this patch), the new message flags are stored in a structure in memory, so that you can go through this folder and get the list of new messages in a folder easily. What would happen if you eliminated this structure, relying completely on the message flags in the database? My fear is that for a very large folder, the operation of resetting the message flag would be slow as you scanned the entire folder to look for NEW flags set, and make the program sluggish to switch folders. After all, this was important enough that the binary search was added to the in-memory structure to prevent this sluggishness.

That is, we need an index into the message database keyed on the new message flag (and any other flag that is routinely reset over an entire folder). Now perhaps mork has this capability, if so we could add it there. But in my implementation, I kept the flag saving structure in nsMsgDBFolder as a new flag index to keep the reset operation fast.
> 
> I know it's just a nit, but this seems fraught with peril:
> +   * set the volatile flags from a message flag. New flag
> +   * bits to set must be applied using numerically increasing msgKey
> +   * values to keep the underlying data structures sorted.
> 
The key functionality that we are currently missing, that prevents things like fixing bug 372372 or bug 11040, is the ability to manipulate the NEW flag (and the proposed similar NOTIFY flag which currently is the same as NEW) through, well, manipulating the flag. (The current system of maintaining a new list, a count of new messages per folder, and a server NEW flag separately is also fraught with peril, and often fails, when they should all be directly derived from the message NEW flag).

In the current patch, the NEW flag really is a sort of reset-only flag, and perhaps should be checked to only allow reset. (except for the SetRaw hack). That works with NEW - but might not work with NOTIFY since you could imagine a scenario where people want NOTIFY off by default, but only set by message filters. In that case, if we want to keep the binary search feature, we should probably resort the structure whenever a key is added out of sequence. In normal operation, this structure is not that big, and messages are filtered in order anyway, so I don't think that would be a burden.
What I was suggesting was keeping the old mechanism of a "new set", stored in the db, and persisted, to address the issue of the db getting closed. The whole reason we used the new set is that it's quick to search, and it's easy to turn into a string and persist. That way we don't have to iterate over the whole set of msg hdrs clearing new flags.
The issue of persisting new state is obscuring what I really want to accomplish here. The real goal is to allow additional message key lists in addition to the new list (in particular a separate biff notify list) without being forced to deal with all of the complexity that has arisen with the new list. This bug was supposed to be some refactoring that made it easier to add volatile message key lists.

My understanding is that the nsMsgDbFolder object persists for the life of the application, and so I can store volatile lists there without dealing with the persistence issues that have resulted in three copies of the new list in various situations: m_newMsgs, m_saveNewMsgs, and m_newSet (two of which are already stored in the folder object). So the real issue in this bug is, shall we provide a unified method of storing volatile flags in nsMsgDbFolder rather than attempting to store those in nsMsgDatabase and dealing with persistence issues when the database object is closed? At the moment, the answer from review seems to be "no".

I'm trying to figure out a way to move forward with some of the work I want to do in BIFF management without getting mired in all of these issues. Here's another possible approach: leave the new flag stuff alone as much as possible, but create a separate MSG_FLAG_NOTIFY managed as a volatile list in nsMsgDBFolder. I'd just list the keys instead of the more complex but general approach here. So I would focus on separating NOTIFY functions from NEW folder decoration issues - but keep the equivalent of m_newSet (maybe m_notifySet) in nsMsgDBFolder.

Can I get some comments on that approach?
OK, I'm perfectly happy with adding the ability to keep track of sets of messages in nsMsgDBFolder. What makes me less happy is involving message flags and the db in that process. So if we had methods on nsIMsgFolder like the following, would it meet your needs?

void addMsgKeyToSet(in string aSetName, in nsMsgKey aKey);
void removeMsgKeyFromSet(in string aSetName, in nsMsgKey aKey);
boolean isKeyInSet(in string aSetName, in nsMsgKey aKey);
void clearSet(in string aSetName);
nsIMutableArray getSet(in string aSetName);

or maybe just a couple methods that get and set a mutable array for a given set name, i.e.,

nsIMutableArray getSet(in string aSetName);
void setSet(in string aSetName, in nsIMutableArray aSet);

then you could have as many of these different sets as you want.
Attachment #331179 - Flags: superreview?(bienvenu)
I haven't been working on new message handling for awhile, so I am going to unassign myself from this. I still think this is a good idea though.
Status: ASSIGNED → NEW
Assignee: kent → nobody
Kent, just to tie up loose ends, could you comment on the thoughts in comment 10?
Flags: needinfo?(kent)
Summary: Keep lists of new messages only in nsMsgDBFolder → Keep lists of NEW messages (m_newSet, m_saveNewMsgs, m_newMsgs) only in nsMsgDBFolder
Whiteboard: [patchlove]
(In reply to Wayne Mery (:wsmwk) from comment #12)
> Kent, just to tie up loose ends, could you comment on the thoughts in
> comment 10?

I opened a few bugs in the last week associated with some of these same issues, which may be creating loose ends more than tying them up. See Bug 1026135. My immediate goal though is to be able to fix testing issues that may be causing tests to fail with maildir (fixing of which is a GSOC project).

I agree with "What makes me less happy is involving message flags and the db in that process". See also Bug 1025976 - "Remove unused folder flag GotNew" that will reduce involvement of folder flags.

But with what I know today, I would probably attempt a more extensive rework. There is a general problem of managing message processing that is unresolved, that shows up from the user point of view as occasionally finding that filters or junk management does not get run on a message. This processing needs to be done async, so that filter search and actions can have access to attributes that require async processing such as attachment name and content-type, as well as have some ability to recover from errors without losing a message or processing step.

Since there is no reasonable intention of continuing this bug without a quite different approach, let's clean up the loose end by marking WONTFIX.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(kent)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: