Closed Bug 170555 Opened 22 years ago Closed 22 years ago

Create Junk folder and add ability to purge messages from junk folder

Categories

(MailNews Core :: Backend, enhancement)

x86
Windows 2000
enhancement
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: naving, Assigned: naving)

Details

Attachments

(3 files, 4 obsolete files)

As we discussed, I have written code so that we create junk folder on demand 
when we are trying to filter messages to junk folder. This is for both pop and 
imap. Also I added ability to purge messages based on age as threshold and I 
tied this with auto-compact as you suggested. I think we need to do this just 
once per session because we age junk folder in days. patch coming.
Attached patch proposed fix (obsolete) — Splinter Review
See comments above
what if you leave the app running for multiple days?
instead of duplicating the code in nsMsgDatabase::PurgeMessagesOlderThan, could
you re-use it? You'll probably need to restructure the code a little so that you
can call nsMsgDatabse::GetMessagesToPurge(PRUint32 daysToKeepHdrs, PRBool
keepUnreadMessagesOnly, nsMsgKeyArray *keysToPurge) and then the callers get to
decide what to do with the keys - for your code, call DeleteMessages on the
folder, for the retention settings code, call DeleteMessages on the folder.
Do you really think we need to purge more than once per session. 
Severity: normal → enhancement
Status: NEW → ASSIGNED
My first take would be that we should purge once per day, which might be a lot
less than once per session if you have lots of sessions in a day, or might be
multiple times per session if you leave the client running...but if that's too
hard to code, then I guess we could do it once per session.
We would have to store the timestamp of last purge if we want to do it only once
per day. I think once per session is good enough because I think on an average
people would start their mail app atleast once per day. Scott, what do you think ?
there an issue that I think might be being overlooked:

for imap, you need to select the folder on the server in order to age messages -
iterating over the db won't give you any headers unless the folder has been
opened and the messages filtered into it downloaded to the db. So, in order to
correctly age the spam folder by iterating through the db, you need to update it
by doing a select. This can be a moderately expensive operation if there are
lots of headers to download so avoiding doing it every session might be worthwhile.
Do we need to do a select ? I think people will go to their junk folder to see
if there are any other msgs besides junk. I think we should not be doing a
select and deleting msgs because it doesn't even give them a chance to look at
their junk messages. It is dangerous. This is something like deleting from
underneath us!
I think if we say we're going to age the junk folder, we should age the junk
folder  - if we don't age it, it will grow without bounds if the user doesn't
open it.  And we'll only age away messages that were put in the junk mail folder
10 days since you last opened the junk mail folder. If we decide that you have
to open the junk mail folder to age it, then we should age it when the user goes
to another folder or shuts down. And finally, for pop3, aren't you going to age
the messages away without opening the folder?
I think once every 24 hours or greater (depending on the last time they logged
on) would be preferable.

I also think we can't depend on the fact that they will select their junk mail
folder through the UI.
I think Navin has a point that auto-aging messages the user hasn't seen might be
undesirable, unless we warn the user when they set the pref. Is auto-aging on by
default, or does the user have to turn it on?
we should make it off by default.

I guess if someone goes on vacation for a month we probably don't want to delete
mail they may not have seen.
I don't think we should do a select and delete it because it is drastic action
no matter what. 
if we are going to warn the user then it is ok. 
Plan was that users could decide if/what folder they wanted junk msgs moved to.
And if/when they wanted junk messages auto deleted.
Auto delete is Off by default. User must manually turn it on first.
thanks for the screen shot, jglick.  I'll be working on the UI changes as part 
of bug #169638
If UI isn't doable, let me know and we'll need to discuss what needs to be changed.
jglick: that dialog barely fits at 800x600 (into a web browser), and it has way
too many controls and way too much text for me to handle, and I spend lots of
time reading large amounts of both. I'd suggest:
1. ditch most of the documentation setences, if people need them they can click
help.
2. change the off..highest into a select.
3. rename them. "lowest" and "highest" are useless, "tolerant" and "aggressive"
are slightly better.
4. drop "When messages are ..."
5. Color junk mail <light grep |v>
6. drop display an indicator. always display one, either in the thread position
or perhaps in some col which the user can turn off.
7. We could combine the two move items, it'd buy space

Move to: <junk mail on Foo          |v>
         |junk mail on bar            |
         |...                         |
         |junk mail on local folders  |
         |----------------------------|
         |other                      >|
         +----------------------------+
I know we haven't in other places, but i think that's a bug (it costs us a lot
of space)

8. move always accept to the very top. (there are a few reasons, one is that
you're trapping grayspace by putting it where you did, another is that of all
the options it's one of the most important, it's also not remotely destructive
whereas move and delete are)

[x] Always accept mail from <personal address book|v>
<Aggressively|v> filter inbound mail for junkmail
...

9. The title should say Junk Mail Controls for <accountprettyname>

10. Shouldn't it be < _View_ junk mail log >?

--
As for deleting mail, I think it might be best if we wrote something like:
[ ] Automatically delete old junk mail after [  9] days of continuous use.

continuous use would have to be defined, but basically if I only use mozilla
three days out of twenty then the threshold wouldn't be met. Leaving mail open
would not meet the definition either. Reading a few messages or opening multiple
folders would.

Last usage consideration. Suppose I get 1 junk message a day every day for a
month. And suppose I use mozilla for days 20..28 of that month. On day 28
junkmail identified between 1 and 19 would be deleted but junk mail received on
day 20 would not be automatically deleted until the next day I use mozilla.
as far as the purging code is concerned, one easy way to solve the "select imap
spam folder before you can purge" problem is to use SEARCH to do the purging -
just construct a search term that gives you messages older than XX days, listen
for the results, and when the search is done, delete the result messages. Search
automatically updates the folder before doing the search, and is async so you
don't have to worry about blocking the UI...

The alternative is to, in the imap case, when you've decided that you need to
run an auto-age, run a select url, register yourself as a listener, and when you
get notified that the select url has finished, run the purge code from the
db...if the user happens to have selected the spam folder, and that has
triggered the auto-purge (which would be a highly unlikely event, I believe),
you could either just NOT do the auto-age, or still run the select url but
*after* the ::UpdateFolder select, which would cause the second select to block
until the first one was finished. However, I'm guessing you're going to try to
shove this into the already overloaded ::OnStopRunningUrl, and add yet another
bit of state to nsMsgImapFolder. I'd encourage you not to do that.
when is the junk mail auto-aging likely to run? Is it most likely to run when
the user starts up at the beginning of the day and selects their inbox? Should
we worry about the effect on start-up performance?
the search approach sounds nice.

I guess i'd suggest that aging happen after the user idles for a while.

The problem is that a user might open mail, do stuff, quit mozilla. so we can't
really do it that way.  How about aging five minutes after the user loads mail
and on a low priority thread with a status window which pops up somewhere. The
dialog should probably have stop and details buttons. details would bring up a
query list showing all the mails scheduled to be deleted.
If you look at the existing code auto-compact never gets called on start-up
because we wouldn't have been running any url (imap)! For pop we don't do it
until after we have opened the database.

I was thinking of adding a bit of state to the folder  (a boolean
mDeletingJunkMessages) so that we know that we know that we have to delete junk
messages after downloading junk messages (imap) or reparsing junk folder (local).
Both are updating folder actions so just call deleteJunkMessages from
nsMsgDBFolder::OnStopRunningURL when mDeletingJunkMessages is true.
This boolean is also needed to ensure that we don't go into infinite loop
because UpdateFolder called AutoPurge then again we call UpdateFolder from
AutoPurge before purging. 
yes, that was the kind of approach I was hoping to avoid. I'll have to think
about it for a while.
don't we load the inbox on mail startup, and doesn't that run an imap url?

use QueryElementAt here:
+        nsCOMPtr<nsISupports> support =
getter_AddRefs(m_connectionCache->ElementAt(i));
+        connection = do_QueryInterface(support);

I'm also not sure why auto compacting of folders and auto-purging of the junk
mail folder have been forced together like this. Is it for conceptual reasons,
or just ease of implementation? Isn't that causing some of the problems you need
the boolean to solve? It also looks to me like auto compact for imap only runs
when we're offline - is that correct?
When we are loading inbox on start-up we are not running any other url on Inbox
at that time, so AutoCompact doesn't get called. But it does get called if you
click on GetNewMessages twice very fast or some other url (not select) is
running  on a folder and you happen to select that folder. 

I think a better place of moving this auto-compact call would be in
nsMsgDBFolder::OnStopRunningURL when we happen to end updating folder. That
would definitely ensure AutoCompact getting called and we could remove the call
from all child implementations (imap, pop, news) 

I was tying both of them because they are both cleaning up methods and both have
to be performed after some time interval. 

I had another idea of simplyfying purging junk folder:

Only purge if you happen to select junk folder and time interval thing is met.
Since we are actually not purging but just deleting, that saving space argument
doesn't hold and users will some time or other select the junk folder to see if
there is anything besides spam ? agree. 
if you're offline, we'll never call ::OnStopRunningUrl.

I think your life would be easier if you broke these two things apart
(auto-compact and auto-purge). I believe you're painting yourself into a corner
by combining them, since they're not called at the same time.
QA Contact: gayatri → laurel
I'm about update the interface.

You'll want to use this from your code:

+  /**
+   * built from moveTargetMode, actionTargetAccount, actionTargetFolder
+   */
+  readonly attribute string spamFolderURI;

it builds up the uri for you, based on those attributes.  the folder is not 
guaranteed to exist, but that's ok, since your code creates it.

let me know if you have questions.
QA Contact: laurel → gayatri
QA Contact: gayatri → laurel
I've checked in, so you'll want to fix to use the new attribute.

if you're also working on setting / clearing the junk mail folder flag, you'll 
want to do that from nsMsgIncomingServer::SetSpamSettings().
Attached patch proposed fix (obsolete) — Splinter Review
This patch includes the junk folder creation part from last patch. What has
changed is how we purge junk folder. As discussed w/ david, I created a
purgeService that goes and purges junk folder at intervals of one day. I store
the last purge time in the folder cache. I see no point storing this info in
the db (msf file) at this point. So basically purges are scheduled when we load
accounts.  We iterate for servers that can have spam settings and add it to
purgeArray. If two purges happen to be at the same time then I delay it by 5
mins. For actual purging I have used search and created term AgeInDays etc...

Cavin, David, can you review ? thx.
+#define oneDay 86400000000
+#define delay 300000000  //minimum delay between two consecutive purges

these names need to follow the normal naming conventions for #defines, which is
all upper case. The names could also be more descriptive ONE_DAY_IN_MILLISECONDS
MIN_DELAY_BETWEEN_PURGES. perhaps you were copying the style in nsMsgDBFolder.cpp,

#define oneHour 3600000000
but that is also wrong.

copyright date should be 2002
+ * The Initial Developer of the Original Code is 
+ * Netscape Communications Corporation.
+ * Portions created by the Initial Developer are Copyright (C) 1999
+ * the Initial Developer. All Rights Reserved.
+ *

wouldn't it be cleaner here to use the Equals method on junkFolderURI?

+  nsXPIDLCString junkFolderURI;
+  spamSettings->GetSpamFolderURI(getter_Copies(junkFolderURI));
+  
+  *aIsJunkFolder = aCaseInsensitive ? !PL_strcasecmp(junkFolderURI.get(),
aFolderURI) : !PL_strcmp (junkFolderURI.get(), aFolderURI);
+  return NS_OK;


changing the comment doesn't make this OK :-) The imap url is not thread safe,
so this routine should not be called from the UI thread.

-// this method is only called from the imap thread
+// this method is only called from the imap thread AND the UI thread...
 NS_IMETHODIMP  nsImapUrl::CreateServerSourceFolderPathString(char **result)
 {
   NS_ENSURE_ARG_POINTER(result);
+  nsAutoCMonitor(this);
 	AllocateServerPath(m_sourceCanonicalFolderPathSubString,
kOnlineHierarchySeparatorUnknown, result);
   return NS_OK;
 
as far as just using the folder cache is concerned, that goes against all the
other code that uses the folder cache. We're supposed to have it such that
removing the folder cache and restarting works as if you hadn't removed the
folder at all (except that startup will be slower), because everything will get
regenerated from the db's. I know this was a convenient shortcut for you, but
I'm not sure it's the right thing to do.
I think Equals is not there for nsXPIDLCString. It is for nsCAutoString. Also
If it has Equals, I think I had problems w/ caseInsensitive comparisons. 

I made it threadsafe by adding a monitor, look at similar methods below. 

 {
   NS_ENSURE_ARG_POINTER(result);
+  nsAutoCMonitor(this);

lastPurgeTime is not that important that it has to be recreated for folder cache
from dbs because we are purging at interval of one day. We can just start fresh
i.e purge and create it again. 


I don't like this code going directly to the folder cache. No other code does
that (other than the code that has to be involved with the folder cache). No
other code should know about the folder cache. 

If it was too much bother to go through the nsIMsgFolder and the db, instead of
just bypassing the nsIMsgFolder and DB entirely, you should try to make it
easier for client code to set properties in the db via the nsIMsgFolder. You
could add get/set methods to the db, Get/SetFolderProperty(in string
propertyName, in string propertyVal), and have those methods get/set the string
property on the nsIDBFolderInfo. Then, you'd just need to add an attribute to
nsIMsgFolder for the last purge time stamp.
the problem is that other routines in nsImapUrl can access/change data that gets
changed in nsImapUrl::CreateServerSourceFolderPathString. Thread-safety isn't
just making it so that only one person can call that method at one time.
Thread-safety also involves making sure that any data access in that method
can't be access by any other method/thread at the same time.

The multiple connection code is in the wrong place. It's supposed to go in
nsImapProtocol::CanHandleUrl(). We talked about this before, but if you look at
that routine, it has two out parameters, canRunUrlImmediately and canRunButBusy.
If a connection is creating a folder and another request comes along that tries
to do an online copy or move into that folder, then CanHandleUrl should return
canRunButBusy as true, and canRunUrlImmediately as false. Hope this helps.
nsImapProtocol::CanHandleUrl() will not know about other imap connections!
As far as multiple connection code is concerned, I still think it is the right
place. So lets consider the simplest case, where we have just 2 connections one
imap inbox connection and other connection which is creating junk folder, so now
the copyUrl (filter moves) wants a connection so that it can be run. We iterate
through the connection cache and if we find Inbox connection first it can easily
handle this copyUrl. The code in nsImapProtocol::CanHandleURL doesn't have any
way to know that destination folder is being created on a separate connection
thread. Also for copyURL to be run we need the connection to be inSelected state
so it can never be run on the other connection which is creating junk folder, so
we will always hit Inbox connection. 

Scott, your comments would be appreciated to resolve this ? thx. 
If you'd look at the code a little more closely, you'd see that we iterate over
the connections. If a connection says it has to handle the url, but it's busy,
then we queue the url. If it says it can handle the url and it's not busy, that
means it's in the selected state for the desired folder.  So, the inbox
connection is not going to say that it can handle the url (if you write the code
correctly so that it doesn't!), and your reasoning and conclusion are wrong. Of
course, if you don't change CanHandleUrl as I said that you should, then it
won't work, but that's really not good argument on your part. I can say that any
approach isn't going to work if I write the code incorrectly!

What you've done is add an extra loop looking through all the connections. The
way the routine is designed and implemented is that it already loops through all
the connections calling CanHandleUrl. If you fix CanHandleUrl to handle this
case, then everything will work as designed.
CanHandleUrl approach will not work. I suggest you code it and try yourself. I
have  spent lot of time in coding this piece and this is the right way.
Regarding looping one more, it is much more cleaner not to put in that other
loop. I originally put in that existing loop but it was turning out to be more
messy because of the several conditions already there. 
OK, I give up; you've convinced me: doing it in the connection code was a very
dumb idea and I'm very sorry I suggested it in the first place. The code is very
sensitive and if anything goes wrong, the consequences are dire. The code needs
to be clean and maintainable as possible, and having this special case code
there turned out to be just a bad idea. So, just create it when the user picks a
spam folder, like you do for creating a normal filter with a new folder. Or
check in your code and I'll go clean it up later.
ok, CanHandleUrl is out but it would have never worked. I'll attach a patch for
just purging junk folder. Creating junk folder is going to be driven from UI
which has to be sorted out. 

Attached patch proposed fix (obsolete) — Splinter Review
I have added Get/SetStringProperty to get lastPurgeTime. I have added comments,
rest of the patch remains same as last one.
Attachment #100431 - Attachment is obsolete: true
Attachment #101179 - Attachment is obsolete: true
David, can I get review ? thx.
Comment on attachment 101483 [details] [diff] [review]
proposed fix

I think you can get rid of the init method on your purge service interface if
you have init get called automatically when the purge service is first invoked.
You can do this by using the nifty marcro:
NS_GENERIC_FACTORY_CONSTRUCTOR_INIT
in nsMsgFactory.

2) tabbing issues in nsMsgPurgeService.h

3) The Get/SetStringProperty stuff on a db folder looked good to me.
Scott, that Init() is needed because if you do a turbo shutdown the services are
not freed and when we start back up we have to call Init explicitly. We already
do that from account manager. This is something we do for biffManager also. 
this is wrong:

+
+//there is no spam settings for news folders - atleast for now
+NS_IMETHODIMP
+nsNntpIncomingServer::GetSpamSettings(nsISpamSettings **aSpamSettings)
+{
+  return NS_ERROR_NOT_IMPLEMENTED;  //no assertion because generic code will
not know
+}
+
+NS_IMETHODIMP
+nsNntpIncomingServer::SetSpamSettings(nsISpamSettings *aSpamSettings)
+{
+  return NS_ERROR_NOT_IMPLEMENTED;  //no assertion because generic code will
not know
+}

we are going to allow spam settings for all protocol types except none. 
(movemail, imap, pop3 and news)

note, all those show up in the account picker for the junk controls.

the difference with news is, you can move messages.

I'm working on fixing the junk controls UI to disable the move UI for news.

I haven;t read the full patch yet, but you can assume that for a news spam
settings, the boolean value for purge will never be true.  I'll take care of that.
a style nit:

you can save yourself from

if () {
  if () {
    if () {
      if () {

by doing this:

+  //Only add it if it hasn't been added already.
+  if (serverIndex != -1)
+    return NS_OK;

+  nsCOMPtr <nsISpamSettings> spamSettings;
+  nsresult rv = server->GetSpamSettings(getter_AddRefs(spamSettings));
+  NS_ENSURE_SUCCESS(rv,rv);
+
+  PRBool purgeSpam;
+  rv = spamSettings->GetPurge(&purgeSpam);
+  NS_ENSURE_SUCCESS(rv,rv);
+  if (!purgeSpam)
+    return NS_OK;

and so on. 

if you can avoid the nested if style, please do.  it's hard to read.
I also removed path exists check and changed GetDBFolderInfoAndDB for
imapFolder to do the right thing i.e open an existing db, not create one.
Attachment #100443 - Attachment is obsolete: true
Attachment #101483 - Attachment is obsolete: true
so you know that no code relies on GetDBFolderInfoAndDB to actually create the
.msf file?
Comment on attachment 101606 [details] [diff] [review]
patch addressing seth's comments

ok, sr=bienvenu if we go back to using exists for now, instead of changing
GetDBFolderInfoAndDB, just to be safe.
Attachment #101606 - Flags: superreview+
Comment on attachment 101606 [details] [diff] [review]
patch addressing seth's comments

r=mscott
Attachment #101606 - Flags: review+
fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Even beyond the Linux bustage (the nsInt64 constructors should probably be
|explicit| to prevent that type of problem, although I'm not sure which compiler
is right), there will probably be bustage on some of the ports because you can't
initialize a 64-bit int based on a numeric literal, since not all ports have
64-bit ints.  That's why we have prlong.h to begin with, and nsInt64 is an
extension of that.
Comment on attachment 101773 [details] [diff] [review]
patch to fix bustage on some platforms

>+static const nsInt64 gOneDayInMilliseconds = 86400000000;
>+static const nsInt64 gMinDelayBetweenPurges = 300000000;

You can't do this since you can't use numeric literals this way.  The portable
way would probably be:

static const PRInt64 gOneDayInMilliseconds = LL_INIT(20, 500654080); /*
86400000000 */
static const PRInt64 gMinDelayBetweenPurges = LL_INIT(0, 300000000);

(fortunately 86400000000 mod (2^32) < (2^31), or else there wouldn't be a
portable way given the current (slightly incorrect, unless I'm misreading)
prlong.h macros)

Do check my math...

>+    nextPurgeTime += gOneDayInMilliseconds;

Then all the things like this would have to be:

nextPurgeTime += nsInt64(gOneDayInMilliseconds);
-  purgeEntry->nextPurgeTime += MIN_DELAY_BETWEEN_PURGES *
(mPurgeArray->Count()+1);  //let us stagger them out by 5 mins if they happen to
start at same time
+  for (PRInt32 i=0; i < mPurgeArray->Count()+1; i++)
+    purgeEntry->nextPurgeTime += nsInt64(gMinDelayBetweenPurges);  //let us
stagger them out by 5 mins if they happen to start at same time

This looks like you're just incrementing the same value by the same amount
multiple times.
Oh, never mind, that's what you meant to do, I think.
(But you could use the stuff in prlong.h to do it a much faster way.)
Update- Using trunk build 20030121 on winxp we are creating the Junk mail folder
for IMAP POP and Web type accounts when the user enables Junk Mail controls (see
bug 181397 for test scenarios).  However I haven't tested the purging based on
age yet which I believe is what the fix for this bug is.   
Using trunk build 20030218 on winxp, macosx and linux purging of messages older
than the value placed in the purge preference is working.   It seems to purge 5
minutes after launching, haven't tested the intervals if app left open for long
periods of time.  Example, app left open 30 minutes after initial purge and did
not purge again.  Can someone tell me what the interval of purge are if the app
is left opened for days (is it 24 hrs as suggested in comment 11).  Spin off
bugs will be logged for wording of purge feature and problems purging mail that
is labeled as Not Junk but resides in the designated Junk folder. 
Purging works, however spinoff bug 194090 where purging puges "not Junk" message
too that reside in the Junk folder.  this is verified
Status: RESOLVED → VERIFIED
QA Contact: laurel → esther
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: