Closed Bug 250084 Opened 20 years ago Closed 20 years ago

Change behavior for when Junk Mail controls are run

Categories

(Thunderbird :: Mail Window Front End, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird0.8

People

(Reporter: mscott, Assigned: mscott)

Details

Attachments

(1 file)

Scenario: 

1) On first launch of Thunderbird, use the new migration wizard to migrate mail
settings and data from another mail app.

2) When thunderbird comes up for the first time, click on an IMAP Inbox.

3) Wait and watch us download every message body in your inbox marking most of
them incorrectly because we don't have any tokens in the training set and all
these messages appear to be new to use.

*Yuck*
Status: NEW → ASSIGNED
Target Milestone: --- → Thunderbird0.8
do we score every message body, or just the unread ones? It is ugly either way -
what's the fix? Don't score messages if there are no tokens?
Two things are going on here.

1) I'd like to see us not run the adaptive filter at all if it tells us that we
don't have any training tokens. Then we won't bother downloading all of the
message bodies.

2) By design, in order to encourage the user to train the mail app, if you don't
have any good tokens we automatically mark messages as junk. If you don't have
any bad tokens we automatically mark the message as not junk. See the comment here:

http://lxr.mozilla.org/mozilla/source/mailnews/extensions/bayesian-spam-filter/src/nsBayesianFilter.cpp#949

I wonder if we should rethink that behavior. 
David, it's possible it is just doing unread ones and not all of them. I'll try
to pay more attention next time. 
ah yes it's just unread.

After migration, I had 100 unread messages and all of them were marked as junk
because of line 949. None of my read messages were touched.
Attached patch a fixSplinter Review
This bypasses the junk mail controls completely if the user has not classified
any messages and we are opening a folder. This helps make the first use
scenario much smoother after migrating a profile or just creating a new account
and downloading the messages for that account.
Comment on attachment 152471 [details] [diff] [review]
a fix

What do you think about doing something like this David?
Attachment #152471 - Flags: superreview?(bienvenu)
It's not obvious from the patch but this code is located right after we do:

  // if this is the junk folder, or the trash folder
  // don't analyze for spam, because we don't care
  //
  // if it's the sent, unsent, templates, or drafts, 
  // don't analyze for spam, because the user
  // created that message
  //
  // if it's a public imap folder, or another users
  // imap folder, don't analyze for spam, because
  // it's not ours to analyze
  if (mFlags & (MSG_FOLDER_FLAG_JUNK | MSG_FOLDER_FLAG_TRASH |
               MSG_FOLDER_FLAG_SENTMAIL | MSG_FOLDER_FLAG_QUEUE |
               MSG_FOLDER_FLAG_DRAFTS | MSG_FOLDER_FLAG_TEMPLATES |
               MSG_FOLDER_FLAG_IMAP_PUBLIC | MSG_FOLDER_FLAG_IMAP_OTHER_USER))
    return NS_OK;

So it seems like the right spot to throttle things based on whether the user has
classified a message or not. 
Comment on attachment 152471 [details] [diff] [review]
a fix

looks good.
Attachment #152471 - Flags: superreview?(bienvenu) → superreview+
fixed branch and trunk.

This undermines the goal of forcing a user to train but I think the usability
gain out weighs that. 
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Suggestion - have some mechanism to determine and display the CONFIDENCE 
in the filter's discrimination ability.

Some sort of progress bar or icon that starts off red and becomes green 
as the database grows. Confidence could be calculated based on number of
messages in database, number of new messages seen in some rolling time
window and number of messages for which the user was forced to intervene
(mislabeled as junk / not labeled as junk).
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: