Open Bug 11036 Opened 25 years ago Updated 1 month ago

Filter by example (create a filter based on a message you're viewing)

Categories

(MailNews Core :: Backend, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: sspitzer, Unassigned)

References

Details

(Keywords: uiwanted)

Summary: Filter by example (create a filter based on a message you're viewing) → Filter by example (create a filter based on a message you're viewing)
Whiteboard: HELP WANTED
Summary: Filter by example (create a filter based on a message you're viewing) → [HELP WANTED] Filter by example (create a filter based on a message you're viewing)
Target Milestone: M15
QA Contact: lchiang → laurel
This would be OK for a single message if the filter came up and you could choose
which fields you wanted to use in the filter.

But even better would be to select a bunch of messages and have an intelligent
algorithm look through and see what's the same about them.  It would need to
know about such things as "to or cc", which fields to look on, ability to search
for substrings such as "@x.com", etc.  This would help novice users especially I
believe.
Bulk-resolving requests for enhancement as "later" to get them off the Seamonkey
bug tracking radar. Even though these bugs are not "open" in bugzilla, we
welcome fixes and improvements in these areas at any time. Mail/news RFEs
continue to be tracked on http://www.mozilla.org/mailnews/jobs.html
Reopen mail/news HELP WANTED bugs and reassign to nobody@mozilla.org
There is some useful research in Artificial Intelligence concerned with
automatic creation of Expert System rules which would be very relevant to this.
Basically such a system would be able to create filter rules by example.  If
anyone is interested in implementing this (I don't have the C++ skills or the
time unfortunately - but I do have the AI knowhow) send me an email
(I.Clarke@strs.co.uk).
I have thought about this some more.  There are several possibilities for
doing this.  Firstly there is a heuristic algorithm that can take numerous
inputs and corresponding outputs and work out a flow-chart type rule
system to get from the inputs to the outputs - this could be adapted to
the message filtering problem.  Another more extreme solution would be to
use a neural network.  A particular neural network could be considered a
single filter and could be trained by the user in a nice intuative manner.
Details such as the number of hidden layers, and number of nodes, should
be configurable but hidden behind an "Advanced..." tag as most users won't
care about this.  This sounds complicated, but backward-error-propogation
can be implemented in a few-hundred lines of code, and there are plenty of
freely available examples around.  Anyone interested in doing this should
contact me - I would be willing to handle the A.I stuff if they can handle
the interface to the rest of Mozilla.
And yet more ideas - after more thought I think I have settled on the best
technology to use, a genetic algorithm.  Essentially Mozilla could evolve a
suitable filter.  If this was written efficiently it could probably evolve a
suitable filter in a matter of seconds, however experiments would be needed to
confirm this.
I have written a proof-of-concept for this called GFilt, it is written in Java.
You can check it out at http://www.sanity.uklinux.net/java_projects.html.
Keywords: helpwanted
Summary: [HELP WANTED] Filter by example (create a filter based on a message you're viewing) → Filter by example (create a filter based on a message you're viewing)
Whiteboard: HELP WANTED
Target Milestone: M15
This would be a real good idea  having a smart filter :) Would it be possible to
have a API hook in mailnews to connect to a separate filter program?

There has been a good deal of research into mail filtering, including, perhaps
unfortunately, some that has led to patent filings, so it might be good to tread
carefully here.

My personal inclination would be to believe that strict machine learning
approaches, based on a single user's input, are unlikely to be completely
reliable. If I were to design such a system, though, I'd tend towards using a
rule induction system
(http://cora.whizbang.com/Artificial_Intelligence/Machine_Learning/Rule_Learning/index.html
) or a decision tree aproach (e.g. C4.5)

The better approach, though, would seem to be a server or P2P based one. Most
spam isn't inherently bad, it's bad becase it's sent to millions and therefore
is unlikely to reach the few (pathetic) souls whom it would actually interest.
It seems to me, therefore, that a "this is spam" button in the mail reader, that
sends out a warning (after aggregation with sufficiently many other "this is
spam" reports) that other mail clients can use for filtering, would be good.
Even better would be some sort of (either automated or user based ("this is get
rich quick/porn/sales/.... spam" pull down)) spam classification, so users can
subscribe to multiple classes of spam filtering.

It also seems to me that the best way of implementing this is by allowing (java)
filter plugins to mozilla - there are a bunch of good ways of doing this, and it
would be nice if there were a simple api that people can write to that would
support the evolution of multiple mail filter methods. (or perhaps XPI is the
right way, in the mozilla world, although it seems inherently less cross-platform).

I did not read the entire bug, here is what I want to add:
It would be nice to
a) right click a mail then select "declare as spam"
b) mozilla pops up a window where I can choose which parts of the mail are
relevant spam-markers eg. "To" (prefilled with Content of "To" Header), "From"
(prefilled with FROM information) etc.

I hope this has not already been covered in this bug.
*** Bug 65761 has been marked as a duplicate of this bug. ***
Depends on: 65761
*** Bug 68603 has been marked as a duplicate of this bug. ***
Michael Witbrock's "this-is-spam button" suggestion sounds like NoCeM. See bug
#73075.
It's a bit like NoCeM, except, because its talking about spam mail instead of
netnews, it can take advantage of the fact that spams go to lots of people to do
some of what NoCeM's reputation system seems to be trying to do.

Certainly a solution based on the NoCeM outline would make me happy if it
achieved the same purpose.
Ah. That would be NoCeM for email (aka NoCeM-E) then.
http://www.novia.net/~doumakes/abuse/ It works in pretty much the same way as
NoCeM, except that notices are sent to a mailing list instead of a newsgroup.
The only application that supports it so far is a procmail application also
called NoCeM-E: http://www.novia.net/~doumakes/abuse/nocem-e.zip .

It still uses the reputation system however, AFAICT.
Filed bug #76773 for NoCeM-E support.
Blocks: 66425
*** Bug 89091 has been marked as a duplicate of this bug. ***
This would be useful for filtering mailing lists into folders.  Currently, the 
only effective way to filter mail sent to a mailing list is to look at the full 
message headers, going to the add filter dialog, typing "X-mailing-list", and 
then typing the value of the "X-mailing-list" header from the "example" 
message.  (Filtering mailing lists based on to: creates problems when a single 
message is sent to two mailing lists you're on, or when someone replies to both 
you and the mailing list to make sure you can reply quickly.)
95% of all spam can be filtered very simple: if the message does not have any of
your email addresses in its To or Cc fields, it's most certainly a spam. So
before you go into artificial intelligence, why not implement this simple filter
as an optional feature.
dmitry:

If this "instant spam filter" feature (which should probably be in its own RFE
bug if you want it) is implemented, it should be done in such a way as to
educate users and minimize the likelyhood of data loss. I subscribe to many
email lists which send me the list mail without putting my address on To: or
Cc:. My address happens to appear in a Received: header, as the receiving
sendmail is configured to stuff the envelope recipient in there for me. Some
folks may not have their email address in *any* header.

Also, i receive a fair amount of legitimate BCC mail.

If I were to turn on this feature, I would have a great deal of legitimate mail
treated as spam. Therefor, the treatment should not be severe. Once labels are
available, perhaps the message could be colored yellow for "suspected spam" or
something like that. In any event, the default action should not be to delete
the message, and the user should be advised to monitor the suspected spam for
legitimate list traffic and BCC'ed mail.

... but as I mentioned, I don't think that's what this bug is about.

-matt
Message/Create Filter is disabled in 2002020409?
To answer comment #20: yes, it's disabled in menu for news. It is incorrectly
enabled on header popup for news message, covered under bug 106520.  Both items
should not be enabled in news, but are indeed enabled and working in mail in
current builds.
Responding to dmitry's "instant spam filter" idea: not only would that get a lot
of false positives, it wouldn't work: spammers would just add the necessary
header to avoid getting filtered, and we'd be back where we started.
*** Bug 135353 has been marked as a duplicate of this bug. ***
mpt suggested the text "Filter Messages Like This..." for a context menu item
corresponding to this feature (http://www.mozilla.org/projects/ui/menus/shortcut/).
Status: NEW → ASSIGNED
I'd stil like to see you being able to right click on the message which brings
up a menu of which one option is to set a filter according to that message.

In Windows, Becky2 has this and in Linux KMail has it, so it shouldn't be too
difficult to find a cross platform solution.
*** Bug 152967 has been marked as a duplicate of this bug. ***
Just a thought on ( semi- )automatic filtering:

I sort messages into folders immediately to keep the inbox clean.  I would like
to be able to drag a message from Inbox to a folder and have Moz pop up a dialog
asking if I want to always move messages from 'sender' to 'folder'.  If i say
yes, a filter should be created automatically with an intelligent name.  If I
say no, it should remember this and not ask me again.  I should be able to reset
the excluded list and start getting prompts again.

This is aimed at helping to filter those messages that have gotten through the
comprehensive spam filters mentioned above and into your Inbox.

Dan Bush
I agree with last few comments - spam aside, it's useful to be able to quickly
set up a new filter for legitimate email that needs to be sorted to a folder.
Somebody please change bug status to NEW. A bug ASSIGNED to
nobody@mozilla.org is just ridiculous
thank you for volunteering, when do you expect to have this finished?
Assignee: nobody → danielwang5168
Status: ASSIGNED → NEW
back to nobody@mozilla.org

I didn't volunteer. I just point out a bug with ASSIGNED status should
not be assigned to nobody@mozilla.org
Assignee: danielwang5168 → nobody
I hope this will cover the following situation:

I received a message that has a name in the Sender column where all email
messages are listed but when reading the message the From field is blank.  I
don't know what the difference is between Sender and From but leaving one or the
other of these fields blank is the sign of a spammer.

It would be great if in creating Message filters in general you could specify
that a message with a specified field left blank be filtered out.

Thanks,
Ray

P.S. If I'm using this bug reporting system in error I apologize.  Please
redirect as appropriate. RLL
What about using CRM114 using this?

See http://crm114.sourceforge.net/
No, crm114 is GPL, and Mozilla can't use GPL code :(
Well, perhaps not use the CRM114 code but it's algorithms as there's a few
papers published on it.
AI for non-spam filtering is covered in bug 168905.  

Instead of AI, I want a dialog that lists the message's headers and asks me
which header(s) to filter on.  I could select things like X-mailing-list or
X-bugzilla-reason from the dialog.  Bug 89091 covers a few cases (creating
filters based on the "from" or "to" addresses).
*** Bug 190709 has been marked as a duplicate of this bug. ***
doesn't mozilla spam blocking features fix the problem in this bug?
'no'
"But even better would be to select a bunch of messages and have an intelligent
algorithm look through and see what's the same about them.  It would need to
know about such things as "to or cc", which fields to look on, ability to search
for substrings such as "@x.com", etc.  This would help novice users especially I
believe."

- Mozilla's bayesian filtering is an "intelligent algorithm" which parses the
document for what is common between spam messages. That was the feature request
and it has been implemented.
'no'

the intelligent algorithm is to decide which pieces of the message should be
used for the filter. not to decide that a message is junk. there's a difference.
*** Bug 240762 has been marked as a duplicate of this bug. ***
(In reply to comment #40)
> "But even better would be to select a bunch of messages and have an intelligent
> algorithm look through and see what's the same about them.
> 
> - Mozilla's bayesian filtering is an "intelligent algorithm" which parses the
> document for what is common between spam messages. That was the feature request
> and it has been implemented.

Can't the bayesan filter be tweaked to sort in data based on the "move message
to:" command? -It should be much more effective here than with SPAM.
So just add a "Move message to and learn" option could trigger the bayesian (or
which filter mechanism is available) training.
Care has to be taken that it is easy to remove the filter for a specific folder
if the folder is deleted or split up to separate folders...
Product: MailNews → Core
*** Bug 257505 has been marked as a duplicate of this bug. ***
with round about 350 filters and over 60.000 mails stored at IMAP very day a have to add new rules/filters.

I want to have a "one-click-rule/filter"

Suggestion: There is a new contact I will often mail in the future a have to make a rule by several steps.
example: a new customer is werner.warweg@kdv-dt.de
- add a new folder at inbox called "werner_warweg@kdv-dt_de" (because WIN has problems with dot
- press "create a filter"
- press "+"-button because I need "from" and "to"
- add the missing information
- leave "edit rule"
- press "act this rule at active folder"

This should been done automatically.

Warm regards
Werner
- 
this, plus bug 123375, would bring TB closer to tin's filter functionality.
Guys - I was the submitter of bug 401751 and I'm not sure exactly where this bug is going.  I want something akin to a ".newsrc" file for mail, from a view standpoint.  I want to be able to hide threads that I've killed in a mailbox.  So for instance, I receive one mailing list that with approximately 100 messages per day, many of which are in as few as 2 threads.  I might not care about either of those threads, but still want to archive them.

Is this the right bug?
(In reply to comment #48)
>...
> Is this the right bug?

depends if first sentence of comment 0 is what you want.
based on my reading of your bug, the answer is yes.
Filter on "Nobody_NScomTLD_20080620"
QA Contact: laurel → backend
Product: Core → MailNews Core
If one message active when the button clicked populate fields with greyed-out style text taken from the message itself -- the 'smarter' option offering a selection of optional text or a popup to allow user highlight (and auto copy) before manually text editing a given field.

If multiple messages selected defer to prior comments.
Keywords: uiwanted
Severity: normal → enhancement
Keywords: helpwanted
Priority: P3 → --
See Also: → 123375
See Also: → 259746
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.