11035 - [meta] junk/spam blocking filters features tracking

Bulk-resolving requests for enhancement as "later" to get them off the Seamonkey bug tracking radar. Even though these bugs are not "open" in bugzilla, we welcome fixes and improvements in these areas at any time. Mail/news RFEs continue to be tracked on http://www.mozilla.org/mailnews/jobs.html

Phil Peterson

Comment 4

•

25 years ago

Reopen mail/news HELP WANTED bugs and reassign to nobody@mozilla.org

Michael Lowe

Updated

•

25 years ago

Keywords: helpwanted

Phil Peterson

Updated

•

25 years ago

Summary: [HELP WANTED]spam blocking filters features → spam blocking filters features

Whiteboard: HELP WANTED

Target Milestone: M15

Scott MacGregor

Comment 5

•

25 years ago

moving out there.

Target Milestone: --- → Future

Håkan Waara

Comment 6

•

24 years ago

*** This bug has been marked as a duplicate of 71413 ***

Status: NEW → RESOLVED

Closed: 24 years ago

Resolution: --- → DUPLICATE

scottputterman

Comment 7

•

24 years ago

Håkan, I don't think this is a dup. This bug is talking about adding intelligent spam filters to the product that would automatically add messages to a junk mail folder. The other bug is talking about adding a feature that lets the user manually add a sender or domain to a block list. They have similar results but are different in how they do things. I'm going to reopen. If you feel differently, let's discuss.

Status: RESOLVED → REOPENED

Resolution: DUPLICATE → ---

scottputterman

Comment 8

•

24 years ago

Just reread the bug. This does mention block sender, so that is a dup of the other bug, but the other things mentioned like Algorithmic blocking andIntegration with RBL or NoCEm are not.

Garth Wallace

Comment 9

•

24 years ago

Isn't RBL a server-side solution?

Garth Wallace

Comment 10

•

24 years ago

Marking dependent on bug #73075 (NoCeM support). I suppose algorithmic blocking should get its own bug too and be marked as a dependency, but I'm not quite sure what is meant by "algorithmic blocking".

Depends on: 73075

Garth Wallace

Comment 11

•

24 years ago

This may actually be a dup of bug #66425.

tbone

Comment 12

•

23 years ago

right click popup menu should show these options: Move To Copy To Filter To <-- The "Filter To" option would filter all mail from this sender to the selected folder (or trashcan).

ivo welch

Comment 13

•

23 years ago

In fact, it should be a big button, right next to the "SEND BUTTON": "BLOCK FUTURE EMAILS BY THIS SPAMMER". /iaw

Akkana Peck

Updated

•

23 years ago

Blocks: 101001

Eric Vaandering (no email)

Comment 14

•

23 years ago

In response to comment 9, RBL is usually used by servers of ISPs to actually block mail, however on their site I found a link to a perl script to be used with procmail to block spam using their (and two other) services. So, it would appear that they don't mind individual users using their service. I'll attach the perl script just as a reference. Basically, I think one just does a DNS lookup of the mail origination using a server from mail-abuse.org and looks at the output.

Eric Vaandering (no email)

Comment 15

•

23 years ago

Attached file Perl script to check if sender is in RBL (a spammer) — Details

This perl script by Bjarni R. Einarsson checks to see if the send of a mail is in the mail-abuse.org list or the orbz.org list. It also takes as input a file of good, but normally blocked IPs.

Raphael Wegmann

Comment 16

•

23 years ago

In response to comment #13: That would be it: *) Analyse the "Received" Header lines to find out the originating IP. *) remove further mails with this IP in the "Received" header *) Do a whois lookup to find a responsible contact for this IP *) send a spam complaint to that contacts

Akkana Peck

Updated

•

23 years ago

No longer blocks: 101001

Lapo Luchini

Comment 17

•

23 years ago

In reply to comment #9: well RBL is based on DNS lookup (you do the lookup on the DNS of the chosen RBL service.. e.g. relays.ordb.org or relays.osirusoft.com)... DNS lookup can easily be done also by Mozilla itself.. why not? Having a "move if" - "listed in RBL given:" - "relays.ordb.org" would be quite useful. Shouln't be too difficult, but I should take a look at filter code before judging on that...

David :Bienvenu

Comment 18

•

23 years ago

dns lookups are async and the filter code is synchronous, so there's a big problem right there. Not involvable, but not easy either.

(not reading, please use seth@sspitzer.org instead)

Reporter

Comment 19

•

23 years ago

mscott's been expressing interest in improving our spam fighting features.

Assignee: nobody → mscott

Status: REOPENED → NEW

Phillip Oertel

Comment 20

•

23 years ago

Cloudmark offers a spam-filtering plugin for m$ outlook. the plugin relies on a central database maintained by the plugin's users. the plugin adds a 'block' and 'unblock' button to the email client. might me interesting to integrate support in mozilla or 'evangelize' cloudmark to do it themselves. check out http://theregus.com/content/6/25317.html for more technical details or visit http://www.cloudmark.com/ also, is there a spam filtering tracking bug in bugzilla? there are lots of bugs & ideas but no coordination as it seems to me. i believe spam filtering is an important feature of a modern email client, and a feature that would make me switch email clients. phil

NorthMan

Comment 21

•

23 years ago

I have put together a few filters which should block at least 80-95% of spam. I hope it BLOCKS all spam, but thats just hopefulness. It uses Mozilla mail's filters, and it has several actual filters, each with their own subfilters/rules. In this way its "modular" (actually not really, just more customizable) in that if you actually subsribe to say "porn mail" you can turn off the set of filters that targets porn mail. I was disappointed that the filters did not allow me to scan the complete headers, however I think my "filter package" is ready for use. I've named it "SpamSlayer", and it needs an easy installer package. Since its a ruleset, it goes in the mail folder, located in slightly different paths on different computers. Usually its something like: c:\windows\application data\mozilla/profiles/default user/(random string).slt/mail/(server IP address (don't know if this is the pop3 or smtp, since both of mine are the same)) If someone could setup an installer that could find this dynamic location and replace the current ruleset with the SpamSlayer ruleset, or even better, APPEND the SpamSlayer ruleset to the current one (so current filters aren't overwritten and lost).

NorthMan

Comment 22

•

23 years ago

Forgot to say this, but if anyone can work with me to setup an installer (and I'll be releasing new versions if neccessary) please email me 6tsh7a001@sneakemail.com

NorthMan

Comment 23

•

23 years ago

OK everyone...here's the URL for the SpamSlayer project, which I described above: http://spamslayer.mozdev.org For right now, it should solve our spam problem.

Angus Davis

Comment 24

•

23 years ago

why not integrate with something like a SpamAssassin? I believe SA is recognized as the best tool out there today for this sort of thing. Some of it could be made relevant on the client-side: http://spamassassin.taint.org/ For example, one aspect that would be fairly obvious to integrate would be Razor: http://razor.sourceforge.net/ SpamAssassin uses Razor and other rules to eliminate spam. It is the best out there, open source, so maybe there is someone on their project who would be keen to help integrate it with Mozilla in some fashion.

Eric Vaandering (no email)

Comment 25

•

23 years ago

Spamassassin is a Perl module that one can very easily pass a message to and receive its "opinion" on whether the mail is spam or not. In fact, I've done this with a stand-alone IMAP client to deal with the mail as I see fit. Razor can be installed as one of the inputs into Spamassassin. I don't if it would be possible to integrate SA into Mozilla (maybe a plugin or something) but I can certainly vouch for its effectiveness.

NorthMan

Comment 26

•

23 years ago

I definately would like to see SpamAssasin (and/or Razor) integrated into mozilla mail. However, with the not-so-important rating that this bug has, I don't think it will be getting any SpamAssasin or Razor integration anytime soon.

scottputterman

Comment 27

•

23 years ago

reassigning to dmose and raising priority. we need to start working on anti-spam features for Mozilla. Maybe someone cc'd on this bug knows if there's a better bug out there to serve as a Meta bug. If not, let's make this one and start adding other bugs as dependencies.

Assignee: mscott → dmose

Keywords: helpwanted

Priority: P3 → P1

Target Milestone: Future → mozilla1.2beta

Alec Flett

Comment 28

•

23 years ago

file a new bug on the spamassasin integration.. that way if someone implements new spam blocking which DOESN'T use spamassasin, then this bug can be marked fixed without causing a big bruhaha

Dan Mosedale (:dmosedale, :dmose)

Comment 29

•

23 years ago

Attached patch Checkpointing: msg filtering, spam-assassin: part 1: diffs — Details — Splinter Review

I'm just checkpointing here, this is far from being ready to land. However, it includes the beginning of a straw-man interface for more generic message filter (lots of work still required on that, as well as the beginnings of an implementation of a filtering plugin which can read and use spam-assassin config files.

Dan Mosedale (:dmosedale, :dmose)

Comment 30

•

23 years ago

Attached patch Checkpointing: part 2: non-cvs diffs — Details — Splinter Review

Alec Flett

Comment 31

•

23 years ago

I'm not 100% sure how this fits into your filter plugin architecture, but here's one thing I was thinking: Right now there is only one type of filter, that uses the generic filter dialog to match on headers, etc. It would be cool if you could register your own type of filter, such that you could create one or more of these other types of filters.. so in your filter list, you might see your standard list of filters, but one of them might be "Spam Assassin" or something. Since the filter plugin is merely a specific type of filter, you could have multiple instances of the plugin, such as 2 spam assasin plugins - one that moves high-threshold spam (i.e. stuff that spamassasin most certainly knows is spam) to one folder (maybe the trash), and low-threshold spam (i.e. stuff its not sure about) to another (maybe a 'might be spam' folder) Each filter could have its own settings stored in the filter rules.dat file, so that you could store instance-specific data. Then, you could do stuff like select the filter and edit it, but when you edit it, chrome specific to that filter would appear - i.e. like the spam assasin dialog. As for actually doing something with the filter, there should be lots of options beyond just moving it to a folder, etc. It would be nice if there were some generic interface like nsIMsgFilterSink where you could do macro-operations on the message, such as forward, reply, maybe edit it and reply, etc. SpamAssasin would mostly call things like sink.moveMessageTo(spamFolder) and so forth. JavaScript filters could be implemented in very much the same way. You could have one or more JS filters.. the JS filter would be a function that gets called with the message headers, maybe other details about the message or message parts, and the message sink object. Then the implementation of this function would perform operations on the sink. Whitelist filters would also work this way - you could have one or more whitelist filters that correspond to an entire address book, or maybe just a mailing list. Some smart wizard could even set up your whitelist filters for you. Each whitelist filter would correspond to a different set of people in your address book, and each one could have a specific action associated with it.

ivo welch

Comment 32

•

23 years ago

I think this bug is missing the big issue: we need to include the mozilla user community in feedback. This is so easy to do: a button next to the "STOP" sign would send a message to a user-selectable anti-spam site, with the base information of the particular email being flagged spam. This way, the anti-spam site could much faster detect new spam schemes. Implementation Cost: Low. Potential Value: Very High. /iaw

Alec Flett

Comment 33

•

23 years ago

That's Yet Another Bug (and another filter type) and you should file a bug on that.. and your cost analysis seems quite weak.. sure, the client-side work doesn't sound hard, but think about all the details beyond just adding the button in the UI. I mean, who runs this service? are there well-known anti-SPAM services out there? how does the client know to block future SPAM? how does it match current e-mail to spam on the service without downloading a whole bunch of spam from the service and without compromising the users privacy? How does one handle failure (spam service unavailable, etc) However its done, it sounds expensive to me. But wait! don't answer me here. File another bug, make it dependent on this one.

Dan Mosedale (:dmosedale, :dmose)

Comment 34

•

23 years ago

alecf: in fact, this quite similar to what I had in mind, but hadn't yet written down. If you've no objections; I'd like to start with your text and whip it up into a strawman proposal in HTML that we can go with. The spam-assassin bits that I've got running so far are implemented exactly along the lines you suggest. In particular, I've made a simple nsIMsgFilterPlugin interface, and modified the IMAP message header fetching code to call out to it once for each message. I've implemented this interface for spam assassin as a JS component, because much of the spam-assassin stuff is regexp based. Once it decides that something is spam, it just usings the existing nsIMsgFilterHitNotify::ApplyFilterHit info to tell the IMAP (or POP or whatever) code how to deal with the hit.

Status: NEW → ASSIGNED

Alec Flett

Comment 35

•

23 years ago

sure - go ahead and use whatever part of that you need!

Russell Odom

Comment 36

•

22 years ago

Check out bug 163188 (Bayesian filtering - very cool!) - looks like this bug should be dependent on it?

jglick

Comment 37

•

22 years ago

http://www.mozilla.org/mailnews/specs/filters/#Junk After discussing w/putterman, idea of what UI might look like. Maybe "Always accept messages from people in my AB" has a dropdown to select a specific AB. That AB, becomes a White List of sorts.

URL: http://www.mozilla.org/mailnews/specs...

Russell Odom

Comment 38

•

22 years ago

A thought re "Always accept messages from people in my address books": 1) At the very least, this should exclude the 'collected addresses' address book! 2) More flexibility would be possible if this was: "Always accept messages from people in the following address book/group:" with a drop-down which enables people to select a particular address book, or just a particular list within an address book 3) Alternatively, this could be "Always accept messages from people in my 'white list' address book" - this address book would be a new top-level book, at the same level as 'Personal Address Book' and 'Collected addresses'. Also, I think there's a bit too much granularity on the 'sensitivity' slider - nobody is really going to spend enough time tweaking and analysing to see the difference between e.g. 35 and 36. I suggest a 0-10 scale - enough to get reasonably fine degree of control without taking too much trial and error to find your 'optimum' level. Other than that, looks pretty good.

jglick

Comment 39

•

22 years ago

>2) More flexibility would be possible if this was: "Always accept messages from >people in the following address book/group:" with a drop-down which enables >people to select a particular address book, or just a particular list within an >address book Agree a dropdown address book selector would be better. >Also, I think there's a bit too much granularity on the 'sensitivity' slider. Agree.

Henry Jia

Comment 40

•

22 years ago

Make mozilla intelligently block spam mail is a good feature.

Blocks: 168902

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 169557

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 167561

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: spamfe

Dvir

Comment 41

•

22 years ago

I suggest also adding a list of "non-spam domains" to the screen. This way people can specify domains of sites / companies they work for never to be filtered as spam.

NorthMan

Updated

•

22 years ago

Depends on: 156744

NorthMan

Comment 42

•

22 years ago

bug 156744 seems like an easy solution for this bug. TMDA is an open source project on sourceforge, and so all the code is there, you just would have to "port" it probably. It also is designed so that it absolutely blocks 100% of spam with virtually no false positives.

Michael Baffoni

Comment 43

•

22 years ago

Regarding Comment #38: How about turning the interface around - would it be so much more difficult to add a "whitelist" or "always accept email" property in individual addressbook cards, and in the top-level properties of an addressbook - especially in ldap-based addressbooks that are used for corporate addressbooks! The idea of no-spam domains much like we currently have uses-HTML-domains sounds like a good idea. Speaking as a corporate site, I would like to make sure that a spamassin-style filter not be the ONLY filter available because, as someone else pointed out, it could present significant privacy/security issues. bug 163188 seems like a good alternative method. Also, does the SA software use any special ports that might get blocked by a firewall?

David :Bienvenu

Comment 44

•

22 years ago

we're adding a whitelist feature, based on the personal address book of your choice, IIRC.

Michael Baffoni

Comment 45

•

22 years ago

Bienvenu - what if you have multiple addressbooks that you want to function as whitelists? By setting a property in the addressbook, you have more flexibility than just selecting one AB (which it appears is all the drop-down would allow). For that reason you wouldn't be able to (e.g.) set both your PAB and an ldap-based AB as being on a whitelist. So maybe instead of a drop-down menu, you have an edit menu that allows you to check/uncheck your ABs, or even a button (that greys out the edit menu) that says all ABs except collected are whitelists. Or are you saying that design has moved past the simple dropdown mentioned above, and you can select _multiple_ "personal address book of your choice"?

Eric Krock

Comment 46

•

22 years ago

I don't think domain-based whitelists would work very well because spammers often fake email to you coming from your own domain or even from your own address. (Obviously if you are sending a spam to john@foo.com it's not hard to have your spambot mark it from jane@foo.com.) Address book-based whitelists are much better because then the spammer needs to know both your address and the address of a person on your whitelist--a much-harder (although not impossible) combination.

Nick Cross

Comment 47

•

22 years ago

Hi, I'm currently using 1.1 so apologies if this has been dealt with in 1.2alpha...I have various filters set up which look at the message body. One string I use is 'You received this email because you signed up with' but this wasn't caught today as the message source is base64 encoded with content type of text/html - mozilla seems to decode and display that fine but the filter is run on the non-decoded version. Thanks.

Anthony DeRobertis

Updated

•

22 years ago

Depends on: 71413

NorthMan

Comment 48

•

22 years ago

Looks like we missed our milestone on this bug. What kind of system are we aiming at implementing? I've scanned the comments and can't seem to see a unified goal.

Aleksander Adamowski

Updated

•

22 years ago

Depends on: bayesian

Christian :Biesinger (don't email me, ping me on IRC)

Updated

•

22 years ago

Depends on: 179503

Christian :Biesinger (don't email me, ping me on IRC)

Updated

•

22 years ago

Depends on: 179504

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 179966

(not reading, please use seth@sspitzer.org instead)

Reporter

Comment 49

•

22 years ago

*** Bug 179984 has been marked as a duplicate of this bug. ***

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 179984

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 179997

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 179999

Jean-Francois Ducarroz

Updated

•

22 years ago

Depends on: 180004

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 180010

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 180029

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 179502, 179518, 179568, 179588, 179637, 179639

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 179162

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 180153

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 180167

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 180215

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 180231

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 180119

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 180477

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 179012

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 120599

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 180857

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 181193

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 181394

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 181531

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 181953

Robert Kaiser

Updated

•

22 years ago

Depends on: 182381

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 182386

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 183613

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 181534

Dan Mosedale (:dmosedale, :dmose)

Updated

•

22 years ago

Depends on: 182109

Ray Trent

Comment 50

•

22 years ago

Comment 46 says that domain whitelists are bad because spammers fake the sender as being from your domain, and says "specific email address" whitelists are better. However, the whitelist discussion ignores the fact that the most common address that spammers spoof is the addressee's address (i.e. *you*), and that's perhaps one of the most common ones that people would want to be in their whitelist (I send mail to myself all the time). So something more complicated would seem to be needed. Also, this whole feature doesn't seem to be working for me in the 2002120604 build. Nothing is marked as spam (though I have logged several emails as spam, and bunch of them as "non spam"). Also, I can't seem to find my junkmail.js file, there doesn't seem to be anything in my junk filter log (though I did turn it on), and I don't see a file anywhere that obviously contains the Bayesian parameters. Is there any preliminary documentation/discussion on this kind of stuff I could look at? I'm really happy to see this feature, BTW.

Christian J. Callsen

Comment 51

•

22 years ago

Check out "training.dat".

David K

Comment 52

•

22 years ago

Two more thoughts on filtering: 1) Instead of/in addition to a white book, how about andding an 'automatically accept email from this person' checkbox to each card in your address book? 2) Allow filters to work on subfolders. I just submitted this as bug 184080 before reading this thread. In brief: Filters A, B, C... filter email to a quarantine folder, deleted and so on. Filter Z searches through your quarantine, deleted, ... and based on it's criteria (such as friends email addresses) moves mail back to your inbox. It's another way of implementing a white list and catches mail you want to read that may otherwise be deleted by your overzealous mail filters.

Ray Trent

Comment 53

•

22 years ago

BTW, I don't know how feasible this would be with the current Bayesian filter mechanism, but it would be really nice (for the curious among us if for no other reason) if the filter log indicated *why* the spam was filtered rather than just indicating that it was. The simple fact of the mail being filtered seems adequately conveyed by the junk mail icon and whether it's been moved into your spam box. So I'd say that the current filter log is pretty useless.

David A. Wheeler

Comment 54

•

22 years ago

BTW, please see the related bug #187044, suggesting that it'd be nice to have a challenge/response anti-spam mechanism. See the bug report for more.

Russell Odom

Updated

•

22 years ago

Depends on: 184948

(not reading, please use seth@sspitzer.org instead)

Reporter

Comment 55

•

22 years ago

taking.

Assignee: dmose → sspitzer

Status: ASSIGNED → NEW

Ray Trent

Comment 56

•

22 years ago

I'm not sure whether this should go in this bug or another one, but speaking of the the "move to folder" feature, something I find annoying about my Junk mail folder is that when I look at it, there's a (significant) delay while the spam filter appears to try to recategorize all the email. Maybe it's just parsing through the headers looking for some kind of "Mozilla thinks this is spam" flag, but it seems to take too much time for that (although I *do* have my email stored on another machine so it gets backed up automatically). Seems to me that mail could get marked "junk" in the index file, but perhaps that doesn't get regen-ed until you open the folder either... don't know enough about the internals...

David :Bienvenu

Comment 57

•

22 years ago

I'm working on code that will prevent the re-classification of messages moved to the junk folder, if your imap server supports user-defined keywords. The problem is that when we move an imap message, we really don't know what the message will be in the destination folder, due to the way imap works, so we can't "pre-mark" it as junk, other than by using imap keywords on the imap servers that support key words.

Ray Trent

Comment 58

•

22 years ago

Re: comment 57, how about just not reclassifying folders that aren't inboxes (I'm presuming Moz knows which those are, because it does a "get new mail" whenever I select my Inbox). It didn't occur to me to mention that it was an IMAP folder... good catch. However, I can't see any real benefit (and quite a bit of annoyance/potential lossage) to running spam filters on secondary folders. Either the user moved it there (via a filter or by hand), or the Junk mail feature did, and in neither case does it make sense to reclassify it. To clarify my "potential lossage" comment: perhaps there are some desireable emails that look just like some class of spam, and a user would want to set up a filter to move those to a "safe" folder before spam classification. If we then reclassify those secondary folders, data loss occurs. The only good thing it would do is show off how good the filter is (or not :-) by displaying the "junk" tag in the message summary. This doesn't seem that useful to me unless requested by the user for some special purpose. Also, would this problem be solved by moving my junk into a local folder instead of an IMAP one?

David :Bienvenu

Comment 59

•

22 years ago

the reason we run spam filters on secondary folders is that your mail mail filters can filter mail to secondary folders, and this happens before the spam filter runs (and thus before the message body is downloaded). For example, if you have a filter that moves all messages addressed directly to you to a folder, and you get spam sent directly to you, you want the spam filters to run on that folder when you open it to catch the spam sent directly to you. The alternative is to run the spam filters first, and we don't do that (we might want to reconsider that, but not for this release)

Ray Trent

Comment 60

•

22 years ago

The 20th - 25th comments of bug 181394 raise an extremely important point: we have to do something about the "obviousness" of using the junk mail feature. If people who are savvy enough to be downloading nightlies and entering comments in b.m.o. have a hard time figuring out that you have to train ~50-100 emails as spam before it starts working, how will any normal user have a hope? Possible solution: pre-learn the training data (may be unpopular with this crowd)... any others? Evangelization isn't likely enough, because no one reads documentation...

Peter Lairo

Comment 61

•

22 years ago

RE comment #57: > I'm working on code that will prevent the re-classification of messages > moved to the junk folder, How about not re-classifying *any* messages (no matter where they are moved to)? > if your imap server supports user-defined keywords. Or perhaps by giving each message in mozilla a "JunkClassified(Y/N)" flag. > The problem is that when we move an imap message, we really don't know > what the message will be in the destination folder, due to the way imap > works, so we can't "pre-mark" it as junk, other than by using imap keywords > on the imap servers that support key words. So it is impossible to track a message's "JunkClassified(Y/N)" and "Label" state when moved from IMAP to local? That would really put a damper on things.

(not reading, please use seth@sspitzer.org instead)

Reporter

Updated

•

22 years ago

Depends on: 188940

David Grant

Comment 62

•

22 years ago

Is this bug going to push 1.2beta back, or are the dependancies of this bug going to be changed so it can make it into 1.2beta.

Michael Lefevre

Comment 63

•

22 years ago

it would be tricky to push 1.2beta anywhere, as it was released last October! 1.3beta won't be held back by general issues, but some of the individual bugs may be blockers, I don't know. target milestone isn't really relevant for tracking bugs like this anyway, so I hope Seth won't mind me taking the liberty of resetting it...

Keywords: meta

Target Milestone: mozilla1.2beta → ---

Ray Trent

Updated

•

22 years ago

Depends on: 191486

Brian Rogers

Updated

•

22 years ago

Depends on: 191723

testing

Comment 64

•

22 years ago

Would it be possible to disable html in the Junk Mail folder only? That way when someone does go through any messages that might not be spam they dont have to worry about html loading that might report their address as active.

Ray Trent

Comment 65

•

22 years ago

Attached file Hard to filter spam concept — Details

I was just thinking about possible ways that spammers could trick our filters, and this one came to me. Basically, this HTML is "M a k e M o n e y F a s t. P l e a s e t a k e o u t a l o a n f r o m u s.". It's just that the alpha letters are in "a few times 'big'" font and most of the spaces are in "many times 'small'" font, so it looks pretty much like normal text. I guess eventually that uncommon standalone characters like "k" would get trained as spam, but that seems dangerous in an engineering environment :-). But I can't think of a good way to avoid this problem except perhaps to include the frequencies of some subset of HTML tags in the list of trained terms... Maybe this kind of trick is covered by bug 181534, though. Anyway, this arms race promises to be an interesting one...

Alec Flett

Comment 66

•

22 years ago

there will ALWAYS be ways that spammers can trick our filters. I'd make references to bush's missle defense system, but they wouldn't really apply since the baysian filters are still actually effective.

Matthew Cline

Comment 67

•

22 years ago

Anyone working on spam filters should really look at the SpamAssassin code, since it has lots and lots of ideas to borrow; the trick refered to in comment 65 has the rule GAPPY_SUBJECT (it was found to not be worth looking for it in the body of messages). The rules/STATISTICS.txt file has information that can tell you what rules are worth spending time trying to imitate. It also has tools to check rules against mail archives (assuming you also archive your spam), and to give spam/not-spam ratios for the various rules; thus, SpamAssassin could be useful for prototyping and analyzing rules in Perl before doing them in C++/JavaScript.

Jon Granrose

Comment 68

•

22 years ago

is there a bug filed to have spam mail move to your junk folder automatically X seconds after you toggle the junk status of a message (assuming you have that pref set)? That's the one last feature I really miss. I hate having to train it with the spam it missed, then setting the junk view to delete them all. It would be so much easier to have them disappear automatically once I tag them as junk.

Jon Granrose

Updated

•

22 years ago

Depends on: 194273

Tobias

Comment 69

•

22 years ago

Hello all- (this seems to be the most appropriate bug for what I need to say, sorry if I bother you) I am testing the spam filter now for quite a while, and I must say it is little use for me. This might be because of some specific reasons, I do receive emails in German as well as in English, I am on some mailing lists and I do receive emails from people that are not in my contacts list. I tried to train the spam filter in many different ways, that is marking all emails either as spam or not, marking only spam mail as spam, marking only the most annoying spam mails as spam and so on. It just has not really satisfying results. The best spam filter I have found at all is www.cloudmark.com spam fighter. See also bug 153522. I am using this spam filter for my business account and the results are great, that is, no real email was marked as spam mail! And that is what you need to rely on. The current spam filter might be great use for some and I see also applications in other areas but spam but I recommend to consider cloudmark support as well. Tobias

David Grant

Comment 70

•

22 years ago

I just wanted to reply to comment 69, from a fellow user who has just been trying out the spam filter for the past 2 months. What I'd like to say is that you have to be patient with the Mozilla filter. I have had to build up my Junk folder to 1000 messages of pure spam, until I really started to experience near-perfect spam filtering. And even now, once in a while, a spam gets through. The spam that gets through is sometimes in a foreign language, or it is one of those Nigerian spam messages. As it turns out, my Junk folder does not contain many messages in foreign languages, and I have not received many of those Nigerian-spam, so it is not well-trained in this ares. But I'm confident that if I get a few more of them, they will start to be picked up for sure. The fact that you get messages in German and English should not matter. And why did you try to test the spam filter by marking all messages as spam? Or, by only marking the most annoying messages as spam? This will just make the filter more inefficient, and you will be forced to have a larger data set, in order to filter out the spams you don't want. Like I said, I have 1000+ messages in my Junk folder right now (I think it's 1300) and I mark practically ALL unsolicited mail as junk/spam. Yet still, maybe 5% of the spams can make it through every day. But that has been steadily dropping every week... I think this brings up an important concern, and that is that this filter takes a long time to become funtional in my opinion. Is it possible to make its effect non-linear? ie. make the spam filter weigh more heavily towards marking messages as spam if the Junk folder contains less than 100 spams? Many people want a populated training.dat file to be shipped with Mozilla, but I don't think that will every be possible. A sex therapist, someone in the porn industry, or maybe even someone in the market for a penis enlargement may use Mozilla, and so there are obvious problems with doing this. I can forsee a lot of people becoming frustrated with the Mozilla filter, as I did in the first week, when I did not see instant results. Is there a way around this? Maybe not, but if there is, then it should be looked into. Just some random thoughts...

Jacek Piskozub

Comment 71

•

22 years ago

Tobias: I have a similar situation receiving both spam and legitimate mail in both English and Polish (plus a lot of Chinese spam - dunno why). After a few weeks and maybe 1000 messages I have no problems in any language (except that probably any message in Chinese will be marked as spam - but that does not bother me). David is right that you need a lot of patience. But it is worth it!

Ray Trent

Updated

•

22 years ago

Depends on: 200190

Aaron McBride

Comment 72

•

22 years ago

It would be great if you could set mozilla to block only images in junk mail.

Nahor

Comment 73

•

22 years ago

certainly not, Aaron. Junk doesn't always detect a spam. I don't want my email to be confirmed on someone's spam list just because the spam filter didn't detect it. But it would be nice to have an option somewhere to download the images for the selected message.

Aaron McBride

Comment 74

•

22 years ago

Well then, maybe it should be an option: 1 Show all images 2 Don't download/display images on junk mail 3 Don't download/display any images If you select 2 or 3, then there should be button on the toolbar to download images that were blocked for a given message. I use a lot of email with images (Netflix and REI for example), but obviously I don't want to download the images of suspected junk mail. -Aaron

David Grant

Comment 75

•

22 years ago

I have concern about Mozilla's spam filter and I was wondering if there was a bug associated with this. I enable the "purge junk mail after x days feature". Imagine the following scenario: 1) I get a message from my friend 2) I accidentally mark it as Junk. 3) I accidentally move it to my Junk folder. 4) 10 days later, it gets "purged" from my Junk folder 5) On the same day, it is automatically deleted from my Trash folder when I close Mozilla 6) Now, some messages from my friend get mistakenly marked as spam, because some keywords in his previous email were marked as "bad" words. But how can I reverse the process, and "unmark" my friend's original message if it no longer exists anywhere on my hard drive? Possible solution: delete the training.dat file, and start over by marking messages in the Junk folder as spam, and going from there. However, if you purge the Junk folder every 10 days, then there will only be around 200 spams (for me) and this will not create a large enough training.dat file for effective spam filtering. So what is being done about this? Is there a bug relating to this issue that I describe? Thanks.

David :Bienvenu

Comment 76

•

22 years ago

all of these are optional and off by default - empty trash on exit, purging of the junk folder, and marking junk moving messages to the junk folder. And furthermore, you can use whitelisting to prevent messages from your friend as being automatically marked as junk, no matter what words he uses. If you turn on all of those things, you really need to look at your junk folder occasionally to make sure it doesn't have any messages you want - it could have non-junk messages that were mis-categorized, without any errors on your part.

Jacek Piskozub

Comment 77

•

22 years ago

To rephrase the most important piece of advice from bienvenu: Add your friend to your personal addressbook and switch on the "Do not mark as junk if sender is in my address book" setting. As simple as that.

Tom Sommer

Updated

•

22 years ago

Depends on: 208197

Ray Trent

Updated

•

22 years ago

Depends on: 212671

Steve Chapel

Updated

•

21 years ago

Blocks: majorbugs

Aleksander Adamowski

Updated

•

21 years ago

Depends on: 243430

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: MailNews → Core

Jerry Baker

Updated

•

20 years ago

No longer blocks: majorbugs

Eyal Rozenberg

Updated

•

19 years ago

Blocks: 66425

(not reading, please use seth@sspitzer.org instead)

Comment 78

•

18 years ago

sorry for the spam. making bugzilla reflect reality as I'm not working on these bugs. filter on FOOBARCHEESE to remove these in bulk.

Assignee: sspitzer → nobody

Serge Gautherie (:sgautherie)

Comment 79

•

17 years ago

Filter on "Nobody_NScomTLD_20080620"

QA Contact: laurel → backend

Nobody; OK to take it and work on it

Assignee

Updated

•

17 years ago

Product: Core → MailNews Core

Wayne Mery (:wsmwk)

Updated

•

11 years ago

Summary: spam blocking filters features → spam blocking filters features tracking [meta]

Wayne Mery (:wsmwk)

Updated

•

8 years ago

Depends on: 223716

Magnus Melin [:mkmelin]

Updated

•

5 years ago

Priority: P1 → --

Summary: spam blocking filters features tracking [meta] → [meta] spam blocking filters features tracking

Wayne Mery (:wsmwk)

Updated

•

3 years ago

Summary: [meta] spam blocking filters features tracking → [meta] junk/spam blocking filters features tracking

u597032

Updated

•

2 years ago

URL: http://www.mozilla.org/mailnews/specs...

Wayne Mery (:wsmwk)

Comment 80

•

2 years ago

You should check the archives and replace http://www.mozilla.org/mailnews/specs/filters/#Junk with
https://www-archive.mozilla.org/mailnews/specs/filters/

URL: https://www-archive.mozilla.org/mailn...

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Wayne Mery (:wsmwk)

Comment 81

•

9 months ago

I'm sure this was useful in the early days of the feature, but today we have the "Filter" component which include encompasses junk and spam processing.

Status: NEW → RESOLVED

Closed: 24 years ago → 9 months ago

Resolution: --- → WORKSFORME

Wayne Mery (:wsmwk)

Updated

•

9 months ago

Component: Backend → Filters

Perl script to check if sender is in RBL (a spammer) 23 years ago Eric Vaandering (no email) 9.14 KB, text/plain		Details
Checkpointing: msg filtering, spam-assassin: part 1: diffs 23 years ago Dan Mosedale (:dmosedale, :dmose) 17.72 KB, patch		Details \| Diff \| Splinter Review
Checkpointing: part 2: non-cvs diffs 23 years ago Dan Mosedale (:dmosedale, :dmose) 17.83 KB, patch		Details \| Diff \| Splinter Review
Hard to filter spam concept 22 years ago Ray Trent 6.46 KB, text/html		Details