Open
Bug 294077
Opened 19 years ago
Updated 2 years ago
use spelling checker as input to bayesian spam filter
Categories
(MailNews Core :: Filters, enhancement)
MailNews Core
Filters
Tracking
(Not tracked)
NEW
People
(Reporter: danm.moz, Unassigned)
References
Details
Suggestion: I think you should hook up the spelling checker to the Bayesian filter. Some (configurable) small proportion of unrecognized words should weigh heavily in the Junk category. I believe that such a rule alone would have caught 3/4 of the last 20 messages currently in my Junk folder. It's even adaptable in its own way. If I find improperly junked messages from colleagues containing red underlined words, I should add those words to my dictionary, if my colleagues can spell. Supporting material: I have the impression that Thunderbird's junk filter doesn't work terribly well any more, because spammers have caught on. One spam source sends me a lot of junk consisting of two and three letter word fragments that get built into human words by the HTML display engine. The fragments are short and somewhat random, limiting the usefulness of that part of the content to the current filter. A lot of junk uses 733t and randomly inserted digits to conceal the junk words. I get a lot of junk in languages and character sets I can't even read. In the past six months, at least half of the junk I've been sent contains a large block of random words, or stuff that looks to have been plagiarized from pulp lit. That last point is a spammer's tool designed to confuse filters, and to trick people into reducing the future effectiveness of their filter by clogging it with random input. The spelling checker won't help with that scheme, but the other schemes in the above paragraph are all susceptible. I include the last example only to further illustrate that dishonest businessmen are intentionally fighting spam filters, and they have techniques reasonably effective against the one in Thunderbird. I think the judgment of a spelling checker has a lot of potential to help, at least for people with friends and business associates who can spell. Its use of course should be a configurable option. Maybe all by itself it'll encourage people to learn how to spell. In order for spammers to catch on to this defense, they must use actual recognizable words, not fragment subterfuge. That will force an increase in conformity, aiding the normal operation of the filter. And once the spammers have caught on, at least the junk I get will be a little less offensive to the eye. A nice grammar checker would make the mail I read even less offensive, and also should catch the messages spammers craft with blocks of random words specifically to trip up naive Bayesian filters. What awesome excuses I could have for not going to meetings! "I'm sorry boss. I didn't get your memo because according to Thunderbird you write at a third grade level." |============== Adaptive Filter Controls ==============| | ---- | |-------------- I wish to read mail from --------------| | | | ^ | | |----|O|-------------------------------------------| | | v | | | | | | | | | | | Philosophers | Half-witted | Spammers | | | Poets | | | Illiterate University | | Court Fops Boys | | | |======================================================| Maybe just the spelling checker for now.
Related to bug 222014?
Updated•16 years ago
|
QA Contact: filters
Assignee | ||
Updated•16 years ago
|
Product: Core → MailNews Core
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•