Closed Bug 222014 Opened 21 years ago Closed 19 years ago

Update Junk mail filtering to use the built in Spell Checker

Categories

(MailNews Core :: Filters, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 294077

People

(Reporter: nogwater, Assigned: sspitzer)

References

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20030925
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20030925

I've been getting a lot of junk mail (spam) that looks like this:
<blockquote>
SEX ATTRACTING PHEROMONES!! AVAILABLE HERE!!
FOR MEN OR WOMEN!!
LOOK HERE FOR MORE INFO!!
gkxmyot c ctg neiyp cihvuiojplsiq fv sk utf ibrrcxohdoummjjthgy k
REMOVE FROM MAILLIST v ylhby vtxc hdbcyezdusdgyv og otfq sgtcbs tzbom lqjakp jg
pf ttypz 
</blockquote>
It looks to me like the spammers are trying to get around the bayesian filters
by adding lots of junk words.  Paul Graham's "A Plan For Spam" calls for setting
the probability of these "new words" to 0.4.  I'm not sure what mozilla is
setting it to, but if we set words that aren't in the spelling dictionary to 0.6
(or maybe higher) I think we'd easily be able to recognize the above as spam.

We should also make sure that these random letter words aren't being stored and
filling up our word list with junk.  Unfortunately, I'm not sure how you
discriminate "cihvuiojplsiq" that you'll never see again from "v1agra" that will
probably show up in every 100 emails.

Reproducible: Always

Steps to Reproduce:
> I'm not sure what mozilla is setting it to, but if we set words that aren't in
the spelling dictionary to 0.6 (or maybe higher) I think we'd easily be able to
recognize the above as spam.

A value of 0.6 (Zilla use 0.4 btw) won't work for multilingual users : I read
and write mails in 3 languages, but only one of them has a dictionary for the
spell-checker.
I guess multi-lingual email would be a problem. :-/
Maybe it could be made optional.  I'd guess that by far most people only get
email in one language.

Maybe it would be better to just change all unrecognized words to 0.5 instead of
0.4.

My concern is that if we don't do something, ALL spammers will be stuffing the
email with junk words and the mozilla junk mail filter will be worthless.
*** Bug 258968 has been marked as a duplicate of this bug. ***
Product: MailNews → Core
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
This bug has been automatically resolved after a period of inactivity (see above
comment). If anyone thinks this is incorrect, they should feel free to reopen it.
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago
Resolution: --- → EXPIRED
Product: Core → MailNews Core
Resolution: EXPIRED → DUPLICATE
You need to log in before you can comment on or make changes to this bug.