All users were logged out of Bugzilla on October 13th, 2018
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3a) Gecko/20021202 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3a) Gecko/20021202 The Bayesian junk mail (Spam) filters should have a user-changeable message size limit. The reasons for this: - Most spams are relatively small messages. Otherwise the spammers would have trouble sending millions of them. - Scanning large messages takes up lot of time and system resources. And often this is done for nothing as large messages are typically not spam. - With IMAP every new message has to be downloaded just to evaluate it for spam status. This is particluarly bad for slow connections and/or lots of large messages. By having an option for skipping messages larger than x kbytes the spam classification process would be quicker and would not use so much network bandwidth with IMAP. Reproducible: Always Steps to Reproduce: Expected Results: Skip scanning messages over x kbytes and just classify them as "not junk". Make this an option that user can enable and tune themselves but provide a reasonable default value (50/100k size limit is probably a good start).
It would be a bad idea to skip messages over a certain size. Then messages with embedded images or (worse) viruses will not be classified. A preferable idea (for me) would be to have a user configurable size that will limit the amount of data in a message that is to be downloaded - e.g. "Download first xxx K of message." With a default value of 10k Mozilla will have enough data to accuratly classify the message while not wasting time downloading the entire attachment.
I like the partial message download idea as well and it would work nicely with IMAP. I assume that the junk filter now does an IMAP FETCH BODY when it downloads the message for classification. The desired size limit would be implemented by changing this to FETCH BODY<0.n> where n is the desired size limit.
Confirming as a valid RFE.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee: naving → sspitzer
*** Bug 219413 has been marked as a duplicate of this bug. ***
Actually, in bug 219413 I had suggested a different mechanism. Since most spams are small in size, how about simply *not* running the spam filters on messages that are lager than a (user configurable) size? This way, if there's a 1M message, it's certainly not spam, and there's no point in downloading any part of that message to run spam filters on..
> Actually, in bug 219413 I had suggested a different mechanism. Since most spams > are small in size This is not always true. Everyone gets spammed in his own way :-). About a half of my spam is in range of 50-200 Kb. And overnight I receive about 10-15 messages of such a size. I use dial-up connection, so it takes the mailer a long time to check them all in the morning. More than this, it's becoming more and more common among spammers to send text ads as a single image exactly in order to fool bayesian-like filters. > This way, if there's a 1M > message, it's certainly not spam Why not? Though rarely I actually did encounter chineese spam of this size.
In regards to comment #6 and #7, and to some extent this bug itself, those approaches will encourage spammers to send larger and larger spams to defeat those filters. What I'd like to see is simply the ability of the spam filter to ignore everything except the text/plain and text/html sections of the email when scanning. This will be a great win for us users on IMAP since it won't download the entire message just to scan a sub 1kbyte text/hmtl section (well, plus headers). On POP I think you are probably stuck downloading the whole message though, but at least it won't encourage behavior that exacerbates the problem. Maybe add the ability to add other attachment types (in case they send spam as word docs, pdfs, rtl, etc. in response).
sorry for the spam. making bugzilla reflect reality as I'm not working on these bugs. filter on FOOBARCHEESE to remove these in bulk.
Assignee: sspitzer → nobody
Filter on "Nobody_NScomTLD_20080620"
QA Contact: laurel → filters
Product: Core → MailNews Core
rkent, I've not tested a huge message to test the pain. But in the current scheme, which has certainly changed since 2002, would there be solid rationale to have an upper bound even if it's outrageous, like 20-30mb?
Yes there is a rationale for having an upper limit. Probably would not be hard to do, either.
You need to log in before you can comment on or make changes to this bug.