Closed
Bug 245168
Opened 21 years ago
Closed 16 years ago
Junk mail controls in 1.8 builds allow more junk mail than in 1.7
Categories
(MailNews Core :: Filters, defect)
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: astrojny, Unassigned)
References
Details
(Whiteboard: needs testcase)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a1) Gecko/20040520
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a1) Gecko/20040520
When I had Mozilla 1.7rc2 of the 100+ emails I received the junk mail control
would get rid of abourt 95%. Now with Mozilla 1.8a1 the junk mail controls only
remove about 60%.
Something in the junk mail filter appears to have changed to allow much more
junk mail to get through.
Reproducible: Always
Steps to Reproduce:
1. Junk mail control activated
2. Get mail
3. Much junk mail that was formerly filtered by junk mail controls now gets
through
Actual Results:
Get much more junk mail, i.,e., not labeled as junk than I did using Mozilla 1.7rc2
Expected Results:
Junk mail controls should be at least as effective as in Mozilla 1.7rc2.
Comment 1•21 years ago
|
||
Reporter, the Bayes algorithm has been improved (see bug 181534). This might
require that you retrain Mozilla a bit (mark more mails as junk, or a non-junk).
Note that the Junk Mail Algorithm is not automatic, it requires help from the user !
Comment 2•21 years ago
|
||
*** This bug has been marked as a duplicate of 243680 ***
Status: UNCONFIRMED → RESOLVED
Closed: 21 years ago
Resolution: --- → DUPLICATE
| Reporter | ||
Comment 3•21 years ago
|
||
I have done considerable retraining using junk mail filter, but it appears to
have had little effect.
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---
| Reporter | ||
Comment 4•21 years ago
|
||
Note I have spent considerable time retraining and reidentifying junk mail.
To date, it appears to have had little effect.
| Reporter | ||
Comment 5•21 years ago
|
||
(In reply to comment #1)
> Reporter, the Bayes algorithm has been improved (see bug 181534). This might
> require that you retrain Mozilla a bit (mark more mails as junk, or a non-
junk).
> Note that the Junk Mail Algorithm is not automatic, it requires help from
the user !
Note I have done considerbable retraining and reidentifying of junk mail to
little noticeable effect.
Comment 6•21 years ago
|
||
The only way to prove this symptom is to set up two side-by-side installs of 1.7
and 1.8, give each the same training.dat file, set each up to access the same
account and not delete mails from server, and allow both installs to download
the same set of mail.
See bug 181534, bug 224318, bug 230093, bug 231873 (all of which *did* change
the Bayes algorithm, but should be in 1.7RCx as well as 1.8)
Comment 7•21 years ago
|
||
Oh -- and see bug 245176, which has a new patch; check whether today's nightly
build works better.
| Reporter | ||
Comment 8•21 years ago
|
||
INstalled Mozilla 1.9\8a2, the June 1, 2004 nightly build and now less than 50%
of my junk mail is filtered out. :=(
| Reporter | ||
Comment 9•21 years ago
|
||
INstalled MOZILLA 1.8a2 June 3rd nightyly build and junk mail filtering still
is not as good as in 1.6 release. Don't know what the problem is, but one of
great thing about Mozilla was its great email junk mail filtering. Can't say
that anymore. At least so far for 1.8a2 version.
Comment 10•21 years ago
|
||
FWIW, after upgrading my installed nightly build from 0522 to 0602, I'm noticing
more false positives -- in particular, Bugzilla mails are getting flagged as
junk where that never happened before. The two times I've seen this, there were
ten or more bugmails in the queue and a few of them (two or four) were junked.
Comment 11•21 years ago
|
||
See bug 245439.
| Reporter | ||
Comment 12•21 years ago
|
||
I have downloaded the June 7th build of Mozilla 1.8a2. I uninstalled the
previous uninstall including the installation folder before installing the June
7th build. I had not downloaded email all weekend and had over 450 mails 99% of
which were junk. The June 7th buiild managed to determine 50 were junk. This
is after much training using the prior builds.
What has happened since version 1.6 where I had a similar experience and all but
29 eamils were determined to be junk.
Comment 13•21 years ago
|
||
Build: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a1) Gecko/20040520
I have been looking for a bug with some symptons similar to me in which to comment.
I have "upgraded" my version of Mozilla from 1.7 RC2 to 1.8 Alpha 1 (2004052009)
and have experienced that the Junk Mail classification barely works at all now.
Here is what I did:
1. Backed up Mozilla (thank ye gods)
2. Uninstalled Mozilla 1.7 RC2
3. Installed Mozilaa 1.8 Alpha 1 (2004052009),
Junk Mail Controls Checklist:
1. Junk mail controls ENABLED
2. Do not mark messages as junk mail if the sender is in my address book:
(Personal Address Book) ENABLED
3. Move incoming messages determined to be junk mail to: "Junk" folder on
(address) SELECTED
4. Automatically delete junk messages older than 1 days from this folder ENABLED
5. All other options disabled
Mozilla 1.8A1 picked up my old profile and everything looked ok. However the
Junk mail classification stopped classifying new Junk mail as junk. Before I
upgraded Junk mail was being classified with a 95% accuracy with only 3 or 4
junk mails getting through in a day. Now the accuracy is about 1-5% with
virtually all my junk mail getting through. There are however NO false positives
(no non-junk mail is being classified as junk). It should be noted that I
clasify my junk mail, using the "Unread" view for performance reasons.
When I try to manually train against the mail that passed the junk filter, the
filter runs very noticibly faster than it did before in 1.7 RC2 and no junk mail
is classified. When I classified and re-classified the same junk mail 20 times
in a row, i managed to get 2 of the junk mails classfied as junk and moved to
the junk folder. There were 16 junk mails at that time i did that spot of training.
From my perspective it looks like the junk mail filtering isn't running at all,
however with the very rare event of a junk mail being classified it does appear
to be running. Its that super fast now!
My training.dat file certainly exists and has grown to 2.84 MB (2,984,960 bytes)
in siae and I have been training with it since junk filtering was first provided
in Mozilla way back when.
| Reporter | ||
Comment 14•21 years ago
|
||
June 8th build doesn't improve junk mail filter performance. :=(
| Reporter | ||
Comment 15•21 years ago
|
||
I have given up on Mozilla 1.8a2, I downloaded Mozilla 1.7rc3 and junk mail
seems to be back to its old self and works well. Don't know what happened to
1.8a2 but 1.7 is fine or so it seem so far.
Comment 16•21 years ago
|
||
(In reply to comment #15)
Yes I saw the news on a new 1.7 RC, and am downloading it right now to install.
Sorry 1.8a1, I tried to love you but just couldn't, I love my Junk mail
filtering more. *sniff*
Comment 17•21 years ago
|
||
I thought that I should probably say something useful instead. I have installed
1.7rc3 and with training on 12 SPAM messages, another 22 remaining SPAM messages
got classified as junk and moved to the junk folder.
Old school junk mail filtering is back and working nicely. If I had a choice I
would say, stick with the 1.7 stream of junk filtering. Personally I don't want
to throw away all my training I have accumulated, but its the masses that need
to be satisfied I suppose.
| Reporter | ||
Comment 18•21 years ago
|
||
(In reply to comment #17)
> I thought that I should probably say something useful instead. I have installed
> 1.7rc3 and with training on 12 SPAM messages, another 22 remaining SPAM messages
> got classified as junk and moved to the junk folder.
>
> Old school junk mail filtering is back and working nicely. If I had a choice I
> would say, stick with the 1.7 stream of junk filtering. Personally I don't want
> to throw away all my training I have accumulated, but its the masses that need
> to be satisfied I suppose.
For me the problem with 1.8a2 junk filter was not the training, I spent over a
week training it,. The problem was the training didn't seem to do any good.
Comment 19•21 years ago
|
||
confirmed here with Mozilla 1.8a1
the junk mail filters don't appear to be working at all.
tools>run junk mail filter takes no action
I get on avg 300 mails a day about 170 of them are junk. Earlier Moz used to get
90% of it. Now I'm looking at my LOTS of junk. After a week of training nothing
has changed.
| Reporter | ||
Comment 20•21 years ago
|
||
Tried Mozilla 1.8a2 June 21st build. Junk mail controls still do not seem
effective.
| Reporter | ||
Comment 21•21 years ago
|
||
Just tried Mozilla build for July 10th. Still does not filter junk mail as well
as Mozilla 1.7.1.
I access to 2 account through Mozilla. After a time the junk mail move
funcition ceases to function on the 2nd account, i.e., not the default account.
The mail gets marked as junk but does not automatically get moved to the junk
mail account, even though the junk mail control settings are set to
automatically move junk mail to the junk mail folder.
Deleting the account and recreating allows the junk mail move function to work
again, but after a few days the same problem recurs.
| Reporter | ||
Comment 22•21 years ago
|
||
Tried Mozilla 1.8 alpha 2. The junk mail filter STILL does not appear to be
working very well. I tried Mozilla 1.8 alpha 2 after using Mozilla 1.7. Ver
1.7 does a very nice job of filtering, generally filtering around 90% of my
mail, i.e., if I get 190 emails all but 10 are labeled junk and moved to the
junk mail file. A check of that file indicates mail labeled junk was indeed
junk). With Mozilla 1.8 alpha 2 the results were almost the opposite. Of 150
emails recieved all but 90 were labeled junk. Obviously quite a difference.
Most of the 90 upon inspection should have been labeled junk.
After about 4 days I uninstalled Mozilla 1.8 alpha 2 and went to Mozilla 1.7.1.,
because Mozilla 1.8 alpha 2 did not appear to be learning and I spend
considerable time trying to teach it.
Comment 23•21 years ago
|
||
Andy Strojny: Until such point as someone posts a patch, it is pointless for you
to continue reporting that things are still not working for you.
Bug 230093 / bug 181534 comment 72 would appear to be the major junk-related
change since 1.7.
I again point to bug 245439, which may or may not be a dupe of this one -- or,
it might be a dupe of the problems about failing to update training.dat
(bug 243680, bug 245499).
A couple more dupes to this bug forthcoming shortly.
Summary: Junk mail controls in 1,8a1 allow more junk mail than in 1.7rc2 → Junk mail controls in 1.8 builds allow more junk mail than in 1.7
Comment 24•21 years ago
|
||
*** Bug 256366 has been marked as a duplicate of this bug. ***
Comment 25•21 years ago
|
||
*** Bug 256219 has been marked as a duplicate of this bug. ***
| Reporter | ||
Comment 26•21 years ago
|
||
Sorry for confusion thought needed to report a new bug for junk mail problem in
new version of Mozilla, 1.8alpha3. Same problem that was in other 1.8alpha
builds I've tried.. Here is my experience -
When I used Moziilla 1.7.2 of 200+ emails downloaded over 176 identified as
junk correctly.
With Mozilla 1.8alpha3 of 221 emials downloaded only some 80 identified as junk.
Over 125 not identifed as junk., which in fact were.
This was basically the same mail dump, as I examined the email using both
browsers, first 1.8alpha2 and then uninstalling it and installing 1.7.2.
Mozilla identifed much more junk mail. Clearly Mozilla 1.7.2's junk mail filter
is superior.
I'm back to using Mozilla 1.7.2. I'm afraid to try Thunderbird as I understand
it uses the same junk mail filter as Moziilla 1.8alphas
| Reporter | ||
Comment 27•21 years ago
|
||
(In reply to comment #23)
> Andy Strojny: Until such point as someone posts a patch, it is pointless for you
> to continue reporting that things are still not working for you.
>
> Bug 230093 / bug 181534 comment 72 would appear to be the major junk-related
> change since 1.7.
>
> I again point to bug 245439, which may or may not be a dupe of this one -- or,
> it might be a dupe of the problems about failing to update training.dat
> (bug 243680, bug 245499).
>
> A couple more dupes to this bug forthcoming shortly.
Again sorry for the confusion. Thought you need new bug if same problem in a
new Mozilla build. How does one determine if a patch has been posted for a
reported bug?
| Reporter | ||
Comment 28•21 years ago
|
||
The release notes for Mozilla 1.8 alpha 4 does not indicate anything was done
concerning the junk mail filter. Has anything happened???
Comment 29•21 years ago
|
||
I see this also on 1.8a4 Solaris Sparc build. As far as I remember, it regressed
in 1.8a1, was fixed in 1.8a3 (don't remember the bug number), and regressed
again in 1.8a4. Junk detection is more or less random now.
Updated•21 years ago
|
Product: MailNews → Core
Updated•21 years ago
|
Flags: blocking1.8a6?
Updated•21 years ago
|
Flags: blocking1.8a6? → blocking1.8a6-
Comment 30•19 years ago
|
||
More slipping through than ever now.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 ID:2006050805
Comment 31•19 years ago
|
||
It is starting to catch a few junk emails again, but not like it used to (on branch: change Version to All or Other).
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1a3) Gecko/20060629 Thunderbird/2.0a1 ID:2006062906
Comment 32•19 years ago
|
||
*** Bug 283021 has been marked as a duplicate of this bug. ***
Comment 33•19 years ago
|
||
I have seen poor junk filtering in 1.8x builds too, as reported in bug #283021. A recent Seamonkey 1.5 nightly filtered about 1/3 of the junk, while Mozilla 1.7x typically filters 2/3 to 3/4 of my junk mail using the same profile and training.dat, on Win98 and now on Linux. I found the problem in the oldest 1.8a builds I could find, dating back to about when this report was written.
I suggested possible reasons in my many reports, but if the reason isn't easy to see, replacing the 1.8 filtering code with 1.7x code would be the simplest fix.
Comment 34•19 years ago
|
||
2.0 basically does not filter junk any longer. This should block a release of 2.0 completely.
Comment 35•19 years ago
|
||
As commented in bug #283021, the simplest fix might be to cut and paste---cut out the "new and degraded" junk-filtering code in 1.8/1.9 and paste in the code that works from Mozilla 1.7.13.
| Reporter | ||
Comment 36•19 years ago
|
||
I pretty much solved the problem my letting my service provider, COX, identify spam and I created a filter to place it in a submail box. I then identify the mail in it as junk, over 99% of it is junk, and slowly Thunderbird learns. So the COX junk filter has solved my problem.
Comment 37•19 years ago
|
||
Andy, leaving it to Cox is not a fix for the problem with Mozilla software spam-filtering. As long as there's code taking up space in Seamonkey and Thunderbird that purports to filter spam, it ought to do it well. If not, it's hardly better than no filter at all.
I'd rather receive and filter my own spam than let my ISP or another firm do it, because it's a known fact that filtering upstream from your inbox may not be honest. AOL has been caught filtering messages critical of its company policies, for example, though I'd have to search for the details.
| Reporter | ||
Comment 38•19 years ago
|
||
I don't disagree. But COX gives me the option of having mail marked by it as spam but still be delivered. I sent up a filter to dump email labeled by COX into a separate COX SPAM mailbox. It is dumped there and I go through it periodically and either label as spam through Thunderbird or move it into my inbox.
Agree this is not an ideal situation, but the Thunderbird spam filter just was not doing it for me. Hopefully it will be fixed in version 2.
Comment 39•19 years ago
|
||
what's needed, *from one of you who is seeing these problems*, is a testcase from 1.5, 2.0 or trunk, and the relevant files (training.dat, sample emails, etc). also see comment 6
Worcester12345 in comment #34
> 2.0 basically does not filter junk any longer. This should block a release of
> 2.0 completely.
If junk is totally not working then it deserves a new bug or escalating one of the several existing bugs where junk does not work. This is about junk not working well (but it is working). FWIW I see no such issue running trunk and 2.0 - but I do have spamassassin as a first line of defense.
OS: Windows XP → All
Hardware: PC → All
Whiteboard: needs testcase
Version: Trunk → 1.8 Branch
Comment 40•19 years ago
|
||
I just had my first junk email message in the past week or two blocked. I continue marking, but it just doesn't work. Nightly 2.0 builds.
Updated•19 years ago
|
Flags: blocking-thunderbird2?
Updated•19 years ago
|
Flags: blocking-thunderbird2? → blocking-thunderbird2-
Comment 41•19 years ago
|
||
My ISP changed my domain name and I'm getting almost no junk to test on. In another report on this problem, I suggested that maybe the tail end of the training.dat file isn't getting read to process junk. The end of the file is where the "blacklist" is located and probably where new junk info is appended. If only the front part of the file is being used, that means only the whitelist and the old blacklist data are being used. Old blacklist data may not be a good match for current junk mail.
If that's not the problem, then there's a defect in the revised algorithms for processing junk mail. The differences between the new and old code need to be examined for flaws.
Comment 42•19 years ago
|
||
(In reply to comment #39)
> what's needed, *from one of you who is seeing these problems*, is a testcase
> from 1.5, 2.0 or trunk, and the relevant files (training.dat, sample emails,
> etc). also see comment 6
>
>
> Worcester12345 in comment #34
> > 2.0 basically does not filter junk any longer. This should block a release of
> > 2.0 completely.
>
> If junk is totally not working then it deserves a new bug or escalating one of
> the several existing bugs where junk does not work. This is about junk not
> working well (but it is working). FWIW I see no such issue running trunk and
> 2.0 - but I do have spamassassin as a first line of defense.
>
I also use SpamAssassin, and have the settings set up to honor the warning from SpamAssassin, yet it does not. This is with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3pre) Gecko/20070308 Thunderbird/2.0pre ID:2007030803
Thunderbird maybe catches maybe 1-5% of incoming junk mail.
Comment 43•18 years ago
|
||
sorry for the spam. making bugzilla reflect reality as I'm not working on these bugs. filter on FOOBARCHEESE to remove these in bulk.
Assignee: sspitzer → nobody
Comment 44•18 years ago
|
||
(In reply to comment #40)
> I just had my first junk email message in the past week or two blocked. I
> continue marking, but it just doesn't work. Nightly 2.0 builds.
I've seen comments where people have been helped by resetting the training data and starting from scratch in marking some messages as junk and some as not junk
(In reply to comment #42)
> I also use SpamAssassin, and have the settings set up to honor the warning from
> SpamAssassin, yet it does not. This is with Mozilla/5.0 (Windows; U; Windows NT
> 5.1; en-US; rv:1.8.1.3pre) Gecko/20070308 Thunderbird/2.0pre ID:2007030803
"Trust" isn't working in 2.0, at least for some people - bug 381589
Comment 45•18 years ago
|
||
(In reply to comment #44)
> (In reply to comment #40)
> > I just had my first junk email message in the past week or two blocked. I
> > continue marking, but it just doesn't work. Nightly 2.0 builds.
>
> I've seen comments where people have been helped by resetting the training data
> and starting from scratch in marking some messages as junk and some as not junk
What's the use, then? I thought the effects were cumulative.
(In reply to comment #44)
> (In reply to comment #42)
> > I also use SpamAssassin, and have the settings set up to honor the warning from
> > SpamAssassin, yet it does not. This is with Mozilla/5.0 (Windows; U; Windows NT
> > 5.1; en-US; rv:1.8.1.3pre) Gecko/20070308 Thunderbird/2.0pre ID:2007030803
>
> "Trust" isn't working in 2.0, at least for some people - bug 381589
Hmmm. Sounds more and more like the junk mail controls are pretty much entirely broken.
Comment 46•18 years ago
|
||
The file training.dat holds both a whitelist and a blacklist, I understand. Presumably blacklist data is appended. If the process that reads the blacklist has an arbitrary limit on the size of the blacklist, appended data beyond that size would never be read. This would make the training of the junk controls appear not to work when the training.dat file or the blacklist it contains gets larger than whatever size limit may exist.
I would suggest that records at the front of the blacklist be dropped off after the blacklist reaches a "mature" size, which would eliminate old and often useless spam data, and make sure that the filtering process reads the entire blacklist as well as the whitelist.
If the problem is in the statistical methods instead of simple file processing, I would have no suggestions.
| Reporter | ||
Comment 47•18 years ago
|
||
That certainly sounds like something to explore. But why not just expand the size of the blacklist data file or make it user configurable.
Hope you had a Merry Christmas and best wishes for a Happy New Year
Comment 48•18 years ago
|
||
I have been studying for the last few weeks the spam processing in some detail. I have an extension that I can use to read in an extensive SPAM corpus (TRAC 2005) and run it through the TB spam filters. In that environment, which is "controlled" in the sense that I have precise control of what I train and analyze, the spam filter works quite well (meaning .05% false positives with 5% false negatives). These tests have used recent trunk builds, or TB 2.0. Also, the more I train the better it works, though the improvement is very slow. I am not aware of any limit to the size of the training database. My largest trials trained on 73,000 emails, generating over 300,000 tokens for both junk and good emails.
I think that the reason retraining is occasionally necessary is that mistakes creep into the database, that is good emails that were trained as junk, and vice versa. I have not tested the sensitivity to that yet. That doesn't mean that the algorithms are ideal. In particular, there is no pruning of old tokens, so obviously obsolete material (like old dates and message IDs) end up cluttering the database. But pruning will result in a big increase in speed, with a modest decrease in effectiveness. (It's amazing how those rare tokens are often the ones that tip the balance.) It probably won't help the effectiveness (unless there has been mistraining or corruption in training.dat), at least in cases where emails are selected at random. In the real case, of course, both spam and ham drift over time, and perhaps retraining would help (or some sort of token pruning, which I am working on).
There are a number of improvements that the spam filter needs - but still the basic algorithm is sound. However, the user interface does not provide much information, so if spam emails slip through (like the rash of big penis emails I've received in the last few days) then you are really very helpless. Or if you get little feedback, its hard to tell if it is working at all, and if not then why not.
So, I don't believe there are any bugs or problems with the spam filtering, just a constant need for improvement to keep up with the increased skills of the spammers. I suppose I could run the tests requested in this bug, checking for regressions against 1.7 (what Thunderbird version is that?). But I've tested TB 2.0 and 3.0, and it works pretty well in a controlled environment. So I'm not sure I see the value of testing against an older version, which would take me hours or days to accomplish. If there is still someone convinced this is worth doing, then please make your case to me. I have the tools to do spam testing.
Updated•17 years ago
|
QA Contact: filters
| Assignee | ||
Updated•17 years ago
|
Product: Core → MailNews Core
Updated•16 years ago
|
Flags: wanted1.9.2?
Comment 50•16 years ago
|
||
In >5 years there is no testcase here, so I think this bug should be closed and issues that individuals have be pursued in individual bugs, eg comment 42. If Andy's issue still exists for him and a testcase can be developed then perhaps leave the bug open, but in it's present state this bug is going nowhere.
Comment 51•16 years ago
|
||
WFM seems inappropriate, so => incomplete.
if you are seeing a problem, please create a specific bug for a specific issue based on using version 3, and after checking bugzilla
Status: NEW → RESOLVED
Closed: 21 years ago → 16 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•