Closed Bug 188232 Opened 22 years ago Closed 22 years ago

training.dat file is being updated yet mail is never classified as junk

Categories

(MailNews Core :: Filters, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 194238
mozilla1.4alpha

People

(Reporter: mars, Assigned: sspitzer)

References

Details

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3a) Gecko/20021212 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3a) Gecko/20021212 On a pop account, junk mail controls have been enabled. A training.dat file has been created and is growing (currently 157 kb). Inspection of this file shows words that are definitely spam-related. I have been training the app now for approximately 5 days, and no incoming mail is being classified as junk. Reproducible: Always Steps to Reproduce: 1. 2. 3.
WFM. My own experience suggest that until a significant number of junk mails (at least a few dozen) have been classified then the filtering is not great. I went through my trash folders and forwarded some junk from an old web-based account and some hotmail accounts. I started training in early December and now have 90%+ spam detection and only one false positive this month. Are you still getting the ?junk icon or is it classifying your junk as not junk? Try classifying an email as junk and then manually running the filters on that email again - it's on the tools menu. It should get the idea. It might also be worth unchecking the whitelist (sender is in my address book option) since there are issues here that may be masking the spam.
Additional Info: This is on a pop account set up as a maildrop for a domain, and as such, gets upwards of 100+ spam emails a day. Setting up Mozilla 1.3a at home pointing to the same account / identical setup, started catching junk mail within one or two sessions.
I have seen this bug as well. In Mozilla build 2003011412, training.dat appears to be updated, but newly junked messages are marked as not junk when "Run Junk Mail Controls on Selected Messages" menu option is chosen. (Clearer? description of the above follows) Setup POP mail account in Mozilla, with the following boxes checked in the server settings dialog: * Check for New Messages on startup * Check for New Messages every 10 minutes * Automatically Download New Messages * Leave Messages on Server Other boxes unchecked. Download some POP email from the server. Put a bunch of other junk emails in an "mbox format" file, and copy that file into the Local Folders directory (wherever that is on your box). Now, mark those messages in that Local Folder as junk. Select a few of them to run a test on, and "Run Junk Mail Controls on Selected Messages". Watch the little Junk status icon go away. I'll attach a copy of prefs.js soon, with a few minor alterations to remove non-public data.
user@host is one account. user2@host2 is another account. Training.dat contains 99K of data, last updated within the last few minutes. Junk controls trained on at least 90 spam and 100 ok messages.
Further investigation into training.dat file shows that the number of good tokens is 0. Number of bad tokens is 6553. Number of good messages is 0. Number of bad messages is 182. Running junk mail controls on any message just turns it to "Not Junk". I will attempt to add at least 1 "good" token to the file, though this will take some time. I am running: Mozilla 2003011412 Windows 2000
I added one "good" token to the training.dat file, with appropriate modifications to set the "good message count" to 1, and "the good token count" to 1. That seems to have fixed the problem, in that I can now click the junk icons on (or off), and "Run Junk Mail Controls on selected messages" no longer changes all of the junk icons back to OK. I took a look at the updated training.dat in my profile directory, and it appears that the "good message count" and good tokens have been updated properly. It appears that having the good token count and good message count both zero confuses something in the spam analyzer, or possibly prevents the training routines from working. First few bytes of the old training.dat file: feed face 0000 0000 0000 00b6 0000 0000 0000 1999 Followed by lots of tokens First few bytes of the new training.dat file: (in hex...) feed face 0000 0001 0000 00b6 0000 0001 0000 0001 0000 0008 6775 6172 6469 616e 0000 1999 followed by lots more tokens.
*** Bug 191004 has been marked as a duplicate of this bug. ***
Perhaps this bug's status should change to "NEW", now that it has a duplicate? The current work-around is too hard for most end-users. The "IMAP junk mail controls work/POP doesn't" part of 191004 may be a clue to what went wrong. Then again, division by zero isn't a good idea in (almost) any case.
I strongly agree to change the status of this bug to NEW. Obviously the bug happens not only one single time with one single user ... so I think it should get any further attention and a true solution I am happy now as the work-around from mozilla.gv5r@snet.net is working perfectly - but as he said for a standard user it is fairly complicate to hex-edit the training.dat-file.
Even if I train on a junk mail and run Junk mail controls on the same message, it will classified as "not junk" again. The problem happens on only one of my systems. Both are running Mozilla 1.3beta on Windows XP. I can't see any interesting difference between these systems.
The problem of the junk mail filtering not working as expected with a new account (specifically when there's no pre-existing training.dat) exists in 1.3b. After marking some messages as Junk, I expected it to be able to mark other junk messages as junk. However, the junk mail filters wouldn't even keep the junk satus on the messages I used to train it. Comment 6 indicates that it's because the list of good tokens is of 0 length. The workaround (or perhaps documentation fix) (at least for 1.3b) is to mark some messages as non-junk. Hex editing the training.dat file worked for me a couple weeks ago, but I didn't try simply training on non-junk messages then. This worked for me, but I'm just a mozilla user. YMMV.
(As has been noted, this problem only happens to a few people.) It might be related to getting junk mail from too many different sources. I think the problem is harder to fix than "mark some messages as non-junk". At least with the build I was using (20030125008), no messages were ever marked as junk, so it was impossible to mark them as "non-junk". I even tried the "Tools->Mark selected messages as non-junk" menu item, but that never added tokens to the list of good ones. I could probably try again, but you have to start from a blank slate to see it happen. ie, no training.dat file, AND all current messages marked as "not-junk". How did you mark messages as not junk?
I just selected a message (again, using 1.3 beta, not a nightly) from the mailbox and used the "Mark Selected Messages as Non-Junk" menu command. My (previously empty) training.dat was updated with good tokens. If it doesn't work for you, maybe another bug or a summary change is in order. It seems this should (and does) affect everyone who creates a new profile and enables junk filtering without tagging at least one good (and junk) email. I would suspect that the lack of good tokens is what's throwing off Daniel's use of the junk filters in comment 10.
At one point, I had a spam message get through my filters (which have been trained on both spam and non-spam). In the thread pane, I clicked on the icon in the Spam column to mark the message as spam (I'm sure the thing I clicked on has a name, but I'm not buzzword-compliant in this area :-). With the message still selected, I ran "Run Junk Mail controls against selected messages", and the message's status returned to non-spam. I then ran "mark as Junk" from the Tools menu, ran "Run Junk Mail controls against selected messages" again, and this time the message stayed marked as spam. Given the info at http://www.mozilla.org/mailnews/spam-howto.html, I assumed that toggling the thread pane icon was the same as using the "mark as Junk" menu option. However, at least in this one case, it wasn't. This was running one of the 1.3b nightlies (sorry - don't know which one). Haven't had an oppourtunity to try to reproduce the problem lately, but if toggling the thread pane icon isn't really adding to the training, that could explain some of the problems reported here.
accepting, might be a dup of a bug twalker / esther logged recently.
Assignee: naving → sspitzer
Keywords: nsbeta1
Target Milestone: --- → mozilla1.4alpha
twalker/esther bug mentioned above is bug 194228
QA Contact: laurel → esther
sorry the twalker/esther bug is 194238
Mail triage team: nsbeta1-
Keywords: nsbeta1nsbeta1-
This bug as originally stated and some of it's comments are duplicates of 194238 and 191042. I know this was logged earlier, but this bug has various scenarios listed which are addresed in these two bugs. When those bugs get fixed, this bug will be tested too, but it's important for those who contributed to this bug try their scenarios again when bugs 194238 and 191042 get fixed.
both bug #194238 and bug #191042 are now fixed, so I believe this bug is fixed.
Status: NEW → ASSIGNED
actually, this was a dup of bug #194238 *** This bug has been marked as a duplicate of 194238 ***
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
verified
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: