Closed
Bug 194238
Opened 22 years ago
Closed 22 years ago
junk mail controls won't analyze a message as junk until you mark a message as "not junk"
Categories
(SeaMonkey :: MailNews: Message Display, defect)
SeaMonkey
MailNews: Message Display
Tracking
(Not tracked)
VERIFIED
FIXED
mozilla1.4alpha
People
(Reporter: sspitzer, Assigned: sspitzer)
References
Details
(Keywords: regression, Whiteboard: [adt2])
Attachments
(1 file, 1 obsolete file)
1.45 KB,
patch
|
Details | Diff | Splinter Review |
junk mail controls stopped working for mac os x around 2/17?
more details coming...
Updated•22 years ago
|
Hardware: PC → Macintosh
Comment 1•22 years ago
|
||
I trained the build from 02/17 all week. No messages were ever filtered to the
junk folder as I had set up.
esther was seeing a prblem where truning on delete message after _1__ days
cuased filtering to not work. I had also set up my account to delete 2 day old
junk marked messages. I've since turned that off and am waiting for incoming junk.
Note: going back to M1.3b (ns) works
Assignee | ||
Comment 2•22 years ago
|
||
morphing the bug based on commments from twalker and esther:
from twalker:
"I think the delete messages issue that esther pointed out is the problem I
turned it off...and an incoming message was filtered to the junk folder"
from esther:
"I think I noticed it stopped working on my Winxp system yesterday, on a
Profile with a very well trained JMC (Junk Mail Control) when I selected to
delete junk messages after 1 day option. I have since disabled that option
and this morning I saw the JMC work again on that profile. Still
investigating."
OS: MacOS X → All
Hardware: Macintosh → All
Summary: junk mail controls stopped working for mac os x → enabling delete junk messages after [n] days causes junk filter to stop working.
Target Milestone: --- → mozilla1.4alpha
I'm sorry but after more investigation on my Profile mentioned in comment 2, I
am not seeing the problem when I enable the automatic delete option again using
the same profile and same build. So this option is working OK on my winxp system.
Assignee | ||
Comment 4•22 years ago
|
||
morphing this bug back, based on comments from esther.
Keywords: regression
OS: All → MacOS X
Summary: enabling delete junk messages after [n] days causes junk filter to stop working. → junk mail controls not work on OS X
Comment 5•22 years ago
|
||
I can no longer reproduce this bug. I've been running the build from 2003-02-20.
it is learning and filtering fine (for this early state in it's learning curve)
with the delete junkmail after (n) days on or off.
Has anyone else besides esther and I seen this bug?
I think I found the problem. As mentioned before my POP and IMAP newly created
profiles on MAC OSX were not evaluating incoming junk mail either (the training
file was being built and growing in size as I marked more messages as junk). I
found in both cases that after I marked a message Not Junk the evaluation & move
started working. This is why those who have been using profiles and JMC on MAC
did not run into this problem. And Tracy added this as a new profile to his mac
system with no training file so messages were not getting marked as Junk so he
had no need to mark them as Not Junk. Not sure if this is across platforms or
if it's regression since we all have had Junk mail working for a while.
Assignee | ||
Comment 7•22 years ago
|
||
morphing, based on comments from esther.
esther, does this happen on win32 or linux?
Summary: junk mail controls not work on OS X → junk mail controls not work on OS X until you mark a message as "not junk"
Yes, it happens on winxp and linux. Not sure if this is regression, we may not
have realized this happened in our early testing because we weren't as familiar
with the feature as we are now. Our expectations may have been that the JMC
wasn't robust enough to catch the junk so we were marking messages ourselves,
once a user makes the mistake of marking a message as junk when it shouldn't be
they would unmark it and from that point on JMC worked.
Note: The envelope feature should make it so the user marks messages as Not Junk
right from the start so they probably won't run into this.
Updated•22 years ago
|
Flags: blocking1.3?
Comment 10•22 years ago
|
||
Bug 188232 may be the first report of this bug which goes back to the 12-12-2002
build.
Updated•22 years ago
|
OS: MacOS X → All
Summary: junk mail controls not work on OS X until you mark a message as "not junk" → junk mail controls not work until you mark a message as "not junk"
Assignee | ||
Comment 11•22 years ago
|
||
accepting. I hope to fix this soon.
Status: NEW → ASSIGNED
Keywords: nsbeta1
Updated•22 years ago
|
Flags: blocking1.3? → blocking1.3-
Comment 12•22 years ago
|
||
Mail triage team: nsbeta1+/adt2
Comment 13•22 years ago
|
||
If the number of good tokens is 0, there's a possible division by 0 in
mailnews/extensions/bayesian-spam-filter/src/nsBayesianFilter.cpp
near line #625
in function void nsBayesianFilter::classifyMessage(Tokenizer& tokenizer, const
char* messageURI, nsIJunkMailClassificationListener* listener)
I'm not sure what 'min (1, g / ngood)' returns when ngood=0, so this may not be
that important.
Comment 14•22 years ago
|
||
Not sure if this is the right bug, but... I had junk mail controls working great
on Win2k and WinXP installs of 1.3b against the same IMAP account. Both
installs had been trained pretty well and were working with high levels of
accuracy. Sometime about a week ago, the Win2k install of 1.3b stopped marking
incoming junk mail (previously it had near 100% accuracy).
The WinXP install continued marking incoming mail as spam for about 4 days
longer than the Win2K machine, until two days ago a friend sent me a false
positive (had "sexy" in the subject). I marked it as not junk, and from that
point on, the WinXP install failed to classify junk at all.
Upgrading both installs to 1.3final hasn't fixed the problem.
After reading comments in Bugzilla, I've marked a few e-mails as "Not Junk" and
will update comments if that fixes the problem.
Apologize for having few specifics on the Win2K install breaking; let me know if
anyone wants more specifics about my setup.
Assignee | ||
Comment 15•22 years ago
|
||
Assignee | ||
Comment 16•22 years ago
|
||
Attachment #117370 -
Attachment is obsolete: true
Assignee | ||
Comment 17•22 years ago
|
||
fixed. the patch has r/sr=bienvenu.
this should improve the initial experience, and encourage users to train both
good and bad, which is important for the algorithm to work.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 18•22 years ago
|
||
*** Bug 188232 has been marked as a duplicate of this bug. ***
Comment 19•22 years ago
|
||
I'm experiencing symptoms from this bug, but have marked many messages as
non-junk. Specifically, the symptom I'm seeing is that if I run the junkmail
controls on a spam message, moz marks it as non-junk. If I mark it as junk
manually, and rerun the control, it still marks it as non-junk.
Is there a tool to analyze the training.dat file? I'd like to provide more
feedback but I don't know how to analyze the training.dat.
I've trained mozilla on all my spam and non-spam since 2000 (yes, I even keep
all my spam :).
The training.dat file is about 1.6Mb. There are a zillion words in the file, but
how can I recognize which tokens it considers good and which ones it considers bad?
Assignee | ||
Updated•22 years ago
|
Summary: junk mail controls not work until you mark a message as "not junk" → junk mail controls won't analyze a message as junk until you mark a message as "not junk"
Comment 20•22 years ago
|
||
This seems to be a pretty important issue in making JMC work perfectly !
Now, there seems to be a 1.3.1 release in the workings (bug 197105 and bug
185169) so how about seeing if it's possible to get this into the 1.3.1 build
too ???
Comment 21•22 years ago
|
||
FYI - marking messages as non-junk fixed my problem reported in comment #14 on
the WinXP box. Haven't tested the Win2k box but assume that it will fix the
problem there, too.
Comment 22•22 years ago
|
||
here's how the code works now, for a first time user of this feature:
1) no training data, all incoming will be determined to be junk (except for
whitelisting).
2) user marks message as not junk, now training data has only "good" tokens.
3) all incoming messages will be determined to be not junk.
4) user maks message as junk, now training data has both "good" and "bad" tokens
5) all incoming messages properly analyzed (except for whitelisting)
this is as desired, since it forces users to train both junk and not junk (which
is need for the JMC to work properly.)
Comment 23•22 years ago
|
||
Using trunk builds 20030318 on winxp and linux this is fixed per how it should
work in comment #22. Using trunk build 20030324 on mac osx this is fixed per
comment 22 also. Verified.
Status: RESOLVED → VERIFIED
Comment 24•22 years ago
|
||
*** Bug 198762 has been marked as a duplicate of this bug. ***
Comment 25•22 years ago
|
||
*** Bug 197801 has been marked as a duplicate of this bug. ***
Comment 26•22 years ago
|
||
I am running 2003052908 (1.4b). This bug appears to still be rearings its ugly
head in one form. I have no whitelist that I am aware of, and no address book
entries. If I delete training.dat and then run the JMC on my 7200-message
nonjunk corpus, it marks about 99% of them as junk. The ones not marked as junk
have no common features except that they really arent junk. I have been told
this is not proper, and that is backed up by previous comments. It should mark
them all as junk, but doesnt, indicating a possible problem related to this bug.
And, slightly off topic, not really commenting on the bug, more using this as a
message board to get an answer from people who know how the JMC work:
In a proper bayesian filter you can start with a corpus of already sorted mail,
junk and non-junk, and create a dictionary of tokens ('words') from their
occurences in each half of the corpus which can then be used as a starting point
for filtering. Is there any way to do this in Mozilla? So far the best results
I have had, after multiple failed training attempts, is to mark each half
INCORRECTLY, then delete the training.dat, then mark them correctly, thus having
artificially 'recieved' and marked all the mails into their proper category and
theoretically producing a proper starting dictionary. However, my results after
using this method, as measured by false positive and false negative results on
new incoming mail, are FAR below those I would expect and have seen with other
bayesian filters (including one written myself for IRC). My initial corpus is
comprised of 7200 non-junk emails and 1200 junk emails, which I am aware is
slightly imbalanced. Out of the 87 emails I have recieved since I finished the
training Mozilla has gotten 39 proper junk positives, 11 false negatives (junk
that didnt get marked), 31 proper negatives, and 6 false positives (nonjunk
marked as junk). This is many orders of magnitude worse than I have learned to
expect from a bayesian filter with this level of training, my expectations being
more along the lines of 50 0 36 1 respectively. If anyone could shed some light
on this I would be grateful.
Updated•20 years ago
|
Product: Browser → Seamonkey
You need to log in
before you can comment on or make changes to this bug.
Description
•