Closed Bug 540385 Opened 10 years ago Closed 10 years ago
Manually marking a message as NOT JUNK is insufficient; message gets refiled as junk
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100105 Firefox/3.6 (.NET CLR 3.5.30729) Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:220.127.116.11) Gecko/20100111 Thunderbird/3.0.1 I am not the only one seeing this problem. See this URL http://getsatisfaction.com/mozilla_messaging/topics/marking_as_not_junk_issue I have a couple of messages that the mail server's spam-scanner marked as spam. In reality, they are not spam. TBird files them into the Junk folder. When I find them there and click the "Not Junk" button, they disappear from Junk and are moved to Inbox (all good). Unfortunately, as soon as I open the Inbox, the messages are moved from there and re-filed into the Junk folder. I even went so far as to add the sender to my Address Book (and verified in the Junk Mail settings of Accounts that Personal Address Book is selected under "Do not mark mail as junk if the sender is in". EVEN THAT does not prevent the messages from being moved back into the Junk folder. Reproducible: Always Actual Results: Messages keep going back into the Junk folder. Expected Results: Messages which I manually mark as "NOT JUNK" are moved to Inbox and stay in Inbox until I do something with them (like delete them, or refile them) [Perhaps Tbird's spam scanner is doing the right thing. Maybe a manual override flag needs to be added to the DB, so that it can remember that the user has purposely marked the message as Not Junk?] Here are the spam-related headers from four messages that keep being refiled into the Junk folder. Maybe these will provide some info?? X-Virus-Scanned: amavisd-new at tkmsoftware.net X-Spam-Flag: YES X-Spam-Score: 6.252 X-Spam-Level: ****** X-Spam-Status: Yes, score=6.252 tagged_above=-10 required=5 tests=[AWL=-0.894, BAD_ENC_HEADER=1.81, BAYES_50=0.001, FH_DATE_PAST_20XX=3.188, FROM_LOCAL_NOVOWEL=3.196, HTML_MESSAGE=0.001, HTML_MIME_NO_HTML_TAG=0.097, MIME_HTML_ONLY=1.457, MIME_QP_LONG_LINE=1.396, RCVD_IN_DNSWL_MED=-4] X-Virus-Scanned: amavisd-new at tkmsoftware.net X-Spam-Flag: YES X-Spam-Score: 6.699 X-Spam-Level: ****** X-Spam-Status: Yes, score=6.699 tagged_above=-10 required=5 tests=[AWL=-0.447, BAD_ENC_HEADER=1.81, BAYES_50=0.001, FH_DATE_PAST_20XX=3.188, FROM_LOCAL_NOVOWEL=3.196, HTML_MESSAGE=0.001, HTML_MIME_NO_HTML_TAG=0.097, MIME_HTML_ONLY=1.457, MIME_QP_LONG_LINE=1.396, RCVD_IN_DNSWL_MED=-4] X-Virus-Scanned: amavisd-new at tkmsoftware.net X-Spam-Flag: YES X-Spam-Score: 6.848 X-Spam-Level: ****** X-Spam-Status: Yes, score=6.848 tagged_above=-10 required=5 tests=[AWL=-0.298, BAD_ENC_HEADER=1.81, BAYES_50=0.001, FH_DATE_PAST_20XX=3.188, FROM_LOCAL_NOVOWEL=3.196, HTML_MESSAGE=0.001, HTML_MIME_NO_HTML_TAG=0.097, MIME_HTML_ONLY=1.457, MIME_QP_LONG_LINE=1.396, RCVD_IN_DNSWL_MED=-4] X-Virus-Scanned: amavisd-new at tkmsoftware.net X-Spam-Flag: YES X-Spam-Score: 7.687 X-Spam-Level: ******* X-Spam-Status: Yes, score=7.687 tagged_above=-10 required=5 tests=[BAYES_60=1, FH_DATE_PAST_20XX=3.188, HTML_IMAGE_ONLY_04=2.041, HTML_MESSAGE=0.001, MIME_HTML_ONLY=1.457] Thank you for reviewing my bug report. I appreciate Tbird being available. Thank you for making it even better.
A couple of questions: 1) What type of account (IMAP, POP3, or Local Folders) is the INBOX in, and what type of account is the incoming message coming into? 2) If you mark the message as not junk a second time, does the same thing happen?
Component: Security → Filters
Product: Thunderbird → MailNews Core
QA Contact: thunderbird → filters
Hi Kent: Sorry to mess-up in the categorization. (Part of JunkMail configuration is inside the Security Tab in options; I didn't know what else to call it; filtering certainly makes more sense.) 1) I am using IMAP servers. The original messages are being received and stored onto the IMAP server. I retrieve the messages from there. When they are filed as junk, they go into the Junk folder associated with the IMAP server. 2) I can go through the cycle as many times as I want. I tried over and over and over. (Find message in Junk-folder; mark message as "Not Junk"; message disappears, presumably moved to Inbox; I select the Inbox; message is not there and I note the counter has increased on the associated Junk-folder; I select the Junk-folder and the message is back there again). I rarely receive any Junk mail that is actually legitimate. Most of the time I just shift-delete the mail in the Junk-folder (after a cursory review). On the few occasions that I find a legitimate piece of E-mail in the Junk-folder, I am not able to move it back to the Inbox, it just won't stay there. I usually wind up archiving it, or saving it to an external file or something (no formal procedure, because I don't find many non-junk messages in the Junk-folder...) Thank you for your assistance. Please let me know if I can be of any help. Bob
I have not been able to confirm this directly for some reason, but looking at the code I can believe that it happens. The fix is fairly simple: to just change the definition of the filter file used for spam filter trust to include a junkstatusorigin field, then don't mark as junk if the junkstatusorigin was set by the user. The workaround is also simple: just don't use the "trust SpamAssassin" option but define your own filter to do the same thing. This is simply implemented as a predefined filter. So I am taking the unusual step of assigning it to myself, but leaving it unconfirmed.
Assignee: nobody → kent
Kent: Thank you for reviewing my bug report. As a semi-simple way of verifying the problem: perhaps you can take the example headers that I have previously submitted and paste them into the header section of a couple of newly received E-mail messages as they sit in your mailbox on your mail server (before your client grabs them). Then turn on "trust SpamAssassin" and retrieve your E-mail. Bob
Kent, et al: An additional observation: The "Do not mark mail as junk if sender is in 'Personal Address Book" automated-exemption method ALSO does not work. I have added the sender of some improperly Junked messages to my personal address book, verified that the check-box is marked in Account Settings and STILL the mail gets moved to Junk (based on the SpamAssassin header). Since SpamAssassin makes mistakes, it seems prudent for Thunderbird to HONOR both automated and manual exemptions (overriding the SpamAssassin header). Thanks, Bob
(In reply to comment #5) > > Since SpamAssassin makes mistakes, it seems prudent for Thunderbird to HONOR > both automated and manual exemptions (overriding the SpamAssassin header). > The issue of automated exemptions is more complex. Spammers regularly use the automated exemption of your own address, so they send you email from "you" hoping to get it through. This was fixed in TB3, but it might still be possible for a clever spammer to figure out sets of related addresses, so that they send email to you from a related address that is likely to be whitelisted. So I think that I would prefer that issue in a separate bug.
I just saw this today in many messages I received overnight. Most odd. I've not changed anything to cause this as far as I know. But I think I heard that spamassassin version was upgraded on the server. And a coworker showed me something similar a few days ago, now CC Has spamassassin changed headers? X-Virus-Scanned: clamav-milter 0.95.3 X-Virus-Status: Clean X-Spam-Status: No, score=-200.0 required=5.0 tests=SUBJECT_IN_WHITELIST, USER_IN_WHITELIST autolearn=disabled version=3.3.0 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18)
Status: UNCONFIRMED → NEW
Ever confirmed: true
I turned off trust SA and these messages are still labeled as junk.
perhaps a bit hasty, I'm not sure my issue is this bug. My messages are being marked junk (although spamassassin hasn't flagged them as such). but they are NOT moved to junk folder.
xheader of sample message labeled junk X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 X-Envelope-From: email@example.com additionally, it should have been whitelisted because the sender is in my AB
I discovered my issue was a second PC had adaptive junk mail controls enabled and because it wasn't trained it was marking a lot of of messages as junk. I disabled JMC on that PC and all is well unconfirming
Status: NEW → UNCONFIRMED
Ever confirmed: false
I can reproduce this now.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Here's an update on status of this. I now have a unit test that demonstrates this bug, and a first attempt at a fix. Unfortunately it appears to me that the attempted fix uncovered another unrelated issue. Message summary database properties are supposed to be preserved on message moves, but that does not appear to be happening for online imap moves. I am investigating that issue at the moment.
I have a fix for this, but it turned out to be more complex than I expected. My fix also changes an interface, which prevents landing in 3.0.x When IMAP moves the junk message back to the inbox, it tries to rerun filters on that message. In general that is a good thing, so I did not want to prevent that. The trust SpamAssassin is implemented as a filter, so I simple extended that filter (generally so it works with others not just SpamAssassin) to detect whether the message has been previously marked as junk by the user. Unfortunately that uncovered a new issue. Pending attributes on Imap messages are applied quite late, in AddNewHeaderToDb, which has not happened at the time that filters function. So the pending junk attributes on the moved message were not being seen by my corrected filter. To fix that, I changed the timing of application of pending headers to be much earlier, as soon as message-id is detected and assigned in nsParseMailMessageState::FinalizeHeaders. This is the same location that the backupDatabase stuff is applied for cross-server moves and reindex. Unfortunately that is a fairly risky change, but I think it is the correct long-term answer since filters should be able to see the pending attributes of a message. As part of the debug process, I wrote a unit test to test that arbitrary attributes were being preserved on Imap moves, which I thought I had fixed earlier but was starting to doubt as I was trying to figure out this bug. Turns out it does work in the non-filter case. Since I am continuing to use the same simple Imap fake server setup, I moved that into a common routine as well, calling it imapPump.js. Asuth's message injection framework is cool and all, but when I am debugging low-level issues of imap I just need something that is simpler. I submit the property preserve test and creation of the imapPump as a separate patch attached to this bug. That is because those tests should probably land on 3.0.x (and I might want to use those in other patches), while the main bug is targeted at 3.1 The main bug requires the imapPump framework introduced by that patch.
This is a testing-only patch. It will be required by the main patch for this bug.
Attachment #436734 - Flags: review?(bienvenu)
>When IMAP moves the junk message back to the inbox, it tries to rerun filters >on that message. In general that is a good thing, Hmm, in general I would suspect this is a bad thing, and could be quite confusing and unpredictable for the user. I might be stuck in a world where filters run first, and then junk runs, which means we've already run filters on the message, but that order is optional now, iirc.
Comment on attachment 436734 [details] [diff] [review] imapPump and two tests with it you can lose this comment, I think. + /* + void CopyMessages(in nsIMsgFolder srcFolder, + in nsIArray messages, + in nsIMsgFolder dstFolder, + in boolean isMove, + in nsIMsgCopyServiceListener listener, + in nsIMsgWindow msgWindow, + in boolean allowUndo); + */ +
Attachment #436734 - Flags: review?(bienvenu) → review+
Comment on attachment 436752 [details] [diff] [review] main fix, depends on previous patch The change to AddNewHdrToDB is problematic - that means we won't automatically get the updatependinghdr call on headers added to a db, which means we have to have gone through nsParseMailboxMessageState::FinalizeHeaders, which I don't think is guaranteed...
(In reply to comment #18) > >When IMAP moves the junk message back to the inbox, it tries to rerun filters > >on that message. In general that is a good thing, > > Hmm, in general I would suspect this is a bad thing I guess that I was thinking of trying to recreate the original filtering that was interrupted by classifying the message as junk, but as I consider this some more perhaps that is not such a great idea. I suppose that an alternate approach to this would be to try to use some of the message processing flags to prevent rerunning of filters in this case. So instead of trying to have all of the database information available at filter time, I would only try to preserve the message processing flags for moved messages. Not running filters on moved messages would represent a change in behavior from current design which might have ramifications beyond the current bug though. But is that the approach you would prefer?
filters don't run on moved mail usually because the mail has been marked read, but in general I think it's a bug that filters run on unread messages moved back to the inbox. We could use a pending attribute on a msg hdr to make it so we don't apply filters to unread messages moved back to the inbox, which could be built on top of what you've already done. Re changing AddNewHdrToDB, could you make calling UpdatePendingHdr's a second time a no-op (perhaps by removing the pending information after applying it)? That way, AddNewHdrToDB can still call it...
If I am not going to reapply filters to messages, then the motivation to move the setting of pending attributes to be earlier largely goes away. I still need a mechanism to communicate the fact that we don't want to run message filters on a particular imap message though. I could do that very narrowly by adding a new database method getAttributeOnPendingHeader, and just check that one attribute prior to applying filters. Or I could be more general and add the UpdatePendingAttributes method with the changes you suggest, and update the header early like in my previous patch. In either case I would need to set a pending header somewhere, something like "dontApplyFilters" with string value "true". So would you prefer the narrower approach or the broader approach?
Input from dumb end-user: SpamAssassin (or Tbird's own junk detection mechanisms) may classify a message as junk and move it to the junk folder. This may occur regardless of special treatment that the user would normally afford to specific pieces of mail. Once a user has reviewed a given JUNK message and certified it as NOT JUNK, then the user would like the message to be processed AS NORMAL. That is, NOT just that the message go back to the inbox, but that the message be subjected to any filtering rules (such rules may tag or re-file the message on behalf of the user). For example, lets say that a certain type of message is normally afforded special treatment (colored or re-filed to given sub-folder). The user may already be burdened by correcting an erroneous JUNK classification of such a message. Once the user has reviewed the message and certified it as NOT-JUNK, if the message is not subsequently subjected to normal receiving filtering/processing, then the user will be double-burdened with the responsibility of manually implementing the filtering rules (coloring/tagging/re-filing the message by hand). Seems to me that a semaphor/flag is required for each message that indicates that a user has manually certified the mail as NOT JUNK. This would permanently exempt the message from junk processing, but STILL ALLOW full automatic input processing to occur (not just re-filing into the inbox). (Sorry if I have misused the technical terms; hope my meaning is still apparent.) Thank you for considering this input (and for making Tbird available to us!).
Isn't UpdatePendingHdr already a noop on a subsequent call, because of this code which was part of the original AddNewHdrToDb code that was copied? m_mdbAllPendingHdrsTable->CutRow(GetEnv(), pendingRow); pendingRow->CutAllColumns(GetEnv());
(In reply to comment #24) > > Once a user has reviewed a given JUNK message and certified it as NOT JUNK, > then the user would like the message to be processed AS NORMAL. That is, NOT > just that the message go back to the inbox, but that the message be subjected > to any filtering rules (such rules may tag or re-file the message on behalf of > the user). > That was my reasoning for my original statement "When IMAP moves the junk message back to the inbox, it tries to rerun filters on that message. In general that is a good thing." I guess I could go either way. Since you have already stated the pro case, let me add to the anti case: - Local Folders do not work that way, so right now IMAP works differently than Local (POP) messages. Also, refiltering only works for unread messages, which can be confusing and is a hidden fact from the user. - Any reapplication of filters opens the possibility of filter loops, where the same message gets repeatedly processed. - Filters are normally thought of as applying only once. Since junk processing happens after normal filters, the same message would get filtered twice: once on initial receipt, and once on marking as Good. The only filter action that really makes sense to repeat is a Move action. Bug 465794 - "annotate junked messages with info from what folder a junk mail came from" was an initial attempt to add the message metadata that would be needed to undo that move correctly, but we have not pursued that. I think that just restoring the junk message to its original folder, rather than reapply all filters, is really the correct solution long-term. Anyway I could go either way. I'll do whatever Bienvenu wants, but the current bug really should be fixed for tb 3.1 regardless of what we decide. Time is short for that.
A ok. I know some people (not me) have very complex filter rules. Helping them to benefit from those rules (even when a message is errantly mistaken for JUNK) is probably a good thing (however you wind up implementing it under the hood). Again, thank you all for making Tbird available!!
(In reply to comment #25) > Isn't UpdatePendingHdr already a noop on a subsequent call, because of this > code which was part of the original AddNewHdrToDb code that was copied? > > m_mdbAllPendingHdrsTable->CutRow(GetEnv(), pendingRow); > pendingRow->CutAllColumns(GetEnv()); Yes, I rather suspected it was. So you could easily revert the AddNewHdr part of your changes. I think I prefer the broader approach, because I buy the argument that having the pending attributes earlier on is better.
I believe this is what you requested.
Comment on attachment 436734 [details] [diff] [review] imapPump and two tests with it checked in as http://hg.mozilla.org/comm-central/rev/e677ea9438fa This is a testing framework only, and does not fix this bug.
See https://bugzilla.mozilla.org/show_bug.cgi?id=359263#c7 which is in a bug that specifically asks to do what my current patch is preventing.
So I think this is a good improvement and I'm willing to let it go in as is. But, after reading the comments and the other bug, now I'm wondering if we don't want to allow normal filters but disable the built-in spam filters, in this scenario. There are several ways we could do this, building on the core change of getting the pending attributes applied before the filters are run. One would be to tweak the built-in spam filters not to run if the non-junk property was already set on a message, and set a special property when "un-junking" a message that overrides the "dontapply" property. Thoughts? It might be nice to cleanup the "dontapply" property once we've looked at a message header, and decided not to run the filter on it, to avoid long term db bloat, but it would increase short term db bloat, so maybe not worth doing.
(In reply to comment #22) > filters don't run on moved mail usually because the mail has been marked read, > but in general I think it's a bug that filters run on unread messages moved > back to the inbox. (Q1) Due to \Recent flag? > http://tools.ietf.org/html/rfc3501#section-6.4.7 > The COPY command copies the specified message(s) to the end of the > specified destination mailbox. The flags and internal date of the > message(s) SHOULD be preserved, and the Recent flag SHOULD be set, > in the copy. Is "(-\Seen || \Recent) && new UID" condition of new mail? If so, can we ignore \Recent flag always? (reply on "no \Seen flag" only of newly added mail=unknown UID). If IMAP supports UID and is RFC compliant, and if \Recent is ignored, and if store \Seen is executed at Junk before move back to Inbox, filter won't be invoked on moved back mail by current logic. Following is a page found by Google search for "imap flag junk". > http://www.ietf.org/mail-archive/web/asrg/current/msg16301.html > Thunderbird adds a flag called Junk to a message if it's flagged as > junk, and a flag called NonJunk to a message that has the junk > indicator subsequently turned off. (Q2) Does Tb store NonJunk flag at Junk folder before "move back to Inbox"? If yes, and if IMAP is RFC compliant(preserve flag upon copy), and if "trust SpamAssassin" option respects NonJunk flag, problem due to "trust SpamAssassin" won't occur. (same concept as "dontapply" property) (Q3) Can we add new internal flag such as $label_0 for "mail which Tb already knows"? (also same concept as "dontapply" property) If user defined keyword is supported, it works even if server is not RFC compliant(Gmail IMAP probably resets \Seen upon uid copy if mail in [Gmail]/All Mail doesn't have \Seen flag yet). But if user defined keyword is not supported, other way is needed... Problem of this solution: No one will remove the $label_0.
The earlier attachment 436752 [details] [diff] [review] implemented the solution that leaves the filtering characteristics unchanged, but moves the pendingAttribute application earlier. It relies on extending the definition of the special spam filters to only function if the user has set the spam score. This is the approach to "allow normal filters but disable the built-in spam filters" that I still think is best (plus I already have the code written for that!). This is very similar to Bievenu's "One would be to tweak the built-in spam filters not to run if the non-junk property was already set on a message" or WADA's 'if "trust SpamAssassin" option respects NonJunk flag' So let's go forward with the assumption that this late in the game, it is too risky to change the behavior of filtering (the dontApplyFilters concept) and instead leave the filtering behavior unchanged, but correct the built-in spam filter definition to fix the current bug. So the issue becomes, do we use the local db property junkscoreorigin=='user' to narrowly detect cases where the user set the junkscore, or do we only apply these filters if junkscore is undefined? The latter I think would have the advantage of working in cases of people running multiple TB installations since junkscoreorigin is local, while I think that the local junkscore is set by IMAP from flags if available (the JunQuilla 'imap flag' in junkstatus+). But I would need to extend the junkscore filter to support the undefined case, which I want to do anyway (bug 540449). That is a fairly small change to add to this bug, but it does need doing soon if that is the approach we will take. I would suggest, Bienvenu, that you look briefly at the previous attachment and see if that is now the approach that you favor. I can extend that to use junkscore instead of junkscoreorigin.
OK I'm going to take back what I said in comment 34. That is, I really don't want to try to extend junkstatus to support undefined before 3.1 string freeze. That will force me to mess with mailWidgets.xml, which always puts me in a bad mood and takes much more time than I expect. The solution using junkscoreorigin works just fine AFAICT, and only differs in subtle ways from the solution using junkstatus/junkscore. Are there any objections to that? The patch is already posted - though I might want to check it again before landing.
Comment on attachment 436752 [details] [diff] [review] main fix, depends on previous patch I am unobsoleting this since we are now discussing this approach.
Attachment #436752 - Attachment is obsolete: false
Comment on attachment 436752 [details] [diff] [review] main fix, depends on previous patch This bug, with the interface change, is presumably subject to the feature freeze for tb 3.1 I'm going to re-enable review flags for the older attachment as a way of pinging :bienvenu that we need to make some decisions about this bug.
Comment on attachment 436752 [details] [diff] [review] main fix, depends on previous patch can you merge these patches, so you don't get rid of AddNewHdrToDb's updating of pending attributes?
Attachment #436752 - Attachment is obsolete: true
Attachment #437404 - Attachment is obsolete: true
Attachment #439636 - Flags: superreview?(bienvenu)
Attachment #439636 - Flags: review?(bienvenu)
Attachment #437404 - Flags: superreview?(bienvenu)
Attachment #437404 - Flags: review?(bienvenu)
Comment on attachment 439636 [details] [diff] [review] Combine previous patches Checked in as http://hg.mozilla.org/comm-central/rev/66b59dea21b7
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [gs][needs r/sr bienvenu]
updated and notified http://gsfn.us/t/ou5i note: per updated procedure at https://wiki.mozilla.org/Thunderbird/Support/Workflow we retain [gs] and add [gssolved] to whiteboard v.fixed based on results of Bug 570943.
Status: RESOLVED → VERIFIED
I'm also having the problem where messages from senders in my address book are sometimes marked as junk, creating false positives. Did we get a resolution? I'm running TB 3.1.6. Thanks.
You need to log in before you can comment on or make changes to this bug.