Open Bug 19442 Opened 25 years ago Updated 5 months ago

Regular expressions in mail and news filters

Categories

(MailNews Core :: Filters, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: gwalla, Unassigned)

References

()

Details

(Keywords: helpwanted, Whiteboard: [penelope_wants])

The ability to filter email and news posts by regular expression matching in headers would be nice. It could be used to filter out messages in ALL CAPS or ending with a number, among other things.
QA Contact: lchiang → laurel
Assignee: phil → nobody
Summary: [FEATURE] regular expressions in mail and news filters → [HELP WANTED] regular expressions in mail and news filters
Whiteboard: [HELP WANTED]
Adding to [help wanted] list
Keywords: helpwanted
Summary: [HELP WANTED] regular expressions in mail and news filters → Regular expressions in mail and news filters
Whiteboard: [HELP WANTED]
Blocks: 66423
Blocks: 66425
That and it'd be nice to have a filter action that would move messages to folders like "bug $1", where the $1 gets substituted from the regex, creating a new folder if one doesn't already exist.
Daniel: one issue = one bug. Let's keep this one nice & simple. Using backreferences for folder creation would be a new RFE depending on this one.
At the very least, there must be a case sensitivity flag for the matching strings. Now I have to enter a spam-filtering substring several times in all case combinations if I want to filter by it.
so I have an extensive set of pr0n and spam filters, and there is yet two types of mails slipping through 1) Subject: bla blah blah blah [many spaces here] 12451252 [AFXLKFJK whatever] 2) Received: someone@anyhost.cn [or other strange domains] 1 could be satisfied by an additional [ends with] condition 2 and more sophisticated stuff could not also I think it's faster to do a regex than browsing my 20+ filters per account I know discussion does not belong here, but I needed to tell that it's impossible to filter *much* spam out without regex (I still get 2-10 per account a day and I have a very good set of message filters so far AND I do not subscribe with my email in sex sites :) ) and the fact that regexes might actually be faster in some cases nominating mozilla1.1
Keywords: mozilla1.1
*** Bug 148430 has been marked as a duplicate of this bug. ***
Yes, as the number of spam mails I get has dramatically increased (around 50 per day ) I also came to the conclusion that regex filtering is urgently needed. So to filter out for example mails by subject "Gambling get over $100 and 100% guranteed" by gambling*$*. I think that Perl regexes would (though I would like it) to overhelming for the average user. Maybe bug 151226 is also of interest within this context.
I just want to second that the number of spam dramtically increases and due to that this should work in the body too, but that is another bug I guess...
updated keyword -> mozilla1.2
Keywords: mozilla1.1mozilla1.2
Depends on: killfile
No longer depends on: killfile
Great for RegExp filter ability!!! I have a 440 filter in file of WinEudora that look on IP source of a spammail and I can survive from 25 spam x day. I hope this filter is applicable on "all header", "All Received: line", "last Received: line" or "Body" (not so usefull for antispam efficacy). Antispam step: 1 - look in header for "real" source IP. Not always is last Received: 2 - Whois this IP, get back email of admin, and IPrange of provider. 3 - Write to Admin and attach the mail with all headers 4 - Create a RegExp filters based on IPrange that put mail in a folder I had write a C program that from IPrange create a RegExp. I'm writing a second C program that parse header to find the real IPsource... and that dont believe to forged header... :-)) How can I partecipate to developing MozillaMail ? Bye, efa
While regular expressions are a great tool for identifying patterns, if each person takes the time to keep their regexes up to date with the latest spamming tactics, there will be a very large collective waste of time. There are projects like spamassassin that are probably better spending your time on. I know, I spent a lot of time with procmail and its weighted scoring, and even with that many spams got through and I had to keep updating it as time went along. You might consider supporting the spamassassin (spamd) protocol and have mozilla filter a message through that, but if you already have a smamd server then you can probably filter on the server level...
Last time I checked, mail filters weren't just for filtering spam.
I do not concordate completely with Mike. I explain why: Before I do not understood spamassassin. I studied it in this time. I see that is a weighted scoring with a very big list of check on body and header of a mail. I also tried Spamnix (EudoraWin32 plugin of spamassassin). It function well. But they are code to identify automatically a mail as a spam mail. It's a great tool. But limiting only to receive, identify and move mail to a spamFolder is really dangerous. If you do so, the number of spam become really enormous. At the end you receive 99% spam. Yes you dont see that, but internet traffic will be all spam. Dont think only to you. Spam must be killed before it damage us! My idea is to kick out spammer from good provider (really really most). Imagine to have a spammassassin tool to identify spam, and than an automatic tools that do my previous point 1 to 3. The spammail will be forwarded to Admin of source server, and he kick out immediately the spammer. I do that manually about one time per day. Everyday I receive response from Admin that say: "We investigate. If we find that a customer is in violation of our policies, we will take the necessary action to stop the activity in question" or best "Thanks you for report. I have just now terminated the account responsible for the abuse." This method is functional, only slow because for now is most manual and maybe I'm the unique to use it. Point 4 is only for Admin that dont reply, dont kick out the spammer, or for provider that are spammer (really few). Regex filtering require some time to keep updated the list of filter, but do not require to keep updated to lastest spamming tactics. They are the same in header. Most of mail, source from last (bottom) received IP address. Forged header are sourced from last IP that have real DNS in header. My C Code to do point 1 is a real alpha, but in the future I think it can identify correcly the real source IP. My C Code to do point 4 is a CandidateRelease2, really stable and bug free. Point 2 is Unix Whois (need port for Win32 system). Not too much more, some parsing on Whois report to extract abuse@domain.tld and IPrange registered in IANA, ... Seems to me that the trick to extinguish spam, is an automatic tools to recognise spam (spamassassin or new MozillaMail1.3 filter), and than a generalized automatic tools to stop the spammer. And if such tool is really diffused (with point 4), Admin keep more attection to spammer, because most users can easily filter all the mail sourced from a provider. In any case RegEx filter ability is a great tool for text matching in filter and search in mailboxes. I want it. :-))
Valerio: that was entirely offtopic, except for the last line which was a simple "me too". Please don't comment unless you have something you have to contribute to the bug under consideration.
*** Bug 191261 has been marked as a duplicate of this bug. ***
*** Bug 184690 has been marked as a duplicate of this bug. ***
I'd posted a dup of this bug apparently. I was informed: q> > I want to sift out all emails with the subject line containing "jhotdraw". So > I create a filter for subject = *jhotdraw* Why not simply do "subject" "contains" "jhotdraw" ? /q> I've been trying to do this: --------simply do "subject" "contains" "jhotdraw"----- for as long as I've been running mozilla. As it stands in moz1.3b no newly created filters are running at all (not connected to this bug, I know)
*** Bug 198273 has been marked as a duplicate of this bug. ***
Blocks: eudora
*** Bug 218298 has been marked as a duplicate of this bug. ***
There has been an increase of SPAM recently due to some very prolific viruses. I think that if wildcards (* and ?) were added to "Message Filtering" that many of the SPAMs that I am receiving could be filtered out. Ususally they are shotgun spams that are emailed to a dozen people with similar email addresses as mine. Wild Cards would allow me to easily identify these shot gun spammers.
I need regex for matching SpamAssassin headers. Voting for this bug.
Sander: See also bug 224318 - "Bayes filtering should be aware of X-Spam Headers".
Ich möchte mein Mozilla noch effektiver machen. Für folgende Probleme suche ich eine Lösung: Bei offensichtlichen (und sicherlich beabsichtigten) Schreibfehlern ist der Filter durchlässig wie ein Sieb. Ich habe beispielsweise gesperrt "Viagra", der Filter läßt durch V;agra V i a g r a Via.gra Es wäre hilfreich, wenn alle Leer- und Sonderzeichen *vor* dem Mustervergleich eliminiert würden. Noch problematischer sind HTML-Emails: ein Text wie <br> V<big>i</big>agra<br> <br> Via<span style="color: rgb(102, 51, 102);">g</span>ra<br> wird nicht erkannt, obwohl der Mensch das bestens lesen kann! noch witziger: <br> Vi<acd>agra<br> wobei <acd> jede beliebige Zeichenkombination sein kann, die willkürlich vom SPAM-Versendern eingestreut wird.
*** Bug 213567 has been marked as a duplicate of this bug. ***
Keywords: mozilla1.2
In all honesty, the concept of Perl regular expressions doesn't even have to be implemented in its entirety... if I would like to send entire domains to a certain folder, it'd be nice to do *@somedomain.com and have that be my one filter. So a simple implementation of ? (single character match) or * (multiple character matches) would do the job for me, and would also be more well known that full-blown regex.
The fact is that it already _is_ implemented in its entirety in JavaScript (AFAIK), so it's just a matter of using it. No sense in re-implementating a subset of regular expressions, I think.
(In reply to comment #25) > if I would like to send entire domains to a > certain folder, it'd be nice to do *@somedomain.com The problem is not with real source email. Is for fake source email, like spam. The spammer can easily cheat the source email and domain, but cannot cheat on source IP address in header Received lines. With regex you can match the whole IP range registered IANA block of a known spam provider like chi....... or hana......
(In reply to comment #27) While I agree with the general concepts, my comments don't relate specifically to spam. I get plenty of mail that is legitimate AND needs to be filed away into a single folder, BUT comes from sources that are just a little bit different in different ways (for example, 61source_dev@domain.com, 61source_stg@domain.com, 81prod_srv2@domain.com). Filtering on subject lines would not be helpful as there have been unpleasant side effects. :) The ability to write a single expression for this would be helpful. I tested a handful of regular expressions that might've worked... they didn't. Eyal... if the JS filtering is there, it's undocumented. :-) On a related note, the reason this comes up is because I don't have the option to create a filter from a message in my inbox (much faster way to create filters). That is a separate feature, however.
(In reply to comment #25) if I would like to send entire domains to a > certain folder, it'd be nice to do *@somedomain.com and have that be my one > filter. So a simple implementation of ? (single character match) or * (multiple > character matches) would do the job for me I tryed this now with Mozilla Suite 1.7.3 and it works well. Just create a filter with "Sender" and "Contain" and "@domain.com"
> > Eyal... if the JS filtering is there, it's undocumented. :-) > Here's how we use regexp's in BiDi Mail UI: http://www.mozdev.org/source/browse/~checkout~/bidiui/source/suite/chrome/content/bidimailpack/bidimailpack-common.js?rev=1.3 just define it with /..../ 's and then do myregexp.test(mystring) . Pretty straightforward.
(In reply to comment #28) > > The ability to write a single expression for this would be helpful. I tested a > handful of regular expressions that might've worked... they didn't. > > Eyal... if the JS filtering is there, it's undocumented. :-) It's not that regexp filtering has been implemented (if it was, this would be RESOLVED FIXED), but that JavaScript already has support for regular expressions so it's just a matter of modifying the filter code to use it. No need to implement a new parser.
Product: MailNews → Core
It's a pity nobody is working on this. Would be a killerfeature for Thunderbird.
*** Bug 261854 has been marked as a duplicate of this bug. ***
Free Eudora had regular expressions in filters a long time ago. Couldn't believe it when I discovered that Mozilla mail didn't.
*** Bug 275988 has been marked as a duplicate of this bug. ***
*** Bug 304428 has been marked as a duplicate of this bug. ***
Depends on: 213567
*** Bug 337229 has been marked as a duplicate of this bug. ***
If Thunderbird had a basic "wild card intelligence" (which comes since the DOS age when we wrote dir file.* and the program gave us the matching files...) it could (with no much lines more) be more efficient and less frustrating in finding INTELLIGENT MATCHES in the filter expression. I know a little about C++ (its not my best language) and I analized the Thunderbird source, so I could give the following CODE SUGGESTION. Just need to transform the diagram in equivalent code. VERY STRAIGHTFORWARD and simple. Please take a look and I hope you give us SOON a Thunderbird UPDATE with "wild card intelligence" in the CUSTOM FILTERS... That would be basic, but would help us A LOT. The ideal features would include: 1- Possibility of mixing OR and AND statements in the same RULE WINDOW. Nowadays you can use ONLY one of the options... 2- For advanced users, Filters with Perl regexp. Ex: re:.+v.*i?a?.*g?.*r?.*a to match: Viagra Visagra Viagbbbrgra and so on... Here is the algorith I suggest to add "wild card intelligence" to thunderbird filters. I the image I indicate even the cpp file and the function that needs to be improved. I hope it helps somehow... http://img509.imageshack.us/img509/2923/thunderbirdsuggestion6pa.gif Best Regards Sergio Abreu
*** Bug 358683 has been marked as a duplicate of this bug. ***
I've been taking a look at this and have hacked up some code. It'll need some beating into shape before it is suitable for inclusion, for which I will need some advice from someone more familiar with the mozilla codebase. (My current MUA is showing signs of age, and I'd *really* like to switch to Thunderbird. However the lack of regex support in the filters means I can't bring over my existing sorting/blacklisting filters which is an absolute showstopper for me.) There are actually two existing regex implementations within the mozilla codebase: 1) directory/c-sdk/ldap/libraries/regex.c - this is a very limited implemenation, far less capable that most people would expect in this age of ubiquitous PCRE support. It's also apparently extended from an old grep implementation in a slightly odd (non-standard) way. 2) js/src/jsregexp.c - this is the one mentioned above. This supports a much more powerful Perl-ish syntax - as I believe from the comments is specified for Javascript by ECMA. Unfortunately it is heavily tied in with both the JS engine memory allocation routines, and the JS engine's internal typedef range. Its symbols are not, without a lot of grotty hacking, even visible from the rest of the mozilla codebase. Even if it were visible, it requires a JSRuntime and JSContext object to operate, which are way too much overhead for a generic facility. It cannot be used directly. 3) (I know I said 2!) There's something in security/nss/lib/util/portreg.c *calling* itself a regexp, but clearly just a shell glob with maybe a little regexy-like extension. There is almost identical code in both modules/libjar/nsWildCard.cpp and xpfe/components/filepicker/src/nsWildCard.cpp which is more accurately named. God knows whether (1) or (3) are still live code. I've been playing around and am now at the stage where I have working POP3 filtering based on regexes. I've taken a copy of jsregexp.c, converted it to use PR_ memory allocation and PR typedefs and it seems to function fairly well. As a proof of concept it shows this bug (after 8 years!) could be addressed without too much pain. At the moment the regex engine is namespaced out of the way and #included directly into mailnews/base/search/src/nsMsgSearchTerm.cpp. Quick-n-dirty with minimal impact, but clearly not a great solution. I suggest that after suitable cleanups it ought to be made a core facility available across mozzila/. The problem is I lack the familiarity (and authority!) to know where it is appropriate to put it so that it is available to anywhere it might be needed. (I don't think it could be abstracted out of the JS engine - which means a parallel implementation. The JS version has to rigidly conform to ECMA, whereas a common internal facility I can see people wanting to extend in various ways. The different ways of handling memory management are also a block here. Not pretty, but necessary I think.) The changes to filtering/UI to integrate it are relatively trivial by comparison (even if the mozilla codebase is a huge bloated monstrosity with no rhyme nor reason to the location of any particular file), however there are a number of places where it may be useful and it might be nice to hit all of them in one go. Again, I lack familiarity enough to be able to enumerate all such places, my focus has so far been on filtering of POP3 mail. I could do with someone who knows the codebase better to give me a few pointers here... Cheers, John
OK - thanks, Eval. I wonder if John Sullivan is still around (see comment #40 above).
Priority: P3 → P1
(In reply to comment #43) > Peva: All TB development is very low-priority for MoFo/MoCo at the moment (and > in general); IIRC there are only 2 full-time TB/mailnews developers at the > moment, and Mitchell Baker wrote a blog entry basically rationalizing how TB > development won't be getting any of the googlebucks. So changing settings in a > bug page probably won't help much with getting feature added anytime soon... > what you (us) need is to find someone to work on this. > Mozilla may only have 2 people working on Thunderbird, but Qualcomm has 4 additional people working on the same basic code. Granted those 4 are not _full-time_ on Thunderbird / Penelope / Eudora, but they are submitting changes.... The more people we can get involved the better, IMHO. Matt
I am still around. I'd really like to get this done - it's a stopper from me upgrading from my current MUA, which a spammer has unwittingly discovered how to crash with stupidly long subject lines. I have a workaround, but a local build of Thunderbird sorts it out completely. Situation as above. If someone with commit rights wants to say where I ought to put generic library code in a way that wont interfere with other projects I can make time. I could I guess just make something up, request review and sort it out from there if any reviewer is listening and objects to a crude insertion. I should point out that I can make time before the end of year, I have leave I need to take and could use a part of for this, but I just don't feel up to using spare time during my normal working schedule when I could be unwinding from existing machine stuff. I'm sure you understand.
John: some related regexp interface design was done in bug 106590. (If you need that it might be worth arguing your case there to get it un-wontfixed.)
(In reply to comment #47) > ... (If you need that it might be worth arguing your case there to get it un-wontfixed.) That seems highly unlikely without core (pun intended) support from those formerly involved in that bug and their strategic (and probably smart) direction of ECMA-262 regexp's / JS_*RegExp API noted in bug 348642. John, Brian says in there "It's pretty easy". Perhaps they'd be glad to have your help. (plus the bugs dependent on bug 106593, bug 32641 and bug 80337, are quite dead so no help gonna come from there)
I want to interject one point into the debate: should we do full JS regexp or wildcard matching. Regexp is much more powerful matching, and I have one use case. The most recent infusion of MI5 spam in Usenet can be filtered out with the matching regexp for sender: ^[vief]+@.*$ Laying out other requirements: * Support both wildcard and regexp? If not, which one? * Matches, contains, or both? * Strings? I like the idea `Sender' `matches regex' [==========] * Needs to be implemented for all of IMAP, POP, NNTP To John: If you need help with the filter code, I can easily provide it.
(In reply to comment #53) > full JS regexp or wildcard matching If we can grab some low hanging wildcard fruit, fine. But as soon as "heavy" coding is involved, we'd probably shoot for RegExp - although not necessarily JS based (having a scriptable interface for the JS RegExp would be truly cool, but I doubt we'll see that). (I've tinkered a bit with how to implement user-defined JS filter actions, but I'm not sure how slow such extensive XPCOM boundary crossing would get.) > * Matches, contains, or both? Not much of a difference with RegExp. > * Needs to be implemented for all of IMAP, POP, NNTP And "movemail" and "none" (local folders).
Flags: wanted-thunderbird3?
Flags: blocking-thunderbird3?
Filter on "Nobody_NScomTLD_20080620"
QA Contact: laurel → backend
Product: Core → MailNews Core
Flags: wanted-thunderbird3?
Flags: wanted-thunderbird3+
Flags: blocking-thunderbird3?
Flags: blocking-thunderbird3-
Marking as wanted-, as per the revised driving rules <https://wiki.mozilla.org/Thunderbird:Release_Driving>.
Flags: wanted-thunderbird3+ → wanted-thunderbird3-
I am *not* a thunderbird driver, but I think the way they're doing things is fair. If users, QAers, extension devs, and non-driver devs "want" a bug, they should vote for it. Obviously by virtue of the feature bug existing in the first place, somebody wants it. So unless the "wanted" flag is only for "special" people, it's pretty redundant.
blocking and wanted have always been part of a mechanism for thunderbird-drivers to help shepherd the highest-impact bugs/features into the tree; nothing has changed there. The wiki page changes did not happen at random; I personally made them on behalf of thunderbird-drivers, to clarify policy changes we've made regarding use of that flag: we didn't think that the way it was being used was really sufficiently helpful w.r.t. helping get the highest impact bugs into the tree. It sounds like you think there should be some separate mechanism for use by the broader community, but it's not clear to me how you envision that working. That bug isn't really the best place for that discussion, I think, but you're welcome to post a proposal to m.d.a.thunderbird or m.d.planning.
Er, "This bug isn't really the best place...."
(In reply to comment #59) > Votes are not version-related. Plus, by now voting is pretty deprecated AFAICT > since votes have been generally ignored. To be fair, I don't think anyone *currently* associated with Thunderbird has discouraged voting by users (tho there are certainly detractors in the mozilla community). And some of us do use votes. For example when attempting to differentiate within an overwhelming number of bugs looking for worthy nominations. But voting's usefulness is limited, and it has it's problems - for example it does not equate directly or well to severity nor need. And you can't productively rank bugs against each other, for example a 9 year old bug with 90 votes (like this bug) against a 1 year old bug with 20 votes. Getting back to this bug... > Eh, what the hell, let'em do whatever they want(ed). Nobody seems to listen to > what I/people like me say anyway. I'm guessing your ultimate concern is for this bug to make progress, which is dependent more on the suggested blocker or someone taking interest (anyone touched base with John Sullivan?), and less so on it's status per drivers. relevant bit also in bug 106590 comment 37.
(In reply to comment #62) > But voting's usefulness is limited, and it has it's problems - for example it > does not equate directly or well to severity nor need. And you can't > productively rank bugs against each other, for example a 9 year old bug with 90 > votes (like this bug) against a 1 year old bug with 20 votes. I counted: # of TB bugs > 50 votes in the last: 1 year: 0 2 years: 1 3 years: 2 4 years: 4 all time: 44 No bugs in the past 4 years have > 100 votes, the newest being just under 5 years. Votes have a strong bias towards older bugs, which means that it's a poor approximation to wanted features. > I'm guessing your ultimate concern is for this bug to make progress, which is > dependent more on the suggested blocker or someone taking interest (anyone > touched base with John Sullivan?), and less so on it's status per drivers. > relevant bit also in bug 106590 comment 37. I've been reading C++0x recently, and as that adds support for regex natively, it might be worthwhile to ask NSPR to add support for the libraries or put it elsewhere. That's another can of worms entirely, though...
I recently posted a patch in bug 495519 that implements the ability to add custom search terms to Thunderbird filters, and as a demonstration shows an extension that does a regular expression comparison to the message subject. I'm reasonably confident that a finished patch will be implemented in TB3, so the use of regular expression filter terms in extensions should be possible then. That patch, and its cousin that adds custom filter actions, relies on calling javascript-implemented features from the C++ filter code. If we are willing to take that step, then it is not a big leap to do the same trick in the normal filter code to implement regular expressions. We could implement a javascript XPCOM object to execute a regular expression, and call it from the C++ search code when needed to do the regular expressions. This is not a big project. It also could be done in extensions instead of in the core code though. I'm curious what people think of this approach, with its obvious possible performance issues. I'd probably be willing to do this work if drivers could add wanted+ to this bug. Otherwise I'll just leave it to the imagination of extension writers.
Component: Backend → Filters
QA Contact: backend → filters
In general, it seems like an entirely reasonable approach. Because of the potential perf issues, I suspect that the thing that makes the most sense is to code this up as an extension first and do some benchmarking before committing to accept this in the core. As such, I don't think we can say for sure whether it's wanted+ at this point in time.
Whiteboard: [penelope_wants]
Also it would be nice if messages could be searched for using regular expression syntax.
This is a bug that could be easily implemented using a javascript XPCOM component to do the regex processing, and adding an additional choice for text searching as suggested in https://bugzilla.mozilla.org/attachment.cgi?id=719922
ukrainianconsular@gmail.com, Personal attacks are not appreciated *at all* around here. Clearly you have nothing better to do with your time than to insult extremely helpful and knowledgeable contributors such as Kent. Thunderbird is run ONLY by contributors now and Mozilla has very little to do in it's development. If you care so badly about this being fixed, by all means go ahead and fix it yourself, but don't expect the rest of us to do whatever you want. We are all working in our spare time to improve TB (We actually have lives and jobs outside of this), but we can't get to every bug in a timely fashion. There are thousands of filed bugs. Therefore, unless you have something useful to contribute to a development-only site, please just stop. Continuing to attack people may result in a ban to your account. Thanks, Josiah
alos, i am insisting towards moz devs, because rkent will not add regex filtering for the body section in his addon, because he thinks it affects me & only me through his rejection & my complaint about his abuse of trust. he will not do it & everyone who's expected that function here has been waiting forever.
(In reply to ukrainianconsular from comment #74) You cover your valid points with so much hyperbole, with curses, with YELLING, et cetera - that it's really difficult for people with a view similar to yours to sympathize with your message, or even to read through it. (In reply to Josiah Bruner [:JosiahOne] from comment #73) Josiah, - "Very little" is still some. And I'm sure people connected to the Mozilla foundation/Corporation have a lot of say w.r.t. Thunderbird development. - Thunderbird development work over the years sure seem to have very strange priorities. - Many users get extremely frustrated - For some people, implementing a certain feature might mean 2 days of work, while for others, it might mean many months of agonizing to get to the point where they can add anything significant to the code. I really suggest people remove "so fix it yourself" from the lexicon. - A 20-year-old mail client should have gotten regex support in message filters many years ago. Having said that, ukranianconsular's personal attacks are quite beyond the acceptable.
it's just that the 15 years of thread & the fact another contributor of an addon ignores a main basic function of his add-on drives me nuts, and that's factual, not an hyperbole. if my complaints are beyond the acceptable, i will apologize to have bothered you & will make sure i never post any requests at mozilla.org it is certain that after having started many threads & obtained no fix for all of them even without complaining about that form of negligence, it's beyond acceptable to insist. you are right eyal, i shouldn't keep expecting anything after having seen the date this thread was started .. . when email clients r conceived & posted online with regex on their first release, i send to all of the ppl here who have been deeply offended, my most humble apologizes. keep up the good work.
(missing bit in my comment) - Many users get extremely frustrated after waiting many years to see some attention to significant issues they have reported or commented on, that they end up losing their temper, like our friend ukrainianconsular. I'm sure he has enough on his plate in life these days in the Ukraine than to write comments here... he does it because he cares about making Thunderbird better. Everyone here wants that, nothing else.
ur right ((
If you check bug 868233, you can see Kent has already have a patch that can help all addon authors to working on filter using body part. The patch itself actually demos how to use regular expressions for body match. Also without the above patch, it's not possible to just write addon to do the regular expression search against body, as now if you need get the body, it's an async call, and filter system need sync call. The current Thunderbird development is a bit slow, that's true. I do see a few bugs have patch and it might need months before get reviewed. That frustrated both contributors and end users. There might be ways to improve this, but I agree with Josiah Bruner, Personal attacks doesn't help.
could you move this fruitless discussion to the forum please? I really dont need this via email...
Thank you so much opera wang, hoping this patch addresses regex & not javascript since i don't know the javascript language & have no computer knowledge. could you please tell me how to install it by the way since i have no coding knowledge & don't know were to put that content & how to name the file: https://bugzilla.mozilla.org/attachment.cgi?id=744907 Regex is already a bit complicated for me but i use a magic tool called regulazy that helped me a lot, when i click on regex edit, select parts & right click, then i have many choices including "exact-letters-numbers-anything, etc". This matter has frustrated me a lot because i've been fighting against spam using a personal email that's been posted by inbreds online & collected by all types of extractors-companies or hired spammers to send me a whole imaginary world of spam content, & i took the decision to abandon the ordinary baysian filters since they appeared on the web & that are useless for my part, since the baysian system sends false positives to a spam folder, moreover, there is a spam folder "to deal with/deal with spam", & i can't use that system to delete false positives. so i started to create complex filters over 20 years of my life, dealing with those spams almost every day, i now reached almost 2500 complex filters, and many of those filters are very "cerebral" & well thought, including the fact: Re: i "or" you (depending on the linguistic expression & content) are not present when i receive marketing related words & http is present & that personal technique is like 30% of my spam filter's database. regex for the body was my only accurate possibility because: if people use doesn't contain "i " & there's a word finishing with the letter "i" in the body, a false positive will be considered & i cannot allow myself such false positives caus emy filters are radical & delete the messages without sound without any attention, without visibility of the matter.. & this is a stupid limitation of all email clients that's never been fixed by an update. i could use " i " but if my letter " i " is in the body but at the start of the body, it will be ignored by tb's filters, because there's no space before " i " so we either needed a regex for the body, or "exclusively contains" option in tb's filters "this would be an evolution in email filtering". which would exclude all letters of the alphabet before or after a string/word/letter. using exclusively contains for "i", this should be recognized "hihi,i " .. . but not this "oui" spammers can even amuse themselves reading this message, i will defeat their mind polluting unsolicited spam campaigns. even if they try to obfuscate i or you or or http links playing with tags or colors, i have a solution for them too. & a link that arrives without any text should also be treated by a complex regex for the body .. . i already excluded more than a certain number of emails allowed in the too or bcc or cc fields. this regex option for the body was very important to me cause i engaged myself in 20 years of personal fight & my system is actually almost perfect, for an email that receives 100 spam remails PER DAY, if i login my email servers account through the web panel, ill see them all in the trash can & they're purged auto. by the server. & i get like 1 spam email over 7 to 14 days actually.. i've built very complex filters including words obfuscated by discomposition among tags for ex.. i've reviewed my spam box many times & have no false positives, but that's a hard & long work & fight against spam, i also lost a dear member of my family & forgot the human vital aspect of my life & experienced affective misery for being so obsessed with it. today, i'm watching the end of the tunnel folks & this body regex filtering option adds perfection to my filtering system & allows me to treat tags too without ever having to deal with it again, when some online filtering engines don't even allow it either, . I became schizophrenic & it became a psychosis, that i ended up regretting to have ignored that family member that passed away before i could spend more time by having more freedom in my hands. & you cant just sign up for a new email when you've shared that email with thousands of people over decades .. . spam can be very devastating in a life, it starts to slowly irritate you until it invades your life, even spammers from argentina are protected by their laws & spam in total impunity, that's why full spam reports will never stop some spammers. All i'm missing now is the "My email" not present in the To field nor cc field, while that filter should check more than one identity, without having to add them to filters & removing them when necessary when if i ever get rid of an account. i hope aceman takes care of it. Again, my sincere apologizes to everyone who's felt harassed in my interventions here, but if Eyal wouldn't have tried to understand me, i probably wouldn't have taken the time to explain my situation & why am i so obsessed with the matter. Because, before you guys feel harassed, trust me, i've been through a lot on this matter .. . Let's move on from this. Thanks.
News: My private conversation with Opera Wang: Opera: The patch C++ code requires you to download the source code of TB, patch it and then rebuild whole TB, it would be hard if you were not in IT. Also the patch is about half year ago which means it is bit rotted ( can't get patched directly, need some merge effort ). ukrainianconsular: It means if i use the patch, i should never update thunderbird & condemn thunderbird to the latest actual stable build & never update it again, since i can't expect devs at mozilla to incorporate it for good .. . could you please add it, since i have no coding knowledge (( Opera: As I said, without the patch, it's (almost) impossible to do regex search in the body. I actually tried it once, with a lot of dirty code, still can't make it work without the possibility of crashing TB. ukrainianconsular to the devs of bugzilla: So now, i'm hopeless .. . and i believe rkent is blocked because tb needs a code modification before this regex search becomes possible .. . If you guys want to consider that issue, here's rlkent patch: https://bugzilla.mozilla.org/attachment.cgi?id=744907 Apparently, only you can do something about it.
Getting rid of filtaquilla which becomes now obsolete to (regex) filter by headers to from subject & the missing one: body Many thanks to opera wang to introduce all those options in the addon he is maintaining: Expression Search / GMailUI 0.8.8 beta http://www.sendspace.com/file/dsfhu4
(In reply to Alastair Gordon from comment #89) > It cannot be difficult, can it? Sure it isn't ... just attach a patch to fix it!
See my comment 64. That approach is implemented in the addon FiltaQuilla, which has been available for years. The usage of that addon has never been that high (I believe around 10,000 users) and it includes many features in additional to RegExp, so the total actual demand for this feature is not as high as you would think. Because it is easy to do in an addon now, it may be appropriate to just leave it there.
I have installed FiltaQuilla, and to the best of my knowledge, it will not apply filters to the BODY of the message, making it essentially useless for filtering spam. If FiltaQuilla does, in fact, allow filtering on the body and is usable by anyone other than a hard-core nerd, then you are right that we already have a solution. Simply supporting a wildcard character in the existing Thunderbird Message Filters would meet 90% of filtering needs. How about someone just adding wildcard capability to existing Message Filters?
Alastair: Right, body filters are much more difficult, since filters are sync by nature, and body access is async. There is now a technical solution to this that did not exist when FiltaQuilla was written, though it is still a bit of a kludge, and may lead to unstable behavior (filters are crashy enough by themselves). It would probably be possible to add RegEx body filters to an addon such as FiltaQuilla, but frankly I've put almost no effort into that in years, and am unlikely to do so in the foreseeable future. There are much more urgent issues to deal with in Thunderbird at the moment that filter improvements. If someone wanted to attempt that I would be happy to point them in the correct direction.
Thank you for the explanation, Kent. I hope someone takes up your kind offer. In fact, simply allowing wild cards ("*") in the existing Message Filters, which can filter on the contents of the BODY, would achieve most of what is needed. How tough would that be?
asterisk "*" is simple workaround but it isn't enough. do you plan support for unix shell or regexp wildcard? shell accepts * , ? and [chars] . this is the simplest regexp for creating. other question. does asterisk work when is placed in theme, sender, receiver or else anywhere ?
Sorry if I'm late to the party, here. Kent, FiltaQuilla doesn't come up when I search for Thunderbird add-ons that do searching - and its description only includes a laundry list of features I don't actually care about (no offense). I care far more about the ability to search e-mail bodies than the ability to use full regular expressions to search only subjects. That said, either regex or boolean search would seem to me to be much more than just a nice-to-have. For me, this has nothing to do with spam - there are other means of filtering spam, and for now TB's "junk" designation is working OK for me. I keep my e-mail until I run out of disk space, and I search old e-mails for specific content *in the body of the e-mail* on a regular basis. In fact, I only within the last month switched to using TB at home rather than my ancient Forte Agent 1.8, because it's easy to use and I haven't wanted to pay to upgrade Forte Agent to a more modern version that will handle the now-ubiquitous HTML e-mails. But I was dismayed to see how e-mail search is implemented in TB. I may go back to Forte Agent after all... I will try some add-ons first, but everything I'm reading seems to indicate that even those that will search other headers may not search bodies (and I understand about the technical difficulty). And recent reviews of FiltaQuilla and of Wang Opera's Expression Search / GMailUI seem to indicate regex searching is currently broken in both of them. Rebeccah
See Also: → 363298
This is supported by FiltaQuilla, super documentation at http://www.digiblog.de/2010/11/regular-expression-mail-filters-for-thunderbird/ Maybe this issue can be reported as RESOLVED?

Many popular application let search and filter by regular expression (think to LibreOffice, but also most common editors) as this is a very useful and common feature

Severity: normal → S3
See Also: → 16750
You need to log in before you can comment on or make changes to this bug.