19442 - Regular expressions in mail and news filters

Reporter

Description

•

25 years ago

The ability to filter email and news posts by regular expression matching in headers would be nice. It could be used to filter out messages in ALL CAPS or ending with a number, among other things.

lchiang

Updated

•

25 years ago

QA Contact: lchiang → laurel

Phil Peterson

Updated

•

25 years ago

Assignee: phil → nobody

Summary: [FEATURE] regular expressions in mail and news filters → [HELP WANTED] regular expressions in mail and news filters

Whiteboard: [HELP WANTED]

Phil Peterson

Comment 1

•

25 years ago

Adding to [help wanted] list

Michael Lowe

Updated

•

25 years ago

Keywords: helpwanted

Phil Peterson

Updated

•

25 years ago

Summary: [HELP WANTED] regular expressions in mail and news filters → Regular expressions in mail and news filters

Whiteboard: [HELP WANTED]

Brian 'netdragon' Bober

Updated

•

24 years ago

Blocks: 66423

Brian 'netdragon' Bober

Updated

•

24 years ago

Blocks: 66425

Daniel Brooks [:db48x]

Comment 2

•

23 years ago

That and it'd be nice to have a filter action that would move messages to folders like "bug $1", where the $1 gets substituted from the regex, creating a new folder if one doesn't already exist.

Garth Wallace

Reporter

Comment 3

•

23 years ago

Daniel: one issue = one bug. Let's keep this one nice & simple. Using backreferences for folder creation would be a new RFE depending on this one.

dmitry

Comment 4

•

23 years ago

At the very least, there must be a case sensitivity flag for the matching strings. Now I have to enter a spam-filtering substring several times in all case combinations if I want to filter by it.

André Langhorst

Comment 5

•

23 years ago

so I have an extensive set of pr0n and spam filters, and there is yet two types of mails slipping through 1) Subject: bla blah blah blah [many spaces here] 12451252 [AFXLKFJK whatever] 2) Received: someone@anyhost.cn [or other strange domains] 1 could be satisfied by an additional [ends with] condition 2 and more sophisticated stuff could not also I think it's faster to do a regex than browsing my 20+ filters per account I know discussion does not belong here, but I needed to tell that it's impossible to filter *much* spam out without regex (I still get 2-10 per account a day and I have a very good set of message filters so far AND I do not subscribe with my email in sex sites :) ) and the fact that regexes might actually be faster in some cases nominating mozilla1.1

Keywords: mozilla1.1

Nicolás Lichtmaier

Comment 6

•

23 years ago

*** Bug 148430 has been marked as a duplicate of this bug. ***

Carsten Menke

Comment 7

•

22 years ago

Yes, as the number of spam mails I get has dramatically increased (around 50 per day ) I also came to the conclusion that regex filtering is urgently needed. So to filter out for example mails by subject "Gambling get over $100 and 100% guranteed" by gambling*$*. I think that Perl regexes would (though I would like it) to overhelming for the average user. Maybe bug 151226 is also of interest within this context.

André Langhorst

Comment 8

•

22 years ago

I just want to second that the number of spam dramtically increases and due to that this should work in the body too, but that is another bug I guess...

André Langhorst

Comment 9

•

22 years ago

updated keyword -> mozilla1.2

Keywords: mozilla1.1 → mozilla1.2

Anthony DeRobertis

Updated

•

22 years ago

Depends on: killfile

Garth Wallace

Reporter

Updated

•

22 years ago

No longer depends on: killfile

Valerio Messina

Comment 10

•

22 years ago

Great for RegExp filter ability!!! I have a 440 filter in file of WinEudora that look on IP source of a spammail and I can survive from 25 spam x day. I hope this filter is applicable on "all header", "All Received: line", "last Received: line" or "Body" (not so usefull for antispam efficacy). Antispam step: 1 - look in header for "real" source IP. Not always is last Received: 2 - Whois this IP, get back email of admin, and IPrange of provider. 3 - Write to Admin and attach the mail with all headers 4 - Create a RegExp filters based on IPrange that put mail in a folder I had write a C program that from IPrange create a RegExp. I'm writing a second C program that parse header to find the real IPsource... and that dont believe to forged header... :-)) How can I partecipate to developing MozillaMail ? Bye, efa

Mike Fedyk

Comment 11

•

22 years ago

While regular expressions are a great tool for identifying patterns, if each person takes the time to keep their regexes up to date with the latest spamming tactics, there will be a very large collective waste of time. There are projects like spamassassin that are probably better spending your time on. I know, I spent a lot of time with procmail and its weighted scoring, and even with that many spams got through and I had to keep updating it as time went along. You might consider supporting the spamassassin (spamd) protocol and have mozilla filter a message through that, but if you already have a smamd server then you can probably filter on the server level...

Ryan Grove

Comment 12

•

22 years ago

Last time I checked, mail filters weren't just for filtering spam.

Valerio Messina

Comment 13

•

22 years ago

I do not concordate completely with Mike. I explain why: Before I do not understood spamassassin. I studied it in this time. I see that is a weighted scoring with a very big list of check on body and header of a mail. I also tried Spamnix (EudoraWin32 plugin of spamassassin). It function well. But they are code to identify automatically a mail as a spam mail. It's a great tool. But limiting only to receive, identify and move mail to a spamFolder is really dangerous. If you do so, the number of spam become really enormous. At the end you receive 99% spam. Yes you dont see that, but internet traffic will be all spam. Dont think only to you. Spam must be killed before it damage us! My idea is to kick out spammer from good provider (really really most). Imagine to have a spammassassin tool to identify spam, and than an automatic tools that do my previous point 1 to 3. The spammail will be forwarded to Admin of source server, and he kick out immediately the spammer. I do that manually about one time per day. Everyday I receive response from Admin that say: "We investigate. If we find that a customer is in violation of our policies, we will take the necessary action to stop the activity in question" or best "Thanks you for report. I have just now terminated the account responsible for the abuse." This method is functional, only slow because for now is most manual and maybe I'm the unique to use it. Point 4 is only for Admin that dont reply, dont kick out the spammer, or for provider that are spammer (really few). Regex filtering require some time to keep updated the list of filter, but do not require to keep updated to lastest spamming tactics. They are the same in header. Most of mail, source from last (bottom) received IP address. Forged header are sourced from last IP that have real DNS in header. My C Code to do point 1 is a real alpha, but in the future I think it can identify correcly the real source IP. My C Code to do point 4 is a CandidateRelease2, really stable and bug free. Point 2 is Unix Whois (need port for Win32 system). Not too much more, some parsing on Whois report to extract abuse@domain.tld and IPrange registered in IANA, ... Seems to me that the trick to extinguish spam, is an automatic tools to recognise spam (spamassassin or new MozillaMail1.3 filter), and than a generalized automatic tools to stop the spammer. And if such tool is really diffused (with point 4), Admin keep more attection to spammer, because most users can easily filter all the mail sourced from a provider. In any case RegEx filter ability is a great tool for text matching in filter and search in mailboxes. I want it. :-))

Garth Wallace

Reporter

Comment 14

•

22 years ago

Valerio: that was entirely offtopic, except for the last line which was a simple "me too". Please don't comment unless you have something you have to contribute to the bug under consideration.

R.K.Aa.

Comment 15

•

22 years ago

*** Bug 191261 has been marked as a duplicate of this bug. ***

Sander

Comment 16

•

22 years ago

*** Bug 184690 has been marked as a duplicate of this bug. ***

Dennis Daniels

Comment 17

•

22 years ago

I'd posted a dup of this bug apparently. I was informed: q> > I want to sift out all emails with the subject line containing "jhotdraw". So > I create a filter for subject = *jhotdraw* Why not simply do "subject" "contains" "jhotdraw" ? /q> I've been trying to do this: --------simply do "subject" "contains" "jhotdraw"----- for as long as I've been running mozilla. As it stands in moz1.3b no newly created filters are running at all (not connected to this bug, I know)

Sander

Comment 18

•

22 years ago

*** Bug 198273 has been marked as a duplicate of this bug. ***

Aaron Kaluszka

Updated

•

22 years ago

Blocks: eudora

Boris Zbarsky [:bzbarsky]

Comment 19

•

21 years ago

*** Bug 218298 has been marked as a duplicate of this bug. ***

John H. Miller

Comment 20

•

21 years ago

There has been an increase of SPAM recently due to some very prolific viruses. I think that if wildcards (* and ?) were added to "Message Filtering" that many of the SPAMs that I am receiving could be filtered out. Ususally they are shotgun spams that are emailed to a dozen people with similar email addresses as mine. Wild Cards would allow me to easily identify these shot gun spammers.

Sander Goudswaard

Comment 21

•

21 years ago

I need regex for matching SpamAssassin headers. Voting for this bug.

Ashley Bischoff (blog at handcoding.com)

Comment 22

•

21 years ago

Sander: See also bug 224318 - "Bayes filtering should be aware of X-Spam Headers".

Werner Warweg

Comment 23

•

21 years ago

Ich möchte mein Mozilla noch effektiver machen. Für folgende Probleme suche ich eine Lösung: Bei offensichtlichen (und sicherlich beabsichtigten) Schreibfehlern ist der Filter durchlässig wie ein Sieb. Ich habe beispielsweise gesperrt "Viagra", der Filter läßt durch V;agra V i a g r a Via.gra Es wäre hilfreich, wenn alle Leer- und Sonderzeichen *vor* dem Mustervergleich eliminiert würden. Noch problematischer sind HTML-Emails: ein Text wie V<big>i</big>agra Viagra wird nicht erkannt, obwohl der Mensch das bestens lesen kann! noch witziger: Vi<acd>agra wobei <acd> jede beliebige Zeichenkombination sein kann, die willkürlich vom SPAM-Versendern eingestreut wird.

Jo Hermans

Comment 24

•

20 years ago

*** Bug 213567 has been marked as a duplicate of this bug. ***

Adam Hauner

Updated

•

20 years ago

Keywords: mozilla1.2

Robert Guico

Comment 25

•

20 years ago

In all honesty, the concept of Perl regular expressions doesn't even have to be implemented in its entirety... if I would like to send entire domains to a certain folder, it'd be nice to do *@somedomain.com and have that be my one filter. So a simple implementation of ? (single character match) or * (multiple character matches) would do the job for me, and would also be more well known that full-blown regex.

Eyal Rozenberg

Comment 26

•

20 years ago

The fact is that it already _is_ implemented in its entirety in JavaScript (AFAIK), so it's just a matter of using it. No sense in re-implementating a subset of regular expressions, I think.

Valerio Messina

Comment 27

•

20 years ago

(In reply to comment #25) > if I would like to send entire domains to a > certain folder, it'd be nice to do *@somedomain.com The problem is not with real source email. Is for fake source email, like spam. The spammer can easily cheat the source email and domain, but cannot cheat on source IP address in header Received lines. With regex you can match the whole IP range registered IANA block of a known spam provider like chi....... or hana......

Robert Guico

Comment 28

•

20 years ago

(In reply to comment #27) While I agree with the general concepts, my comments don't relate specifically to spam. I get plenty of mail that is legitimate AND needs to be filed away into a single folder, BUT comes from sources that are just a little bit different in different ways (for example, 61source_dev@domain.com, 61source_stg@domain.com, 81prod_srv2@domain.com). Filtering on subject lines would not be helpful as there have been unpleasant side effects. :) The ability to write a single expression for this would be helpful. I tested a handful of regular expressions that might've worked... they didn't. Eyal... if the JS filtering is there, it's undocumented. :-) On a related note, the reason this comes up is because I don't have the option to create a filter from a message in my inbox (much faster way to create filters). That is a separate feature, however.

Valerio Messina

Comment 29

•

20 years ago

(In reply to comment #25) if I would like to send entire domains to a > certain folder, it'd be nice to do *@somedomain.com and have that be my one > filter. So a simple implementation of ? (single character match) or * (multiple > character matches) would do the job for me I tryed this now with Mozilla Suite 1.7.3 and it works well. Just create a filter with "Sender" and "Contain" and "@domain.com"

Eyal Rozenberg

Comment 30

•

20 years ago

> > Eyal... if the JS filtering is there, it's undocumented. :-) > Here's how we use regexp's in BiDi Mail UI: http://www.mozdev.org/source/browse/~checkout~/bidiui/source/suite/chrome/content/bidimailpack/bidimailpack-common.js?rev=1.3 just define it with /..../ 's and then do myregexp.test(mystring) . Pretty straightforward.

Garth Wallace

Reporter

Comment 31

•

20 years ago

(In reply to comment #28) > > The ability to write a single expression for this would be helpful. I tested a > handful of regular expressions that might've worked... they didn't. > > Eyal... if the JS filtering is there, it's undocumented. :-) It's not that regexp filtering has been implemented (if it was, this would be RESOLVED FIXED), but that JavaScript already has support for regular expressions so it's just a matter of modifying the filter code to use it. No need to implement a new parser.

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: MailNews → Core

robert wegner

Comment 32

•

20 years ago

It's a pity nobody is working on this. Would be a killerfeature for Thunderbird.

Frankie

Comment 33

•

20 years ago

*** Bug 261854 has been marked as a duplicate of this bug. ***

Alex

Comment 34

•

20 years ago

Free Eudora had regular expressions in filters a long time ago. Couldn't believe it when I discovered that Mozilla mail didn't.

Jo Hermans

Comment 35

•

20 years ago

*** Bug 275988 has been marked as a duplicate of this bug. ***

Jo Hermans

Comment 36

•

19 years ago

*** Bug 304428 has been marked as a duplicate of this bug. ***

Mike Cowperthwaite

Updated

•

19 years ago

Depends on: 213567

Jo Hermans

Comment 37

•

19 years ago

*** Bug 337229 has been marked as a duplicate of this bug. ***

Sergio

Comment 38

•

19 years ago

If Thunderbird had a basic "wild card intelligence" (which comes since the DOS age when we wrote dir file.* and the program gave us the matching files...) it could (with no much lines more) be more efficient and less frustrating in finding INTELLIGENT MATCHES in the filter expression. I know a little about C++ (its not my best language) and I analized the Thunderbird source, so I could give the following CODE SUGGESTION. Just need to transform the diagram in equivalent code. VERY STRAIGHTFORWARD and simple. Please take a look and I hope you give us SOON a Thunderbird UPDATE with "wild card intelligence" in the CUSTOM FILTERS... That would be basic, but would help us A LOT. The ideal features would include: 1- Possibility of mixing OR and AND statements in the same RULE WINDOW. Nowadays you can use ONLY one of the options... 2- For advanced users, Filters with Perl regexp. Ex: re:.+v.*i?a?.*g?.*r?.*a to match: Viagra Visagra Viagbbbrgra and so on... Here is the algorith I suggest to add "wild card intelligence" to thunderbird filters. I the image I indicate even the cpp file and the function that needs to be improved. I hope it helps somehow... http://img509.imageshack.us/img509/2923/thunderbirdsuggestion6pa.gif Best Regards Sergio Abreu

Magnus Melin [:mkmelin]

Comment 39

•

18 years ago

*** Bug 358683 has been marked as a duplicate of this bug. ***

John Sullivan

Comment 40

•

18 years ago

I've been taking a look at this and have hacked up some code. It'll need some beating into shape before it is suitable for inclusion, for which I will need some advice from someone more familiar with the mozilla codebase. (My current MUA is showing signs of age, and I'd *really* like to switch to Thunderbird. However the lack of regex support in the filters means I can't bring over my existing sorting/blacklisting filters which is an absolute showstopper for me.) There are actually two existing regex implementations within the mozilla codebase: 1) directory/c-sdk/ldap/libraries/regex.c - this is a very limited implemenation, far less capable that most people would expect in this age of ubiquitous PCRE support. It's also apparently extended from an old grep implementation in a slightly odd (non-standard) way. 2) js/src/jsregexp.c - this is the one mentioned above. This supports a much more powerful Perl-ish syntax - as I believe from the comments is specified for Javascript by ECMA. Unfortunately it is heavily tied in with both the JS engine memory allocation routines, and the JS engine's internal typedef range. Its symbols are not, without a lot of grotty hacking, even visible from the rest of the mozilla codebase. Even if it were visible, it requires a JSRuntime and JSContext object to operate, which are way too much overhead for a generic facility. It cannot be used directly. 3) (I know I said 2!) There's something in security/nss/lib/util/portreg.c *calling* itself a regexp, but clearly just a shell glob with maybe a little regexy-like extension. There is almost identical code in both modules/libjar/nsWildCard.cpp and xpfe/components/filepicker/src/nsWildCard.cpp which is more accurately named. God knows whether (1) or (3) are still live code. I've been playing around and am now at the stage where I have working POP3 filtering based on regexes. I've taken a copy of jsregexp.c, converted it to use PR_ memory allocation and PR typedefs and it seems to function fairly well. As a proof of concept it shows this bug (after 8 years!) could be addressed without too much pain. At the moment the regex engine is namespaced out of the way and #included directly into mailnews/base/search/src/nsMsgSearchTerm.cpp. Quick-n-dirty with minimal impact, but clearly not a great solution. I suggest that after suitable cleanups it ought to be made a core facility available across mozzila/. The problem is I lack the familiarity (and authority!) to know where it is appropriate to put it so that it is available to anywhere it might be needed. (I don't think it could be abstracted out of the JS engine - which means a parallel implementation. The JS version has to rigidly conform to ECMA, whereas a common internal facility I can see people wanting to extend in various ways. The different ways of handling memory management are also a block here. Not pretty, but necessary I think.) The changes to filtering/UI to integrate it are relatively trivial by comparison (even if the mozilla codebase is a huge bloated monstrosity with no rhyme nor reason to the location of any particular file), however there are a number of places where it may be useful and it might be nice to hit all of them in one go. Again, I lack familiarity enough to be able to enumerate all such places, my focus has so far been on filtering of POP3 mail. I could do with someone who knows the codebase better to give me a few pointers here... Cheers, John

Comment hidden (advocacy)

Peva

Comment 44

•

17 years ago

OK - thanks, Eval. I wonder if John Sullivan is still around (see comment #40 above).

Matt Dudziak

Updated

•

17 years ago

Priority: P3 → P1

Matt Dudziak

Comment 45

•

17 years ago

(In reply to comment #43) > Peva: All TB development is very low-priority for MoFo/MoCo at the moment (and > in general); IIRC there are only 2 full-time TB/mailnews developers at the > moment, and Mitchell Baker wrote a blog entry basically rationalizing how TB > development won't be getting any of the googlebucks. So changing settings in a > bug page probably won't help much with getting feature added anytime soon... > what you (us) need is to find someone to work on this. > Mozilla may only have 2 people working on Thunderbird, but Qualcomm has 4 additional people working on the same basic code. Granted those 4 are not _full-time_ on Thunderbird / Penelope / Eudora, but they are submitting changes.... The more people we can get involved the better, IMHO. Matt

John Sullivan

Comment 46

•

17 years ago

I am still around. I'd really like to get this done - it's a stopper from me upgrading from my current MUA, which a spammer has unwittingly discovered how to crash with stupidly long subject lines. I have a workaround, but a local build of Thunderbird sorts it out completely. Situation as above. If someone with commit rights wants to say where I ought to put generic library code in a way that wont interfere with other projects I can make time. I could I guess just make something up, request review and sort it out from there if any reviewer is listening and objects to a crude insertion. I should point out that I can make time before the end of year, I have leave I need to take and could use a part of for this, but I just don't feel up to using spare time during my normal working schedule when I could be unwinding from existing machine stuff. I'm sure you understand.

Magnus Melin [:mkmelin]

Comment 47

•

17 years ago

John: some related regexp interface design was done in bug 106590. (If you need that it might be worth arguing your case there to get it un-wontfixed.)

Wayne Mery (:wsmwk)

Comment 48

•

17 years ago

(In reply to comment #47) > ... (If you need that it might be worth arguing your case there to get it un-wontfixed.) That seems highly unlikely without core (pun intended) support from those formerly involved in that bug and their strategic (and probably smart) direction of ECMA-262 regexp's / JS_*RegExp API noted in bug 348642. John, Brian says in there "It's pretty easy". Perhaps they'd be glad to have your help. (plus the bugs dependent on bug 106593, bug 32641 and bug 80337, are quite dead so no help gonna come from there)

Comment hidden (metoo)

Comment hidden (advocacy)

Comment hidden (metoo)

Joshua Cranmer [:jcranmer]

Comment 53

•

17 years ago

I want to interject one point into the debate: should we do full JS regexp or wildcard matching. Regexp is much more powerful matching, and I have one use case. The most recent infusion of MI5 spam in Usenet can be filtered out with the matching regexp for sender: ^[vief]+@.*$ Laying out other requirements: * Support both wildcard and regexp? If not, which one? * Matches, contains, or both? * Strings? I like the idea `Sender' `matches regex' [==========] * Needs to be implemented for all of IMAP, POP, NNTP To John: If you need help with the filter code, I can easily provide it.

Karsten Düsterloh

Comment 54

•

17 years ago

(In reply to comment #53) > full JS regexp or wildcard matching If we can grab some low hanging wildcard fruit, fine. But as soon as "heavy" coding is involved, we'd probably shoot for RegExp - although not necessarily JS based (having a scriptable interface for the JS RegExp would be truly cool, but I doubt we'll see that). (I've tinkered a bit with how to implement user-defined JS filter actions, but I'm not sure how slow such extensive XPCOM boundary crossing would get.) > * Matches, contains, or both? Not much of a difference with RegExp. > * Needs to be implemented for all of IMAP, POP, NNTP And "movemail" and "none" (local folders).

Matěj Cepl

Updated

•

17 years ago

Flags: wanted-thunderbird3?

Flags: blocking-thunderbird3?

Serge Gautherie (:sgautherie)

Comment 55

•

17 years ago

Filter on "Nobody_NScomTLD_20080620"

QA Contact: laurel → backend

Nobody; OK to take it and work on it

Assignee

Updated

•

16 years ago

Product: Core → MailNews Core

David Ascher (:davida)

Updated

•

16 years ago

Flags: wanted-thunderbird3?

Flags: wanted-thunderbird3+

Flags: blocking-thunderbird3?

Flags: blocking-thunderbird3-

Dan Mosedale (:dmosedale, :dmose)

Comment 56

•

16 years ago

Marking as wanted-, as per the revised driving rules <https://wiki.mozilla.org/Thunderbird:Release_Driving>.

Flags: wanted-thunderbird3+ → wanted-thunderbird3-

Comment hidden (advocacy)

Rob Siklos [:robzilla]

Comment 58

•

16 years ago

I am *not* a thunderbird driver, but I think the way they're doing things is fair. If users, QAers, extension devs, and non-driver devs "want" a bug, they should vote for it. Obviously by virtue of the feature bug existing in the first place, somebody wants it. So unless the "wanted" flag is only for "special" people, it's pretty redundant.

Comment hidden (off-topic)

Dan Mosedale (:dmosedale, :dmose)

Comment 60

•

16 years ago

blocking and wanted have always been part of a mechanism for thunderbird-drivers to help shepherd the highest-impact bugs/features into the tree; nothing has changed there. The wiki page changes did not happen at random; I personally made them on behalf of thunderbird-drivers, to clarify policy changes we've made regarding use of that flag: we didn't think that the way it was being used was really sufficiently helpful w.r.t. helping get the highest impact bugs into the tree. It sounds like you think there should be some separate mechanism for use by the broader community, but it's not clear to me how you envision that working. That bug isn't really the best place for that discussion, I think, but you're welcome to post a proposal to m.d.a.thunderbird or m.d.planning.

Dan Mosedale (:dmosedale, :dmose)

Comment 61

•

16 years ago

Er, "This bug isn't really the best place...."

Wayne Mery (:wsmwk)

Comment 62

•

16 years ago

(In reply to comment #59) > Votes are not version-related. Plus, by now voting is pretty deprecated AFAICT > since votes have been generally ignored. To be fair, I don't think anyone *currently* associated with Thunderbird has discouraged voting by users (tho there are certainly detractors in the mozilla community). And some of us do use votes. For example when attempting to differentiate within an overwhelming number of bugs looking for worthy nominations. But voting's usefulness is limited, and it has it's problems - for example it does not equate directly or well to severity nor need. And you can't productively rank bugs against each other, for example a 9 year old bug with 90 votes (like this bug) against a 1 year old bug with 20 votes. Getting back to this bug... > Eh, what the hell, let'em do whatever they want(ed). Nobody seems to listen to > what I/people like me say anyway. I'm guessing your ultimate concern is for this bug to make progress, which is dependent more on the suggested blocker or someone taking interest (anyone touched base with John Sullivan?), and less so on it's status per drivers. relevant bit also in bug 106590 comment 37.

Joshua Cranmer [:jcranmer]

Comment 63

•

16 years ago

(In reply to comment #62) > But voting's usefulness is limited, and it has it's problems - for example it > does not equate directly or well to severity nor need. And you can't > productively rank bugs against each other, for example a 9 year old bug with 90 > votes (like this bug) against a 1 year old bug with 20 votes. I counted: # of TB bugs > 50 votes in the last: 1 year: 0 2 years: 1 3 years: 2 4 years: 4 all time: 44 No bugs in the past 4 years have > 100 votes, the newest being just under 5 years. Votes have a strong bias towards older bugs, which means that it's a poor approximation to wanted features. > I'm guessing your ultimate concern is for this bug to make progress, which is > dependent more on the suggested blocker or someone taking interest (anyone > touched base with John Sullivan?), and less so on it's status per drivers. > relevant bit also in bug 106590 comment 37. I've been reading C++0x recently, and as that adds support for regex natively, it might be worthwhile to ask NSPR to add support for the libraries or put it elsewhere. That's another can of worms entirely, though...

Kent James (:rkent)

Comment 64

•

16 years ago

I recently posted a patch in bug 495519 that implements the ability to add custom search terms to Thunderbird filters, and as a demonstration shows an extension that does a regular expression comparison to the message subject. I'm reasonably confident that a finished patch will be implemented in TB3, so the use of regular expression filter terms in extensions should be possible then. That patch, and its cousin that adds custom filter actions, relies on calling javascript-implemented features from the C++ filter code. If we are willing to take that step, then it is not a big leap to do the same trick in the normal filter code to implement regular expressions. We could implement a javascript XPCOM object to execute a regular expression, and call it from the C++ search code when needed to do the regular expressions. This is not a big project. It also could be done in extensions instead of in the core code though. I'm curious what people think of this approach, with its obvious possible performance issues. I'd probably be willing to do this work if drivers could add wanted+ to this bug. Otherwise I'll just leave it to the imagination of extension writers.

Component: Backend → Filters

QA Contact: backend → filters

Dan Mosedale (:dmosedale, :dmose)

Comment 65

•

16 years ago

In general, it seems like an entirely reasonable approach. Because of the potential perf issues, I suspect that the thing that makes the most sense is to code this up as an extension first and do some benchmarking before committing to accept this in the core. As such, I don't think we can say for sure whether it's wanted+ at this point in time.

Matt Dudziak

Updated

•

16 years ago

Whiteboard: [penelope_wants]

sergo

Comment 66

•

15 years ago

Also it would be nice if messages could be searched for using regular expression syntax.

Wayne Mery (:wsmwk)

Comment 69

•

12 years ago

kent's comment 64 is related to https://addons.mozilla.org/en-US/thunderbird/addon/filtaquilla/

URL: https://addons.mozilla.org/en-US/thun...

Priority: P1 → --

Kent James (:rkent)

Comment 70

•

12 years ago

This is a bug that could be easily implemented using a javascript XPCOM component to do the regex processing, and adding an additional choice for text searching as suggested in https://bugzilla.mozilla.org/attachment.cgi?id=719922

Josiah Bruner [:jsbruner]

Comment 73

•

11 years ago

ukrainianconsular@gmail.com, Personal attacks are not appreciated *at all* around here. Clearly you have nothing better to do with your time than to insult extremely helpful and knowledgeable contributors such as Kent. Thunderbird is run ONLY by contributors now and Mozilla has very little to do in it's development. If you care so badly about this being fixed, by all means go ahead and fix it yourself, but don't expect the rest of us to do whatever you want. We are all working in our spare time to improve TB (We actually have lives and jobs outside of this), but we can't get to every bug in a timely fashion. There are thousands of filed bugs. Therefore, unless you have something useful to contribute to a development-only site, please just stop. Continuing to attack people may result in a ban to your account. Thanks, Josiah

ukrainianconsular

Comment 76

•

11 years ago

alos, i am insisting towards moz devs, because rkent will not add regex filtering for the body section in his addon, because he thinks it affects me & only me through his rejection & my complaint about his abuse of trust. he will not do it & everyone who's expected that function here has been waiting forever.

Eyal Rozenberg

Comment 77

•

11 years ago

(In reply to ukrainianconsular from comment #74) You cover your valid points with so much hyperbole, with curses, with YELLING, et cetera - that it's really difficult for people with a view similar to yours to sympathize with your message, or even to read through it. (In reply to Josiah Bruner [:JosiahOne] from comment #73) Josiah, - "Very little" is still some. And I'm sure people connected to the Mozilla foundation/Corporation have a lot of say w.r.t. Thunderbird development. - Thunderbird development work over the years sure seem to have very strange priorities. - Many users get extremely frustrated - For some people, implementing a certain feature might mean 2 days of work, while for others, it might mean many months of agonizing to get to the point where they can add anything significant to the code. I really suggest people remove "so fix it yourself" from the lexicon. - A 20-year-old mail client should have gotten regex support in message filters many years ago. Having said that, ukranianconsular's personal attacks are quite beyond the acceptable.

ukrainianconsular

Comment 78

•

11 years ago

it's just that the 15 years of thread & the fact another contributor of an addon ignores a main basic function of his add-on drives me nuts, and that's factual, not an hyperbole. if my complaints are beyond the acceptable, i will apologize to have bothered you & will make sure i never post any requests at mozilla.org it is certain that after having started many threads & obtained no fix for all of them even without complaining about that form of negligence, it's beyond acceptable to insist. you are right eyal, i shouldn't keep expecting anything after having seen the date this thread was started .. . when email clients r conceived & posted online with regex on their first release, i send to all of the ppl here who have been deeply offended, my most humble apologizes. keep up the good work.

Eyal Rozenberg

Comment 79

•

11 years ago

(missing bit in my comment) - Many users get extremely frustrated after waiting many years to see some attention to significant issues they have reported or commented on, that they end up losing their temper, like our friend ukrainianconsular. I'm sure he has enough on his plate in life these days in the Ukraine than to write comments here... he does it because he cares about making Thunderbird better. Everyone here wants that, nothing else.

ukrainianconsular

Comment 80

•

11 years ago

ur right ((

Comment hidden (off-topic)

opera wang

Comment 82

•

11 years ago

If you check bug 868233, you can see Kent has already have a patch that can help all addon authors to working on filter using body part. The patch itself actually demos how to use regular expressions for body match. Also without the above patch, it's not possible to just write addon to do the regular expression search against body, as now if you need get the body, it's an async call, and filter system need sync call. The current Thunderbird development is a bit slow, that's true. I do see a few bugs have patch and it might need months before get reviewed. That frustrated both contributors and end users. There might be ways to improve this, but I agree with Josiah Bruner, Personal attacks doesn't help.

robert wegner

Comment 83

•

11 years ago

could you move this fruitless discussion to the forum please? I really dont need this via email...

ukrainianconsular

Comment 84

•

11 years ago

Thank you so much opera wang, hoping this patch addresses regex & not javascript since i don't know the javascript language & have no computer knowledge. could you please tell me how to install it by the way since i have no coding knowledge & don't know were to put that content & how to name the file: https://bugzilla.mozilla.org/attachment.cgi?id=744907 Regex is already a bit complicated for me but i use a magic tool called regulazy that helped me a lot, when i click on regex edit, select parts & right click, then i have many choices including "exact-letters-numbers-anything, etc". This matter has frustrated me a lot because i've been fighting against spam using a personal email that's been posted by inbreds online & collected by all types of extractors-companies or hired spammers to send me a whole imaginary world of spam content, & i took the decision to abandon the ordinary baysian filters since they appeared on the web & that are useless for my part, since the baysian system sends false positives to a spam folder, moreover, there is a spam folder "to deal with/deal with spam", & i can't use that system to delete false positives. so i started to create complex filters over 20 years of my life, dealing with those spams almost every day, i now reached almost 2500 complex filters, and many of those filters are very "cerebral" & well thought, including the fact: Re: i "or" you (depending on the linguistic expression & content) are not present when i receive marketing related words & http is present & that personal technique is like 30% of my spam filter's database. regex for the body was my only accurate possibility because: if people use doesn't contain "i " & there's a word finishing with the letter "i" in the body, a false positive will be considered & i cannot allow myself such false positives caus emy filters are radical & delete the messages without sound without any attention, without visibility of the matter.. & this is a stupid limitation of all email clients that's never been fixed by an update. i could use " i " but if my letter " i " is in the body but at the start of the body, it will be ignored by tb's filters, because there's no space before " i " so we either needed a regex for the body, or "exclusively contains" option in tb's filters "this would be an evolution in email filtering". which would exclude all letters of the alphabet before or after a string/word/letter. using exclusively contains for "i", this should be recognized "hihi,i " .. . but not this "oui" spammers can even amuse themselves reading this message, i will defeat their mind polluting unsolicited spam campaigns. even if they try to obfuscate i or you or or http links playing with tags or colors, i have a solution for them too. & a link that arrives without any text should also be treated by a complex regex for the body .. . i already excluded more than a certain number of emails allowed in the too or bcc or cc fields. this regex option for the body was very important to me cause i engaged myself in 20 years of personal fight & my system is actually almost perfect, for an email that receives 100 spam remails PER DAY, if i login my email servers account through the web panel, ill see them all in the trash can & they're purged auto. by the server. & i get like 1 spam email over 7 to 14 days actually.. i've built very complex filters including words obfuscated by discomposition among tags for ex.. i've reviewed my spam box many times & have no false positives, but that's a hard & long work & fight against spam, i also lost a dear member of my family & forgot the human vital aspect of my life & experienced affective misery for being so obsessed with it. today, i'm watching the end of the tunnel folks & this body regex filtering option adds perfection to my filtering system & allows me to treat tags too without ever having to deal with it again, when some online filtering engines don't even allow it either, . I became schizophrenic & it became a psychosis, that i ended up regretting to have ignored that family member that passed away before i could spend more time by having more freedom in my hands. & you cant just sign up for a new email when you've shared that email with thousands of people over decades .. . spam can be very devastating in a life, it starts to slowly irritate you until it invades your life, even spammers from argentina are protected by their laws & spam in total impunity, that's why full spam reports will never stop some spammers. All i'm missing now is the "My email" not present in the To field nor cc field, while that filter should check more than one identity, without having to add them to filters & removing them when necessary when if i ever get rid of an account. i hope aceman takes care of it. Again, my sincere apologizes to everyone who's felt harassed in my interventions here, but if Eyal wouldn't have tried to understand me, i probably wouldn't have taken the time to explain my situation & why am i so obsessed with the matter. Because, before you guys feel harassed, trust me, i've been through a lot on this matter .. . Let's move on from this. Thanks.

ukrainianconsular

Comment 85

•

11 years ago

News: My private conversation with Opera Wang: Opera: The patch C++ code requires you to download the source code of TB, patch it and then rebuild whole TB, it would be hard if you were not in IT. Also the patch is about half year ago which means it is bit rotted ( can't get patched directly, need some merge effort ). ukrainianconsular: It means if i use the patch, i should never update thunderbird & condemn thunderbird to the latest actual stable build & never update it again, since i can't expect devs at mozilla to incorporate it for good .. . could you please add it, since i have no coding knowledge (( Opera: As I said, without the patch, it's (almost) impossible to do regex search in the body. I actually tried it once, with a lot of dirty code, still can't make it work without the possibility of crashing TB. ukrainianconsular to the devs of bugzilla: So now, i'm hopeless .. . and i believe rkent is blocked because tb needs a code modification before this regex search becomes possible .. . If you guys want to consider that issue, here's rlkent patch: https://bugzilla.mozilla.org/attachment.cgi?id=744907 Apparently, only you can do something about it.

ukrainianconsular

Comment 86

•

11 years ago

Getting rid of filtaquilla which becomes now obsolete to (regex) filter by headers to from subject & the missing one: body Many thanks to opera wang to introduce all those options in the addon he is maintaining: Expression Search / GMailUI 0.8.8 beta http://www.sendspace.com/file/dsfhu4

Comment hidden (advocacy)

Matěj Cepl

Comment 90

•

9 years ago

(In reply to Alastair Gordon from comment #89) > It cannot be difficult, can it? Sure it isn't ... just attach a patch to fix it!

Comment hidden (advocacy)

Kent James (:rkent)

Comment 92

•

9 years ago

See my comment 64. That approach is implemented in the addon FiltaQuilla, which has been available for years. The usage of that addon has never been that high (I believe around 10,000 users) and it includes many features in additional to RegExp, so the total actual demand for this feature is not as high as you would think. Because it is easy to do in an addon now, it may be appropriate to just leave it there.

Alastair Gordon

Comment 93

•

9 years ago

I have installed FiltaQuilla, and to the best of my knowledge, it will not apply filters to the BODY of the message, making it essentially useless for filtering spam. If FiltaQuilla does, in fact, allow filtering on the body and is usable by anyone other than a hard-core nerd, then you are right that we already have a solution. Simply supporting a wildcard character in the existing Thunderbird Message Filters would meet 90% of filtering needs. How about someone just adding wildcard capability to existing Message Filters?

Kent James (:rkent)

Comment 94

•

9 years ago

Alastair: Right, body filters are much more difficult, since filters are sync by nature, and body access is async. There is now a technical solution to this that did not exist when FiltaQuilla was written, though it is still a bit of a kludge, and may lead to unstable behavior (filters are crashy enough by themselves). It would probably be possible to add RegEx body filters to an addon such as FiltaQuilla, but frankly I've put almost no effort into that in years, and am unlikely to do so in the foreseeable future. There are much more urgent issues to deal with in Thunderbird at the moment that filter improvements. If someone wanted to attempt that I would be happy to point them in the correct direction.

Alastair Gordon

Comment 95

•

9 years ago

Thank you for the explanation, Kent. I hope someone takes up your kind offer. In fact, simply allowing wild cards ("*") in the existing Message Filters, which can filter on the contents of the BODY, would achieve most of what is needed. How tough would that be?

grzegorz.szyszlo

Comment 96

•

9 years ago

asterisk "*" is simple workaround but it isn't enough. do you plan support for unix shell or regexp wildcard? shell accepts * , ? and [chars] . this is the simplest regexp for creating. other question. does asterisk work when is placed in theme, sender, receiver or else anywhere ?

Rebeccah

Comment 97

•

9 years ago

Sorry if I'm late to the party, here. Kent, FiltaQuilla doesn't come up when I search for Thunderbird add-ons that do searching - and its description only includes a laundry list of features I don't actually care about (no offense). I care far more about the ability to search e-mail bodies than the ability to use full regular expressions to search only subjects. That said, either regex or boolean search would seem to me to be much more than just a nice-to-have. For me, this has nothing to do with spam - there are other means of filtering spam, and for now TB's "junk" designation is working OK for me. I keep my e-mail until I run out of disk space, and I search old e-mails for specific content *in the body of the e-mail* on a regular basis. In fact, I only within the last month switched to using TB at home rather than my ancient Forte Agent 1.8, because it's easy to use and I haven't wanted to pay to upgrade Forte Agent to a more modern version that will handle the now-ubiquitous HTML e-mails. But I was dismayed to see how e-mail search is implemented in TB. I may go back to Forte Agent after all... I will try some add-ons first, but everything I'm reading seems to indicate that even those that will search other headers may not search bodies (and I understand about the technical difficulty). And recent reviews of FiltaQuilla and of Wang Opera's Expression Search / GMailUI seem to indicate regex searching is currently broken in both of them. Rebeccah

Wayne Mery (:wsmwk)

Updated

•

8 years ago

Comment 98

•

7 years ago

This is supported by FiltaQuilla, super documentation at http://www.digiblog.de/2010/11/regular-expression-mail-filters-for-thunderbird/ Maybe this issue can be reported as RESOLVED?

Valerio Messina

Comment 100

•

2 years ago

Many popular application let search and filter by regular expression (think to LibreOffice, but also most common editors) as this is a very useful and common feature

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

BB

Comment 101

•

2 years ago

https://github.com/Betterbird/thunderbird-patches/blob/main/102/features/14-feature-regexp-searchterm.patch

Matt

Updated

•

6 months ago