187044 - [RFE] Add Challenge/Response system to counter spam

Reporter

Description

•

22 years ago

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020830 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020830 I'd like to see another anti-spam technique also implemented in Mozilla: a "challenge-response" system. Here's how challenge-response works: 1. If the sender is on the whitelist, accept the email. (Spammers can forge their addresses, but they then have to figure out who to forge as... and anti-fraud measures make this dangerous). 2. If the subject line includes a "password" set by the receiver, accept the message. 3. Otherwise, reply back to the sender a message that's configurable by the receiver-to-be, saying that they need to include the password in the subject line & here's how to figure it out. Spammers won't get the message, or won't read the responses. Real users will include the password. 4. Include various measures to prevent email loops: detect null senders, vacation messages, and remember who you sent replies to (and after a few tries, start dropping them). Information about this approach is at: http://www.uwasa.fi/~ts/info/spamfoil.html Another implementation (more recent) is at: http://sourceforge.net/projects/whitelight Reproducible: Always Steps to Reproduce:

David A. Wheeler

Reporter

Comment 1

•

22 years ago

By the way - if you SEND or REPLY to an address, or even save that email as a non-junk email, you should add the sender to the whitelist for purposes of this approach. That way, if you talk to someone, they no longer need to include the password.

NorthMan

Comment 2

•

22 years ago

This sounds very similiar to TMDA. Is it bug 156744?

David A. Wheeler

Reporter

Comment 3

•

22 years ago

No, this isn't the same as TMDA (bug 156744), though there are some similarities. TDMA also sends a challenge back, but it encodes a password in the email address to reply to. That makes it simpler to reply to initially, but it also it also has a fundamental weakness: it presumes that spammers never have a valid return address. A spammer with a valid return address can automatically slip the email through TDMA-like systems. ASK has the same problem. In contrast, if you ask for a password, and give instructions that require human processing, you foil automatic bulk systems. The challenge could be "The password is 'c' followed by 'hess'". Handling that in general requires human processing, and that raises the costs sufficiently that spammers are unlikely to do so.

David A. Wheeler

Reporter

Comment 4

•

22 years ago

Also, TDMA as they've currently implemented it requires significant cooperation from the mail server. That's because TDMA encodes codes in the "To" address itself. Many users don't have that kind of control over their nameserver. It also causes complications when two TDMA users talk to each other. However, requiring senders to include a password in the subject line is really easy to do - everybody can do that, without fancy reconfigurations. And while it requires senders to actually figure out the password, the benefit is that even spammers who have valid return addresses can't create automated tools to bash through password-protected systems. If sending or replying automatically adds them to the whitelist - or even sending the password adds them to the whitelist - then this becomes very easy to do.

Vidar Haarr (not reading bugmail)

Comment 5

•

22 years ago

Confirmed enhancement. Hw/OS -> All/All.

Status: UNCONFIRMED → NEW

Ever confirmed: true

OS: Linux → All

Hardware: PC → All

Summary: Add Challenge/Response system to counter spam → [RFE] Add Challenge/Response system to counter spam

mozilla.gv6r

Comment 6

•

22 years ago

Which parts of this are not possible with Mozilla now? OK, part 1 and 2: Use normal message filters, using the available "sender isn't in address book X" option, and "Subject doesn't contain password Y" to move email to the "probably junk" folder. Part 3: Cannot auto-reply to email using Mozilla right now. Part 4: No email loop problem, since there are no auto-reply functions in Mozilla. So, this bug is probably best simplified to a request that auto-reply functions be added to Mozilla. For an example of how NOT to do auto-reply, see valimail.com. Personally, I think the challenge-response method will not help the junk email problem, since at least some spammers use the "I got a response" method to validate email addresses, and this won't stop spam from taking network resources. If anything, it might increase the overall bandwidth needed to handle spam (plus challenge, plus response, plus argument over whether or not a challenge-response system is needed) I'm against adding auto-reply or challenge-response to Mozilla.

David A. Wheeler

Reporter

Comment 7

•

22 years ago

>Part 3: Cannot auto-reply to email using Mozilla right now. >Part 4: No email loop problem, since there are no auto-reply functions in >Mozilla. >So, this bug is probably best simplified to a request that auto-reply functions >be added to Mozilla. No, not quite. Imagine Spammer Spud, who forges an email to Bob claiming to be from Alice. Alice is on vacation, and sends out an "I'm on vacation" message to anyone who sends her a message: * Bob replies to Alice with a password message * Alice replies with the vacation message * Bob replies to Alice with a password message * Alice replies with the vacation message and so on. This is easily dealt with using some simple email loop checks. See Timo's notes for more. Many loops can be eliminated simply by setting the SMTP FROM envelope value to <>, noted below. >Personally, I think the challenge-response method will not help the >junk email problem, since at least some spammers use the >"I got a response" method to validate email addresses, and this >won't stop spam from taking network resources. If anything, >it might increase the overall bandwidth needed to handle spam >(plus challenge, plus response, plus argument over whether or not >a challenge-response system is needed) I don't agree, for several reasons. First of all, the "I got a response" message is rarely the "From" address. Today's spammers usually throw away any replies to the from address, because so many are error messages. This is PARTICULARLY true if the "here's how to make the password" has an SMTP envelope with the FROM value <> (null), which looks just like other error messages and eliminates. Today's, today's "sucker" lists are often created from the "please unsubscribe me" email address, which is part of the email address and NOT the from address. Now, spammers could modify their code to also detect challenges and add those users to their list. But there's no incentive for spammers to do so. Since spammers have to READ the messages individually to get the password, it won't be worth it to them. Besides, if a spammer already has your email address, they'll sell it to others, whether or not they "confirm". You're going to get MORE spam, and clog more network bandwidth, whether you challenge or not. Besides, think long-term. If EVERYONE had good spam protection, it would eventually be not worth it to spam. Then the network bandwidth wastage would be nearly zero. And THAT would be worth it for everyone. Challenge-response would help.

David A. Wheeler

Reporter

Comment 8

•

22 years ago

I've written a paper to discuss the idea in more detail, analyzing alternatives, etc. Please take a look at: http://www.dwheeler.com/guarded-email

mozilla.gv6r

Comment 9

•

22 years ago

Challenge response is not as useful for business addresses, since every new customer would have to jump through a second hoop to send email to the business. The first hoop is figuring out the email address. If everyone had "good spam protection", then spammers would simply figure out a way around the most popular spam protection, and use that. People sometimes forget that there are some smart spammers out there, and all it takes is one to write a good "get around spam protection X" system and sell it to the rest. The other problem is that this system increases the network load (at least temporarily) while 10000 users (or however many turn on the c-r system) send back a challenge to a non-existent email address. On most open relays, SMTP-FROM doesn't have to be a real address, or may be set to the receivers email address to get around other filters... If the sending address is a real address, the spammer could easily (and perhaps stupidly) use the responses as verification of reading, and just keep right on sending junk email to that address. I think the most common spammer trick is to send junk email to every short word at each well-known ISP, and claim that they just sent "X thousand emails to verified addresses". I have one address that has not been used for anything other than receiving spam. After I set it up, it remained spam free for about a month. It's up to one spam a day. The user name is only three letters long, at a well-known ISP. I have another that gets about 33 a day, which is a dictionary word 8 letters long, at a not particularly well known ISP. The 8 letter address can be found on google on two web sites (neither of them my own), and has been in use for 5 years. If, as you suggest, the challenge comes with SMTP-FROM: <>, most mail servers these days won't deliver it. If they do, they could easily be classed as open relays by some of the relay testers. The way email works ("accept email to local users" for the majority of email servers today) makes it difficult to block all spam, and still let through those good messages from new addresses (such as people who've read something you wrote). Challenge-response makes it less likely that you'll get the good messages.

David A. Wheeler

Reporter

Comment 10

•

22 years ago

I envision challenge-response as an option, off by default. If you don't think it will help you, by all means don't enable it. If you think it WILL help you, then enable it (by setting the password). A few counters to your comments: * "If everyone had "good spam protection", then spammers would simply figure out a way around the most popular spam protection, and use that." I agree that this is true for filters - which is why I think filters in the long run won't work. But I haven't seen any evidence that challenge-response systems have this problem. If someone broadcasts your password, you can change it faster than spammers can afford to keep re-finding it. * "The other problem is that this system increases the network load (at least temporarily) while 10000 users (or however many turn on the c-r system) send back a challenge to a non-existent email address..." Spammers are already using a torrent of network load. If most people used a challenge-response system, there's hope it would dry up. But while network load is important, it's READING TIME that's more important. Besides, if a spammer gets that many challenges, it'll hurt the spammer's source than anyone else - aiding in finding them. "If the sending address is a real address, the spammer could easily (and perhaps stupidly) use the responses as verification of reading, and just keep right on sending junk email to that address." A simple alternative would be to challenge email sent to a non-address, I suppose. * "If, as you suggest, the challenge comes with SMTP-FROM: <>, most mail servers these days won't deliver it. If they do, they could easily be classed as open relays by some of the relay testers." Hmm, that's a problem. Is that really true? The simple solution would be to change the SMTP-FROM address to be the guarded user's email address. Thanks for the note. * "The way email works ("accept email to local users" for the majority of email servers today) makes it difficult to block all spam, and still let through those good messages from new addresses (such as people who've read something you wrote)." Yes. We agree on the problem. * "Challenge-response makes it less likely that you'll get the good messages." Increasingly, I can't get the good messages BECAUSE of spam. I've already had two separate email outages, where DAYS of email were lost, because of spam. And I have to classify on-the-fly what's left, and sometimes I'll make a mistake & delete legitimate email. I have to run filters, and sometimes those filters will misclassify my email & I'll never see it. Remember, filters ALSO make it less likely that I'll get the good messages. There _IS_ a danger that a user won't bother to respond to a challenge. Please, let _ME_ (the receiver!) make that determination.

mozilla.gv6r

Comment 11

•

22 years ago

I'm not sure of the truth about blank MAIL-FROM: in the SMTP protocol. I believe it is left to the implementation to decide what to do. I may be wrong about the "most servers discard", it might be only "many servers discard". I know there are some out there that don't seem to care much about what they get, as long as it has a "To:" address, they'll send it. I've seen c-r systems fail more often than email servers. Many of them leave you guessing as to whether the receiver will ever see the email, or if your response was correct enough to pass the challenge. 'c' plus 'hess' appears to be a reasonable challenge. Anything requiring a user to come up with a good challenge may be doomed. I've been a sysadmin, and I know that there are many people who just want their email to work, and not require such difficult thought. I suppose that if you wanted to avoid receiving email from mathematically challenged people, you could say "Respond with an 11 digit prime number to send email to this address". I'm not actually in favor of that sort of thing, since I've got some friends who have difficulty figuring out what the Subject: line does, and that they could type something into that field. In other matters than computers, they do very well. Don't forget the computer-challenged in the race to stop spam. (A sysadmin saying "be nice"? I must be tired.) The idea of a client-based c-r is a little better than the web-based ones I've seen. The client idea requires fewer third-party systems, which is a good thing. Some people, myself included, will never trust a third-party challenge. The other difficulty is, where do you store the challenged email until the response comes back, and how long do you wait in case someone missed the challenge (or it got lost in the net). Two weeks of spam, for me recently, is about 1 MB. Any of those could, in theory, be a human trying to contact me. In practice, it's mostly wasted space, or "historical documents". If you go by the dates on the spam, it could be 1.3 MB per year, but that's only because spammers can't set clocks either. I don't like the idea of having to send a message twice (the original email, and the response to challenge), and still having no idea whether my response to the challenge got back before the other client deleted the message I sent. Perhaps you need to have a check for whether the message still exists on Bob's machine, and send a "please resend your message" notice to Alice in response to late responses. Then a spammer could have more time to work out the proper response, AND have something to point to that he could claim was a request for more email. Shoot. That doesn't work, does it. Perhaps requiring another c-r cycle for late responses would handle that, but if the response was late in the first place, it'll probably be late again unless you increase the amount of challenged email you keep temporarily. By the way, if you wanted to cut down on the spam that you see, you could try popfile (popfile.sf.net). After a few months of occasionally retraining it on errors, I have a 97% classification accuracy with about 17 classes. It can usually distinguish between useful messages from friends, and chain letters from the same friends. Very few of the classification errors are "ham mismarked as spam". Since it's basically a trainable probability based filter, the only way a spammer can get through it is by not sending email that looks like spam, or occasionally, by changing to a different product from anything seen in a long time. Too long, I know, but I ran out of time for editting.

(not reading, please use seth@sspitzer.org instead)

Comment 12

•

22 years ago

mass re-assign.

Assignee: naving → sspitzer

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: MailNews → Core

Mark Banner (:standard8)

Comment 13

•

20 years ago

*** Bug 194792 has been marked as a duplicate of this bug. ***

(not reading, please use seth@sspitzer.org instead)

Comment 14

•

18 years ago

sorry for the spam. making bugzilla reflect reality as I'm not working on these bugs. filter on FOOBARCHEESE to remove these in bulk.

Assignee: sspitzer → nobody

Serge Gautherie (:sgautherie)

Comment 16

•

17 years ago

Filter on "Nobody_NScomTLD_20080620"

QA Contact: laurel → filters

Nobody; OK to take it and work on it

Assignee

Updated

•

17 years ago

Product: Core → MailNews Core

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Bugzilla

[RFE] Add Challenge/Response system to counter spam

Categories

(MailNews Core :: Filters, enhancement)

Tracking

(Not tracked)

People

(Reporter: dwheeler, Unassigned)

References

(
URL
)

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Updated

Comment 13

Comment 14

Comment 16

Updated

Updated