Open Bug 265226 Opened 20 years ago Updated 1 month ago

Implement DomainKeys (DKIM/RFC 4871)

Categories

(MailNews Core :: Security, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: fareedrizkalla, Unassigned)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [patchlove][has draft patch])

Attachments

(1 file, 9 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.3) Gecko/20041019 Firefox/1.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.3) Gecko/20041019 Firefox/1.0

Spam is a growing concern today for any user. Spam filters only succeed by 20%.
I'm noticing developers are mimicing GMail, so why not mimic this new feature.
DomainKeys http://antispam.yahoo.com/domainkeys

Reproducible: Always
Steps to Reproduce:
This is a job for the servers, not the clients : SMTP-servers should add the
DomainKeys-header to your mail, and incoming MTA's can verify the header, which
also proves that the message hasn't been tampered with.

It's not a job for the client like Thunderbird to generate this header, although
it's still possible that a client might do the verification on its own.
Clients could indeed either generate the DK signature or verify one for inbound
mail... Generating one is for the extreme geeks -- they need to create a key and
put it into their DNS.  Verification, however, is much more reasonable, though
there will certainly be mail servers that completely break the signature during
the delivery process.
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
This bug has been automatically resolved after a period of inactivity (see above
comment). If anyone thinks this is incorrect, they should feel free to reopen it.
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago
Resolution: --- → EXPIRED
Status: RESOLVED → UNCONFIRMED
OS: Windows XP → All
Hardware: PC → All
Resolution: EXPIRED → ---
Summary: Implement DomainKeys → Implement DomainKeys (DKIM/RFC 4871)
Version: unspecified → Trunk
Assignee: mscott → nobody
Status: UNCONFIRMED → NEW
Component: General → MailNews: Security
Ever confirmed: true
Product: Thunderbird → Core
QA Contact: security
I'm interested in helping make this happen!
Product: Core → MailNews Core
I strongly support this feature - which to be useful really needs to be 'out of the box' for users.

May I suggest 2 modes - and the first part should be straightforward to implement. I think it should support both Domainkeys and DKIM. Yahoo has remained with DK while gmail checks both but favors DKIM.

   (i) Use existing Authentication-Results: header information.

       In this mode we rely on the ISP mail server to verify. We can allow user to (optionally) supply the mail server which identifies itself in the same header (e.g. mx.google.com) - all we need to check for is dkim=pass/fail or domainkeys=pass/fail in the header.

     
     I'd suggest for each mail account that there is a whitelist of what should be in the Authenticated-Results header (mx.google.com etc) - which must be activated by the user. We'd like to avoid spoofed headers caused by whitelisting it say for mx.google.com when the current account is not a gmail account and has a mail server which does not check dk/dkim. To avoid this, the user must turn the whitelist on - we can offer help in this (say mx.google.com) for gmail hosted accounts for example. 

    (ii) Once (i) is working - we could with some substantial effort link in the library to allow the client to (re-)verify the hashes - 

    I'd propose releaseing (i) as soon as possible.

  (i) is easy to simulate using filters and tags for those somewhat technically inclined but I'd much prefer a simple our of the box solution and by working out of the box (almost) for yahoo and gmail we'd cover a pretty large part of the mail community.

   Thank you for suggesting this .. 

gene/
Continuing on from comment #8 RFC5451 (http://tools.ietf.org/html/rfc5451) defines a standard for Authenticated-Results headers to parse in a validation mechanism agnostic manner.

The authserv-id of the standard should be a parameter defining the ADMD for the user in a similar what to what is mentioned.

I'm not sure (ii) above should be a goal. DNS keys don't lend themselves to archive integrity. This is mentioned in on of the DKIM RFCs.
I agree with sticking to (i) - thanks for pointing out the ietf.org link.
No longer blocks: 381879
Depends on: 545866
This is a WIP patch for testing DKIM/SPF via the Authentication-Results header in messages. What it does:

 - Check for "dkim=fail" and triggers the phishing warning if so
 - Check for "spf=fail" and triggers the phishing warning if so

What it does not do:

 - Function at all without View->Headers->All enabled. currentHeaderData doesn't seem to contain authentication-results otherwise; if anyone knows an alternate way at getting at the data, I'd appreciate it.
 - Check that the authentication-results header is valid. I'm not sure the best way to do this. On the other hand, it only checks for "fail" warnings, so a spammer who spoofs "pass" in his authentication-results header doesn't achieve anything special.

The patch is not yet fully tested. Also, I don't have a local hg copy of comm-central, so it's based on an unpacked omni.jar. You get the idea, though.

Feedback would be greatly appreciated, both on the ideas behind the patch (what should pass, what shouldn't) and the technical execution. I'll work more on this soon.
Attached patch diff using hg (obsolete) — Splinter Review
Proper hg diff. Furthermore, I've just tested this against a test message using dkim=fail and spf=permfail, and can confirm it triggers the scam warning.

If anyone knows how I can get access to the Authentication-Results header without View->Headers->All being set, that'd be great, since that's my next step.
Attachment #499472 - Attachment is obsolete: true
Attached patch patch v0.1 (obsolete) — Splinter Review
I think I've found the solution to the headers problem, although I don't have a build environment set up, so I can't test the change. On the JS side, however, I expanded this to cover dkim, spf, and dkim-adsp in the Authentication-Results field. If any of those are marked failed, the email is flagged as a scam.

So far, it tests correctly. Some other things I need to do:

 - Decide what to do in the case of multiple Authentication-Results headers
 - Decide if we should really flag a message as a scam if, say, SPF checks 
   but DKIM fails
 - Decide how to treat other Authentication-Results failures, like "policy"
 - Construct tests

Would it be appropriate for me to find whoever owns this code and flag them feedback?
Attachment #499473 - Attachment is obsolete: true
Well I think its MTA to deal with multiply headers and leave only one. We should use recent one(highest), which is usually stamped by trusted MTA.
DKIM should have more weight than SPF.

Ludo, can you suggest someone who could give feedback about this?
Okay, I'm looking at the RFC to see what it says about multiple headers. Section 1.6:

   Hence, use of the header field depends upon
   ensuring that mail entering the ADMD has instances of the header
   field claiming to be valid within its boundaries removed, so that
   occurrences of such header fields can be used safely by consumers.

However, section 5 says:

   For simplicity and maximum security, a border MTA MAY remove all
   instances of this header field on mail crossing into its trust
   boundary.  However, this may conflict with the desire to access
   authentication results performed by trusted external service
   providers.  It may also invalidate signed messages whose signatures
   cover external instances of this header field.  A more robust border
   MTA could allow a specific list of authenticating MTAs whose
   information should be let in, removing all others.

http://tools.ietf.org/html/rfc5451

Removing the previous instances of the header is therefore not a requirement but a suggestion. 

However, it's my understanding that headers are parsed into currentHeaderData such that duplicate headers are merely appended onto the end of the header value. In such a case, the authHeader.match() line in the patch would return multiple results, with the most trusted header coming first, and the method only counts the first match. Hence, a message marked spf=pass by the original sender but marked spf=fail by the user's MTA would be considered failed. This seems like the right behavior.

Also, if someone spoofs dkim=fail in their message, and Thunderbird detects that header and trusts it falsely, I think it's safe to mark the message as questionable anyway...

I hope someone can confirm that my understanding of Thunderbird's header parsing is correct.

Now, as for the other question -- if SPF checks out and DKIM fails, and so on -- how would we approach this? Consider a table:

   DKIM | SPF  | Scam?
  --------------------
   Pass | Pass | No
   Pass | Fail | No
   Fail | Pass | Yes?
   Fail | Fail | Yes

Does that seem reasonable? If a dkim-adsp= tag exists, I'll consider that as a replacement for the dkim= tag, so they're equivalent in the table.

I'll implement this logic, give it a bit of testing, and then flag someone for feedback.
Attached patch patch v0.2 (obsolete) — Splinter Review
This implements the logic in comment 16, and I've done some testing to make sure it works. Still needs exhaustive testing for the weird cases, so I'll try to dig up the automated tests for the phishing detection and make something based on that... (Also, the change in nsMimeHtmlEmitter.cpp hasn't been tested.)

Flagging philor for feedback, on the basis that he's reviewed the most recent substantive changes to phishingDetector.js, although that was still over two years ago. I apologize if I'm flagging the wrong person or shouldn't be flagging anyone at all; this is my first patch and I'm not entirely sure of the rules.

If this is something you actually want in Thunderbird, I'll see if I can devise some tests for this. DKIM and SPF are important anti-spam and anti-phishing tools, used by the big players like eBay, Facebook, PayPal, and so on, and taking advantage of them would help protect Thunderbird users.
Attachment #499559 - Attachment is obsolete: true
Attachment #499624 - Flags: feedback?(philringnalda)
Thx for working on this. Can't the authentication results header be easily spoofed? I.e., is there a lot of value to doing i without ii?
Here's my reasoning: Suppose we have a MTA that does no verification whatsoever, so it leaves any previous Authentication-Results headers in the message. Suppose the sender, a phisher, adds Authentication-Results: dkim=pass, spf=pass. We now trust the message no more or less than we would have before, and nothing is gained or lost. The results do not prevent us from doing the other URL-based phishing checks.

Suppose we have a MTA that does verification and adds Authentication-Results headers. The sender adds their own header, but the MTA may strip it or add its own before it. The patch trusts the earlier headers more (implicitly), so we aren't fooled by the spoofing.

The only spoofable case I can think of atm is if the MTA only checks SPF, and leaves old headers intact. It might add Authentication-Results: spf=fail, but the sender's spoofed dkim=pass would override that, leaving us fooled by a phish attempt.

If there's a way to only get the first Authentication-Results header, I could use that.

(ii) is not feasible anyway. DKIM makes no guarantee that the key will stick around in DNS for very long; you're free to remove it after a period of a day or two, since it should have been checked by the MTA immediately after delivery.
Indeed - thank you!!


The header has the server name - so a person using gmail should expect a real
header with 'mx.google.com' .. if there is a spoofed header with google in it
if it survived or gets deleted - either way if there is at least one fail
header it should indeed be considered failed - I think this is what is being
implemented.

 As for the table of pass/fails if both DKIM and SPF are present - DKIM is the
only relevant marker - if there is no DKIM then one could look at SPF - but
this is weaker ... if colors were involved i'd have DKIM be green/red .. and
SPF be blue/yellow or somthing .. 

 Is it also possible to show the count of fails (in case of multiple headers)
for each of DKIM/DOMAINKEYS/SPF - even if its a popup or a click for details or
something .. 

 Thank you so much for doing this ... I believe this will set TB apart from all
other mailers ... and I am sure others will follow suit ...

Alex comment about (ii) - seems reasonable -  I am fine with letting the MTA's do the checking - however, as an observation 

 in practice the certs (of the majors anyway) do not change very often ... and when they do they should have a different tag - so the old ones should still be available for a reasonable period of time by the older tag - at least thats what I believe - tho I could be wrong.
(In reply to comment #20)
> The header has the server name - so a person using gmail should expect a real
> header with 'mx.google.com' .. if there is a spoofed header with google in it
> if it survived or gets deleted - either way if there is at least one fail
> header it should indeed be considered failed - I think this is what is being
> implemented.

Unfortunately RFC 5451 section 2.3 specifies that the header field contain a unique identifier, but does not require that it be the server name. It could be anything.

>  As for the table of pass/fails if both DKIM and SPF are present - DKIM is the
> only relevant marker - if there is no DKIM then one could look at SPF - but
> this is weaker ... if colors were involved i'd have DKIM be green/red .. and
> SPF be blue/yellow or somthing .. 

What I do is look at SPF only if DKIM is not present or gave an error; for example, the MTA might have found a DKIM-Signature header, but couldn't find the key in DNS to see if it's valid. In those cases, I trust SPF. When DKIM gives a firm "pass" or "fail", I trust it instead.

>  Is it also possible to show the count of fails (in case of multiple headers)
> for each of DKIM/DOMAINKEYS/SPF - even if its a popup or a click for details or
> something .. 
I'm sure it's possible, but I haven't implemented this. For the moment I think it's best to be simple, and just trigger the scam warning if the message is suspect.

>  in practice the certs (of the majors anyway) do not change very often ... and
> when they do they should have a different tag - so the old ones should still be
> available for a reasonable period of time by the older tag - at least thats
> what I believe - tho I could be wrong.

That's true, but there's no guarantee that that old one will be around forever. If Thunderbird were to do its own DKIM checking, it'd have to happen as soon as it receives the message, to maximize the chances of finding the key. I'd honestly rather take the MTA's word for it, though, since it'll be configured to handle interesting scenarios. 

For example, mailing lists can break DKIM signatures by adding headers or appending content to the message, and the MTA may be configured to whitelist those messages. Or the MTA may be told to trust a certain external server's Authentication-Results headers. We should place the configuration burden on the MTA's administrator, not the user.
Alex, Thanks for starting an implementation.

> What it does not do:
> - Check that the authentication-results header is valid. I'm not sure the best
way to do this

The authserv-id tags  be 'trusted' need to be managed by the user unfortunately. My thoughts were to present a greyed out ticks and crosses corresponding to pass/fail from various auth-ids. If 'trusted' they can go green and if not the user can mark them untrusted to hide them.

'Scam' based on your DKIM/SPF logic is good though without more informed user control it may not mean much.

Regular expressions need to be fairly exact or someone will report it as a security vulnerability.

[a-z] kind of ranges don't work with all locales so best use regex classes if really needed.

Still recommended dropping (ii). You can use opendkim (BSD) licensed to validate the email message if you fetch the dkim key at reception and place it in some temporary store.

"We should place the configuration burden on the MTA's administrator, not the user."
When the MTA is a ISP or other big organisation the responsibility generally falls the other way.
(In reply to comment #22)
> The authserv-id tags  be 'trusted' need to be managed by the user
> unfortunately. My thoughts were to present a greyed out ticks and crosses
> corresponding to pass/fail from various auth-ids. If 'trusted' they can go
> green and if not the user can mark them untrusted to hide them.
> 
> 'Scam' based on your DKIM/SPF logic is good though without more informed user
> control it may not mean much.

I really can't see a user having any idea what this means. If there were a way of determining that the Authentication-Results header was added by the final MTA -- by its location among the Received headers, for example -- I'd go for that. 

Trusting the first header I see seems valid to me, though, since we don't stamp verified messages with a special "Signed!" message (so spoofing doesn't gain you anything) and spoofing a dkim=fail means you deserve a scam warning anyway.

> Regular expressions need to be fairly exact or someone will report it as a
> security vulnerability.
> 
> [a-z] kind of ranges don't work with all locales so best use regex classes if
> really needed.

I agree, the regex will need scrutiny. Since I'm unfamiliar with the code base, there might be some premade tokenizer I could use to make this better.

Unfortunately my regex skills are weak, so if someone wants to look at section 2.2 of RFC 5451 and concoct a stronger regex, please do. The header format is defined strictly, so it shouldn't be too hard to parse if you know more about regex than I do.

> "We should place the configuration burden on the MTA's administrator, not the
> user."
> When the MTA is a ISP or other big organisation the responsibility generally
> falls the other way.

Having heard stories about users in large organizations, I'm not sure I'd want to trust them any more than the MTA administrator...
(In reply to comment #19)

> We now trust the message no more or less than we would have before,
> and nothing is gained or lost. The results do not prevent us from doing the
> other URL-based phishing checks.
> 

Ah, thx, makes sense.
I agree - (i) is where the focus should be ... thanks for reading the RFC too
... again in practice the servers use a "server-like" name when checking DKIM -
certainly thats seems to  happen naturally when setting up DKIM milter.

Again thank you for pushing this - this will set thunderbird well above other
mailers - once this is in the wild it should be marketed a lot as a valuable
new feature.
I agree. I hope that pushing DKIM leads to wider use and cuts down on phishing.

So, would Thunderbird implement this as I've written it? If so, I'll try to devise testcases and further improve the code; feedback on the patch itself would be appreciated. Otherwise, maybe I'll package it as an extension...

(Also, merry Christmas, and a spam-free New Year...)
Comment on attachment 499624 [details] [diff] [review]
patch v0.2

Sounds like bienvenu's a better bet for feedback - I'm willing to believe I reviewed touching of phishingDetector, but I certainly can't remember what touching it was.
Attachment #499624 - Flags: feedback?(philringnalda) → feedback?(bienvenu)
(In reply to comment #15)
> Ludo, can you suggest someone who could give feedback about this?

Bienvenu is probably the right person to ask feedback.

Would be nice to add some unit tests too.
Flags: in-testsuite?
Once someone's verified that I'm taking the right approach (that we want to throw the scam warning based on the criteria in comment 16) I'll look at how the other phishing tests work and then write some more.

I've also learned that there can be "domainkeys=pass" and so on on the Authentication-Results header; I'll have to add support for that, with DKIM (the newer standard that replaces DomainKeys) taking higher priority.
For the regex, here's the relevant portion of the spec:

        tag-list  =  tag-spec 0*( ";" tag-spec ) [ ";" ]
        tag-spec  =  [FWS] tag-name [FWS] "=" [FWS] tag-value [FWS]
        tag-name  =  ALPHA 0*ALNUMPUNC
        tag-value =  [ tval 0*( 1*(WSP / FWS) tval ) ]
                          ; WSP and FWS prohibited at beginning and end
        tval      =  1*VALCHAR
        VALCHAR   =  %x21-3A / %x3C-7E
                          ; EXCLAMATION to TILDE except SEMICOLON
        ALNUMPUNC =  ALPHA / DIGIT / "_"

I think the proper regex is this:
  /(?:^|;)\s*dkim\s*=\s*([!-:<-~\s]*?)\s*(?:;|$)/

I'm not sure if header values that span multiple lines have folding whitespace removed by the time you read it. If so (or if you want to do this manually before you parse), the regex simplifies to this, but you have to check that the tag-value has no leading or trailing spaces:

  /(?:^|;)dkim=([!-:<-~\s]*)(?:;|$)/


For people who want to check my work (regexes are pretty much write-only code), here we go:

1) From tag-list, we get this to describe where a tag-spec is:
   /(?:^|;)<<tag-spec>>(?:;|$)/
2) A tag-spec looks like:
   /\s*dkim\s*=\s*<<tag-value>>\s*/
3) A tag-value looks like (non-greedy star so we don't match trailing spaces):
   /([<<VALCHAR>>\s]*?)/
4) The characters listed in VALCHAR are ! : < ~, so tag-value is now:
   /([!-:<-~\s]*?)/
5) Put this mess together and you've got:
   /(?:^|;)\s*dkim\s*=\s*([!-:<-~\s]*?)\s*(?:;|$)/
Hmm, actually it might not be possible to remove all those "\s*" from the regex if folding whitespace is removed. I'm not 100% sure how the spec handles that stuff.
Awesome, thanks! I'm not sure what Thunderbird does to the headers when it parses them, but I can easily test to see how newlines are handled; if they're not removed, I can strip them manually easily enough. I can also check on folding whitespace.
Hm. The regex doesn't quite work on this test header:

  Authentication-Results: mail.xyloid.org; dkim = pass
	(1024-bit key; insecure key) header.i=@gmail.com; dkim-adsp = fail

I run this:

  /(?:^|;)\s*dkim\s*=\s*([!-:<-~\s]*?)\s*(?:;|$)/im

and it matches:

  pass (1024-bit key

Is there a change you can make to not parse up to the semicolon?
Oops, that regex was for DKIM-Signature, not Authentication-Results. It's going to be tough to parse Authentication-Results by regex, since it has CFWS (comment/folding white space), but this should succeed on non-pathological headers:

  /;\s*dkim\s*=\s*(\S*)/

I don't think you want case-insensitive regexes for this, by the way.

I could probably figure out how to strip out comments to make this regex actually correct, but I'm not sure it's worth it.
Hm. I had the impression that the spec was case-insensitive, but it appears that's only specific fields.

If anyone here regularly receives DKIM-signed mail and has Authentication-Results headers, could you post those headers so I can use them as testcases? If you have failed messages (perhaps spam messages coming from a domain with ADSP, or with SPF failures), could you post the header from those too?

I'll upload a revised patch soon, and then start working on testcases. (If anyone can point me to the existing phishing tests, that'd be awesome.)
Whoops, no, the regex in comment 34 matches this:

  fail;

for this header:

  Authentication-Results: mail.xyloid.org; dkim=pass
	(1024-bit key; insecure key) header.i=@gmail.com; dkim-adsp=fail; spf=pass

Without trailing semicolon would be nice for string comparison purposes. Stripping it with str.replace would be easy enough too, if it turns out to be impractical to create a regex otherwise.
Hmph. Some mxr and hg digging reveals no pre-existing phishing tests. What's the approach to take when making these tests, then? I'd like to provide test messages and see if they trip the phishing warning, but I have no idea how to set up a test when there's no pre-existing tests.
Are you thinking about the difference between

  (1) known good domain (gmail, yahoo) having signed it

  (ii) unknown domain having signed it

  (iii) faked auth line - not really signed at all.

If there is a list of 'good domains' which the user can add to this white list - then we know that mail from those domains if passing auth test is actually good - or bad if it fails.

 for (ii) and (iii) we should be a little more cautious since we're not actually checking the signature ..

gene
Since the patch performs no actions when the DKIM tests pass, it doesn't matter to me if the Authentication-Header is fraudulent. We don't stamp the message with a "This Message Is Genuine!" label.

The case that worries me is if your ISP adds the Authentication-Results header without stripping earlier headers. If they don't do ADSP checks, you could add dkim-adsp=pass in the fraudulent header, and the ISP could add dkim=fail in the (trustworthy) header. But ADSP has priority, so the failure would be ignored.

If there is a way to only read the first header in the message, I will use that and these issues will be avoided. But some ISPs add multiple Authentication-Results headers -- one for each test.
I guess I was thinking more of something like this:

  Extract most recent Received: header - determine if the 'server' is in the whitelist 

  Received: from .* by <host>
 
  e.g. host ~ *.yahoo.com might match whitelist .yhaoo.com
RFC 5451 section 7.1:

   An MUA or filter that accesses a mailbox whose mail is handled by a
   non-conformant MTA, and understands Authentication-Results header
   fields, could potentially make false conclusions based on forged
   header fields.  A malicious user or agent could forge a header field
   using the DNS domain of a receiving ADMD as the authserv-id token in
   the value of the header field, and with the rest of the value claim
   that the message was properly authenticated.  The non-conformant MTA
   would fail to strip the forged header field, and the MUA could
   inappropriately trust it.

This is why I do not use Authentication-Results to trust a message -- I use it only to cast doubts on messages. Forged headers are not an issue if I only look for failure, not success -- if someone forges dkim=fail, well, they deserve a scam warning.

A conforming MTA that leaves prior Authentication-Results headers in the message is asking for trouble, and there is little I can do to determine which header is real and which isn't. A MTA may add multiple Authentication-Results headers, so I cannot just read the first one; a malicious MTA can use whatever authserv-id it wants to, so I can't just use that. Conforming MTAs should strip untrusted headers (though they are not required to).

The RFC also notes that I could register "faceb00k.com" and send fake email from there with valid DKIM signatures. Hence dkim=pass should not be taken to mean "this message is trustworthy." I only trust dkim=fail.

tl;dr: fraudulent headers don't matter, because all they can do is make me trip the phishing warning.
Ah yes - you said this earlier - good point ... :-) I'm with you now.
(In reply to comment #36)
> Whoops, no, the regex in comment 34 matches this:
> 
>   fail;
> 
> for this header:
> 
>   Authentication-Results: mail.xyloid.org; dkim=pass
>     (1024-bit key; insecure key) header.i=@gmail.com; dkim-adsp=fail; spf=pass
> 

Whoops, that's what I get for not testing. Fixed: /;\s*dkim\s*=\s*([^\s;]*)/
To test, you can probably look at these tests: http://mxr.mozilla.org/comm-central/source/mail/test/mozmill/message-header/test-message-header.js as a basis. Note the use of "clobberHeaders" to add custom headers to the message.
Attached patch patch v0.3 (obsolete) — Splinter Review
Updated patch. Uses new regex that handles whitespace correctly (thanks Jim), and now checks for DomainKeys as well as DKIM and ADSP.

Rather than trying to prioritize ADSP over DKIM over DomainKeys (which means a message with invalid DomainKeys signature could have a fraudulent dkim=pass and get through), I just say that if any of them fails, the message is suspicious. Saves me the trouble of deciding which header to believe.

Tests are forthcoming, once I figure out how that whole system works.
Attachment #499624 - Attachment is obsolete: true
Attachment #500557 - Flags: feedback?(bienvenu)
Attachment #499624 - Flags: feedback?(bienvenu)
This sounds interesting, so I've made a TB build with your patch. But how can I now test this feature, do I need a special email with a DKIM digital signature? Or is there a way to test this with "regular" emails?
You'll need DKIM-signed email and an email account with a host that checks it. Easy way:

 - Get a Gmail account and a Yahoo Mail account. Add the Gmail account in Thunderbird with IMAP.
 - Send email from Yahoo Mail to your Gmail address.
 - View the message in Thunderbird once it's received in your Gmail account.

Yahoo adds DKIM and DomainKeys signatures, and Gmail will add the Authentication-Results header when you receive the message. DKIM has also been implemented by sites such as Facebook and eBay, so if you already get Facebook mail at your Gmail account, you can test that.

My email server is set up for DKIM signing as well, so I can send a test message to any address you want...
(In reply to comment #45)

> Tests are forthcoming, once I figure out how that whole system works.

The documentation is at https://developer.mozilla.org/en/Thunderbird/Thunderbird_Automated_Testing if you have questions feel free to ask them in mozilla.dev.apps.thunderbird.
Assignee: nobody → alex+list
Status: NEW → ASSIGNED
Attached patch patch v0.4, now with moar tests (obsolete) — Splinter Review
New patch, now with MozMill test. The test runs, but without recompiling to include my change in nsMimeHtmlEmitter.cpp I can't be sure it works correctly.

As far as I know, the patch is complete. Since I don't have a build system set up I can't test the .cpp changes, but otherwise it's ready for feedback/review.
Attachment #500557 - Attachment is obsolete: true
Attachment #500958 - Flags: feedback?(bienvenu)
Attachment #500557 - Flags: feedback?(bienvenu)
Attachment #500958 - Flags: feedback?(bienvenu) → review?(bienvenu)
No longer depends on: 545866
the test fails, even with the .cpp change :-(

TEST-PASS | setupModule
TEST-UNEXPECTED-FAIL | c:\builds\tbirdhq\mail\test\mozmill\message-header\test-p
hishing-bar.js | test_auth_header_scam_warning
  EXCEPTION: This Authentication-Results header should trigger the scam warning:
 mx.example.com; spf=hardfail (example.com: domain of spam@example.com does not
designate 192.0.2.4 as permitted sender)...true
    at: test-phishing-bar.js line 106
       checkScamBar("mx.example.com; spf=hardfail (example.com: domain of spam t
est-phishing-bar.js 106
       test_auth_header_scam_warning() test-phishing-bar.js 92
            frame.js 474
            frame.js 530
            frame.js 573
            frame.js 426
            frame.js 585
            server.js 164
            server.js 168
TEST-PASS | teardownModule
make: *** [mozmill-one] Error 1
My apologies; the test is detecting that somewhere along the line I lost my spf=hardfail detection in the patch. (The RFC says it should be spf=hardfail, but errata indicate it's actually spf=fail; gmail implements hardfail, so we should detect both.)

(Is there a way of making a test failure not end the test immediately? I'd like to test each header in the array without dying immediately upon one header failing. Do I have to make each header a separate test?)

Anyway, new patch. If the tests continue to fail, perhaps there really is something wrong with the .cpp change, and I'll have to set up a build environment...
Attachment #500958 - Attachment is obsolete: true
Attachment #501221 - Flags: review?(bienvenu)
Attachment #500958 - Flags: review?(bienvenu)
(In reply to comment #51)

> 
> (Is there a way of making a test failure not end the test immediately? I'd like
> to test each header in the array without dying immediately upon one header
> failing. Do I have to make each header a separate test?)

yes, you could make each header a separate test_ function in the test file.

I'll try the new patch tomorrow.
Updated to use one test per header, and tested to be sure the tests at least run. This way, at least if tests fail with the patch applied and compiled, I can tell what's wrong.
Attachment #501221 - Attachment is obsolete: true
Attachment #501255 - Flags: review?(bienvenu)
Attachment #501221 - Flags: review?(bienvenu)
Is there any way you could split the regexp part as a separate library function that I could use from Thunderbird Conversations? (Anywhere in mail/base/content would be fine.) That function would take the authentication-results header and return a boolean telling whether the caller should set the phishing msg. That way, I could easily integrate the phishing warning in Thunderbird Conversations.

Thanks!
Would a helper function in phishingDetector.js, inside of gPhishingDetector, work fine? I can just split out the guts of analyzeAuthenticationHeader to another function which returns a boolean.
That sounds good to me, thanks :)
Now with checkAuthenticationMethods helper that can be passed the Authentication-Results header of any message and returns a verdict. Jonathan, will this work?
Attachment #501255 - Attachment is obsolete: true
Attachment #501336 - Flags: review?(bienvenu)
Attachment #501255 - Flags: review?(bienvenu)
Yes, that's awesome, I can easily poke at such a global from Thunderbird Conversations, and it prevents terrible code duplication. Thanks!
Hmph. A complication: I just went through some mailing list messages I got from tb-planning to see what mailman does to DKIM signatures. There's a bunch of messages from gmail addresses, and it seems that they only escape the scam warning because gmail.com's DKIM record is set in test mode -- since mailman alters the mailing list messages, all their signatures fail.

Clearly this is a problem for anyone who uses mailing lists, since DKIM-signed mail will end up being marked as a scam. Mailman used to strip DKIM signature headers so this would not occur, but later reverted the change:

https://bugs.launchpad.net/mailman/+bug/557493

The only way a mailing list can avoid this problem is by re-signing messages that leave the mailing list with its own key, so that the signature will validate. 

Hmph. I'll see if there's a way of noticing mailing-list munging.
Comment on attachment 501336 [details] [diff] [review]
patch v0.7, now with moar helperfunctionage

clearing review request, then, though I will try running the patch.
Attachment #501336 - Flags: review?(bienvenu)
some tests succeed, others fail:
TEST-PASS | test_dkim_spf_pass
TEST-PASS | test_spf_pass
TEST-PASS | test_dkim_pass_whitespace
TEST-PASS | test_spf_neutral
TEST-PASS | test_domainkeys_pass
TEST-UNEXPECTED-FAIL | c:\builds\tbirdhq\mail\test\mozmill\message-header\test-p
hishing-bar.js | test_dkim_adsp_fail
  EXCEPTION: This Authentication-Results header should trigger the scam warning:
 mail.example.com; dkim = pass (1024-bit key; insecure key) header.i=@example.co
m; dkim-adsp = fail...true
    at: test-phishing-bar.js line 120
       checkScamBar("mail.example.com; dkim = pass (1024-bit key; insecure key)
header.i= test-phishing-bar.js 120
       test_dkim_adsp_fail() test-phishing-bar.js 89
            frame.js 474
            frame.js 530
            frame.js 573
            frame.js 426
            frame.js 585
            server.js 164
            server.js 168
TEST-UNEXPECTED-FAIL | c:\builds\tbirdhq\mail\test\mozmill\message-header\test-p
hishing-bar.js | test_spf_hardfail
  EXCEPTION: This Authentication-Results header should trigger the scam warning:
 mx.example.com; spf=hardfail (example.com: domain of spam@example.com does not
designate 192.0.2.4 as permitted sender)...true
    at: test-phishing-bar.js line 120
       checkScamBar("mx.example.com; spf=hardfail (example.com: domain of spam t
est-phishing-bar.js 120
       test_spf_hardfail() test-phishing-bar.js 95
            frame.js 474
            frame.js 530
            frame.js 573
            frame.js 426
            frame.js 585
            server.js 164
            server.js 168
TEST-UNEXPECTED-FAIL | c:\builds\tbirdhq\mail\test\mozmill\message-header\test-p
hishing-bar.js | test_spf_fail
  EXCEPTION: This Authentication-Results header should trigger the scam warning:
 mx.example.com; spf=fail (example.com: domain of spam@example.com does not desi
gnate 192.0.2.4 as permitted sender)...true
    at: test-phishing-bar.js line 120
       checkScamBar("mx.example.com; spf=fail (example.com: domain of spam test-
phishing-bar.js 120
       test_spf_fail() test-phishing-bar.js 101
            frame.js 474
            frame.js 530
            frame.js 573
            frame.js 426
            frame.js 585
            server.js 164
            server.js 168
TEST-UNEXPECTED-FAIL | c:\builds\tbirdhq\mail\test\mozmill\message-header\test-p
hishing-bar.js | test_dkim_adsp_fail_override
  EXCEPTION: This Authentication-Results header should trigger the scam warning:
 mail.example.com; spf=pass; dkim=pass; dkim-adsp=fail; domainkeys=pass;...true
    at: test-phishing-bar.js line 120
       checkScamBar("mail.example.com; spf=pass; dkim=pass; dkim-adsp=fail; doma
inkeys=pass;") test-phishing-bar.js 120
       test_dkim_adsp_fail_override() test-phishing-bar.js 106
            frame.js 474
            frame.js 530
            frame.js 573
            frame.js 426
            frame.js 585
            server.js 164
            server.js 168
Hmph. This means that the nsMimeHtmlEmitter.cpp change is wrong, so I'll have to dig through to find the right place to edit. The tests and the patch will only function if the header view mode is set to All, since the Authentication-Results header isn't passed along to JS otherwise.

Now that I've noticed the mailing list issue, though, I'll have to plan out the best course of action. I'm currently looking through RFC 5863 (DKIM Development, Deployment and Operations) to see what is recommended.

It's kind of difficult to use a message signing standard when it's acceptable for legitimate users to invalidate signatures.
Now that I think of it, I realize there is still some hope for DKIM. ADSP allows domains to publish a DNS record that specifies whether the mail they send is signed; one option is to mark the domain "discardable", which tells recipients that if the signature is invalid or missing, the message may be discarded.

Potential use-cases include things like Bugzilla mail, or my bank's notification emails. Since they do not participate in mailing lists and their contents should never be modified in transit, they can be DKIM-signed and verifiers told to discard modified messages as scams.

We can use this, since Authentication-Results includes dkim-adsp=discard as one of its results. We should use that, along with spf=fail, to mark a message as a scam.

However, I may need help finding out where to edit to get Authentication-Results to be passed into the JavaScript.
(In reply to comment #63)
> 
> However, I may need help finding out where to edit to get
> Authentication-Results to be passed into the JavaScript.

I'd suggest having a look at http://mxr.mozilla.org/comm-central/source/mailnews/mime/emitters/src/nsMimeHtmlEmitter.cpp#193 (it's where the headers get collected from libmime and passed to messageHeaderSink, in JS), and then figuring out how to add Authentication-Results to the list. These headers are gathered on the other end by the JS code at http://mxr.mozilla.org/comm-central/source/mail/base/content/msgHdrViewOverlay.js#459 . You could add a special case there for the header you're interested in: instead of adding it to the message header view, it could trigger your function.
Yes, nsMimeHtmlEmitter.cpp is exactly what I tried editing in my patch, but apparently my method of adding it did not work. If I understand msgHdrViewOverlay.js, new headers should be passed through and available, so I don't know why it's not working.
I'll spend a few minutes with the debugger and see if/why you're not getting this header.
the header is getting passed into js, so I think the problem must be on the js end. I'll try faking it by hand and see what happens.
I took a message with a dkim=pass authentication result header and changed it to dkim=fail, and I saw the scam bar. So the code works at least some of the time. I'll go see if I can figure out why some of the mozmill tests failed.
OK, trying these tests by editing messages by hand works fine, so I think there's a problem with the testing framework.
I was just trying to watch what was happening and found that putting a call to mc.sleep(5000) before checking if the phishing bar is displayed makes all the tests pass. So there must be some delay before the phishing bar is displayed, and I need to figure out if there's a notification I can listen for that indicates we've finished the notification bar stuff.
Hmm. I edited by hand and was only able to get the phishing warning to appear when I enabled View->Headers->All. I'll test again and experiment a bit when I get time.

Thanks for looking into it. Here's my plan:

 - Mailing lists can munge messages and cause verification to fail. We shouldn't throw the phishing warning in this case.
 - Mailing lists also confuse SPF.
 - Hence, we should look for DKIM-ADSP set to "discardable". If a domain sets its policy to discardable, it is telling us that the message can safely be discarded if DKIM fails. 
 - We should flag discardable messages. A bank, for example, set its DKIM-ADSP to discardable, since it should never pass through a mailing list or munger. An individual user sending mail won't set discardable, and the phishing risk is much lower anyway.

The question is: will any DKIM-verifying MTA simply discard a discardable message, so that we never have to handle it? I'll research this, but I think the answer is that OpenDKIM, at least, doesn't discard anything by default.
this makes the tests pass - the mc.sleep(0) lets us return to the event loop, which seems to fix the problem.

I also fixed a bunch of whitespace issues (blank lines with spaces on them, primarily), so if you could start with this patch, that would be much appreciated!
Attachment #501336 - Attachment is obsolete: true
Alex are you still working on this ?
No, I'm not; I'm back at school again and don't have the time. I'm also no longer sure this is the correct approach, considering the difficulties in determining who added Authentication-Results, handling mailing lists, and so on. The major benefit comes from messages with DKIM-ADSP set to "discardable" (like said in comment 71) and we know the message should be flagged if authentication failed.

Perhaps Thunderbird could do only that, and publicize DKIM with ADSP enabled as an anti-phishing practice that important services should adopt and ISPs should support.
There is one simple case in DKIM, that is when authentication succeeded. All the other cases are tricky, because failures can occur for various reasons, and flagging lots of good mail as failed is not a good idea.

For this reason, I'd like to argue for showing something like a standard HTTPS (non-EV) indicator when DKIM succeeds, while showing nothing if it fails or is absent.

You could then thing about when it's worth flagging an email as failing DKIM once this basic feature is in place.
As I said in comment 41, it's impractical to try to trust Authentication-Results headers when authentication succeeds, and I believe the RFC recommends against it. Many mailservers do not support DKIM or strip Authentication-Results headers, so it would be trivial for a phisher to insert their own "pass" header and get a special "trusted" logo.
Maybe I'm misunderstanding. I think case (i) (from comment 8) requires that the user's email service implement checking. It seems then that all that is needed is a user configuration for each account that tells this feature what the given service supports. 

With that, you could implement a "Pass" indicator without fear of forgery if the configuration says the server checks and strips. If the configuration says the server does neither this feature could be disabled altogether, since none of the Results info is trustworthy. (Or are there noteworthy use cases for trusting a server ahead of the user's mail service?)

Is there some reason to try to make this feature work absent any knowledge of the user's service's behavior? That approach seems to have been a major stumbling block. The defaults for the account config could be tied to an ad-hoc list of the major email services (gmail, Yahoo, etc.), so that the most users get the most benefit.
As an aside, I'm still rooting for case (ii) checking, as many of my own mail services do not check any form of authentication. (Earthlink, Roadrunner, ...)
As I said in comment 19, I don't think (ii) is feasible either; there's no guarantee that we can reverify a message some days after it was sent. The RFC also recommends against showing a "Pass" indicator merely on the basis of the Authentication-Results header:

   An MUA SHOULD NOT reveal these results to end users unless the
   results are accompanied by, at a minimum, some associated reputation
   data about the authenticated origin identifiers within the message.
   For example, an attacker could register examp1e.com (note the digit
   "one") and send signed mail to intended victims; a verifier would
   detect that the signature was valid and report a "pass" even though
   it's clear the DNS domain name was intended to mislead.  See
   Section 7.2 for further discussion.

https://tools.ietf.org/html/rfc5451#section-4.1

The user would also have to know whether the server provides this service, and there's no way to query a server to check.

See comment 41 for my reasoning on only giving failure messages, rather than granting messages a stamp of approval based on DKIM.
I had read comment 41 (indeed skimmed the entire thread before posting) and my answer to that was to provide explicit configuration for the user's service, rather than rely on a non-existent query. 

That does put a burden on the user to figure it out. My suggestion was that the burden could be eased substantially by providing a default list for the few large services, and let the FAQ pages and user forums of other services provide the rest. 

I understand the deceit intended by examp1e.com versus example.com, but always thought that was understood to be the user's peril. That is, I understood DomainKeys to guard against literal spoofing only, not as an indicator of the sender's trustworthiness nor as protection against visual spoofs. I may be losing the battle here, especially in light of the prospect of Unicode domain names, but I think the RFC authors are allowing the perfect to be the enemy of the useful. 

For (ii), if the key has been removed from the dns can that be treated as a neutral case (neither pass nor fail)? If so that would seem acceptable; I normally check my email in a timely fashion, and I would accept as limitation that if I hadn't retrieved mail in some while that the DomainKeys checks wouldn't happen for the stale messages. Even if that case is indistinguishable from a fail I might still find that to be a limitation worth living with. And yes, I'm hoping the check is done on retrieval, as per your comment 21, and not done or redone when I happen to open the message list.
I understand your reasoning in comment 41, but I disagree with it. It suffers the perfect solution fallacy. Knowing that an email *really* originates from faceb00k.com is immensely more useful than the current situation, where we have no idea whatsoever.

I appreciate that the RFC reasons along the same lines, but it asks that we don't solve the problem *at all* until we have solved it *perfectly*. That reputation database that is being asked for won't appear until there is demand for it, and there is no demand for it until you start telling people "this comes from faceb00k.com, but we don't know if that website is malicious or not".
In fairness to Alex and the other implementers, our comments really belong in the RFC process -- the purpose of the published RFCs is to give guidance to implementers.

That said, in lieu of a public reputation database I'd be more than satisfied with a user-configured whitelist where I can designate senders known to me (such as facebook.com and example.com) and only messages for those senders would be eligible for the "Pass" mark. I'd rather, in fact, limit the Pass mark to organizations I already know.

[Note, this is not the same as gene's whitelist, in comment 8.]

This workaround I think satisfies the criteria in the RFC that associated reputation data be used. We're merely requiring that the user provide the reputation data, as an option to relying on some external source. 

I think this also addresses Roman's concern that we start somewhere. If this mechanism proves popular enough it will drive demand for better automation, and perhaps attract the attention of additional developers. We can hope to leverage the techniques (and maybe even the involvement) of the clever folks who worked out how to automate the account creation process (POP/IMAP setup), for example.
A fairly lengthy comment, apologies for verbosity.

A method to check Authentication-Results (A-R) fields is to stop reading at some Received field.  It is described at bullet 5 of RFC 5451, Section 7.1.  The single difficulty is to learn the internal handling, from boundary MTAs to mailbox server, possibly including alias-like internal forwarding.  Received fields from/by "localhost" should just be disregarded.  Otherwise, if we find the first Received before any A-R, and there is an A-R right after it having an authserv-id equivalent to the names in the "by" clause of both surrounding Received fields --the one we found and the next downward-- then we can trust this A-R.  "Equivalent" here means equal, but we could tolerate an added prefix if we knew that the remaining part is specific enough.

There are three interesting kinds of result, IMHO.  One, with dkim=pass and/or spf=pass we have authenticated a domain name.  That's the only one thing it makes sense to display.  Possibly, clicking on it a user could bring up a dialog asking "Always trust DOMAIN? (check spelling carefully)"  The bare domain name is the only "associated reputation" currently available.  If the user sets such checkbox we can then set a prominent "OK" mark on messages of that stream.  See also
http://tools.ietf.org/html/rfc6376#appendix-D

Two, a negative value of dkim-adsp (discard, fail, perhaps even nxdomain) calls for some anti-phishing flag to be set.  This is the single point where anti-phishing is interfaced.

Three, vbr=pass is similar to dkim=pass for a trusted domain name.  The domain name is header.md, while the decision to trust senders vouched by header.mv was made by the authentication server.  I would mark these messages with a slightly different "OK" icon.  I don't think users need to be able to override that setting, since they have to trust their mailbox provider(s) anyway.

One further functionality that needs to be considered is for users to mark messages as spam and report them to the MTA.  It is especially important for vouched domains.  One way to report a message is to wrap it in an ARF message and send it as suggested in
http://wiki.asrg.sp.am/wiki/Adding_a_junk_button_to_MUAs#Per_message_approaches
that is, sending it to abuse@<authserv-id> if so configured.

When I discovered this bug, I had already tried to develop an extension for doing some of the above.  I didn't complete it, but I had a (JSON encoded) configuration file with one entry for each authserv-id.  Entries collected in the way described above can get an abuseAddress value when the user sends an abuse report for the first time.

jm2c
Resetting assignee to default per comment #75. Feel free to re-take if you decide to pick this up again, otherwise someone else may be able to continue where you left off.
Assignee: alex+list → nobody
Status: ASSIGNED → NEW
Whiteboard: [patchlove][has draft patch]
Flags: in-testsuite?
I'm using Philippe Lieser's DKIM Verifier:
https://addons.mozilla.org/en-US/thunderbird/addon/dkim-verifier/

The addon works well, and that should be enough to close this bug.
Would someone close it, please?
We only close bugs when landed in core. Good to know there's an add-on for it though.

(In reply to Alessandro Vesely from comment #87)

I'm using Philippe Lieser's DKIM Verifier:
https://addons.mozilla.org/en-US/thunderbird/addon/dkim-verifier/

This addon does not evolve quick enough. And fails every update now.
Making this function a part of TB would be a great enhancement for all users.

See Also: → 1756335
See Also: → 1675449
Severity: normal → S3
Blocks: 1885323
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: