Closed Bug 1504526 (idn-phishing) Opened 6 years ago Closed 5 years ago

Should show sender/from domain as punycode

Categories

(Thunderbird :: Untriaged, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: volker.mische, Unassigned)

References

Details

Attachments

(4 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0

Steps to reproduce:

Received a scam email.


Actual results:

The sender looks legit.

The sender looks like this:

Аmazon.dе <noreply@аmаzоn.dе>

When looking at the source of the email, the sender is:

From: =?UTF-8?B?0JBtYXpvbi5k0LU=?= <noreply@аmаzоn.dе>

When I paste the domain name from the top (аmаzоn.dе) into Firefox I get:

http://www.xn--mzn-5cdb9f.xn--d-jtb/



Expected results:

The sender doesn't look legit.

I would expect that the domain name is either displayed as "аmаzоn.dе" (that's what my email provider does) or as "xn--mzn-5cdb9f.xn--d-jtb".
Hmm, =?UTF-8?B?0JBtYXpvbi5k0LU=?= is RFC 2047 encoding for Amazon.de, check it by pasting 0JBtYXpvbi5k0LU= into https://www.base64decode.org/.

Raw UTF-8 is legitimate in e-mail headers according to RFC 6532.

TB does show the address in full with <noreply@аmаzоn.dе> in the header pane, but not in the thread pane. You can also see the full address when you reply.

Magnus, do you see a need for action here?
Flags: needinfo?(mkmelin+mozilla)
I agree that there's not much you can do about the sender "Аmazon.dе" (D0 90 6D 61 7A 6F 6E 2E 64 D0 B5), which looks like "Amazon.de" (41 6D 61 7A 6F 6E 2E 64 65).

Is the header pane the thing that shows the sender, subject and receivers? If yes, in TB 60.3.0 the sender renders for me as "Аmazon.dе <noreply@аmаzоn.dе>" (I copied it here with "Copy Name and Email Address"). I only see "<noreply@аmаzоn.dе>" when I view the source
Attached image header-pane.png
This is what I see. The copy gives this: Аmazon.dе <noreply@аmаzоn.dе>

So something is at work that tricks you which would make a good phishing attack. What happens when you reply?
If I reply I also get "Аmazon.dе <noreply@аmаzоn.dе>". I'll attach a screenshot.
Then please attach the full message here since I've only crafted a sample e-mail with the From: header you provided. Drag the message to the desktop or use "save as file" and save as .eml. Then attach that here. You can obfuscate personal details, like your e-mail address, but then you Gmail address is already visible on here.
The From header can't be trusted, unless the message is signed and the signature checks out.

What's the point of having the fake amazon.de address? Might as well have used he right one. Or perhaps this fools some DKIM verification or such?

Registrars wouldn't allow you to register the fake amazon.de so in that sense it can't be used to scam people. 

I don't see much to do about this, except that it would be good to consistently show noreply@аmаzоn.dе instead of noreply@аmаzоn.dе at some places.
Flags: needinfo?(mkmelin+mozilla)
That's the original email (with my email username removed) when I do a "save as". The sender shows up as "noreply@аmаzоn.dе".
Attached file copyandpasted.eml
That's the email (with my email username removed) when I copy and paste it from the "view-source" window. The sender shows up as "noreply@аmаzоn.dе".
I'd rather prefer if it would always show "noreply@аmаzоn.dе". I know that the sender can't be trusted, but if the sender already does some weird things, for me it's a good indication that something might be wrong. So if there's a way to indicate that and display to the user, that would be nice.
Well that would be terrible for perfectly valid IDN domains.
Attachment #9023196 - Attachment mime type: message/rfc822 → text/plain
Could it be displayed as "noreply@xn--mzn-5cdb9f.xn--d-jtb"?
OK, so the sender in the original message is this:
From: =?UTF-8?B?0JBtYXpvbi5k0LU=?= <noreply@аmаzоn.dе>

You can see it if you switch the encoding to unicode in the message preview.

However, the amazon.de aren't the usual ASCII characters you'd expect, but for example D0 B0 (looking in a hex editor) for the "a". That's a small Cyrillic "a", see https://www.utf8-chartable.de/unicode-utf8-table.pl?start=1024.

So what happens is that the sender used UTF-8 in the From: field and that's legal according to RFC 6532. We display it correctly and you can't see the difference between a Latin "a" and a Cyrillic "a".

In the source view shown as Western/windows-1252 you can see that it's something else.

I don't know why the attacker has done this since the phishing attack is clearly to click the link hiding behind the "Weiter (über den Sicherheitsserver)" - https://t.co/BC6MAVX3qJ.

Pasting the аmаzоn.dе into FF indeed gives http://www.xn--mzn-5cdb9f.xn--d-jtb/.

So maybe there is some room for improvement to display the domain in the e-mail address differently.

This kind of thing is known as homograph attack (2002). In ASCII, it would be played as, say, Arnazon.de, relying on victims' short-sightedness.

I noted the email sample didn't have a DKIM signature. Probably it was some kind of test. With the coming of EAI, a DKIM signature can be produced with U-Labels in the d= tag, see rfc8616. That way the scam can become perfect, with no illegal content nor email client vulnerability whatsoever. IOW, this is not a TB bug. The defence must be looked for in tools like RPZ. An "intelligent" plugin, able to decode UTF-8 and compare homographs against a list of heavily phished domains would also be possible, but it's no business for the core of an email client.

Please close this bug.

Hmm, I still think we could display the domain differently, see comment #13.

(In reply to Jorg K (GMT+2) from comment #15)

Hmm, I still think we could display the domain differently, see comment #13.

Like some shade of brown? If the domain is IDN change color/font as per given option. Just the domain part. I'm not sure if To and Cc deserve the same treatment. The Subject certainly not... Yes, that can be a good hint to savvy users.

Actually, no, as punycode as requested. Would that be wrong? We convert non-ASCII domains to punycode when sending, so why not display it?

(In reply to Jorg K (GMT+2) from comment #17)

Actually, no, as punycode as requested. Would that be wrong? We convert non-ASCII domains to punycode when sending, so why not display it?

That would be a disservice. The whole point of EAI is to allow email addresses in the users native languages. It's much like rfc2047 encoding, you'd like better to see Hi Jörgs than =?UTF-8?B?SGkgSsO2cmdz?= in a displayed subject , no?

Punycode was an gimmick to quickly introduce IDNs, mainly concerned with registration issues. Horrible as it is, it's only justification is that it is not visible to end users. Clean UTF-8 seems to be heading to global acceptance, so let's stick to that. Here the point is that I can fool you by registering homograph domains. I can put that in a link in the body, of in a From:/To:/Cc: field in the header. In the latter case, a different color can highlight fraud. We can still leave that job to the DKIM Verifier, which would be more meaningful, as unverified domains can be anything they like.

See Also: → 1563891

OK, penny has dropped. I agree that it would be pretty terrible to show domain foà.it as xn--fo-kia.it. And Magnus already said this in comment #11: "Well that would be terrible for perfectly valid IDN domains".

As per comment #16, we could add a visual hint for IDN domains, but that's for another bug.

Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX

(In reply to Jorg K (GMT+2) from comment #19)

As per comment #16, we could add a visual hint for IDN domains, but that's for another bug.

Having a visual hint sounds good to me, that would solve the original issue. I could tell people "if you get an email from a known online shop and it suddenly has this different colour, (or whatever the visual hint is) just delete the email".

Jorg K, Alessandro Vesely: Would one of you be OK with opening a new bug? You both seems to know way more about this topic and I'd expect one of you would be better at creating a usable/actionable bug report.

(In reply to volker.mische from comment #20)

Jorg K, Alessandro Vesely: Would one of you be OK with opening a new bug? You both seems to know way more about this topic and I'd expect one of you would be better at creating a usable/actionable bug report.

As I mentioned, I will report this issue to Philippe Lieser's DKIM verifier. Before I do that, I need to check some details about DKIM signing with IDN domain. Knowing that a verified domain name is non-ASCII is meaningful, and degrades the trust in case the domain name was expected to be ASCII.

It is true that a non-ASCII, non-verified name, like the one signaled in this bug, can make its way to a user mailbox whereas it would have been quarantined if the spoofed domain so required (like Amazon) and if his mailbox provider fully adhered to DMARC specification. So there is a kind of user who could be saved if he knew why the domain name was colored. But then, if he's so cunning, why didn't he install the DKIM verifier?

Yes, I know that the From: is not trusted, but our users don't know that. Phishing is a real thing.

This discussion happened for Firefox shortly after IDNs were introduced. Phishers did the same in browsers, pretending to me Amazon and Google and eBay and whatnot, offered login dialogs, and let people send them their passwords.
Usually, people get there by phishing email. So, the risk for phishing in email is much higher than on the web. Yet, even in the browser, it was considered a large enough problem to add specific protections against homoglyphs. The solution was that if the IDN contains characters that look very similar to ASCII characters, it will not show the confusing character, but the more technical encoding.
We can probably leverage the very same code that Firefox uses here.

This would be important to fix. IDN support is in fact dangerous for all users (even those who do not care about IDN) unless this is fixed.

Alias: idn-phishing
Summary: Show sender/from domain as punycode → [IDN Phishing] Protect against homoglyphs used for phishing when reading mail: Show sender/from domain as punycode when homoglyphs are present

I added bug #1617385

Summary: [IDN Phishing] Protect against homoglyphs used for phishing when reading mail: Show sender/from domain as punycode when homoglyphs are present → Should show sender/from domain as punycode
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: