Closed Bug 1645958 Opened 4 years ago Closed 4 years ago

IDN homograph spoofing by mixing latin with accented latin extended glyphs

Categories

(Firefox :: Address Bar, defect)

77 Branch
defect

Tracking

()

RESOLVED DUPLICATE of bug 1507582

People

(Reporter: wanggang1107, Unassigned)

References

Details

Attachments

(1 file)

Attached file 44-firefox.txt

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36

Steps to reproduce:

We are researchers from the University of Illinois. We find some issues with Chrome’s IDN policies. Chrome is currently displaying Punycode for internationalized domain names (IDN) that are created to impersonate other popular domain names or brands (i.e., IDN homograph). However, we find that certain IDN homograph is consistently displayed in Unicode (e.g., those using “Latin Extended-A”, those imprisoning .gov or .mil domain names, etc.).

Examples:
tumbır.com
paypaī.com
militarỹonesource.mil

Actual results:

These domains are displayed as Unicode, may deceive users.
I attached a list of examples of such homograph IDNs that have this issue.

Expected results:

Expect to display Punycode

This looks like a copy of a chrome report. Can you link to the chrome report?

Type: enhancement → defect
Component: Untriaged → Address Bar
Flags: needinfo?(wanggang1107)

This looks very much like existing reports of "spoofs" using accented letters and similar. See for example bug 1637901; also https://bugzilla.mozilla.org/show_bug.cgi?id=1623648#c7. In general we do not block legitimate accented letters; it's the World Wide Web, not the English Web.

E.g. tumbır.com uses the dotless ı, which is a distinct letter of the alphabet in Turkish and it would be wrong to prevent its use in IDN names; that would discriminate against many valid Turkish words and names.

One potential mitigation here would be something along the lines of bug 1507582.

Sorry for the typo --- we tested both Chrome and Firefox. Neither browsers are showing Punycode for these domains, so we were wondering why.

Jonathan's answer helps. I also don't think it is a good idea to block a domain name for using a certain language or character. But I thought it was the "script mixing" that caused the problem here.

Flags: needinfo?(wanggang1107)

(In reply to GW from comment #3)

Sorry for the typo --- we tested both Chrome and Firefox. Neither browsers are showing Punycode for these domains, so we were wondering why.

So presumably you reported to Chrome, too? Please can you link to the Chrome issue you filed?

Flags: needinfo?(wanggang1107)
Flags: needinfo?(wanggang1107)
Summary: IDN policies cannot block IDN homograph → IDN homograph spoofing by mixing latin with accented latin extended glyphs

Complete list of extended latin characters used in the attachment:

ä
è
î
ú
ć
ċ
ĥ
ħ
ī
ı
ķ
ļ
ľ
ņ
ŝ
ǡ
ǩ
ȅ
ȑ
ș
ȯ
ậ
ặ
ẻ
ị
ọ
ụ
ỹ

(In reply to GW from comment #3)

Sorry for the typo --- we tested both Chrome and Firefox. Neither browsers are showing Punycode for these domains, so we were wondering why.

Jonathan's answer helps. I also don't think it is a good idea to block a domain name for using a certain language or character. But I thought it was the "script mixing" that caused the problem here.

There's no script mixing here. Characters like ä, è, î, ú and so on are most definitely part of the Latin script, which is used for much more than just English and contains more than 26 letters. No individual language's writing system uses anywhere near all of them, but in general they're all there because there's some Latin-script writing system somewhere that uses them.

Flags: needinfo?(dveditz)

Your example paypaī.com is on par with plain ASCII example phishing domains paypai.com or paypa1.com. Your attachment has bankofamerļca.com which has even more suspicious decoration than plain bankofamerlca.com -- if the former might fool someone then the latter certainly will and doesn't require any IDN characters. Meanwhile punycode domains tend to all look the same and are completely meaningless for people who would otherwise understand the script, making spoofing even more likely for the legit non-ASCII domains out there.

I guess we can keep an eye out for what the Chrome team does if they ever unhide their bug, but we can't blanket ban accented Latin characters. The best practical solution is the "skeleton of popular domains" approach Chrome does (mentioned above). People will still be able to come up with examples that aren't caught, but as long as they're not banks or other broadly-used domains the practical value of any phishing attacks would be close to nil.

Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Flags: needinfo?(dveditz)
Resolution: --- → DUPLICATE

I agree with your arguments, Daniel.

Group: firefox-core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: