1332714 - IDN Phishing using whole-script confusables on Windows and Linux

Reporter

•

8 years ago

Component: Untriaged → Location Bar

Valentin Gosu [:valentin] (he/him)

Comment 2

•

8 years ago

(In reply to :Gijs from comment #1) > Valentin, any idea why the IDN URL stuff we do doesn't detect this as a > homograph attack? To be honest I don't know the IDN code well enough to be sure. It seems in the past our approach has been to black list the characters. Patch incoming.

Assignee: nobody → valentin.gosu

Flags: needinfo?(valentin.gosu)

Valentin Gosu [:valentin] (he/him)

Comment 3

•

8 years ago

Attached patch Update blacklist pref — Details — Splinter Review

MozReview-Commit-ID: re2Gs83qLT

Attachment #8829419 - Flags: review?(smontagu)

Simon Montagu :smontagu

Comment 4

•

8 years ago

I'm not happy about just blacklisting the "ӏ". Couldn't it be used in a legitimate Cyrillic domain? https://www.аррӏе.com/ is an example of a whole-script homograph, which our IDN display code is not designed to protect against -- for example https://www.асе.com/ spoofs https://www.ace.com/

Flags: needinfo?(gerv)

Gervase Markham [:gerv]

Comment 5

•

8 years ago

(In reply to Simon Montagu :smontagu from comment #4) > https://www.аррӏе.com/ is an example of a whole-script homograph, which our > IDN display code is not designed to protect against -- for example > https://www.асе.com/ spoofs https://www.ace.com/ Indeed. Our IDN threat model specifically excludes whole-script homographs, because they can't be detected programmatically and our "TLD whitelist" approach didn't scale in the face of a large number of new TLDs. If you are buying a domain in a registry which does not have proper anti-spoofing protections (like .com), it is sadly the responsibility of domain owners to check for whole-script homographs and register them. We can't go blacklisting standard Cyrillic letters. If you think there is a problem here, complain to the .com registry who let you register https://www.xn--80ak6aa92e.com/ . Gerv

Status: NEW → RESOLVED

Closed: 8 years ago

Flags: needinfo?(gerv)

Resolution: --- → WONTFIX

wvsrk1lx

Reporter

Comment 6

•

8 years ago

https://bugs.chromium.org/p/chromium/issues/detail?id=683314 As mentioned by :Gijs, bug is currently confidential

Simon Montagu :smontagu

Comment 7

•

8 years ago

Comment on attachment 8829419 [details] [diff] [review] Update blacklist pref Review of attachment 8829419 [details] [diff] [review]: ----------------------------------------------------------------- r-, since Gerv and I agree that this approach is wrong.

Attachment #8829419 - Flags: review?(smontagu) → review-

Jungshik Shin

Comment 8

•

8 years ago

FYI, Chromium has this CL: https://codereview.chromium.org/2683793010 . It affects about 2,800 more domains out of ~ 1 million IDN domains in .com TLD.

:Gijs (he/him)

Comment 9

•

8 years ago

(In reply to Jungshik Shin from comment #8) > FYI, Chromium has this CL: https://codereview.chromium.org/2683793010 . It > affects about 2,800 more domains out of ~ 1 million IDN domains in .com TLD. Gerv/Valentin, is this something we can/should align with Chromium on?

Flags: needinfo?(valentin.gosu)

Flags: needinfo?(gerv)

Anne (:annevk)

Comment 10

•

8 years ago

FWIW, I think this is something that the security team should decide.

Marco Bonardo [:mak]

Comment 11

•

8 years ago

we can wontfix this later, once a final decision about comment 9 has been taken.

Status: RESOLVED → REOPENED

Priority: -- → P3

Resolution: WONTFIX → ---

Gervase Markham [:gerv]

Comment 12

•

8 years ago

(In reply to :Gijs from comment #9) > Gerv/Valentin, is this something we can/should align with Chromium on? I would say no, and here's why. Some (many?) responsible registries implement bundling/blocking of homographic domains. In those registries, the owner of www.<something>.tld will always be the same as the owner of www.<cyrillic-lookalike>.tld. People may well have bought these domains in good faith and be using them for their businesses. Why should they be penalised because other registries are not doing the sensible thing? It's a known and accepted issue that our current system does not suppress whole-script homographs in TLDs where the registry refuses to implement proper anti-spoofing controls. That was considered an acceptable tradeoff in order to hit the following goals: * Not have first and second-class scripts in IDN, but have everything that is supported at all be first class * Make it so that if an IDN works in one Firefox, it works in them all (certainty for site operators) If we start putting restrictions on scripts which happen to look like Latin, such as Cyrillic, we are making that script a second-class citizen because not as much can be represented using it. If the Internet had started as a Russian invention, and Latin was late to the party, I think we'd be pretty annoyed at being treated that way. So I don't think we should treat Cyrillic that way. Gerv

Flags: needinfo?(gerv)

Marco Bonardo [:mak]

Comment 13

•

8 years ago

On the other side, it is also matter of balancing breaking some rare domains, that are already broken in the browser with the largest marketshare, vs providing added phishing protection to most users. Fwiw, could we also evaluate a UI fix where the domain keeps working but for these specific chars we don't actually decode them? btw, ni? dveditz to evaluate the global discussion from a security point of view.

Flags: needinfo?(dveditz)

Valentin Gosu [:valentin] (he/him)

Comment 14

•

8 years ago

I think maybe we should do something about it. There's almost no way of differentiating between the ascii and IDN domains in comment 0, and unless the registrars do something about it immediately, users will be at risk.

Flags: needinfo?(valentin.gosu)

Anne (:annevk)

Comment 15

•

8 years ago

It's not clear to me why we trust the registrars here, since they've clearly not shown competency thus far and indeed other browsers are not trusting them for those reasons. Trusting registrars also does not help us with any kind of subdomain situation.

Gervase Markham [:gerv]

Comment 16

•

8 years ago

"Trusting the registrars" would better be put as "making it clear the registries are responsible for their own actions". We went through a period where we tried to solve this problem completely, by having a TLD whitelist of sensible TLDs. Then, the number of TLDs exploded and that didn't scale any more. We were faced with a choice of breaking some of the conditions I outlined above, which would be bad for users of non-Latin languages and domain owners in general, or resolving to solve the problem as far as we could and, for the remaining edge cases, make it clear if it ever came up that it's the registry which is causing the problem by their own practices, and telling anyone upset where the real blame lies. We went for the second option. I freely admit that our current mechanism doesn't solve whole-script spoofing. This is not a surprise - it was a known and accepted fact when we implemented it. See https://wiki.mozilla.org/IDN_Display_Algorithm#Downsides . Gerv

Daniel Veditz [:dveditz]

Comment 17

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #13) > Fwiw, could we also evaluate a UI fix where the domain keeps working but for > these specific chars we don't actually decode them? What do you mean "keeps working"? No one is suggesting refusing to connect to a web site, we're talking about whether we display the IDN form as our current algorithms say (the example has no mixed scripts) or whether we display the uglified punycode form. Chrome's fix is to collect all the Cyrillic letters in a label and then see if they are all in the set of 22 confusables. If they are and the TLD is ascii then they show punycode. If they find a Cyrillic letter outside that set then they let the normal IDN algorithm make the decision about allowed script mixing. аррӏе.com would be punycode аррӏе.ru would be punycode аррӏе.рф would be IDN Advantages: Protects users from registries not doing their job; protects against sub-domain label spoofing where the registry has no say in any case. Disadvantages: will uglify at least 2800 .com domains. Do we know how many are legit vs spoofing demonstrations like аррӏе.com? More concerning are the unknown number in other ascii TLDs like .ru, .ua, etc. Given 22 letters to play with I would imagine a large number of legit Russian words fit in that set. It looks like some of those registries may only allow ascii domains on the ascii TLD and restrict the use of cyrillic to their cyrillic TLD (don't hold me to it--was skimming). On the the other hand the .eu registry definitely accepts cyrillic (Bulgaria is a member) so that could be a problem. (In reply to Gervase Markham [:gerv] from comment #12) > It's a known and accepted issue that our current system does not suppress > whole-script homographs in TLDs Should we unhide this bug, then? > That was considered an acceptable tradeoff in order to hit the following goals: > > * Not have first and second-class scripts in IDN, but have everything that > is supported at all be first class > * Make it so that if an IDN works in one Firefox, it works in them all > (certainty for site operators) Adopting Chrome's fix nails the second but fails on the first. We'd have to take the extra step of disallowing ascii confusables if the TLD is cyrillic, but that's only legalistically fair and less so if you take the history of .com dominance into account.

Flags: needinfo?(dveditz)

•

8 years ago

Group: firefox-core-security

Keywords: sec-moderate → sec-low

Comment hidden (me-too)

u534134

Comment 26

•

8 years ago

As the Wordfence article says; WHY for FIX this you do not set the parameter network.IDN_show_punycode to true?

u534134

Comment 27

•

8 years ago

•

Comment 67

•

8 years ago

(In reply to Marco from comment #63) > I AM unable to understand how the domain https://xn--e1awd7f.com/ can be > showed as the same as https://www.apple.com > > IF I paste into the browser the chatacter https://xn--e1awd7f.com/ this > should be not converted to apple.com who is completly different address... > > maybe it's me who I AM ignorant on the subject but... cannot understand... > but I understand that if this is allowed that all bank and other important > website can be copied and with a different address look the same address as > unicredit or BARCLAYS, mozzilla, etc. Marco, I'm starting to get the feeling that perhaps you are confused about what the intended purpose of this feature actually is. You may wish to read up on Internationalized Domain Names, and why they were implemented in the first place. https://en.wikipedia.org/wiki/Internationalized_domain_name Essentially, without this feature many characters from non-English languages cannot be used in domain names at all. (E.g. A person whose native language is Mandarin would not be able to create a domain name which uses Chinese characters.) In your example, the characters you see on screen, а, р, р, ӏ, е are not the same as the letters which make up the real apple.com's domain. They are instead characters from a different script which happen to resemble the Latin characters which make up Apple's name. That's why the solution to this issue is a bit more complicated than "let's just make xn--e1awd7f display as xn--e1awd7f". That's why, while I completely agree with you that IDNs should not be implemented in a way which allows two different domains to be displayed exactly the same way in the URL bar, others are reacting very negatively to your suggestion to turn off the feature entirely.

Jonathan Kingston [:jkt] he/him

Comment 68

•

8 years ago

> That's why, while I completely agree with you that IDNs should not be implemented in a way which allows two different domains to be displayed exactly the same way in the URL bar, others are reacting very negatively to your suggestion to turn off the feature entirely. Exactly. > Actually, a comment from the Mozilla Foundation and/or some Firefox developers could be very useful. dveditz, Gijs, annevk and myself are all current developers for Firefox to name a few. Gerv represents policy and was a developer for Firefox and as I understand it was the decider of our IDN policy in the first place. > I disagree. While I do agree that domain registrars can and should be involved in addressing this, that cannot and should not be the main solution to this problem for several reasons. As mentioned elsewhere, suggestions to how to fix this without becoming a global registrar would be welcome. So far I don't thin there are any perfect solutions. As Firefox represents one of the most translated browsers, I think it would do out users and contributors a disservice to treat any language as better. To me this significantly limits solutions to this down to: 1. Lobbying registrars/ICAAN/similar to do the right thing 2. Working on tightening restrictions when it's clear it doesn't impact real domains 3. Using tools like safebrowsing to create a blocklist of entries that clearly shouldn't be accessed The reason 2. is hard is because there are valid instances of brand names or dialects that could confuse users. These probably should be valid in the language of origin however in another they wouldn't be. The reason registrars are the right solution here is exactly that they have the systems and processes in places to prevent multiple registrations of similar domains for example .uk domains are restricted for sale at present so that .co.uk owners can purchase their own domain. They also have the ability to rapidly check for all confusable matches at time of registration, there isn't a chance a user agent could do that before presenting a page to a user. Registrars could also reserve all confusable variants of a domain to the domain owner automatically. > Are you going to maintain a whitelist of TLDs allowed to create IDN domains, and remove TLDs from that list if they consistently mess up? In previous comments it was suggested that you'd already tried that and determined it to not be a good idea. It's unmaintainable, I worked for a domain registration company who was struggling to manage the pace of even deploying new extensions to search. Last count I did there was 1500 extensions permitted by ICAAN with each having differing restrictions of policy for IDN blocking.

u580221

Comment 69

•

8 years ago

What's wrong with the previous suggestion already made, which is that the raw punycode should be shown if: 1. The domain name uses punycode and 2. All characters are either ascii or ascii look-alike punycode and 3. The domain extension has no punycode in it I think in those cases it should be pretty obvious something fishy might be going on, no? Even if someone finds a legitimate use case for something caught by those rules, everything that applies to those three things is at high risk of being a phishing attempt, isn't it?

Robert Kaiser

Comment 70

•

8 years ago

Do we have the capability to detect on the browser side if a non-Latin-script domain name is a complete homograph of a Latin-script one? If so, would it be feasible to just not allow IDN for those? I know, my thinking here may be naive and we may not have the tools or it could get hairy (do we know the edge cases of this?) in some situations, but I believe it may be worth a thought.

Arthur Edelstein [:arthur]

Comment 71

•

•

8 years ago

We now have an FAQ which makes our position clear: https://wiki.mozilla.org/IDN_Display_Algorithm_FAQ You may not agree with it, but it's our considered position, so please do not comment further here unless you have new information to add which you genuinely believe has not been considered. Gerv

Flags: needinfo?(gerv)

Vittorio Bertola

Comment 79

•

8 years ago

Sorry, you caught me while writing, I just want to point this out to you: (In reply to Jonathan Kingston [:jkt] from comment #68) > > Actually, a comment from the Mozilla Foundation and/or some Firefox developers could be very useful. > > dveditz, Gijs, annevk and myself are all current developers for Firefox to > name a few. Gerv represents policy and was a developer for Firefox and as I > understand it was the decider of our IDN policy in the first place. Yes I know, what I meant is that it would be useful if you submitted a comment to ICANN's IDN guidelines consultation stating your problems with them allowing registries to sell whole-script confusables (the new guidelines draft now says that registries "may" apply Unicode TR-39 and block these registrations, but it's still at their discretion - it should say "must").

u534134

Comment 81

•

8 years ago

Sad to see that the final decision seems to be the wontfix. It's incredible think that a website can be showed in the browser with the same name and also a valid certificate: no warn... so the business seems to be in the first place and the security in the second place... I understand this is a complex case but I do not agree when I read in the Internet is possibile also in the future, use this kind of hole and browser vulnerability... to find and think how create new fake domain who are showed also with a valid certificate (for the browser). You said for now no big phishing or scam has been made... well seems we need wait a big scam or hack is made in the web for fix issues... however I AM looking at this security issue as a navigator who know the good rule for stay safe is to check the browser address and today... articles in the web are starting to talk and demostrate you cannot trust anymore the browser bar because there are the possibility to have two completly different website with the look of the same address. No warn, valid certificate, same web address... is unbelievable to me. I love the Firefox browser but today I AM very deluded... Security in this case seems to be not important or not at the first place. I think web address should be not repetible. Is for me as say that home address should have the possibility to be the same to another for usability so you can have different address who looks like the same. When I discover this security issue I was surprise to see the browser show a completly unrelated address same to another... with te difference you are redirected to another server. Ok I can stop to follow this bug and consider this will be not fixed in Firefox. Now will monitor Chrome and other browser... Firefox has a vulnerability now for me and maybe not also for me... I belive is right wait a fix and read to wait a fix in articles... but this fix seems never come. It's incredibile that also the certificate still showed valid in Chrome and in Firefox... so also https website can be not secure, cloned with a different, very rare address case who can grow in the future... as the news of this vulnerability leaved as is now... will going in the wrong hands. I AM very deluded today to read about the wontfix or just blacklisting... this is not a fix but is your decision... I will continue to serach an look to security on the web and on secure browser... sad to see Firefox has decide to not touch nothing about this. End of my partencipation here.

My1

•

8 years ago

(In reply to Gervase Markham [:gerv] from comment #78) > We now have an FAQ which makes our position clear: > https://wiki.mozilla.org/IDN_Display_Algorithm_FAQ > > You may not agree with it, but it's our considered position, so please do > not comment further here unless you have new information to add which you > genuinely believe has not been considered. > > Gerv There are a few things which don't appear to be addressed by that page. In particular, it seems to dismiss a lot of partial solutions by arguing "this won't solve the problem for everybody in all cases, so we're just going to ignore it", whereas I think the correct attitude should be "this _will_ solve the problem for a large number of users (even if not everyone), therefore we should implement it". For example: > # OK. Why doesn't Firefox decide based on the script associated with the browser's UI language? > > Because many people use browsers with a UI language different to the ones they speak, Yes, but there are also many users who use a UI language which _exactly_ matches the language they speak. This would solve the problem for those users. > or that is only one of the ones they speak. So let users set multiple languages like Edge does. > And that's before you've accounted for shared computers and internet cafes, with multiple people of differing capabilities using the same computer. Again, just because this doesn't solve the problem for _everyone_ doesn't mean it's not a useful solution. Solving the problem for _some_ users is better than not solving the problem at all. > Also, this would make using IDN domain names a dodgy proposition for any organization, because they can never know which of their customers will see them correctly and which won't. Essentially all customers who speak the language the domain is written in (i.e. >99% of the site's target demographic) would see the domain rendered correctly. (And those who don't could permanently solve that issue on their computer with just a few clicks.) Those who don't speak the language the domain is written in likely wouldn't be able to enter the domain name correctly on their keyboard layout in the first place (which is a far bigger usability issue than them seeing punycode is). > Lastly, this fix wouldn't actually solve the problem for everyone. http://apple.com and http://аррІе.com/ look the same even to people who read Cyrillic. Again, that's not an argument against implementing this.

Marco Bonardo [:mak]

Comment 87

•

•

7 years ago

It's possible someone was working on an official position and Gerv's FAQs have been commented out and then removed (still available through wiki history). Chris did the first commenting out, he may have more information about it.

Flags: needinfo?(criley)

Chris Riley [:mchris, :criley]

Comment 109

•

•

5 years ago

Updated

•

4 years ago

Updated

•

2 years ago

Severity: normal → S3

BugBot [:suhaib / :marco/ :calixte]

Comment 119

•

2 years ago

The severity field for this bug is relatively low, S3. However, the bug has 23 duplicates, 20 votes, 86 CCs and 7 See Also bugs.
:adw, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(adw)

BugBot (nomail) [:suhaib / :marco/ :calixte]

Comment 120

•

•

6 months ago

Depends on: 1913158

Mathew Hodson

Updated

•

6 months ago

No longer blocks: 1519691

Comment hidden (off-topic)

Mathew Hodson

Updated

•

6 months ago

status-firefox129: unaffected → affected

status-firefox-esr115: unaffected → affected

status-firefox-esr128: unaffected → affected

Keywords: regression

Sean Kim

Updated

•

6 months ago

Depends on: 1916799

Daniel Veditz [:dveditz]

Updated

•

4 months ago