Closed Bug 1694025 Opened 4 years ago Closed 4 years ago

Unicode IDNA normalization of URLs can lead to bypasses of naive string comparisons for blocked domains in website software

Categories

(Firefox :: Untriaged, defect)

Firefox 85
defect

Tracking

()

RESOLVED INVALID

People

(Reporter: z3r0.php, Unassigned)

Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:85.0) Gecko/20100101 Firefox/85.0

Steps to reproduce:

  1. Execute the HTML code provided below
  2. Now you can see the hyperlink actually becomes github.com and the iframe also fetches resource from github.com
  3. Check the source code it is not actually github.com there.

What is wrong here:
Suppose I've a website and I've blacklisted some website there. Ex: github.com .
Now if an attacker passes the 'https://Gℹthub.com' domain . My website backend/frontend will check the domain name and if its github.com then it will throw it away . Lets assume that the logic is like this :
if $submitted_domain == "https://Github.com" {
throw();
}
else {
iframe_it();
}
Now it actually bypassed the logic as the submitted domain isnot github.com but when it its iframed in the site it will actually fetch the resource from that website. I dont see the developer who checked the blacklisted domain isnt logically wrong but the browser parse this type of character differently which could trigger some security issues

Code:
<a href="https://Gℹthub.com"> Github</a>
<iframe src="https://Gℹthub.com"></iframe>

Actual results:

Browser is fetching resource from the wrong domain.

Expected results:

Browser should fetch the resource from right domain

This is per spec, I think, so I don't think this is a valid bugreport. Specifically, the latest IDNA tables have:

2139          ; mapped                 ; 0069          # 3.0  INFORMATION SOURCE

where 2139 is the hex codepoint for and 0069 is the hex codepoint for i, as per https://url.spec.whatwg.org/#idna . Anne, can you confirm I'm reading this right, and if so, resolve this bug as invalid?

If the hypothetical website developer in comment #0 is writing code that is supposed to block certain domains to deal with input that hasn't been IDNA normalized, they would need to also execute such normalization before comparing with the list of target domains.

Flags: needinfo?(annevk)
Summary: Mishandling of unicode → Unicode IDNA normalization of URLs can lead to bypasses of naive string comparisons for blocked domains in website software

Yeah, all a web developer would have to do is pass the domain to new URL() and compare .host or some such.

Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Flags: needinfo?(annevk)
Resolution: --- → INVALID
Group: firefox-core-security

So youre saying the dev have to pass the domain to new URL (). probably youre talking about javascript. But what about other languages? Lets see the example on some popular backend languages.
Php:
<?php
var_dump(parse_url("https://Gℹthub.com")["host"]=="Github.com"); ?>

It returns false and also if you check the host name here after parsing the domain you will see the malicious url...

Golang:
package main
import (
"fmt"
// "net"
"net/url"
)
func main() {

s := "https://𝖎cl𝖔ud.com"
u, err := url.Parse(s)
if s == "icloud.com" {
fmt.Println("Matched and throw this domain")
} else {
fmt.Println("Didnt Matched and keep this domain")
}
if err != nil {
    panic(err)
}
fmt.Println(u.Host)

}

This backend languages parser will keep the malicious domain. And I think it is browser's responsibility to only use the domain name what is submitted not other domai name!

Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(annevk)

From a very very quick and superficial web search PHP (e.g. https://www.php.net/manual/en/function.idn-to-utf8.php ) and Go ( https://pkg.go.dev/golang.org/x/net/idna ) both have means to normalize URLs as required by the specs.

Even if a particular language didn't, that is not a problem Firefox can solve - both it and all other modern web browsers apply this normalization. "[T]he domain name what is submitted" as you call it, doesn't exist / make sense -- a decent domain registry will not let you register it, and no browser will load it, and it is unlikely specs that are more than a decade old will be changed because some web applications do not handle them well.

Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(annevk)
You need to log in before you can comment on or make changes to this bug.