Closed Bug 354592 Opened 18 years ago Closed 9 years ago

Handling of U+2571 and U+FF1A in IDNs allows URL spoofing

Tracking

()

Status:

RESOLVED WORKSFORME

People

(Reporter: bjoern, Assigned: usenet)

Details

(Keywords: sec-low, Whiteboard: [sg:low spoof])

Attachments

(3 files, 2 obsolete files)

Screenshot FF 1.5 on Debian 18 years ago Bjoern Hoehrmann 11.52 KB, image/png		Details
Test cases for the above 18 years ago Neil Harris 244 bytes, text/html		Details
EXPERIMENTAL patch: work-in-progress for script-block whitelisting 18 years ago Neil Harris 15.43 KB, patch		Details \| Diff \| Splinter Review
More polished patch to nsIDNService.cpp; not yet smoketested, but works 18 years ago Neil Harris 12.96 KB, patch		Details \| Diff \| Splinter Review
Extended set of testcases for the above... 18 years ago Neil Harris 472 bytes, text/html		Details

Bjoern Hoehrmann

Reporter

Description

•

18 years ago

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2) Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 In case of <a href="http://www.bank.com╱.example.com/">...</a> <a href="http://www.bank.com：8888╱.example.com/">...</a> Firefox allows users to click the link and displays essentially the following strings in the status and address bar (if it was able to make the request): http://www.bank.com/.example.com/ http://www.bank.com:8888/.example.com/ On Windows and MacOSX the requests both go to *.example.com, and in case of the latter the request will be malformed as it includes Host: [www.bank.xn--com:8888-br3e.example.com] While Apache rejects such a request, it's not difficult to work around that. I was unable to reproduce the problem on Linux. There are other characters that look like "/" and ":" though only some of them are displayed literally. Opera9 and IE7 consider both resource identifiers malformed and do not attempt to traverse them. Reproducible: Always

Bjoern Hoehrmann

Reporter

Comment 1

•

18 years ago

Reading http://www.mozilla.org/projects/security/tld-idn-policy-list.html I should probably add that I used a .de domain for testing, not example.com.

Bjoern Hoehrmann

Reporter

Comment 2

•

18 years ago

Attached image Screenshot FF 1.5 on Debian — Details

Using only U+2571 in the URL seems to work fine now on Linux.

Daniel Veditz [:dveditz]

Comment 3

•

18 years ago

Sounds like we need to add these two to our blacklist http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/modules/libpref/src/init/all.js&rev=FIREFOX_1_5_0_7_RELEASE#621 I'm a bit confused about the fullwidth colon though (xff1a). RFC 3490 section 3.1.1 says we must treat both fullwidth and halfwidth ideographic full stops as label separators, yet both \uFF0E and \uFF61 are in our blacklist. nsIDNService::normalizeFullStops() converts them before the blacklist is applied so I'm not sure why they're needed in the blacklist. The IDNA spec doesn't mention accepting fullwidth colon as a port delimiter, but it would be somewhat consistent to do so. \u2571 is a no-brainer to add. \u2573 sorta looks like an 'X', there are various "plus"-looking things.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Flags: blocking1.8.1.1?

Flags: blocking1.8.0.8?

Whiteboard: [sg:low spoof]

Daniel Veditz [:dveditz]

Comment 4

•

18 years ago

Does the colon get converted in nameprep? I would have expected colon to be banned by net_isValidHostName since I think we only call that after we've parsed and removed the port part, but we allow colon. It looks like the whole box-drawing section isn't supposed to be allowed as output (http://www.unicode.org/reports/tr39/#IDN_Security_Profiles). As long as we do allow them, though, \u2571 would be good to have in the blacklist as an interim band-aide.

Assignee: nobody → usenet

Bjoern Hoehrmann

Reporter

Comment 5

•

18 years ago

(In reply to comment #3) > The IDNA spec doesn't mention accepting fullwidth colon as a port delimiter, > but it would be somewhat consistent to do so. Port delimiters are not part of the domain name, they could only be part of the resource identifier; since the URL is parsed for the domain name first, any colon or colon-lookalike character cannot delimit the domain name from the port. I initially included this character precisely to test that Mozilla does not handle it this way.

Gervase Markham [:gerv]

Comment 6

•

18 years ago

CCing Neil Harris, who works on our IDN implementation. Gerv

Daniel Veditz [:dveditz]

Updated

•

18 years ago

Flags: blocking1.8.0.8? → blocking1.8.0.9?

Neil Harris

Assignee

Comment 7

•

18 years ago

That's interesting: as the poster says, this doesn't appear to work on Linux; something peculiar is happening here, since this should work in exactly the same way on all operating systems. I've got some code lying in an earlier IDN bug which never got merged, which might be useful for stopping this.

Neil Harris

Assignee

Comment 8

•

18 years ago

I think the best way to handle this in the short term is to add an extra check to isOnlySafeChars() that blacklists all characters that do not belong either to a script system, or to a very limited set of non-script characters. This will also have the effect of enforcing part of the ICANN rules for labels. I've got some code lying around that might just do the trick.

Bjoern Hoehrmann

Reporter

Comment 9

•

18 years ago

(In reply to comment #7) > That's interesting: as the poster says, this doesn't appear to work on Linux; > something peculiar is happening here, since this should work in exactly the > same way on all operating systems. Since ':' cannot occur in a domain name, it is likely that the DNS client code on Linux simply rejects any hostname containing it; on Windows this is not the case (compare, for example, `ping a:b.example.org` on both systems, where example.org needs to have a wildcard record). The first example should work on all systems.

Neil Harris

Assignee

Comment 10

•

18 years ago

I think it's time to for me push forward the code I wrote to address for bug 316727, which should fix this, as well as many other issues. I've got a patch already made, but it's untested: I'm currently generating sets of test cases for it, to try on 2.0rc1+patch. More soon. *** This bug has been marked as a duplicate of 316727 ***

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → DUPLICATE

Neil Harris

Assignee

Comment 11

•

18 years ago

Reopening: the fix for bug 316727 has more issues to be tested, and I have a experimental patch almost ready for this simpler bug now: this will also test the waters for the full fix of bug 316727.

Status: RESOLVED → REOPENED

Resolution: DUPLICATE → ---

Neil Harris

Assignee

Comment 12

•

18 years ago

Attached file Test cases for the above (obsolete) — Details

These links test the two test cases given by the submitter.

Neil Harris

Assignee

Comment 13

•

18 years ago

OK, test case 1 is now caught by my experimental patch, which also blocks a huge number of other characters by adopting a whitelisting-by-Unicode-blocks approach, in addition to the existing very specific blacklist. However, the behavious in test case 2 is more involved: I think there's a possibility of an interaction between the Unicode normalization of the fullwidth colon and the IPv6 code... I'll try to take a look at the on-the-wire behaviour across multiple operating systems tomorrow.

Neil Harris

Assignee

Comment 14

•

18 years ago

Attached patch EXPERIMENTAL patch: work-in-progress for script-block whitelisting (obsolete) — Details — Splinter Review

This is the experimental code so far, just for reference. Note: this is completely untested, and subject to rapid change.

Neil Harris

Assignee

Comment 15

•

18 years ago

Attached patch More polished patch to nsIDNService.cpp; not yet smoketested, but works — Details — Splinter Review

This patch defangs both of the examples given in this bug on my Linux build, without specifically needing to reference any particular character. The first becomes http://www.bank.xn--com-544a.example.com/ and the second becomes http://www.bank.xn--com:8888-br3e.example.com/ ASCII domain names and normal mixed-script IDNs still appear to work OK: my set of broken IDNs with bad character are consistently caught by this, too, and it doesn't crash with any of the tests. NB this patch has not been fully smoketested yet.

Attachment #241008 - Attachment is obsolete: true

Neil Harris

Assignee

Comment 16

•

18 years ago

The second testcase is an example of "ASCII-smuggling" through Unicode normalization in the IDN Nameprep processing (see bug 316444). However the comment at the end of 316444 seems to be contradicted by the second test case here: see the issues regarding URL roundtripping at end of this comment. I'm working on some code in bug 355181 that should shut off the possibility of using the ':' character in IDNs, by discriminating between the allowed character sets for RFC 1035 DNS names and dotted quads, and that for RFC 2732 IPv6 literals. However, this example also raises an interesting round-tripping issue, which will probably become a new bug: the ASCII-smuggling behaviour of the second example allows an address with a colon in it to appear in the location bar, but does not get looked up, so is OK. However, reparsing the very same text that is displayed in the location bar will truncate the new hostname at the colon, and thus end up looking up a quite different domain name. This is a problem of 1 relying on Punycoding for obfuscation, when it was not originally intended for that purpose 2 URL display and URL parsing sometimes not being round-trippable

Neil Harris

Assignee

Comment 17

•

18 years ago

Attached file Extended set of testcases for the above... — Details

Now with addition of duplicate examples, with the addition of the fake ".idntest" TLD, which I flag as IDN-compatible in my local installation

Attachment #241003 - Attachment is obsolete: true

Daniel Veditz [:dveditz]

Comment 18

•

18 years ago

Time is a little tight and this hasn't been tested on the trunk yet so I'm a little worried about it... moving nomination request to next release for now, can always request approval on trunk-landed patches if that happens in time. Is this patch ready for reviews now?

Flags: blocking1.8.1.2?

Flags: blocking1.8.1.1?

Flags: blocking1.8.0.9?

Flags: blocking1.8.0.10?

Daniel Veditz [:dveditz]

Updated

•

18 years ago

Flags: wanted1.8.1.x+

Flags: wanted1.8.0.x+

Flags: blocking1.8.1.2?

Flags: blocking1.8.0.10?

Daniel Veditz [:dveditz]

Updated

•

18 years ago

Flags: wanted1.8.1.x+

Flags: wanted1.8.0.x+

Flags: blocking1.8.1.2+

Flags: blocking1.8.0.10+

Jay Patel [:jay]

Updated

•

18 years ago

Flags: wanted1.8.1.x+

Flags: wanted1.8.0.x+

Flags: blocking1.8.1.2+

Flags: blocking1.8.0.10+

Robert Sayre

Updated

•

16 years ago

Flags: wanted1.9.1+

Gervase Markham [:gerv]

Comment 19

•

16 years ago

Neil: are you still working on this? Gerv

Neil Harris

Assignee

Comment 20

•

16 years ago

This bug probably needs to be update to "critical" or "blocker"; given the recent very public reports of experimental exploitation of this spoofing technique, we are almost certain to see it in the wild very soon. http://www.theregister.co.uk/2009/02/19/ssl_busting_demo/ (see page 2 of the report for the use of homographs in the attack) I'm impressed by the use of a wildcard something.cn certificate: that's clever. Unfortunately, I haven't got the time or resources to test my patch properly at the moment, but I believe the patch is reasonably OK, if someone else wants to QA it.

Neil Harris

Assignee

Comment 21

•

16 years ago

I've also got an experimental patch filed for bug 316727 that enforces even more paranoid checking, preventing not only the use of unassigned characters, but also the mixing of scripts except in certain explicity allowed combinations, as per ICANN guidance on IDNs -- in the long run, that patch is probably superior to this one, but in the short run, it's would need more QA, and would be riskier to apply.

Gervase Markham [:gerv]

Comment 22

•

16 years ago

We definitely need a lot more discussion before banning script mixing. Let's just make sure the current character blacklist is solid. Gerv

Daniel Veditz [:dveditz]

Comment 23

•

16 years ago

See also bug 479336 for a quick-n-dirty blacklist update, and bug 479520 about looking into the new proposed IDNA2008 standards.

Daniel Veditz [:dveditz]

Updated

•

16 years ago

Group: core-security

Curtis Koenig [:curtisk-use curtis.koenig+bzATgmail.com]]

Updated

•

13 years ago

Keywords: sec-low

Patrick McManus [:mcmanus]

Comment 24

•

9 years ago

c22/23

Status: REOPENED → RESOLVED

Closed: 18 years ago → 9 years ago

Resolution: --- → WORKSFORME

You need to log in before you can comment on or make changes to this bug.