Last Comment Bug 354592 - Handling of U+2571 and U+FF1A in IDNs allows URL spoofing
: Handling of U+2571 and U+FF1A in IDNs allows URL spoofing
Status: RESOLVED WORKSFORME
[sg:low spoof]
: sec-low
Product: Core
Classification: Components
Component: Networking (show other bugs)
: Trunk
: x86 Windows Server 2003
: -- normal with 1 vote (vote)
: ---
Assigned To: Neil Harris
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-09-27 17:22 PDT by Bjoern Hoehrmann
Modified: 2016-01-28 11:38 PST (History)
14 users (show)
sayrer: wanted1.9.1+
jaymoz: wanted1.8.1.x+
jaymoz: wanted1.8.0.x+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Screenshot FF 1.5 on Debian (11.52 KB, image/png)
2006-09-27 20:26 PDT, Bjoern Hoehrmann
no flags Details
Test cases for the above (244 bytes, text/html)
2006-10-02 17:56 PDT, Neil Harris
no flags Details
EXPERIMENTAL patch: work-in-progress for script-block whitelisting (15.43 KB, patch)
2006-10-02 18:45 PDT, Neil Harris
no flags Details | Diff | Splinter Review
More polished patch to nsIDNService.cpp; not yet smoketested, but works (12.96 KB, patch)
2006-10-03 16:12 PDT, Neil Harris
no flags Details | Diff | Splinter Review
Extended set of testcases for the above... (472 bytes, text/html)
2006-10-05 12:03 PDT, Neil Harris
no flags Details

Description Bjoern Hoehrmann 2006-09-27 17:22:01 PDT
User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7

In case of

  <a href="http://www.bank.com&#x2571;.example.com/">...</a>
  <a href="http://www.bank.com&#xFF1A;8888&#x2571;.example.com/">...</a>

Firefox allows users to click the link and displays essentially the following strings in the status and address bar (if it was able to make the request):

  http://www.bank.com/.example.com/
  http://www.bank.com:8888/.example.com/

On Windows and MacOSX the requests both go to *.example.com, and in case of the latter the request will be malformed as it includes

  Host: [www.bank.xn--com:8888-br3e.example.com]

While Apache rejects such a request, it's not difficult to work around that. I was unable to reproduce the problem on Linux. There are other characters that look like "/" and ":" though only some of them are displayed literally. Opera9 and IE7 consider both resource identifiers malformed and do not attempt to traverse them.

Reproducible: Always
Comment 1 Bjoern Hoehrmann 2006-09-27 17:30:11 PDT
Reading http://www.mozilla.org/projects/security/tld-idn-policy-list.html I should probably add that I used a .de domain for testing, not example.com.
Comment 2 Bjoern Hoehrmann 2006-09-27 20:26:11 PDT
Created attachment 240404 [details]
Screenshot FF 1.5 on Debian

Using only U+2571 in the URL seems to work fine now on Linux.
Comment 3 Daniel Veditz [:dveditz] 2006-09-28 14:20:21 PDT
Sounds like we need to add these two to our blacklist

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/modules/libpref/src/init/all.js&rev=FIREFOX_1_5_0_7_RELEASE#621

I'm a bit confused about the fullwidth colon though (xff1a). RFC 3490 section 3.1.1 says we must treat both fullwidth and halfwidth ideographic full stops as label separators, yet both \uFF0E and \uFF61 are in our blacklist. nsIDNService::normalizeFullStops() converts them before the blacklist is applied so I'm not sure why they're needed in the blacklist.

The IDNA spec doesn't mention accepting fullwidth colon as a port delimiter, but it would be somewhat consistent to do so.

\u2571 is a no-brainer to add. \u2573 sorta looks like an 'X', there are various "plus"-looking things.

Comment 4 Daniel Veditz [:dveditz] 2006-09-28 18:13:28 PDT
Does the colon get converted in nameprep? I would have expected colon to be banned by net_isValidHostName since I think we only call that after we've parsed and removed the port part, but we allow colon.

It looks like the whole box-drawing section isn't supposed to be allowed as output (http://www.unicode.org/reports/tr39/#IDN_Security_Profiles). As long as we do allow them, though, \u2571 would be good to have in the blacklist as an interim band-aide.
Comment 5 Bjoern Hoehrmann 2006-09-28 18:24:32 PDT
(In reply to comment #3)
> The IDNA spec doesn't mention accepting fullwidth colon as a port delimiter,
> but it would be somewhat consistent to do so.

Port delimiters are not part of the domain name, they could only be part of the resource identifier; since the URL is parsed for the domain name first, any colon or colon-lookalike character cannot delimit the domain name from the port. I initially included this character precisely to test that Mozilla does not handle it this way.
Comment 6 Gervase Markham [:gerv] 2006-09-29 08:39:26 PDT
CCing Neil Harris, who works on our IDN implementation.

Gerv
Comment 7 Neil Harris 2006-09-30 12:01:12 PDT
That's interesting: as the poster says, this doesn't appear to work on Linux; something peculiar is happening here, since this should work in exactly the same way on all operating systems. 

I've got some code lying in an earlier IDN bug which never got merged, which might be useful for stopping this.
Comment 8 Neil Harris 2006-09-30 16:42:09 PDT
I think the best way to handle this in the short term is to add an extra check to isOnlySafeChars() that blacklists all characters that do not belong either to a script system, or to a very limited set of non-script characters.  This will also have the effect of enforcing part of the ICANN rules for labels.

I've got some code lying around that might just do the trick.


Comment 9 Bjoern Hoehrmann 2006-09-30 23:48:59 PDT
(In reply to comment #7)
> That's interesting: as the poster says, this doesn't appear to work on Linux;
> something peculiar is happening here, since this should work in exactly the
> same way on all operating systems.

Since ':' cannot occur in a domain name, it is likely that the DNS client code on Linux simply rejects any hostname containing it; on Windows this is not the case (compare, for example, `ping a:b.example.org` on both systems, where example.org needs to have a wildcard record). The first example should work on all systems.
Comment 10 Neil Harris 2006-10-02 12:46:33 PDT
I think it's time to for me push forward the code I wrote to address for bug 316727, which should fix this, as well as many other issues.

I've got a patch already made, but it's untested: I'm currently generating sets of test cases for it, to try on 2.0rc1+patch. More soon.

*** This bug has been marked as a duplicate of 316727 ***
Comment 11 Neil Harris 2006-10-02 17:48:39 PDT
Reopening: the fix for bug 316727 has more issues to be tested, and I have a experimental patch almost ready for this simpler bug now: this will also test the waters for the full fix of bug 316727.
Comment 12 Neil Harris 2006-10-02 17:56:32 PDT
Created attachment 241003 [details]
Test cases for the above

These links test the two test cases given by the submitter.
Comment 13 Neil Harris 2006-10-02 18:17:59 PDT
OK, test case 1 is now caught by my experimental patch, which also blocks a huge number of other characters by adopting a whitelisting-by-Unicode-blocks approach, in addition to the existing very specific blacklist.

However, the behavious in test case 2 is more involved: I think there's a possibility of an interaction between the Unicode normalization of the fullwidth colon and the IPv6 code... 

I'll try to take a look at the on-the-wire behaviour across multiple operating systems tomorrow.
Comment 14 Neil Harris 2006-10-02 18:45:19 PDT
Created attachment 241008 [details] [diff] [review]
EXPERIMENTAL patch: work-in-progress for script-block whitelisting 

This is the experimental code so far, just for reference. Note: this is completely untested, and subject to rapid change.
Comment 15 Neil Harris 2006-10-03 16:12:02 PDT
Created attachment 241123 [details] [diff] [review]
More polished patch to nsIDNService.cpp; not yet smoketested, but works

This patch defangs both of the examples given in this bug on my Linux build, without specifically needing to reference any particular character. 

The first becomes

http://www.bank.xn--com-544a.example.com/

and the second becomes

http://www.bank.xn--com:8888-br3e.example.com/

ASCII domain names and normal mixed-script IDNs still appear to work OK: my set of broken IDNs with bad character are consistently caught by this, too, and it doesn't crash with any of the tests.

NB this patch has not been fully smoketested yet.
Comment 16 Neil Harris 2006-10-03 17:57:53 PDT
The second testcase is an example of "ASCII-smuggling" through Unicode normalization in the IDN Nameprep processing (see bug 316444). However the comment at the end of 316444 seems to be contradicted by the second test case here: see the issues regarding URL roundtripping at end of this comment.  

I'm working on some code in bug 355181 that should shut off the possibility of using the ':' character in IDNs, by discriminating between the allowed character sets for RFC 1035 DNS names and dotted quads, and that for RFC 2732 IPv6 literals.

However, this example also raises an interesting round-tripping issue, which will probably become a new bug: the ASCII-smuggling behaviour of the second example allows an address with a colon in it to appear in the location bar, but does not get looked up, so is OK. However, reparsing the very same text that is displayed in the location bar will truncate the new hostname at the colon, and thus end up looking up a quite different domain name. 

This is a problem of
1 relying on Punycoding for obfuscation, when it was not originally intended for that purpose
2 URL display and URL parsing sometimes not being round-trippable 
Comment 17 Neil Harris 2006-10-05 12:03:28 PDT
Created attachment 241351 [details]
Extended set of testcases for the above...

Now with addition of duplicate examples, with the addition of the fake ".idntest" TLD, which I flag as IDN-compatible in my local installation
Comment 18 Daniel Veditz [:dveditz] 2006-11-03 13:36:05 PST
Time is a little tight and this hasn't been tested on the trunk yet so I'm a little worried about it... moving nomination request to next release for now, can always request approval on trunk-landed patches if that happens in time.

Is this patch ready for reviews now?
Comment 19 Gervase Markham [:gerv] 2009-01-05 06:23:56 PST
Neil: are you still working on this?

Gerv
Comment 20 Neil Harris 2009-02-19 12:23:26 PST
This bug probably needs to be update to "critical" or "blocker"; given the recent very public reports of experimental exploitation of this spoofing technique, we are almost certain to see it in the wild very soon.

http://www.theregister.co.uk/2009/02/19/ssl_busting_demo/

(see page 2 of the report for the use of homographs in the attack)

I'm impressed by the use of a wildcard something.cn certificate: that's clever.

Unfortunately, I haven't got the time or resources to test my patch properly at the moment, but I believe the patch is reasonably OK, if someone else wants to QA it.
Comment 21 Neil Harris 2009-02-19 12:29:26 PST
I've also got an experimental patch filed for bug 316727 that enforces even more paranoid checking, preventing not only the use of unassigned characters, but also the mixing of scripts except in certain explicity allowed combinations, as per ICANN guidance on IDNs -- in the long run, that patch is probably superior to this one, but in the short run, it's would need more QA, and would be riskier to apply.
Comment 22 Gervase Markham [:gerv] 2009-02-20 05:44:39 PST
We definitely need a lot more discussion before banning script mixing. Let's just make sure the current character blacklist is solid.

Gerv
Comment 23 Daniel Veditz [:dveditz] 2009-02-21 09:18:52 PST
See also bug 479336 for a quick-n-dirty blacklist update, and bug 479520 about looking into the new proposed IDNA2008 standards.
Comment 24 Patrick McManus [:mcmanus] PTO until Sep 6 2016-01-28 11:38:26 PST
c22/23

Note You need to log in before you can comment on or make changes to this bug.