IDNA does not conform to RFC and is interpreted as a different hostname
Categories
(Core :: Networking, defect, P1)
Tracking
()
People
(Reporter: kageshiron, Assigned: valentin)
References
()
Details
(Keywords: csectype-spoof, reporter-external, sec-low, Whiteboard: [reporter-external] [client-bounty-form] [verif?][necko-triaged][post-critsmash-triage][adv-main94+])
Attachments
(4 files)
Firefox 90.0.2 (Latest MacOS/Windows)
Firefox's URL parser cannot correctly parse IDNA(Internationalizing Domain Names in Applications).
You can check the behavior with the following HTML and JavaScript.
[HTML Sample] <a href="http://xn--あああ.example.com">URL</a>
[JavaScript Sample] new URL("http://xn--あああ.example.com").href
("あ" is U+3042 character)
Actual behavior:
"http://xn--bbb.example.com"
Expected behavior:
"xn--あああ" is an invalid label for a domain name. Only alphanumeric characters are allowed after "xn--". Browsers should treat it as an error.
<Security Risk>
- Bypassing WAF and malware scanning
- Phishing
- Inconsistencies with applications that expect to be RFC compliant
<Cause of the problem>
The codepoint for "あ" is U+3042, and the codepoint for "B" is "U+0042.
I think Firefox's URL parser ignore the upper bytes of the string following "xn--" and try to interpret it.
<Other URLs that can be interpreted incorrectly>
http://👨🦰.tk
(This is the domain name I got.)
The codepoints sequence for the 👨🦰 emoji is U+1F468 U+200D U+1F9B0.
Actual Parsed Result [Firefox]:
http://xn--1ugz855p6kd.tk
Expected Parsed Result [ Chrome, ICU (International Components for Unicode) ]:
http://xn--qq8hq8f.tk/
(Removed Zero Width Joiner(U+200D) )
<Note>
Apple Safari has the same problem. I have already reported the issue to apple.
Updated•3 years ago
|
Assignee | ||
Comment 1•3 years ago
|
||
(In reply to kageshiron from comment #0)
Firefox 90.0.2 (Latest MacOS/Windows)
Firefox's URL parser cannot correctly parse IDNA(Internationalizing Domain Names in Applications).
[JavaScript Sample] new URL("http://xn--あああ.example.com").href
Expected behavior:
"xn--あああ" is an invalid label for a domain name. Only alphanumeric characters are allowed after "xn--". Browsers should treat it as an error.
I think this stems from the fact that when NormalizingIDN we first convert to IDN, then to ASCII, instead of the other way around as the URL standard says.
This is a remnant from the times when we used to keep URLs internally as unicode.
<Other URLs that can be interpreted incorrectly>
http://👨🦰.tk
(This is the domain name I got.)The codepoints sequence for the 👨🦰 emoji is U+1F468 U+200D U+1F9B0.
Actual Parsed Result [Firefox]:
http://xn--1ugz855p6kd.tk
Expected Parsed Result [ Chrome, ICU (International Components for Unicode) ]:
http://xn--qq8hq8f.tk/
(Removed Zero Width Joiner(U+200D) )<Note>
Apple Safari has the same problem. I have already reported the issue to apple.
This URL ( http://👨🦰.tk ) also fails in the reference URL parser, so I'm not sure if the ZWJ is something we need to add to the spec? Anne?
https://jsdom.github.io/whatwg-url/#url=aHR0cDovL3d3dy7wn5Go4oCN8J+msC50aw==&base=YWJvdXQ6Ymxhbms=
Comment 2•3 years ago
|
||
I think the emoji difference is a result of Chrome not using Nontransitional_Processing: https://bugs.chromium.org/p/chromium/issues/detail?id=694157.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 3•3 years ago
|
||
Assignee | ||
Comment 4•3 years ago
|
||
Depends on D122097
Updated•3 years ago
|
Comment 5•3 years ago
|
||
Make sure to run ConvertUTF8toACE before ConvertToDisplayIDN r=necko-reviewers,dragana
https://hg.mozilla.org/integration/autoland/rev/7617df50b420a09e9fba0080b2a3d6bf49287566
https://hg.mozilla.org/mozilla-central/rev/7617df50b420
Comment 6•3 years ago
|
||
Seems like it might be best to let this bake for another cycle rather than uplifting to Beta the week prior to RC. Feel free to NI me if you strongly disagree.
Updated•3 years ago
|
Comment 7•3 years ago
|
||
We fixed the first part of this (non-ASCII in an xn--
prefixed label) but now treat http://www.👨🦰.tk as an invalid URL. I guess that's better than opening the wrong site, but Chrome does open it. Opened bug 1732963 for this remaining piece/regression.
I have reproduced the issue using Nightly 91.0a1 on Windows 10, Mac 11 and Ubuntu 20. I can confirm this issue is fixed using provided HTML sample, I verified using Nightly 94.0a1 (2021-09-28)(64-bit) on Windows 10, Mac 11 and Ubuntu 20.
Please, can you provide steps in order to verify Javascript sample?
Assignee | ||
Comment 10•3 years ago
|
||
(In reply to Jerónimo Torti from comment #9)
Please, can you provide steps in order to verify Javascript sample?
- Open devtools console
- Check that executing
new URL("http://xn--あああ.example.com").href
throws an error in recent builds
Thanks for providing steps,
I could check executing that line, it throws an error on Nightly 94.0a1 (2021-09-28)(64-bit) on Windows 10, Mac 11 and Ubuntu 20 as well.
Comment 12•3 years ago
|
||
Is this something we should consider taking on ESR91 also?
Assignee | ||
Comment 13•3 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #12)
Is this something we should consider taking on ESR91 also?
Yes. The WPT tests have changed a little since 91 - I'll request uplift as soon as the try run is complete:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=03d0baaa5650eec8899aa93c5b4cbfdf7d652107
Assignee | ||
Comment 14•3 years ago
|
||
Assignee | ||
Comment 15•3 years ago
|
||
Comment on attachment 9246624 [details]
Bug 1724233 - [esr91] Make sure to run ConvertUTF8toACE before ConvertToDisplayIDN r=#necko
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: See comment 0:
- Bypassing WAF and malware scanning
- Phishing
- Inconsistencies with applications that expect to be RFC compliant
- User impact if declined: Potentially broken IDNA parsing for some domains.
- Fix Landed on Version: 94
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): There's a potential for regressions stemming from the fact that this makes URL parsing more strict.
Also, since Chrome is using Transitional processing for IDNA and we're using NonTransitional, there's a small chance of webcompat issues for rare domains containing ZWJ especially. - String or UUID changes made by this patch:
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Comment 16•3 years ago
|
||
Comment on attachment 9246624 [details]
Bug 1724233 - [esr91] Make sure to run ConvertUTF8toACE before ConvertToDisplayIDN r=#necko
Approved for 91.3esr.
Comment 17•3 years ago
|
||
uplift |
Updated•3 years ago
|
Comment 18•3 years ago
|
||
Updated•3 years ago
|
I verified using esr 91.3 on Windows 10, Mac 11 and Ubuntu 20, I can confirm this issue is fixed using provided HTML sample.
I updated the flag accordingly.
Thanks.
Updated•3 years ago
|
Updated•2 years ago
|
Updated•6 months ago
|
Description
•