2.72 KB, patch
|Details | Diff | Splinter Review|
3.01 KB, patch
|Details | Diff | Splinter Review|
Any domain name that includes a letter "i" with an accent above (acute, grave, circumflex, dieresis, etc) can be perfectly spoofed using a sequence <dotless i, combining accent>, which is not canonically equivalent yet will display identically given reasonable font support. For example, http://xn--nave-6pa.com (http://naïve.com, with precomposed i-dieresis) could be spoofed by http://xn--nave-mza04z.com (http://naı̈ve.com, with dotless-i + dieresis) on any system where the font used by the URL bar supports such combining marks (true for both macOS and Windows, AFAIK, and probably most modern systems). IMO, this is a more important fix than some of the other potential spoofs that have been reported lately, as it does not depend on users overlooking a minor distinction, or on poor font support; rather, in the presence of fonts with good Unicode and diacritic support, the resulting spoof will (by definition) be visually perfect. Such URLs (any that include U+0131 DOTLESS I followed by a diacritic above) should be forced to display as punycode. (FTR, I have also filed an equivalent bug against Chromium, as their "dangerous" URL check doesn't currently catch such examples either.)
Attachment #8918673 - Flags: review?(valentin.gosu)
Comment on attachment 8918673 [details] [diff] [review] patch, don't allow IDNs with dotless-i + accent above Nice catch!
Attachment #8918673 - Flags: review?(valentin.gosu) → review+
I'm going to call this sec-moderate, like most other Unicode-based spoofing bugs. Fortunately, although this provides a "perfect" spoof for domain names with accented "i" in them, such domain names are themselves pretty rare. (I only found a handful in the top million domains as listed on https://domainpunch.com/tlds/topm.php -- and FWIW, several of those appeared to be using accented "i" just to work around court-ordered blocks on their "original" unaccented forms!)
I think that, like all such heuristics, we should get cross-browser agreement before implementing in a release build. I can see the point here - dotless i is a bit unusual in the doing normalization doesn't canonicalize it in the way you expect. jfkthame: are there any languages which expect to, er, use a dotless i with some sort of dot, without it being actually a dotted i? I assume not... Gerv
Comment on attachment 8918673 [details] [diff] [review] patch, don't allow IDNs with dotless-i + accent above Approval Request Comment [Feature/Bug causing the regression]: not a regression [User impact if declined]: domains having an accented letter "i" can be spoofed [Is this code covered by automated tests?]: yes [Has the fix been verified in Nightly?]: not yet [Needs manual test from QE? If yes, steps to reproduce]: no, has unit test [List of other uplifts needed for the feature/fix]: either we also uplift bug 1404349, or the patch here will need minor (simple) rebasing to apply on beta [Is the change risky?]: no [Why is the change risky/not risky?]: straightforward patch, just adds one more condition that will trigger punycode instead of IDN display [String changes made/needed]: none
Attachment #8918673 - Flags: approval-mozilla-beta?
This also affects earlier releases, but I'm assuming we wouldn't consider uplifting anywhere earlier than beta-57 for an issue like this.
(In reply to Gervase Markham [:gerv] from comment #4) > I think that, like all such heuristics, we should get cross-browser > agreement before implementing in a release build. FWIW, Chrome recently landed a change that blocks IDNs with U+0307 (combining dot above) over dotless i, which I realize I should have specifically highlighted, as it can be used to spoof even pure-ASCII domains like mozilla.com or microsoft.com (try xn--mozlla-r9a478a.com and xn--mcrosoft-tkb418b.com respectively). So the issue here is not ONLY about accented i, it applies to basic Latin i as well. And Chrom[e,ium] is on course to ship a fix for the basic "i" case, at least; I filed a bug report (and patch) in the chromium bug tracker to extend that to cover the accented "i"s as well. > I can see the point here - > dotless i is a bit unusual in the doing normalization doesn't canonicalize > it in the way you expect. Exactly; the "soft-dotted" property of i is a special case where normalization actually introduces confusion. (There would be a similar problem with U+0237 DOTLESS J, but it is already excluded from the acceptable IDN character set.) > jfkthame: are there any languages which expect to, er, use a dotless i with > some sort of dot, without it being actually a dotted i? I assume not... I'm not aware of any. I suppose a language that used both dotted and dotless i (like Turkish), and also used accents that could be applied to either, would probably have to retain the dot on accented i (and put the acute or whatever above it) in order to maintain the distinction. But AFAIK nobody actually does such a thing. And if they did, the (immutable and non-locale-specific) rules of Unicode normalization would be a bit awkward for them... (I've seen statements to the effect that Lithuanian, IIRC, has a typographic convention that retains the dot on i when adding an accent, but I don't believe it also uses dotless ı; and it's unclear to me whether that's really a requirement of the language, or just an artifact of some older technology -- such as a typewriter -- that didn't support automatically removing the dot when adding an accent to i.)
Summary: Potential "visually perfect" spoofing of accented letter 'i' in domain names → Potential "visually perfect" spoofing of the letter 'i' (with or without accents) in domain names
Comment on attachment 8918673 [details] [diff] [review] patch, don't allow IDNs with dotless-i + accent above Sec rated, taking it, beta57+
Attachment #8918673 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Patch rebased for mozilla-beta.
(In reply to Jonathan Kew (:jfkthame) from comment #5) > [Is this code covered by automated tests?]: yes > > [Has the fix been verified in Nightly?]: not yet > > [Needs manual test from QE? If yes, steps to reproduce]: no, has unit test Setting qe-verify- based on Jonathan's assessment on manual testing needs and the fact that this fix has automated coverage.
You need to log in before you can comment on or make changes to this bug.