Open Bug 1257108 Opened 10 years ago Updated 3 years ago

network.IDN.blacklist_chars can almost be removed

Categories

(Core :: Networking, defect, P3)

defect

Tracking

()

People

(Reporter: jshin1987, Unassigned)

References

Details

(Whiteboard: [necko-backlog])

Most characters in network.IDN.blacklist_chars would be blocked because they do not belong to [:Identifier_Status=Allowed:] [1]. Mozilla blocks characters outside this set plus Aspirational Scripts [2]. So, maintaining network.IDN.blacklist_chars is not that useful. The last time I checked there are only three characters in IDN.blacklist_chars that belong to the above set (i.e. they're allowed). They're U+0338 : Long Solidus Overlay U+05F4 : Hebrew Gershayim U+2027: Hyphenation Point Out of the three, I propose that U+0338 and U+2027 be kept in the black list but U+05F4 be dropped. U+0338 and U+2027 would fall through cracks (i.e. will pass other checks in place) so that it's worth keeping in the blacklist. However, U+05F4 (blacklisted for its similarity in look to double quotation mark, I guess) would fail other tests in place when it's used with a non-Hebrew script. It'll also fail BiDi check (enabled in Mozilla when calling ICU's IDN API) when it's mixed with LTR. So, it'll be only allowed when used with other Hebrew characters. One potential case that is arguably marginally risky is U+05F4 being used by itself as a label. But is it really risky? BTW, the IDN table for .il [3] does not include it while Hebrew IDN table for .com includes it. My impression (that does not carry much weight if at all) is that the former seems too tight while the latter is too liberal. In any case, network.IDN.blacklist_chars can be trimmed down significantly because most of them are outside [:Identifier_Status=Allowed:]. [1] [1] [:Identifier_Status=Allowed:] == [[:Identifier_Type=Recommended:][:Identifier_Type=Inclusion:]] (use http://unicode.org/cldr/utility/unicodeset.jsp to verify) [2] http://lxr.mozilla.org/mozilla-central/source/netwerk/dns/nsIDNService.cpp#821 is equivalent to blocking characters outside [[:Identifier_Status=Allowed:] [Aspirational Scripts]]. // Check for restricted characters; aspirational scripts are permitted 820 XidmodType xm = GetIdentifierModification(ch); 821 if (xm != XIDMOD_RECOMMENDED && 822 xm != XIDMOD_INCLUSION && 823 xm != XIDMOD_ASPIRATIONAL) { 824 return false; 825 } [3] http://www.iana.org/domains/idn-tables/tables/il_he_1.0.html [4] http://www.iana.org/domains/idn-tables/tables/com_hebr_1.2.txt
Whiteboard: [necko-backlog]
See Also: → 997914
Priority: -- → P1
Priority: P1 → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.