If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

IDN script mixing policy: Consider switching to highly-restrictive

NEW
Unassigned

Status

()

Core
Networking
5 days ago
10 hours ago

People

(Reporter: Jungshik Shin, Unassigned, NeedInfo)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

5 days ago
Coming from Chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=726950

Currently, Chromium uses moderately-restrictive for mixed script check in domain name components. As a result, certain scripts with confusable characters to Latin are allowed to mix with Latin.   To protect against them, Chromium has added some ad-hoc rules.  

When I reviewed Verisign and IDN ccTLD rules for scripts other than Han, Hangul, Hiragana/Katakana, and Bopomofo,  they don't allow mixing with Latin. [1] 

So, using 'moderately-restrictive' policy for mixed-script detection does not do much for (major) TLDs.   

Of course, tertiary and lower level domain components can mix Latin and those scripts (Hebrew, Deva, Arabic, Armenian, Georgian), but if you look at web pages in languages written in those scripts,  script-mixing within a single word is very rare if any. 

I propose that Firefox and Chromium can sync up on this policy and change to 'highly restrictive' script mixing check. 


[1] https://www.verisign.com/en_US/channel-resources/domain-registry-products/idn/idn-policy/registration-rules/index.xhtml  ; section 3

Hebrew domain name policy (Israel): does not allow mixing Hebrew characters and Latin. 

http://www.isoc.org.il/files/docs/ISOC-IL_Registration_Rules_v1.5_ENGLISH_-_26.6.2016.pdf

https://www.icann.org/sites/default/files/packages/lgr/lgr-second-level-hebrew-30aug16-en.html


Indian IDN policy (not sure if it's the latest. it's from 2009)

http://meity.gov.in/writereaddata/files/India-IDN-Policy.pdf

3.B has this:
B. NOT PERMISSIBLE
1. CODE-PAGE MIXING
No mixing of scripts at a given level will NOT be allowed

As an example, Latin-Devanagari mixed label is given.
Hi Jungshik,

Did you get the emails I sent to you and Mark Davis on this topic recently?

Do you have any measures of the level of impact this change will have? E.g. how many domain names are affected, and whether any are in the Alexa top 1M or any other list of popular sites?

Gerv
(Reporter)

Comment 2

16 hours ago
Hi Gerv,  I must have missed it. I'll look for it. 

As for # of domains affected by this change, I only checked dot com domain (as of a few months ago). IIRC,  it's 0 because apparently Verisign did what their policy page says they do in terms of mixed script names. I'll go back and check again after  disabling some additional checks I put in for Chromium on top of the current moderately restrictive policy. 

I'll also check some of IDN ccTLDs as well as some TLDs not controlled by Verisign.
(Reporter)

Comment 3

14 hours ago
Gerv,  I couldn't find any recent email from you (both my personal and company account) regarding IDN display policy. Can you send it again to my personal gmail (associated with my bugzilla account)?  Thank you
Christoph, should this be in security or psm?
Flags: needinfo?(ckerschb)
Jungshik: I've resent the email using the address you use for Bugzilla. But basically, the questions are: do we have to switch to Highly Restrictive, or is there another way? What impact would that switch have? And even if we do switch, does that solve all the issues which have been raised recently? (Spoiler: no.)

Gerv
You need to log in before you can comment on or make changes to this bug.