Closed Bug 551413 Opened 16 years ago Closed 16 years ago

[IDNA] NFKC normalization of Japanese Hiragana/katakana + COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK is not correct

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

VERIFIED INVALID

People

(Reporter: masa141421356, Unassigned)

Details

(Keywords: intl)

According to Strinprep and Unicode specification, hostname "カ゛.mozilla.org" # "カ゛" is [KATAKANA LETTER KA (U+30AB)] + # [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK (U+309B)]. Should be normalized as "ガ.jp" by NFKC, # "ガ" is [KATAKANA LETTER GA (U+30AC)] and converted to punycode "xn--mck.mozilla.org" but, Firefox converts it to "xn-- -wdu6b.mozilla.org" See also: http://unicode.org/reports/tr15/
Reproduce with other Browsers. IE 7.0.5730.13 Opera 10.50 Safari 4.0.4 Google chrome 5.0.342.2 dev Opera and Safari opened same URL. http://xn--%20-wdu7a7j1erex96vkchttb745g16cdy1m.jp/ I think, it is not mozilla's bug.
Keywords: intl
> I think, it is not mozilla's bug. I think this is common IDNA implementation bug. NAMEPREP requires to normalize using NFKC, it requires to convert [KATAKANA LETTER KA] + [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK] to [KATAKANA LETTER GA]. # I also reported this problem to ICU project. # http://bugs.icu-project.org/trac/ticket/7526
(In reply to comment #3) > > I think, it is not mozilla's bug. > I think this is common IDNA implementation bug. Sorry, Yamada-san. It is Mozilla's bug.
FYI: NAMEPREP reuires to use NFKC http://www.rfc-editor.org/rfc/rfc3491.txt > 4. Normalization > > This profile specifies using Unicode normalization form KC, as > described in [STRINGPREP]. NFKC example of Japanse KATAKANA/HIRAGANA http://unicode.org/reports/tr15/#NFKD_And_NFKC_Applied_Table
Unfortunately, example java applet of unicode.org does not normalize [KATAKANA LETTER KA] + [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK] correctly. http://www.unicode.org/reports/tr15/Normalizer.html
U+309B is NOT [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK]. It's a [KATAKANA-HIRAGANA VOICED SOUND MARK] (note that "COMBINING" is absent). [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK] is U+3099. U+3099 was normalized correctly (click the following link). http://日本語ドメイン名協会.jp/ Why do you think U+309B is a combining mark? -> INVA
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → INVALID
U+309B is NOT "COMBINING", it should be normalized to U+0020 + U+3099 at NFKC, not single U+3099. Thank you for kimura-san.
Status: RESOLVED → VERIFIED
->V.
You need to log in before you can comment on or make changes to this bug.