Closed
Bug 551413
Opened 16 years ago
Closed 16 years ago
[IDNA] NFKC normalization of Japanese Hiragana/katakana + COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK is not correct
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
VERIFIED
INVALID
People
(Reporter: masa141421356, Unassigned)
Details
(Keywords: intl)
According to Strinprep and Unicode specification,
hostname "カ゛.mozilla.org"
# "カ゛" is [KATAKANA LETTER KA (U+30AB)] +
# [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK (U+309B)].
Should be normalized as "ガ.jp" by NFKC,
# "ガ" is [KATAKANA LETTER GA (U+30AC)]
and converted to punycode "xn--mck.mozilla.org"
but, Firefox converts it to "xn-- -wdu6b.mozilla.org"
See also:
http://unicode.org/reports/tr15/
| Reporter | ||
Comment 1•16 years ago
|
||
Example:
http://日本語ドメイン名協会.jp/
http://日本語ドメイン名協会.jp/
http://日本語ト゛メイン名協会.jp/
all of them should converted to
http://xn--eckwd4c7cz44rqkf8kb898fdocu19k.jp/
but, 3rd case is converted to
http://xn-- -wdu7a7j1erex96vkchttb745g16cdy1m.jp/
Reproduce with other Browsers.
IE 7.0.5730.13
Opera 10.50
Safari 4.0.4
Google chrome 5.0.342.2 dev
Opera and Safari opened same URL.
http://xn--%20-wdu7a7j1erex96vkchttb745g16cdy1m.jp/
I think, it is not mozilla's bug.
Keywords: intl
| Reporter | ||
Comment 3•16 years ago
|
||
> I think, it is not mozilla's bug.
I think this is common IDNA implementation bug.
NAMEPREP requires to normalize using NFKC, it requires to convert
[KATAKANA LETTER KA] + [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK]
to
[KATAKANA LETTER GA].
# I also reported this problem to ICU project.
# http://bugs.icu-project.org/trac/ticket/7526
(In reply to comment #3)
> > I think, it is not mozilla's bug.
> I think this is common IDNA implementation bug.
Sorry, Yamada-san.
It is Mozilla's bug.
| Reporter | ||
Comment 5•16 years ago
|
||
FYI:
NAMEPREP reuires to use NFKC
http://www.rfc-editor.org/rfc/rfc3491.txt
> 4. Normalization
>
> This profile specifies using Unicode normalization form KC, as
> described in [STRINGPREP].
NFKC example of Japanse KATAKANA/HIRAGANA
http://unicode.org/reports/tr15/#NFKD_And_NFKC_Applied_Table
| Reporter | ||
Comment 6•16 years ago
|
||
Unfortunately, example java applet of unicode.org does not normalize [KATAKANA LETTER KA] + [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK] correctly.
http://www.unicode.org/reports/tr15/Normalizer.html
Comment 7•16 years ago
|
||
U+309B is NOT [COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK]. It's a [KATAKANA-HIRAGANA VOICED SOUND MARK] (note that "COMBINING" is absent).
[COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK] is U+3099.
U+3099 was normalized correctly (click the following link).
http://日本語ドメイン名協会.jp/
Why do you think U+309B is a combining mark?
-> INVA
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → INVALID
| Reporter | ||
Comment 8•16 years ago
|
||
U+309B is NOT "COMBINING",
it should be normalized to U+0020 + U+3099 at NFKC, not single U+3099.
Thank you for kimura-san.
Status: RESOLVED → VERIFIED
| Reporter | ||
Comment 9•16 years ago
|
||
->V.
You need to log in
before you can comment on or make changes to this bug.
Description
•