Closed Bug 191388 Opened 22 years ago Closed 18 years ago

IDN link does not work. if hostnames are url-escaped

Categories

(Core :: Internationalization, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 309671

People

(Reporter: teruko, Assigned: nhottanscp)

References

Details

(Keywords: intl, qawanted)

Attachments

(2 files)

From brower, IDN link does not work.

Steps of reproduce
1. Put the following lines in pref.js file.
user_pref("network.IDN_prefix", "bq--");
user_pref("network.IDN_testbed", true);
user_pref("network.enableIDN", true);

2. Launch Composer
3. Create links by using the following IDN.

Active Chinese Domain Name
http://南极星.com/

Active Japanese Domain Name
http://www.トナー.com/

Active Korean Domain Name
http://부동산lg.com/

Active German Domain Name
http://internetdomänen.com/

Active Polish Domain Name
http://pasaż.com/

4. Open the page from Brower and click on the links.

Actual result
IDN link does not work.

Expected result
IDN link works.

Tested 1-29 trunk Win32, Mac, and Linux build.
QA Contact: ylong → teruko
Summary: IDN link does not work. → IDN link does not work. if the link is not specified encoded
Attached file test case file
International domain name in this bug report is changed already.  I attached
the test case.
If you use the original characters in the domain name (without escaped
characters) for link, the link will be converted to the escaped characters.
I am sorry my original explanation is very confusing.

test case (id=113151) works.
test case (id=113156) does not work.
i18n triage team: need info.  Frank to investigate if the non-working test case
is valid.
Whiteboard: [need info]
per i18n triage meeting: nsbeta1-
Keywords: nsbeta1nsbeta1-
Whiteboard: [need info]
the link would not work in the following cases either:
from the mail body when:
- open link in new window;
- open link in new tab;
my last comment applies only to Edit as new/Edit Draft mail, this problem is
filed as bug # 201072
Is this still a problem now that bug 166996 has been fixed?
Keywords: qawanted
Still a problem with Mozilla 1.7b 2004030208.

Links with escaped UTF-8 or ASCII characters are not unescaped and recode to ACE
for accessing.

Example: The link for http://internetdomänen.com/ with escaped UTF-8 is
http://internetdom%c3%a4nen.com/ or with escaped ASCII
http://internetdom%e4nen.com/ and the status bar shows it correctly unescaped
with umlaut. If Mozilla tries to access the escaped link it connects not to the
ACE-coded http://xn--internetdomnen-gib.com/. The same happens if you type in
the escaped URL directly in the location bar.

I read bug 170241 comment 4 about not to unescape hostnames. Hmm ... ist this
still correct even for characters allowed in IDN domains? 

So is this then a dupe of Bug 170241?
http://www.w3.org/International/iri-edit/draft-duerst-iri-05.txt
(see how ihostname and idomainlabel are defined).

In IRI, hostname part should not be escaped. When converting IRI to URI,
hostname part should be converted to punycode (ACE encoding). 

So, this bug is either invalid or should be treated as an 'enhancement'(?)
request for condoning (common) mistakes of escaping hostnames in IDN. 

Severity: normal → enhancement
(In reply to comment #11)
> http://www.w3.org/International/iri-edit/draft-duerst-iri-05.txt
> (see how ihostname and idomainlabel are defined).

Hmm ... see p. 10/11:

Infrastructure accepting IRIs MAY also deal with 'ihostname' parts escaped
according to Step 3) rather than Step 2).  For example, Step 2) converts the IRI
http://résumé.example.org to
http://xn--rsum-bpad.example.org.  For backward compatibility,
http://r%C3%A9sum%C3%A9.example.org would also be converted to
http://xn--rsum-bpad.example.org.
Thanks for bringing that part to my attention. 'May' is used in the draft so
that we're not obliged to. Nonetheless, I think it's a good idea to implement
that (but this bug is still an 'enhancement')

The bug summary and the original report are rather confusing. The original
report talked about urls with i18n hostnames in UTF-8 but subsequent comment and
testcases   morphosed this bug into url-escaped hostnames. 
 
Summary: IDN link does not work. if the link is not specified encoded → IDN link does not work. if hostnames are url-escaped
Blocks: IDN
*** Bug 209541 has been marked as a duplicate of this bug. ***
(In reply to comment #13)
> 'May' is used in the draft so
> that we're not obliged to. Nonetheless, I think it's a good idea to implement
> that (but this bug is still an 'enhancement')

According to RFC-3986 (the new URI spec that obsoletes RFC-2396) you are now
obliged.  Notice that RFC-3986 adds formerly-invalid syntax to the host field
(the percent-escaped UTF-8 stuff) so that all existing URI parsers are
retroactively declared to be buggy.  How nice.  :)

*** This bug has been marked as a duplicate of 309671 ***
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: