Closed Bug 31225 Opened 25 years ago Closed 24 years ago

Cannot resolve URL with non-ascii ISO-8859-1 characters

Categories

(Core :: Networking, defect, P3)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
Future

People

(Reporter: thomas, Assigned: gordon)

References

()

Details

(Whiteboard: [nsbeta3-])

From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; N; Linux 2.2.14 i686; en-US; m14) BuildID: 2000030909 In the URL mentioned, http://www.nunames.nu/Local-Language.htm, there is a link to another URL: http://www.nunämes.eu.nu, which will result in the error message: "www.nun%c3%c2%a4mes.eu.nu could not be found.". I have tested the same URL with both Netscape 4.7 and MSIE5 and they both resolve it fine. Reproducible: Always Steps to Reproduce: 1.Visit http://www.nunames.nu/Local-Language.htm 2.Locate the link to "http://www.nunämes.eu.nu" at middle of page 3.Click link Actual Results: I got an error message as indicated above. Expected Results: The URL should resolve
Build 2000042113 handles this even stranger than the reporter says. *Very* odd characters are reported back. An older build, 2000032305 converted it to proper %-type encoding but still wasn't able to resolve it (like the reporter said).
Status: UNCONFIRMED → NEW
Ever confirmed: true
NC 4.72 on linux also does not resolve the URL.
what does the spec recommend for charset on hostnames? ->gordon
Assignee: gagan → gordon
Any progress here? Current specs (rfc 2396) don't allow for non ascii chars in domainnames, but there is some discussion about changing that. But I haven't found anything really useful on the topic.
This page might have better testcases: http://www.nunames.nu/eu-lang-test.htm
Blocks: 19313
This is from a response of the webmaster of the domain in question: It looks like your problem is not with Mozilla but with the Linux resolver's "gethostbyname". Linux will not allow a host name outside the ASCII character set. Our system uses the IETF-standard RFC 2277 UTF-8 encoding for non-ASCII names, which works fine with BIND versions 4.x and 8.x, and with all Windows local resolvers (Windows 95, 98, NT and 2000). But until the Linux project or one of the resellers for Linux implements RFC 2277 in the local resolver for Linux, UNIX users will never be able to use Internationalized Domain Names (IDN). There is an IETF Working Group which is developing new IETF standards for IDN which, once it is done in 2001, may provide the pressure needed to convince the Linux group to update the local resolver - or maybe user pressure will do it. [...] I have much to tell you and others at Mozilla about bugs in M15 and NN 6 that need to be fixed to support RFC 2277 properly. For example, if you cut and paste those sample names into the Mozilla M14 or M15 Browser's address form window, it will work. But it does not work if you just click on the link on the page.
On windows the links in the page are working now with the latest builds, but the window title is messed up.
For ease of reference, RFC 2277, "IETF Policy on Character Sets and Languages", is at http://www.ietf.org/rfc/rfc2277.txt Andreas, there is already a bug with DUPs about the titlebar messing up for charsets that the platform does not natively support (Win32 can handle only one at a time), but if you are using a localized Windows that does use ISO-8859-1 and it is messing up, that's a new bug. FWIW, testing the links on http://www.nunames.nu/eu-lang-test.htm with the 2000-07-15-09-M17 nightly binary on WinNT, all links worked, all titles displayed accented characters properly, and those characters are URL-encoded (%xx) in the location bar.
can someone comment on what the spec says about this? nominating for nsbeta3
Keywords: nsbeta3
There will be no help for linux with this bug, since linux does not allow hostname characters outside the ASCII range. It is not compliant with RFC2277, which is not very eleborate about the whole problem in my opinion. If this bug is specific for linux then it can be marked WONTFIX, since there is nothing that can be fixed within mozilla. The last time I looked it worked in principle on windows, but there were some interesting encoding differences when clicking on a link or typing the url in the location bar. That difference is what blocks bug 19313.
based on andreas' comments above I am setting this as a [nsbeta3-]
Whiteboard: [nsbeta3-]
Target Milestone: --- → Future
cc nhotta for comments
This is working in 6.01 (also in 6.0 probably). The URL, http://www.nunämes.eu.nu I cannot find it in http://www.nunames.nu/Local-Language.htm anymore, but other URL with non ASCII domains are working. Mozilla is now sending UTF-8.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
QA Contact: tever → benc
You need to log in before you can comment on or make changes to this bug.