Closed Bug 31225 Opened 25 years ago Closed 23 years ago

Cannot resolve URL with non-ascii ISO-8859-1 characters

Categories

(Core :: Networking, defect, P3)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
Future

People

(Reporter: thomas, Assigned: gordon)

References

()

Details

(Whiteboard: [nsbeta3-])

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; N; Linux 2.2.14 i686; en-US; m14)
BuildID:    2000030909

In the URL mentioned, http://www.nunames.nu/Local-Language.htm, there is a link
to another URL: http://www.nunämes.eu.nu, which will result in the error
message: "www.nun%c3%c2%a4mes.eu.nu could not be found.".
I have tested the same URL with both Netscape 4.7 and MSIE5 and they both
resolve it fine.


Reproducible: Always
Steps to Reproduce:
1.Visit http://www.nunames.nu/Local-Language.htm
2.Locate the link to "http://www.nunämes.eu.nu" at middle of page
3.Click link


Actual Results:  I got an error message as indicated above.

Expected Results:  The URL should resolve
Build 2000042113 handles this even stranger than the reporter says. *Very* odd

characters are reported back.



An older build, 2000032305 converted it to proper %-type encoding but still

wasn't able to resolve it (like the reporter said).

Status: UNCONFIRMED → NEW
Ever confirmed: true
NC 4.72 on linux also does not resolve the URL.
what does the spec recommend for charset on hostnames? ->gordon
Assignee: gagan → gordon
Any progress here? Current specs (rfc 2396) don't allow for non ascii chars in
domainnames, but there is some discussion about changing that. But I haven't
found anything really useful on the topic.
This page might have better testcases: http://www.nunames.nu/eu-lang-test.htm
Blocks: 19313
This is from a response of the webmaster of the domain in question:

It looks like your problem is not with Mozilla but with the Linux
resolver's "gethostbyname". Linux will not allow a host name outside the
ASCII character set. Our system uses the IETF-standard RFC 2277 UTF-8
encoding for non-ASCII names, which works fine with BIND versions 4.x and
8.x, and with all Windows local resolvers (Windows 95, 98, NT and 2000).
But until the Linux project or one of the resellers for Linux implements
RFC 2277 in the local resolver for Linux, UNIX users will never be able to
use Internationalized Domain Names (IDN). There is an IETF Working Group
which is developing new IETF standards for IDN which, once it is done in
2001, may provide the pressure needed to convince the Linux group to update
the local resolver - or maybe user pressure will do it.

[...]

I have much to tell you and others at Mozilla about bugs in M15
and NN 6 that need to be fixed to support RFC 2277 properly. For example,
if you cut and paste those sample names into the Mozilla M14 or M15
Browser's address form window, it will work. But it does not work if you
just click on the link on the page.

On windows the links in the page are working now with the latest builds, but the
window title is messed up. 
For ease of reference, RFC 2277, "IETF Policy on Character Sets and Languages",
is at http://www.ietf.org/rfc/rfc2277.txt

Andreas, there is already a bug with DUPs about the titlebar messing up
for charsets that the platform does not natively support (Win32 can handle
only one at a time), but if you are using a localized Windows that does use
ISO-8859-1 and it is messing up, that's a new bug. 

FWIW, testing the links on http://www.nunames.nu/eu-lang-test.htm with
the 2000-07-15-09-M17 nightly binary on WinNT, all links worked, all titles
displayed accented characters properly, and those characters are URL-encoded
(%xx) in the location bar.
can someone comment on what the spec says about this? nominating for nsbeta3
Keywords: nsbeta3
There will be no help for linux with this bug, since linux does not allow
hostname characters outside the ASCII range. It is not compliant with RFC2277,
which is not very eleborate about the whole problem in my opinion. If this bug
is specific for linux then it can be marked WONTFIX, since there is nothing that
can be fixed within mozilla. The last time I looked it worked in principle on
windows, but there were some interesting encoding differences when clicking on a
link or typing the url in the location bar. That difference is what blocks bug
19313.
based on andreas' comments above I am setting this as a [nsbeta3-]
Whiteboard: [nsbeta3-]
Target Milestone: --- → Future
cc nhotta for comments
This is working in 6.01 (also in 6.0 probably).
The URL, http://www.nunämes.eu.nu I cannot find it in
http://www.nunames.nu/Local-Language.htm anymore, but other URL with non ASCII
domains are working. Mozilla is now sending UTF-8.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
QA Contact: tever → benc
You need to log in before you can comment on or make changes to this bug.