Closed
Bug 205471
Opened 22 years ago
Closed 21 years ago
non-ASCII URL should be escaped from UTF-8
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
RESOLVED
DUPLICATE
of bug 129726
People
(Reporter: kazhik, Assigned: smontagu)
Details
(Keywords: intl)
A non-ASCII string in manually typed URL is escaped from
OS-locale encoding(ex. Shift_JIS). But that should be escaped
from UTF-8.
http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1
Original report in Bugzilla-jp:
http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=3162
Related: bug 105909
Comment 1•22 years ago
|
||
Yeah, that's what the standard says and I believe we have to do it. As I
suggested a long time ago :-), we may need an option similar to that offered by
MS IE ('send URLs in UTF-8). Why do we need that?
Because a lot of web servers don't abide by the standard [1] in that when
requested for a document in url-escaped UTF-8, they don't convert the URL to
their local file system charset. IIRC, Mozilla has a (partial) guard against
this by trying again after converting url-escaped UTF-8 to the url-escaped
locale charset (or is it origin charset?)on receiving 'not found' response. This
doesn't always work (the refering web page and the referred page can be on
different servers with diff. filesytem charsets), but Mozilla does its best [2]
and the rest should be done by web servers.
[1] There's an Apache module that converts incoming URLs in url-escaped utf-8 to
the filesystem charset.
[2] There's still some room for improvement here, though. If you right-click on
a loaded image with non-ascii(non-utf-8) file name in non-utf-8 page and select
'view image', Mozilla fails to fetch the image because it just tries once (in
this case) with url-escaped utf-8. A second attempt can be made with the origin
charset in this case, too.
Keywords: intl
Comment 2•22 years ago
|
||
I think we should deal with this particular bug as a duplicate of
Bug 129726. By the way, we should cite an RFC or a group of RFCs that
have to do with internationalized URI/IRI rather than HTML 4 document
to argue for this.
*** This bug has been marked as a duplicate of 129726 ***
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•