Open
Bug 715801
Opened 13 years ago
Updated 3 years ago
http-index-format directory listings should send 301: encoding line
Categories
(Core :: Networking, defect, P5)
Core
Networking
Tracking
()
UNCONFIRMED
People
(Reporter: info, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [necko-would-take])
Numerous bugs have been filed about character encodings in directory listings (ftp, jar, etc.). It seems many of these have been "fixed" by allowing the user to change View > Character Encoding, change intl.charset.default, or relying on auto-detect.
But in those cases where Mozilla knows the encoding of the directory listing, it should specify it and avoid any problems. Mozilla produces directory listings using a textual http-index-format (https://developer.mozilla.org/En/Application%2F%2Fhttp-index-format_specification) , and then transforms this into HTML. This format includes a
301: <encoding>
line, see http://mxr.mozilla.org/mozilla-central/source/netwerk/streamconv/converters/nsDirIndexParser.cpp#477 , but it is undocumented and seems unused.
One test and proof of this is to connect to an FTP server and browse a directory full of files with special glyphs in their names. They appear wrong if you leave your intl.charset.default at the default of ISO-8859-1, because all FTP servers are supposed to issue UTF-8 directory listings, but Mozilla's http-index-format representation of the FTP doesn't tell itself to use UTF-8, and it all gets garbled.
You can see garbled characters at ftp://mozilla:mozilla@annexia.org/ (bug 26767).
I captured some http-index-format output, added the 301 charset line, and configured my server to serve this. Compare
http://www.skierpage.com/moz_bugs/ftp_listing.diri?x
with the added 301 line:
http://www.skierpage.com/moz_bugs/ftp_listing_extra.diri?x
Similarly, compare browsing the contents of the jar file
jar:http://www.skierpage.com/moz_bugs/d%C3%A9j%C3%A0%E6%97%A5%E6%9C%AC%E5%9B%BD.jar!/
with the added 301 line:
http://www.skierpage.com/moz_bugs/dejajar_contents_extra.diri?x
(The ?x query string on these URLs avoids bug 367076 where http-index-format adds a / and becomes a 404 when you view source, reload, or change character encoding.)
Adding a "301: UTF-8" line fixes accented characters in listings, but Asian glyphs seem to remain problematic. Maybe this can be part of the solution to bug 26767, bug 502540 and others.
Updated•9 years ago
|
Whiteboard: [necko-would-take]
Comment 1•8 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P5
Updated•3 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•