User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; cs-CZ; rv:1.9.1b4) Gecko/20090427 Fedora/3.5-0.20.beta4.fc11 Firefox/3.5b4 Build Identifier: (originally filed as https://bugzilla.redhat.com/show_bug.cgi?id=507831) Changing the encoding while browsing ftp doesn't work. No matter which I select, it resets back to the browsers default non-UTF8 encoding (1251 for my downstream reporter ISO-8859-2 for me) and there is no way how to switch to the encoding corresponding to the one of the filenames on the server. Reproducible: Always Steps to Reproduce: 1.Go to ftp://ftp.asu.ru/incoming/ or ftp://22.214.171.124:8021/pub/test 2.Observe unreadable names 3.Try to change encoding in View/Encoding so that the names are readable (note that I do read Russian, so I can rather authoritatively say that whatever I see there is NOT Russian) Actual Results: Whatever I do, there are no readable filenames Expected Results: If not per default (see bug 251892 comment 10 for some discussion which seems to suggest that could be possible at least for some servers to make it work) then at least make it possible to change the encoding of the generated page.
Actually according to the downstream reporter, filenames on the server are encoded in UTF-8.
Just a note: this is a regression, the previous versions of firefox did that fine.
Indeed, I had the problems with _some_ cyrillic chars (actually, only with 'И', IIRC), and now - with all of them. :) Getting away with the previous problems was rather easy for me. I think the encoding selection was a nice feature and should be re-enabled.
I see this in Firefox 3.5 and 3.6apre1 for FTP and other index listing such as jar: files. I can choose other encodings in the View > Character Encoding menu but the listing doesn't change and when I bring up the menu it's back to the default. Try e.g. ftp://janych.selfip.com/ A workaround is to change intl.charset.default in about:config, but this can make FTP index entries silently vanish if the filename has a code point that is invalid in the current encoding.
This problem was introduced by bug #348233 (changeset http://hg.mozilla.org/mozilla-central/rev/35d3b66853a8). FTP listing was changed from html to xhtml and changing charset of "application/xhtml+xml" doesn't work (caused by bug #240321). When the document is of type "text/html" then charset selected by user is correctly set in nsHTMLDocument::StartDocumentLoad() with priority kCharsetFromUserForced. But when the document is of type "application/xhtml+xml" then the charset is set to UTF-8 with priority kCharsetFromDocTypeDefault (http://hg.mozilla.org/mozilla-central/diff/9b2a99adc05e/content/html/document/src/nsHTMLDocument.cpp). Charset is later changed to encoding found in xhtml (http://hg.mozilla.org/mozilla-central/diff/9b2a99adc05e/parser/htmlparser/src/nsParser.cpp) which is value of intl.charset.default in case of FTP listing. Is the right solution in case of xhtml to check for user forced charset but not for others (channel, bookmark, ...)?
For XML encoding errors are fatal, so there is no provision for encoding overrides there, nor should there be, I would think. We shouldn't be using XHTML here if we don't know the correct encodings, imo... If we know the encoding in the FTP code, of course, we should be setting it.
I could maybe be convinced to look at charset overrides in nsHTMLDocument::StartDocumentLoad in the XHTML case, but would they actually have any effect on what the XML parser is doing?
> overrides there, nor should there be, I would think. We shouldn't be using > XHTML here if we don't know the correct encodings, imo... If we know the > encoding in the FTP code, of course, we should be setting it. Unfortunately we don't know correct encoding. And as far as I know it is correct to have multiple encodings in one FTP listing. > I could maybe be convinced to look at charset overrides in > nsHTMLDocument::StartDocumentLoad in the XHTML case, but would they actually > have any effect on what the XML parser is doing? Hmm, you're right. This doesn't work.
Sounds like we need to stop using XHTML for FTP listings, then.
As Dao set the dependencies, once bug 478416 lands, we can switch back to using HTML instead of XHTML for FTP listings. However, bug 478416 won't land on 1.9.1, so I'm not sure what a correct fix for the 1.9.1 branch would be. Are pages served as XHTML affected by the same problem as well? Is there any other fix imaginable?
> Are pages served as XHTML affected by the same problem as well? If by "problem" you mean user not being able to override the encoding, yes. > Is there any other fix imaginable? Well, we could do surgery on the XML parser, adding codepaths that violate the XML spec, for this... That doesn't seem like an acceptable 1.9.1 change, honestly.
Is there any other solution conceivable? For example, can kCharsetFromUserForced be used with XHTML documents?
I already answered that in comment 13.
(In reply to comment #11) > Sounds like we need to stop using XHTML for FTP listings, then. Or directory listings in general. Reportedly, a local file name with character that XML forbids YSoDs, too: http://krijnhoetmer.nl/irc-logs/whatwg/20091029#l-301
The work I have started in bug 525222 will resolve this issue as well. Marking it as a dependency.
Bug 525222 landed.