Closed
Bug 162377
Opened 22 years ago
Closed 22 years ago
Non-ASCII filename is not shown when saving and downloading from a ftp listing
Categories
(Core :: Internationalization, defect, P1)
Tracking
()
VERIFIED
FIXED
mozilla1.0.1
People
(Reporter: ji, Assigned: nhottanscp)
References
Details
(Keywords: intl, topembed, Whiteboard: [adt2 RTM] [ETA 08/21] [Land on 1.0 branch post MachV])
Attachments
(1 file, 5 obsolete files)
7.94 KB,
patch
|
shanjian
:
review+
shanjian
:
superreview+
chofmann
:
approval+
|
Details | Diff | Splinter Review |
This bug is seperated from bug 155569. Fix for bug 155569 is reading the page encoding. This works for most cases, but on a ftp listing page, to display non-ascci filename correctly on the page, charset preference setting has to match the sever encoding. But ftp listing page is encoded in UTF-8, this caused the fix for bug 155569 not working for ftp listing page. On the latest branch build, the filename is not shown on File Save dialog except the file extension part, the user can still correct the filename on the file save dialog. But on some embedding applications, because there is no file save dialog as in Netscape client, the file is downloaded directly from the browser to local drive, so the user has no chance to correct the filename, non-ascii filename is plainly shown as .exe or .doc in local drive after it's downloaded. Steps to reproduce on Netscape client: 1. go to ftp://10.169.114.243 2. Go to Edit | Preference | Language, select Simplified Chinese (GB2312) as Character Encoding, so the Chinese filenames can be viewed correctly on the listing page. 2. Click on a doc file, Helper dialog comes up. Select "Save the File", then File Save dialog comes up. You can see the filename is not displayed on the dialog.
Comment 3•22 years ago
|
||
The problem is that we have no way of knowing what the server's charset actually is. See bug 145101 and bug 26767, but mainly bug 26767. Since almost no server implements the RFC in that bug, that wouldn't help you, though. The basic problem is that the ftp spec is ASCII only...
Assignee | ||
Comment 4•22 years ago
|
||
I think this bug is for a specific case when the server's charset and the browser's default charset matches (so ftp listing works for non ASCII).
Status: NEW → ASSIGNED
Assignee | ||
Comment 5•22 years ago
|
||
I talked to ftang about this. Here is the current plan. In webshell, we get a document charset and set it as originCharset. For ftp listing, howerver, the charset is UTF-8 and the filename is encoded in browser's default (this assumes the list is already shown correctly). So, what we do is to detect ftp listing case in webshell and set browser's default to originCharset as a special case.
Assignee | ||
Comment 6•22 years ago
|
||
Assignee | ||
Comment 7•22 years ago
|
||
Attachment #95331 -
Attachment is obsolete: true
Comment 8•22 years ago
|
||
Comment on attachment 95336 [details] [diff] [review] Moved mPref check down to ftp case. r=ftang
Attachment #95336 -
Flags: review+
Comment 9•22 years ago
|
||
Comment on attachment 95336 [details] [diff] [review] Moved mPref check down to ftp case. This looks basically identical to the code at http://lxr.mozilla.org/mozilla/source/netwerk/streamconv/converters/nsDirIndexP arser.cpp#66 Why isn't that code triggering? Oh. Oops In http://lxr.mozilla.org/mozilla/source/netwerk/streamconv/converters/nsIndexedTo HTML.cpp#206, that code should be getting the encoding from mParser::GetEncoding, instead of using utf8. Does that change fix this bug? If so, then that would be the preferred way to do this. I think I even had that fixed locally as part of my patches to i18n-ify the html file directory listings which I never got to finish. That fix is too early to pick up the hints from the protocol for the encoding, but ftp can't set those anyway ATM. I'll try to generate a patch this evening which you could test
Comment 10•22 years ago
|
||
This compiles, but I'm not set up to test it - can someone please do so? If it does work, then this patch should be applied instead of the other one. If it doesn't, then (to help me debug when I have to fix this later), can you please tell me if the filename looks correct on the page, with and without this patch? Thanks! (As a side note, the reason that this code uses UTF-8 always is because it uses nsITextToSubURI internally to do conversions. I think that I'm mishandling that somewhere, but I'm not sure where...)
Assignee | ||
Comment 11•22 years ago
|
||
> please tell me if the filename looks correct on the page, with and without this > patch? Thanks! With the patch, the ftp list does not show the file names correctly (ftp://10.169.114.243). The document charset is now changed to GB2312 (from the browser's default). Without the patch, the filename were shown correctly when the browser's default matches with the server (e.g. "GB2312" for ftp://10.169.114.243). > (As a side note, the reason that this code uses UTF-8 always is because it uses > nsITextToSubURI internally to do conversions. I think that I'm mishandling that > somewhere, but I'm not sure where...) > If you want to go with your current approach (i.e. the last patch), then how about just apply unescape without converting to Unicode? Then the charset and the actual content could match and the filename would be shown correctly. >This compiles, but I'm not set up to test it - can someone please do so? What does that mean? I think I can help you about this if that means you don't have fonts (are you using MS Widnows?).
Assignee | ||
Comment 12•22 years ago
|
||
318 nsIndexedToHTML::FormatInputStream(nsIRequest* aRequest, nsISupports *aContext, const nsAString &aBuffer) 319 { 320 nsresult rv = NS_OK; 321 NS_ConvertUCS2toUTF8 buffer(aBuffer); 322 nsCOMPtr<nsIInputStream> inputData; 323 rv = NS_NewCStringInputStream(getter_AddRefs(inputData), buffer); This part (line 321) generates HTML in UTF-8. This has to change also if we label the page as parser's charset instead of UTF-8. We can hook up charset converter here.
Assignee | ||
Comment 13•22 years ago
|
||
Attachment #95336 -
Attachment is obsolete: true
Attachment #95391 -
Attachment is obsolete: true
Comment 14•22 years ago
|
||
Bradley Baetz , not sure your approach will work the reason now the generated xml is in utf8 is because we are mixing data from both the string bundle (which contains the localized message from the cleint) and the data from the ftp protocol. The only way we can make both work is to generate encoding in utf-8, we get the string bundle (which is utf8) and generated it AS IS to the output, and get the data from the ftp, convert it from the default charset to the utf-8 and generate it. IF you change the xml encoding to something else, then the data from string bundle may not convert into it correctly. For example, if the build is localized to japanese build and the default charset is set to korean, then the string bundle is japanese in utf-8, we then convert the ftp data from korean euc-kr to utf-8 and generate it. IF you change the the xml encoding to euc-kr, then the utf-8 string bundle wiill show up as garbage if you generate without convert, if you generate with convert, the japanese text will be convert to ? mark sicne some of the text cannot be encoded in euc-kr. Only the utf-8 or other unicode can convert everything.
Comment 15•22 years ago
|
||
ok, I take my comment back. after talk to nhott, he said we can generate ncr instead for the string bundle data. I agree with it. so he will work out a patch which include that part.
Comment 16•22 years ago
|
||
No, I'm running linux, and I don't know what ite meant to look like anyway... In that case, use your patch, and I'll tidy it up when I have more time to make the file stuff work (in a few months...) My approach doesn't work, but I thought I was converting it to utf8. I've just noticed that the texttosuburi thing uses ucs2, so maybe that was the problem.
Assignee | ||
Comment 17•22 years ago
|
||
Attachment #95452 -
Attachment is obsolete: true
Comment 18•22 years ago
|
||
Comment on attachment 95622 [details] [diff] [review] Added code to convert string bundle to NCR. r=ftang
Attachment #95622 -
Flags: review+
Comment 19•22 years ago
|
||
cc to rick
Updated•22 years ago
|
Priority: -- → P1
Whiteboard: [adt2 RTM] [ETA 08/21] [Land on 1.0 branch post MachV]
Target Milestone: --- → mozilla1.0.1
Comment 20•22 years ago
|
||
Comment on attachment 95622 [details] [diff] [review] Added code to convert string bundle to NCR. - In nsIndexedToHTML.cpp: +static void ConvertNonAsciiToNCR(const nsAString& in, nsAFlatString& out) +{ + nsAString::const_iterator start, end; Any reason for |in| to not be a nsAFlatString as well? If you can make that change, then change start and end to be nsAFlatString::const_iterator's too. ... + while (start != end) { + if (*start < 128) { + out.Append(*start++); + continue; + } + out.Append(NS_LITERAL_STRING("&#x")); + nsAutoString hex; + hex.AppendInt(*start++, 16); + out.Append(hex); + out.Append((PRUnichar)';'); + } Please loose the |continue| and move the code below the if statement into an else statement. - In nsIndexedToHTML::FormatInputStream(): + if (NS_SUCCEEDED(rv)) + rv = mUnicodeEncoder->SetOutputErrorBehavior(nsIUnicodeEncoder::kOnError_Replace, + nsnull, (PRUnichar)'?'); Indent the rv = ... and please put braces around the if, even if it's just a one-line statement. sr=jst
Attachment #95622 -
Flags: superreview+
Comment 21•22 years ago
|
||
Attachment #95622 -
Attachment is obsolete: true
Comment 22•22 years ago
|
||
> Any reason for |in| to not be a nsAFlatString as well? If you can make that > change, then change start and end to be nsAFlatString::const_iterator's too. I thought nsAString is more universal than nsAFlatString, and it should not make any difference here, right? > Please loose the |continue| and move the code below the if statement into an > else statement. Done. > Indent the rv = ... and please put braces around the if, even if it's just a > one-line statement. Done.
Comment 23•22 years ago
|
||
Comment on attachment 95954 [details] [diff] [review] new patch carry over r/sr
Attachment #95954 -
Flags: superreview+
Attachment #95954 -
Flags: review+
Comment 24•22 years ago
|
||
fix checked into trunk.
Comment 25•22 years ago
|
||
Resolving as fixed per Comment #24 From Shanjian Li. ji: pls verify this as fixed on the branch. thanks!
Comment 26•22 years ago
|
||
mark this bug as fixed since shanjian land into trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 27•22 years ago
|
||
ji: can you pls verify this as fixed on the trunk? thanks!
Reporter | ||
Comment 28•22 years ago
|
||
I have some questions, with 08/20 trunk, I go to ftp://10.169.114.243 then set global default to GB2312 via Edit | Preference | Language. On the ftp listing, clicking on .exe and .doc file brings up helper dialog, Chinese filename can be shown correctly if i choose to save the file. But for .txt and .html file, if I Shift click the file to save the file instead of opening the file, the Chinese html/txt filename name is always shown as "File" on Save as dialog. Shanjian/Frank, is it by design or a limitation? If that's the way for html/txt file, I'll close this bug as fixed. Thanks.
Reporter | ||
Comment 29•22 years ago
|
||
Right click on the file and select Save Link Target As menu has the same problem. It looks like we still have problems with html/text files which don't need to go through helper.
Reporter | ||
Comment 30•22 years ago
|
||
Filed bug 163682 for the problem with html/text file download. Verified this one as fixed.
Status: RESOLVED → VERIFIED
Reporter | ||
Comment 31•22 years ago
|
||
Did more testing, actually download via Save Link Target As menu has problems with all different file types: .exe/.doc/.html/.txt/.jpg/.bmp/.gif
Reporter | ||
Comment 32•22 years ago
|
||
QA contact to Rui for the verification on the branch build. Thanks.
QA Contact: ji → ruixu
Comment 33•22 years ago
|
||
jaimejr, when should we land this into the m1.0 branch? how many approval we need to got post machv?
Comment 34•22 years ago
|
||
adt1.0.1+ (on ADT's behalf) approval for checkin to the 1.0 branch, pending Drivers' approval. pls check this in asap, the replace "Mozilla1.0.1+" with "fixed1.0.1". [Using adt1.0.1 keywords as a proxy, since no edt1.0.1 keywords were created for the 1.0.1 branch. This is needed on the 1.0 branch for a major embedor]. thanks!
Comment 35•22 years ago
|
||
Naoki, please make sure to include patch for bug 164197 when you check in this patch to branch.
Assignee | ||
Comment 36•22 years ago
|
||
shanjian, do you mean that the patch of this bug has to merge with the patch of bug 164197?
Comment 37•22 years ago
|
||
Kind of. In fact, bug 164197 fixed 3 problems, 2 caused by this patch, the one caused by another one of the 4 ftp patch. I think you will checkin all the 4 ftp patches, so be sure to apply patch from 164197 before you do so.
Comment 38•22 years ago
|
||
You need to use Substring instead of nsDependentCString to wrap a chat* which doesn't end in NULL. Just grab the netwerek/ftp/srv part of that patch to fix that, although taking the other part won't hurt; it fixes what could be a significant leak if you do lots of ftp browsing.
Comment 39•22 years ago
|
||
Comment on attachment 95954 [details] [diff] [review] new patch a=chofmann for 1.0.2
Attachment #95954 -
Flags: approval+
Updated•22 years ago
|
Keywords: mozilla1.0.1 → mozilla1.0.1+
Reporter | ||
Comment 42•22 years ago
|
||
verified as fixed with 2002-08-28-08-1.0 build.
Keywords: fixed1.0.2 → verified1.0.2
You need to log in
before you can comment on or make changes to this bug.
Description
•