Closed
Bug 28787
Opened 25 years ago
Closed 24 years ago
directory listing display Non ASCII filename as garbage
Categories
(Core Graveyard :: RDF, defect, P1)
Core Graveyard
RDF
Tracking
(Not tracked)
VERIFIED
FIXED
M17
People
(Reporter: ftang, Assigned: nhottanscp)
References
Details
(Whiteboard: [nsbeta2+] ETA 7/24)
Attachments
(3 files)
1.91 KB,
patch
|
Details | Diff | Splinter Review | |
40.17 KB,
text/plain
|
Details | |
77.30 KB,
image/jpeg
|
Details |
This get caught when I use the directory viewer to view the directory name in my
Japanese NT. It will do the same thing for any non ASCII name.
I use the technique in bug 28424 to catch this problem. It assert in the line
values[i] = value;
of nsHTTPIndexParser::ParseData
cvsblame show me this is waterson's code
I believe this is one of the cause that user see garbage of non ASCII
file/folder name in the directory viewer
Here is the stack trace-
nsAutoString::operator=(const nsStr & {...}) line 822 + 21 bytes
nsHTTPIndexParser::ParseData(const char * 0x033faa1d) line 665
nsHTTPIndexParser::ProcessData() line 508 + 15 bytes
nsHTTPIndexParser::OnDataAvailable(nsHTTPIndexParser * const 0x033e1550,
nsIChannel * 0x033dad00, nsISupports * 0x00000000, nsIInputStream * 0x033e5f58,
unsigned int 0x00000000, unsigned int 0x00000238) line 454
nsDocumentOpenInfo::OnDataAvailable(nsDocumentOpenInfo * const 0x033dfbd0,
nsIChannel * 0x033dad00, nsISupports * 0x00000000, nsIInputStream * 0x033e5f58,
unsigned int 0x00000000, unsigned int 0x00000238) line 262 + 46 bytes
nsFileChannel::OnDataAvailable(nsFileChannel * const 0x033dad04, nsIChannel *
0x033e1950, nsISupports * 0x00000000, nsIInputStream * 0x033e5f58, unsigned int
0x00000000, unsigned int 0x00000238) line 468 + 49 bytes
nsOnDataAvailableEvent::HandleEvent(nsOnDataAvailableEvent * const 0x033e2c80)
line 370
nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x033e2c30) line 93 + 12 bytes
PL_HandleEvent(PLEvent * 0x033e2c30) line 526 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x01074b50) line 487 + 9 bytes
while values is an array of nsString and value is a nsCString
notice value may contains non ASCII text. In the US Window, it contains cp1252
(which is 88% same as ISO-8859-1 with 12% differences [0x80-0x9F]), in Japanese
system, the charset is totally different outside the ASCII part.
using the = operator for this kind of nsCString to nsString will cause bad data
converion.
I am not sure how should you fix it. If nsHTTPIndexParser is only used by file
system directory (but not for other, for example FTP), then you can assume the
nsCString is in the charset of nsIPlatformCharset( filenameselector) and use a
nsIUnicodeDecoder to convert it into PRUnichar* . if nsHTTPIndexParser is used
by other places than file: protocol, you probably need to pass the charset in by
some way from the file protocol.
Reporter | ||
Comment 1•25 years ago
|
||
change qa contact to teruko and add 'beta1' keyword
Need info on what exactly the bad problem is here.
Whiteboard: [NEED INFO]
Reporter | ||
Comment 3•25 years ago
|
||
sorry that I forget to mention the user problem. User in non English system,
inculding Japanese, Chinese, Korean for all platform and European languages for
Macintosh will see garbage of their folder name, file name in garbage in the
file:/// directory view. This is also true for Macintosh user which commonly use
non ASCII character, such as bullet or the "folder f" , "Mu" characters .
Reporter | ||
Updated•25 years ago
|
Whiteboard: [NEED INFO]
Updated•25 years ago
|
Target Milestone: M15
Putting on PDT- radar for beta1. Assuming this is only for ftp directory
listings. bobj was in PDT and approved.
Whiteboard: [PDT-]
Comment 5•25 years ago
|
||
Yeah, this'll probably involve making sure that the back-end code generates
UTF-8 as well. Jud, you'll need to do this FTP stream converter (if it's even
possible for FTP directories to be sent in non-ASCII). I'll do the filesystem
stuff and kick it over to you once that's working.
Status: NEW → ASSIGNED
Reporter | ||
Comment 6•25 years ago
|
||
I try to debug 20292 and find out it assert in the same place as this one. So
this bug is the cause of 20292 also. Please be careful here. FTP and FILE system
are NOT operate in UTF8 these days.
Blocks: 20292
Reporter | ||
Comment 7•25 years ago
|
||
waterson- what is the ETA for fixing this ?
Updated•25 years ago
|
Target Milestone: M15 → M16
Reporter | ||
Comment 8•25 years ago
|
||
Change the summary from "illegal use nsCString to nsString -
nsHTTPIndexParser::ParseData" to "directory listing display Non ASCII filename
as garbage" . This bug also show garbage on US MacOS 9 as well as non Western
Window/Linux locale.
put in nsbeta2
Reporter | ||
Updated•25 years ago
|
Summary: illegal use nsCString to nsString - nsHTTPIndexParser::ParseData → directory listing display Non ASCII filename as garbage
Updated•25 years ago
|
Priority: P3 → P1
Comment 10•25 years ago
|
||
This is going to involve a fairly significant re-write of the nsHTTPIndexParser
code, which is probably best done after NEW_STRING_APIS are turned on (so I
don't have to do it twice).
Updated•24 years ago
|
Target Milestone: M16 → M17
Comment 11•24 years ago
|
||
*** Bug 40661 has been marked as a duplicate of this bug. ***
Comment 13•24 years ago
|
||
Frank (Tang), please provide a FTP URL (preferrably outside of our firewall)
which demonstrates this bug. That would be very helpful. :^)
Status: NEW → ASSIGNED
Updated•24 years ago
|
Whiteboard: [nsbeta2+] → [nsbeta2+] ETA: 7/14
Comment 14•24 years ago
|
||
nhotta, I haven't heard from Frank (Tang) in a while, perhaps you can help with
this bug a bit. First, can you reproduce it? (Frank didn't give a URL in this
bug to test, so I've been going to "ftp://kaze/pub/" which was a URL from one of
the other dependancy bugs.) From what I'm seeing, what we display in Mozilla is
the same as what we display in 4.x.
I was looking at this bug over a week or so ago, and made some small changes for
testing which might actually be more appropriate [basically, instead of using the
IO service, use textToSubURI->UnEscapeAndConvert()] If you are still seeing the
original problem that Frank reported, please try applying the diff (I'll attach
it as well as the entire file to this bug) and see if that helps.
Comment 15•24 years ago
|
||
Comment 16•24 years ago
|
||
Comment 17•24 years ago
|
||
nhotta, can you test this for me? (Please see comment above and try out the diff
if need be.)
Assignee: rjc → nhotta
Status: ASSIGNED → NEW
Assignee | ||
Comment 18•24 years ago
|
||
After I used the attached file and rebuild, I still see the problem on my
Japanese WinNT 4.
There is a slite difference after the change so I will attach a screen shot.
And it is a separate problem but I see something wrong in a date string (year is
bogus).
Assignee: nhotta → rjc
Assignee | ||
Comment 19•24 years ago
|
||
Reporter | ||
Comment 20•24 years ago
|
||
does rjc's fix work for Japanese directory viewing in file:/// ?
I think we should seperate the file:/// from ftp:// issue. We have a way to know
what charset is file:/// (from nsPlatformCharset) but we don't know what is the
charset of ftp://
Comment 21•24 years ago
|
||
file:/// handling is separate from ftp:// handling... and this bug will deal
only with ftp :)
Comment 22•24 years ago
|
||
Well, this bug was a huge pain. Here's the current state of the world:
o FTP (the spec/protocol) doesn't appear to be strong enough to provide locale
info to the client
o because of that, FTP (in Necko) just sends up bytes (escaped) for filenames
o its rather hard to reliably convert random bytes (which could be encoded in ANY
fashion) into Unicode
so...
o I'll whacked the directory/ftp datasource to use:
nsITextToSubURI::UnEscapeAndConvert(oldCharset, byteRun, &ucs2Result)
instead of the bare-bones nsEscape() for FTP filename convertion
o I've added an "encoding" attribute to the directory/ftp datasource's IDL, so
that JavaScript can pass in the "current" encoding as chosen by the user
o In mozilla/xpfe/components/directory/directory.js around line #69, its
currently hard-coded to this:
httpDS.encoding = "ISO-8859-1";
If I then ftp to "ftp://kaze/pub/" I see crap. If I change line # 69 to:
httpDS.encoding = "EUC-JP";
I then see Japanese. <whee!>
I've just checked all this in.
All that's left: someone who is familiar with document charset encoding needs to
remove that hard-coded line and instead get the current document's charset and
pass that in instead.
Should be really trivial JavaScript... as fate would have it, I haven't had any
luck with it.
nhotta, can you (or cata, or someone who's been involved with the charset menus)
hook this up?
Assignee: rjc → nhotta
Reporter | ||
Comment 23•24 years ago
|
||
rjc-
Thanks for fixing to this stage. I think your change for ftp is great. However,
the origional bug are for file directory not ftp if you read carefully. Maybe we
should file a seperate bug for that.
Comment 24•24 years ago
|
||
Frank, as has been indicated at various places in this bug by various people
(including myself), this bug will only deal with ftp problems. If you have a
problem with file encoding as well, that will have to be a different bug.
Comment 25•24 years ago
|
||
If it helps to explain why this is the case, realize that nsHTTPIndexParser
(which is where the problem was reported as being) deals with ftp, not file.
Assignee | ||
Comment 26•24 years ago
|
||
Clearing ETA.
Somehow, a character set in JS is always UTF-8. No override or autodetection
affects the charset (so the menu check mark is always at UTF-8).
The value I am looking at for this is,
window._content.document.characterSet
Whiteboard: [nsbeta2+] ETA: 7/14 → [nsbeta2+]
Assignee | ||
Comment 27•24 years ago
|
||
Since a charset is always UTF-8, I cannot use it to set "httpDS.encoding".
I will talk to ftang tomorrow about this.
Assignee | ||
Updated•24 years ago
|
Whiteboard: [nsbeta2+] → [nsbeta2+] No ETA
Assignee | ||
Comment 28•24 years ago
|
||
So the document charset is always "UTF-8".
If I can get appcore then I may use a default charset as below.
window._content.appCore.GetDocumentCharset()
But that returns null, I cannot get appcore in directory.js.
Reassign to rjc, do you know how to get appcore in directory.js?
Assignee: nhotta → rjc
Comment 29•24 years ago
|
||
nhotta, it might be the case that the appCore can't be obtained in directory.js
because the entire page is sand-boxed inside of the browser's content area.
Can you guys come up with another way of getting the charset?
Assignee: rjc → nhotta
Assignee | ||
Comment 30•24 years ago
|
||
No, we only have document charset or default charset.
Menu check mark is using document charset. That's why any page for ftp get a
charset check mark at "UTF-8". So getting a charset from menu does not work.
Assignee | ||
Comment 31•24 years ago
|
||
For every ftp load, directory.xul is loaded from C++ code. It may be related why
I cannot get appCore in directory.js. I tried to used window.parent.appCore but
that's also null.
BTW, local file listing is working fine for Japanese (I think this bug was
orginally filed for local file listing).
Assignee | ||
Comment 32•24 years ago
|
||
Asked Cata for help for the appCore problem.
Reporter | ||
Comment 33•24 years ago
|
||
nhotta- ETA ? Next Tuesday ???
Assignee | ||
Comment 34•24 years ago
|
||
I got a help from Bill Law for appCore issue, ETA is 7/24.
Status: NEW → ASSIGNED
Whiteboard: [nsbeta2+] No ETA → [nsbeta2+] No 7/24
Assignee | ||
Updated•24 years ago
|
Whiteboard: [nsbeta2+] No 7/24 → [nsbeta2+] ETA 7/24
Assignee | ||
Comment 35•24 years ago
|
||
Fix checked in, Japanese name shows by setting a correct charset by menu.
A menu check mark sticks to UTF-8 but that should be a separate bug.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 36•24 years ago
|
||
I verified this in 2000-07-24-08 Win32, Mac, and Linux build.
Status: RESOLVED → VERIFIED
Updated•6 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•