** Observed with 1/11/00 Win32 build and also with M12-rtm build ** Apparently this problem has been around for a while. I hope it is being addressed somewhere and that this is a duplicate. 1. Use 4.71 to display Netscape Home page. 2. Now save this page into .txt format by "File | Save as..." menu. This will produce the home page content in .txt format. 3. Now view this page locally by Mozilla using "File | Open File" under Shift_JIS encoding. 4. The display looks all wrong except the ASCII characters. The results look bad under Japanese Auto-Detect ON setting also. 5. If you save the above URL in .html format and view it locally, there is no problem. We don't seem to be sending Unicode data to the layout when the file format is in .txt.
If this is not a duplicate, we should mark this [Beta].
I pointed out this problem in 17022. And Marina pointed the other problem in 16878. Those bugs are dup of 16868. Now, 16868 is fixed, but this problem still happens.
Thanks. I've looked at the other bugs and whatever the diagnosis was for these original bugs does not seem to apply to this and Bug 17022. Rather than going back to the old bugs and retrace the discussios which did notlead to a real solution, let's start the bug again here new. Let me summarize the current known facts. 1. .txt file in Shift_JIS Japanese cannot be displayed correctly -- presumably we are not sending correct Uncode data to the layout for some reason. 2. .html file in Shift_JIS Japanese displays OK under Shift_JIS encoding. 3. .html file in Shift_JIS which encloses the Japanese data in <pre> ... </pre> tag making it a preformatted text displays OK under Shift_JIS encoding. ftang should analyze the problem again and assign it to right people or deal with it himself. I have marked this bug [Beta].
Since the problem applies generally to non-ASCII data in .txt format, I corrected the summary line.
email@example.com has been added to the CC line since he was consulted earlier in a supposedly related bug.
In the 4.x time, we handle plain text w/ html code path. Somehow we currently use XML document to handle plain text in nsLayoutDLF.cpp now. This cause some problem. The reason is the charset policy for html and plain text is different than xml. In xml, if there are no charset information available, the charset default to "UTF-8". In html and plain text, if there are no charset information avaialbe, the charset should default to whatever user select. RickG: Is there any reason that you use nsXMLDocument to handle plain text intead of nsHTMLDocument ? Can we switch back to nsHTMLDocument ?
I probably can work around the problem by copy some charset policy code from nsHTMLDocument.cpp into nsXMLDocument.cpp
big diff. Postpone to M14.
Change OS and Platform to ALL
Reassign this to rickg since he say in email that he will make the plain text using nsHTMLDocument instead of nsXMLDocument which will fix this problem. This is not only a problem in Japanese but also ISO-8859-1 plain text file. If you use nsXMLDocument to display ISO-8859-1 plain text file, it will display incorrectly.
It's fixed in my tree -- awaiting opportunity to checkin.
Fixed with checkin last night.
I verified this in 2000021408 Win32 and 2000021409 Linux build. I cannot verify this in Mac because of bug 27773.
Bug 27773 is dup of bug 22244. Now, 22244 is fixed, but Browser on mac does not open local non-ASCII file. So, I cannot verify this bug until bug 31054 is fixed.
Ok, we found that if I put the file name extention. It works file on Mac. I verified this in 2000030809 Mac build.