Closed Bug 56630 Opened 25 years ago Closed 25 years ago

Wrong character encodings set by Mozilla for this URL.

Categories

(Core :: Internationalization, defect, P3)

defect

Tracking

()

VERIFIED WONTFIX

People

(Reporter: tarahim, Assigned: shanjian)

References

()

Details

To reproduce:Load the URL above. Result:No background image, no frame, no table, but just some weird bunch of characters are displayed. The encoding is set to UTF-16 by Mozilla. If you change the encoding manually to Western or Japanese, the page gets redrawn correctly. However, hitting reload falls back to the initial state, even though the character coding does not change back to UTF-16. Reproducibility:Always. 2000101320Trunk and 2000101014M18 View Source or Page Info did not give me much help to sort out what is the cause of this trouble.
The HTML at the URL has a metatag; <meta http-equiv="Content-Type" content="text/html; charset=x-sjis">. Opening the URL in Composer results in the same. Interestingly, you can manually switch the character codings to Western(ISO-8859-1) in the editor, and then the file becomes editable. See also bug 56626. Switching component to Internationalization.
Component: HTML Element → Internationalization
Fixing typo in summary. used to say "charcter." Also updating QA Contact and Assignee to reflect the change in Component.
Assignee: clayton → nhotta
QA Contact: lorca → teruko
Summary: Wrong charcter encodings set by Mozilla for this URL. → Wrong character encodings set by Mozilla for this URL.
I was able to see the page without Auto charset detection. But after I selected Japanese auto detection then I got the garbage characters.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Change platform to all since I was able to reproduce with WinNT. I think this problem is related to charset auto detection. Reassign to shanjian, cc to ftang.
Assignee: nhotta → shanjian
OS: Mac System 8.6 → All
Hardware: Macintosh → All
I spent some time on this problem. I found that the 1st byte of the message body is '\0'. When charset detector try to detect its charset, the only possible explaination is le_utf16. That is before meta tag is even read. I am not sure if this problem is caused by our network code or it is the mistake of the website.
Do we know why the first byte is zero ? Can someone contact the webmaster?
The home page of this website (www.webstar-s.com) does not have this zero in text stream from the http server. It does not seems like the http server's problem. I guess that index.html file is started with a zero. I will close this bug. If anybody believe either: 1, This is not a website's problem, the extra zero was added by browser, OR 2, This is a common practice and many websites have this kind of behavior, Then you can reopen the bug.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → WONTFIX
I have emailed the webmaster about the URL above, but have not got any response. BTW, I have found another site that has one byte in its HTML that causes Auto detect to fall into UFT-16: http://www.dontaku.com/table/fuzokuindex.html This one does not have the particular byte at the beginning, but in a tag in the Header of the HTML(yes, it has entire HTML in the Header, though). The HTML does not have a meta tag to specify any charset, but I consider an error in HTML like this one should be taken care of by the Browser so as not to generate display of totally unrelated information. I am just submitting this information, and leave the status as WONTFIX. If you guys think this is relevant, please reopen. 2000102008 MacTrunk.
One more point. View Source of the URL below is also affected by AutoDetect switching to UTF-16. Is this a correct behavior? It seems strange that Browser and View Source result in the same display. http://www.dontaku.com/table/fuzokuindex.html
Verified as wontfix.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.