Closed Bug 182764 Opened 23 years ago Closed 21 years ago

Incorrect auto-selection of Character Coding: UTF-8 instead of ISO-8859-1; some characters are displayed as Diamond (or "bare" on W95) Question Marks

Categories

(Core :: Internationalization, defect)

x86
All
defect
Not set
minor

Tracking

()

RESOLVED DUPLICATE of bug 158285

People

(Reporter: SkewerMZ, Assigned: smontagu)

Details

Attachments

(3 files)

Frequently, I visit sites like http://www.buy.com/ that exhibit a strange behavior where copyright symbols, long dashes, and non-breaking spaces are rendered as a diamond question mark character. This seems to be because the character coding is switching to UTF-8 even though the page is encoded in ISO-8859-1. The authors of buy.com have not specified the character coding. Expected: When the character coding is not specified, Mozilla should try to auto-detect the character coding from the active character codings under the customize menu, starting with the top. Actual: Somehow Mozilla is getting stuck on UTF-8. I cannot get Mozilla to do this by just visiting a UTF-8 page, but eventually it switches. Mozilla seems to be ignoring the setting under "customize." Build: 2002112808 WXP
Could this bug be a duplicate of bug 158285 ?
Here's another example: http://www.pbs.org/cringely/pulpit/pulpit20030306.html Question marks appear at the end of every sentence, and the character coding shown by Mozilla is UTF-8; there is no declared character coding in the document. Build: 2003030805 Win2k
Flags: blocking1.3?
The behavior exhibited by that link is the behavior I was talking about. The first time I loaded it I got those diamond question marks. If I change the character coding to ISO-8859-1 in the menu, the question marks go away, and they don't come back for a good while. But they always eventually come back.
unable to rpeproduce with today's branch build on winXP.
Flags: blocking1.3? → blocking1.3-
This issue appears to have been fixed... if anyone sees the diamond question marks appear again using the latest nightlies reopen the bug and provide the URL and build ID.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WORKSFORME
<http://www.babelloyd.com/problems.html> is still displayed with diamond question marks for me in the latest build.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
<http://www.buy.com/retail/product.asp?sku=20353281&loc=105> will display the trademark signs as diamond question marks. If I change the character coding to ISO-8859-1 the diamond question marks will go away and not come back for a while.
[Mozilla/5.0 (Windows; U; Win95; en-US; rv:1.4a) Gecko/20030401] Re comment 7: WorksForMe: got 'ISO-8859-1' Char. Coding. Which build are you using ? Something must differ between our profiles...
Addition to comment 8: Comment 7: If I manually select UTF-8, I get simple (not in a diamond) question marks. Comment 6: WFM too; In manual UTF-8, some '"' (double quote) characters become simple (not in a diamond) question marks '?'. Comment 2: WFM too; In manual UTF-8, "every" '. ' (and more) become simple (not in a diamond) question marks '.?'.
I suspect the problem does not occur in Win95 because that OS doesn't fully support Unicode.
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030507] Bug still there (somewhere), at least very soon after moving from v1.4a to v1.4b, keeping the same profile. New case: (don't know if reproductible) Opened <http://bugzilla.mozilla.org/show_bug.cgi?id=168800> and got UTF-8 displaying 2 d.q.m.. (see attachment) Addition to comment 8 and comment 9: WFM with v1.4b/W2K(sp2) too, for the 3 URLs. With _manual_ UTF-8, diamond q.m. are displayed. Re comment 10: From my reported testings, it seems that (my) W95 is unable to display the d.q.m. indeed: it displays them as bare q.m.. Does it mean that Win95 don't have the current bug about auto. selecting the wrong char.cod. ? Does anyone have an opinion on my comment 1 (on possible duplicate bugs) ?
Workaround: *Manually select 'View > Character Coding > Western (ISO-8859-1)'. Changing Severity from Normal to Minor.
Severity: normal → minor
Summary: Diamond question marks appear on pages (incorrect character coding) → Incorrect auto-selection of Character Coding: UTF-8 instead of ISO-8859-1; some characters are displayed as Diamond (or "bare" on W95) Question Marks
Correction to comment 11: (I have W2K sp _3_.)
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030507] Addition to comment 11: <http://bugzilla.mozilla.org/show_bug.cgi?id=168800> The UTF-8 remains after reloads and restarts ! Once changed to ISO-8859-1 (or UTF-8 again), the change is remembered after restart. (This must be history/bookmark standard behaviour !?) The only question is why is UTF-8 selected "at first" ? See also bug 158285 comment 19 for a possible clue.
When a link, from mail and news, bring you to browther (default parameters set to 8859-1 or other) it always change to the UTF-8 settings. This fact doesn't happen when I select site from the browther.
This bug has existed at least since mozilla 1.2 on the Unix plattform. Every now and then mozilla (1.4.1, 1.5) switches to show some ISO 8859-1 pages using UTF-8 resulting in a lot of ? characters for non-ASCII characters. Looking in menu View->Character Coding mozilla have changed to UTF-8. Changing back to ISO 8859-1 fixes that page (for a while, at least). It looks like mozilla ignore both the preferences saying that default is ISO 8859-1 and might have a defective auto-detect. or it might have to do with caching (or not cached pages). And when a page gets wrong character set, links with non-ASCII stops working too. I suggest that this bug is set to blocking mozilla 1.6! It is high time mozilla worked correctely for us not using ASCII. Now neither URLs having non-ASCII nor pages having non-ASCII works correctely.
I have requested blocking on this bug. It is a major problem. Both character displayed and links stop working until I manually change the character set back. It is not just Windows. I get the same under Solaris. And I get it several times a week. It is also very difficult to give a reproducible test case. The only thing I can see is that now and then pages gets the wrong character set because Mozilla changes to UTF-8. It may be the cache that is wrong, or it may be someting else.
Flags: blocking1.6b?
Flags: blocking1.6?
Bugs generally don't get fixed in Browser-General.
Assignee: asa → font
Status: REOPENED → NEW
Component: Browser-General → Layout: Fonts and Text
QA Contact: asa → ian
Would the people who see this bug say what item is selected under View | Character Coding | Auto-Detect?
Assignee: font → smontagu
Component: Layout: Fonts and Text → Internationalization
QA Contact: ian → amyy
Changing: (OS) Windows XP -> All, per comment 16 (Unix) and comment 17 (Solaris).
OS: Windows XP → All
I guess this bug is invalid (or tech-evangelism. all those sites don't set 'charset') Anyway, as dbaron asked, we need to know what 'auto-selection' (in view|character coding menu) mode is selected. How about the default character coding setting in Edit | preferecen | langauges?
Flags: blocking1.6b?
Flags: blocking1.6b-
Flags: blocking1.6?
Flags: blocking1.6-
There is still the same problem with the diamonds question mark character in 1.6. <a href="http://members.aon.at/gsiberger/printscreen.jpg">Printscreen</a> Link: <a href="http://www.diepresse.at/services/diashow/diashow.asp?aktArtID=400138">http://www.diepresse.at/services/diashow/diashow.asp?aktArtID=400138</a> These signs appear any time when you view a slide show of www.diepresse.at. Mozilla changes the charset to UTF-8 although no charset is specified in the page source. Reproducible: every time until charcter coding is changed. after that the umlaute are displayed correctly (in most cases!) Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113 Character Coding settings: Auto-Detect: Off Active Character Codings: Western (ISO-8859-1) The same occured in the Dropdown field in the ebay advert on this page: <a href="http://members.aon.at/gsiberger/printscreen2.jpg">Printcreen</a> <a href="http://communicator.aon.at>Link zur Page</a> Reproducible: Does not alway occur. Mostly when reloading the page once.
re: comment #22 What's the default character coding in Edit | preferecen | langauges? If most web pages of your interest are in ISO-8859-1, you'd better set it to ISO-8859-1. BTW, don't include any html snippet here.
The default character coding was already set to ISO-8859-1. This does not solve the problem.
I also have this identical problem for several versions including my current 1.6 on XP. I have the default set to ISO-8859-1, yet at random times I see the diamond question marks and see it has switched to UTF-8.
same here with 1.6 on W2KSP4, this problem is very old already, started at 1.3 or earlier perhaps, I do not remember
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.1) Gecko/20040707] (Release) I had not seen this bug for a long time; but it happened today (on a site which I visit daily). Unicode (UTF-8) instead of usual Western (ISO-8859-1).
Removing broken URL from bug. I agree with comment 1 -- this is a dupe. Anyone still experiencing this bug should be sure they have View | Character Encoding | Auto Detect set to OFF With this setting, a page without a charset specified (in the page, or in the HTTP headers from the server) will generally drop back to the Default Encoding specified in preferences. *** This bug has been marked as a duplicate of 158285 ***
Status: NEW → RESOLVED
Closed: 22 years ago21 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: