Closed Bug 122455 Opened 23 years ago Closed 23 years ago

incorrect characters with ISO-8859-15 character coding

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

VERIFIED INVALID

People

(Reporter: ollittm, Assigned: ftang)

References

()

Details

(Keywords: intl)

Attachments

(1 file)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.7+) Gecko/20020116 BuildID: 2002011604 I started using 8859-15 codepage when we went over to Euro. I noticed that sometimes single quotes are wrong. Changing codepage to -1 corrects problem, but then the euro doesn't display properly. In the slashdot article, "we didn't violate russian law" demonstrates the problem with codepage -15. Reproducible: Always Steps to Reproduce: 1.View->Character coding->Western (ISO-8859-15) 2.Load URL 3.Enjoy Actual Results: boxes and question marks replace single quotes Expected Results: single quotes
QA Contact: ruixu → ylong
I do see the single quotes are marked as "?" on both windows and linux when charset sets to iso-8859-15. But I don't see the euro sign in this page though. So, looks like the page will displayed fine when charset is in iso-8859-1.
Well... the slashdot url is using quotes that are encoded in iso-8859-1, no?
As far as I understand how things work, codepage -15 *is* codepage -1 plus the euro character. At least that's what the linux installer said. Anyways, check this european central back URL: http://www.euro.ecb.int/en/section.html The paragraph starting "The new coins.." displays "minus" characters incorrectly as well as the euro sign before 664 billion when you use codepage -15. If you use -1, all's well.
URL: http://www.euro.ecb.int/en/section.html with iso-8859-1 will display euro sign fine and "The new coins -" (I can not tell it should be ".." instead of "-") on all platforms. Does charset iso-8859-15 replace some iso-8859-1 characters or just add some more special charcters based on iso-8859-1? I'm confirming it to get engineers input.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: intl
OS: Windows 2000 → All
Hardware: PC → All
Comparing the two character sets you can find in: http://www.kostis.net/charsets/iso8859.1.htm http://www.kostis.net/charsets/iso8859.15.htm ..It turns out -15 actually substitutes accented S, Z and OE characters into the latin-1 set, as well as the euro sign. For whatever reason, mozilla cannot display the accented chars properly with the -15 encoding, althought it doesn't have a problem with kostis' page. However, I should not see the single quotes even in ideal situation, it looks like Redhat installation docs were less than complete about the differences of latin-9 and latin-1 ... I quess I'll go on using the Microsoft-hacked latin-1 character set.
I cannot see any problem on 2002012806 build. can someone attach a screenshot and tell me where is the problem ?
give to ftang
Assignee: yokoyama → ftang
For seeing currency signs everywhere with codepage -15, it's a not-a-bug. For seeing question marks instead of extra currency chars, there's either a prolem in the font or maybe with mozilla font rendering? I did try a few w2k fonts, both serif and sans serif but I couldn't see the yen-char etc the codepage -15 says I should see. Just extra question marks. See the slashdot url sentense "we didn't break russian law"
ok, the problem is not we support ISO-8859-15 wrong, the problem is how we handle some code point that ISO-8859-15 does NOT defined in range 0x80-0x9f the - ae encoded either in 0x96 or 0x97 which are defined in cp1252 but nor ISO-8859-1 neither ISO-8859-15. For ISO-8859-1, we use the cp1252 definitation. But in ISO-8859-15, we treat them as undefined characters.
Status: NEW → ASSIGNED
I've found also when I use ISO-8859-15 I can't write accented letters (such as "á"), I have to write them somewhere else and then copy&paste. I doesn't happen when I use any other charset. However, I can write "ñ", so it's not my keyboard settings, I suppose
> rinzewind@wanadoo.es which platform are you running?
I'm using Linux (Debian SID)
Forget about what I said... it was a misconfiguration in my font server... nothing to worry. Sorry :-(
Maybe the codepoints 0x80 to 0x9f should be handled as UNDEFINED for all ISO-8859 character sets. I simply don't like using "poisoned variants" of thoses charsets. If People want to use those Windows-Charsets, they should declare so accordingly. Just as HTML character entity references € to   are control characters, not curly quotes or euro signs.
I'd like to second Marc's request. Please leave the ISO character sets as they are. If you're inventing a mix of Windows-specific and ISO-characters users might get confused by unexpected characters (as seen here), and you're supporting the wrong setting of the Content-Type header (iso-.. instead of windows-..). After all, that's what the Content-Type header is ment for, defining which characters to display. You're trying to fix wrong content types at the wrong end - a bit like Microsoft does. In addition to that, you're getting a problem with forms: Although windows characters are displayed in ISO pages, you cannot enter them in a text field, because at least some of them get converted to Unicode things. A &#92 (a single quote in Windows) for example suddenly turns up as ’ in the posted data, although the form was in ISO format and thus should never have contained such a character (see http://bugzilla.mozilla.org/show_bug.cgi?id=139328 ). Even if I'm wrong about this internal conversion and it's relation to the mixed display of Windows/ISO-characters discussed here, this at least shows that the way you've gone only causes confusion, displaying things where they shouldn't be. Better do it the right way from the beginning on :)
>------- Additional Comment #16 From Sönke Tesch 2002-04-23 04:48 ------- > >I'd like to second Marc's request. Please leave the ISO character sets as they >are. If you're inventing a mix of Windows-specific and ISO-characters users >might get confused by unexpected characters (as seen here), sorry, too late to said that. >Better do it the right way from the beginning on :) yea but this is not a "beginning". The "beginning" is in 1994. It is 8 years away from the "beginning". mark this bug as invalid
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → INVALID
Mark as verified according to above comments.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: