Closed
Bug 122455
Opened 23 years ago
Closed 22 years ago
incorrect characters with ISO-8859-15 character coding
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
VERIFIED
INVALID
People
(Reporter: ollittm, Assigned: ftang)
References
()
Details
(Keywords: intl)
Attachments
(1 file)
25.44 KB,
text/html
|
Details |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.7+) Gecko/20020116 BuildID: 2002011604 I started using 8859-15 codepage when we went over to Euro. I noticed that sometimes single quotes are wrong. Changing codepage to -1 corrects problem, but then the euro doesn't display properly. In the slashdot article, "we didn't violate russian law" demonstrates the problem with codepage -15. Reproducible: Always Steps to Reproduce: 1.View->Character coding->Western (ISO-8859-15) 2.Load URL 3.Enjoy Actual Results: boxes and question marks replace single quotes Expected Results: single quotes
Comment 1•23 years ago
|
||
I do see the single quotes are marked as "?" on both windows and linux when charset sets to iso-8859-15. But I don't see the euro sign in this page though. So, looks like the page will displayed fine when charset is in iso-8859-1.
Comment 2•23 years ago
|
||
Well... the slashdot url is using quotes that are encoded in iso-8859-1, no?
Reporter | ||
Comment 3•23 years ago
|
||
As far as I understand how things work, codepage -15 *is* codepage -1 plus the euro character. At least that's what the linux installer said. Anyways, check this european central back URL: http://www.euro.ecb.int/en/section.html The paragraph starting "The new coins.." displays "minus" characters incorrectly as well as the euro sign before 664 billion when you use codepage -15. If you use -1, all's well.
Comment 4•23 years ago
|
||
URL: http://www.euro.ecb.int/en/section.html with iso-8859-1 will display euro sign fine and "The new coins -" (I can not tell it should be ".." instead of "-") on all platforms. Does charset iso-8859-15 replace some iso-8859-1 characters or just add some more special charcters based on iso-8859-1? I'm confirming it to get engineers input.
Comment 5•23 years ago
|
||
Reporter | ||
Comment 6•23 years ago
|
||
Comparing the two character sets you can find in: http://www.kostis.net/charsets/iso8859.1.htm http://www.kostis.net/charsets/iso8859.15.htm ..It turns out -15 actually substitutes accented S, Z and OE characters into the latin-1 set, as well as the euro sign. For whatever reason, mozilla cannot display the accented chars properly with the -15 encoding, althought it doesn't have a problem with kostis' page. However, I should not see the single quotes even in ideal situation, it looks like Redhat installation docs were less than complete about the differences of latin-9 and latin-1 ... I quess I'll go on using the Microsoft-hacked latin-1 character set.
Assignee | ||
Comment 7•23 years ago
|
||
I cannot see any problem on 2002012806 build. can someone attach a screenshot and tell me where is the problem ?
Reporter | ||
Comment 9•23 years ago
|
||
For seeing currency signs everywhere with codepage -15, it's a not-a-bug. For seeing question marks instead of extra currency chars, there's either a prolem in the font or maybe with mozilla font rendering? I did try a few w2k fonts, both serif and sans serif but I couldn't see the yen-char etc the codepage -15 says I should see. Just extra question marks. See the slashdot url sentense "we didn't break russian law"
Assignee | ||
Comment 10•23 years ago
|
||
ok, the problem is not we support ISO-8859-15 wrong, the problem is how we handle some code point that ISO-8859-15 does NOT defined in range 0x80-0x9f the - ae encoded either in 0x96 or 0x97 which are defined in cp1252 but nor ISO-8859-1 neither ISO-8859-15. For ISO-8859-1, we use the cp1252 definitation. But in ISO-8859-15, we treat them as undefined characters.
Status: NEW → ASSIGNED
Comment 11•23 years ago
|
||
I've found also when I use ISO-8859-15 I can't write accented letters (such as "á"), I have to write them somewhere else and then copy&paste. I doesn't happen when I use any other charset. However, I can write "ñ", so it's not my keyboard settings, I suppose
Assignee | ||
Comment 12•23 years ago
|
||
> rinzewind@wanadoo.es
which platform are you running?
Comment 13•23 years ago
|
||
I'm using Linux (Debian SID)
Comment 14•22 years ago
|
||
Forget about what I said... it was a misconfiguration in my font server... nothing to worry. Sorry :-(
Comment 15•22 years ago
|
||
Maybe the codepoints 0x80 to 0x9f should be handled as UNDEFINED for all ISO-8859 character sets. I simply don't like using "poisoned variants" of thoses charsets. If People want to use those Windows-Charsets, they should declare so accordingly. Just as HTML character entity references € to   are control characters, not curly quotes or euro signs.
Comment 16•22 years ago
|
||
I'd like to second Marc's request. Please leave the ISO character sets as they are. If you're inventing a mix of Windows-specific and ISO-characters users might get confused by unexpected characters (as seen here), and you're supporting the wrong setting of the Content-Type header (iso-.. instead of windows-..). After all, that's what the Content-Type header is ment for, defining which characters to display. You're trying to fix wrong content types at the wrong end - a bit like Microsoft does. In addition to that, you're getting a problem with forms: Although windows characters are displayed in ISO pages, you cannot enter them in a text field, because at least some of them get converted to Unicode things. A \ (a single quote in Windows) for example suddenly turns up as ’ in the posted data, although the form was in ISO format and thus should never have contained such a character (see http://bugzilla.mozilla.org/show_bug.cgi?id=139328 ). Even if I'm wrong about this internal conversion and it's relation to the mixed display of Windows/ISO-characters discussed here, this at least shows that the way you've gone only causes confusion, displaying things where they shouldn't be. Better do it the right way from the beginning on :)
Assignee | ||
Comment 17•22 years ago
|
||
>------- Additional Comment #16 From Sönke Tesch 2002-04-23 04:48 ------- > >I'd like to second Marc's request. Please leave the ISO character sets as they >are. If you're inventing a mix of Windows-specific and ISO-characters users >might get confused by unexpected characters (as seen here), sorry, too late to said that. >Better do it the right way from the beginning on :) yea but this is not a "beginning". The "beginning" is in 1994. It is 8 years away from the "beginning". mark this bug as invalid
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•