Closed
Bug 99426
Opened 23 years ago
Closed 23 years ago
Shouldn't translate Windows-1252 characters when document is ISO-8859-1
Categories
(Core :: Internationalization, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: jmd, Assigned: shanjian)
References
Details
(Keywords: intl, Whiteboard: jmd-remind)
Attachments
(1 file)
239 bytes,
text/html; charset=iso-8859-1
|
Details |
I believe ISO-8859 specifically reserves 0x80 through 0x9f for controls, no
printable characters should be there. Translating characters in the range when
the document specifically requests ISO-8895-1 rendering is just furthering
Microsoft extend and embrace. Using the characters in that range is considered
bad netiquette, which Mozilla should not help proliferate.
Reporter | ||
Comment 1•23 years ago
|
||
Comment 2•23 years ago
|
||
Switching component to "Internationalization".
Assignee: rchen → yokoyama
Component: Localization → Internationalization
QA Contact: ylong → andreasb
Updated•23 years ago
|
QA Contact: andreasb → ylong
Assignee | ||
Comment 4•23 years ago
|
||
ISO-8859 left 0x80 to 0x9f unassigned. It is not reserved for anything. In a
8859-1 encoded text, we shouldn't see anything in this range. In case it
happens, it is very likely that user's intention is using win1252. Many
programmers do not know the difference of 8859-1 and win1252, let along average
users. So if we don't handle those code points, they thought it as a bug in
mozilla. Considering of the real situation mozilla based browser is in, we could
do nothing to stop this kind of practice from proliferating. I absolutely agree
with you if we can make some difference. The fact is, we can't blame user for
such practice, nor can we stop it. There is much more evil things in this world
we need to fight, so let's make the compromise here.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → WONTFIX
Reporter | ||
Comment 5•23 years ago
|
||
To my understanding, ISO 8859 referances the C1 control set from ISO 6429 as the
controls for 80-9f. Of course, ISO standards aren't exactly available, so this
is all heresay.
=80 U+0080 PADDING CHARACTER
=81 U+0081 HIGH OCTET PRESET
=82 U+0082 BREAK PERMITTED HERE
=83 U+0083 NO BREAK HERE
=84 U+0084 INDEX
=85 U+0085 NEXT LINE
=86 U+0086 START OF SELECTED AREA
=87 U+0087 END OF SELECTED AREA
=88 U+0088 CHARACTER TABULATION SET
=89 U+0089 CHARACTER TABULATION WITH JUSTIFICATION
=8A U+008A LINE TABULATION SET
=8B U+008B PARTIAL LINE FORWARD
=8C U+008C PARTIAL LINE BACKWARD
=8D U+008D REVERSE LINE FEED
=8E U+008E SINGLE-SHIFT TWO
=8F U+008F SINGLE-SHIFT THREE
=90 U+0090 DEVICE CONTROL STRING
=91 U+0091 PRIVATE USE ONE
=92 U+0092 PRIVATE USE TWO
=93 U+0093 SET TRANSMIT STATE
=94 U+0094 CANCEL CHARACTER
=95 U+0095 MESSAGE WAITING
=96 U+0096 START OF GUARDED AREA
=97 U+0097 END OF GUARDED AREA
=98 U+0098 START OF STRING
=99 U+0099 SINGLE GRAPHIC CHARACTER INTRODUCER
=9A U+009A SINGLE CHARACTER INTRODUCER
=9B U+009B CONTROL SEQUENCE INTRODUCER
=9C U+009C STRING TERMINATOR
=9D U+009D OPERATING SYSTEM COMMAND
=9E U+009E PRIVACY MESSAGE
=9F U+009F APPLICATION PROGRAM COMMAND
I don't think those are going to be implemented, however I'd eventually like to
take a look at the wording of 8859 regarding the range. Marking status.
Whiteboard: jmd-remind
Assignee | ||
Comment 6•23 years ago
|
||
I don't have ISO8859 standard document either, so my understanding of this are
also base one various indirect source. In my understanding, control set C1 is an
application level stuff. ISO8859 intentionally leave those code points unassign
to make applications which utilize ISO6429 C1 control set possible. That's say
C1 control set code points (0x80 to 0x9f) should not live beyond its application
scope. They are meaningless in the context of general information exchange,
especially in HTML document.
Updated•16 years ago
|
Attachment #49177 -
Attachment mime type: text/html → text/html; charset=iso-8859-1
You need to log in
before you can comment on or make changes to this bug.
Description
•