Closed Bug 99426 Opened 23 years ago Closed 23 years ago

Shouldn't translate Windows-1252 characters when document is ISO-8859-1

Tracking

()

Status:

RESOLVED WONTFIX

People

(Reporter: jmd, Assigned: shanjian)

References

Details

(Keywords: intl, Whiteboard: jmd-remind)

Attachments

(1 file)

testcase, none of these should render 23 years ago Jeremy M. Dolan 239 bytes, text/html; charset=iso-8859-1		Details

Jeremy M. Dolan

Reporter

Description

•

23 years ago

I believe ISO-8859 specifically reserves 0x80 through 0x9f for controls, no printable characters should be there. Translating characters in the range when the document specifically requests ISO-8895-1 rendering is just furthering Microsoft extend and embrace. Using the characters in that range is considered bad netiquette, which Mozilla should not help proliferate.

Jeremy M. Dolan

Reporter

Comment 1

•

23 years ago

Attached file testcase, none of these should render — Details

Andreas Becker

Updated

•

23 years ago

Keywords: intl

QA Contact: andreasb → ylong

Andreas Becker

Comment 2

•

23 years ago

Switching component to "Internationalization".

Assignee: rchen → yokoyama

Component: Localization → Internationalization

QA Contact: ylong → andreasb

Andreas Becker

Updated

•

23 years ago

QA Contact: andreasb → ylong

Roy Yokoyama

Comment 3

•

23 years ago

assiging to shanjian.

Assignee: yokoyama → shanjian

Shanjian Li

Assignee

Comment 4

•

23 years ago

ISO-8859 left 0x80 to 0x9f unassigned. It is not reserved for anything. In a 8859-1 encoded text, we shouldn't see anything in this range. In case it happens, it is very likely that user's intention is using win1252. Many programmers do not know the difference of 8859-1 and win1252, let along average users. So if we don't handle those code points, they thought it as a bug in mozilla. Considering of the real situation mozilla based browser is in, we could do nothing to stop this kind of practice from proliferating. I absolutely agree with you if we can make some difference. The fact is, we can't blame user for such practice, nor can we stop it. There is much more evil things in this world we need to fight, so let's make the compromise here.

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → WONTFIX

Jeremy M. Dolan

Reporter

Comment 5

•

23 years ago

To my understanding, ISO 8859 referances the C1 control set from ISO 6429 as the controls for 80-9f. Of course, ISO standards aren't exactly available, so this is all heresay. =80 U+0080 PADDING CHARACTER =81 U+0081 HIGH OCTET PRESET =82 U+0082 BREAK PERMITTED HERE =83 U+0083 NO BREAK HERE =84 U+0084 INDEX =85 U+0085 NEXT LINE =86 U+0086 START OF SELECTED AREA =87 U+0087 END OF SELECTED AREA =88 U+0088 CHARACTER TABULATION SET =89 U+0089 CHARACTER TABULATION WITH JUSTIFICATION =8A U+008A LINE TABULATION SET =8B U+008B PARTIAL LINE FORWARD =8C U+008C PARTIAL LINE BACKWARD =8D U+008D REVERSE LINE FEED =8E U+008E SINGLE-SHIFT TWO =8F U+008F SINGLE-SHIFT THREE =90 U+0090 DEVICE CONTROL STRING =91 U+0091 PRIVATE USE ONE =92 U+0092 PRIVATE USE TWO =93 U+0093 SET TRANSMIT STATE =94 U+0094 CANCEL CHARACTER =95 U+0095 MESSAGE WAITING =96 U+0096 START OF GUARDED AREA =97 U+0097 END OF GUARDED AREA =98 U+0098 START OF STRING =99 U+0099 SINGLE GRAPHIC CHARACTER INTRODUCER =9A U+009A SINGLE CHARACTER INTRODUCER =9B U+009B CONTROL SEQUENCE INTRODUCER =9C U+009C STRING TERMINATOR =9D U+009D OPERATING SYSTEM COMMAND =9E U+009E PRIVACY MESSAGE =9F U+009F APPLICATION PROGRAM COMMAND I don't think those are going to be implemented, however I'd eventually like to take a look at the wording of 8859 regarding the range. Marking status.

Whiteboard: jmd-remind

Shanjian Li

Assignee

Comment 6

•

23 years ago

I don't have ISO8859 standard document either, so my understanding of this are also base one various indirect source. In my understanding, control set C1 is an application level stuff. ISO8859 intentionally leave those code points unassign to make applications which utilize ISO6429 C1 control set possible. That's say C1 control set code points (0x80 to 0x9f) should not live beyond its application scope. They are meaningless in the context of general information exchange, especially in HTML document.

Simon Montagu :smontagu

Updated

•

16 years ago

Attachment #49177 - Attachment mime type: text/html → text/html; charset=iso-8859-1

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Shouldn't translate Windows-1252 characters when document is ISO-8859-1

Categories

(Core :: Internationalization, defect)

Tracking

()

People

(Reporter: jmd, Assigned: shanjian)

References

Details

(Keywords: intl, Whiteboard: jmd-remind)

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Updated

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Attachment

General

Description

File Name

Content Type