195420 - [ComposerSourceView] Composer converts non-Unicode characters (trademark &trade, copyright &copy) and accented characters to Unicode symbols, despite charset given

Reporter

Description

•

22 years ago

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3b) Gecko/20030220 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3b) Gecko/20030220 In a document containing the codes ™ or &trade, these codes are converted to a single 'TM' character when the document is saved. This occurs even after changing the setting in Preferences to 'Retain Original Source Formatting'. The reason I need to retain the code is because I paste the source from Composer's <HTML> Source' page into a memo field in an Oracle (7.3) database which converts the 'TM' into a question mark. Also, a tool for inserting such codes (or symbols, with the code being inserted into the source) into a document in Composer would be useful. Reproducible: Always Steps to Reproduce: 1. Open a new document in Composer. 2. Type 'Mozilla' (no quotes). 3. Go to '<HTML> Source' page. 4. Append the code ™ or ™ to the text (Mozilla) you typed. 5. Go to the 'Normal' page. 6. Return to the '<HTML> Source' page. 7. Repeat after changing setting in Preferences to 'Retain Original Source Formatting'. Actual Results: The code ™ or ™ is converted to ™. Expected Results: The codes should be retained exactly as I typed them (as is the case with other codes: >, <, &, &nsbs;, etc.

Kathleen :Brade

Comment 1

•

22 years ago

I see this in Netscape7 on MacOSX sounds like a bug for -->DOM to Text Conversion akkana--is this a dupe?

Assignee: composer → harishd

Status: UNCONFIRMED → NEW

Component: Editor: Composer → DOM to Text Conversion

Ever confirmed: true

Keywords: nsbeta1

OS: Windows 2000 → All

QA Contact: petersen → sujay

Hardware: PC → All

Whiteboard: editorbase

Akkana Peck

Comment 2

•

22 years ago

Looks like we have another character not being output as an entity when it should be.

nhottanscp

Comment 3

•

22 years ago

When it is saved, it is saved as "™". I think the editor maintains entities for Latin1 set only but not for others like ™ and € (for source view) because á and © are kept as entities in the source view.

rbs

Comment 4

•

22 years ago

Should OutputEncodeHTMLEntities (rather than OutputEncodeLatin1Entities) be the default for ISO-8859-1? It seems that wouldn't regress bug 65324. Reporter: in the meantime, you can try to set this pref n your prefs.js (or user.js or editor.js): pref("editor.encode_entity", "html"); // which includes ² α ™ etc But note that this will cause Composer to entity-ze 8bit accented letters, greek letters, and other special markup symbols as defined in HTML4. So it will only work if your Oracle 7.3 product understands the set of HTML4 entities.

Kevin McCluskey (gone)

Comment 5

•

22 years ago

editorbase-

Whiteboard: editorbase → editorbase-

Samir Gehani

Comment 6

•

22 years ago

adt: nsbeta1-

Keywords: nsbeta1 → nsbeta1-

Wayne Mery (:wsmwk)

Updated

•

19 years ago

Assignee: harishd → dom-to-text

Severity: normal → minor

QA Contact: sujay

Summary: Composer converts Unicode to character for trademark symbol. → Composer converts Unicode to character for trademark, copyright &copy and probably other special characters

Wayne Mery (:wsmwk)

Comment 7

•

19 years ago

*** Bug 288866 has been marked as a duplicate of this bug. ***

Wayne Mery (:wsmwk)

Comment 8

•

19 years ago

*** Bug 288384 has been marked as a duplicate of this bug. ***

u63580

Comment 9

•

19 years ago

*** Bug 354943 has been marked as a duplicate of this bug. ***

u63580

Comment 10

•

19 years ago

Based on the dupes I'm altering the priority back to "normal" and (hopefully) tweaking the summary

Severity: minor → normal

Summary: Composer converts Unicode to character for trademark, copyright &copy and probably other special characters → Composer converts non-Unicode characters (trademark &trade, copyright &copy) and accented characters to Unicode symbols, despite charset given

victor.gattegno

Comment 11

•

19 years ago

Solution : Instead of selecting "File > Save" you should select : "File > Save and change Character encoding" and select "Western (Iso 8859-1)".

Phil Ringnalda (:philor)

Updated

•

16 years ago

Assignee: dom-to-text → nobody

QA Contact: dom-to-text

Stephen Donner [:stephend] Not actively reading bugmail

Comment 12

•

5 years ago

Mass-removing myself from cc; search for 12b9dfe4-ece3-40dc-8d23-60e179f64ac1 or any reasonable part thereof, to mass-delete these notifications (and sorry!)

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Masayuki Nakano [:masayuki] (he/him)(JST, +0900)

Updated

•

5 months ago

Blocks: 1975139

Masayuki Nakano [:masayuki] (he/him)(JST, +0900)

Updated

•

5 months ago

Summary: Composer converts non-Unicode characters (trademark &trade, copyright &copy) and accented characters to Unicode symbols, despite charset given → [ComposerSourceView] Composer converts non-Unicode characters (trademark &trade, copyright &copy) and accented characters to Unicode symbols, despite charset given

Bugzilla

[ComposerSourceView] Composer converts non-Unicode characters (trademark &trade, copyright &copy) and accented characters to Unicode symbols, despite charset given

Categories

(Core :: DOM: Serializers, defect)

Tracking

()

People

(Reporter: sbrown3, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: editorbase-)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated

Comment 12

Updated

Updated

Updated