Open
Bug 195420
Opened 22 years ago
Updated 1 month ago
[ComposerSourceView] Composer converts non-Unicode characters (trademark &trade, copyright ©) and accented characters to Unicode symbols, despite charset given
Categories
(Core :: DOM: Serializers, defect)
Core
DOM: Serializers
Tracking
()
NEW
People
(Reporter: sbrown3, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: editorbase-)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3b) Gecko/20030220
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3b) Gecko/20030220
In a document containing the codes ™ or &trade, these codes are converted
to a single 'TM' character when the document is saved. This occurs even after
changing the setting in Preferences to 'Retain Original Source Formatting'. The
reason I need to retain the code is because I paste the source from Composer's
<HTML> Source' page into a memo field in an Oracle (7.3) database which converts
the 'TM' into a question mark. Also, a tool for inserting such codes (or
symbols, with the code being inserted into the source) into a document in
Composer would be useful.
Reproducible: Always
Steps to Reproduce:
1. Open a new document in Composer.
2. Type 'Mozilla' (no quotes).
3. Go to '<HTML> Source' page.
4. Append the code ™ or ™ to the text (Mozilla) you typed.
5. Go to the 'Normal' page.
6. Return to the '<HTML> Source' page.
7. Repeat after changing setting in Preferences to 'Retain Original Source
Formatting'.
Actual Results:
The code ™ or ™ is converted to ™.
Expected Results:
The codes should be retained exactly as I typed them (as is the case with other
codes: >, <, &, &nsbs;, etc.
Comment 1•22 years ago
|
||
I see this in Netscape7 on MacOSX
sounds like a bug for -->DOM to Text Conversion
akkana--is this a dupe?
Assignee: composer → harishd
Status: UNCONFIRMED → NEW
Component: Editor: Composer → DOM to Text Conversion
Ever confirmed: true
Keywords: nsbeta1
OS: Windows 2000 → All
QA Contact: petersen → sujay
Hardware: PC → All
Whiteboard: editorbase
Comment 2•22 years ago
|
||
Looks like we have another character not being output as an entity when it
should be.
Comment 3•22 years ago
|
||
When it is saved, it is saved as "™".
I think the editor maintains entities for Latin1 set only but not for others
like ™ and € (for source view) because á and © are kept
as entities in the source view.
Should OutputEncodeHTMLEntities (rather than OutputEncodeLatin1Entities) be the
default for ISO-8859-1? It seems that wouldn't regress bug 65324.
Reporter: in the meantime, you can try to set this pref n your prefs.js (or
user.js or editor.js):
pref("editor.encode_entity", "html"); // which includes ² α ™ etc
But note that this will cause Composer to entity-ze 8bit accented letters, greek
letters, and other special markup symbols as defined in HTML4. So it will only
work if your Oracle 7.3 product understands the set of HTML4 entities.
Updated•19 years ago
|
Assignee: harishd → dom-to-text
Severity: normal → minor
QA Contact: sujay
Summary: Composer converts Unicode to character for trademark symbol. → Composer converts Unicode to character for trademark, copyright © and probably other special characters
Comment 7•19 years ago
|
||
*** Bug 288866 has been marked as a duplicate of this bug. ***
Comment 8•19 years ago
|
||
*** Bug 288384 has been marked as a duplicate of this bug. ***
*** Bug 354943 has been marked as a duplicate of this bug. ***
Comment 10•19 years ago
|
||
Based on the dupes I'm altering the priority back to "normal" and (hopefully) tweaking the summary
Severity: minor → normal
Summary: Composer converts Unicode to character for trademark, copyright © and probably other special characters → Composer converts non-Unicode characters (trademark &trade, copyright ©) and accented characters to Unicode symbols, despite charset given
Comment 11•19 years ago
|
||
Solution : Instead of selecting "File > Save" you should select : "File > Save and change Character encoding" and select "Western (Iso 8859-1)".
Updated•16 years ago
|
Assignee: dom-to-text → nobody
QA Contact: dom-to-text
Mass-removing myself from cc; search for 12b9dfe4-ece3-40dc-8d23-60e179f64ac1 or any reasonable part thereof, to mass-delete these notifications (and sorry!)
Updated•3 years ago
|
Severity: normal → S3
Updated•1 month ago
|
Summary: Composer converts non-Unicode characters (trademark &trade, copyright ©) and accented characters to Unicode symbols, despite charset given → [ComposerSourceView] Composer converts non-Unicode characters (trademark &trade, copyright ©) and accented characters to Unicode symbols, despite charset given
You need to log in
before you can comment on or make changes to this bug.
Description
•