Closed Bug 288384 Opened 20 years ago Closed 18 years ago

Composer automatically reformats ampersand-whatever-semicolon character entities -- incorrectly? -- under UTF-8

Categories

(SeaMonkey :: Composer, defect)

PowerPC
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 195420

People

(Reporter: nospamforjim521, Unassigned)

Details

(Keywords: testcase)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.6) Gecko/20050319
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.6) Gecko/20050319

When using Composer on a page which specifies UTF-8 encoding,
ampersand-whatever-semicolon entities such as © are shown in the "Normal"
view correctly. However, upon saving, the raw HTML with the encoded entity is
then saved with the actual character instead of the encoded version (the
copyright symbol in this case) in the actual html file.
  This causes a problem with certain other browsers and third-party applications
that expect the ampersand-whatever-semicolon entities, even if (as in my case)
the application in question specifies that UTF-8 encoding must be used.
  It is possible that this is not a bug. If so, please accept my apologies. I
will take up the issue with the third-party application programmers. However,
the feature is extremely easy to reproduce (see below).

Reproducible: Always

Steps to Reproduce:
1. Create a text file with .html extension with the following content:

<html><head>
<meta content="text/html; charset=UTF-8" http-equiv="content-type">
<title>UTF-8 copyright test</title>
</head>
<body>
Not really copyright &copy; 2005 by jim the tester
</body>
</html>

2. Open this file in Mozilla, Safari, and IE, and you should get the expected
result: a copyright symbol in the text where the .html has the
ampersand-copy-semicolon sequence.

3. Open it in Mozilla, then edit in Composer.

Actual Results:  
It displays properly in Normal mode. But in HTML Source mode, the
ampersand-copy-semicolon sequence has been changed to a copyright symbol. Making
any change (there or elsewhere) in the document and saving causes the new file
to have the actual copyright symbol instead of the original, thus causing
problems in certain browsers.

Expected Results:  
I don't know what the actual specification for UTF-8 implies, but I *wish* that
the software would have "respected" the ampersand-copy-semicolon sequence and
left it as is. I have tried a variety of workarounds to get Composer to
"respect" the formatting of the source file and leave it alone. None work (of
course, further input on workarounds would be welcome!).

Apologies, again, if the UTF-8 specification implies that such entities MUST be
handled this way. My request is then irrelevant and the solution will have to be
sought from the vendor who insisted that his pages be written in UTF-8 but which
cannot handle true UTF-8 input.

However, given that several older and/or noncompliant browsers do choke on such
things as the copyright symbol, it would be nice if Composer could respect the
formatting chosen by the html author.
Attached file reporter's testcase
wfm Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.8b2) Gecko/20050330

but I had View -> Character Encoding set to Unicode (UTF-8)
Reproducible with Mozilla 1.8b1 and 1.8b2/20050322. View -> Character Encoding
is set to Unicode (UTF-8) during the Normal, HTML Tags and Preview modes, but
when I switch to the HTML Source mode View -> Character Encoding automatically
becomes Western ISO-8859-1. When I try to change it back to Unicode (UTF-8) and
switch to the HTML Source mode again, I get the original source twice in a row. 

Related to/duplicate of bug 195420? The workaround mentioned in bug 195420
comment 4 seems to work when added to my user.js, although I started the new
line with user_pref not pref (see
http://www.mozilla.org/unix/customizing.html#prefs).
Keywords: testcase

*** This bug has been marked as a duplicate of 195420 ***
Status: UNCONFIRMED → RESOLVED
Closed: 18 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: