Closed Bug 271454 Opened 20 years ago Closed 9 years ago

Save as text: all non-ascii encodings handled as MacRoman

Categories

(Core :: DOM: Serializers, enhancement)

All
macOS
enhancement
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mlneelsen, Unassigned)

References

()

Details

(Keywords: intl)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/125.5.5 (KHTML, like Gecko) Safari/125.11
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

When you save a document as text, no Latin 1-characters are correct; there are unfortunately also hard 
line breaks instead of soft ones.

Reproducible: Always
Steps to Reproduce:
1.
2.
3.
(In reply to comment #0)
> no Latin 1-characters are correct;

Can you please explain this more clearly, preferably with an example, and
specifying expected results and actual results?

Firefox 1.0 OS X 10.3.7

1: find a Latin1 page with non-ascii characters (such as Aftenposten).
2: Save Page As... Text File. (ignore the .html suffix, unrelated bug)
3: open the file in your text editor, see weird char substitutions.

For example: "L¯r: Aftenpostens NyttÂrskonsert"

Explanation: Latin1 F8 is ø (oslash), but MacRoman F8 is ¯ (macron).

Mozilla should either convert text to MacRoman, or let OS X know that the text
uses another encoding. No, I don't know how to do that, but I know that BBEdit
can do it.

I also do not know which Component to set. Probably Core -> DOM Text, but a more
experienced code wrangler should make that call.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Save as text doesn't handle Latin 1 correctly, and line breaks are hard → Save as text: Latin 1 handled incorrectly as MacRoman
When 'saving as text', Mozilla just preserves the original document encoding
(which is ISO-8859-1 in this case). If it needs to be converted to anything,
it's not MacRoman but UTF-8 on Mac OS X. Perhaps, this should be made a dupe
(there's already a bug on this.)

  
Severity: normal → enhancement
Component: General → DOM to Text Conversion
Keywords: intl
Product: Firefox → Core
Version: unspecified → Trunk
Assignee: firefox → dom-to-text
QA Contact: general
I wasn't able to find the bug that Jungshik mentions.

The text facilities in OS X (as used by TextEdit et al) wrongly assume that any 8-bit text file without a BOM is MacRoman.

Saving text raw is arguably the correct behavior.
Hardware: PowerPC → All
Summary: Save as text: Latin 1 handled incorrectly as MacRoman → Save as text: all non-ascii encodings handled as MacRoman
No longer blocks: 247483
Assignee: dom-to-text → nobody
QA Contact: dom-to-text
MacRoman is dead; OS X uses UTF-8, which maps smoothly with Latin-1.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.