[SeaMonkey 2.1] Wrong text encoding for html mail in non-utf encoding (with html5.enable=false, charset in <meta> tag is always used in HTML rendering, although decoder returns in UTF-8)

RESOLVED FIXED

Status

--
major
RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: mozdiav, Unassigned)

Tracking

Trunk
x86
Windows XP

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [fixed by bug 505072])

Attachments

(1 attachment)

(Reporter)

Description

9 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.3a1pre) Gecko/20090828 SeaMonkey/2.1a1pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.3a1pre) Gecko/20090828 SeaMonkey/2.1a1pre

Mail in html format and non-utf charset (at least for russian koi8-r and windows-1251) displays incorrectly.
Plain text messages, or plain text part for plain and html-combo messages shown correct.
utf-8 encoded mails displays correctly



Reproducible: Always

Steps to Reproduce:
1. compose mail on russian, with koi8-r encoding
message body will be

Subject: =?KOI8-R?Q?=D4=C5=D3=D4_=C2=CC=D1_2?=
Content-Type: text/html; charset=KOI8-R
Content-Transfer-Encoding: 8bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=KOI8-R" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
тест тест тест<br>
</body>
</html>

Actual Results:  
Seamonkey display text as
я┌п╣я│я┌ я┌п╣я│я┌ я┌п╣я│я┌

Expected Results:  
should be
тест тест тест


text displays correctly if choose "view->message body as->plain text"

this bug actual only for combination non-utf encoding + html body.
Following is displayed by View/Source of a mail written in utf-8(byte code of character data is utf-8), and View/Character Encoding=utf-8 is displyed.
> Content-Type: text/plain; charset=utf-8; format=flowed
> Content-Transfer-Encoding: 8bit
> 
> тест тест тест

When View/Character Encoding of View/Message Source window is changed to koi8-r, source display is changed to next.
> Content-Type: text/plain; charset=utf-8; format=flowed
> Content-Transfer-Encoding: 8bit
> 
> я┌п╣я│я┌ я┌п╣я│я┌ я┌п╣я│я┌

=> Your Actual Result is display of utf-8 data in koi8-r.

Your mail data is really written in byte code of koi8-r?
Phenomenon at which?
  (a) SeaMonkey Mailnews, mail display of a mail in mail folder
  (b) SeaMonkey Browser, display of Web content of Mime-Type=message/rfc822
      (http://, https://, file:// etc.)
  (c) SeaMonkey Browser, local ".eml" file display

If phenomenon on mail data really written in koi8-r, and if (b) or (c), see Bug 227360 (see also Bug 206421).

> Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.3a1pre)
> Gecko/20090828 SeaMonkey/2.1a1pre
> <meta content="text/html;charset=KOI8-R" http-equiv="Content-Type">
> this bug actual only for combination non-utf encoding + html body.

If (a), and if real byte code of mail data is utf-8, see Bug 508946.
Phenomenon of Bug 508946 is not observed by SeaMonkey 1.1.17, SeaMonkey 1.9.1 branch, nor any Thunderbird.
(Reporter)

Comment 2

9 years ago
Yes, it's really koi8-r inside. I check it by viewing local inbox file with FAR manager with character tables installed. Text looks as it should with koi8-r table selected.
(In reply to comment #2)
> Yes, it's really koi8-r inside.

If so, (b) or (c) instead of (a), isn't it? SeaMonkey Mailnews(1.9.3) incorrectly displays the kio8-r HTML mail in mail folder?
(Reporter)

Comment 4

9 years ago
(a)
Confirmed with 8/29 build.
> Build identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.3a1pre) Gecko/20090829 SeaMonkey/2.1a1pre

If charset=utf-8 is set in <meta> tag of HTML written in kio8-r,
>  <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
text of "тест тест тест" was displayed as expected.

Bug 508946 and Bug 227360 looks to occur on a real mail at same time.
Due to halfway HTML 5 support implementation?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Created attachment 397490 [details]
mail folder file

2 HTML mails written in koi8-r.
  mail-1 : charset=KOI8-R in meta tag
  mail-2 : charset=UTF-8 in meta tag
Attachment #397490 - Attachment mime type: text/plain → text/plain; charset=koi8-r
Note: Sm 2.0 series doesn't have this bug's problem. 2.1 series only problem.
> Build identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.1.4pre) Gecko/20090830 SeaMonkey/2.0b2pre
Version: unspecified → Trunk
Following looks first Sm 2.1(Gecko 1.9.2) build for Win at comm-central-trunk. 
> http://ftp.mozilla.org/pub/mozilla.org/seamonkey/nightly/2009/07/2009-07-12-06-comm-central-trunk/
> seamonkey-2.1a1pre.en-US.win32.zip
> Build identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.2a1pre) Gecko/20090712 SeaMonkey/2.1a1pre
This bug's problem occurs with this build. It looks problem since initial of Gecko 1.9.2 use.
Summary: Wrong text encoding for html mail in non-utf encoding → Wrong text encoding for html mail in non-utf encoding (Sm 2.1 only problem)
When <meta> tag is removed, mail was correctly displayed by Sm 2.1, regardless of auto-detect=On or off, character encoding of auto-detect. "koi8-r" in View/Character Encoding was displayed as "Checked", properly as expected.

I think this bug(non utf-8, charset in meta) and Bug 508946(utf-8, wrong charset in meta) are same problem.
  - charset in <meta tag> is always applied upon HTML mail rendering.
Summary: Wrong text encoding for html mail in non-utf encoding (Sm 2.1 only problem) → Wrong text encoding for html mail in non-utf encoding (Sm 2.1 only problem. charset in <meta> tag is always used in HTML rendering)
Phenomenon looks;
  (a) Sm2 Mailnews correctly processes charset in Content-Type: header.
  (b) HTML converter of Gecko 1.9.2/1.9.3 returns data in utf-8.
  (c) Sm2 Mailnews doesn't remove meta tag upon HTML rendering
      even when chaset is properly specified in Content-Type: header, 
      then HTML renderer uses charset in <meta> tag,
  (d) Thus, mismatch of charset between (b) and (c) happnes.
It's similar to Bug 227360 of Seamonkey Browser, which occurs on all of Sm1.x, Sm 2.0, Sm 2.1.
Severity: normal → major

Updated

9 years ago
Summary: Wrong text encoding for html mail in non-utf encoding (Sm 2.1 only problem. charset in <meta> tag is always used in HTML rendering) → [SeaMonkey 2.1] Wrong text encoding for html mail in non-utf encoding (charset in <meta> tag is always used in HTML rendering)
Summary: [SeaMonkey 2.1] Wrong text encoding for html mail in non-utf encoding (charset in <meta> tag is always used in HTML rendering) → [SeaMonkey 2.1] Wrong text encoding for html mail in non-utf encoding (charset in <meta> tag is always used in HTML rendering, although decoder returns in UTF-8)
(Reporter)

Updated

9 years ago
Flags: blocking-seamonkey2.1a1?

Updated

9 years ago
Component: MailNews: Message Display → MIME
Product: SeaMonkey → MailNews Core
QA Contact: message-display → mime
Patch for Bug 505072 is landed. Checked with next builds.
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a1pre) Gecko/20091118 Shredder/3.1a1pre
> Build identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.3a1pre) Gecko/20091118 SeaMonkey/2.1a1pre
Problem could't be reprocuced with these builds.
Fixed/Verified.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by bug 505072]
Status: RESOLVED → VERIFIED

Updated

9 years ago
Flags: blocking-seamonkey2.1a1?
(Reporter)

Comment 12

8 years ago
The problem is back.
Mozilla/5.0 (Windows NT 5.1; rv:2.0b8pre) Gecko/20101109 Firefox/4.0b8pre SeaMonkey/2.1b2pre - Build ID: 20101109020708
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
(In reply to comment #12)
> SeaMonkey/2.1b2pre - Build ID: 20101109020708
> The problem is back.

It's not "problem is back", it's "new problem was exposed".
 
Default of html5.enable was changed from false to true on 2010/5/08. Problem after 2010/5/08 build occurs only with html5.enable=true. It's different problem/cause from your original comment #0(with html5.enable=false), although same external symptom on same mail. 
Bug 594646 exists for the problem which occurs only with html5.enable=true.

Closing again.
Status: REOPENED → RESOLVED
Last Resolved: 9 years ago8 years ago
Resolution: --- → FIXED
Summary: [SeaMonkey 2.1] Wrong text encoding for html mail in non-utf encoding (charset in <meta> tag is always used in HTML rendering, although decoder returns in UTF-8) → [SeaMonkey 2.1] Wrong text encoding for html mail in non-utf encoding (with html5.enable=false, charset in <meta> tag is always used in HTML rendering, although decoder returns in UTF-8)
(Reporter)

Comment 14

8 years ago
The Magic!
Thank you, wada, and sorry.
You need to log in before you can comment on or make changes to this bug.