Closed Bug 571704 Opened 14 years ago Closed 14 years ago

Quoted printable decoding fails on certain characters (with html5.enable=true, iso-8859-1 data is converted to utf-8, then Content-Type: charset=iso-8859-1 is applied to the converted utf-8 data)

Categories

(MailNews Core :: MIME, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 594646

People

(Reporter: JoeS1, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: regression)

Attachments

(4 files, 1 obsolete file)

Attached file testcase eml
Double quotes and apostrophe (maybe others) fail to decode in message body.

STR:
Open the eml attachment in TB3.1 and in TB3.2 and compare the results.
I have no idea if the QP encoding is correct in my example,but it works in 3.1
Tested using:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a6pre) Gecko/20100611 Shredder/3.2a1pre ID:20100611121121
Attached image TB 3.2 rendering
Attached image TB 3.1 screenshot
This looks like a character-set mixup to me, the 3.2 rendering corresponds to what you see when displaying UTF-8 encoding as ISO-8859-1. Moving to MIME, though this may have been caused by some Core-code change.
Component: Mail Window Front End → MIME
Product: Thunderbird → MailNews Core
QA Contact: front-end → mime
Attachment #450875 - Attachment mime type: application/x-mimearchive → text/plain
OS: Windows XP → All
Hardware: x86 → All
Hi,
I only seen this bug in mail not in newsgroups.

Thanks for your work!
> Shredder/3.2a1pre ID:20100611121121
> Content-Type: text/html; charset="iso-8859-1"
> <META content=3D"text/html; charset=3Diso-8859-1" =
> http-equiv=3DContent-Type>

Bug 594646?
I've also the bug in 8bit :

Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
Sorry forgot my last message!

The same mail in text and html:

display in text  :correct
Content-Type: text/plain;
    charset="iso-8859-1"
Content-Transfer-Encoding: 8bit


html : Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
display in full html : correct
display in simple html : wrong accents
(In reply to comment #9)

Oh, sorry for misleading comment.
My comment #7 is for comment #0 and test case attached to comment #0 by bug opener.
Stéphane Grégoire, I can say nothing about your case, because no data about Tb build, no mail data to produce problem, no data about correct/wrong such as screen shot. What is your base/evidence that your problem is same as bug opener's case, comment #0?
(In reply to comment #0)

Bug 594646 occurs only when all of next conditions are met.
1. Tb 3.2 or later (Tb 3.1 doesn't have html5.enable)
2. html5.enable=true (IIRC, default is changed to true around 6/08).
3. Correct charset is specified in Content-Type: header
   (If incorrect charset, different problem == malformed mail)
4. <meta> tag of http-equiv for Content-Type exists in HTML
 (excluding multiple meta,http-equiv=content-type,content=...;charset=... case)
5. Valid charset other than utf-8/utf-16 is specified in content="text/html;
   charset=xxxx" of <meta>.
6. Order of attributes in <meta> : content -> http-equiv
Your comment #0 and test case satisfies all conditions except 2. I can't know your real html5.enable setting.
Joe Sabash, check above conditions, please.
Attachment #479737 - Attachment mime type: message/rfc822 → text/plain; charst="iso-8859-1"
Attachment #479737 - Attachment mime type: text/plain; charst="iso-8859-1" → text/plain; charset="iso-8859-1"
(In reply to comment #12)
> Created attachment 479737 [details]

Oh, you were bug opener of Bug 598740 which was duped to this bug...
 
I changed mime-type of the attachment to "text/plain; charset=iso-8859-1" which is set in Content-Type: header of the mail.
Stéphane Grégoire, view it by browser, with View Character Encoding=iso-8859-1 and utf-8. Mail data is written in utf-8, but "Content-Type: charset=iso-8859-1" is set by mail sender or mail server, or mail server transformed data to utf-8 from iso-8859-1 without altering iso-8859-1 in Content-Type: to utf-8.
I looks that "Simple HTML" correctly handles "Content-Type: charset=iso-8859-1", but "Original HTML" looks to fail to use iso-8859-1 and looks to interpret it as UTF-8.
It's common phenomenon/issue among your bug, this bug, and Bug 594646 I pointed.
However, your case is phenomenon with malformed mail. It's better sepalately handled.
Re-opening your Bug 598740, with setting dependency to this bug for ease of search.
Hello, I made a bug submission but they marked as dublicated to this, so here I write my experience:

a screenshot of the problem of greek mail encoding: 

https://bug600178.bugzilla.mozilla.org/attachment.cgi?id=479024

and here the bug report:

https://bugzilla.mozilla.org/show_bug.cgi?id=600178

thank you for your time!
(In reply to comment #14)
> a screenshot of the problem of greek mail encoding: 
> https://bug600178.bugzilla.mozilla.org/attachment.cgi?id=479024
> https://bugzilla.mozilla.org/show_bug.cgi?id=600178

To nas(bug opner of bug 600178):
View your bug 600178 by browser with View/Character Encoding = utf-8 and iso-8859-7.
Ελληνικά! in your bug with utf-8 is shown like your screen shot with iso-8859-7 which is set in Content-Type: header of your mail data. Your bug is iso-8859-7 variant of Bug 598740 which is phenomenon on malformed mail.
Close your bug as dup of Bug 598740, insead of this bug on valid mail, for ease of diagnosis, tracking, and search, please.
Difference of this bug from Bug 594646.
(A) Content-Type: multipart/alternative;
(B) text/html part in multipart/alternative.
    Content-Transfer-Encoding: quoted-printable
mail-1: same HTML as original test case
        <meta content=text/html; charset="iso-8859-1" http-equiv=Content-Type>
mail-2: <meta content=text/html; charset="iso-8859-1" ...> is removed
mail-3: <meta content=text/html; charset="utf-8" http-equiv=Content-Type>
mail-4: <meta http-equiv=Content-Type content=text/html; charset="iso-8859-1">

Phenomenon of this bug is observed with mail-1 only, only with html5.enable=true.
Same problem as Bug 594646.
"Quoted-printable or not" is irrelevant. "multipart/xxx or not" is irrelevant.
Attachment #479737 - Attachment is obsolete: true
Phenomenon/problem depended on View/Message Body As, and Bug 594646 is for "Original HTML" case.
Joe Sabash, "Original HTML" case? Or "Simple HTML" case?

Note: Regression window is clear in other bug - when patch which changed default of html5.enable from false to true was landed.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Quoted printable decoding fails on certain characters → Quoted printable decoding fails on certain characters (with html5.enable=true, iso-8859-1 data is converted to utf-8, then Content-Type: charset=iso-8859-1 is applied to the converted utf-8 data)
As I wrote in bug 598740 comment #23, if "Simple HTML", garbled display is shown even for next simple HTML of quoted-printable other than utf-8.
> Content-Type: text/html; charset=Shift_JIS
> Content-Transfer-Encoding: quoted-printable
> 
> <html><head></head>
> <body>
> <p>=93=FA=96{=8C=EA</p>
> </body></html>
Please separate "Original HTML" case and "Simple HTML" case.
No longer blocks: 598740
Depends on: 598740
No longer blocks: 600178
Encoded or not(quoted-printable/base64 or not) was relevant to "View/Message Body As/Simple HTML" case only.
Closing as dup of Bug 594646(relevant to <meta>, irrelevant to quoted-printable/base64, Original HTML case).
Joe Sabash(bug opener), please reopen if duping is wrong.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: