utf-8 decoding botched somehow

RESOLVED WORKSFORME

Status

--
major
RESOLVED WORKSFORME
9 years ago
3 months ago

People

(Reporter: eyalroz, Assigned: smontagu)

Tracking

({regression, testcase})

1.9.1 Branch
x86
Windows XP
regression, testcase

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

903 bytes, text/plain
Details
(Reporter)

Description

9 years ago
For the last several months, certain UTF-8 messages, which are reported as windows-1255 or iso-8859-8-i and have both plain text and HTML parts (these may be sufficient rather than necessary criteria), are no longer properly decoded when forcing the charset to UTF-8. 

Moreover, sometimes one even gets parts of the HTML itself, e.g. the doctype string, in the decoded message.

This happens, of course, with and without my BiDi Mail UI extension.

This started happening between 2009-07-09 and 2009-07-12.
(Reporter)

Comment 1

9 years ago
There are no Windows builds between 2009-07-09 and 2009-07-12 so I can't pinpoint the exact day.

Another note: The windows-1255 decoding result also differs before and after the regression, it's not just the post-coersion decoding.
(Reporter)

Comment 2

9 years ago
Created attachment 407246 [details]
triggerring message
(Reporter)

Comment 3

9 years ago
Created attachment 407249 [details]
triggerring message

Simplified testcase.

It turns out you don't need the text MIME part. But if you only have text instead of HTML with the same issue (UTF-8 reported as windows-1255 or 8859-8-i), the problem does not manifest.
Attachment #407246 - Attachment is obsolete: true
(Reporter)

Updated

9 years ago
OS: All → Windows XP
Hardware: All → x86
(Reporter)

Comment 4

9 years ago
Bug does _not_ manifest with a clean profile using the same mail folders as the dirty one. But it _does_ manifest with a clean profile into which I've copied my prefs.js, addressbook, bookmarks and cookies, and using the same mail folders.
bad, sounds like.

this is the only open intl bug, with regression keyword + created since fall 2006.
Keywords: testcase
Version: unspecified → 1.9.1 Branch
(Reporter)

Comment 6

9 years ago
Manifestation seems to also be affected by the choice of View Message Body as Simple HTML/Original HTML. With some messages I see manifestation with Original and Simple HTML but not with text; with other messages, it's just Simple HTML.
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.

Comment 8

3 months ago
Jorg, can you please check this?
Flags: needinfo?(jorgk)

Updated

3 months ago
Attachment #407249 - Attachment mime type: message/rfc822 → text/plain
Flags: needinfo?(jorgk)

Comment 9

3 months ago
The message shows as mojibake since it's shown as per
  Content-Type: text/html; charset=iso-8859-8-i
in Hebrew although the HTML is in UTF-8.

Comment 10

3 months ago
We have to decode based on the mail header, and that's wrong, so the bug is invalid.
Status: NEW → RESOLVED
Last Resolved: 3 months ago
Resolution: --- → INVALID
(Reporter)

Comment 11

3 months ago
@JorgK: 
With due respect - I was talking about what happens when you _force_ UTF-8 decoding, not with the default we have to apply based on the header.

However, I don't see the bug right now, so changing the resolution to WFM. I will reopen if I encounter it again though.
Resolution: INVALID → WORKSFORME

Comment 12

3 months ago
Sorry, I read too many bugs every day and some details escape me.
You need to log in before you can comment on or make changes to this bug.