Closed Bug 523307 Opened 15 years ago Closed 6 years ago

utf-8 decoding botched somehow

Categories

(MailNews Core :: Internationalization, defect)

1.9.1 Branch
x86
Windows XP
defect
Not set
major

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: eyalroz1, Assigned: smontagu)

Details

(Keywords: regression, testcase)

Attachments

(1 file, 1 obsolete file)

For the last several months, certain UTF-8 messages, which are reported as windows-1255 or iso-8859-8-i and have both plain text and HTML parts (these may be sufficient rather than necessary criteria), are no longer properly decoded when forcing the charset to UTF-8. 

Moreover, sometimes one even gets parts of the HTML itself, e.g. the doctype string, in the decoded message.

This happens, of course, with and without my BiDi Mail UI extension.

This started happening between 2009-07-09 and 2009-07-12.
There are no Windows builds between 2009-07-09 and 2009-07-12 so I can't pinpoint the exact day.

Another note: The windows-1255 decoding result also differs before and after the regression, it's not just the post-coersion decoding.
Attached file triggerring message (obsolete) —
Attached file triggerring message
Simplified testcase.

It turns out you don't need the text MIME part. But if you only have text instead of HTML with the same issue (UTF-8 reported as windows-1255 or 8859-8-i), the problem does not manifest.
Attachment #407246 - Attachment is obsolete: true
OS: All → Windows XP
Hardware: All → x86
Bug does _not_ manifest with a clean profile using the same mail folders as the dirty one. But it _does_ manifest with a clean profile into which I've copied my prefs.js, addressbook, bookmarks and cookies, and using the same mail folders.
bad, sounds like.

this is the only open intl bug, with regression keyword + created since fall 2006.
Keywords: testcase
Version: unspecified → 1.9.1 Branch
Manifestation seems to also be affected by the choice of View Message Body as Simple HTML/Original HTML. With some messages I see manifestation with Original and Simple HTML but not with text; with other messages, it's just Simple HTML.
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.
Jorg, can you please check this?
Flags: needinfo?(jorgk)
Attachment #407249 - Attachment mime type: message/rfc822 → text/plain
Flags: needinfo?(jorgk)
The message shows as mojibake since it's shown as per
  Content-Type: text/html; charset=iso-8859-8-i
in Hebrew although the HTML is in UTF-8.
We have to decode based on the mail header, and that's wrong, so the bug is invalid.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
@JorgK: 
With due respect - I was talking about what happens when you _force_ UTF-8 decoding, not with the default we have to apply based on the header.

However, I don't see the bug right now, so changing the resolution to WFM. I will reopen if I encounter it again though.
Resolution: INVALID → WORKSFORME
Sorry, I read too many bugs every day and some details escape me.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: