Open Bug 425945 Opened 16 years ago Updated 2 years ago

Charset not applied sometimes to subjects in message list, but is OK in message display. (was BiDi Hebrew & Arabic)

Categories

(Thunderbird :: Folder and Message Lists, defect)

x86
All
defect

Tracking

(Not tracked)

People

(Reporter: eyalroz1, Unassigned)

References

Details

(Keywords: intl, regression, Whiteboard: [dupeme?])

Attachments

(4 files)

For the last few months I've noticed that for some messages, the subject lines seem to show up ok in the message header display, but not in the list of messages in a folder. See screenshots comparing Seamonkey trunk 2008-03-27 to Thunderbird release 2.0.0.12 .
Bug begun to manifest between 2008-02-01 and 2008-02-07.
Keywords: regression
Bug began to manifest between 2008-02-05 and 2008-02-06 nightlies.
Blocks: 90584
(In reply to comment #0)
Hi Eyal. Can you also attach a message header please.
The message header contains 'charset=iso-8859-I', but I don't see such charset in the Hebrew encodings list in Thunderbird menu (View->Character Encoding->More Encodings->Middle Eastern).
If I change charset in the header of attached message to 'iso-8859-8-I' I can see  msg subject & text correctly.
Well... that's for the body, not for the subject. Although I can't say with certainty that the new behavior is actually a bug if you believe the content-type's charset should also apply to the subject.
That was the idea of https://bugzilla.mozilla.org/show_bug.cgi?id=90584 patch to apply message charset to non-MIME subject, sender and recipient (if specified), otherwise default folder charset will be used.
So how is the message body getting displayed with the right charset if there is a typo in the header? Autodetection?
Quite probably the default character encoding in the folder properties has right charset. Will work w/o #90584 changes.
(In reply to comment #11)
> So how is the message body getting displayed with the right charset if there is
> a typo in the header? Autodetection?

It's not, actually, unless you use my extension (BiDi Mail UI) which does its own autodetection.
Component: MailNews: BiDi Hebrew & Arabic → Layout: Text
QA Contact: giladehven → layout.fonts-and-text
Also occurs on MacOS: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1a2pre) Gecko/20080820030902 Shredder/3.0b1pre.

Attachment of screenshot provided.  Note that the body in the preview pane displays characters correctly.  Font is set to Verdana, and encoding in/out as UTF-8.
Attachment #312616 - Attachment mime type: application/octet-stream → text/plain
Attachment #312616 - Attachment mime type: text/plain → text/plain;charset=windows-1255
Attachment #312616 - Attachment mime type: text/plain;charset=windows-1255 → text/plain;charset=iso-8859-8-I
(In reply to comment #7)
> folder with triggering message

I changed content-type: of the attachment to content-type: text/plain;charset=ISO-8859-8-I, because first(bottom most) Received: is as follows.
> Received: from 127.0.0.1 (unknown [127.0.0.1])
>	by asat.org.il (Postfix) with SMTP id 1960B42EB0
>	for <eyalroz@technion.ac.il>; Tue, 25 Mar 2008 19:57:00 +0000 (UTC)

Subject: and mail body seems written in code of one of ISO-8859-8-I, iso-8859-8, windows-1252 (Hebrew).
> http://en.wikipedia.org/wiki/Windows-1255
> http://en.wikipedia.org/wiki/ISO_8859-8
> http://en.wikipedia.org/wiki/ISO-8859-8-I

(1) Subject: is not properly encoded.
(2) iso-8859-I is specified as charset. 
> Content-type: text/plain; charset=iso-8859-I
"I" is 9-th alphabet. 0->"A",1->"B",...,7->"H",8->"I", then iso-8859-I?  
Or iso-8859-8-I is wrongly specified as iso-8859-I?
It may be same problem as bug 468351(fixed by Tb 3.0b4).
  As wrong carset=iso-8859-I, auto-detect works. However, due to bug bug 468351,
  quirks for raw binary of non-ascii in header is broken.
Assignee: mozilla → nobody
This bug no longer manifests using 3.3a3 on MacOS 10.6.6.  Verified with a Usenet article that had accented characters in the title.
richard, thanks for checking.


(In reply to comment #17)
> It may be same problem as bug 468351(fixed by Tb 3.0b4).
>   As wrong carset=iso-8859-I, auto-detect works. However, due to bug 468351,
>   quirks for raw binary of non-ascii in header is broken.

I can still reproduce using current trunk and attachment 312616 [details].

duplicates/related bugs (this list includes two types of bugs, at least):
bug 537869
Bug 459288 
Bug 537869 
Bug 602556 
Bug 610110
bug 317263
plus some of these 
https://bugzilla.mozilla.org/buglist.cgi?query_format=advanced&short_desc=subject%20encod&field0-0-0=short_desc&short_desc_type=allwordssubstr&type0-0-0=nowordssubstr&value0-0-0=char&resolution=---&product=MailNews%20Core&product=Thunderbird
Keywords: intl
note: Bug 468351 - display of header values with unencoded special characters broken - which some of the above bugs may match, was fixed for 3.0
> I can still reproduce using current trunk and attachment 312616 [details].

Me too. That is to say: My default charset for the folder is windows-1255, and that isn't applied to the message, nor is iso-8859-i. I get the \xFFFD's like in the screenshot, which are the result of seeing iso-8859-I, not understanding what that means, and trying to 'apply' it to the subject.

(Perhaps the bug name should be changed.)
Does this really belong in a Thunderbird component rather than in core:layout?


(In reply to comment #21)
> (Perhaps the bug name should be changed.)

can you adjust it to your thinking.
OS: Windows XP → All
Component: Layout: Text and Fonts → Folder and Message Lists
Product: Core → Thunderbird
Summary: Charset no longer applied sometimes to subjects in message list → Charset not applied sometimes to subjects in message list, but is OK in message display. (was BiDi Hebrew & Arabic)
Whiteboard: [dupeme?]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: