Use the charset sniffed by auto-detector from mail body to the non-MIME header



MailNews Core
16 years ago
9 years ago


(Reporter: Jack Angel, Assigned: nhottanscp)




Firefox Tracking Flags

(Not tracked)




16 years ago
Well, I have quite a lot of messages, some are in latin, some in cyrillic
koi8-r, some in cyrillic win-1251. It seems, that auto-detection of charset for
message itself works (not absolutely sure, though). But for message list I'm
able to select only one charset. So only messages having subject in this charset
have their subject displayed properly in the message-list. I suppose, may be
charset should be auto-detected for each message and it should be displayed in
the auto-detected charset both in message list and when I view the message itself.

Comment 1

16 years ago
The message list (thread pane) display is implemented differently with the 
message view pane. For the mails w/o MIME charset, if it uses auto-detector to 
sniff out the charset, it would be risky, since the subject is usually not so 
long, auto-detector is not powerful enough in this case. 
You can put the mails in the same charset into one folder and specify 
appropriate folder charset by selecting menu View | Folder Character Coding...

Comment 2

16 years ago

*** This bug has been marked as a duplicate of 77903 ***
Last Resolved: 16 years ago
Resolution: --- → DUPLICATE

Comment 3

16 years ago
It doesn't seem a dup of bug 77903, since the reporter referred the case which 
mail body doesn't have charset info ( the auto-detector works in mail body). In 
this case, I think we can use the charset sniffed by auto-detector  from the 
mail body and apply it to the message header. But I'm not sure if it will 
affect the performance. Kat, what do you think? Thanks.

Comment 4

16 years ago
but i think that the reporter is reffering to the case when the message has no
charset info in the body, so it can not be sniffed. In that case we apply global
default ( folder cahrset). 

Comment 5

16 years ago
Added shanjian to cc list.
In most cases, auto-detector can sniff out the charset from the mail body with
no charset info when it's turned on.

Comment 6

16 years ago
there is a bug on russian auto-detectection ( it's not working), let me find a

Comment 7

16 years ago
this is bug # 90581

Comment 8

16 years ago
this is bug # 90581

Comment 9

16 years ago
Yes, shanjian's fix for Russian auto-detector has not been checked in yet.
This bug is about if we need to apply the charset sniffed out by auto-detector 
from the mail body to the message header in the thread pane. We have folder 
charset feature already, but just in case the user wants to read multi-lingual 
mails in one same folder.

Comment 10

16 years ago
actually Shanjian fix is for autodetecting body not headers, this is the case
when our autodetector can not detect the charset when the message body is short
( not enough info for sniffing).That's why Frank proposition is to obsolete
nsIStringCharsetDetector and use nsIStringCharsetDetector since the data we
provide to it is too small. If i would apply folder's charset it would work.

Comment 11

16 years ago
I didn't mean we should use auto-detector for message headers in the thread 
pane. I'm thinking that maybe we should pass the charset info to the header 
display in thread pane after auto-detector finds out the charset from the mail 

Comment 12

16 years ago
I guess this could be a duplicate of Bug 77903. It would depend on
how that bug is fixed. If the solution there is to apply the body charset
to non-MIME headers before using the folder charset -- no matter how 
that body charset is obtained, then it would fix this bug also. 
I don't know how that fix will be implemented, and so it might be best to 
leave this bug open so that whoever will work on Bug 77903 take this 
aspect of the problem into consideration. Also this bug should
provide additional test cases to look in case Bug 77903 takes care
of this bug.

Comment 13

16 years ago
Reopened the bug, modified summary.
Keywords: intl
Resolution: DUPLICATE → ---
Summary: As I have messages both in Koi8-R and Cp-1251 Cyrillic charsets, I'm not able to list them all simultaneously → Use the charset sniffed by auto-detector from mail body to the non-MIME header


16 years ago
Ever confirmed: true


16 years ago
Hardware: PC → All

Comment 14

16 years ago
assiging to jbetak.  Please have a look.  Thanks
Assignee: yokoyama → jbetak

Comment 15

16 years ago
Reassign to nhotta.

Currently, charset detection is not applied for message headers.
There are a couple of issues, performance and accuracy.
Usually, the number of characters in the headers are relatively small, so I
assume the accuracy of the detection would be low.
Assignee: jbetak → nhotta

Comment 16

16 years ago
nhotta, what we're talking about is the same as bug 77903 except in bug 77903
case, the charset for Body is known, while in this bug no charset for Headers or
Body is specified. So basically this is about doing the same trick as for bug
77903, but auto-detecting the Body charset prior to passing it on to Header.

Comment 17

16 years ago
Getting body charset for header does not always work. For Imap, no body data
available when we display the message list in thread pane.
I think the original report is requesting charset auto detection for headers.


16 years ago
OS: Linux → All
Priority: -- → P4

Comment 18

16 years ago
set to future
Target Milestone: --- → Future

Comment 19

16 years ago
*** Bug 115631 has been marked as a duplicate of this bug. ***

Comment 20

15 years ago

*** This bug has been marked as a duplicate of 90584 ***
Last Resolved: 16 years ago15 years ago
Resolution: --- → DUPLICATE
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.