Closed Bug 97314 Opened 24 years ago Closed 22 years ago

Use the charset sniffed by auto-detector from mail body to the non-MIME header

Categories

(MailNews Core :: Internationalization, defect, P4)

defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 90584
Future

People

(Reporter: eternal, Assigned: nhottanscp)

References

Details

(Keywords: intl)

Well, I have quite a lot of messages, some are in latin, some in cyrillic koi8-r, some in cyrillic win-1251. It seems, that auto-detection of charset for message itself works (not absolutely sure, though). But for message list I'm able to select only one charset. So only messages having subject in this charset have their subject displayed properly in the message-list. I suppose, may be charset should be auto-detected for each message and it should be displayed in the auto-detected charset both in message list and when I view the message itself.
The message list (thread pane) display is implemented differently with the message view pane. For the mails w/o MIME charset, if it uses auto-detector to sniff out the charset, it would be risky, since the subject is usually not so long, auto-detector is not powerful enough in this case. You can put the mails in the same charset into one folder and specify appropriate folder charset by selecting menu View | Folder Character Coding...
*** This bug has been marked as a duplicate of 77903 ***
Status: UNCONFIRMED → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
It doesn't seem a dup of bug 77903, since the reporter referred the case which mail body doesn't have charset info ( the auto-detector works in mail body). In this case, I think we can use the charset sniffed by auto-detector from the mail body and apply it to the message header. But I'm not sure if it will affect the performance. Kat, what do you think? Thanks.
but i think that the reporter is reffering to the case when the message has no charset info in the body, so it can not be sniffed. In that case we apply global default ( folder cahrset).
Added shanjian to cc list. In most cases, auto-detector can sniff out the charset from the mail body with no charset info when it's turned on.
there is a bug on russian auto-detectection ( it's not working), let me find a number
this is bug # 90581
this is bug # 90581
Yes, shanjian's fix for Russian auto-detector has not been checked in yet. This bug is about if we need to apply the charset sniffed out by auto-detector from the mail body to the message header in the thread pane. We have folder charset feature already, but just in case the user wants to read multi-lingual mails in one same folder.
actually Shanjian fix is for autodetecting body not headers, this is the case when our autodetector can not detect the charset when the message body is short ( not enough info for sniffing).That's why Frank proposition is to obsolete nsIStringCharsetDetector and use nsIStringCharsetDetector since the data we provide to it is too small. If i would apply folder's charset it would work.
I didn't mean we should use auto-detector for message headers in the thread pane. I'm thinking that maybe we should pass the charset info to the header display in thread pane after auto-detector finds out the charset from the mail body.
I guess this could be a duplicate of Bug 77903. It would depend on how that bug is fixed. If the solution there is to apply the body charset to non-MIME headers before using the folder charset -- no matter how that body charset is obtained, then it would fix this bug also. I don't know how that fix will be implemented, and so it might be best to leave this bug open so that whoever will work on Bug 77903 take this aspect of the problem into consideration. Also this bug should provide additional test cases to look in case Bug 77903 takes care of this bug.
Reopened the bug, modified summary.
Status: RESOLVED → UNCONFIRMED
Keywords: intl
Resolution: DUPLICATE → ---
Summary: As I have messages both in Koi8-R and Cp-1251 Cyrillic charsets, I'm not able to list them all simultaneously → Use the charset sniffed by auto-detector from mail body to the non-MIME header
Status: UNCONFIRMED → NEW
Ever confirmed: true
Hardware: PC → All
assiging to jbetak. Please have a look. Thanks
Assignee: yokoyama → jbetak
Reassign to nhotta. Currently, charset detection is not applied for message headers. There are a couple of issues, performance and accuracy. Usually, the number of characters in the headers are relatively small, so I assume the accuracy of the detection would be low.
Assignee: jbetak → nhotta
nhotta, what we're talking about is the same as bug 77903 except in bug 77903 case, the charset for Body is known, while in this bug no charset for Headers or Body is specified. So basically this is about doing the same trick as for bug 77903, but auto-detecting the Body charset prior to passing it on to Header.
Getting body charset for header does not always work. For Imap, no body data available when we display the message list in thread pane. I think the original report is requesting charset auto detection for headers.
Status: NEW → ASSIGNED
OS: Linux → All
Priority: -- → P4
set to future
Target Milestone: --- → Future
*** Bug 115631 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of 90584 ***
Status: ASSIGNED → RESOLVED
Closed: 24 years ago22 years ago
Resolution: --- → DUPLICATE
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.