Closed
Bug 90584
Opened 24 years ago
Closed 17 years ago
charset=... must be applied to non-MIME Subject:/From:/To:/etc. fields
Categories
(MailNews Core :: Internationalization, enhancement)
MailNews Core
Internationalization
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla1.9beta4
People
(Reporter: ezh, Assigned: lelik52)
References
(Depends on 1 open bug, Blocks 1 open bug, )
Details
(Keywords: fixed1.8.1.15, intl)
Attachments
(6 files, 6 obsolete files)
123.38 KB,
image/jpeg
|
Details | |
13.02 KB,
image/png
|
Details | |
655 bytes,
text/plain
|
Details | |
95.32 KB,
image/jpeg
|
Details | |
7.22 KB,
patch
|
Bienvenu
:
review+
Bienvenu
:
superreview+
|
Details | Diff | Splinter Review |
7.76 KB,
patch
|
Bienvenu
:
review+
Bienvenu
:
superreview+
samuel.sidler+old
:
approval1.8.1.13-
dveditz
:
approval1.8.1.15+
|
Details | Diff | Splinter Review |
1. Set character coding to Autodetect->Russian.
2. Open this news group.
3. List thrue messages
Some subjects are not in cyrillic than in smth else.
If you see message with (for expl: from Mikhail Lyubkin, 02.06.01 16:39) with ?????????????? in the body (see bug 90581) and the subject is also corrupted so make it manually KOI8-R and the subject will appear correctly but only in the message head.
Comment 1•24 years ago
|
||
Could you attach the news message?
Also please specify name of the newsgroup, thanks.
Keywords: intl
Reporter | ||
Comment 2•24 years ago
|
||
Just look at the URL I typed-in this bug:
news://news.regetsoft.com/regetsoft.public.regetjr.support.ru
There are just some messages, so wouldn't miss it.
Comment 3•24 years ago
|
||
In that newsgroup, I see some messages which does not have charset specified.
On my machine they are shown as garbage (I will attach a screen shot) but that
is expected. The user has to change charset manually per message or set the
default charset by either pref UI or folder property UI.
Comment 4•24 years ago
|
||
Comment 5•24 years ago
|
||
I did not realize that this is about auto detection.
Auto detection is not applied to message headers because usually not enough data
available also the cost of setting up auto detection is high.
Status: NEW → ASSIGNED
Summary: Subject codepage is incorrect → Subject codepage is incorrect with charset auto detect is on.
Target Milestone: --- → Future
Comment 6•23 years ago
|
||
This does not only affect the subject, but all headers. Also it doesnt work when
you are actually viewing a message.
Comment 7•23 years ago
|
||
Comment 8•23 years ago
|
||
see also the same message with koi8-r encoding
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=78892
from
http://bugzilla.mozilla.org/show_bug.cgi?id=137098
Comment 9•23 years ago
|
||
Mozilla always uses codepage specified in Message Display properties instead of
specified in the message header.
This is not related to Autodetect feature.
Comment 10•23 years ago
|
||
For 'Subject' displaying Mozilla always uses codepage specified in 'Message
Display' properties instead of specified in the message header.
This is not related to 'Autodetect' feature.
Comment 11•23 years ago
|
||
I see the same problem on OS/2 build 2002071108, so please switch OS->All.
The problem is not only for newsgroups, but also for regular e-mails. A typical
e-mail message is attached. It has
"Content-type: text/plain; charset=windows-1251"
in the header but Mozilla does not apply this for the Subject.
I think summary like
"Codepage used for message displaying must be applied for Subject also"
will be much better.
Comment 12•23 years ago
|
||
Status changes:
OS: Windows98 -> All
Summary: 'Subject codepage is incorrect with charset auto detect is on.' ->
'Codepage used for message displaying must be applied for Subject also'
Severity: normal -> major
Severity: normal → major
OS: Windows 98 → All
Hardware: Other → All
Summary: Subject codepage is incorrect with charset auto detect is on. → Codepage used for message displaying must be applied for Subject also
Target Milestone: Future → ---
Comment 13•23 years ago
|
||
I've experienced the same problem here. Agree with comment #11 and #12
Comment 14•22 years ago
|
||
I agree with comment #11 and #12.
Message subject should be displayed according to the setting in the header, eg.
"Content-type: text/plain; charset=iso-8859-2"
in message preview panel and window and also in message folder.
Comment 15•22 years ago
|
||
Some additional comment:
when replying to message, mozilla displays the CORRECT subject in both cases:
when subject is not correctly displayed and when subject is correctly displayed.
this problem is actual for 'sender' field also.
Updated•22 years ago
|
Blocks: advocacybugs
Comment 16•22 years ago
|
||
Changing Summary according to recent comments.
Summary: Codepage used for message displaying must be applied for Subject also → charset=... must be applied for Subject:/From:/To:/etc. fields
Comment 17•22 years ago
|
||
*** Bug 97314 has been marked as a duplicate of this bug. ***
Updated•22 years ago
|
Summary: charset=... must be applied for Subject:/From:/To:/etc. fields → charset=... must be applied to non-MIME Subject:/From:/To:/etc. fields
Comment 18•22 years ago
|
||
IMHO the following sequence should be used:
1) if charset of subject is specified (like 'Subject:
=?KOI8-R?Q?=F0=D2=CF=C2=CC=C5=CD=D9_=D3_=D0=CF=DE=D4=CF=CA?='), this charset
should be used to show subject;
2) if charset of subject is not specified, charset of the message (specified in
'Content-Type' header) should be used to show subject;
3) if none is specified, defaults/autodetectiion should be used.
Reporter | ||
Comment 19•22 years ago
|
||
Is this actually the same as bug 196734? Here is about news, but symptomps are
the same...
Comment 20•22 years ago
|
||
*** Bug 196734 has been marked as a duplicate of this bug. ***
Comment 21•22 years ago
|
||
I'm pretty sure that there is no distinction between mail and news in respect to
this bug.
Reporter | ||
Comment 22•22 years ago
|
||
CC'ing from duped bug.
Updated•22 years ago
|
Flags: blocking1.4b?
Updated•22 years ago
|
Flags: blocking1.4b? → blocking1.4b-
Comment 23•22 years ago
|
||
*** Bug 154948 has been marked as a duplicate of this bug. ***
Comment 24•21 years ago
|
||
re comment #18,
1) and 3) work in Mozilla. The issue is #2. I also listed in the international
known issues part (of the release notes)
Comment 25•21 years ago
|
||
Jungshik, I didn't find any references to this bug in Mozilla 1.5 release notes
http://www.mozilla.org/releases/mozilla1.5/known-issues-int.html
Am I looking at a wrong place?
Comment 26•21 years ago
|
||
IT's in 'Mail' section as quoted below:
> The Character Coding menu cannot correct the thread pane display of non-MIME
encoded headers. (See > RFC 2047.) If the message displayed lacks MIME charset
information, the thread pane display obeys > the default message display charset
set via the folder properties dialog (set via the menu "Edit | > Folder
Properties ..."). Correctly setting this option to the charset most often used
in the
> user's mail viewing helps avoid display problems of this type.
It's not as direct as you may like (my draft explicitly mentioned it if my
memory serves me right). While word-smithing, we left that detail out. For most
people, this wouldn't be a serious problem. However, I realize that for
Russians, it's rather important because multiple (legacy) character encodings
(Windows-1251, KOI8-R and ISO-8859-5are used with more or less the same
frequency.
Anyway, I'll refer to this bug in the document.
Comment 27•21 years ago
|
||
adding some more mailnews gurus to cc. This bug should be easier to fix than
making the header listing pane respond 'dynamically' to changes made via View |
Character Coding menu.
|Mime_DecodeMimeHeader| (that's a wrapper over
nsIMIMEHeaderParam::decodeRFC2047Header [1]) has a parameter for the default
charset.
[1]
http://lxr.mozilla.org/seamonkey/source/netwerk/mime/src/nsMIMEHeaderParamImpl.cpp#360
Comment 28•21 years ago
|
||
*** Bug 228133 has been marked as a duplicate of this bug. ***
Comment 29•21 years ago
|
||
*** Bug 201037 has been marked as a duplicate of this bug. ***
Comment 30•21 years ago
|
||
From http://www.mozilla.org/releases/mozilla1.6b/known-issues-
int.html#mail: "If the message displayed lacks MIME charset information, the
thread pane display obeys the default message display charset set via the
folder properties dialog (set via the menu "Edit | Folder Properties ...").
Correctly setting this option to the charset most often used in the user's
mail viewing helps avoid display problems of this type."
But the message NOT lacks MIME charset information! As I posted at the
http://bugzilla.mozilla.org/show_bug.cgi?id=228133, there is charset in the
header: "Content-type: text/plain; charset=windows-1251". And the message body
displayed right. Thunderbird should use the _same_ charset for the headers.
But it didn't :(
Setting the default charset of the folder to "most often used" not helps,
because 55% of the headers have "koi8-r" charset, and 45% - "windows-1251". So
we'll unable to read a half of the 8-bit headers. If Thunderbird will apply
the rule "for 8bit headers use the same charset as for message body; and use
default charset only if there no explicit charset in the message body or
headers" - we will able to read 99.5% of the headers and messages w/o touching
menu.
So, I agree with Jungshik Shin 2003-10-23 16:09 :
"I realize that for Russians, it's rather important because multiple (legacy)
character encodings (Windows-1251, KOI8-R and ISO-8859-5 are used with more
or less the same frequency." ISO-8859-5 is not used in the Russian/Cyrillic
mail now, but Windows-1251 and KOI8-R used with the same frequency, and this
is bad idea to select one of them as default. Most often charset is mentioned
for a body, but just ignored by Mozilla/Thunderbird for a headers.
Any chance this problem will be solved in Thunderbird? This bug is the only
reason why we still using Outlook.
Comment 31•21 years ago
|
||
Please, read RFC 2047 and RFC 2822. Content-Type header is NOT for specifying
the charset of messgeage header fields BUT ONLY for specifying the charset of
the message __body__.
To specify the charset of the messgae header, you should encode your message
header according to RFC 2047 as below:
Subject: =?KOI8-R?Q?........?=
8bit characters in RFC 822/2822 (STD 11) emails are __forbidden__ forever !!!
Comment 32•21 years ago
|
||
As it stands, Mozilla/Thunderbird ARE standard-compliant, but those sending
their emails with raw 8bit characters violate RFC 2822/RFC 2047. [1]
We want to 'fix' this bug not because those messages with 8bit characters are
standard-compliant but because we believe in a network maxim 'be lenient in what
you accept and be strict in what you send out'.
Please, ask your correspondents to configure their email clients NOT to send out
8bit characters in email headers. Major email clients can be configured that way
if RFC 2047-compliance is not ON by default. With two widely used encodings,
Russians have all the more incensitve to abide by RFC 2047/2822. We'll "fix"
this bug, but in the meantime, why don't you spread the words to your fellow
Russians?
[1] http://www.faqs.org/rfcs/rfc2047.html
http://www.faqs.org/rfcs/rfc2822.html
Comment 33•21 years ago
|
||
> 8bit characters in RFC 822/2822 (STD 11) emails are __forbidden__ forever !!!
Ooops. I meant '8bit characters in the __header__ of RFC 822/2822(STD 11)
email messages are forbidden'. In the message _body_, needless to say, 8bit
characters can be used with Content-Transfer-Encoding: 8bit when ESMTP 8BITMIME
is supported.
As I'm now sitting in front of Windows 2k box, let me tell you how to configure
MS OE to send RFC 2047-compliant emails:
1. Go to Tools | Options.
2. Click on 'Send' tab.
3. Click on 'HTML settings' and make sure 'allow 8-bit characters in headers' is
NOT checked
4. Repeat the above for 'Plain text setting' for both mail and news and 'html
setting' for news
I can't find the equivalent option in Outlook, but all the messages I received
from Ms Outlook users are compliant to RFC 2047 so that there must be a way to
do this (or it's not user-configurable, but it always sends RFC 2047-compliant
emails, which is a bit atypical of MS given its usual ignorance/disrespect
of/for the standard)
In case of Mozilla, RFC 2047-compliance is ON by default. Perhaps because it's
regarded as important to observe RFC 2047 (or because it's regarded as a 'bad'
UI), the checkbox that used to be available in Edit|preference|Mail|Send for
turning that off is now gone. You have to go to 'about:config' and set to
'false' 'mail.strictly_mime-headers'. Of course, I recommend you keep the
default (ON).
Comment 34•21 years ago
|
||
Here is my penny.
Personally, I'm not going to teach everybody in the net not send 8-bit headers.
But there a lot of such messages (including newsgroups), and I want Mozilla to
support them. Note that such support does NOT violate any RFC. We're not asking
about sending 8-bit headers that would be a violation. We're asking about
ability to read them.
Comment 35•21 years ago
|
||
Please, read my comment #32 (the first paragraph). You're free not to teach
everybody to abide by the standard, but that kind of disrespect for the standard
hurts the internet as a whole.
Comment 36•21 years ago
|
||
This is not a bug per se, but just an enhancement request.
Severity: major → enhancement
Comment 37•21 years ago
|
||
Sorry for spamming...
> Please, read my comment #32 (the first paragraph)
I meant the second paragraph of comment #32 (we believe in a network maxim 'be
lenient in what
you accept and be strict in what you send out)
Comment 38•21 years ago
|
||
Dear Jungshik Shin,
Mozilla browser is also standard compliant, BUT you've spend a lot of time to
support most of non-startdart legacy html tricks, bugs, old encodings
(including even Cyrillic DOS-encoding, not used in the internet at all ;) -
just because you WANT TO READ (be able to read at least) all these old sites.
And you don't want to teach each site author - "hey, you have wrong formatted
html - there and there...". And we also unable re-teach 10 millions of russian
internet users and 100 000 old mail scripts around the world. These scripts
know nothing about RFC2047 and "default Cyrillic encoding". For example, now I
will write something in Russian ('Ìàêñ, ãëÿäè-êà, îíè òóò ïðîñòî ðóññêèõ íå
ëþáÿò :)') - and I'm sure, your modern Bugzilla will send me this mail with
wrong encoding! :-)) Hey guys, please fix your Bugzilla scrpits now! :) They
are not compliant with MIME RFCs!
May be you should better just fix this small issue in the Thunderbird, and it
will become really "The e-mail and newsgroup client for 2004 and beyond" - for
Russians too.
Thanks,
Regards,
Comment 39•21 years ago
|
||
Hey, how many times did I have to repeat? Don't you understand what I
meant by that we believe in a maxim that 'be lenient in what we accept
and be strict in what you send out'(comment #32) [1] or did you decide
to ignore that? Please, press 'show votes for this bugs' button and
see who voted for this bug. I had voted for this bug before most of
you did. See also who wrote 'international known issues' document(press
'Document history' button at the bottom of the page). I co-authored it
with Kats and it's I who insisted that this issue had to be listed.
[1] This doesn't mean that you can keep violating the standard. Some
people believe that we must be strict in what you accept as well because
otherwise non-standard products would never be fixed. I'm not one of
them (at least in this case). I AM willing to fix this bug. If not,
I would have resolved this bug as 'INVALID or WONT FIX'.
You have to/must 'enlighten' people around you about the standard and
the importance of adhering to the standard. Your time would be much
better spent teaching one more friend/correspondent of yours and/or
authors/administrators of web mail scripts/web mail services about the
standard than making me keep saying the same thing over and over again.
A couple of years ago, I wrote a couple of times to the largest Korean
web mail service provider and was successful in persuading them to fix
their web mail service to be compliant to RFC 2047. Please, do something
instead of just saying that there are too many non-standard email messages
going around.
Incidentally, as opposed to what you wrote, Mozilla rejects quite a lot of
non-standard constructs (in html/css/dom) as invalid/illegal/non-standard.
That's why we have 'tech-evangelism' components to teach people (web
page authors and web site administrators) what's wrong with their web
pages/sites and how to fix them.
Comment 40•21 years ago
|
||
have yet to build a mozilla with this (my Win2k SP4 became terribly slow in
accessing my Samba server where Mozilla source is after installing SP4...)
Comment 41•21 years ago
|
||
It still doesn't work. In the message display pane, it works, but in the
message list(thread) pane, it doesn't. I thgouth 'charset' stored in msgDB is
the body charset (for simple 'single part' messages), but it's just empty.
Attachment #137356 -
Attachment is obsolete: true
Comment 42•21 years ago
|
||
*** Bug 73202 has been marked as a duplicate of this bug. ***
Comment 43•21 years ago
|
||
It doesn't work (msgDB doesn't have the 'body charset' information stored)
because when we get to nsParseMailMessageState::ParseHeaders, |m_headers| has
only a part of the message header[1]. Especially, it doesn't have 'Content-Type'
header so that down in the method, the attempt to set |m_content_type| always
fails (actually, Mozilla never gets there [2]).
857 case 'C': case 'c':
858 if (!nsCRT::strncasecmp ("CC", buf, end - buf))
859 header = GetNextHeaderInAggregate(m_ccList);
860 else if (!nsCRT::strncasecmp ("Content-Type", buf, end - buf))
861 header = &m_content_type;
862 break;
I have to track down why m_headers.GetBuffeerPos() doesn't point to the end of
the header (excluding Content-Type and Content-Transfer-Encoding header fields)
837 int nsParseMailMessageState::ParseHeaders ()
838 {
839 char *buf = m_headers.GetBuffer();
840 char *buf_end = buf + m_headers.GetBufferPos();
841 while (buf < buf_end)
842 {
[1] http://lxr.mozilla.org/mozilla/source/mailnews/local/src/nsParseMailbox.cpp#839
[2] http://lxr.mozilla.org/mozilla/source/mailnews/local/src/nsParseMailbox.cpp#860
Assiging to myself for a better tracking (for me)
Assignee: nhottanscp → jshin
Status: ASSIGNED → NEW
Comment 44•21 years ago
|
||
btw, if charset= is not specified but autodetected for the body, it should be
applied to non-MIMEd header fields as well.
Jungshik, does your patch work this way?
Comment 45•21 years ago
|
||
Thank you very much for patch! How to test it? (I'm unable to recompile
Mozilla Thunderbird under Windows).
Comment 46•21 years ago
|
||
bienvenu and mscott,
can you help me with the problem described in comment #43?
re: comment #44. No, my patch doesn't address that. It's a different issue. And,
please 'enlighten' your friends to use standard-compliant mail clients. Not even
specifying the body charset (in C-T) is a gross violation of the standard. If
some Russian web mail services do that, file a tech-evangelism bug, put me in
the CC list and I'll write to them. If you're talking about
'multipart/alternative' and other C-T with text/* part(s), we have to deal with
it in a separate bug. Again, this shows why RFC 2047 has to be abided by and why
RFC 2047 stipulates that the character encoding be explicitly specified in each
header field. If everybody observed the standard, we'd dispense with so much
'defensive' stuffs like this.
re: comment #45: sorry I don't have time to put up a binary build with the patch
for you to try.
Comment 47•21 years ago
|
||
The folder default charset is KOI8-R. Both messages in the folder have raw 8bit
characters in Subject header (the first one in KOI8-R and the second one in
Windows-1251). As you can see, the message list pane 'interprets' the subject
of the second as in KOI8-R while the subject in the message display area is
interpreted as in Windows-1251 (that of the body charset).
I manually edited my mailbox to put 'Content-Type' header field before
'Subject' header field, but that didn't change anything.
Comment 48•21 years ago
|
||
any chance to get this fixed in 1.7b ?
Comment 49•21 years ago
|
||
This is just a comment about the difficulty factor in adding this new behavior.
This is a nice-to-have addition but it may not be easy since the info you need
at the time you need it in the code may not be easy to get. The body thread and
header thread go in 2 different threads within the Mozilla code. See this
diagram for better understanding of the data flow:
http://wp.netscape.com/eng/intl/docs/iuc17/mail/slide05.html
and hope that Jungshik and others can figure out how to get this done
eventually. Thanks for trying it, Jungshik.
Those who have any influence at all in the Mail and News fields of Russian,
please evangelize the use of MIME-compiant news posting software -- whether or
not this feature request gets done.
Comment 50•21 years ago
|
||
> The body thread and header thread go in 2 different threads within the
Mozilla code.
But the required charset (for Subject) is present in the header! Not need to
hack a body thread to fetch something.
This rule is easy: if the Subject contains 8-bit characters with unknown
charset - get and use the charset from the 'charset=' parameter in
the 'Content-Type' field of the message _header_. That's all !
If 'Content-Type:...charset=' in the header is missimg too - then use the
Default charset.
> Those who have any influence at all in the Mail and News fields of Russian,
please evangelize the use of MIME-compiant news posting software -- whether or
not this feature request gets done.
You mean, we need to ask you to disable existing feature of posting messages
with 8-bit headers in Mozilla?
Comment 51•21 years ago
|
||
(In reply to comment #50)
> This rule is easy: if the Subject contains 8-bit characters with unknown
> charset - get and use the charset from the 'charset=' parameter in
> the 'Content-Type' field of the message _header_. That's all !
Everybody here knows what this bug is about. You don't have to explain it again.
However, it's not that simple to implement that. Header fields in RFC (2)822 /
STD11 messages come in any order so that Content-Type heeader field come at the
very end while Subject, From come well before that. That's yet another rationale
for RFC 2047-encoding.I'm pretty sure that the authors of RFC 2047 had this in
mind because making each header field self-contained makes it the life of
implementors (e.g. Mozilla developers) so much easier.
Anyway, what's needed here is a sort of 'callback' we can invoke to change the
header character encoding when necessary.
> > Those who have any influence at all in the Mail and News fields of Russian,
> please evangelize the use of MIME-compiant news posting software -- whether or
> not this feature request gets done.
>
> You mean, we need to ask you to disable existing feature of posting messages
> with 8-bit headers in Mozilla?
It's disabled by default, isn't it? As I wrote a couple of times before, you
Russians have a more compelling reasons to abide by RFC 2047 than others because
you have two widely used character encodings. So, why don't you evagelize that
as Kat suggested?
Comment 52•20 years ago
|
||
Can anyone give an update on the status of this bug?
Updated•20 years ago
|
Product: MailNews → Core
Comment 53•20 years ago
|
||
*** Bug 273196 has been marked as a duplicate of this bug. ***
Comment 54•19 years ago
|
||
I am using Thunderbird 1.5/20051111, and it seems like I have found a workaround for this bug. I am not sure if it will work with later versions, however I think it may be useful.
See demo image:
http://kitoy.ru/ak/thunderbird/thunderbird-broken-header-encoding-for-folders-workaround.gif
The workaround:
Set windows-1251 as a default encoding for Inbox.
Create a new subfolder named koi and create a filter to move all koi-encoded messages into it. Set koi8-r as a default encoding for the koi folder.
Create a view (Inbox-test Mail in the picture) which shows all messages from Inbox and koi folders.
Done.
However, there are two another bugs:
1. “Sort by” do not work right for views.
2. “Get flagged messages” do not work right for views.
Sorry for my English.
Comment 55•18 years ago
|
||
bug 351541 may be a dupe of this
Comment 56•18 years ago
|
||
*** Bug 214510 has been marked as a duplicate of this bug. ***
Reporter | ||
Comment 57•18 years ago
|
||
I do still experience this bug in 3.0 nightlies. Any change to fix this for 3.0?
There is no way for nominating bugs for 3.0 yet, but this affects many languages and is defenetly worth be nominated for 3.0.
Comment 58•18 years ago
|
||
Why waiting for Tb 3.0? We didn't release Tb2.0b1 yet, I see no reason to wait for 3.0.
Flags: blocking-thunderbird2?
Comment 59•18 years ago
|
||
I like to fix this bug for TB 2.0, but I'm entirely tied up until the end of December. If someone can come forward and fix this before TB 2.0, it'd be great. If it's not fixed by then, I'll take another look early next year.
Assignee: jshin1987 → smontagu
QA Contact: ji → mailnews.i18n
Comment 60•18 years ago
|
||
*** Bug 351541 has been marked as a duplicate of this bug. ***
Comment 62•18 years ago
|
||
Looks like we didn't get any traction on this and time is starting to run out to make Thunderbird 2.
Flags: blocking-thunderbird2? → blocking-thunderbird2-
Assignee | ||
Comment 64•17 years ago
|
||
Assignee | ||
Comment 65•17 years ago
|
||
Message charset should be used to display subject, sender and recipient correctly if specified. Otherwise default folder charset will be used.
Comment 66•17 years ago
|
||
(In reply to comment #64)
> Created an attachment (id=290131) [details]
> Charset patch
Alex, you should request review for patch from specific person. See
http://developer.mozilla.org/en/docs/Getting_your_patch_in_the_tree#Getting_the_patch_reviewed
Assignee | ||
Updated•17 years ago
|
Attachment #290131 -
Flags: review?
Assignee | ||
Updated•17 years ago
|
Attachment #290131 -
Flags: review? → review?(smontagu)
Assignee | ||
Comment 67•17 years ago
|
||
Forgot to take into account "Charset Override", if the user changes charset manually.
Attachment #290131 -
Attachment is obsolete: true
Attachment #290131 -
Flags: review?(smontagu)
Attachment #290213 -
Flags: review?
Assignee | ||
Updated•17 years ago
|
Attachment #290213 -
Flags: review? → review?(bienvenu)
Updated•17 years ago
|
Assignee: smontagu → lelik52
Comment 68•17 years ago
|
||
Jungshik, are you still tracking this, and does this look like what you had in mind?
Assignee | ||
Comment 70•17 years ago
|
||
Synced with the latest cvs tree. Added new message alert charset fix.
base/util/nsMsgDBFolder.cpp - New message alert fix. Detect charset correctly if Content-Type header is multiline.
db/msgdb/src/nsMsgDatabase.cpp - Message list pane fix.
- mime/src/mimemsg.cpp - Message pane fix.
Use message header charset to display header fields in message panes.
Attachment #290213 -
Attachment is obsolete: true
Attachment #290213 -
Flags: review?(bienvenu)
Attachment #300087 -
Flags: review+
Assignee | ||
Updated•17 years ago
|
Attachment #300087 -
Flags: review+ → review?
Assignee | ||
Updated•17 years ago
|
Attachment #300087 -
Flags: review? → review?(bienvenu)
Comment 71•17 years ago
|
||
Comment on attachment 300087 [details] [diff] [review]
Charset patch #2
thx for the patch, Alex
Attachment #300087 -
Flags: superreview+
Attachment #300087 -
Flags: review?(bienvenu)
Attachment #300087 -
Flags: review+
Updated•17 years ago
|
Keywords: checkin-needed
Comment 72•17 years ago
|
||
Checking in mailnews/base/util/nsMsgDBFolder.cpp;
/cvsroot/mozilla/mailnews/base/util/nsMsgDBFolder.cpp,v <-- nsMsgDBFolder.cpp
new revision: 1.328; previous revision: 1.327
done
Checking in mailnews/db/msgdb/src/nsMsgDatabase.cpp;
/cvsroot/mozilla/mailnews/db/msgdb/src/nsMsgDatabase.cpp,v <-- nsMsgDatabase.cpp
new revision: 1.390; previous revision: 1.389
done
Checking in mailnews/mime/src/mimemsg.cpp;
/cvsroot/mozilla/mailnews/mime/src/mimemsg.cpp,v <-- mimemsg.cpp
new revision: 1.60; previous revision: 1.59
done
Status: NEW → RESOLVED
Closed: 17 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9beta4
Updated•17 years ago
|
Attachment #137375 -
Attachment is obsolete: true
Assignee | ||
Comment 73•17 years ago
|
||
Would be good to apply this patch to Thunderbird 2.0 (MOZILLA_1_8_BRANCH) too.
Assignee | ||
Updated•17 years ago
|
Attachment #301853 -
Flags: review?(bienvenu)
Comment 74•17 years ago
|
||
Comment on attachment 301853 [details] [diff] [review]
Charset patch backport for MOZILLA_1_8_BRANCH
I think you're leaking strings here:
+ err = RowCellColumnToCharPtr(row, m_messageCharSetColumnToken, &charSet);
+ if (NS_FAILED(err) || !*charSet || !PL_strcasecmp(charSet, "us-ascii") ||
+ characterSetOverride)
+ {
+ m_dbFolderInfo->GetCharPtrCharacterSet(&charSet);
+ }
+
In RowCellColumntoCharPtr allocates memory for the out string, which you're not freeing, and you're overwriting. And GetCharPtrCharacterSet also allocates memory.
Attachment #301853 -
Flags: review?(bienvenu) → review-
Assignee | ||
Comment 75•17 years ago
|
||
Sorry, messed up things with svn HEAD in previous patch, added freeing code.
But actually there were 'charSet' out string leaks in nsMsgDatabase::RowCellColumnToMime2DecodedString and nsMsgDatabase::RowCellColumnToCollationKey which didn't free allocated string. Fixed too.
Attachment #301853 -
Attachment is obsolete: true
Attachment #302087 -
Flags: superreview?
Attachment #302087 -
Flags: review?
Assignee | ||
Updated•17 years ago
|
Attachment #302087 -
Flags: superreview?(bienvenu)
Attachment #302087 -
Flags: superreview?
Attachment #302087 -
Flags: review?(bienvenu)
Attachment #302087 -
Flags: review?
Comment 76•17 years ago
|
||
Comment on attachment 302087 [details] [diff] [review]
Charset patch backport for MOZILLA_1_8_BRANCH #2
Can you use PR_Free instead of PR_FREEIF? The difference is that PR_FREEIF nulls out charset, which you don't need (they both check for null args, ultimately).
Thx!
Assignee | ||
Comment 77•17 years ago
|
||
Changed patch to use PR_Free instead of PR_FREEIF. Thank for the hints.
Attachment #302087 -
Attachment is obsolete: true
Attachment #302087 -
Flags: superreview?(bienvenu)
Attachment #302087 -
Flags: review?(bienvenu)
Attachment #302128 -
Flags: superreview?
Attachment #302128 -
Flags: review?
Assignee | ||
Updated•17 years ago
|
Attachment #302128 -
Flags: superreview?(bienvenu)
Attachment #302128 -
Flags: superreview?
Attachment #302128 -
Flags: review?(bienvenu)
Attachment #302128 -
Flags: review?
Comment 78•17 years ago
|
||
Comment on attachment 302128 [details] [diff] [review]
Charset patch backport for MOZILLA_1_8_BRANCH #3
thx for the patch!
Nominating for 1.8.1.13, though it should get some trunk baking.
Attachment #302128 -
Flags: superreview?(bienvenu)
Attachment #302128 -
Flags: superreview+
Attachment #302128 -
Flags: review?(bienvenu)
Attachment #302128 -
Flags: review+
Attachment #302128 -
Flags: approval1.8.1.13?
Comment 79•17 years ago
|
||
Comment on attachment 302128 [details] [diff] [review]
Charset patch backport for MOZILLA_1_8_BRANCH #3
clearing branch approval request for now (not denying) -- please re-request after a week or two of trunk testing.
Attachment #302128 -
Flags: approval1.8.1.13?
Comment 80•17 years ago
|
||
Comment on attachment 302128 [details] [diff] [review]
Charset patch backport for MOZILLA_1_8_BRANCH #3
That patch was baked on trunk for month with no visible regressions.
It fixes very old bug.
Attachment #302128 -
Flags: approval1.8.1.13?
Comment 81•17 years ago
|
||
Comment on attachment 302128 [details] [diff] [review]
Charset patch backport for MOZILLA_1_8_BRANCH #3
It's too late in the cycle to take this now. We'll look at it for 1.8.1.14.
Attachment #302128 -
Flags: approval1.8.1.14?
Attachment #302128 -
Flags: approval1.8.1.13?
Attachment #302128 -
Flags: approval1.8.1.13-
Comment 82•17 years ago
|
||
FWIW, I've tested this on my batch of pathologically-encoded emails and it seems to work sanely in all cases. I do have one bad message which looks worse with this patch than without, but that's because the header was hardcoded with ISO-8859-1 while the charset was ISO-2022-JP. Nice work!
Comment 83•17 years ago
|
||
Comment on attachment 302128 [details] [diff] [review]
Charset patch backport for MOZILLA_1_8_BRANCH #3
approved for 1.8.1.14, a=dveditz for release-drivers
Attachment #302128 -
Flags: approval1.8.1.14? → approval1.8.1.14+
Updated•17 years ago
|
Keywords: checkin-needed
Updated•17 years ago
|
Whiteboard: [checkin needed (1.8 branch)]
Comment 84•17 years ago
|
||
MOZILLA_1_8_BRANCH:
Checking in mailnews/base/util/nsMsgDBFolder.cpp;
/cvsroot/mozilla/mailnews/base/util/nsMsgDBFolder.cpp,v <-- nsMsgDBFolder.cpp
new revision: 1.257.2.27; previous revision: 1.257.2.26
done
Checking in mailnews/db/msgdb/src/nsMsgDatabase.cpp;
/cvsroot/mozilla/mailnews/db/msgdb/src/nsMsgDatabase.cpp,v <-- nsMsgDatabase.cpp
new revision: 1.355.4.15; previous revision: 1.355.4.14
done
Checking in mailnews/mime/src/mimemsg.cpp;
/cvsroot/mozilla/mailnews/mime/src/mimemsg.cpp,v <-- mimemsg.cpp
new revision: 1.55.28.1; previous revision: 1.55
done
Keywords: checkin-needed → fixed1.8.1.14
Whiteboard: [checkin needed (1.8 branch)]
Comment 85•17 years ago
|
||
This appears to have caused a regression - bug 425945
Comment 86•17 years ago
|
||
This patch has corrected the display of some message subjects and broken the display of others:
https://bugzilla.mozilla.org/show_bug.cgi?id=425945
I don't see why the message body's charset 'must' be applied to the headers (and I couldn't find an explanation skimming through the comments here) - please explain the reason.
Assignee | ||
Comment 87•17 years ago
|
||
In a few words message charset must be applied to the header fields because that's the only way to get messages with different charsets be displayed correctly in the one folder.
Comment 88•17 years ago
|
||
(In reply to comment #87)
But that's not true, the header can specify it's own encoding, e.g.
Subject: Message 4 =?windows-1255?Q?=E3=E1_=F9=F1_=E1=EE=F8=F7=E7?=
=?windows-1255?Q?=E4=5D?=
shows up as
Message 4 דב שס במרקחה]
with or without this patch, regardless of the charset of any of the mime parts.
Also, applying the body charset to the header doesn't make messages with different charset display correctly, it makes _some_ such messages display correctly while breaking the display of others...
I therefore ask that this bug be reopened pending further discussion.
Comment 89•17 years ago
|
||
Right. The message body's charset should be applied to headers ONLY if encoding isn't specified in the header encoding itself. I.e. only for "raw" headers, not encoded with base64 or QP.
Comment 90•17 years ago
|
||
I think that not even then. Since it's perfectly possible to specify a different charset for a header and for MIME parts, I see no good reason to assume raw non-ASCII headers should be interpreted using some MIME part's charset rather than the folder's default. If you were to make it a _preference_, that's something else, but that's not what you've done.
Alex: If you have some unruly non-standards-compliant messages you want to manipulate, write an extension or something, don't change the default behavior. Or how about this - why not create a folder with a different default charset and use a filter to put your messages there?
Comment 91•17 years ago
|
||
(In reply to comment #90)
> I think that not even then. Since it's perfectly possible to specify a
> different charset for a header and for MIME parts, I see no good reason to
> assume raw non-ASCII headers should be interpreted using some MIME part's
> charset rather than the folder's default.
This doesn't make sense either. This means the same email would be displayed differently depending on *your* default folder rather than making the assumption it's encoded using the same charset as the body of the email.
I get emails with different encodings quite often, and that's probably the best assumption the email client can do, IMHO. Filtering is a bad workaround, really.
Comment 92•17 years ago
|
||
(In reply to comment #91)
> This doesn't make sense either. This means the same email would be displayed
> differently depending on *your* default folder rather than making the
> assumption it's encoded using the same charset as the body of the email.
But that's exactly what per-folder default charsets are for.
> I get emails with different encodings quite often, and that's probably the
> best assumption the email client can do, IMHO.
But _I_ only get emails which this change screws up, so this change seems to me like a much worse assumption for the email client make. And the reason I said it's _much_ worse is that at least you have the 'bad workaround' of filtering;
I don't don't _any_ workaround now. A workaround isn't that bad if people can use it without causing other people trouble.
If you need this change this bad, make it a preffed-off alternative.
Comment 93•17 years ago
|
||
I'm sorry, I spoke too soon. I do have an alternative, I can use the charset 'forcing' mechanism.
So I'm changing my request: Make the behavior change preffed-on.
Comment 94•17 years ago
|
||
Eyal, default charset don't work for about half of messages.
This patch will solve all encoding problems, if it applied only for raw headers with unknown encoding.
Eyal> "Since it's perfectly possible to specify a different charset for a header and for MIME parts,..."
It is possible, but if different charsets are actually used in different MIME-parts - its encodings always specified in these parts (in other case nobody will be able to read it), and in this case Thunderbird should use them. Our talks are about some old programs - most often web scripts - which correctly encode body, but forget to encode Subject (because Outlook Express applied body's encoding to headers and all non-Mozilla readers can read it!!). With this patch Thunderbird will 'emulate' Outlook Express / Windows Mail behavior, and all will be happy.
Assignee | ||
Comment 95•17 years ago
|
||
(In reply to comment #88)
> (In reply to comment #87)
> But that's not true, the header can specify it's own encoding, e.g.
>
> Subject: Message 4 =?windows-1255?Q?=E3=E1_=F9=F1_=E1=EE=F8=F7=E7?=
> =?windows-1255?Q?=E4=5D?=
>
> shows up as
>
> Message 4 דב שס במרקחה]
Eyal, please read carefully. ...'must be applied to non-MIME fields'...
You're talking about MIME encoded subject. Fields like in your example have priority and will be displayed correctly anyway.
Comment 96•17 years ago
|
||
(In reply to comment #88)
> Also, applying the body charset to the header doesn't make messages with
> different charset display correctly, it makes _some_ such messages display
> correctly while breaking the display of others...
It improves display of messages with _known_ charset (koi8-r, windows-1251, etc
etc). 'iso-8859-I' isn't known, perhaps it should be made an alias to
'iso-8859-8-I'.
(In reply to comment #90)
> Alex: If you have some unruly non-standards-compliant messages you want to
> manipulate, write an extension or something, don't change the default behavior.
First, there are just too many of these 'unruly' messages out there. Yeah,
they do not comply to RFC, but it's not MUA's job to enforce these -- its job
is to let users read their mail in their language. The user won't go seeking
the magic extension, she'll just won't use the software -- especially when the
well-known alternative (MS Outlook) handles this just fine.
> Or how about this - why not create a folder with a different default charset
> and use a filter to put your messages there?
For each charset that you may receive mail in? Ridiculous.
Comment 97•17 years ago
|
||
(In reply to comment #89)
> Right. The message body's charset should be applied to headers ONLY if encoding
> isn't specified in the header encoding itself. I.e. only for "raw" headers, not
> encoded with base64 or QP.
Wait a second - you are applying some random MIME part charset to perfectly valid ASCII header? This surely will fail terribly if the applied charset is not "ASCII-synced" like eg. Unicode is, for instance if the MIME part encoding is EBCDIC...
Comment 98•17 years ago
|
||
(In reply to comment #96)
> 'iso-8859-I' isn't known, perhaps it should be made an alias to
> 'iso-8859-8-I'.
I did wonder about that when reading earlier comments. Do we have any idea how common it is and whether it only occurs when iso-8859-8-I is intended? In theory it might also be an error for iso-8859-6-I.
If it only occurs in mails from a particular forum, it's presumably a configuration error, and would be better handled by reporting to the forum admins.
Comment 99•17 years ago
|
||
Karsten> Wait a second - you are applying some random MIME part charset to perfectly valid ASCII header?
No. ASCII is really perfectly valid and shouldn't be touched by this patch. Raw headers (Cyrillic, Unicode, etc) has non-ASCII 8-bit chars.
if(Is8bit(header))apply_patch();
Comment 100•17 years ago
|
||
I'd like to add a piece of info for those who aren't aware: Many MIME part body
(as opposed to header) charset issues, and practically all such issues relevant
to email in Hebrew and Arabic, are handled by an extension I co-wrote, which
does further auto-detection work over Mozilla's, and also takes care of
mixed-charset cases:
https://addons.mozilla.org/en-US/thunderbird/addon/310
so for users like me, the charset defaults for folders effectively 'strongly'
apply only to the subjects and not anything else...
Sergey Svishchev: If this is so much of a problem that you simply _must_ change
the default behavior, then you have to change it so as to accommodate what
currently works well and not break it. How? Do some auto-detection on the
subject line before deciding which charset to use for it; at least don't give
me subject lines which are all \uFFFD's are or all àéùé's (accented latin
letters) etc. Or think of something else. You can't just break some
non-RFC-compliant messages in favor of others (which, by the way, Outlook also
handles fine, and after your change you lose Outlook parity on).
Comment 101•17 years ago
|
||
Simon> If it only occurs in mails from a particular forum, it's presumably a
configuration error, and would be better handled by reporting to the forum
admins.
No. There are millions of such forums, CMS, etc-etc (including used on your site). For example, ALL PhpBB forums. It is impossible to ask all existend admins to fix their scripts.
Thunderbird users just wants to read their mail, not to fix entire Internet bugs.
Comment 102•17 years ago
|
||
(In reply to comment #101)
> For example, ALL PhpBB forums.
Hey, that's my example too :-) ... your change is messing up my messages from phpBB fora. Except mine have both the body charset wrong and the header charset unspecified.
Comment 103•17 years ago
|
||
(In reply to comment #100)
> I'd like to add a piece of info for those who aren't aware: Many MIME part body
> (as opposed to header) charset issues, and practically all such issues relevant
> to email in Hebrew and Arabic, are handled by an extension I co-wrote, which
> does further auto-detection work over Mozilla's, and also takes care of
> mixed-charset cases:
>
> https://addons.mozilla.org/en-US/thunderbird/addon/310
>
> so for users like me, the charset defaults for folders effectively 'strongly'
> apply only to the subjects and not anything else...
>
Your extension managed to fix subject lines in the message header display, but doesn't managed to fix them in list of messages in a folder, as you've said in Bug 425945 comment #0
So my guess, that either your extension should be fixed to take care of subject lines in the message list, or additional patch needed to display subject lines in message header and message list consistenly.
Comment 104•17 years ago
|
||
(In reply to comment #96)
> 'iso-8859-I' isn't known, perhaps it should be made an alias to
> 'iso-8859-8-I'.
I just tested it. I've added attachment from Bug 425945 comment #7 to Thunderbird Mail folder. Then I've opened %thunderbirdprogramfolder%/res/charsetalias.properties, found section:
# Aliases for ISO-8859-8-I
#
csiso88598i=ISO-8859-8-I
iso-8859-8i=ISO-8859-8-I
and added string:
iso-8859-i=ISO-8859-8-I
Then I've restarted Thunderbird and attachment from Bug 425945 comment #7 was shown properly.
Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.1.14pre) Gecko/20080329 Thunderbird/2.0.0.14pre ID:2008032903
Updated•17 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•