Closed Bug 68394 Opened 24 years ago Closed 23 years ago

Non-ASCII headers sent as unicode when quoted-printable encoding turned off and mail.strictly_mime_headers is false.

Categories

(MailNews Core :: Internationalization, defect)

defect
Not set
critical

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.3

People

(Reporter: graf, Assigned: nhottanscp)

References

Details

(Keywords: imap-interop, intl, Whiteboard: PDT+)

Attachments

(2 files)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux 2.2.16 i686; en-US; 0.8) Gecko/20010209
BuildID:    2001020908

If you turn off encoding 8-bit as quoted-printable and opt for the "as-is"
format, the headers of the message in KOI8-R get sent as unicode, thus garbling
the subject line.

Reproducible: Always
Steps to Reproduce:
1. Set your settings to "send as-is".
2. Open new message.
3. Type in a subject in koi8-r Russian subset.
4. Type in some koi8-r Russian body.
5. Send.

Actual Results:  When message arrives, the subject is in Unicode instead of
KOI8-R (well, I think it is unicode, since it LOOKS like unicode). The body is
preserved correctly.

Expected Results:  Both subject and body should be in KOI8-R 8-bit.

Some Russian newsgroups prohibit posting in quoted-printable format, therefore
this is a show-stopper for many of us.
It is, in fact, unicode. Quick test with iconv confirmed that KOI8-R in the
subject arrives as UTF-8.
Can someone PLEASE take a look at this? Changing to "Internationalization" in
hopes this will attract more attention.
Component: MIME → Internationalization
Can someone PLEASE take a look at this? This is a really nasty bug. I can't use
Mailandnews due to this, and so pretty much all other cyrillic users. Also, I am
not certain this problem is limited to just cyrillic -- it's possible that this
pertains to other charsets as well.
Severity: major → critical
Konstantin, I am not able to confirm the problem with 2/28/2001 Win32 build.
I have one question about the report.

Q: Turning off "Quoted Printable" only applies to the body of a message. Not to the headers.
    There is no UI for turning off MIME-encoding of headers in Mozilla. (You should be able to
    do this in prefs.js, however.)
    So it seems that there is some factor missing in your report. 
   
Request: Can you send a problem test message in KOI8-R to people in the CC line of 
             this bug report? Thank you.
Changed QA contact to marina.
QA Contact: esther → marina
OK, I sent two messages with KOI8-R in both of them to people in CC. One message
is with Quoted-Printable turned off, another with it turned on. You should see
that in the message with it turned off the headers are Unicode.

You are correct about your other suggestion -- I in fact do have
"strictly_mime_headers" set as false in prefs.js. This is something that Mozilla
acquired when it was converting my Netscape profile. Changing the summary to
reflect this.
Summary: cyrillic headers sent as unicode when quoted-printable encoding turned off → cyrillic headers sent as unicode when quoted-printable encoding turned off and mail.strictly_mime_headers is false.
Is it Linux specific problem or happens on all platforms?
Confirm.
Looks like we are sending UTF-8 which we use for internal data processing.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: intl
OS: Linux → All
Hardware: PC → All
Summary: cyrillic headers sent as unicode when quoted-printable encoding turned off and mail.strictly_mime_headers is false. → Non-ASCII headers sent as unicode when quoted-printable encoding turned off and mail.strictly_mime_headers is false.
The problem is cross-platform probably.
The problem should show up for all non-ASCII
headers, not just for Cyrillic. Changed the summary line.

The line to use in prefs.js should be:

user_pref("mail.strictly_mime_headers", false);
second Kat's comments: with user_pref("mail.strictly_mime_headers", false) line
in prefs.js file and leaving unchecked " send message as is" all non-ascii in
header are having this display problem. Crossplatform as well.
  

*** Bug 69791 has been marked as a duplicate of this bug. ***
OK, I think I know where the problem is:

mailnews/compose/src/nsMsgCompUtil.cpp, function mime_generate_headers() calls
the nsMsgI18NEncodeMimePartIIStr() function and adds the result of the function
to the headers. However, if strictly_mime_headers is set to false,
nsMsgI18NEncodeMimePartIIStr() function does this:

(mailnews/base/util/nsMsgI18N.cpp:386)

 // No MIME, just duplicate the string.
  if (PR_FALSE == bUseMime) {
    return PL_strdup(header);
  }

The returned string is Unicode, while we want it to be the client's preferred
outgoing charset instead. I think that the easiest solution would be to call one
of the ConvertFromUnicode functions before returning the header back. This fixes
the bug, but I don't know whether this breaks anything else (should it?).

I've tried to do this myself, but I got lost in all the unicode converting
functions. :) 
USEFOR is recommending UTF8 for all news headers, so if it's 8 bits it *should* 
be unicode.

In the future expect to have UTF8 newsgroup names, and if the Newsgroups header 
is "converted" (either QP or local charset), posting will fail.
> USEFOR is recommending UTF8 for all news headers, so 
> if it's 8 bits it *should* be unicode.

What is the status of this idea? Is this an RFC, WWC recommendation or some
other document. Please provide 
the info so that we can evaluatre the recommendation.

> In the future expect to have UTF8 newsgroup names, 
> and if the Newsgroups header is "converted" 
> (either QP or local charset), posting will fail.

Once again, we need specifications for this. Please
provide the URL for documentation.

By the way, what we do for this particular bug
is not directly related to the issues you're raisig.
We should take up this UTF-8 recommendation in
another bug.
This is a usability issue above all, as many oddball gated newsgroups do not
allow QP in the headers (fido, for example). Besides, QP is currently not
understood correctly by many web-interface gates, which makes subjects
completely unreadable. Lastly, if there is an option to turn off mime-encoding
of headers, then the header should be in the same charset as the body, not
Unicode in the header and some local charset elsewhere. I am unaware of any
clients that will distinguish between the header charset and the body charset --
if you specify that the body is "content-type: text/plain; charset=koi8-r", then
every client I know off will show the 8-bit information in the header as being
koi8-r. It is just logical.
Konstantin, as I said above, let's not deal with
UTF-8 issues in this bug. We should fix this bug
so that whe the stricly mime option is OFF, the header
will go out in 8-bit characters of the user's chosen
encoding, which will match thh body encoding. 
That is the basic case we should take care of.
USEFOR is the working group writing the replacement for RFC 1036.
The latest draft is at:
<http://www.landfield.com/usefor/drafts/draft-ietf-usefor-article-04.txt>
I have the same problem as Konstantin because of this bug - I have trouble
posting to gated FIDO groups that want the header to be in koi8-r.
Keywords: interop, mozilla1.0
CC'ing andrei.
I have filed a separate bug, 91112, for having the Newsgroups header be 
UTF8 regardless of local charset or the MIME enocding preference.
Let me take this, I have a patch (changed the code mentioned by Konstantin
Riabitsev 2001-06-23 16:09).
R=ducarroz
Reassign to nhotta.
Assignee: ducarroz → nhotta
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.3
*** Bug 90929 has been marked as a duplicate of this bug. ***
Let's not use the word 'header' when we actually mean a 'subject line' to say. 
This can get easily confused with the 'message header' of which Subject: is one 
of the field.

Now, this bug is about... what?
Andrei, we usually use "header" in several instances: header in the thread,
header in the message body. This bug deals with the situation when stricly mime
option is OFF the header would go out in 8-bit characters in the encoding that
user has chosen, which will match thh body encoding. 

You mean the whole message header?
Subject is not the only header which contains the 8-bit information on many
users's computers. The other headers would be From, To, Keywords, Reply-To, etc.
They all need to go out in the same charset, as the Content-Type, not Unicode.
sr=bienvenu
Checked in to the trunk.
Yes! It's fixed! Yowzat!

/me does a hula dance.

Thanks to all!
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You mean that I can now post again messages in french newsgroups with a subject
line containing a "é" or a "à" and not get flames in return ? :-)

YAY !!!! Thanks a lot !
Too bad this should have been checked into 
the nsbranch in time for the next major NS release.
Anyone wants to agitate for that?
There are still a couple of days left. If it doesn't get to the branch it will 
not be in 6.1. I strongly beleive this is a must for all 'non-Western' 
community. Kat, I think we should do everything possible to convince PDT team to 
accept it. Points to mention: relatively low risk, what it did without the patch 
was just wrong and non-complient with docs, impact of not fixing that is 
repelling non-English newsgroup readers/writers from the product (and this 
basically means the whole world outside North America, the UK and Australia/NZ).
Ok. After talling with andrei over email, I have
decided to lobby for this bug in the branch.
Here are some reasons:

1. We got caught in the changing standards. We had hoped
   that the Mail and News header standards would be the
   same, i.e. MIME-encoded headers.
   But alas, both the trend and the News artcile
   standard are moving toward non-MIME 8-bit headers
   while the Mail header remains MIME-encoding.

2. In Europe and much of Asia except Japan, many many news
   articles use raw 8-bit characters. This will 
   continue to be the case into the future.

3. Because of this bug, Mozilla users were not able to
   send **readable** news articles headers in their own 
   languages. They get complaints from other news readers
   that Mozilla sends garbage in the header line.

4. People use a variety of means to access news articles,i.e. 
   special news reading programs, web-based News reading 
   services, and traditional Mail programs such as
   Mozilla/Netscape Mail/New, Communicator Mail?new, 
   and Outlook Express Mai/News.

   The users of special news programs and web-based
   news reading services often don't have access to
   MIME-compliant programs.

   This is why most users of news in many languages
   prefer 8-bit headers. And not having this bug
   fix in the Branch means that users will not
   want to switch to Mozilla for news because 
   we mangle such 8-bit headers without the fix.

5. Users of Communicator 4.x are used to being able
   to send 8-bit news headers. Outlook Express offers
   an UI option to tunr off MIME encoding for news
   and/or mail. International users are used to
   having this functionality in other programs.

6. This bug has been fixed on trunk and it is working now.
   This seems to be low-risk and provides maximum benefits
   to non-ASCII users in Europe, Asia, and Latin America.

   
PDT+.  Please get extra testing going.  If there are any problems this may get
backed out.
Whiteboard: PDT+
Re-opening it so that the fix can get checked into 
the branch. Please resolve it again when teh checkin
is done.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Naoki, please take care of this.
Keywords: nsBranch
glazman reported running into a problem with this patch.  Please get details
from him before proceeding.
It was my mistake : the only problem I see with this bug is that setting
"mail.strictly_mime_headers" to false manually in prefs is not an action that
an average user can afford... The minimum minimorum for checking in this into
the branch seems to me the addition of a Release Note.
Keywords: nsBranch
Confirming : tested on windows and linux; proposed patch works very well
+--------------+
|IF AND ONLY IF| the following line is added to prefs.js
+--------------+

user_pref("mail.strictly_mime_headers", false);

Daneil, there will be an RFE and proposal coming soon to 
create UI for the MIME on/off options.

Many Communicator users are already aware of the
prefs.js line and have it in their profiles.

When they migrate 4.x profiles, that line gets migrated
and then they see this bug. There will be a release
note item and I will write one with concise instructions.
Once the info is available about this option, then
people who need it can use it. 
With this bug not fixed, the user simply does not
have the option to send 8-bit headers except in UTF-8.
The need for this bug fix in the branch is not diminished
for that reason.
Please help test this fix!

Here are the things to look for when testing this fix:

0. Locate your profile directory and edit the file,
   prefs.js. Inser the following line:

   user_pref("mail.strictly_mime_headers", false);

1. With the above prefs.js line inserted, we can send 8-bit headers for all
major encodings correctly. Make sure that when this line is in the prefs.js, the
Pref item, "Edit | Prefs | Mail & Newsgroup | Msg Composition | Composing Msgs |
(turn on Quoted Printable header)" is checked OFF.

2. Then start a new Compose window, try different encodings and then send out
Subject line containing 8-bit characters of that lang script/encoding.
(Copy/paste the same text into the body for comparison.)

3. Receive the test msg and see if you can view the Subject line and the Body
line correctly -- they should display the same. 

Please note that Mozilla/NS will not display headers correctly unless the raw
8-bit headers match the current encoding. You need to change "View | Character
Coding" values for testing various encodings.

Note: If you have set the default encoding to "X" and then if the subject header
is in that encoding, it should display OK.
The same is true if you set the "View | Folder Character 
Coding" to the one that matches teh encoding of raw 8-bit
headers.
  
4. Additional things to check:

a. You may write your names in 8-bit characters, e.g. Andrè <andre@netscape.net>
in the TO and CC fields or via the permanent setting in "Edit | Mail & News
Account Settings | Identity | Your Name".

b. You may choose to write Organization names using 8-bit characters. --> See
"Edit | Mail & News Account Settings | Identity | Organization". 

c. All these headers need to come out OK.



The question is : are we trying to touch new users or not ? if the answer is yes,
then a release note is needed because these people may not have a communicator
profile or may have deleted it (yes, some people *have* removed 4.x from HD...).
> The question is : are we trying to touch new users or not ? 

Yes. We need to have this relnote item be spread amogn news
posting community until we can land an UI option.
> Yes. We need to have this relnote item be spread amogn news
> posting community until we can land an UI option.


Absolutely !!! As soon as the version is released, I will post a (if not "the")
french translation of the release note into fr.usenet.* and 
fr.comp.infosystemes.navigateurs to let people know.
Checked in on the branch at selmer's request.
marking RESOLVED FIXED for jag
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Do we really need the UI option for "mail.strictly_mime_headers"? I thought this
was important because migrated profiles might have the background pref set to
non MIME.
NS 6.0 and later use a charset label in MIME header for message view. If non
MIME headers are used, the user may have to set charset manually in order to
view the headers correctly. This is especially a problem when more than one
charsets are used in one newsgroup (e.g. koi8-r and windows-1251).
Are there any examples of non MIME headers are recommended for certain
newsgroups or popular news readers which do not support MIME headers?

See comments from Konstantin Riabitsev 2001-06-25 13:10 - some gated FIDO groups
require the header to be in koi8-r without any QP or other MIME coding.

See also http://www.fido7.ru/roadmap.html (Russian) or
http://www.fido7.ru/roadmap.html.en 
Thanks for the info. I think more information have to be collected before
deciding the UI is needed or not. Anyway, if anyone think the UI is needed
please file a separate bug.
It should also be noted that most web-interfaces do not parse QP-headers
correctly at all. Google had just recently learned how to do it, and many others
are simply unaware of this issue. Also, some may parse the headers for
displaying, but will not do so internally, so searching on QP-headers will fail.

Overall, QP in the headers is generally frowned upon in the non-ascii world.
Addressing the problem of correct or incorrect display of the Subject in the
tree view -- if someone is posting in the wrong charset (e.g. not the de-facto
koi8-r), that's their problem entirely, and as long as mozilla displays the
subject correctly in the message view (by looking at Content-type charset), then
it's no big deal. It happens all the time with other Newsreaders all the time,
and people who post in hostile charsets suffer the consequences of their actions
anyway. :)
marina, pls verify.
verifying with 2001-07-24-0.9.2 build. Inserting
user_pref("mail.strictly_mime_headers", false); with QP in the Prefs turned off,
the headers are displayed correctly!

Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: