Non-MIME-encoded headers do not display correctly with the view default charset

VERIFIED FIXED in M18

Status

MailNews Core
MIME
P3
major
VERIFIED FIXED
18 years ago
10 years ago

People

(Reporter: Katsuhiko Momoi, Assigned: Scott MacGregor)

Tracking

Trunk
x86
Windows NT

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [nsbeta2-][7/13] [nsbeta3+])

Attachments

(3 attachments)

(Reporter)

Description

18 years ago
** Observed with 5/31/2000 Win32 build **

I don't know if there is a bug filed on this yet. I think we talked
about this before, however.
When you set the View Default Charset, you should be able to 
display a non-MIME-encoded header correctly if the encoding
of the header matches that of the default viewing.
Currently, this is not the case. You need to re-set the 
menu to ISO-2022-JP to display the header correctly if you
have a non=MIME-encoded Japanese header.
(Reporter)

Comment 1

18 years ago
Created attachment 9396 [details]
A mailbox containing a msg with non-MIME-encoded JPN header
(Reporter)

Comment 2

18 years ago
Put the attachment file in the Local Folder.
Set the Default Viewing Charset to Japanese (ISO-2022-JP)
and then try to display the message, 

1. Body will show correctly. (There is no MIME-charset info in the msg,)
2. Header pane will not show correctly until the charset is engaged.
3. The header will not show correctly in the msg envelope 
    even when the charset menu is manually corrected.

Rich, is this your bug ? 

Comment 3

18 years ago
Actually, I think this is mscott's. This is the emitter/broadcaster interaction 
with non-mime encoded headers. 

- rhp
(Reporter)

Comment 4

18 years ago
QA contact to myself.
QA Contact: lchiang → momoi
(Assignee)

Comment 5

18 years ago
Kat, do you think this is critical enough that it should be nominated for beta2?
(Reporter)

Comment 6

18 years ago
I would like to for the following reasons.

1. We support display of non-labeled body now and it would look 
   strange that we cannot display unlabeled headers. Normal users
   would wonder why this is so and it will certainly incovenience
   him/her.
2. In newsgroups of various languages, we encounter this situation
   quite often. This is not addressing a far-fetched scenario. It is
   every day mail/news reading that is hampered by this bug. View Default
   charset lends such powerful friendly help to resolve this
   situation for users.

3. Applicability of this fix is much much wider than just a few 
   languages. All Asian languages, for sure, and then all these
   European language news articles.

Nominating for nsbeta2.
Keywords: nsbeta2

Comment 7

18 years ago
Checking with msanz and bobj for PDT call.
Whiteboard: [NEED INFO]

Comment 8

18 years ago
Putting on [nsbeta2-] radar.  If critical to beta, let us know.
Whiteboard: [NEED INFO] → [nsbeta2-]
(Reporter)

Comment 9

18 years ago
This is a bug where for some reason, the default setting is
not used if the folder charset is not set. 
So we would run into this problem in 2 types of cases:

1. On a new profile, no folder property has been set to Inbox.
2. When you create a new folder, there is no folder
   charset set.

You see the body correctly and the problem looks very strange
to most users that we don't do this right and not show Japanese
or other non-Latin 1 characters even when the View default is set.

From a product point of view, this is a very important feature 
in all non-Latin 1 markets. It should work without the user
doing anything in localized products. This bug prevents that
from happening. 
If I were running the show, I would not let PR2 go out the door
without fixing this one. You want the maximum impact of
introducing great mail features in ANY market in the world --
you tell users to set the Viewing Default and voila, it works
for a large majority of users without any more steps whether or not 
the messages contain MIME charset info. 1 setting and no more tears.
That's what I call very good User Experience.

I strongly urge the PDT committee committee to rethink this. 
I bet the fix isn't tha hard to find, either.
(Reporter)

Comment 10

18 years ago
I'm going to re-nominate this for nsbeta2.
I've stated the detailed reasons above. 
As an additional thought, let me say this.

We have this new wonderful pref UI called "Message View Default Character Coding"
and this pref does not function at all. Its intended use to help novice/average users
is thwarted by this bug. I'm saying that the effect is not 40% working, or 60% working.
The pref effectiveness at this point with this bug is 0%.
Whiteboard: [nsbeta2-]

Comment 11

18 years ago
Putting on [NEED INFO] radar.  Sending to I18l folks for PDT call.
Whiteboard: [NEED INFO]

Comment 12

18 years ago
Putting on [nsbeta2+] radar for beta2 fix. 
Whiteboard: [NEED INFO] → [nsbeta2+]

Comment 13

18 years ago
moving to M17.
Target Milestone: --- → M17

Comment 14

18 years ago
If this isn't fixed by 7/13 we are going to cut from nsbeta2.
Whiteboard: [nsbeta2+] → [nsbeta2+][7/13]
(Assignee)

Comment 15

18 years ago
from the pdt meeting yesterday I've been told that I18N is willing to cut this.
Also given our new nsbeta2 criteria we don't believe we would pull the beta off
the wire for this. Marking nsbeta2-.

nominating for nsbeta3.
Keywords: nsbeta3
Whiteboard: [nsbeta2+][7/13] → [nsbeta2-][7/13]
Target Milestone: M17 → M18

Comment 16

18 years ago
*** Bug 32901 has been marked as a duplicate of this bug. ***

Updated

18 years ago
Whiteboard: [nsbeta2-][7/13] → [nsbeta2-][7/13] [nsbeta3-]
(Reporter)

Comment 17

18 years ago
Per i18n/mail triage meeting, this bug is marked as
[nsbeta3+]. 

mscott may want to look into the possibility of combining
this bug with Bug 43389 or fixing it in the process of 
fixing that one.

This bug is to make it so that the default view charset
selection in the pref will be applied to message envelope
headers in case the MIME charset info is lacking in the
them.
(This is the remaining problem in this bug.)
Whiteboard: [nsbeta2-][7/13] [nsbeta3-] → [nsbeta2-][7/13] [nsbeta3+]
(Assignee)

Comment 18

18 years ago
I'm not quite sure how these bugs get assigned to me =). I'm probably going to
need Naoki's help here.

Looks like the reason headers don't render properly is because  of the following
routine in libmime:
mime_convert_rfc1522

This function is called with a header value. It currently just calls:
MIME_DecodeMimePartIIStr to decode the string and gets a charset for that string.

Then we turn around and call MIME_ConvertString from said charset to UTF-8. And
then mime continues on it's merry way.

According to Kat, this method is implemented incorrectly. I *think* 2 things
need to be changed.

1) If we are using an override charset to display the message and
MIME_DecodeMimePartIIStr doesn't return a charset then we should use the over
ride charset as the input charset before converting to UTF-8. Right now it looks
like we'll use NULL (or whatever MIME_DecodeMimePartIIStr gave us).

2) If MIME_DecodeMimePartIIStr didn't give us a charset and we aren't using an
over ride charset then we should use the default viewing charset pref.

Does this sound like the correct behavior to everyone?

Note: MIME_DecodeMimePartIIStr returns NULL (I think) if the header isn't a mime
encoded header.

Comment 19

18 years ago
>According to Kat, this method is implemented incorrectly. 
Momoi san, could you explain what's implemented incorrectly?
(Assignee)

Comment 20

18 years ago
Assuming my assumptions above are correct, I have a new version of
MimeHeaders_convert_rfc1522 which works much better for me for these test cases.
Except for one problem!

Naoki: MIME_DecodeMimePartIIStr ALWAYS returns a charset even if the header
isn't mime encoded. In the Ja example Kat gave me in 43389, there's a non-mime
encoded shift_jis header. This routine always tells me the charset is ISO-8859-1.

There's no way for me to cleanly step in with another charset because I think I
got a valid charset from MIME_DecodeMimePartIIStr. Should this routine be
returning null if the header is not mime encoded?

If I force my new code to be executed then this bug and 43389 now work.
(Assignee)

Comment 21

18 years ago
To answer Naoki's question. this is the routine that does the mime header
conversion. According to Kat, we aren't doing the right thing in the case where
we are using the over ride charset or in the case where the header is not mime
encoded. Ergo, this routine is not doing the right thing.

See my comments above for the proposed changes to fix this for this bug and the
other one. 

Comment 22

18 years ago
>According to Kat, we aren't doing the right thing in the case where
>we are using the over ride charset or in the case where the header is not mime
>encoded. 
That's because no override (and a default) charset was passed to libmime until 
Scott enabled the override.
(Assignee)

Comment 23

18 years ago
Right and now I have all this information. It works. Except
MIME_DecodeMimePartIIStr always says it extracted a charset from the header so I
never invoke this code.

I think we are dancing around the issue. I'll attach my patch which shows the
fix. This fix works for both these bugs except for the fact that
MIME_DecodeMimePartIIStr always returns a charset. So we never call the new code
to use one of these charsets when we convert to UTF-8. 
(Assignee)

Comment 24

18 years ago
Created attachment 12497 [details] [diff] [review]
proposed fix (has a bug that prevents it from working).

Comment 25

18 years ago
>There's no way for me to cleanly step in with another charset because I think I
>got a valid charset from MIME_DecodeMimePartIIStr. Should this routine be
>returning null if the header is not mime encoded?
Yes, you can check NULL return value to do the decision (inherited convention 
from the old code).
See, http://lxr.mozilla.org/seamonkey/source/mailnews/mime/src/comi18n.h#80
(Assignee)

Comment 26

18 years ago
*sigh* I feel like I've been having a hard time communicating with folks today
(not just this bug...just in general). Maybe I didn't eat my wheaties this
morning...

Noaki wrote:
Yes, you can check NULL return value to do the decision (inherited convention
from the old code).
See, http://lxr.mozilla.org/seamonkey/source/mailnews/mime/src/comi18n.h#80

Naoki, read my comments above. my problem is that I can't do this!
MIME_DecodeMimePartIIStr does not work this way. It ALWAYS ALWAYS returns a
valid header string and a charset. Even if the header is not mime encoded. If
the header is not mime encoded it always says the charset is ISO-8859-1 and
returns a copy of the input string.

I hope this gets us on the same page. If this worked then my patch would then work.

Comment 27

18 years ago
I looked at the code and there is inconsistency between the header file comment 
and the actual implementation.
A charset parameter is in and out. You can specify a default charset there. If 
the default is specified and the header string is not MIME encoded then NULL is 
returned. Sorry, about the confusion.

(Assignee)

Comment 28

18 years ago
Great!  That did the trick. Thanks Naoki. I'm going to attach a new patch that
has the complete fix now. Can you code review it for me so I can check it in
tonight? 
(Assignee)

Comment 29

18 years ago
Created attachment 12500 [details] [diff] [review]
final fix ready for code review

Comment 30

18 years ago
I have just started to review the patch but I want to confirm one thing.
With the patch, charset override is only used when the header does not have a 
MIME charset. This is a different behavior from body override where the user can 
override incorrect charset label specified in Content-Type header.
Do we agree on this behavior?

(Assignee)

Comment 31

18 years ago
Yup, that's how it works and I agree with you on that behavior. Override doesn't
really over ride a header that is mime type encoded.
(Reporter)

Comment 32

18 years ago
What would be wrong if we did apply an override to a
header which is MIME-encoded? Headers can be mis encoded
sometimes.
(Assignee)

Comment 33

18 years ago
If you do that, and the header is correctly encoded then you break viewing
messages using the over ride charset 'cause the headers would then be corrupted.
(we'd force the overrride charset on the headers instead of honoring the
encoding) I'm more apt to guess that using the over ride charset to view a
message is more common than having headers that are incorrectly mime encoded. 

Comment 34

18 years ago
When a MIME header is correctly encoded then the header can be viewed correctly 
so the user does not have to override.
I think the user does override when a header cannot be seen correctly.
There are two possible cases, either the header does not contain a MIME charset 
or the header is MIME encoded with a incorrect charset.
So, I think we also want to override incorrect MIME charset.
(Assignee)

Comment 35

18 years ago
I would think the user does the over ride because the body doesn't have a
charset on it (or the charset is set incorrectly).  That seems like the common
case to me. (But you guys know best not me). And if you implement over ride
charset such that it forces us to ignore mime encoding in the headers than I
think we'll have more complaints than success stories. Because then the body
will look right and the headers won't.

What's more common:
1) message bodies with incorrect or no charset information but with encoded mime
headers.
2) mime encoded headers with the wrong encoding information.

Comment 36

18 years ago
Well, I am not sure which is more common 1 or 2. Momoi san, any idea?
But usually headers and body charset match even a wrong charset are specified.
For example, I saw a Chinese newgroup message both the subject and the body were 
incorrectly labeled as "ISO-8859-1".
(Assignee)

Comment 37

18 years ago
Okay I'll let you guys figure out which behavior you want. I'd still like to
check this patch in when you are done reviewing....
(Reporter)

Comment 38

18 years ago
Is this something we can try out for a period?

1. Try out: application to both header and body
2. If we don't like it -- because most msgs have only one of
   them wrong, for example -- then we revert to appllying to
   only body.

My suspicion is that usually both are wrong at the same time. 
If not, that would be an unsual message -- for some reason,
the mail in question applied encoding X to headers but 
encoding Z to body. 
All we want is to cover the most common cases and 
settle for not being able to satisfy all possiblities.

Comment 39

18 years ago
Okay, the review is done. So, currently override is applied only if no MIME 
charset in a header. We may revisit this as Momoi san mentioned.

(Assignee)

Comment 40

18 years ago
Great!! Fixed checked in. We now honor the view default charset for headers in
the message pane if the header is non mime encoded.

And we can open a new bug if we think the presence of an overrride charset
should relace the charset in the encoded mime header if we run into that
scenario a lot. 
Status: NEW → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → FIXED
(Reporter)

Comment 41

18 years ago
I'm going to pass this on to ji.
This seems to be working but I cannot spend
sufficient time to verify it well at this point in time.
QA Contact: momoi → ji

Comment 42

18 years ago
Verified with limux 2000092008 and win32 2000092005 build It's fixed.
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.