924 bytes, text/plain
2.75 KB, patch
|Details | Diff | Splinter Review|
5.22 KB, patch
|Details | Diff | Splinter Review|
** Observed with 5/31/2000 Win32 build ** I don't know if there is a bug filed on this yet. I think we talked about this before, however. When you set the View Default Charset, you should be able to display a non-MIME-encoded header correctly if the encoding of the header matches that of the default viewing. Currently, this is not the case. You need to re-set the menu to ISO-2022-JP to display the header correctly if you have a non=MIME-encoded Japanese header.
Put the attachment file in the Local Folder. Set the Default Viewing Charset to Japanese (ISO-2022-JP) and then try to display the message, 1. Body will show correctly. (There is no MIME-charset info in the msg,) 2. Header pane will not show correctly until the charset is engaged. 3. The header will not show correctly in the msg envelope even when the charset menu is manually corrected. Rich, is this your bug ?
Actually, I think this is mscott's. This is the emitter/broadcaster interaction with non-mime encoded headers. - rhp
QA contact to myself.
QA Contact: lchiang → momoi
Kat, do you think this is critical enough that it should be nominated for beta2?
I would like to for the following reasons. 1. We support display of non-labeled body now and it would look strange that we cannot display unlabeled headers. Normal users would wonder why this is so and it will certainly incovenience him/her. 2. In newsgroups of various languages, we encounter this situation quite often. This is not addressing a far-fetched scenario. It is every day mail/news reading that is hampered by this bug. View Default charset lends such powerful friendly help to resolve this situation for users. 3. Applicability of this fix is much much wider than just a few languages. All Asian languages, for sure, and then all these European language news articles. Nominating for nsbeta2.
Checking with msanz and bobj for PDT call.
Whiteboard: [NEED INFO]
Putting on [nsbeta2-] radar. If critical to beta, let us know.
Whiteboard: [NEED INFO] → [nsbeta2-]
This is a bug where for some reason, the default setting is not used if the folder charset is not set. So we would run into this problem in 2 types of cases: 1. On a new profile, no folder property has been set to Inbox. 2. When you create a new folder, there is no folder charset set. You see the body correctly and the problem looks very strange to most users that we don't do this right and not show Japanese or other non-Latin 1 characters even when the View default is set. From a product point of view, this is a very important feature in all non-Latin 1 markets. It should work without the user doing anything in localized products. This bug prevents that from happening. If I were running the show, I would not let PR2 go out the door without fixing this one. You want the maximum impact of introducing great mail features in ANY market in the world -- you tell users to set the Viewing Default and voila, it works for a large majority of users without any more steps whether or not the messages contain MIME charset info. 1 setting and no more tears. That's what I call very good User Experience. I strongly urge the PDT committee committee to rethink this. I bet the fix isn't tha hard to find, either.
I'm going to re-nominate this for nsbeta2. I've stated the detailed reasons above. As an additional thought, let me say this. We have this new wonderful pref UI called "Message View Default Character Coding" and this pref does not function at all. Its intended use to help novice/average users is thwarted by this bug. I'm saying that the effect is not 40% working, or 60% working. The pref effectiveness at this point with this bug is 0%.
Putting on [NEED INFO] radar. Sending to I18l folks for PDT call.
Whiteboard: [NEED INFO]
Putting on [nsbeta2+] radar for beta2 fix.
Whiteboard: [NEED INFO] → [nsbeta2+]
moving to M17.
Target Milestone: --- → M17
If this isn't fixed by 7/13 we are going to cut from nsbeta2.
Whiteboard: [nsbeta2+] → [nsbeta2+][7/13]
from the pdt meeting yesterday I've been told that I18N is willing to cut this. Also given our new nsbeta2 criteria we don't believe we would pull the beta off the wire for this. Marking nsbeta2-. nominating for nsbeta3.
Whiteboard: [nsbeta2+][7/13] → [nsbeta2-][7/13]
Target Milestone: M17 → M18
*** Bug 32901 has been marked as a duplicate of this bug. ***
Per i18n/mail triage meeting, this bug is marked as [nsbeta3+]. mscott may want to look into the possibility of combining this bug with Bug 43389 or fixing it in the process of fixing that one. This bug is to make it so that the default view charset selection in the pref will be applied to message envelope headers in case the MIME charset info is lacking in the them. (This is the remaining problem in this bug.)
Whiteboard: [nsbeta2-][7/13] [nsbeta3-] → [nsbeta2-][7/13] [nsbeta3+]
I'm not quite sure how these bugs get assigned to me =). I'm probably going to need Naoki's help here. Looks like the reason headers don't render properly is because of the following routine in libmime: mime_convert_rfc1522 This function is called with a header value. It currently just calls: MIME_DecodeMimePartIIStr to decode the string and gets a charset for that string. Then we turn around and call MIME_ConvertString from said charset to UTF-8. And then mime continues on it's merry way. According to Kat, this method is implemented incorrectly. I *think* 2 things need to be changed. 1) If we are using an override charset to display the message and MIME_DecodeMimePartIIStr doesn't return a charset then we should use the over ride charset as the input charset before converting to UTF-8. Right now it looks like we'll use NULL (or whatever MIME_DecodeMimePartIIStr gave us). 2) If MIME_DecodeMimePartIIStr didn't give us a charset and we aren't using an over ride charset then we should use the default viewing charset pref. Does this sound like the correct behavior to everyone? Note: MIME_DecodeMimePartIIStr returns NULL (I think) if the header isn't a mime encoded header.
>According to Kat, this method is implemented incorrectly. Momoi san, could you explain what's implemented incorrectly?
Assuming my assumptions above are correct, I have a new version of MimeHeaders_convert_rfc1522 which works much better for me for these test cases. Except for one problem! Naoki: MIME_DecodeMimePartIIStr ALWAYS returns a charset even if the header isn't mime encoded. In the Ja example Kat gave me in 43389, there's a non-mime encoded shift_jis header. This routine always tells me the charset is ISO-8859-1. There's no way for me to cleanly step in with another charset because I think I got a valid charset from MIME_DecodeMimePartIIStr. Should this routine be returning null if the header is not mime encoded? If I force my new code to be executed then this bug and 43389 now work.
To answer Naoki's question. this is the routine that does the mime header conversion. According to Kat, we aren't doing the right thing in the case where we are using the over ride charset or in the case where the header is not mime encoded. Ergo, this routine is not doing the right thing. See my comments above for the proposed changes to fix this for this bug and the other one.
>According to Kat, we aren't doing the right thing in the case where >we are using the over ride charset or in the case where the header is not mime >encoded. That's because no override (and a default) charset was passed to libmime until Scott enabled the override.
Right and now I have all this information. It works. Except MIME_DecodeMimePartIIStr always says it extracted a charset from the header so I never invoke this code. I think we are dancing around the issue. I'll attach my patch which shows the fix. This fix works for both these bugs except for the fact that MIME_DecodeMimePartIIStr always returns a charset. So we never call the new code to use one of these charsets when we convert to UTF-8.
Created attachment 12497 [details] [diff] [review] proposed fix (has a bug that prevents it from working).
>There's no way for me to cleanly step in with another charset because I think I >got a valid charset from MIME_DecodeMimePartIIStr. Should this routine be >returning null if the header is not mime encoded? Yes, you can check NULL return value to do the decision (inherited convention from the old code). See, http://lxr.mozilla.org/seamonkey/source/mailnews/mime/src/comi18n.h#80
*sigh* I feel like I've been having a hard time communicating with folks today (not just this bug...just in general). Maybe I didn't eat my wheaties this morning... Noaki wrote: Yes, you can check NULL return value to do the decision (inherited convention from the old code). See, http://lxr.mozilla.org/seamonkey/source/mailnews/mime/src/comi18n.h#80 Naoki, read my comments above. my problem is that I can't do this! MIME_DecodeMimePartIIStr does not work this way. It ALWAYS ALWAYS returns a valid header string and a charset. Even if the header is not mime encoded. If the header is not mime encoded it always says the charset is ISO-8859-1 and returns a copy of the input string. I hope this gets us on the same page. If this worked then my patch would then work.
I looked at the code and there is inconsistency between the header file comment and the actual implementation. A charset parameter is in and out. You can specify a default charset there. If the default is specified and the header string is not MIME encoded then NULL is returned. Sorry, about the confusion.
Great! That did the trick. Thanks Naoki. I'm going to attach a new patch that has the complete fix now. Can you code review it for me so I can check it in tonight?
I have just started to review the patch but I want to confirm one thing. With the patch, charset override is only used when the header does not have a MIME charset. This is a different behavior from body override where the user can override incorrect charset label specified in Content-Type header. Do we agree on this behavior?
Yup, that's how it works and I agree with you on that behavior. Override doesn't really over ride a header that is mime type encoded.
What would be wrong if we did apply an override to a header which is MIME-encoded? Headers can be mis encoded sometimes.
If you do that, and the header is correctly encoded then you break viewing messages using the over ride charset 'cause the headers would then be corrupted. (we'd force the overrride charset on the headers instead of honoring the encoding) I'm more apt to guess that using the over ride charset to view a message is more common than having headers that are incorrectly mime encoded.
When a MIME header is correctly encoded then the header can be viewed correctly so the user does not have to override. I think the user does override when a header cannot be seen correctly. There are two possible cases, either the header does not contain a MIME charset or the header is MIME encoded with a incorrect charset. So, I think we also want to override incorrect MIME charset.
I would think the user does the over ride because the body doesn't have a charset on it (or the charset is set incorrectly). That seems like the common case to me. (But you guys know best not me). And if you implement over ride charset such that it forces us to ignore mime encoding in the headers than I think we'll have more complaints than success stories. Because then the body will look right and the headers won't. What's more common: 1) message bodies with incorrect or no charset information but with encoded mime headers. 2) mime encoded headers with the wrong encoding information.
Well, I am not sure which is more common 1 or 2. Momoi san, any idea? But usually headers and body charset match even a wrong charset are specified. For example, I saw a Chinese newgroup message both the subject and the body were incorrectly labeled as "ISO-8859-1".
Okay I'll let you guys figure out which behavior you want. I'd still like to check this patch in when you are done reviewing....
Is this something we can try out for a period? 1. Try out: application to both header and body 2. If we don't like it -- because most msgs have only one of them wrong, for example -- then we revert to appllying to only body. My suspicion is that usually both are wrong at the same time. If not, that would be an unsual message -- for some reason, the mail in question applied encoding X to headers but encoding Z to body. All we want is to cover the most common cases and settle for not being able to satisfy all possiblities.
Okay, the review is done. So, currently override is applied only if no MIME charset in a header. We may revisit this as Momoi san mentioned.
Great!! Fixed checked in. We now honor the view default charset for headers in the message pane if the header is non mime encoded. And we can open a new bug if we think the presence of an overrride charset should relace the charset in the encoded mime header if we run into that scenario a lot.
Status: NEW → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → FIXED
I'm going to pass this on to ji. This seems to be working but I cannot spend sufficient time to verify it well at this point in time.
QA Contact: momoi → ji
Verified with limux 2000092008 and win32 2000092005 build It's fixed.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.