** Observed with 4/9/99 Win32 5.0 build ** Please re-assing to this someone appropriate in Mail/News. Here's how the problem occurs: 1. Prepare a mailbox with a number of msgs with Japanese headers. The headers should be of varying length -- anywhere from 5-6 characters to over 80 characters. 2. Erase the existing summary file, i.e. ",msf" file. 3. Now start Messenger. This will re-create the summary of headers. 4. At this point examine thread pane headers and see that they are displayed correctly. 5. Now quit Messenger and then re-start Messenger. 6. Exmaine the headers again. This time, some of the headers (particularly ones longer than certain length) now include extraneous materials, i.e. $0D$0A$09 7. There is actually no truncation of headers. It's just that these extra strings are inserted. 8. After this, you will see these extra strings until you delete the summary (".msf") file.
TM set to M5.
Question: Did you see $0D$0A$09 in .msf file or display?
The extraneous chatracters show up in the thread pane display.
Question: I think .msf is a plain text file. So, $0D$0A$09 were not in the .msf file, just confirming. $0D$0A$09 are CR, LF, and a tab. Where did you see those characters inserted? Did you see a line break after the first line and a tab in the following line? I think a header is displayed in one line and truncated if longer.
Now that we can examine the source for messages, I looked at it in the View pane part and found that these extraneous characters are there in the MIME-encoded format. It seems that this problem occurs only with longer headers and where there should be line breaking, we are actually MIME-encoding these characters in the following way: =?iso-2022-jp?GyRCISMbKEI=?= This string represents the Japanese period plus the extraneous characters.
So, this is an encoding problem. 1) The header was generated by Mozila, correct? 2) What do you actually see in the thread pane of Mozilla? 3) How is that displayed in 4.5 thread pane?
Here's on example display in Japanese: HTML: 今度はうまく行くでしょう。これは日本語のメールです。 This is what I see in Communicator 4,5x thread pane, but under the current Mozilla, I see: HTML: 今度はうまく行くでしょう。これ$0D$0A$09は日本語のメールです。 Note the insertion of the extraneous characters. In the source, this is represented as: =?iso-2022-jp?B?GyRCOiNFWSRPJCYkXiQvOVQkLyRHJDckZyQmISMkMyRsGyhC?= =?iso-2022-jp?B?GyRCJE9GfEtcOGwkTiVhITwlayRHJDkhIxsoQg==?= Note that the header folding is taking place actually.
One more question, the source do you mean in .msf file or in the locale file in Berkeley format?
Well, I had thought that the problem is in the MIME-encoded part but actually it is not. What Iquoted above is from my Inbox file. When I looked in the Inbox.msf file, I saw: HTML: =?iso-2022-jp?B?GyRCOiNFWSRPJCYkXiQvOVQkLyRHJDckZyQmISMkMyRsGyhC?=$0D$0A$09=?is o-2022-jp?B?GyRCJE9GfEtcOGwkTiVhITwlayRHJDkhIxsoQg==?= So maybe this is a summary file creation problem? Are these characters supposed to be there, but we should not display them?
Reassigning to bienvenu, adding putterman to cc. I think this is a summary file issue. This is not Japanese specific. We saw it happens in a folded Latin1 header too. BTW, what's the spec here, we used to truncate it for a display? Do we support line breaks in the thread pane in 5.0 instead?
No, Mork is escaping the strings to include $0D$0A, etc, as it should. I'm not sure how I should handle this. I wouldn't think we should be giving the DB lines with CRLF's in them...
David, are you unescaping yarns when you give them back?
I should be unescaping yarns, but it's possible I messed it up after it was working earlier. I'll check after I finish pulling the tree.
It looks like I was never unescaping bytes encoded as hex in Mork, so once they were written as $xx, it stayed that way when read later. I wrote the code just now to unescape it correctly, but I still don't have an operative build yet. Someone else can check it in before me if necessary and one can't wait until I build and hopefully run for the first time today. If I'm unable to run today, I'll note this here later in case someone needs the fix faster.
Fixed checked in yesterday by changing morkParser::ReadValue() to convert next two hex bytes after '$' into the next source byte. Resolving FIXED pending independent verification.
** Checked with 4/20/99 Win32 build ** With this new build, under the same conditions which prompted me to file this bug (see steps 1-6 above), there no longer are any extraneous strings inserted. This problem used to occur with Latin 1 headers as well as Japanese ones. All the problems I obserevd with the non-ASCII headers in my mailbox are gone now. Marking the fix verified.