Closed Bug 4925 Opened 25 years ago Closed 25 years ago

Japanese thread pane headers get extraneous materials on re-start

Categories

(MailNews Core :: Backend, defect, P3)

x86
Windows NT
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: momoi, Assigned: davidmc)

Details

** Observed with 4/9/99 Win32 5.0 build **

Please re-assing to this someone appropriate in Mail/News.

Here's how the problem occurs:

1. Prepare a mailbox with a number of msgs with Japanese
   headers. The headers should be of varying length -- anywhere
   from 5-6 characters to over 80 characters.

2. Erase the existing summary file, i.e. ",msf" file.
3. Now start Messenger. This will re-create the summary of headers.
4. At this point examine thread pane headers and see that they are
   displayed correctly.
5. Now quit Messenger and then re-start Messenger.
6. Exmaine the headers again. This time, some of the headers
   (particularly ones longer than certain length) now include
   extraneous materials, i.e. $0D$0A$09

7. There is actually no truncation of headers. It's just that these
   extra strings are inserted.

8. After this, you will see these extra strings until you delete
   the summary (".msf") file.
QA Contact: 4080 → 1308
Summary: Japanese thread pane headers get extraaneous materials on re-start → Japanese thread pane headers get extraneous materials on re-start
Target Milestone: M5
TM set to M5.
Status: NEW → ASSIGNED
Question: Did you see $0D$0A$09 in .msf file or display?
The extraneous chatracters show up in the thread pane display.
Question: I think .msf is a plain text file. So, $0D$0A$09 were not in the .msf
file, just confirming.
$0D$0A$09 are CR, LF, and a tab. Where did you see those characters inserted?
Did you see a line break after the first line and a tab in the following line? I
think a header is displayed in one line and truncated if longer.
Now that we can examine the source for messages, I looked at it in the
View pane part and found that these extraneous characters are
there in the MIME-encoded format. It seems that this problem occurs only with
longer headers and where there should be line breaking, we are
actually MIME-encoding these characters in the following way:

=?iso-2022-jp?GyRCISMbKEI=?=

This string represents the Japanese period plus the extraneous characters.
So, this is an encoding problem.
1) The header was generated by Mozila, correct?
2) What do you actually see in the thread pane of Mozilla?
3) How is that displayed in 4.5 thread pane?
Here's on example display in Japanese:

HTML: 今度はうまく行くでしょう。これは日本語のメールです。

This is what I see in Communicator 4,5x thread pane, but under the current
Mozilla, I see:

HTML: 今度はうまく行くでしょう。これ$0D$0A$09は日本語のメールです。

Note the insertion of the extraneous characters. In the source, this is
represented as:

=?iso-2022-jp?B?GyRCOiNFWSRPJCYkXiQvOVQkLyRHJDckZyQmISMkMyRsGyhC?=
=?iso-2022-jp?B?GyRCJE9GfEtcOGwkTiVhITwlayRHJDkhIxsoQg==?=

Note that the header folding is taking place actually.
One more question, the source do you mean in .msf file or in the locale file in
Berkeley format?
Well, I had thought that the problem is in the MIME-encoded part
but actually it is not. What Iquoted above is from my Inbox file. When
I looked in the Inbox.msf file, I saw:

HTML:
=?iso-2022-jp?B?GyRCOiNFWSRPJCYkXiQvOVQkLyRHJDckZyQmISMkMyRsGyhC?=$0D$0A$09=?is
o-2022-jp?B?GyRCJE9GfEtcOGwkTiVhITwlayRHJDkhIxsoQg==?=

So maybe this is a summary file creation problem? Are these characters
supposed to be there, but we should not display them?
Assignee: nhotta → bienvenu
Status: ASSIGNED → NEW
Reassigning to bienvenu, adding putterman to cc.
I think this is a summary file issue.
This is not Japanese specific. We saw it happens in a folded Latin1 header too.
BTW, what's the spec here, we used to truncate it for a display? Do we support
line breaks in the thread pane in 5.0 instead?
No, Mork is escaping the strings to include $0D$0A, etc, as it should. I'm not
sure how I should handle this. I wouldn't think we should be giving the DB lines
with CRLF's in them...
David, are you unescaping yarns when you give them back?
Assignee: bienvenu → davidmc
I should be unescaping yarns, but it's possible I messed it up after
it was working earlier.  I'll check after I finish pulling the tree.
It looks like I was never unescaping bytes encoded as hex in Mork,
so once they were written as $xx, it stayed that way when read later.

I wrote the code just now to unescape it correctly, but I still don't
have an operative build yet.  Someone else can check it in before me
if necessary and one can't wait until I build and hopefully run for
the first time today.  If I'm unable to run today, I'll note this here
later in case someone needs the fix faster.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Fixed checked in yesterday by changing morkParser::ReadValue() to convert
next two hex bytes after '$' into the next source byte.  Resolving FIXED
pending independent verification.
Status: RESOLVED → VERIFIED
** Checked with 4/20/99 Win32 build **

With this new build, under the same conditions which prompted me
to file this bug (see steps 1-6 above), there no longer are any
extraneous strings inserted.
This problem used to occur with Latin 1 headers as well as Japanese
ones. All the problems I obserevd with the non-ASCII headers in
my mailbox are gone now.

Marking the fix verified.
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.