Closed Bug 46239 Opened 25 years ago Closed 25 years ago

ASCII CR/LF's ignored in CJK plain text mail fusing all ASCII lines

Categories

(MailNews Core :: Backend, defect, P1)

x86
Windows 2000
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: wu.tommy, Assigned: shanjian)

Details

(Keywords: regression, Whiteboard: nsbeta3+)

Attachments

(2 files)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; m17) Gecko/20000721 BuildID: 2000072108 When use BIG5 charset, some ASCII line's CR/LF will merge to one line. (plain text mail format) Reproducible: Always Steps to Reproduce: 1.Read any mail use big5 charset that contain ascii string 2. 3. Actual Results: first line second line three line Expected Results: first line second line three line The BIG5 string's CR/LF is OK. But the ascii string won't.
ftang, can you take care of this?
Assignee: rchen → ftang
I'm going to confirm this bug. This occurs in Big5 Plain text mail. Ben, this sounds like a regression of a problem we had before. We had a problem of line-breaks turned into spaces beofe -- Bug 41997. This one is similar but does not have to involved a signature line at all. Sending it to Ben for initial analsys.
Assignee: ftang → mozilla
Status: UNCONFIRMED → NEW
Ever confirmed: true
Component: Localization → Mail Back End
momoi, - I have no idea what a big5 charset is. - Could you please check, if the bug appears with the workaround for bug 41637 (adding the rules to user.css)? If yes, it's a bug. If no, I need a test msg, and even then, I'm not sure I can debug this, as I don't know much about I18N issues.
> If yes, it's a bug. s/bug/duplicate
qawanted
Keywords: qawanted
Ben, I can get to this probably tomorrow. I think this is a wider problem than just Big5. I'll provide send you some msgs within the next day.
It wasn't that hard to figure out what is going on with this one. So here it is. As we know, format=fixed is the default for plain text messages in Chinese, Japanese and (possibly) Korean for reasons discussed in another long bug. We are all familiar with that one. When Mozilla tries to display such CJK plain text msgs with mixed text in CJK and ASCII (e.g. English), it ignores all the CRLF for the ASCII portion of the message while honoring those in the CJK portion of it. This seems to be a display only bug. If you send any mgs as format=flowed, such messages should not be affected. I'll attach test messages which show this clearly. Too many users will be affected very badly if this is not fixed. Nominating for nsbeta3. We should definitely fix this during M18. This could be Daniel's bug?
Keywords: qawantednsbeta3, regression
Big5, Japanese & Thai messages are in the file. Some sent by Comm 4.72 and others sent by Mozilla. It does not matter. This does not seem to be a sending side problem The data are correct but Mozilla ignores CR LF for format=fixed msgs if it sees ASCII lines.
Note that the Thai message sent by Mozilla uses format=flowed because it is not one of CJK charsets. And it does not have a display problem for htat reason.
Modified the summary line.
Summary: CR/LF handle not correctly in BIG5 charset → ASCII CR/LF's ignoered in CJK plain text mail fusing all ASCII lines
I can't say that I feel responsible for the format=fixed (plain text mail) decoder. It's quite simple and I thought it would be hard for it to break anything. If it's in the decoder, Ben might know something since he did some work there a couple of months ago. Since other decoders (format=flowed) works it seems likely that the problem is in (or is caused by something in) mimetpla.cpp.
Summary: ASCII CR/LF's ignoered in CJK plain text mail fusing all ASCII lines → CR/LF handle not correctly in BIG5 charset
I tried to look at the code the decoder outputs and thinks this might be another dupe of the style sheet problem. I think (Ben, correct me if I'm wrong) that we depends on the styles to be able to display messages properly. The code was (mangled by cut'n'paste through a console window): <div class=text-plain wrap=true graphical-quote=true style="font-family: ўІІЕџФк Ні; font-size: 16px;"><pre wrap><div class=text-plain wrap=true graphical-quote= true style="font-family: ўІІЕџФкНі; font-size: 16px;"><pre wrap>ЕћІезяЕряўмфўТІо РЛЕЃьеяжД+юЕХЩѕЅјоќшеёЎвЧќ оПФоНћўЃяЕи+вЧќкј+еЁЩктСѕ+ТвЧќўТІксќўЎІкћєвЧќўРмѕЉшец+еХНежП CICQвЧќўТІоРЛўёЅеЃџўЁыешотЂвЧщ This is line 1 This is line 2 This is line 3 This is line 4 <div class=txt-sig>-- Katsuhiko Momoi Netscape International Client Products Group <a class="txt-link txt-link-abbreviated" href="mailto:momoi@netscape.com">momoi@ netscape.com</a> What is expressed here is my personal opinion and does not reflect official Netscape views. </div></pre></div>Enabling Quirk StyleSheet
I tried the above code in the ordinary browser. Watching with a western charset, the line breaks was ok. Watching with Big5 made the lines join eachother again. I think this is a problem somewhere else than in mailnews and Ben is most certainly not the right owner (but I won't reassign it since I don't know who is). It looks like a bug somewhere in the layout or i18n modules.
For some reason, the summary line was changed again. The problem is not limited to Chinese Bug 5 alone. So it is corrected again.
Summary: CR/LF handle not correctly in BIG5 charset → ASCII CR/LF's ignoered in CJK plain text mail fusing all ASCII lines
Attached file Small testcase
Summary: ASCII CR/LF's ignoered in CJK plain text mail fusing all ASCII lines → ASCII CR/LF's ignored in CJK plain text mail fusing all ASCII lines
Probably my fault. There was a collision when I tried to add my previous comment. Anyway. I've attached a small testcase. Watch it in ISO-8859-1 and you get the line breaks. Watch it in BIG5 and you don't. I admit I probably broke the chinese text, but I hope it shows the problem anyway.
Thanks, Daniel. In that case, let's re-assign this to Naoki and see if he can help us get further.
Assignee: mozilla → nhotta
(So much traffic here, I have no chance to post :) .) Daniel, tnx for helping. Yes, format=fixed->HTML converter is mine and rhp's area. (Note that the first doubled <div> and <pre> tags are propably just an error while copy&paste, so don't worry about that.) I didn't check the msgs myself yet, but looking at what Daniel posted here, I cannot see the problem. Should work fine. Actually, apart from - the font - the Japanese(?) text in the beginning and - propably the charset (not pasted here) this is the same as what we output for normal english msgs. I don't think, the Japanese text is the problem, so this is either nhotta's bug (the font) or NGLayout's.
I checked the msgs now (using |mimetest|), and looked at Daniel's testcase, but I can't find a charset. Could that be the reason?
Ben, I don't think the double <pre> and <div> was caused by my copying. I see it very much in the console. Unfortunately mimetest no longer works so I can't check for what the output is for real. Anyway, it has nothing to do with this bug.
I don't think charset selection is any problem. I used the menu in the browser to force different charsets and saw the bug.
Understood. But it's still odd. IIRC, we are supposed to hand the charset in the msg header over to NGLayout. |mimetest| works fine for me (Linux), and I see only one <div> and <pre>. I do see the bug as described by you and momoi.
It does not look like a font selection problem. When I override the Big5 message by ISO-8859-1 (by view->character coding), the line breaks were shown correctly. The font selection is not affected by charset override so the font selection was "BitStream Cyberbit" on my machine regardless of the override states.
Looks like problem in nsTextTransfomer for unicode case. REassign to shanjian Mark it nsbeta3+ per i18n bug meeting shanjian, remove the non ASCII character in the "small testcase" and reload to see the correct behavior in pure English. mark it as nsbeta3+ per i18n bug meeting. Mark it P1
Assignee: nhotta → shanjian
Priority: P3 → P1
Whiteboard: nsbeta3+
Is this a dup of 47154
nhotta: I just checked in a fix for 47154 (update intl/lwbrk/src). I think that should fix this problem...
It did fix the bug. Tested on Windows 2000. Marking bug FIXED.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
** Checkedw ith 8/21/2000 Win32 build ** Yes.This problem is now fixed for plain text messages with ASCII text. The only remaining problem is the signature block. I believe there is another bug for tat already and so I'll verify this fix. Marking it verified as fixed.
Status: RESOLVED → VERIFIED
> The only remaining problem is the signature block. > I believe there is another bug for tat already You're right: bug 41637.
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: