Last Comment Bug 7430 - Latin 1 characters under 8859-1 and UTF-8 are sent incorrectly: header & body
: Latin 1 characters under 8859-1 and UTF-8 are sent incorrectly: header & body
Status: VERIFIED FIXED
:
Product: MailNews Core
Classification: Components
Component: Internationalization (show other bugs)
: Trunk
: x86 Windows NT
: P3 normal (vote)
: M7
Assigned To: rickg
: Katsuhiko Momoi
:
Mentors:
Depends on: 7479
Blocks: 7228
  Show dependency treegraph
 
Reported: 1999-06-01 15:05 PDT by Katsuhiko Momoi
Modified: 2008-07-31 01:22 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments

Description Katsuhiko Momoi 1999-06-01 15:05:24 PDT
** Observed with 6/1/99 Win32 build **

We seem to have a wide-spread falures in send:

Latin 1:

  1. Header is not going out in MIME-encoded format though the
     prefs50.js setting is for strict MIME header.
  2. Body seems to lose 8-bit input and replaces them with
     with a dot.

Japanese:

  1. Here you cannot the see the input correctly because of the font
     problem mentioned in Bug 7424, but body goes out correctly in
     iso-2022-jp.

UTF-8:

  1. Header is encoded in UTF-8/B but display is wrong. The Body
     uses QP but it seems that wrong conversion might be talking
     place. See these headers and body:

--------------------------------------------------------
     Subject: UTF8: =?UTF-8?B?RnLtl5R0cmU=?=
     Content-Type: text/plain; charset=UTF-8
     Content-Transfer-Encoding: quoted-printable

     Fr=ED=97=94tre
--------------------------------------------------------

Note that the word is supposed to be "Frêtre" in the header and
body.
Comment 1 Katsuhiko Momoi 1999-06-02 16:09:59 PDT
I'm updating the summary line based on Naoki's suggestions.

In this bug, Latin 1 accented characters are mishandled for sending mail under ISO-8859-1 or UTF-8 charset.
To update the UTF-8 condition,

UTF-8:

  1. Header is encoded in UTF-8/B but display is wrong. The Body
     uses QP but it seems that wrong conversion might be talking
     place.  This problem applies to Latin 1 accented character
     input. Japanese input seem to be handle correctly. See these headers
     and body for Latin 1 input: .....
Comment 2 nhottanscp 1999-06-03 11:42:59 PDT
Reassigning to rickg@netscape.com.
It's still happening today's build.
But my local build with a change below does not have the problem.
It is likely that the char vs unsigned char causes the problem.

void CopyChars1To2(char* aDest,PRInt32 anDestOffset,const char* aSource,PRUint32
anOffset,PRUint32 aCount) {

  PRUnichar* theDest=(PRUnichar*)aDest;
  PRUnichar* to   = theDest+anDestOffset;
  const char* first= aSource+anOffset;
  const char* last = first+aCount;

  //now loop over characters, shifting them left...
  while(first<last) {
#if 0
    *to=kIsoLatin1ToUCS2[*first];
#else
    *to=kIsoLatin1ToUCS2[(unsigned char)*first];
#endif
    to++;
    first++;
  }
}
Comment 3 rickg 1999-06-03 16:22:59 PDT
This bug should now be fixed because I eliminated the kIsoLatin1ToUCS2 table
lookup and did a direct 1-to-2 byte conversion.
Comment 4 Katsuhiko Momoi 1999-06-04 16:20:59 PDT
** Checked with 6/4/99 Win32 - the 2nd build **

Unfortunately the Latin 1 high-bit characters problem
for sending mail under ISO-8859-1 and UTF-8 encopdings are
still there with this new build.

Re-opening...
Comment 5 Katsuhiko Momoi 1999-06-04 16:22:59 PDT
We need to get 7479 resolved to see this bug fixed. I'm marking the dependency here.
Comment 6 Katsuhiko Momoi 1999-06-10 23:52:59 PDT
This seems to be working now.
I'll mark it resolved (due to 7479) and get it on my
plate to verify.
Comment 7 Katsuhiko Momoi 1999-06-11 19:00:59 PDT
** Checked with 6/11/99 Win 32 build **

I looked at both Latin1 and UTF-8 mail send with
Latin 1 high-bit range data in the header and body.
None of these headers and bodies show the problems mentioned
in the original and subsequent reports.

MIME encodings are also correctly done.
Marking the fix verified.

Note You need to log in before you can comment on or make changes to this bug.