Closed Bug 29789 Opened 20 years ago Closed 20 years ago

Import utility can't import addresses with "å","ä", or "ö" in it.

Categories

(SeaMonkey :: MailNews: Address Book & Contacts, defect, P3)

x86
Windows 98
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: ek96fksg, Assigned: tonyr)

Details

(Whiteboard: PDT+, patch submitted for review)

Attachments

(6 files)

BuildID: 2000022820
Platform: Win32

The Import Utility (Tasks->Tools->Import Utility) can't import addresses 
correctly (from Outlook or Outlook Express) with "å","ä", or "ö" in it.

Example:
An address book entry in the Outlook address book:

"Johannes Sjöström"

gets cut off when imported to Mozilla:

"Johannes Sj"
kat - fyi
QA Contact: lchiang → esther
Sending to Tony.
Assignee: hangas → tonyr
Yes. We are not handling 8-bit characters including
JPN in import utility. So basically they disappear in the
importing process. Are we going to expose Import function
in Beta 1? If so, we need to fix this problem. Otherwise,
there needs to be a way to indicate that it may not work
for 8-bit data for Beta 1.
I'm going to confirm this bug.
This feature is planned for Beta 2. I would rather not
see people try to import JPN address book data for Beta 1 
if this is not going to work for 8-bit data.
Status: UNCONFIRMED → NEW
Ever confirmed: true
This is a problem with the nsIAddrDatabase interface.  The import utility is 
passing the correct string to nsIAddrDatabase->AddFirstName, AddLastName, etc. 
calls.  The address book database is chopping off the characters.
Assignee: tonyr → chuang
Nominating for Beta1 because if users try to import Address Book books
from Outlook or Outlook Express, all non-ASCII data will show 
up as blank. It's data loss and will affect anything other than
ASCII and we should not subject users around the world with this type
of aggravation.
Keywords: beta1
nsIAddrDatabase->AddFirstName, AddLastName, etc. are expecting a UTF8 string.  
Before calling nsIAddrDatabase->AddFirstName, import utility need to convert the 
unicode string into UTF8 string using INTL_ConvertFromUnicode(pUnicodeStr, 
unicharLength, (char**)&pUTF8Str);  

You can look in nsAddrDatabase::AddAttributeColumnsToRow() to see the sample 
code.
Assignee: chuang → tonyr
Assigning myself as QA contact for now.
QA Contact: esther → momoi
Can't I just call nsString::ToNewUTF8String?  In any case this is a trivial fix.  
How do I get this nominated/approved for beta1 checkin?
Tony, please describe the nature of your fix and safeness
so that the PDT team can evaluate it for PDT+ status which is
required for Beta 1.
The benefit for this fix is enormous for all internaitonal 
users. If it's a safe and easy fix, then we have better argument
for the PDT+ status. 
Sure.  The import utility currently calls nsString::ToNewCString to obtain a 
string to pass to nsIAddrDatabase::Add[XYZ] calls.  nsIAddrDatabase is expecting 
a UTF8 string so the fix is simply changing ToNewCString to ToNewUTF8String.  
Simple fix already tested and working in my tree.  The changes are all in the 
import tools themselves and have no impact on other parts of mozilla. 
[PDT+] w/b minus on 3/3
Whiteboard: [PDT+] w/b minus on 3/3
Fix all checked in - hopefully [PDT+] w/b minus 3/3 meant I could do that!
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
This is not fixed yet.
See the attached image -- it imports something but not
Japanese on JPN Windows. Are we converting from system file charset
to Unicode?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I suspect the problem is getting the data from MAPI?  I have absolutely no way 
to test this.  Kat, can yyou apply the attached patch and see if it works?
Tony, I think you can debug this yourself.
I'm providing necessary files as attachments.

1. My OE5 Adbook files in .wab format. (Actually I see 2 files in the
   directory for OE5 Adbook, momoi.wa~ and momoi.wab. I attached both
   just in case in a .zip file. Place these in your OE5 directory.

2. You need a font which contains Japanese glyphs on your
   Windows system. If you don't have one, get this font:

   ftp://ftp.netscape.com/pub/communicator/extras/fonts/windows/

   the name: Cyberbit.ZIP   (this contains many language fonts nlcuding JPN)

   unzip this file, and install the resulting file via Control Panel |
   Font utility.

Mozilla picks the font automatically and so you don't have to do any 
setting. 
To see that the font is being used, just go to this page,

http://home.netscape.com/ja

You should see Japanese displayed correctly on the browser.

Below, I'm also attaching an image of the AdBook entries in
my OE5 Address Book so that you can compare the results.
Wait, this probably will not work straightforwardly. 

The data in these wab files seem to be in Shift_JIS (which is the system
charset for Japanese Windows). If you are debugging on
US Windows, can you somehow set the charset of the original
data to "Shift_JIS" rather than your system charset when
debugging?

If you can't do this easily, it would be best to get nhotta to
look at this with his debug build. I'm not building and so can't
help you with your patch.
Tony, also you can input latin 1 accented data -- as reported by 
the original filer of this bug. This you can do in OE5 directly under
US Windows if you use the ALT+NumPad method to input 8-bit
acccented characters. 

1. Press ALT key (usually the left one)
2. While holding it down, start using the NumPad without the NumLock.
   First type 0, then the decimal codepoint of the character
   on Windows-1252 character set table. e.g.
 
   0 2 3 4

3. Then let go the ALT key. This will input "ê". 

Use the decimal number rather than the hex number on the following
page to input various accented characters.

http://msdn.microsoft.com/library/books/devintl/S2573.HTM

This should at least get you the latin debugging going.
Latin characters were fixed by the previous checkin.  I tested them and they 
worked fine.  The current problem appears to only be double-byte characters.
Tony, you've made changes so that importing from Eudora, Outlook
and OE are covered by that fix, right? If so, I'd better look at
Eudora and Outlook also.
Yes & No.  The changes should work for latin1 on all import formats.  My guess 
is that there will be problems with Japanese on Eudora.  Outlook currently is 
the same as OE and hopefully the patch will fix Japanese (and any other double-
byte character sets) for Outlook and OE but not for Eudora.
3/3 is gone... and it sounds like this bug is resolved enough for beta1
PDT- for beta1
PDT+ for beta2
Whiteboard: [PDT+] w/b minus on 3/3 → [PDT-] plus for beta2
Depending on how effective the last fix is, L10n might 
evaluate not including the ones which corrupt import data.
Removing PDT- note.  Folks indicated that they have a change in hand (maybe) and
would like to test it on Monday. I'd like to hear BobJ's input on this at the
PDT meeting Monday.
Whiteboard: [PDT-] plus for beta2
I applied the patch (posted on 3/4) and imported the Japanese data (posted on 
3/4).
Japanese strings are still not imported correctly with the patch.
In address book, the imported Japanese are displayed as dots (before the patch 
it was garbage instead of dots).
Plus, there are still the issues of Eudora address import 
as well as message importing. I don't think we have sufficient 
time to check these out for Beta 1. 
I suggest that L10n disable the import menu for Beta 1.
Putting on PDT+ radar for ja beta1.
Keywords: beta1jab1
Whiteboard: [PDT+]
Whiteboard: [PDT+] → [PDT+] (JA beta only now!)
Removing PDT+ to get it off the beta1 radar.
This is still needed for the Japanese beta, so it has the jab1 keyword.
Whiteboard: [PDT+] (JA beta only now!) → (JA beta only now!)
"jab1" is for bugs in the localization itself, not for enabling bugs.
I sent email to mozilla-seamonkey@mozilla.org to clarify.
Restoring this back to PDT+ and keyword beta1.
Keywords: jab1beta1
Whiteboard: (JA beta only now!) → PDT+
nhotta, Please see if you can help tonyr in any way.  Code review, etc.
If we remove the import utility, will it automajically disappear from the menu?
If so, one possibility is to remove this feature from the localized Beta1s.
msanz?
For Japanese Beta 1, the following are probably the only thing
that is working:

1. Msg importing from Outlook and Outlook Express.

I talked to tonyr and we seem to have the following options:

1. Remove the menu item from from the .xul file
2. Remove 4 import module services in .dll format. This will 
   still leave the menu in but there will be no services available 
   to choose from.

I debugged again with today's pull which includes the patch of 3/4.
I used the same japanese data of 3/4 (ran on WinNT japanese).

In CWAB::GetValueString,
I put break points for cases PT_UNICODE and PT_MV_UNICODE but they were not hit.
I saw it hits a break point at PT_STRING8.

I don't know why the MAPI property does not have UNICODE when using Japanese 
data. What WinAPI is used in order to get the property? There may be a special 
API or flag to let MAPI to return UNICODE.
I called WinAPI MultiByteToWideChar with CP_ACP to convert Shift_JIS to Unicode 
in case of PT_STRING8. Then japanese data is now imported correctly.

MultiByteToWideChar(CP_ACP, 0, pVal->Value.lpszA, -1, temp, 128);

The issue of not getting unicode data from MAPI need investigation. But I think 
we can put this ACP conversion to enable japanese address book import.
Whiteboard: PDT+ → PDT+, patch submitted for review
tonyr,
Please apply nhotta's patch as well as comment out the nickname substitution
that you, momoi and nhotta discussed.  Then please test these changes with
Latin1 data to make sure that is still working as expected.  Assuming this
is all good, please run the pre-checkin tests and then check in the changes.
If we can get this in tonight it will be in tomorrow's Beta1 builds.  Thx.
tonyr, When you check-in, use r=nhotta and a=bobj.  Thx.
To summarize the current status of this bug:

1. The proposed checkins will make Outlook/Outlook Express
   Address Book import work for Japanese & other multi-byte data.
2. Outlook/Outlook Express message import seems to be
   workign at basic level for Japanese and other 
   multi-byte encoded messages
3. TEXT import and Eudora import are not working for 
   multi-byte data and will be disabled for JPN beta.

   Separate bugs will be filed for these problems to Text
   and Eudora import issues post Beta1.
Fix checked in morning of 3/15/00
Status: REOPENED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → FIXED
** Checked with 3/16/2000 Win32 build **

OK. This looks good. 

1. Outlook Express Address Entries in JPN get imported correctly
   in all fields now.
   (** I have not tested Outlook Adbook but these two share the
       same Address Book data under the shared directory.)
2. Message import works OK at the basic level. I tested for
   sinlge-part msgs and simple multi-part mixed msgs with HTML attachments.
   More complex cases have not been included in the tests but so far
   msgs are imported OK as Japanese and MIME structures are
   also good.

With these results, I'm going to mark this fix verified.
Will look at more complex cases later and report on any
problem in a separate bugs.

k96fksg@el.haninge.kth.se, if you have access to Btea 1 builds 
later than the fix date, please check it out. It should be 
working now.

Marina, please check out this fix for Latin 1 data. If the original
poster or Marina finds a problem with Latin 1 data, please
re-open this bug.

   
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.