Open Bug 237624 Opened 16 years ago Updated 11 years ago

Palm addressbook synchronization issues with non-ASCII characters (international language / unicode with PalmSync)

Categories

(MailNews Core Graveyard :: Palm Sync, defect, critical)

x86
Windows XP
defect
Not set
critical

Tracking

(Not tracked)

People

(Reporter: jshin1987, Unassigned)

References

Details

(Keywords: dataloss, intl)

Attachments

(2 files)

As far as I know, Palm doesn't use Unicode but it uses one of legacy
(locale-dependent) encodings. It seems like Palm-Sync code in Mozilla assumes
ISO-8859-1. There should be a way to change the default character encoding for
Palm Sync. 

I don't have a Palm, but the problem was reported at http://www.mozilla.or.kr by
a Korean Mozilla/TB user.
I've never seen a Palm Conduit with an option to select character encoding.  

I think we should perhaps be looking to autodetect this perhaps by following
whatever the address book uses?
(In reply to comment #1)
> I've never seen a Palm Conduit with an option to select character encoding.

 Please, execuse me if I say something stupid. I've never owned Palm and I'm not
sure how this synchronizing works. I'm assuming that there are two parties, Palm
and Mozilla/TB.  Because Palm OS has no notion of character encoding, Palm will
just 'emit' a stream of bytes (and I guess Palm Conduit will just pass through
the stream of bytes/octets as they're). This bug is about how to interpret the
stream of bytes sent from Palm (via Palm conduit?) on Mozilla's side. Because
Mozilla uses Unicode, we can't just store the stream of bytes as they're, but we
have to 'interpret' bytes and convert them to Unicode. Currently, we interpret
the stream of bytes/octets in ISO-8859-1, which doesn't work for
non-Western-European users. It may not even work for Western European users
because what's used on Palm OS (in English/Western-European version) could well
be Windows-1252 instead of ISO-8859-1.

The other way around (when sending the content of Mozilla/TB's address book to
Palm),  we have to convert Unicode strings to what's used on Palm's side (a
sequence of bytes whose interpretation will depend on the language/locale of
Palm OS in use, which must be user-dependent.) 
  
> I think we should perhaps be looking to autodetect this perhaps by following
> whatever the address book uses?

 I don't think auto-detecting can work very reliably especially considering that
we're dealing with relatively short runs of text in the addressbook. Most Palm
users would be 'mono-lingual' (as far as their addressbooks are concerned) so
that it'd be nice to add a UI to Mozilla/TB to set the default character
encoding to assume when exchanging/synchronizing addressbook with Palm. 
I guess we need to do two things: 1) add a backend pref. entry (palm sync
character encoding and use the uconv converter corresponding to the value of the
pref. entry (instead of ASCII <-> Unicode converter) 2) add a UI for this

We can implement the first rather easily, I guess. Not having a Palm, I need
someone's help to test this, though. (Ahah, sometime soon I'll visit my brother
who's got a Palm  and a Mozilla build environment on Win2k   so that I can test
my yet-to-be made patch at his place)
Assignee: cavin → jshin
Jshin:

If you join become a Palm Developer on the PalmSource web site. (just fill out
the form is all it takes) you can down load an emulator of a Palm that can run
on the PC and can be synced.  You could then use that to test with instead of
using an actual handheld.  There are also roms for other languages so that you
could test what is happening in something other than English.

Kevin
Thanks. I realized that several hours ago  and downloaded necessary files. BTW,
there's an evalgelism issue at the site (bug 237871).
 
I've read a bit more about PalmSync (bug 214407 was helpful) and realized that
there are two ways to handle this. One is to to fix the problem on Mozilla/TB
side while the other is to fix both 'TB/Mozilla conduit' and Mozilla/TB. The
former means that we just let 'conduit' pass through byte steams and get
TB/Mozilla do the conversion assuming user-specified charactere encoding. In the
latter, we have to add an option to Mozilla/TB conduit so that users can specify
character encoding. With that, Mozilla/TB just have to deal with Unicode because
the conversion will be taken care of by the conduit. 

For me, it's easier to take the former path and I'm not sure if the latter is
plausible because I yet have to get Palm Desktop work properly with a Palm
Emulator. It's failing at the start-up. I think the latter was planned because I
found many methods in Mozilla/TB-side code have 'IsUnicode' argument.

David, what do you think? 
jshin:
Here are two pages on setting up the emulator.
http://www.ianywhere.com/developer/technotes/palmos_hotsync.html
http://www.palmos.com/dev/support/docs/conduits/win/Util_Emulator.html

If you are using the simulator, here are the instructions.  Note that if you
haven't gotten the OS6 stuff, use the google cached version of the page to see
the instructions.
http://www.palmos.com/dev/support/docs/conduits/win/Util_Simulator.html
http://216.239.51.104/search?q=cache:Xxrdd4Qe238J:www.palmos.com/dev/support/docs/conduits/win/Util_Simulator.html+%2Bsimulator+%2Bhotsync+%2Bdesktop+%2Bpalm&hl=en&ie=UTF-8

Hope that helps.

Kevin
Jungshik, I'm inclined to think this should be handled in the conduit - I'm not
sure how you can handle it just in Seamonkey/Tbird, since the app doesn't know
that data is coming from the conduit or not. But I don't know exactly what you
have in mind.

Also, it's tricky to have UI for the palm sync conduit prefs since we don't have
a place to put any UI at the moment. For tbird, we could add a ui to the
extension  so that the extension setings has a UI.
David, in Seamonkey this would typically be achieved by having the extension
overlay append a preference item to mail's preference overlay.
Right, I'm just saying that's an extra chunk of work that would need to be done,
and it can be a bit painful.
Attached image extend conduit dialog
David:
Since this is part of the conduit, wouldn't it be best to include the UI as
part of the conduit configuration, like they do in the Doc2Go conduit?	Since
the conduit is a windows only feature, I would think that there shouldn't be a
problem storing the setting in the registry.  It seems that this would be a
much simpler way to have UI for the setting than trying to integrate it into
Seamonkey/TB

Kevin
Kevin, that's a possibility, as long as the code that needs to access the
setting is only in the conduit. I'd prefer not to use the windows registry if
possible, however, just because then you have to clean up after yourself when
you uninstall, etc. I suppose we could use the condmgr dll api like the
installer does to set some special settings, and let the conduit manager handle
the cleanup.
Product: MailNews → Core
*** Bug 294136 has been marked as a duplicate of this bug. ***
*** Bug 184297 has been marked as a duplicate of this bug. ***
*** Bug 285984 has been marked as a duplicate of this bug. ***
*** Bug 310290 has been marked as a duplicate of this bug. ***
*** Bug 202380 has been marked as a duplicate of this bug. ***
*** Bug 205584 has been marked as a duplicate of this bug. ***
Blocks: 206207
*** Bug 311371 has been marked as a duplicate of this bug. ***
I thought characters in Latin-1 would be translated correctly, but even those
characters seem to have mistranslated (e.g. bug 311371). Thanks, Wayne, for
informing me of all those dupes.

Status: NEW → ASSIGNED
Summary: when synchronizing Palm addressbook, ISO_8859-1 is assumed → Palm addressbook synchronization issues with non-ASCII characters
Severity: normal → major
Keywords: dataloss
jshin, Jean-Francois Ducarroz <jf@ducarroz.org> wrote to me "I will be happy to review any patch" (wrt bug 184297, which is duped to this one)
Blocks: 205587
changing to critical because of dataloss issue
Severity: major → critical
Summary: Palm addressbook synchronization issues with non-ASCII characters → Palm addressbook synchronization issues with non-ASCII characters (international language)
(In reply to jshin's comment #20)
> I thought characters in Latin-1 would be translated correctly, but even those
> characters seem to have mistranslated (e.g. bug 311371). Thanks, Wayne, for
> informing me of all those dupes.

see also bug 182643 for prior history and hints

Summary: Palm addressbook synchronization issues with non-ASCII characters (international language) → Palm addressbook synchronization issues with non-ASCII characters (international language / unicode with PalmSync)
Not sure if this very relevant information or not, but I have successfully used Thunderbird and Palm Sync Extension for years. Now I upgraded from TB 1.0.7 to 1.5 and also updated the palm sync extension. I noticed that address book entries with ä's and ö's in then (that's a and o with umlauts) break up when they're coming from my Palm (Treo 600) to TB, i.e. those characters are converted to question marks. Entries syncing from TB to Palm do not break up.

The point is that I never had (or noticed) this problem before today, even though I have used this palm sync extension for years (and I have plenty of names with those characters as I'm a Finn), so maybe this particular bug has been introduced into the extension after the one that accompanied TB release 1.0.7?
(In reply to comment #24)
> The point is that I never had (or noticed) this problem before today, even
> though I have used this palm sync extension for years (and I have plenty of
> names with those characters as I'm a Finn), so maybe this particular bug has
> been introduced into the extension after the one that accompanied TB release
> 1.0.7?

Jouni,

What windows OS and what is the regional language setting on your PC?

(also, in the future add yourself to the bug, especially if you comment and wish to see future replies)
QA Contact: nbaca → vseerror
> What windows OS and what is the regional language setting on your PC?
> (also, in the future add yourself to the bug, especially if you comment and
> wish to see future replies)

Oh, ok, thanks. I haven't really used Bugzilla that much.

I'm running English Windows XP SP2 with "standards and formats" and "location" set to Finnish in the WinXP language settings. "Language for non-unicode programs" is set to English.
> I'm running English Windows XP SP2 with "standards and formats" and "location"
> set to Finnish in the WinXP language settings. "Language for non-unicode
> programs" is set to English.
I have the same problem since the upgrade to TB 1.5!
I'm running English Windows XP SP2 with "standards and formats" and "location" set to German in the WinXP language settings. "Language for non-unicode programs" is set to German too.

I have the same problem. To repair the ä,ö,ü, etc in the addressbook, I changed the string $EF$BF to $C3 in the file abook.mab. This can be done with a text editor or a script.
Hello Konrad,

I had the same problem with Seamonkey and my Tungsten E2 and the change of these characters solved it. But can you explain, what these change exactly do?

And another quoestion: if this change solve the problem in all encodings, it should be doable do integrate it in Seamonkey/Thunderbird. Correct me if not, cause I'm only a user and do not know something about programming Seamonkey.
I assume the Palm sync should encode umlaute with $C3 but writes $EF$BF instead. I don't know how this could be fixed. The Palm synchronisation works quite well in Mozilla, but unfortunately it's broken in Thunderbird and Seamonkey. I'm using a different email software now.


(In reply to comment #29)
> Hello Konrad,
> 
> I had the same problem with Seamonkey and my Tungsten E2 and the change of
> these characters solved it. But can you explain, what these change exactly do?
> 
> And another quoestion: if this change solve the problem in all encodings, it
> should be doable do integrate it in Seamonkey/Thunderbird. Correct me if not,
> cause I'm only a user and do not know something about programming Seamonkey.
> 
bug 207156 looks like it deals with similar issue(s)

The screen shot shows the results of a simple test. I created a couple records in the palm emulator and synced with setting of handheld overwrites desktop.  Note however I manage to crash thunderbird in address book after the sync - new bug 375491.

for anyone able to test or work on this I have a web page of instructions for setting up a test environment and the palm simulator on a PC, such that you won't mess up your own handheld, or even need one.  The emulator is all you need.
jshin in comment #20
> I thought characters in Latin-1 would be translated correctly, but even those
> characters seem to have mistranslated (e.g. bug 311371). Thanks, Wayne, for
> informing me of all those dupes.

jshin,

You are correct. In late 2002 this issue was fixed - Bug 182643 comment 4, bug 182643 comment 5, ... provide some details. But something broke between thunderbird 1.4 and 1.5 to cause regressions documented in Bug 310290, Bug 311371 and others (which I now believe I have incorrectly duped to this bug). 

Is perhaps bug 209699 (convert some consumers over to CopyUTF8toUTF16/CopyUTF16toUTF8) the source of the regression that caused those bugs?  (In working toward their initial fix in bug 182643 dayalrajiv and JF say "so we need to not use the NS_ConvertUTF8toUCS2 function".)
tutorial/simplified instructions at http://wsm.wetpaint.com/ for getting the palm emulator and palmsync extension up and running 
Assignee: jshin1987 → bienvenu
Status: ASSIGNED → NEW
Product: Core → MailNews Core
QA Contact: vseerror → palm-sync
Product: MailNews Core → MailNews Core Graveyard
Assignee: bienvenu → nobody
You need to log in before you can comment on or make changes to this bug.