Closed Bug 104027 Opened 23 years ago Closed 14 years ago

Linux: Cut & paste of accents fails from/to external application in C locale

Categories

(Core :: Internationalization, defect)

x86
Linux
defect
Not set
major

Tracking

()

RESOLVED WONTFIX

People

(Reporter: deragon, Assigned: masaki.katakai)

References

Details

(Keywords: intl)

From Bugzilla Helper:
User-Agent: Mozilla/4.7 [en] (X11; U; Linux 2.4.2-2 i686)
BuildID:    Mozilla 0.9.3 (provided by Ximian)

Under Linux Red Hat 7.1 running the latest Ximian Gnome environment (1.4
at the time) and XFree86 4.0.3, the following occurs.

I have french text with accents stored in an ascii file.  I edit this
file with vim through an xterm.  I select the text and paste it into
an input field on a website or on a new message window under Mozilla.
All accented characters are converted to "?".  

Example:  

  Je suis ouvert à toute position.  Cependant, si plusieurs
  offres sont disponibles, notez mes préférences suivantes:

...becomes:

  Je suis ouvert ? toute position.  Cependant, si plusieurs
  offres sont disponibles, notez mes pr?f?rences suivantes:

This does not occur under Netscape 4.7, which I use to write this bug
report.  The cut&paste is performed using the standard Unix way:
select the text with the mouse and paste with the middle mouse button.

It is imperative that this feature to work.  Imagine if in the english
language, pasting under Linux would transform all the "a" to "?".
Currently, Mozilla is not usable in another language other than english
because of this.


Reproducible: Always
Steps to Reproduce:
1. Type french accents in a text editor.  Example "àéèêî".
2. Cut the text using the Unix way, i.e. select the text with the mouse and
paste with the middle mouse button.


Actual Results:    Je suis ouvert ? toute position.  Cependant, si plusieurs
  offres sont disponibles, notez mes pr?f?rences suivantes:


Expected Results:    Je suis ouvert à toute position.  Cependant, si plusieurs
  offres sont disponibles, notez mes préférences suivantes:
assiging to bstell
Assignee: yokoyama → bstell
Works for me (suse 7.2, gnome, buildID 2001100921).

I pasted "айико" into 

+ a new mail message text area + adress field
+ browser adress field
+ composer window

using the middle mouse button.

Marcus Bauer -> from where did you made your copy?  Copying from
Mozilla to Mozilla works.  It is copying from another window, such as
an xterm to Mozilla which fails.

If you copied the string from the web page you were looking at using
Mozilla, your test is void.  Type the characters on an xterm and try
again.

If you did this, then I cannot tell what the problem is.  All I can
tell is that it is only Mozilla which has the problem.  Netscape,
Gnucash, xterm, Gnumeric, etc... all work well together.

Now when I type "יייי" in Mozilla (direct typing works) and copy it
to an xterm, I get this:  "e'e'e'e'".

Note that I use dead keys using XFree86 feature, with the following
option:

Section "InputDevice"
  Identifier  "Keyboard0"
  Driver      "keyboard"
  Option      "XkbLayout" "us_intl"
EndSection

Maybe this clue can help.
ylong, can you please confirm?
Keywords: intl
QA Contact: andreasb → ylong
I tried it by the followed steps on my RH7.1-Ja:
1. Open an xterm and set locale to fr_FR.
2. Type some accent characters.
3. Open another terminal and change the locale to fr_FR, launch the browser(I
was using 10-09 0.9.4 branch build).
4. Copy the accent characters in step 2 into netscape (browser and composer).

Then you will see those accent characters got corrupted.

Note: 
1. This is not the case inside netscape application, e.g. from browser to
composer...
2. If I copy/paste Japanese string from a terminal window to netscape, the
Japanese will display fine.

Confirming the bug, and change the summary from:
Cut & paste of accents under Linux fails.
to:
Linux: Cut & paste of accents fails from/to external application. 
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Cut & paste of accents under Linux fails. → Linux: Cut & paste of accents fails from/to external application
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.8
Katakai- can you take a look at this one?
Assignee: bstell → katakai
Status: ASSIGNED → NEW
I'm sorry I could not reproduce this problem. I've tested on
RH7.1-ja but I log into the system after selecting langage
on login window.

Yuying, why didn't choose French on login dialog then login?
Can you try again?
This problem still occurs to me with Mozilla 0.9.5.  Could that be that its
an interaction problem with XFree86/Ximian Gnome?

Note that I do not suffer from this problem when using Netscape 6.1.  I have not
tried Netscape 6.2, as I am afraid that it might not work;  I need to be able to
make those cut&paste.
> Yuying, why didn't choose French on login dialog then login?

There is no difference for me between French login and Japanese login.

But I found if I copy/paste those accent characters by single character, then
will work fine, any character combination will get garbled characters.

But I'm thinking maybe we can not do much for that: for my case, I did
copy/paste from xterm to netscape which is running on kterm.  But I even failed
copy/paste those accent characters from xterm to kterm, in this case, there is
nothing to do with mozilla or netscape.

Hans Deragon: could you please let me know how to get the accent characters in
text editor? cause I couldn't get it on gedit on RH7.1-Ja. 
As mentionned earlier, I setup XFree86 Version 4.0.3 with the
following in my /etc/X11/XF86Config-4

Section "InputDevice"
  Identifier  "Keyboard0"
  Driver      "keyboard"
  Option      "XkbLayout" "us_intl"
EndSection

I do not use any tool that comes with my window manager to
setup accents.

I successfully created accents in gedit and copied them
to an xterm and vice-versa.  I also copied successfull
accents between gnome-terminal and xterm.  In fact,
This works with 99% of the applications I use.

However, there are only 2 applications with this problem.
kterm, and Mozilla.  I suspect that this is some application
problem, and that kterm and Mozilla are doing the same "bad"
thing.  Did some code regarding this issue changed since
Netscape 6.1?  As I said previously, Netscape 6.1 works fine.
By checking the changes in the source code since them, you
might be able to get a clue.

Note thought that kterm does not behave the same way as
Mozilla.  When I copy ייי to kterm, nothing shows up, but
on Mozilla, it shows up as ???.  Also, from Mozilla to
xterm, I get e'e'e' on the xterm, but I cannot even
generate the accents on kterm.
I just tried Netscape 6.2 and the accents work fine also with this version.
*** Bug 113445 has been marked as a duplicate of this bug. ***
Conserning to 113445. Copy/Paste work fine with ru_RU.KOI8-R, but has same
problems with ru_RU.CP1251
John,

Which locale are you using for Mozilla and other
test apps?
This problems occurs also with Mozilla 0.9.7 but Netscape 6.2.1 works fine.  Can
someone explain why Netscape works?  Netscape is built from Mozilla...  should
be the same code for something as basic as accents.
I tried copy&paste on 0.9.2-0.9.7 and NS6.2.1 on
RH7.1-JA.
Status: NEW → ASSIGNED
Please note that even when you log into
the desktop after selecting French locale,
LANG environment seems to be ja_JP.euc.JP on RH7.1
-JA, so we have to set LANG environment manually
on terminal before you start Mozilla.

1. Log into desktop as French language
2. start terminal
3. setenv LANG fr_FR
4. mozilla & or xterm & or gedit &

Actually copy&paste works fine when I started
all apps in fr_FR locale.

However, when I started Mozilla in C locale,
I'm seeing this problem that ??? is pasted.

1. Log into desktop as French language
2. start terminal
3. setenv LANG fr_FR
4. gedit &
5. setenv LANG C
6. mozilla &

pasting characters become ??? on Mozilla as the
case on 0.9.6 and 0.9.7. There is no problem
on 0.9.2 - 0.9.5. NS6.2.1 also works fine because
it's from 0.9.4.

So, I'll check diifs between 0.9.6 and 0.9.6.

Masaki,
previously I used ru_RU.KOI8-R for whole system (X and console). And all was
good (I meen copy/paste and titlebar of window). Some time ago I switched to
ru_RU.CP1251 at whole system and got this troubles.
But for now: even if I start Mozilla and some text editor in ru_RU.KOI8-R locale
I can't get it working.
When I play with sources and make debug build it writes me that there is
'Unknown locale' (as far as I remember). Later I found file
'unixcharset.properties' and try to add to it:
locale.all.ru_RU.CP1251=CP-1251
and play with it. But mozilla successfully crashed with this line in file just
after start.
Masaki,
just now perfomed some tests with mozilla & (abiword, xterm & bluefish) in
different locales. So results follows:

Test on input text field:
Xterm (CP1251) -> Mozilla (KOI8-R) = all good
AbiWord (CP1251) -> Mozilla (KOI8-R) = text is not eaten but it incorrect codepage
BlueFish (CP1251) -> Mozilla (KOI8-R) = same as AbiWord but where is also
'%/1
Hm. Something went wrong...
Trying to repeat:
Masaki,
just now perfomed some tests with Mozilla 0.9.6 & (abiword, xterm & bluefish) in
different locales. So results follows:

Test on input text field:
Xterm (CP1251) -> Mozilla (KOI8-R) = all good (tested special on mail composer:
all good)
AbiWord (CP1251) -> Mozilla (KOI8-R) = text is pasted, but in CP1251 codepage
BlueFish (CP1251) -> Mozilla (KOI8-R) = same as AbiWord but where is also
'%/1 microsoft-1251' in begining of pasted text.

Test on input text field:
Xterm (KOI8-R) -> Mozilla (KOI8-R) = all good
AbiWord (KOI8-R) -> Mozilla (KOI8-R) = all good
BlueFish (KOI8-R) -> Mozilla (KOI8-R) = all good

Test on input text field and mail composer:
Xterm (KOI8-R) -> Mozilla (CP1251) = all good
AbiWord (KOI8-R) -> Mozilla (CP1251) = text is pasted, but in ISO8859-1 codepage
BlueFish (KOI8-R) -> Mozilla (CP1251) = same as AbiWord but where is also
'%/1 koi8-r' in begining of pasted text.

Test on input text field and mail composer:
Xterm (CP1251) -> Mozilla (CP1251) = text is pasted but there is no it in
message in send folder
AbiWord (CP1251) -> Mozilla (CP1251) = text is pasted, but in ISO8859-1
codepage, text is corrupted in send folder
BlueFish (CP1251) -> Mozilla (CP1251) = same as AbiWord

Mozilla (KOI8-R) -> Xterm (KOI8-R) = all good
Mozilla (KOI8-R) -> AbiWord (KOI8-R) = all good
Mozilla (KOI8-R) -> BlueFish (KOI8-R) = all good

Mozilla (CP1251) -> Xterm (KOI8-R) = all good
Mozilla (CP1251) -> AbiWord (KOI8-R) = russian letters replaced with '?'
Mozilla (CP1251) -> BlueFish (KOI8-R) = same as AbiWord

Mozilla (KOI8-R) -> Xterm (CP1251) = all good
Mozilla (KOI8-R) -> AbiWord (CP1251) = text pasted, but in KOI8-R codepage
Mozilla (KOI8-R) -> BlueFish (CP1251) = rusiian text is omited

Mozilla (CP1251) -> Xterm (CP1251) = all good
Mozilla (CP1251) -> AbiWord (CP1251) = russian letters replaced with '?'
Mozilla (CP1251) -> BlueFish (CP1251) = same as AbiWord

Mozilla says about itself: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6)
Gecko/20011120
Target Milestone: mozilla0.9.8 → ---
I using a Mandrake 8.0 and mozilla 0.97.


All copy and paste seems ok (xterm or konsole or XRTV_BR to mozilla and mozilla
to xterm or konsole or XRTV_BR)
My layout is pt_BR (Brazilian Portuguese) and all other layout configurations
are set to pt_BR.
But in some text box (like this one, when you add a comment), sometimes mozilla
stops to join the accent with the vogal. a' instead of á, etc).
Mozilla 0.9.8: Repeatable on my system too.

Pasting scandinavian characters (å, ä, ö) from mozilla to 
-Eterm/xterm will result characters (a*, a", o")-gnotepad+ (?, ?, ?)
-opera (?, ?, ?)

But: when pasting to Konqueror or kword, characters are displayed just as typed.
To mozilla from 
-Eterm (?, ?, ?)
-gnotepad+ (?, ?, ?)
-kword (?, ?, ?)
-konqueror (?, ?, ?)


Im running Mozilla 0.9.8 under Debian 2.2r5 (testing uptodate) with KDE. My
keyboard layout is set to finnish (it didn't matter when I changed it to
default). Im using iso-8859-1 and iso-8859-10 character encodings.
Does anyone have this problem with Redhat + KDE ? This happens on my 2
workstations, both Debian with KDE, but not with this RH72 laptop which also
runs KDE..

Seems like pasting from Mozilla to any KDE's application would work as well on
Debians (like with konqueror and kword I mentioned earlier).

I'll try to find out if there is some configuration diffirence with these
workstations
-> nsbeta1
Keywords: nsbeta1
> BlueFish (KOI8-R) -> Mozilla (CP1251) = same as AbiWord but where is also '%/1
koi8-r' in begining of pasted text.

When we get such results, the apps only supports CT and
the CT encoding "koi8-r" and "windows-1251" are not supported
in the target locale on your enviroment.

I believe we can see the same results on non-Mozilla apps.



Let me explain how Mozilla works at copy&paste and I'd like to ask
test again on your environment.

Mozilla can display any characters on it because Mozilla is unicode
based application and tries to look up proper fonts to display the
glyphs. Even when Mozilla starts in English locale, Mozilla can display
Japanese glyphs.

However, when copy&paste, it depends on locale of Mozilla and other
application.

Mozilla supports "UTF8_STRING". When other application supports
"UTF8_STRING", copy&paste should work.

When other application does not support "UTF8_STRING", most applications
uses "COMPOUND_TEXT". This should work also when Mozilla and apps
are running in the same locale.

What happens when Mozilla and other apps are running in different locale??
e.g. apps running in french locale, Mozilla running in POSIX locale.
Yes, Mozilla can display french accented characters. But when pasting
the characters from others to Mozilla, Mozilla tries to convert them
to "us-ascii" because Mozilla is running in C locale. So ? is displayed.
When copying to others, Mozilla converts UCS2 to "us-ascii" first, then
making CT. So "e'" is pasted.

For "%/1 koi8-r" problem, it's system (X11) limitation of your
environment and it's not Mozilla problem.


Sorry for asking testing to you many times, but please check again
on your enviornment and tell me the exact test cases when
"mozilla and other apps are running in the same locale"?
nsbeta1- since we already all good when both run in the same locale
Keywords: nsbeta1nsbeta1-
Mmm...  I have not much experience in setting locale...  I have X11 setup for
us_intl keyboard and that's how I get my accented characters.  I have the
following setup:

GDM_LANG=en.ISO-8859-1
LC_CTYPE=

Can you help me correct my locale by pointing me to some URLs explaining all of
this?  I searched the web a little, but could not find something helpfull on the
spot.

My config, RH7.1 + Ximian Gnome.  Essentially, I want to get all my apps to work
on the same locale.  And since I will not be the last to suffer of this problem,
I suggest that you type the solution in a FAQ so others would be able to correct
this problem.
Blocks: 144260
Trying to solve bug http://bugzilla.mozilla.org/show_bug.cgi?id=150958
I also solved this one!
Thanks to Masaki Katakai for the explanation on locales!

My config:

User-Agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.0.0) Gecko/20020529
BuildID:    2002052918
OS:         Debian GNU/Linux 2.2 (stable, up to date)

I solved the problem by setting LC_CTYPE=fr_CH (french-speaking part
of Switzerland, we use ISO-8859-1) before running Mozilla. 
BTW it also solves the problem for the 'less' application. Interestingly,
'more', 'cat' and Navigator 4 always work well with accents.
LC_CTYPE does not need to be a global setting, I can 'export LC_CTYPE=fr_CH'
juste before 'mozilla&', so it is probably not related with the window manager
(fvwm 2.2.4) or X (XFree86 3.3.6).

So, is it a Mozilla (and 'less') bug? I don't know, Unix locale is one of the
nearest thing to black magic I know of. A release note would be welcome.

I never played with locale before. Here is the configuration I used until now,
given by the 'locale' program:
LANG=POSIX
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_ALL=

And here is my new configuration, correcting this bug on Mozilla:
LANG=POSIX
LC_CTYPE=fr_CH
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_ALL=

Note that I use a much older system than most of you (Debian 2.2),
so this could be a working solution for everybody using ISO-8859-1.
How strange, Mozilla thinks that this page uses 'Windows-1251' charset
and I see all French accented characters 'àéè' in cyrillic!
Bug in the charset auto-detection?
Running Red Hat 9 and using UTF-8 as encoding, everything works fine now.  I can
copy accented characters between Mozilla and Linux terminals (xterm and
gnome-terminal) without a problem.

Does anybody still suffer of this problem?  Or should this bug be closed?
I understand that this problem happens when Mozilla is running in C locale and
still exists. No problem happens in non-C locale. I've updated the Summary.
Summary: Linux: Cut & paste of accents fails from/to external application → Linux: Cut & paste of accents fails from/to external application in C locale
Who uses locale C these days?  With UTF-8, this bug has become irrelevant.  Should we close it?
QA Contact: amyy → i18n
> Who uses locale C these days?  With UTF-8, this bug has become irrelevant. 

> Should we close it??
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.