137659 - wide-char (kanji) cut-n-paste gives wrong value (non-functional)

Reporter

Description

•

23 years ago

From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; SunOS i86pc; en-US; rv:0.9.9) Gecko/20020312 BuildID: 2002031212 Instead of loading in the proper wide-char byte value into the clipboard, it puts a meaningless '?' char in there for each widechar (widechar: eg: kanji char) This type of action works fine in netscape4.x Reproducible: Always Steps to Reproduce: Under netscape 4.x, if you highlight the first kanji char on the page, and paste it into cat >/tmp/pasteoutput and examine the results, you will see the appropriate result: a 2byte char, 0x36c0 If you do the same thing under mozilla, you get a SINGLE char, 0x3f, which is nothing more than '?' I dont think this is a problem with the underlying gtk libraries, since "kanjipad" built on my system, and using the same shared libs, handles wide-char cut-n-paste just fine. So I'd guess its a problem with the way mozilla is using cut-n-paste. Note: My locale is C. No, I shouldnt have to change my locale to use cut-n-paste properly. Actual Results: '?' gets put in the clipboard. Expected Results: 0x36c0 should be in the clipboard I rate this as a "major" feature lack, because without it, you cannot use cut-n-paste to translate pages.

Miles Bader

Comment 1

•

23 years ago

I have the same problem on a Debian GNU/Linux system. My main wish is to cut/paste japanese text between mozilla and emacs, but the all text from mozilla has a `?' in place of any non-ASCII character. Surprisingly, the same thing is true of Latin-1 text too! Emacs expects text pasted from X to be encoded using the X `compound-text-with-extensions' or `compound-text' encodings. Going in the opposite direction results in what appears to be the raw encoding being inserted into the mozilla input box (i.e., line-noise). [My locale is `C']

phil

Reporter

Comment 2

•

22 years ago

It turns out that if you set LANG=en_US.UTF-8 or something like that before starting mozilla, then cut-n-paste works. However, this is not acceptible. I dont WANT my LANG set that way. Nor should I have to set it that way. If mozilla can *display* something, it should be able to put that same thing in the clipboard, reguardless of what LANG is set to. even plain old xterm can handle cat >/tmp/pastefile to accept non-C-locale cut-n-paste. so mozilla should be able to generate appropriate cut-n-paste, reguardless of locale settings.

Miles Bader

Comment 3

•

22 years ago

> If you set LANG=en_US.UTF-8 or something like that before > starting mozilla, then cut-n-paste works. > However, this is not acceptible. Yes, I think this is clearly true -- X has the mechanisms for handling multi-language cut-n-paste, so there's no reason for mozilla be dependent on the locale. Users shouldn't have to fiddle with arcane settings to get this stuff to work (especially since changing the locale affects other things too, which might be undesirable).

Kenneth Herron

Comment 4

•

22 years ago

Phil, can you reproduce this with a current version of mozilla? Exactly what program are you pasting into? How are you performing the copy/paste? Are you using the PRIMARY or the CLIPBOARD (see <http://www.jwz.org/doc/x-cut-and-paste.html>)? I loaded the sample URL into mozilla, selected some kanji text, and tried pasting it into several other things with the following results. In each case I was copying & pasting through the PRIMARY. 1) Pasting into a mozilla textarea (this "additional comments field") produced kanji. 2) Pasting into a mulberry (email reader) message-composer window produced a string of \x{8749}\x{6642} sequences. 3) Pasting into a form in konquerer produced kanji. 4) Pasting into konsole (KDE's terminal app) produced question marks? 5) Pasting into gnome-terminal or xterm produced nothing. 6) Pasting into other eterm, kterm, and rxvt also produced \x{8749}\x{6642} sequences. Loading the page into konquerer and pasting from there to another program produced even worse results. I got question marks pasting into any terminal program, and nothing at all pasting into mozilla. Loading the page into galeon seemed to work about as well as loading into mozilla, though I didn't test it very far.

phil

Reporter

Comment 5

•

22 years ago

Current build still has issues. Here's even more explicit details: Mozilla 1.5a, Copyright (c) 2003 mozilla.org, build 2003062101 LANG=C ./mozilla /opt/sfw/bin/xterm -version XFree86 4.0.1c(146) LANG=en_US.UTF-8 /opt/sfw/bin/xterm [select kanji with mouse, middlebutton into xterm] gets ??? [select kanji from same page, netscape 4.7] gets raw kanji codes successfully LANG=en_US.UTF-8 /usr/openwin/bin/xterm [select kanji with mouse, middlebutton into xterm] gets ??? [select kanji from same page, netscape 4.7] gets raw kanji codes successfully Technically, the program I "really" care about is kdrill :-) http://www.bolthole.com/kdrill/ But that's my own program, so I thought I'd use a more standard prog as an official example. I can start kdrill in 'C', and have the same behaviour as above. LANG=C kdrill [works from netscape4.7, not from moz1.5a] Code in kdrill is just clipboardatom=XA_PRIMARY; XtGetSelectionValue(widget, clipboardatom, XA_TEXT(display), copybuffer, NULL, timestamp);

Andrew Schultz

Comment 6

•

22 years ago

same problem still occurs with linux as described in comment 4. changing OS to Linux to get more attention. marking NEW ==> intl

Assignee: blaker → smontagu

Status: UNCONFIRMED → NEW

Component: XP Apps: Drag and Drop → Internationalization

Ever confirmed: true

OS: Solaris → Linux

QA Contact: tpreston → ylong

Hardware: Sun → PC

Phil Ringnalda (:philor)

Updated

•

15 years ago

QA Contact: amyy → i18n

Andrei Purice

Comment 7

•

3 years ago

Marking this as Resolved > Incomplete since the last real activity on this issue was 19 years ago and it might not be relevant anymore.
Feel free to re-open it if it's not the case and the issue is still relevant.

Status: NEW → RESOLVED

Closed: 3 years ago

Resolution: --- → INCOMPLETE

Bugzilla

wide-char (kanji) cut-n-paste gives wrong value (non-functional)

Categories

(Core :: Internationalization, defect)

Tracking

()

People

(Reporter: phil, Assigned: smontagu)

References

(
URL
)

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7