Closed Bug 163908 Opened 22 years ago Closed 22 years ago

After copied Non-ascii characters from websites, Non-ascii chars are displayed as garbage

Categories

(Core :: Internationalization, defect)

PowerPC
macOS
defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: teruko, Assigned: nhottanscp)

References

Details

(Keywords: intl, topembed)

Attachments

(1 file, 1 obsolete file)

This is related to bug 108029.

After I copied Japanese characters from Japanese websites, Japanese characters
in clipboard are displayed as garbage.

Steps of reproduce
1. Go to http://www.yahoo.co.jp
2. Copy some Japanese characters
3. Open the Clipboard

Actual result 
Japanese characters are displayed as garbage.

Expected result
Japanese characters are displayed correctly in clipboard.

This does not happen with IE.

Tested 8-19 1.01 branch and 8-17 1.1 branch MacOS X build.
Keywords: intl, topembed
QA Contact: ruixu → teruko
dan, this looks more like your exact issue, no?
This happens with Chinese and other non-Latin pages as well. Some MacRoman
garbage is shown in the Clipboard, although whatever is copied is rendered fine
in TextEdit or Pepper. Perhaps Clipboard needs to have data served in a special
fashion.
Changed the summary.
Summary: After copied Japanese characters from websites, Japanese chars are displayed as garbage → After copied Non-ascii characters from websites, Non-ascii chars are displayed as garbage
teruko:
how about trunk ? can you reproduce this on trunk?

TextEdit may support utxt and clipboard may not support utxt

reassign to nhotta

is this a bug needed by a embeding customer based on m1.0 branch?
if the trunk work fine, then we probably should land the patch for bug 108029
into branch to solve it. 
how can we know this is a bug in Mozilla instead of clipboard?
the other possiblility is that we do not put the style for unicode out. 
so, let's be specific. The major issue is how "clipboard" display it in this
bug, right ?
frank, correct. the issue is how the finder displays the data in text clippings
and the clipboard. from all reports, pasting the data to other applications
yields the correct behavior (but that's just hearsay, i don't know it for a fact).

this does not work on either the branch or the trunk. 108029 does not fix this
issue which is why it was spun out into another bug.
Maybe the Finder clipboard window doesn't display Unicode? Does this behaviour
change in 10.2?
reassing to nhotta
nhotta, please first try to produce a fix for trunk and later move it to branch
according to pinkerton's request

here is what I find
I do the following
1. go to www.yahoo.co.jp
2. select all by command-a
3. copy by command-c
4. goto finder
5. open Edit:Show Clipboard

you will see the text display as garbage. Notice in the botton of the clipboard,
the status line say
"Clipboard contents: text"

no, if I paste the text into TextEdit, all text show correctly
then, In TextEdit I select all and go to clipboard
now the clipboard display them correctly with Japanese, also the stauts line say
"Clipboard contents: stlyed text"

now if I used IE, the following thing will happen
1. in the clipboard, the status say "Clipboard contents: stlyed text"
2. in the clipboard, the content who show up in styled text, which mean the font
color, style, underline etc will show up in the clipboard
3. if I paste into TextEdit, those styled will also paste into the clipboard

So... I think the issue is we do not generate 'styl' data now. but that will be
a much harder problem to solve. I wont to clearly understand what kind of
support is really needed in shert term:
1. just make the clipboard show up japanese, without font color, underline, etc
to show up
2. font color, font face, and underline should also show up in clipboard

for 1, I think we can produce a patch in one we, what we can do is generate the
'styl' by default (per script) font family, font   size, font color, font style
month job. For 2, I think it is at least a one month job because that required
retrive resolved style inforamtion from the layout. 
Assignee: ftang → nhotta
So it sounds like the Finder's Clipboard window is showing old-style text + styl
data. I think what really matters is what happens when you paste into a
non-unicode destination (since we know that pasting unicode works).
I tested this in 822 trunk MacOS X build.  I still reproduciable this.
The fix for bug 108029 is only Mac system 9.x build, not MacOS X build.
Are you sure? I don't see any TARGET_CARBON #ifdefs there.
I am not sure, but I can see the non-ascii characters in clipboard after I
copied from 8.x build only (1.1 and trunk build). 
If you are using a double byte base OS 9x to copy a double byte string into
clipboard, you won't see the problem in bug 108029.
Using an US OS 9.x with language pack, will see the problem in branch build but
get fixed in trunk build.

To answer the question in comment #8:
Mac 10.2 still show "text" not "stlyed text" in clipboard.
I change the code to create 'styl' and the Finder clipboard shows Japanese. So
we can use this approach.

But I think the Finder also has a problem that is not showing 'utxt' and not
showing 'TEXT' using system's locale. It can reproduce using TextEdit as plain
text mode. On Japanese locale, copying Japanese plain text from TextEdit shows
garbage in the Finder clipboard.
Mike, please review the patch, thanks.
Status: NEW → ASSIGNED
Xianglan, please test if this is landed in trunk build while I am on vacation.
QA Contact: teruko → ji
+  ScriptCodeRun *scriptCodeRuns = (ScriptCodeRun *)
nsMemory::Alloc(sizeof(ScriptCodeRun) * aUnicodeStrLen);

nit. other parts of the patch use NS_REINTERPRET_CAST, this probably should too.

-                                            kUnicodeUseFallbacksBit | 
+                                            kUnicodeUseFallbacksMask | 
                                             kUnicodeLooseMappingsMask |
-                                            kUnicodeKeepSameEncodingBit,
+                                            kUnicodeTextRunMask,

are you sure this is what you want? ususally one doesn't 'or' masks together
like this.

+  PRInt32 scrpRecLen = sizeof(short) + sizeof(ScrpSTElement) * scriptRunOutLen;

nit. this should be const to indicate to the reader it won't change.

+  GrafPtr savePort;
+  ::GetPort(&savePort);
...
+  ::SetPort(savePort);

Why are you saving and restoring the port? I don't see anything between these
calls to change the port. You should probably also ensure that you have a valid
port. There are utils in gfx to do this.

looks good besides this. does this patch fix the problem on both os9 and osx?
thanks for jumping on this so quickly!
+  GrafPtr savePort;
+  ::GetPort(&savePort);
...
+  ::SetPort(savePort);

I am changing font in order to get fontInfo. So I restore the port after the
loop. Is that necessary?

+    ::TextFont(fontFamilyID);
+    ::TextSize(textSize);
+    ::TextFace(textStyle);
+    ::GetFontInfo(&fontInfo);
saving/restoring the port only resets what the current port is, not any of its
settings. so if you want the port to be in the state it was before you made
changes, you need to save those changes individually and restore them.
That's right, the code is actually modifying font setting of the current port. I
will change to save/restore the font setting.
-                                            kUnicodeUseFallbacksBit | 
+                                            kUnicodeUseFallbacksMask | 
                                             kUnicodeLooseMappingsMask |
-                                            kUnicodeKeepSameEncodingBit,
+                                            kUnicodeTextRunMask,

I think this is right according to the Apple's document.
http://developer.apple.com/techpubs/macos8/TextIntlSvcs/TextEncodingConversionManager/TEC1.5/TEC.6d.html


I only tested on MacOSX, I am going to build OS9 and find a test environment
that I can test.
It works on MacOS9 (after some modifications).
The Finder clipboard shows Japanse on OS9 with the language kit.
Attachment #96915 - Attachment is obsolete: true
Mike, please review the patch, thanks.
Comment on attachment 97102 [details] [diff] [review]
Changed to address review's comment.

r=pink

+#endif // nsStylClipboardUtils_h___
\ No newline at end of file

just fix the lack of newline here at the end of the file.
Attachment #97102 - Flags: review+
Simon, could you 'sr'?
Comment on attachment 97102 [details] [diff] [review]
Changed to address review's comment.

sr=sfraser
Attachment #97102 - Flags: superreview+
checked in to the trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
verified as fixed with 08/30 and 09/03 trunk build on Mac OS 9.1, Mac OS 10.1
and Mac OS 10.2
Status: RESOLVED → VERIFIED
Comment on attachment 97102 [details] [diff] [review]
Changed to address review's comment.

a=chofmann for 1.0.2
Attachment #97102 - Flags: approval+
checked in to 1.0 branch, adding fixed1.0.2
Keywords: fixed1.0.2
Verified this in 09-09 1.0 branch build on MacOS 10.1.5 and 10.2.
Keywords: verified1.0.2
posthumus adt1.0.1+.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: