17169 - Japanese string display applets show only dots not real characters

Reporter

Description

•

25 years ago

** Observed with 10/24/99 Win32 build w/ JDK 1.3 Beta Java Plugins ** The applet at the above URL displays both Japanese characters and the English Applet but with the setup indicated above, Japanese strings show up only as dots while the English string "Applet" is displayed. To see this bug, follow the steps below: 1. Install the newest Java Plugins as explained in Bug 14202. 2. Start Mozilla and set the Character Set to Japanese (Shift_JIS). (It will probably freeze Mozilla if you set the auto-detect to Japanese, and so for now don't try that. --> this should be another bug.) 3. Now visit the above URL and wait for the applet to load. 4. You see that strings begin to scroll but Japanese characters appear simply as dots while the English word "Applet" is visible. You can try other string display applets indicated by links at the above URL. Most of them should run and will display English strings but none will display Japanese characters. conversion failure? font information lacking? widget not supporting multi-byte characters? By the way, most applets at the above URL actually run ** without the newesst Java Plugins mentioned in Bug 14202 **, but the results are still the same -- Japanese characters display only as dots. Who should get this bug? Initially assigned to edburns.

Katsuhiko Momoi

Reporter

Updated

•

25 years ago

QA Contact: leila.garin → amasri

edburns

Comment 1

•

25 years ago

How do I set the character set to Japanese?

Katsuhiko Momoi

Reporter

Comment 2

•

25 years ago

You need to use "View | Character Set"Menu to choose Japanese (Shift_JIS). Right now the menu is not scrollable and some monitors will have a problem displaying all the menu items. I have a workaround posted here for the unscrollable menu. http://www.mozilla.org/quality/intl/charmenufix.html Under Method 2, you can use the Navigator modifications suggested under the section For Navigator window (M10 only): and change 2 files mentioned in it. After this modification, you will find Japanese Character sets under "Character Set: Multibyte" menu.

Katsuhiko Momoi

Reporter

Comment 3

•

25 years ago

It says "M10 only" in the document but it works with M11 builds also (e.g. today's build).

edburns

Updated

•

25 years ago

Assignee: edburns → ryang

edburns

Comment 4

•

25 years ago

You need to follow the steps in the Method 2 section of http://www.mozilla.org/quality/intl/charmenufix.html then run mozilla and go to View->Character Set Multibyte and choose Japanese, Shift_JIS. Then visit the URL http://ss4.inet-osaka.or.jp/~athushin/ScrollString1.html

Katsuhiko Momoi

Reporter

Comment 5

•

25 years ago

Hopefully within in a few days, a temporary charset workaround will be checked into the source so that you don't have to do these modifications manually.

Robert H. Yang

Updated

•

25 years ago

Assignee: ryang → edburns

Robert H. Yang

Comment 6

•

25 years ago

I've got a Nov-9 build of Mozilla and I did the following : 1. Started up with the Moz. home page. 2. Used menu options View->Character Set Multibyte->Japanese (Shift_jis) 3. Went to smoothscroll page. The text scrolled by the applet looks like the following : ".... Applet ..." (My rendering of the number of periods is not accurate 8) I did some debugging in the Java Plugin code, and apparently nsIPluginTagInfo2::GetParameters() is used to retrieve the parameters for the applet. The text of the parameters which should have contained UTF-8 encoding for Japanese characters actually contain periods ('.'). In order for this kind of page to work, the parameters retrieved by nsIPluginTagInfo2::GetParameters() will need to be in UTF-8 encoding. So the implementation of this method will have to get the parameters converted from whatever encoding is present in the HTML (shift_jis in this case) to UTF-8, because the Java Plugin creates Unicode strings for display from UTF-8 chars. Re-assigning to Ed. My apologies if I am incorrect in doing so !

edburns

Updated

•

25 years ago

Assignee: edburns → ryang

edburns

Comment 7

•

25 years ago

I have verified that this bug does indeed occurr with the latest J2SDK install. Robert, is there some codepage problem?

Robert H. Yang

Comment 8

•

25 years ago

I doubt it's a codepage problem. Mozilla clearly has the ability to translate between encodings, if I can trust the menu options. OJI has to convert from an MBCS string in, say, shift_jis encoding to one in UTF8 encoding. The chars have to end up as Unicode for display in Java, right ?

Katsuhiko Momoi

Reporter

Comment 9

•

25 years ago

Are you really using UTF-8 in Java display? Not UCS2/Unicode?

Robert H. Yang

Comment 10

•

25 years ago

The native code in the Java Plugin uses NewStringUTF8() in JNI to convert from a char string to a Java string (which is always Unicode). This is because the OJI method mentioned above returns char strings (instead of wchar_t, or whatever XPCOM defines for Unicode). Recall that UTF8 is simply an 8-bit encoding of Unicode, whereas UCS2 is a 16- bit encoding. In UTF8 a Unicode char may take up one or two bytes, depending on its value in Unicode. You can't assume that 1 byte == 1 character, but the data type used for passing these things around is still char*. Makes life interesting. Any way you do this, the text will have to be converted from whatever it is on the html page to some form of Unicode for use by Java, whether that be UTF8 or UCS2.

leila.garin

Updated

•

25 years ago

Component: Java APIs to WebShell → OJI

Robert H. Yang

Updated

•

25 years ago

Assignee: ryang → edburns

edburns

Updated

•

25 years ago

Status: NEW → ASSIGNED

edburns

Comment 11

•

25 years ago

Robert, are you suggesting an API change for communication between mozilla and the Java Plugin?

bobj

Comment 12

•

25 years ago

I'm ignorant of OJI, but assuming that the OJI methods expect all strings to be char* encoded in UTF-8, then whatever code calls these methods should convert from the encoding of the HTML page to UTF-8 before invoking the OJI method. Converters from most common charsets to UTF-8 already exist in the browser. If this is correct, I don't think you need any API changes.

Katsuhiko Momoi

Reporter

Comment 13

•

25 years ago

Hi, Ed, Could you possibly set the Target Milestone on this? It would be nice if we can have Java plugins working for Beta1 and then this JPN string data to also show correctly. I have a feeling that this could be solved quickly once we know what conversion to perform and where.

edburns

Comment 14

•

25 years ago

momoi: done.

Target Milestone: M15

bobj

Comment 15

•

25 years ago

Ed, M15 means post-Beta1 (in current definition). As momoi-san states, we really want Java plug-ins to work for Japanese in Beta1.

edburns

Comment 16

•

25 years ago

Ok, but the delivery of the fix of this fix depends on Sun and Netscape coming up with a mutually acceptable release vehicle for the Java Plugin that includes the necessary changes to run in mozilla.

Depends on: 16296

Target Milestone: M15 → M14

Katsuhiko Momoi

Reporter

Comment 17

•

25 years ago

The Hitsuji Applet Farm -- the JPN test case page -- has moved. So, I updated the link above. It is bette and bigger than before. The above URL is for "Horizontal Scroll". In addition, he has some "Vertical Scroll" applets at: http://www.hitsuji.to/VerticalScroll1.html (Watch out: Mozilla seems to hang on this page a lot.) also, Special effects Character display at: http://www.hitsuji.to/SpecialString1.html Java is ON today (2/24/2000 build). Seems to change from day to day, though.

URL: http://ss4.inet-osaka.or.jp/~athushin... → http://www.hitsuji.to/ScrollString1.html

Frank Tang

Comment 18

•

25 years ago

This looks like a illegal calling of nsString::ToNewCString problem. (because ToNewCString will converter U+0100 - U+FFFF to '.' ) When you try to locate the problem code, first apply the patch I listed in bug 28424 (remember to chage the DEBUG_ftang to your name) , rebuild xpcom/ and visit that page. You should get assert at the point which the data get damage.

edburns

Comment 19

•

25 years ago

Raju, can you also test this in addition to testing 21305.

Whiteboard: ETA of ETA 03/01/00

edburns

Comment 20

•

25 years ago

Raju tested this on a Japanese Windows NT system and he also observed seeing the dots instead of the Kanji. I'm assigning this to Stanley for investigation. Stanley, you can assign it back to me if you don't have time for it.

Assignee: edburns → stanley.ho

Status: ASSIGNED → NEW

Whiteboard: ETA of ETA 03/01/00 → ETA after Beta1

stanley.ho

Assignee

Updated

•

25 years ago

Assignee: stanley.ho → edburns

stanley.ho

Assignee

Comment 21

•

25 years ago

Ed, sorry, I don't have time to investigate this, so I will reassign it to you. By the way, nsIPluginTagInfo::GetParameter will only take UTF string, so if MBCS strings are passed, it won't be recognized by the plug-in.

stanley.ho

Assignee

Comment 22

•

25 years ago

I tried out the latest JRE in a Japanese NT machine. If character encoding was not enabled, I saw the applet was showing garbage english character. I saw the same set of garbage english character in the PARAM tag when I clicked "View->Source", so the browser passed in the characters from the PARAM tags into the JRE correctly. However, if Japanese encoding is enabled, I saw the "...." displayed in the applet. If I clicked "View->Source", I saw the japanese characters instead of "...". Java Plug-in obtains the parameters from nsIPluginTagInfo2::GetParameters as UTF-8 strings. In the Japanese machine with japanese character encoding enabled, the browser tried to pass in multi-byte characters instead, and this causes problem to Java Plug-in because it tried to interpret the string as UTF-8. This confirms the theory made by bobj@netscape.com. To solve this problem, the browser should convert all strings from its encoding into UTF-8, before passing it into the JRE. This problem occurs not only in plug-in, but also generic plug-in in general, so I will reassign this bug to the plug-in category. Basically, this problem

Assignee: edburns → av

Component: OJI → Plug-ins

QA Contact: amasri → shrir

bobj

Comment 23

•

25 years ago

stanley.ho@eng.sun.com, Your last sentence in your last comment was cut off. Was there more that you intended to write? Thx.

stanley.ho

Assignee

Comment 24

•

25 years ago

No, that's all I wanted to write. The last line was there by mistake. By the way, would anyone be able to tell me what exactly the behavior of the JVM should be when the character encoding changes in Navigator 3/4?? We would like to provide similar behavior in Java 2 through OJI, but we need more information on this issue because there are lots of difference between NS JVM and Sun's Java 2.

Katsuhiko Momoi

Reporter

Comment 25

•

25 years ago

In 4x, when Java outputs to Browser window (w/ AWT widgets), it converts from Unicode to what the window says its charset is. The charset info could be in the form of "document meta charset tag", "Http server charset", or "user set menu choice". I think they ammount to the same thing, i.e. the current default charset of that window.

stanley.ho

Assignee

Comment 26

•

25 years ago

So, does it mean that the JVM will always do the convertion according to the default character encoding of the current page, and doesn't support character encoding changing on the fly?? If this is the case, it will make our life much easier. We will still need the bug fix for 16296 on the browser side, so the OJI plug-in will always take UTF-8 string and pass it directly to the JVM, without worrying about character encoding.

Katsuhiko Momoi

Reporter

Comment 27

•

25 years ago

Yes, that was true for 4.x. The real question is, that does that same consideration now apply in Mozilla? Mozilla uses Unicode for layout and so why/where would it be necessary to output to a native encoding? For input from web documents, we need to be aware of the document charset, but for ouput? I'm not sure. The other question I have is what advantage do we have in being able to change output encoding on the fly? Is that something we should look at before deciding what to do?

Katsuhiko Momoi

Reporter

Comment 28

•

25 years ago

Sorry, I think there is a mistake in the way I described the interaction with JVM. I think this is ftang's area of knowledge and I'll leave it up to him to supply correct info. In the meantime, you can look at Frank Tang's presentation for Netscape Navigator and Java on this page. This should make things clearer: http://people.netscape.com/ftang/paper/unicode9/page0006.htm

stanley.ho

Assignee

Comment 29

•

25 years ago

For OJI, the consideration applies when nsIPluginTagaInfo2::GetParameter is called. This method is for the OJI plug-in to obtain the strings that are specified in the PARAM tag inside APPLET tag. Currently, we accept the string as UTF-8, not unicode. Thus, as long as Mozilla convert the string from Unicode to UTF-8 properly using default encoding before passing it to OJI, we will be okay. We need to support of UTF-8 because the same plug-in will also work with Navigator 3/4 through backward adapter, and in this case, the string is always char* in old browsers. IMO, I don't think changing output encoding on the fly gives us any advantage, because it only happens when you try to change the encoding of the page to non-default one. Actually, this is difficult to implement it as well because we can't force the applet to dump any string that they obtain from Applet.getParameter() and reload the string when encoding changes. The cost of supporting it will definitely be much higher than the benefit of it. If I would choose, I will not want to support encoding changing on the fly.

Frank Tang

Comment 30

•

25 years ago

Are you talking about nsObjectFrame.cpp ? Replace your ToNewCString with ToNewUTF8String. You should also do that for ToCString but be aware that ToNewUTF8String malloc for you while ToCString don't. I think you should change line 1614-1637 (not test, please try by yourself) 1585 NS_IMETHODIMP nsPluginInstanceOwner::GetAttributes(PRUint16& n, 1586 const char*const*& names, 1587 const char*const*& values) ... 1614 mAttrNames[mNumAttrs] = (char *)PR_Malloc(name.Length() + 1); 1615 mAttrVals[mNumAttrs] = (char *)PR_Malloc(value.Length() + 1); 1616 1617 if ((nsnull != mAttrNames[mNumAttrs]) && 1618 (nsnull != mAttrVals[mNumAttrs])) 1619 { 1620 name.ToCString(mAttrNames[mNumAttrs], name.Length() + 1); 1621 value.ToCString(mAttrVals[mNumAttrs], value.Length() + 1); 1622 1623 mNumAttrs++; 1624 } 1625 else 1626 { 1627 if (nsnull != mAttrNames[mNumAttrs]) 1628 { 1629 PR_Free(mAttrNames[mNumAttrs]); 1630 mAttrNames[mNumAttrs] = nsnull; 1631 } 1632 if (nsnull != mAttrVals[mNumAttrs]) 1633 { 1634 PR_Free(mAttrVals[mNumAttrs]); 1635 mAttrVals[mNumAttrs] = nsnull; 1636 } 1637 } change it to mAttrNames[mNumAttrs] =name.ToNewUTF8String(); mAttrVals[mNumAttrs]=value.ToNewUTF8String(); same thing in 1883 NS_IMETHODIMP nsPluginInstanceOwner::GetParameters(PRUint16& n, const char*const*& names, const char*const*& values) change 1964 mParamNames[mNumParams] = (char *)PR_Malloc(name.Length() + 1); 1965 mParamVals[mNumParams] = (char *)PR_Malloc(val.Length() + 1); 1966 1967 if ((nsnull != mParamNames[mNumParams]) && 1968 (nsnull != mParamVals[mNumParams])) 1969 { 1970 name.ToCString(mParamNames[mNumParams], name.Length() + 1); 1971 val.ToCString(mParamVals[mNumParams], val.Length() + 1); 1972 1973 mNumParams++; 1974 } 1975 else 1976 { 1977 if (nsnull != mParamNames[mNumParams]) 1978 { 1979 PR_Free(mParamNames[mNumParams]); 1980 mParamNames[mNumParams] = nsnull; 1981 } 1982 1983 if (nsnull != mParamVals[mNumParams]) 1984 { 1985 PR_Free(mParamVals[mNumParams]); 1986 mParamVals[mNumParams] = nsnull; 1987 } 1988 } 1989 } to mParamNames[mNumParams] = name.ToNewUTF8String(); mParamVals[mNumParams] = val.ToNewUTF8String(); Please understand I didn't test this changes and you have to try by yourself. I assume stanley.ho 's comment about UTF-8 is true, in that case, ToNewUTF8String will do the right job while ToCString/ToNewCString will convert non ASCII into '.'.

Blocks: 28424

Frank Tang

Comment 31

•

25 years ago

I have already put down the details of how to fix this problem. Please provide ETA of the check in. (year 2010 can consider "ETA after Beta1" , right ? Please be more specific). And also please mark it ASSIGN if you agree to work on it.

av (gone)

Updated

•

25 years ago

Status: NEW → ASSIGNED

Target Milestone: M14 → M16

av (gone)

Comment 32

•

24 years ago

Is JRE 1.3 the latest version? I have trouble seeing Java applets with recent build.

shrirang khanzode

Comment 33

•

24 years ago

1.3 is the latest version. Java is not working on windows (bug 36405).

Frank Tang

Updated

•

24 years ago

Keywords: nsbeta2

av (gone)

Comment 34

•

24 years ago

Shrirang, do you this applet? For it says Loading... forever.

Katsuhiko Momoi

Reporter

Comment 35

•

24 years ago

This test applet mentioned in the above URL is working with 5/15/2000 Win32 build. (It runs but still does not display Japanese.) If you need a demo of how it should work, please give me a call.

shrirang khanzode

Comment 36

•

24 years ago

Talked with Kat and confirmed that this applet does not load on english windows builds. I also see a "Loading" message and the applet does not load. This is working ok on japanese windows system only.

shrirang khanzode

Comment 37

•

24 years ago

Strangely enough, after leaving this page open for hours on my windowsNT,the applet loads and I see it as kat had mentioned in his initial comments. (build used: 2000051720)

leger

Comment 38

•

24 years ago

Putting on [nsbeta2+] radar for beta2 fix.

Whiteboard: ETA after Beta1 → [nsbeta2+]

av (gone)

Comment 39

•

24 years ago

Stanley, I tried ftang's change and got the following (I don't understand this 'Illegal UTF8 string' after what you explained): java.lang.ClassFormatError: ScrollString (Illegal UTF8 string in constant pool) at java.lang.ClassLoader.defineClass0(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at sun.applet.AppletClassLoader.findClass(Unknown Source) at sun.plugin.security.PluginClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.applet.AppletClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.applet.AppletClassLoader.loadCode(Unknown Source) at sun.applet.AppletPanel.createApplet(Unknown Source) at sun.plugin.AppletViewer.createApplet(Unknown Source) at sun.applet.AppletPanel.runLoader(Unknown Source) at sun.applet.AppletPanel.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

av (gone)

Comment 40

•

24 years ago

I checked in the new string conversion. Reassigning to stanley.ho@eng.sun.com

Assignee: av → stanley.ho

Status: ASSIGNED → NEW

Mike

Comment 41

•

24 years ago

M16 has been out for a while now, these bugs target milestones need to be updated.

shrirang khanzode

Comment 42

•

24 years ago

component:oji

Component: Plug-ins → OJI

leger

Comment 43

•

24 years ago

per I18n QA review today, moving from [nsbeta2+] to [nsbeta2-]. Can no longer wait for a fix. Missed the beta2 train.

Whiteboard: [nsbeta2+] → [nsbeta2-]

Frank Tang

Comment 44

•

24 years ago

remove [nsbeta2-], without this , Japanese user cannot use Java for Japanese sites. We have to have this fix for nsbeta2.

Whiteboard: [nsbeta2-]

Jay Patel [:jay]

Comment 45

•

24 years ago

Putting on [nsbeta2-] radar. Not critical to beta2. Sorry, you missed the train (as Jan said).

Whiteboard: [nsbeta2-]

Frank Tang

Comment 46

•

24 years ago

remove [nsbeta2-]. I didn't miss the train. The one who own the bug miss the train. I am the CUSTOMER of this work.

Whiteboard: [nsbeta2-]

Frank Tang

Comment 47

•

24 years ago

We do more testing now. The data have been passed to the JVM correctly. The rendering have some problem. But after we uncommment out some alis line in the font.properties file, we see the Japanese display correctly. Therefore, it is working now and there are nothing mozilla need to change from here. We do need to release note the remaining issue. momoi will take care of that part. I still see some hang or crash when we hit Java. I think we should do more testing in that area and make sure it won't crash/hang.

Katsuhiko Momoi

Reporter

Comment 48

•

24 years ago

To test Java and display for non-ASCII data string, e.g. scroll applet designed to show Japanese characters scroll by, do the following. 1. Modify the font.properties file appropairte for your Windows. If you're using Japanese Windows, then modify: JavaSoft/JRE/1.3/lib/font.properties.ja If you're using US Windows, then modify: JavaSoft/JRE/1.3/lib/font.properties 2. In the appropriate font.properties file, uncomment the following 3 lines in the alias section: # alias.timesroman=serif # alias.helvetica=sansserif # alias.courier=monospaced Not all applets which display non-ASCII characters depend on this modification but many applets use these legacy font names prior to JDK 1.1x. Removing "#"'s will ensure that these legacy font name compatibility is taken care of. We should mark this bug rexolved/fixed and deal with additional issues ftang talks about above in other bugs. George, will Sun be willing to "uncomment" these lines in the JRE distributions? It would not cost Java anything but will make handling of these legacy font names cases much much easirer because the user has to do anything to see the characters. Otherwise, the user will have to uncomment the alias lines. Can you help to put the workaround into the JRE distribution for us to use?

Keywords: relnote

Katsuhiko Momoi

Reporter

Comment 49

•

24 years ago

The basic display probelm has been fixed. Now it is time to test a variety of applets which show non-ASCII characters to see if there are additional problems like "crashing if not set to the proper character encoding ahead of time". This requires i18n testing conditions -- teruko, can the Browser team take this bug for PR2 testing? I will write a release note item for this.

Teruko Kobayashi

Comment 50

•

24 years ago

I tested this in 2000-07-24-08 and 2000-07-24-20 Win32 build. After I removed 3 lines in font.properties.ja file as Kat mentioned before, Japanese characters will be displayed correctly in Java applet with Japanese characters are included. For example, http://babel/java/ If you click under JDK1.1.5 i18n features, Japanese characters are displayed correctly and you can type Japanese characters. This is the same test cases as when you install JDK, JDK\demo\i18n directory. However, under I18n features Japanese Text, Scroll Text and Rainbow Text does not load applet. This Scroll Text applet is similar to the above URL, so the applet does not display Japanese characters in above URL. I do not know this is anything to do with I18n.

Katsuhiko Momoi

Reporter

Comment 51

•

24 years ago

Changed the QA contact to teruko. Suggestion about older test applets ftang/we created originally for pre-JDK1.1 days. It is quite possible that these applets use depreacted classes. For that reason, they probably should not be used as benchmark for testing with Java2. If supporting old applets is important some other people would file bugs on that. Let's mark this resolved/fixed and deal with extra issues in other bugs.

Status: NEW → RESOLVED

Closed: 24 years ago

QA Contact: shrir → teruko

Resolution: --- → FIXED

Jay Patel [:jay]

Comment 52

•

24 years ago

Putting on [nsbeta2-] radar. Not critical to beta2.

Whiteboard: [nsbeta2-]

Teruko Kobayashi

Comment 53

•

24 years ago

I verified this in 2000-08-17-08 Win32 build.

Status: RESOLVED → VERIFIED

timeless

Updated

•

14 years ago

Product: Core → Core Graveyard