Closed Bug 17169 Opened 25 years ago Closed 24 years ago

Japanese string display applets show only dots not real characters

Categories

(Core Graveyard :: Java: OJI, defect, P3)

x86
Windows NT

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: momoi, Assigned: stanley.ho)

References

()

Details

(Keywords: relnote, Whiteboard: [nsbeta2-])

** Observed with 10/24/99 Win32 build w/ JDK 1.3 Beta Java Plugins ** The applet at the above URL displays both Japanese characters and the English Applet but with the setup indicated above, Japanese strings show up only as dots while the English string "Applet" is displayed. To see this bug, follow the steps below: 1. Install the newest Java Plugins as explained in Bug 14202. 2. Start Mozilla and set the Character Set to Japanese (Shift_JIS). (It will probably freeze Mozilla if you set the auto-detect to Japanese, and so for now don't try that. --> this should be another bug.) 3. Now visit the above URL and wait for the applet to load. 4. You see that strings begin to scroll but Japanese characters appear simply as dots while the English word "Applet" is visible. You can try other string display applets indicated by links at the above URL. Most of them should run and will display English strings but none will display Japanese characters. conversion failure? font information lacking? widget not supporting multi-byte characters? By the way, most applets at the above URL actually run ** without the newesst Java Plugins mentioned in Bug 14202 **, but the results are still the same -- Japanese characters display only as dots. Who should get this bug? Initially assigned to edburns.
QA Contact: leila.garin → amasri
How do I set the character set to Japanese?
You need to use "View | Character Set"Menu to choose Japanese (Shift_JIS). Right now the menu is not scrollable and some monitors will have a problem displaying all the menu items. I have a workaround posted here for the unscrollable menu. http://www.mozilla.org/quality/intl/charmenufix.html Under Method 2, you can use the Navigator modifications suggested under the section For Navigator window (M10 only): and change 2 files mentioned in it. After this modification, you will find Japanese Character sets under "Character Set: Multibyte" menu.
It says "M10 only" in the document but it works with M11 builds also (e.g. today's build).
Assignee: edburns → ryang
You need to follow the steps in the Method 2 section of http://www.mozilla.org/quality/intl/charmenufix.html then run mozilla and go to View->Character Set Multibyte and choose Japanese, Shift_JIS. Then visit the URL http://ss4.inet-osaka.or.jp/~athushin/ScrollString1.html
Hopefully within in a few days, a temporary charset workaround will be checked into the source so that you don't have to do these modifications manually.
Assignee: ryang → edburns
I've got a Nov-9 build of Mozilla and I did the following : 1. Started up with the Moz. home page. 2. Used menu options View->Character Set Multibyte->Japanese (Shift_jis) 3. Went to smoothscroll page. The text scrolled by the applet looks like the following : ".... Applet ..." (My rendering of the number of periods is not accurate 8) I did some debugging in the Java Plugin code, and apparently nsIPluginTagInfo2::GetParameters() is used to retrieve the parameters for the applet. The text of the parameters which should have contained UTF-8 encoding for Japanese characters actually contain periods ('.'). In order for this kind of page to work, the parameters retrieved by nsIPluginTagInfo2::GetParameters() will need to be in UTF-8 encoding. So the implementation of this method will have to get the parameters converted from whatever encoding is present in the HTML (shift_jis in this case) to UTF-8, because the Java Plugin creates Unicode strings for display from UTF-8 chars. Re-assigning to Ed. My apologies if I am incorrect in doing so !
Assignee: edburns → ryang
I have verified that this bug does indeed occurr with the latest J2SDK install. Robert, is there some codepage problem?
I doubt it's a codepage problem. Mozilla clearly has the ability to translate between encodings, if I can trust the menu options. OJI has to convert from an MBCS string in, say, shift_jis encoding to one in UTF8 encoding. The chars have to end up as Unicode for display in Java, right ?
Are you really using UTF-8 in Java display? Not UCS2/Unicode?
The native code in the Java Plugin uses NewStringUTF8() in JNI to convert from a char string to a Java string (which is always Unicode). This is because the OJI method mentioned above returns char strings (instead of wchar_t, or whatever XPCOM defines for Unicode). Recall that UTF8 is simply an 8-bit encoding of Unicode, whereas UCS2 is a 16- bit encoding. In UTF8 a Unicode char may take up one or two bytes, depending on its value in Unicode. You can't assume that 1 byte == 1 character, but the data type used for passing these things around is still char*. Makes life interesting. Any way you do this, the text will have to be converted from whatever it is on the html page to some form of Unicode for use by Java, whether that be UTF8 or UCS2.
Component: Java APIs to WebShell → OJI
Assignee: ryang → edburns
Status: NEW → ASSIGNED
Robert, are you suggesting an API change for communication between mozilla and the Java Plugin?
I'm ignorant of OJI, but assuming that the OJI methods expect all strings to be char* encoded in UTF-8, then whatever code calls these methods should convert from the encoding of the HTML page to UTF-8 before invoking the OJI method. Converters from most common charsets to UTF-8 already exist in the browser. If this is correct, I don't think you need any API changes.
Hi, Ed, Could you possibly set the Target Milestone on this? It would be nice if we can have Java plugins working for Beta1 and then this JPN string data to also show correctly. I have a feeling that this could be solved quickly once we know what conversion to perform and where.
momoi: done.
Target Milestone: M15
Ed, M15 means post-Beta1 (in current definition). As momoi-san states, we really want Java plug-ins to work for Japanese in Beta1.
Ok, but the delivery of the fix of this fix depends on Sun and Netscape coming up with a mutually acceptable release vehicle for the Java Plugin that includes the necessary changes to run in mozilla.
Depends on: 16296
Target Milestone: M15 → M14
The Hitsuji Applet Farm -- the JPN test case page -- has moved. So, I updated the link above. It is bette and bigger than before. The above URL is for "Horizontal Scroll". In addition, he has some "Vertical Scroll" applets at: http://www.hitsuji.to/VerticalScroll1.html (Watch out: Mozilla seems to hang on this page a lot.) also, Special effects Character display at: http://www.hitsuji.to/SpecialString1.html Java is ON today (2/24/2000 build). Seems to change from day to day, though.
This looks like a illegal calling of nsString::ToNewCString problem. (because ToNewCString will converter U+0100 - U+FFFF to '.' ) When you try to locate the problem code, first apply the patch I listed in bug 28424 (remember to chage the DEBUG_ftang to your name) , rebuild xpcom/ and visit that page. You should get assert at the point which the data get damage.
Raju, can you also test this in addition to testing 21305.
Whiteboard: ETA of ETA 03/01/00
Raju tested this on a Japanese Windows NT system and he also observed seeing the dots instead of the Kanji. I'm assigning this to Stanley for investigation. Stanley, you can assign it back to me if you don't have time for it.
Assignee: edburns → stanley.ho
Status: ASSIGNED → NEW
Whiteboard: ETA of ETA 03/01/00 → ETA after Beta1
Assignee: stanley.ho → edburns
Ed, sorry, I don't have time to investigate this, so I will reassign it to you. By the way, nsIPluginTagInfo::GetParameter will only take UTF string, so if MBCS strings are passed, it won't be recognized by the plug-in.
I tried out the latest JRE in a Japanese NT machine. If character encoding was not enabled, I saw the applet was showing garbage english character. I saw the same set of garbage english character in the PARAM tag when I clicked "View->Source", so the browser passed in the characters from the PARAM tags into the JRE correctly. However, if Japanese encoding is enabled, I saw the "...." displayed in the applet. If I clicked "View->Source", I saw the japanese characters instead of "...". Java Plug-in obtains the parameters from nsIPluginTagInfo2::GetParameters as UTF-8 strings. In the Japanese machine with japanese character encoding enabled, the browser tried to pass in multi-byte characters instead, and this causes problem to Java Plug-in because it tried to interpret the string as UTF-8. This confirms the theory made by bobj@netscape.com. To solve this problem, the browser should convert all strings from its encoding into UTF-8, before passing it into the JRE. This problem occurs not only in plug-in, but also generic plug-in in general, so I will reassign this bug to the plug-in category. Basically, this problem
Assignee: edburns → av
Component: OJI → Plug-ins
QA Contact: amasri → shrir
stanley.ho@eng.sun.com, Your last sentence in your last comment was cut off. Was there more that you intended to write? Thx.
No, that's all I wanted to write. The last line was there by mistake. By the way, would anyone be able to tell me what exactly the behavior of the JVM should be when the character encoding changes in Navigator 3/4?? We would like to provide similar behavior in Java 2 through OJI, but we need more information on this issue because there are lots of difference between NS JVM and Sun's Java 2.
In 4x, when Java outputs to Browser window (w/ AWT widgets), it converts from Unicode to what the window says its charset is. The charset info could be in the form of "document meta charset tag", "Http server charset", or "user set menu choice". I think they ammount to the same thing, i.e. the current default charset of that window.
So, does it mean that the JVM will always do the convertion according to the default character encoding of the current page, and doesn't support character encoding changing on the fly?? If this is the case, it will make our life much easier. We will still need the bug fix for 16296 on the browser side, so the OJI plug-in will always take UTF-8 string and pass it directly to the JVM, without worrying about character encoding.
Yes, that was true for 4.x. The real question is, that does that same consideration now apply in Mozilla? Mozilla uses Unicode for layout and so why/where would it be necessary to output to a native encoding? For input from web documents, we need to be aware of the document charset, but for ouput? I'm not sure. The other question I have is what advantage do we have in being able to change output encoding on the fly? Is that something we should look at before deciding what to do?
Sorry, I think there is a mistake in the way I described the interaction with JVM. I think this is ftang's area of knowledge and I'll leave it up to him to supply correct info. In the meantime, you can look at Frank Tang's presentation for Netscape Navigator and Java on this page. This should make things clearer: http://people.netscape.com/ftang/paper/unicode9/page0006.htm
For OJI, the consideration applies when nsIPluginTagaInfo2::GetParameter is called. This method is for the OJI plug-in to obtain the strings that are specified in the PARAM tag inside APPLET tag. Currently, we accept the string as UTF-8, not unicode. Thus, as long as Mozilla convert the string from Unicode to UTF-8 properly using default encoding before passing it to OJI, we will be okay. We need to support of UTF-8 because the same plug-in will also work with Navigator 3/4 through backward adapter, and in this case, the string is always char* in old browsers. IMO, I don't think changing output encoding on the fly gives us any advantage, because it only happens when you try to change the encoding of the page to non-default one. Actually, this is difficult to implement it as well because we can't force the applet to dump any string that they obtain from Applet.getParameter() and reload the string when encoding changes. The cost of supporting it will definitely be much higher than the benefit of it. If I would choose, I will not want to support encoding changing on the fly.
Are you talking about nsObjectFrame.cpp ? Replace your ToNewCString with ToNewUTF8String. You should also do that for ToCString but be aware that ToNewUTF8String malloc for you while ToCString don't. I think you should change line 1614-1637 (not test, please try by yourself) 1585 NS_IMETHODIMP nsPluginInstanceOwner::GetAttributes(PRUint16& n, 1586 const char*const*& names, 1587 const char*const*& values) ... 1614 mAttrNames[mNumAttrs] = (char *)PR_Malloc(name.Length() + 1); 1615 mAttrVals[mNumAttrs] = (char *)PR_Malloc(value.Length() + 1); 1616 1617 if ((nsnull != mAttrNames[mNumAttrs]) && 1618 (nsnull != mAttrVals[mNumAttrs])) 1619 { 1620 name.ToCString(mAttrNames[mNumAttrs], name.Length() + 1); 1621 value.ToCString(mAttrVals[mNumAttrs], value.Length() + 1); 1622 1623 mNumAttrs++; 1624 } 1625 else 1626 { 1627 if (nsnull != mAttrNames[mNumAttrs]) 1628 { 1629 PR_Free(mAttrNames[mNumAttrs]); 1630 mAttrNames[mNumAttrs] = nsnull; 1631 } 1632 if (nsnull != mAttrVals[mNumAttrs]) 1633 { 1634 PR_Free(mAttrVals[mNumAttrs]); 1635 mAttrVals[mNumAttrs] = nsnull; 1636 } 1637 } change it to mAttrNames[mNumAttrs] =name.ToNewUTF8String(); mAttrVals[mNumAttrs]=value.ToNewUTF8String(); same thing in 1883 NS_IMETHODIMP nsPluginInstanceOwner::GetParameters(PRUint16& n, const char*const*& names, const char*const*& values) change 1964 mParamNames[mNumParams] = (char *)PR_Malloc(name.Length() + 1); 1965 mParamVals[mNumParams] = (char *)PR_Malloc(val.Length() + 1); 1966 1967 if ((nsnull != mParamNames[mNumParams]) && 1968 (nsnull != mParamVals[mNumParams])) 1969 { 1970 name.ToCString(mParamNames[mNumParams], name.Length() + 1); 1971 val.ToCString(mParamVals[mNumParams], val.Length() + 1); 1972 1973 mNumParams++; 1974 } 1975 else 1976 { 1977 if (nsnull != mParamNames[mNumParams]) 1978 { 1979 PR_Free(mParamNames[mNumParams]); 1980 mParamNames[mNumParams] = nsnull; 1981 } 1982 1983 if (nsnull != mParamVals[mNumParams]) 1984 { 1985 PR_Free(mParamVals[mNumParams]); 1986 mParamVals[mNumParams] = nsnull; 1987 } 1988 } 1989 } to mParamNames[mNumParams] = name.ToNewUTF8String(); mParamVals[mNumParams] = val.ToNewUTF8String(); Please understand I didn't test this changes and you have to try by yourself. I assume stanley.ho 's comment about UTF-8 is true, in that case, ToNewUTF8String will do the right job while ToCString/ToNewCString will convert non ASCII into '.'.
Blocks: 28424
I have already put down the details of how to fix this problem. Please provide ETA of the check in. (year 2010 can consider "ETA after Beta1" , right ? Please be more specific). And also please mark it ASSIGN if you agree to work on it.
Status: NEW → ASSIGNED
Target Milestone: M14 → M16
Is JRE 1.3 the latest version? I have trouble seeing Java applets with recent build.
1.3 is the latest version. Java is not working on windows (bug 36405).
Keywords: nsbeta2
Shrirang, do you this applet? For it says Loading... forever.
This test applet mentioned in the above URL is working with 5/15/2000 Win32 build. (It runs but still does not display Japanese.) If you need a demo of how it should work, please give me a call.
Talked with Kat and confirmed that this applet does not load on english windows builds. I also see a "Loading" message and the applet does not load. This is working ok on japanese windows system only.
Strangely enough, after leaving this page open for hours on my windowsNT,the applet loads and I see it as kat had mentioned in his initial comments. (build used: 2000051720)
Putting on [nsbeta2+] radar for beta2 fix.
Whiteboard: ETA after Beta1 → [nsbeta2+]
Stanley, I tried ftang's change and got the following (I don't understand this 'Illegal UTF8 string' after what you explained): java.lang.ClassFormatError: ScrollString (Illegal UTF8 string in constant pool) at java.lang.ClassLoader.defineClass0(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at sun.applet.AppletClassLoader.findClass(Unknown Source) at sun.plugin.security.PluginClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.applet.AppletClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.applet.AppletClassLoader.loadCode(Unknown Source) at sun.applet.AppletPanel.createApplet(Unknown Source) at sun.plugin.AppletViewer.createApplet(Unknown Source) at sun.applet.AppletPanel.runLoader(Unknown Source) at sun.applet.AppletPanel.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
I checked in the new string conversion. Reassigning to stanley.ho@eng.sun.com
Assignee: av → stanley.ho
Status: ASSIGNED → NEW
M16 has been out for a while now, these bugs target milestones need to be updated.
component:oji
Component: Plug-ins → OJI
per I18n QA review today, moving from [nsbeta2+] to [nsbeta2-]. Can no longer wait for a fix. Missed the beta2 train.
Whiteboard: [nsbeta2+] → [nsbeta2-]
remove [nsbeta2-], without this , Japanese user cannot use Java for Japanese sites. We have to have this fix for nsbeta2.
Whiteboard: [nsbeta2-]
Putting on [nsbeta2-] radar. Not critical to beta2. Sorry, you missed the train (as Jan said).
Whiteboard: [nsbeta2-]
remove [nsbeta2-]. I didn't miss the train. The one who own the bug miss the train. I am the CUSTOMER of this work.
Whiteboard: [nsbeta2-]
We do more testing now. The data have been passed to the JVM correctly. The rendering have some problem. But after we uncommment out some alis line in the font.properties file, we see the Japanese display correctly. Therefore, it is working now and there are nothing mozilla need to change from here. We do need to release note the remaining issue. momoi will take care of that part. I still see some hang or crash when we hit Java. I think we should do more testing in that area and make sure it won't crash/hang.
To test Java and display for non-ASCII data string, e.g. scroll applet designed to show Japanese characters scroll by, do the following. 1. Modify the font.properties file appropairte for your Windows. If you're using Japanese Windows, then modify: JavaSoft/JRE/1.3/lib/font.properties.ja If you're using US Windows, then modify: JavaSoft/JRE/1.3/lib/font.properties 2. In the appropriate font.properties file, uncomment the following 3 lines in the alias section: # alias.timesroman=serif # alias.helvetica=sansserif # alias.courier=monospaced Not all applets which display non-ASCII characters depend on this modification but many applets use these legacy font names prior to JDK 1.1x. Removing "#"'s will ensure that these legacy font name compatibility is taken care of. We should mark this bug rexolved/fixed and deal with additional issues ftang talks about above in other bugs. George, will Sun be willing to "uncomment" these lines in the JRE distributions? It would not cost Java anything but will make handling of these legacy font names cases much much easirer because the user has to do anything to see the characters. Otherwise, the user will have to uncomment the alias lines. Can you help to put the workaround into the JRE distribution for us to use?
Keywords: relnote
The basic display probelm has been fixed. Now it is time to test a variety of applets which show non-ASCII characters to see if there are additional problems like "crashing if not set to the proper character encoding ahead of time". This requires i18n testing conditions -- teruko, can the Browser team take this bug for PR2 testing? I will write a release note item for this.
I tested this in 2000-07-24-08 and 2000-07-24-20 Win32 build. After I removed 3 lines in font.properties.ja file as Kat mentioned before, Japanese characters will be displayed correctly in Java applet with Japanese characters are included. For example, http://babel/java/ If you click under JDK1.1.5 i18n features, Japanese characters are displayed correctly and you can type Japanese characters. This is the same test cases as when you install JDK, JDK\demo\i18n directory. However, under I18n features Japanese Text, Scroll Text and Rainbow Text does not load applet. This Scroll Text applet is similar to the above URL, so the applet does not display Japanese characters in above URL. I do not know this is anything to do with I18n.
Changed the QA contact to teruko. Suggestion about older test applets ftang/we created originally for pre-JDK1.1 days. It is quite possible that these applets use depreacted classes. For that reason, they probably should not be used as benchmark for testing with Java2. If supporting old applets is important some other people would file bugs on that. Let's mark this resolved/fixed and deal with extra issues in other bugs.
Status: NEW → RESOLVED
Closed: 24 years ago
QA Contact: shrir → teruko
Resolution: --- → FIXED
Putting on [nsbeta2-] radar. Not critical to beta2.
Whiteboard: [nsbeta2-]
I verified this in 2000-08-17-08 Win32 build.
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.