Closed Bug 110220 Opened 23 years ago Closed 22 years ago

unable to use iso10646-1 for GB18030 if using charcell spacing

Categories

(Core :: Internationalization, defect, P4)

x86
Linux
defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: llch, Assigned: ftang)

Details

Attachments

(3 files)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.2.1) Gecko/20010901 BuildID: 2001090111 Set font in proportional spacing: sy_gb18030.ttf -misc-SongYi_Z13-medium-r-normal--0-0-0-0-p-0-iso10646-1 sy_gb18030.ttf -misc-SongYi_Z13-bold-r-normal--0-0-0-0-p-0-iso10646-1 Mozilla will display all the GB18030's 4 bytes characters correctly. However, if i set font in charcell spacing: sy_gb18030.ttf -misc-SongYi_Z13-medium-r-normal--0-0-0-0-c-0-iso10646-1 sy_gb18030.ttf -misc-SongYi_Z13-bold-r-normal--0-0-0-0-c-0-iso10646-1 The character will changed to all "?". Reproducible: Always Steps to Reproduce: 1. install a GB18030 font 2. set charcell spacing for the font 3. run mozilla with 4 byte characters test page Actual Results: all characters changed to "?" Expected Results: Displays all GB18030 chars
Leon, 0.9.2.1 is pretty old... do you see the same problem with a recent nightly?
Yes, I see the same problem by using Build ID: 2001111512
Status: UNCONFIRMED → NEW
Ever confirmed: true
Component: Layout → Internationalization
Ok, so I added the GB18030 font to my font path, and it displays (in proportional) the doublebyte.txt file above without any problem after selecting the correct charset. The four.txt file displays nothing though, which contradicts the bug report (which claims it displays fine with the same font (proportional). What is the encoding of the four.txt file?
Attached file double2.txt
double2.txt (binary file, don't save as text)
Attached file four.txt
four.txt (don't save as text, save as binary)
OK, I have two questions here. It was mentioned to me that GB18030 has four byte wide characters but we are using the two byte wide converter. Is this actually the case? It's strange that the fonts in question work for one instead of the other. Are the code points different in the fonts?
you will need to ask ftang about this but he is out of the country right now on family business till the 2nd week in dec.
GB18030 encoded character in 1, 2, and 4 bytes. 1 byte range from 0x00 to 0x7f. Those characters are almost the same as ASCII. 2 byte range from (0x81 ~ 0xfe, 0x40 ~0x7e | 0x80~0xfe). In this rang, subrange (0xa1 ~ 0xfe, 0xa1 ~0xfe) is exactly the same as GB2312 (and GBK as well). double2.txt contains characters in this range. If you choose to use GB2312, GBK, you should see the same characters as using gb18030. That is the backward compatibility. GB18030 also encode character in 4 byte range, (0x81~0xfe, 0x30~0x39, 0x81~0xfe, 0x30~0x39). Characters in unicode BMP plane that are not cover by 1 and 2 bytes are enumerated from (GB+81308130). four.txt" contains some characters in this range. When converting those characters into unicode, what we get is 2-byte wide char (PRUnichar). If conversion from gb18030 to unicode is done correctly, we are supposed to display those character correctly.
this belongs to someone who works with fonts, reassigning to ftang for appropriate assign to
Assignee: attinasi → ftang
>The character will changed to all "?". This could happen in the following reason 1. the nsFontMetricsWinGTK.cpp code do not know this font 2. it know this font, load it but decide those code point are empty for those character position in the font. erik/bstell use a speicific algorithm in nsFontMetricsWin.cpp to deal with ISO-10646 font. The reason is other encoding basically tell use what glyph is available in the font, most of the ISO-10646 font do NOT cover the whole unicode range, we need to depend on additional algorithm to decide a particular glyph is available or not. And that algorithm is heavly depend on per glyph metrics. Does the font server return per glphy metrics if you set to "c" ? The algorithm is in GetMapFor10646Font(XFontStruct* aFont) see http://lxr.mozilla.org/seamonkey/source/gfx/src/gtk/nsFontMetricsGTK.cpp#1950 in mozilla source code it is gfx/src/gtk/nsFontMetricsGTK.cpp if you want to debug also check nsFontGTK::IsEmptyFont(nsXFont* aXFont) First thing: check if the XFontStruct.per_char is null According to my x windows reference book "Destop Quick Reference: The X Widnow System in a Nutshell for X version 11 Release 4", "O'Reilly & Associates, Inc." page 320: Spacing: All standard R3 fonts are either m (monospace, i.e. fixed-width) or p ( propotial, i.e., variable-width). In R4, fonts may also have the spacing characterstic c, (character cell, a fixed- width font based on the traditional typewriter model). I don't understand the difference between m and c. It is likly that the the XFontStruct do NOT return per_char if you set to c.
ok, I put in some debug code to see who we deal with ISO-10646 font and the result are quite intereting. Basically, my testing result show me that the per_char of the XFontStruct for the SAME font may return null or not null. It could be a x problem, a gtk problem or a memory issue.
run the above debugging patch to hit http://warp/u/ftang/utf8test/ncr.cgi and we got the following print out Call GetMapFor10646Font for -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 per_char = 41567000 can create map Call GetMapFor10646Font for -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 per_char = 0 can NOT create map Call GetMapFor10646Font for -misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1 per_char = 41622000 can create map Call GetMapFor10646Font for -misc-fixed-medium-r-normal--15-140-75-75-c-90-iso10646-1 per_char = 41314000 can create map Call GetMapFor10646Font for -mutt-clearlyu alternate glyphs-medium-r-normal--17-120-100-100-p-91-iso10646-1 per_char = 417ff000 can create map Call GetMapFor10646Font for -mutt-clearlyu pua-medium-r-normal--17-120-100-100-p-110-iso10646-1 per_char = 41339000 can create map Call GetMapFor10646Font for -mutt-clearlyu-medium-r-normal--17-120-100-100-p-128-iso10646-1 per_char = 418ba000 can create map Call GetMapFor10646Font for -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 per_char = 0 can NOT create map Call GetMapFor10646Font for -misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1 per_char = 0 can NOT create map Call GetMapFor10646Font for -misc-fixed-medium-r-normal--13-120-75-75-c-70-iso10646-1 per_char = 4137e000 can create map the strange thing is the frirst time we call the font it return succesfully. But some how we hit that same code again for the same font and the second hit we got per_char equal to null this happen to the following two fonts: -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 and -misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1
ok, I know why it got hit twice. Somehow my x server report them twice: [ftang@ftang bin]$ xlsfonts | egrep "arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1|misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1" -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 -misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1 -misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1
ok, see this : [ftang@ftang bin]$ xlsfonts | sort > t [ftang@ftang bin]$ wc -l t 5008 t [ftang@ftang bin]$ xlsfonts | sort -u >tt [ftang@ftang bin]$ wc -l tt 4068 tt xlsfonts report 5008 font to me but onely 4068 of them are unique there are 940 of them are duplicate. which is about 19% of them are duplicate.
llch, could you please apply my debuggin patch to your local tree, rebuild it, run it and show me what you got. also, can you do a xlsfonts "misc-SongYi_Z13-medium-r-normal--0-0-0-0-c-0-iso10646-1" and tell me how many time it show up ?
file seperate bug 116136 for part of the issue.
ok, I will look at it.
Status: NEW → ASSIGNED
Leon: are you still working on this issue. We cannot move forward without your information.
I am currently on vacation. Sorry for not replying promptly. Here are the results: -c- spacing output: Call GetMapFor10646Font for -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 per_char = 41f35000 can create map Call GetMapFor10646Font for -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 per_char = 0 can NOT create map Call GetMapFor10646Font for -misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1 per_char = 41ff0000 can create map Call GetMapFor10646Font for -misc-fixed-medium-r-normal--15-140-75-75-c-90-iso10646-1 per_char = 41dd8000 can create map Call GetMapFor10646Font for -mutt-clearlyu alternate glyphs-medium-r-normal--17-120-100-100-p-91-iso10646-1 per_char = 420b1000 can create map Call GetMapFor10646Font for -mutt-clearlyu pua-medium-r-normal--17-120-100-100-p-110-iso10646-1 per_char = 4216c000 can create map Call GetMapFor10646Font for -mutt-clearlyu-medium-r-normal--17-120-100-100-p-128-iso10646-1 per_char = 4217f000 can create map Call GetMapFor10646Font for -adobe-courier-medium-r-normal--17-120-100-100-m-100-iso10646-1 per_char = 42240000 can create map Call GetMapFor10646Font for -adobe-helvetica-medium-r-normal--17-120-100-100-p-88-iso10646-1 per_char = 4225d000 can create map Call GetMapFor10646Font for -adobe-new century schoolbook-medium-r-normal--17-120-100-100-p-91-iso10646-1 per_char = 42278000 can create map Call GetMapFor10646Font for -adobe-times-medium-r-normal--17-120-100-100-p-84-iso10646-1 per_char = 42293000 can create map Call GetMapFor10646Font for -adobe-utopia-regular-r-normal--15-140-75-75-p-79-iso10646-1 per_char = 422b1000 can create map Call GetMapFor10646Font for -b&h-lucida-medium-r-normal-sans-17-120-100-100-p-96-iso10646-1 per_char = 422cc000 can create map Call GetMapFor10646Font for -b&h-lucidabright-medium-r-normal--17-120-100-100-p-96-iso10646-1 per_char = 422e7000 can create map Call GetMapFor10646Font for -b&h-lucidatypewriter-medium-r-normal-sans-17-120-100-100-m-100-iso10646-1 per_char = 42302000 can create map Call GetMapFor10646Font for -misc-fzsongyi_z13-medium-r-normal--16-*-0-0-c-*-iso10646-1 per_char = 0 can NOT create map Document http://junk.brisbane.redhat.com/gb18030/four.txt loaded successfully -p- spacing output: Call GetMapFor10646Font for -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 per_char = 41f21000 can create map Call GetMapFor10646Font for -arabic-newspaper-medium-r-normal--32-246-100-100-p-137-iso10646-1 per_char = 0 can NOT create map Call GetMapFor10646Font for -misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1 per_char = 41fdc000 can create map Call GetMapFor10646Font for -misc-fixed-medium-r-normal--15-140-75-75-c-90-iso10646-1 per_char = 4209d000 can create map Call GetMapFor10646Font for -mutt-clearlyu alternate glyphs-medium-r-normal--17-120-100-100-p-91-iso10646-1 per_char = 420c2000 can create map Call GetMapFor10646Font for -mutt-clearlyu pua-medium-r-normal--17-120-100-100-p-110-iso10646-1 per_char = 41ddd000 can create map Call GetMapFor10646Font for -mutt-clearlyu-medium-r-normal--17-120-100-100-p-128-iso10646-1 per_char = 4217d000 can create map Call GetMapFor10646Font for -adobe-courier-medium-r-normal--17-120-100-100-m-100-iso10646-1 per_char = 4223e000 can create map Call GetMapFor10646Font for -adobe-helvetica-medium-r-normal--17-120-100-100-p-88-iso10646-1 per_char = 4225b000 can create map Call GetMapFor10646Font for -adobe-new century schoolbook-medium-r-normal--17-120-100-100-p-91-iso10646-1 per_char = 42276000 can create map Call GetMapFor10646Font for -adobe-times-medium-r-normal--17-120-100-100-p-84-iso10646-1 per_char = 42291000 can create map Call GetMapFor10646Font for -adobe-utopia-regular-r-normal--15-140-75-75-p-79-iso10646-1 per_char = 422af000 can create map Call GetMapFor10646Font for -b&h-lucida-medium-r-normal-sans-17-120-100-100-p-96-iso10646-1 per_char = 422ca000 can create map Call GetMapFor10646Font for -b&h-lucidabright-medium-r-normal--17-120-100-100-p-96-iso10646-1 per_char = 422e5000 can create map Call GetMapFor10646Font for -b&h-lucidatypewriter-medium-r-normal-sans-17-120-100-100-m-100-iso10646-1 per_char = 42300000 can create map Call GetMapFor10646Font for -misc-fzsongyi_z13-medium-r-normal--16-*-0-0-p-*-iso10646-1 per_char = 4231d000 Document http://junk.brisbane.redhat.com/gb18030/four.txt loaded successfully Here is the xlsfonts output of "songyi": -misc-fzsongyi_z13-bold-r-normal--0-0-0-0-c-0-iso10646-1 -misc-fzsongyi_z13-medium-r-normal--0-0-0-0-c-0-iso10646-1 Seem XFontStruct does not return per_char for fzsongyi's -c-, but -misc-console-medium-r-normal--16-160-72-72-c-160-iso10646-1 does have per_char = 41ff0000
leon: can you try the patch in bug 116136 and see that fix your problem or not.
can you tell me where to get this font and how to install it step-by-step, which font server are you using?
The patch from Bug 116136 does not fix the problem. For charcell spacing, it still returns Call GetMapFor10646Font for -misc-fzsongyi_z13-medium-r-normal--16-*-0-0-c-*-iso10646-1 per_char = 0 can NOT create map
leon: ftang was assuming that the AASB (anti aliased scaled bitmap) font code was causing this. Would you disable the AASB code by setting this pref to false and see if it affects this bug? pref("font.scale.aa_bitmap.enable", true);
modified /usr/lib/mozilla/defaults/pref/unix.js:pref("font.scale.aa_bitmap.enable", false); but it does not affect the bug.
We need to find out why xFont->per_char is null. It will be nice if you can debug that and figure out is that return by the GTK / XLIB or some code in mozilla null it out.
the xFont->per_char null is addressed in bug 116136
>the xFont->per_char null is addressed in bug 116136 It should be "the xFont->per_char null caused by AASB is addressed in bug 116136 " we may have other reason that per_char is null in here.
Leon, is the per_char null ?
The results was sent from email. Here it is: Here is for c spacing: font [-misc-fzsongyi_z13-medium-r-normal--16-*-0-0-c-*-iso10646-1] have null in per_char XXXXXXXXXXXXxxxxx Call GetMapFor10646Font for -misc-fzsongyi_z13-medium-r-normal--16-*-0-0-c-*-iso10646-1 per_char = 0 can NOT create map Here is for p spacing: font [-misc-fzsongyi_z13-medium-r-normal--16-*-0-0-p-*-iso10646-1] have per_char Call GetMapFor10646Font for -misc-fzsongyi_z13-medium-r-normal--16-*-0-0-p-*-iso10646-1 per_char = 42213000 can create map
ok, it seems that we are depend on the Xlib to give us per_char and it don't is that really important to make sure it work with c ?
Priority: -- → P4
Frank, without the per-char info how would moz know which chars have glyphs?
>is that really important to make sure it work with c ? I am really asking Leon is that important that this gb font work with c or not. If his answer is not that important (because we now can also access thorugh the FreeType path) , then we can simply mark this bug as wontfix
For GB18030 font, there are around 20,000 glyphs. For proportional spacing, it will take over 2 mins to initialize and load the first page on PIII 866 XEON. I would say it is important for mozilla. It may not be that important if Brain's freetype implementation is in the main source. Brian, do your implementation need to read all the glyphs in the font file first?
The very first time the code sees a new font it opens it and determines the valid chars/glyphs in the font and then stores that data in a file. Unless the file timestamp changes this is never done again. The code only renders then glyphs it is asked to display (and they are cached) so startup is not dependent on the number of glyphs in a font.
ok, we know the issue. simply won't fix it.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: