Closed Bug 127713 Opened 23 years ago Closed 22 years ago

support Surrogate display on Linux by using FreeType

Categories

(Core :: Internationalization, defect)

x86
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.4alpha

People

(Reporter: ftang, Assigned: masaki.katakai)

References

Details

(Keywords: intl, topembed-)

Attachments

(4 files, 6 obsolete files)

Now bstell's FreeType work is in the tree. we should think about using it to display Surrogate characters Shanjian did the surrogate enablement in Windows already. We probably need to use similar approach to address it. katakai- can you take charge of this issue? I don't think we have enough time to complete it before m1.0, but we definitely sould start the design now. We can use the "Code 2001" font to test it and later use Microsoft's surrogate fonts or tne new OfficeXP simp chinese font. Maybe other vendors like sun will license some surrogate TTF fonts from foundary. Once this work is completed and FreeType is turn on, then we can support Unicode Idograph Extension B on Linux
shanjian- can you put the links about your window surrogates support in here. I think katakai need to know how you did that in windows also where he can find the public surrogate fonts. bstell- can you put down where he can find FreeType API documentation here too?
Keywords: intl
QA Contact: ruixu → ylong
here is the infor for code 2001 font. >Frank, Shanjian, > >James Kass' font works well and covers the characters in the example >page at: http://www.geocities.com/i18nguy/unicode-example-plane1.html > >The page has some information on setting up Windows for surrogates. > >Kass' Code 2001 font is at: >http://home.att.net/~jameskass/CODE2001.ZIP > >tex
Font for CJK extension A can be found in MS website, link is available in bug 102254. Extension A has nothing to do with surrogate. So this font probably is not needed for this bug, but you might want to verify to see if it works. If not, we should file a bug and fix it. CCMAP support for surrogates can be found in 110843. This work is available for all platforms. But you might want to take a look to see how to use it. It is very simple though. Handling new OpenType format12 table is implemented in bug 118606. A unix implemetation is needed before TTF (or more accurately, OpenType) font can be use to support surrogates. See next part to get a font for testing. The final patch that make surrogates work on windows is bug 118000. It also contains links to download Code2001, a TTF font support surrogate. Several links to testcases were also provided.
this is a working draft
status: - installed code 2001 fonts to Win2K and verified glyphs can be displayed on Win2K on IE and Mozilla - made simple FT2 example that displays surrogate code range of code 2001
Status: NEW → ASSIGNED
Great work! FreeType2 2.0.9 has a new features that you may eventually want to look at to make scanning the 1 Million surrogate code points faster: - FT_Get_First_Char and FT_Get_Next_Char. These can be used to iterate efficiently over the currently selected charmap of a given face.
Yes, I tried FT_Get_First_Char() and FT_Get_Next_Char(), and verified they work propery for code 2001 font. > 1.3) Save surrogate ranges > Modify NextNonEmptyCCMapPage to support 32 bit > Modify NextNonEmptyCCMapPage to report surrogate ranges Do you think we need to modify whole nsCompressedCharMap? because SetChar() supports now 16 bit. http://lxr.mozilla.org/seamonkey/source/gfx/src/nsCompressedCharMap.cpp#245 244 void 245 nsCompressedCharMap::SetChar(PRUint16 aChar) 246 { 247 unsigned int i; 248 unsigned int upper_index = CCMAP_UPPER_INDEX(aChar); 249 unsigned int mid_index = CCMAP_MID_INDEX(aChar);
> Yes, I tried FT_Get_First_Char() and FT_Get_Next_Char(), and > verified they work propery for code 2001 font. Of course we will need a fallback for systems with older versions of FreeType2. > Do you think we need to modify whole nsCompressedCharMap? > because SetChar() supports now 16 bit. Shanjian added surrogate support to the CCmap code. Shanjian: could you kindly explain how to use the surrogate support you added to the CCMap? Thanks
It seems that codes around MapToCCMapExt() in windows makes ccmap. I'll refer them. http://lxr.mozilla.org/seamonkey/source/gfx/src/windows/nsFontMetricsWin.cpp#1479
Attached patch first drop (obsolete) — Splinter Review
I have put the first version of patch. With the patch and with CODE2001 TTF font, I can see the surrogate characters on Linux/Solaris Mozilla now. What I have done in patch, - added FCE_FLAGS_UCS4 flag for SURROGATE TrueType - still assume FCE_FLAGS_UCS4 has the same code points in FCE_FLAGS_UNICODE (need to verify) - enhanced nsFT2FontCatalog::NewFceFromFontFile() can handle SURROGATE for creating ccmap make map then make ccmap by MapToCCMapExt() for SURROGATE FCE_FLAGS_UCS4 make map then make ccmap by MapToCCMap() for non-SURROGATE FCE_FLAGS_UNICODE These codes from Windows - enhanced nsCompressedCharMap.cpp - NextNonEmptyCCMapPage() supports SURROGATE area now can store SURROGATE area to .db file - added nsCompressedCharMap.cpp - NewCCMapExt() makes a copy of mCCMap that contains SURROGATE area this is used in nsFT2FontCatalog::GetCCMap() - enhaced nsFT2FontCatalog::NewFceFromSummary() can handle SURROGATE for creating ccmap make map then make ccmap by MapToCCMapExt() for SURROGATE FCE_FLAGS_UCS4 make map then make ccmap by MapToCCMap() for non-SURROGATE FCE_FLAGS_UNICODE now can load SURROGATE area from .db file at next start up These codes from Windows - In nsRenderingContextGTK.cpp, added SURROGATE check into the following methods nsRenderingContextGTK::GetWidth() nsRenderingContextGTK::GetTextDimensions() nsRenderingContextGTK::DrawString() - In nsFreeType.cpp, added SURROGATE check into the following methods nsFreeTypeFont::doGetBoundingMetrics() nsFreeTypeFont::GetWidth() nsFreeTypeXImage::DrawString() - In nsFontMetricsGTK.cpp, changed PRUnichar to PRUint32 for support SURROGATE - CCMAP_HAS_CHAR() has bee changed to CCMAP_HAS_CHAR_EXT() for support SURROGATE
What I need to do more: - Use FT_Get_First_Char() and FT_Get_Next_Char() if usable but I don't think the current version (can all code points) is so slow - invalid code check, put NS_ASSERTION() - performance improvement - more testing
Brian/Frank, Please read the comments of http://bugzilla.mozilla.org/show_bug.cgi?id=127713#c15 and give your comments if I'm wrong. Also if you have time, please look at codes in the patch.
Great work! Your description (http://bugzilla.mozilla.org/show_bug.cgi?id=127713#c15) and code (attachment 75913 [details] [diff] [review]) look good. nit: > - added FCE_FLAGS_UCS4 flag for SURROGATE TrueType Could USC4 mean more than just surrogate? Perhaps FCE_FLAGS_SURROGATE would be more self-explanatory? > - added nsCompressedCharMap.cpp - NewCCMapExt() makes > a copy of mCCMap that contains SURROGATE area Is this new? I thought Shanjian made surrogates work on windows but I have not looked into the details. It might be nice if one function could build the proper map (surrogate/non-surrogate) depending on what was in the data. > extern PRBool NextNonEmptyCCMapPage(PRUint16 *, PRUint32 *); Do both these need to be PRUint32 ? > CCMAP_HAS_CHAR_EXT Would CCMAP_HAS_CHAR32 be more self-explanatory? if (charSetInfo->mCharSet) { + // if SURROGATE char, ignore charSetInfo->mCCMap checking + // because the exact ccmap is never created before loading + // NEED TO FIX: need better way + if (IS_SURROGATE(aChar) ) { + goto check_done; + } I'm unclear on how the charset map would be non-exact. Could you talk about this? + if(is_surrogate) { + i++; + } Could you add a comment here since I suspect non-i18n engineers will not see the (oblivious to us) connection. + fce->mFlags = FCE_FLAGS_ISVALID | FCE_FLAGS_UCS4; We probably want to also set the FCE_FLAGS_UNICODE flag here since this is used to distinguish Unicode encoded fonts from non-Unicode encoded fonts. For example symbol encoded fonts such as MathML fonts would not set FCE_FLAGS_UNICODE but would set FCE_FLAGS_SYMBOL. I'm not sure if in the future we will have to add support for other encoding. 298 #define CCMAP_BEGIN_AT_START_OF_MAP 0xFFFF We probably need to make this work for a 32 bit variable.
> if (charSetInfo->mCharSet) { > + // if SURROGATE char, ignore charSetInfo->mCCMap checking > + // because the exact ccmap is never created before loading > + // NEED TO FIX: need better way > + if (IS_SURROGATE(aChar) ) { > + goto check_done; > + } > > I'm unclear on how the charset map would be non-exact. > Could you talk about this? Yes, this is one outstanding issue. We currently use charset encoding for each font such as -vendor-family-charset encoding-charset registry. I understand before we load ttf font, charset map is created by using "charset encoding-charset registry". What "charset encoding-registry" should be used for surrogate fonts? code2001.ttf font is handled as "jjk-code2001-iso8859-1" and charsetmap is created as "iso8859-1". If previous maching is done for "iso8859-1", this will just return, before the exact ttf font is loaded.
>> extern PRBool NextNonEmptyCCMapPage(PRUint16 *, PRUint32 *); > >Do both these need to be PRUint32 ? I undersntand ccmap is still PRUint16* even if this contains surrogate. >> CCMAP_HAS_CHAR_EXT > >Would CCMAP_HAS_CHAR32 be more self-explanatory? I just use _EXT macro defined in nsCompressedCharMap.h. It is used in gfx/src/windows. Shanjian, what do you think? >> - added nsCompressedCharMap.cpp - NewCCMapExt() makes >> a copy of mCCMap that contains SURROGATE area > >Is this new? I thought Shanjian made surrogates work on >windows but I have not looked into the details. > >It might be nice if one function could build the proper >map (surrogate/non-surrogate) depending on what was in the data. Yes, that's true. I'll try to use the existing method.
It is great to see surrogate being added to Unix/Linux! > I understand before we load ttf font, charset map is created > by using "charset encoding-charset registry". What "charset > encoding-registry" should be used for surrogate fonts? Nice to see that you are looking so closely! The code actually maps it to all encoding-charsets indicated by the OS/2 table ulCodePageRange1/2: http://lxr.mozilla.org/seamonkey/source/gfx/src/x11shared/nsFT2FontCatalog.cpp#8 40 807 nsFT2FontCatalog::doGetFontNames(...) 808 { ... 832 for (i=0; i<mFontCatalog->numFonts; i++) { 833 nsFontCatalogEntry *fce = mFontCatalog->fonts[i]; ... 843 for (j=0; j<32; j++) { 844 unsigned long bit = 1 << j; 845 if (bit & fce->mCodePageRange1) { 846 charSetName = GetRange1CharSetName(bit); ... 851 } 852 if (bit & fce->mCodePageRange2) { 853 charSetName = GetRange2CharSetName(bit); > I undersntand ccmap is still PRUint16* even if this contains > surrogate. Will this work when the next page is above 0xFFFF ?
Thanks for the info. I understood. The problem is that the code2001 font reports fce->mCodePageRange1=1 fce->mCodePageRange2=0 which are the same as iso-8859-1... So, if iso-8859-1 font is loaded before code2001, the checking will return.
Attached patch 2nd patch (obsolete) — Splinter Review
I have made changes for Brian's comments. Also I've put First_Char() and Next_Char() methods for performance improvement.
Nice work! > I undersntand ccmap is still PRUint16* even if this contains > surrogate. Yes you are right. I see now. +#define IS_HIGH_SURROGATE(u) ... +#define IS_LOW_SURROGATE(u) ... +#define SURROGATE_TO_UCS4(h, l) ... +#define IS_SURROGATE(u) ... nit: Should these be in a place where they could be shared with other GUIs? + if(i < aLength-1 && IS_HIGH_SURROGATE(c) && IS_LOW_SURROGATE(aString[i+1])) { + // if surrogate, make UCS4 code point from high aString[i] and + // low surrogate aString[i+1] + c = SURROGATE_TO_UCS4(c, aString[i+1]); + is_surrogate = true; => i++; nit: Would it make sense to move the "i++" to the place where we detect the surrogate as you do in other parts of the code? (Please forgive me, I should have caughtthis last time) nsFT2FontCatalog::GetCCMap(nsFontCatalogEntry *aFce) { ... + extMap[plane_num] = new PRUint32[UCS2_MAP_LEN]; ... + delete [] extMap[i]; ... + extMap[plane_num] = new PRUint32[UCS2_MAP_LEN]; ... + delete [] extMap[i]; ... + delete [] extMap[i]; ... + aExtMap[plane_num] = new PRUint32[UCS2_MAP_LEN]; I'd like to ask if we could we move the CCMap allocation/deallocation code into nsCompressedCharMap. This is not a strict request so if it is very difficult you can skip this (sorry I missed this last time). + // check fast methods are defined + PRBool has_fast_methods = PR_FALSE; + if (nsFreeTypeFont::nsFT_Get_First_Char && + nsFreeTypeFont::nsFT_Get_Next_Char) { + has_fast_methods = PR_TRUE; + } Very nice! minor nit: could you change this from FONT_SCAN_PRINTF(("\b\b\b\b\b\b\b\b\b\b\b\b%5d glyphs", num_checked)); to FONT_SCAN_PRINTF(("\b\b\b\b\b\b\b\b\b\b\b\b\b\b%7d glyphs", num_checked)); - still assume FCE_FLAGS_UCS4 has the same code points in FCE_FLAGS_UNICODE (need to verify) Did you have a chance to verify this? r=bstell@ix.netcom.com from here on.
> +#define IS_HIGH_SURROGATE(u) ... > +#define IS_LOW_SURROGATE(u) ... > +#define SURROGATE_TO_UCS4(h, l) ... > +#define IS_SURROGATE(u) ... > > nit: Should these be in a place where they could be shared with > other GUIs? Do you have any idea for the place? How about gfx/public/nsFont.h ?
> How about gfx/public/nsFont.h ? Frank or Shanjian: where would you recommend?
Attached patch 3rd patch (obsolete) — Splinter Review
Attachment #75913 - Attachment is obsolete: true
Attachment #76391 - Attachment is obsolete: true
Now nsUnicharUtils.h has the definitions for > +#define IS_HIGH_SURROGATE(u) ... > +#define IS_LOW_SURROGATE(u) ... > +#define SURROGATE_TO_UCS4(h, l) ... I'm using the header file. It's the same with windows version.
3rd patch is attached. I have made all changes except allocation/deallocation CCMap in nsCompressedCharMap. I'm sorry it's difficult and I want to use the same scenario with windows. > I'd like to ask if we could we move the CCMap allocation/deallocation code > into nsCompressedCharMap. This is not a strict request so if it is very > difficult you can skip this (sorry I missed this last time). Brian, could you r= again please?
+// New version - support 32 bit aPageStart Nit: Would you drop this line since this will only be "new" for a short time? - // walk thru the upper pointers - PRUint16 *upper = &aCCMap[0]; - for (i=upper_index; i<CCMAP_NUM_UPPER_POINTERS; i++, mid_index=0) { - if (upper[i] == CCMAP_EMPTY_MID) { - continue; + for(l=planestart; l<=planeend; l++, *aPageStart = ... Nit: Could you replace the comment with something appropiate? > Now nsUnicharUtils.h has the definitions for > > > +#define IS_HIGH_SURROGATE(u) ... > > +#define IS_LOW_SURROGATE(u) ... > > +#define SURROGATE_TO_UCS4(h, l) ... > > I'm using the header file. It's the same with windows version. Any particular reason not to move IS_SURROGATE there as well? + // if SURROGATE char, ignore charSetInfo->mCCMap checking + // because the exact ccmap is never created before loading + // NEED TO FIX: need better way Is there a bug open for this? Is correct version of CCMAP_HAS_CHAR vs CCMAP_HAS_CHAR_EXT used in all the appropiate places? As a saftey check: Have you tested to see what will happen if CCMAP_HAS_CHAR is called with a 32 bit value? Or could you add an assert to warn if this is happening? + PRUint32 i, ii; for (i = 0; i < aLength; i++) { - PRUnichar c = aString[i]; + PRUint32 c = aString[i]; + ii = i; + if(i < aLength-1 && IS_HIGH_SURROGATE(c) && IS_LOW_SURROGATE(...) { + // if surrogate, make UCS4 code point from high aString[i] and + // low surrogate aString[i+1] + c = SURROGATE_TO_UCS4(c, aString[i+1]); + + // aString[i+1] is already used as low surrogate + i++; To make the logic here more obvious perhaps you could use a variable PRUint32 extraSurrogateLength. Initialize extraSurrogateLength to 0. Change the for loop to: for (i = 0; i < aLength; i+=1+extraSurrogateLength). Set extraSurrogateLength to 0 where you initialize ii. Set extraSurrogateLength to 1 when you detect a surrogate. + else if (face->charmaps[i]->encoding_id == TT_MS_ID_UCS_4) { Nit: Is the indentation here correct? Nit: would you change if (!glyph_index) to if (glyph_index == 0) + // alloc extMap[plane_num] if needed + plane_num = CCMAP_PLANE(i); + NS_ASSERTION(plane_num <= EXTENDED_UNICODE_PLANES,"invalid plane"); + if (plane_num <= EXTENDED_UNICODE_PLANES) { + if (extMap[plane_num] == 0) { + extMap[plane_num] = new PRUint32[UCS2_MAP_LEN]; + } + // set map + SET_REPRESENTABLE(extMap[plane_num], i & 0xffff); + + if (fce->mFlags & FCE_FLAGS_SURROGATE) { + // make ext ccmap from map + fce->mCCMap = MapToCCMapExt(map, extMap+1, EXTENDED_UNICODE_PLANES); + } else if (fce->mFlags & FCE_FLAGS_UNICODE) { + fce->mCCMap = MapToCCMap(map); + } + // free map + for (i = 1; i <= EXTENDED_UNICODE_PLANES; ++i) { + if (extMap[i]) { + delete [] extMap[i]; + } + } + // create map then create ccmap + // codes from Windows + PRUint32* extMap[EXTENDED_UNICODE_PLANES+1]; + memset(extMap+1, 0, sizeof(PRUint32*)*EXTENDED_UNICODE_PLANES); + PRUint32 map[UCS2_MAP_LEN]; + memset(map, 0, sizeof(map)); + extMap[0] = map; - fce->mCCMap = ccmapObj.NewCCMap(); + if (fce->mFlags & FCE_FLAGS_SURROGATE) { + // make ext ccmap from map + fce->mCCMap = MapToCCMapExt(map, extMap+1, EXTENDED_UNICODE_PLANES); + } else if (fce->mFlags & FCE_FLAGS_UNICODE) { + fce->mCCMap = MapToCCMap(map); + } + // free map + for (i = 1; i <= EXTENDED_UNICODE_PLANES; ++i) { + if (extMap[i]) { + delete [] extMap[i]; + } + } - if (byte & (1<<j)) - aCCMapObj->SetChar(pagechar); + if (byte & (1<<j)) { + // alloc aExtMap[plane_num] if needed + PRUint32 plane_num = CCMAP_PLANE(pagechar); + NS_ASSERTION(plane_num <= EXTENDED_UNICODE_PLANES,"invalid plane"); + if (plane_num > EXTENDED_UNICODE_PLANES) { + continue; + } + if (aExtMap[plane_num] == 0) { + aExtMap[plane_num] = new PRUint32[UCS2_MAP_LEN]; + } + // set map + SET_REPRESENTABLE(aExtMap[plane_num], pagechar & 0xffff); + } These manipulate the ccmap and should probably not be in the font code. Can they be moved to the ccmap code? - num = sscanf(value, "%lu", &uLongVal); + num = sscanf(value, "%lx", &uLongVal); Nit: What is the data type here? +static FtFuncList FtFuncsExt [] = { + {"FT_Get_First_Char", (PRFuncPtr*)&nsFreeTypeFont::nsFT_Get_First_Char}, + {"FT_Get_Next_Char", (PRFuncPtr*)&nsFreeTypeFont::nsFT_Get_Next_Char}, + {nsnull, (PRFuncPtr*)nsnull}, Nit: indentation here + // new functions for 2.0.9 + for (p=FtFuncsExt; p->FuncName; p++) { + *p->FuncPtr = PR_FindFunctionSymbol(sSharedLib, p->FuncName); + } Do we need to check for errors here? What happens if the lib is missing these functions? + // break not to overwrite by UNICODE_CS if UCS_4 found Nit: could you perhaps change this to something like + // UCS_4 is the most prefered cmap since it supports surrogates + // so stop here to avoid the possibly of getting UNICODE_CS which + // is the 2nd prefered choice.
Attached patch new patch (obsolete) — Splinter Review
Attachment #87487 - Attachment is obsolete: true
Brian, Thanks for useful comments, >Any particular reason not to move IS_SURROGATE there as well? No, should I need to define? >+ // if SURROGATE char, ignore charSetInfo->mCCMap checking >+ // because the exact ccmap is never created before loading >+ // NEED TO FIX: need better way > >Is there a bug open for this? Sure, I'll file the bug. >Is correct version of CCMAP_HAS_CHAR vs CCMAP_HAS_CHAR_EXT used in all the >appropiate places? I have changed all CCMAP_HAS_CHAR to CCMAP_HAS_CHAR_EXT because CCMAP_HAS_CHAR_EXT can be used for 16 bit. Any objection? >These manipulate the ccmap and should probably not be in the font code. >Can they be moved to the ccmap code? Yes. I've made changes in nsCompressedCharMap class to handle such extend area for surrogate. The same methods now can be used. Only the difference is to call ccmapObj.Expand() to enable surrogate suppport, like nsCompressedCharMap ccmapObj; if (aFce->mFlags & FCE_FLAGS_SURROGATE) { ccmapObj.Extend(); } ccmapObj.SetChars(aFce->mCCMap); return ccmapObj.NewCCMap(); We can now use the same method such as SetChars(), SetChar() and NewCCMap(). The codes can be smaller than before patch. >- num = sscanf(value, "%lu", &uLongVal); >+ num = sscanf(value, "%lx", &uLongVal); > >Nit: What is the data type here? By addition of FCE_FLAGS_SURROGATE 0x08 Now the flag can have the value such as Flags=0000000b. So I need to change from lu -> lx. >+ // new functions for 2.0.9 >+ for (p=FtFuncsExt; p->FuncName; p++) { >+ *p->FuncPtr = PR_FindFunctionSymbol(sSharedLib, p->FuncName); >+ } > >Do we need to check for errors here? >What happens if the lib is missing these functions? I understand these are optional functions. If the functions are not defined, FuncPrt is set to NULL. I checked whether the functions are defined or not like below in nsFT2FontCatalog.cpp, + if (nsFreeTypeFont::nsFT_Get_First_Char && + nsFreeTypeFont::nsFT_Get_Next_Char) { + has_fast_methods = PR_TRUE; + }
Attached patch new patch (obsolete) — Splinter Review
NS_ASSERTION(aChar > 0xffff) should be NS_ASSERTION(aChar <= 0xffff)
Attachment #92874 - Attachment is obsolete: true
Comment on attachment 93115 [details] [diff] [review] new patch > >Is there a bug open for this? > > Sure, I'll file the bug. I could not find the bug. Could you let me know what the bug # is? > I have changed all CCMAP_HAS_CHAR to CCMAP_HAS_CHAR_EXT because > CCMAP_HAS_CHAR_EXT can be used for 16 bit. Any objection? This seems fine. Nit: Should we just call it CCMAP_HAS_CHAR? Great work! r=bstell@ix.netcom.com
Attachment #93115 - Flags: review+
Comment on attachment 93115 [details] [diff] [review] new patch katakai: Can you port the gfx/src/gtk/-changes to gfx/src/xlib/ or should I do the work ?
Attachment #93115 - Flags: needs-work+
I'll work on it, let me try...
katakai- let's focus and land the gtk one first. one by one, step by step I really want to see the gtk part finished and land into trunk by 8/28. katakai, can you make it ?
Blocks: 157673
Keywords: nsbeta1+
If there is deadline then having the gtk version get in under the deadline and not the xlib version will really penalize the xlib version.
Roland: this means you need to hustle.
Roland?
Keywords: topembed
Katakai: lets get this sr='d and checked in
Keywords: topembedtopembed-
Reassigning to Roland. Please set a target milestone.
Assignee: katakai → Roland.Mainz
Status: ASSIGNED → NEW
A lot of work got done in bug 182877 before I saw it, so instead of marking it a dupe of this one, for the time being I am mutually cc-ing the respective owners of the two bugs so you can work together and decide which one to close.
The code path here looks to me to be independent of the code path for Xft. Thus this bug is not a dup of a Xft bug.
Thanks for the clarification, Brian. So Roland should work here on FreeType and ignore Xft, and Jungshik should work on Xft in bug 182877 and ignore Freetype?
yes
what is the progress now?
gfx codes have been changed in many place while requesting sr=. I'll restart the work for gfx part and will post new patch soon.
Assignee: Roland.Mainz → katakai
Attached patch updated patch for current tree (obsolete) — Splinter Review
Attachment #93115 - Attachment is obsolete: true
Most codes of the previous could be re-used for the current tree. But one thing, I had to modify the codes for FT_Get_First_Char() and FT_Get_Next_Char(). FreeType functions are now being accessed via nsIFreeType2.idl. I added the following to idl, + FT_ULong getFirstChar(in FT_Face face, out FT_UInt gindex); + FT_ULong getNextChar(in FT_Face face, in FT_ULong charcode, out FT_UInt gindex); + void supportsExtFunc(out PRBool res); SupportsExtFunc() will check the FreeType2 library in the system can support FT_Get_First_Char() and FT_Get_Next_Char() functions for fast looking up of glyph index. Brian, sorry for dup work, but could you review please? I have a question. There are some dup files for nsCompressedCharMap nsFT2FontCatalog nsFreeType undef gfx/. I modified only the files for gtk build but should I update both? ./public/nsCompressedCharMap.h ./src/nsCompressedCharMap.cpp ./src/shared/nsCompressedCharMap.cpp ./src/shared/nsCompressedCharMap.h ./src/freetype/nsFT2FontCatalog.cpp ./src/freetype/nsFT2FontCatalog.h ./src/x11shared/nsFT2FontCatalog.cpp ./src/x11shared/nsFT2FontCatalog.h ./src/freetype/nsFreeType.cpp ./src/freetype/nsFreeType.h ./src/x11shared/nsFreeType.cpp ./src/x11shared/nsFreeType.h
Status: NEW → ASSIGNED
Attachment #111684 - Flags: review?(bstell)
Overall this looks great. There are a few items to discuss. 1) I am little concerned about overwriting the input this way + // checking each plane + for(l=planestart; l<=planeend; l++, *aPageStart = CCMAP_BEGIN_AT_START_OF_MAP) { 2) need to check if malloc failed + mExtMap[plane_num] = (PRUint32*)PR_Malloc(sizeof(PRUint32)*UCS2_MAP_LEN); + memset(mExtMap[plane_num], 0, sizeof(PRUint32)*UCS2_MAP_LEN); 3) I'm curious if it would be easier to call "SetChars((PRUint16)base, page);" instead of scanning the bits. + if(mExtended){ + PRUint32 page = CCMAP_BEGIN_AT_START_OF_MAP; + while (NextNonEmptyCCMapPage(aCCMap, &page)) { + PRUint32 pagechar = page; + for (i=0; i<(CCMAP_BITS_PER_PAGE/8); i++) { + for (j=0; j<8; j++) { + if (CCMAP_HAS_CHAR_EXT(aCCMap, pagechar)) { + SetChar(pagechar); + } + pagechar++; + } + } + } 4) please convert the tabs to spaces + {"FT_Done_Face", NS_FT2_OFFSET(nsFT_Done_Face), PR_TRUE}, + {"FT_Done_FreeType", NS_FT2_OFFSET(nsFT_Done_FreeType), PR_TRUE}, 5) Nit: a slightly more positive way to say this: +#define IS_SURROGATE(u) (u > 0xFFFF) would be: +#define IS_SURROGATE(u) (u >= 0x10000) 6) Is there a bug open for this? + // if SURROGATE char, ignore charSetInfo->mCCMap checking + // because the exact ccmap is never created before loading + // NEED TO FIX: need better way + if (IS_SURROGATE(aChar) ) { + goto check_done; 7) It might be a bit more clear if: + PRBool need_scan = PR_TRUE; was changed to scan_done (and the logic was changed as appropiate) 8) How about using extraSurrogateLength here as in the earlier part of the patch? + // aString[i+1] is already used as low surrogate, skip i+1 + i++;
I do not know how these dups happened. ./public/nsCompressedCharMap.h ./src/nsCompressedCharMap.cpp ./src/shared/nsCompressedCharMap.cpp ./src/shared/nsCompressedCharMap.h these should be cleaned up when bug 180668 is checked in ./src/freetype/nsFT2FontCatalog.cpp ./src/freetype/nsFT2FontCatalog.h ./src/x11shared/nsFT2FontCatalog.cpp ./src/x11shared/nsFT2FontCatalog.h ./src/freetype/nsFreeType.cpp ./src/freetype/nsFreeType.h ./src/x11shared/nsFreeType.cpp ./src/x11shared/nsFreeType.h
Attached patch updated patchSplinter Review
Attachment #111684 - Attachment is obsolete: true
Thank you very much for review, > 1) I am little concerned about overwriting the input this way > > + // checking each plane > + for(l=planestart; l<=planeend; l++, *aPageStart = CCMAP_BEGIN_AT_START_OF_MAP) { OK, I provided pagestart for aPageStart, PRUint32 pagestart = *aPageStart; so that aPageStart will be changed only when return TRUE. *aPageStart = (((PRUint32)l)<<16)+base; return PR_TRUE; > 2) need to check if malloc failed > > + mExtMap[plane_num] = (PRUint32*)PR_Malloc(sizeof(PRUint32)*UCS2_MAP_LEN); > + memset(mExtMap[plane_num], 0, sizeof(PRUint32)*UCS2_MAP_LEN); fixed. > 3) I'm curious if it would be easier to call "SetChars((PRUint16)base, page);" > instead of scanning the bits. no changes in the patch... > + if(mExtended){ > + PRUint32 page = CCMAP_BEGIN_AT_START_OF_MAP; > + while (NextNonEmptyCCMapPage(aCCMap, &page)) { > + PRUint32 pagechar = page; > + for (i=0; i<(CCMAP_BITS_PER_PAGE/8); i++) { > + for (j=0; j<8; j++) { > + if (CCMAP_HAS_CHAR_EXT(aCCMap, pagechar)) { > + SetChar(pagechar); > + } > + pagechar++; > + } > + } > + } > > 4) please convert the tabs to spaces > > + {"FT_Done_Face", NS_FT2_OFFSET(nsFT_Done_Face), PR_TRUE}, > + {"FT_Done_FreeType", NS_FT2_OFFSET(nsFT_Done_FreeType), PR_TRUE}, fixed. > 5) Nit: a slightly more positive way to say this: > > +#define IS_SURROGATE(u) (u > 0xFFFF) > > would be: > > +#define IS_SURROGATE(u) (u >= 0x10000) fixed. > 6) Is there a bug open for this? > > + // if SURROGATE char, ignore charSetInfo->mCCMap checking > + // because the exact ccmap is never created before loading > + // NEED TO FIX: need better way > + if (IS_SURROGATE(aChar) ) { > + goto check_done; filed as bug 189425. > 7) It might be a bit more clear if: > > + PRBool need_scan = PR_TRUE; > > was changed to scan_done (and the logic was changed as appropiate) I removed "break;" and changed to the below in while() while (!scan_done && i < len) { please let me know when you find better logic and codes. > 8) How about using extraSurrogateLength here as in the earlier part of the patch? > > + // aString[i+1] is already used as low surrogate, skip i+1 > + i++; Agreed. fixed.
Attachment #112041 - Flags: review?(bstell)
Attachment #112041 - Flags: review?(bstell) → review+
Attachment #112041 - Flags: superreview?(bryner)
Attachment #112041 - Flags: superreview?(bryner) → superreview+
Flags: blocking1.3b?
Attachment #112041 - Flags: approval1.3b?
Flags: blocking1.3b?
Comment on attachment 112041 [details] [diff] [review] updated patch Please hold this until we open for 1.4alpha development. Thanks.
Attachment #112041 - Flags: approval1.3b? → approval1.3b-
Target Milestone: --- → mozilla1.4alpha
Comment on attachment 111684 [details] [diff] [review] updated patch for current tree clearing review request on obsolete patch
Attachment #111684 - Flags: review?(bstell)
patch checked into 1.4a.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
so I hate to say it, but this bug increased footprint by about 3.4k. Is there any optimization that can be done to make up for (or fix!) the code size increase? It looks like a lot of functions changed type signature and grew by a few bytes in the process..*changing that one paramter from char to PRUnichar I think) there isn't any glaringly obvious growth here, just lots of functions growing from 4 to 50 bytes each, and some new functions.. libgfx_gtk.so Total: +3420 (+19088/-15668) Code: +3292 (+15476/-12184) Data: +128 (+3612/-3484) +3292 (+15476/-12184) T (CODE) +3292 (+15476/-12184) UNDEF:libgfx_gtk.so:T +2356 nsFontMetricsGTK::FindStyleSheetGenericFont(unsigned int) +2144 nsFontMetricsGTK::FindLangGroupPrefFont(nsIAtom *, unsigned int) +2072 nsFontMetricsGTK::PickASizeAndLoad(nsFontStretch *, nsFontCharSetInfo *, unsigned int, char const *) +1052 nsFontMetricsGTK::TryNode(nsCString *, unsigned int) +1020 nsFontMetricsGTK::TryNodes(nsACString &, unsigned int) +804 nsFontMetricsGTK::TryFamily(nsCString *, unsigned int) +596 nsFontMetricsGTK::SearchNode(nsFontNode *, unsigned int) +580 nsFontMetricsGTK::FindLangGroupFont(nsIAtom *, unsigned int, nsCString *) +464 nsFontMetricsGTK::FindAnyFont(unsigned int) +456 nsFontMetricsGTK::FindFont(unsigned int) +448 nsFontMetricsGTK::FindStyleSheetSpecificFont(unsigned int) +376 nsFT2FontCatalog::NewFceFromFontFile(FT_LibraryRec_ *, char const *, int, int *) +300 SetFontCharsetInfo(nsFontGTK *, nsFontCharSetInfo *, unsigned int) +220 nsFontMetricsGTK::DrawString(unsigned short const *, unsigned int, int, int, int, int const *, nsRenderingContextGTK *, nsDrawingSurfaceGTK *) +216 nsFontMetricsGTK::GetBoundingMetrics(unsigned short const *, unsigned int, nsBoundingMetrics &, int *, nsRenderingContextGTK *) +212 nsFontGTK::SupportsChar(unsigned int) +212 nsFontMetricsGTK::GetWidth(unsigned short const *, unsigned int, int &, int *, nsRenderingContextGTK *) +204 nsFontMetricsGTK::GetTextDimensions(unsigned short const *, unsigned int, nsTextDimensions &, int *, nsRenderingContextGTK *) +172 nsFontMetricsGTK::FindSubstituteFont(unsigned int) +168 nsFontMetricsGTK::TryAliases(nsCString *, unsigned int) +156 nsFontMetricsGTK::ResolveForwards(unsigned short const *, unsigned int, int (*)(nsFontSwitchGTK const *, unsigned short const *, unsigned int, void *), void *) +156 nsFontMetricsGTK::TryLangGroup(nsIAtom *, nsCString *, unsigned int) +144 nsFT2FontCatalog::PrintPageBits(nsNameValuePairDB *, unsigned short *, unsigned int) +128 InitGlobals(nsIDeviceContext *) +124 nsFontMetricsGTK::FindUserDefinedFont(unsigned int) +116 nsFontMetricsGTK::LocateFont(unsigned int, int &) +96 nsFreeTypeFont::doGetBoundingMetrics(unsigned short const *, unsigned int, int *, int *, int *, int *, int *) +92 nsFreeTypeFont::GetWidth(unsigned short const *, unsigned int) +92 nsFreeTypeXImage::DrawString(nsRenderingContextGTK *, nsDrawingSurfaceGTK *, int, int, unsigned short const *, unsigned int) +68 nsFT2FontCatalog::NewFceFromSummary(nsNameValuePairDB *) +44 nsFreeType2::GetCCMap(nsFontCatalogEntry *) +40 nsFreeType2::GetNextChar(FT_FaceRec_ *, unsigned long, unsigned int *, unsigned long *) +40 nsFreeType2::SupportsExtFunc(int *) +36 nsFreeType2::GetFirstChar(FT_FaceRec_ *, unsigned int *, unsigned long *) +36 nsFreeType2::LoadSharedLib(void) +20 SetUpFontCharSetInfo(nsFontCharSetInfo *) +16 GetMapFor10646Font(XFontStruct *) -4 nsFT2FontCatalog::PrintCCMap(nsNameValuePairDB *, unsigned short *) -4 PrefEnumCallback(char const *, void *) -104 nsFontGTK::SupportsChar(unsigned short) -132 nsFontMetricsGTK::FindUserDefinedFont(unsigned short) -156 nsFontMetricsGTK::TryLangGroup(nsIAtom *, nsCString *, unsigned short) -172 nsFontMetricsGTK::TryAliases(nsCString *, unsigned short) -172 nsFontMetricsGTK::FindSubstituteFont(unsigned short) -188 SetFontCharsetInfo(nsFontGTK *, nsFontCharSetInfo *, unsigned short) -364 nsFontMetricsGTK::FindAnyFont(unsigned short) -460 nsFontMetricsGTK::FindStyleSheetSpecificFont(unsigned short) -472 nsFontMetricsGTK::FindFont(unsigned short) -600 nsFontMetricsGTK::SearchNode(nsFontNode *, unsigned short) -600 nsFontMetricsGTK::FindLangGroupFont(nsIAtom *, unsigned short, nsCString *) -740 nsFontMetricsGTK::TryFamily(nsCString *, unsigned short) -908 nsFontMetricsGTK::TryNodes(nsACString &, unsigned short) -960 nsFontMetricsGTK::TryNode(nsCString *, unsigned short) -1900 nsFontMetricsGTK::FindStyleSheetGenericFont(unsigned short) -2064 nsFontMetricsGTK::PickASizeAndLoad(nsFontStretch *, nsFontCharSetInfo *, unsigned short, char const *) -2184 nsFontMetricsGTK::FindLangGroupPrefFont(nsIAtom *, unsigned short) +96 (+112/-16) D (DATA) +96 (+112/-16) UNDEF:libgfx_gtk.so:D +96 nsFreeType2::FtFuncs +8 getenv_done.2846 +4 already_complained.2826 +4 nsFreeType2::gHasExtFunc -4 gFreeTypeFaces -4 already_complained.2823 -8 getenv_done.2843 +32 (+3500/-3468) R (DATA) +32 (+3500/-3468) UNDEF:libgfx_gtk.so:R +2520 __PRETTY_FUNCTION__.2758 +128 __PRETTY_FUNCTION__.2254 +128 __PRETTY_FUNCTION__.2272 +128 __PRETTY_FUNCTION__.2329 +128 __PRETTY_FUNCTION__.2332 +116 __PRETTY_FUNCTION__.2356 +96 __PRETTY_FUNCTION__.2263 +96 __PRETTY_FUNCTION__.2287 +96 __PRETTY_FUNCTION__.2314 +32 __PRETTY_FUNCTION__.10 +32 __PRETTY_FUNCTION__.2260 -32 __PRETTY_FUNCTION__.2266 -96 __PRETTY_FUNCTION__.2284 -96 __PRETTY_FUNCTION__.2281 -96 __PRETTY_FUNCTION__.2251 -116 __PRETTY_FUNCTION__.2350 -128 __PRETTY_FUNCTION__.2326 -128 __PRETTY_FUNCTION__.2323 -256 __PRETTY_FUNCTION__.2752 -256 __PRETTY_FUNCTION__.2248 -2264 __PRETTY_FUNCTION__.2755
Hi Alec, Yes, I needed to change PRUnichar to PRUint32 to handle surrogate in each function. The type change of PRUnichar to PRUint32 also has been needed in other platform e.g. gfx/src/windows when adding surrogate support. http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&root=/cvsroot&subdir=mozilla/gfx/src/windows&command=DIFF_FRAMESET&file=nsFontMetricsWin.h&rev2=3.41&rev1=3.40 but yes, I need to check the codes again.
On 02-25 trunk build / linux RH8.0, the GB18030 plane 2 surrogate characters are still displayed as "?" using truetype font SURSONG.TTF. While GB18030 extension A characters are displayed fine.
None of the testcases linked from bug 118000 work for me in current Linux trunk, and some hang. I have user_pref("font.FreeType2.enable", true); user_pref("font.directory.truetype.1", "~/fonts"); in my prefs.js, and FreeType works for me other than this.
I don't have SURSONG.TTF but was using simsun18030.ttc and code2001.ttf, it works for me on Trunk. When you already have fonts/.mozilla_font_summary.ndb file in your ttf font directory, please remove the file, then start Mozilla again.
Thanks for the tip, Katakai. After deleting ~/fonts/.mozilla_font_summary.ndb all the testcases now work for me.
Status: RESOLVED → VERIFIED
Thank you Simon for verification. Yuying, are you still seeing problem? Actually I've never tried SURSONG.TTF font. I always used simsun18030 and code2001. I understand SURSONG.TTF is not free, right? I'll try to get the font. Btw, do you have any good test page for plane 2? If you have please send to me. If the same page can be displayed on Windows, it seems bug for me. Please open new bug report and assign to me.
Yes, I still have problem with display plane 2 extension B characters; I think simsun.ttc and code2001.ttf are plane 1 not CJK extension B. Bug 195154 filed for remaining problem.
> Btw, do you have any good test page for plane 2? My test pages for various UTF's at http://jshin.net/i18n/utftest have a _few_ random plane 2 characters thrown in. I can make a plane 2 character table, but you can do that as easily as I can....
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: