support Surrogate display on Linux by using FreeType

VERIFIED FIXED in mozilla1.4alpha

Status

()

VERIFIED FIXED
17 years ago
16 years ago

People

(Reporter: ftang, Assigned: masaki.katakai)

Tracking

({intl, topembed-})

Trunk
mozilla1.4alpha
x86
Linux
intl, topembed-
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 6 obsolete attachments)

(Reporter)

Description

17 years ago
Now bstell's FreeType work is in the tree. we should think about using it to
display Surrogate characters 

Shanjian did the surrogate enablement in Windows already. We probably need to
use similar approach to address it. 

katakai- can you take charge of this issue? I don't think we have enough time to
complete it before m1.0, but we definitely sould start the design now. 

We can use the "Code 2001" font to test it and later use Microsoft's surrogate
fonts or tne new OfficeXP simp chinese font. Maybe other vendors like sun will
license some surrogate TTF fonts from foundary.
 
Once this work is completed and FreeType is turn on, then we can support Unicode
Idograph Extension B on Linux
(Reporter)

Comment 1

17 years ago
shanjian- can you put the links about your window surrogates support in here.
I think katakai need to know how you did that in windows also where he can find
the public surrogate fonts. 

bstell- can you put down where he can find FreeType API documentation here too?

Updated

17 years ago
Keywords: intl
QA Contact: ruixu → ylong
(Reporter)

Comment 2

17 years ago
here is the infor for code 2001 font.
>Frank, Shanjian,
>
>James Kass' font works well and covers the characters in the example
>page at: http://www.geocities.com/i18nguy/unicode-example-plane1.html
>
>The page has some information on setting up Windows for surrogates.
>
>Kass' Code 2001 font is at:
>http://home.att.net/~jameskass/CODE2001.ZIP
>
>tex

(Reporter)

Comment 3

17 years ago
>also, look at http://www.microsoft.com/downloads/release.asp?ReleaseID=31326
for MS surrogate font
(Reporter)

Comment 4

17 years ago
the font could be found at
http://www.microsoft.com/downloads/release.asp?ReleaseID=32385

Comment 5

17 years ago
Font for CJK extension A can be found in MS website, link is available in bug 
102254. Extension A has nothing to do with surrogate. So this font probably is 
not needed for this bug, but you might want to verify to see if it works. If 
not, we should file a bug and fix it. 

CCMAP support for surrogates can be found in 110843. This work is available for 
all platforms. But you might want to take a look to see how to use it. It is 
very simple though. 

Handling new OpenType format12 table is implemented in bug 118606. A unix 
implemetation is needed before TTF (or more accurately, OpenType) font can be 
use to support surrogates. See next part to get a font for testing.

The final patch that make surrogates work on windows is bug 118000. It also 
contains links to download Code2001, a TTF font support surrogate. Several links 
to testcases were also provided. 

Comment 6

17 years ago
Created attachment 72932 [details]
prelim notes on adding surrogate support to linux moz

this is a working draft
(Assignee)

Comment 7

17 years ago
status:
- installed code 2001 fonts to Win2K and
  verified glyphs can be displayed on Win2K on
  IE and Mozilla
- made simple FT2 example that displays
  surrogate code range of code 2001
Status: NEW → ASSIGNED

Comment 8

17 years ago
Great work!

FreeType2 2.0.9 has a new features that you may eventually want to look at to
make scanning the 1 Million surrogate code points faster:

    - FT_Get_First_Char and FT_Get_Next_Char.
      These can be used to iterate efficiently over the currently
      selected charmap of a given face.  
(Assignee)

Comment 9

17 years ago
Yes, I tried FT_Get_First_Char() and FT_Get_Next_Char(), and
verified they work propery for code 2001 font.

> 1.3) Save surrogate ranges
> Modify NextNonEmptyCCMapPage to support 32 bit
> Modify NextNonEmptyCCMapPage to report surrogate ranges

Do you think we need to modify whole nsCompressedCharMap?
because SetChar() supports now 16 bit.

http://lxr.mozilla.org/seamonkey/source/gfx/src/nsCompressedCharMap.cpp#245

244 void
245 nsCompressedCharMap::SetChar(PRUint16 aChar)
246 {
247   unsigned int i;
248   unsigned int upper_index      = CCMAP_UPPER_INDEX(aChar);
249   unsigned int mid_index        = CCMAP_MID_INDEX(aChar);

Comment 10

17 years ago
> Yes, I tried FT_Get_First_Char() and FT_Get_Next_Char(), and
> verified they work propery for code 2001 font.

Of course we will need a fallback for systems with older versions of
FreeType2.

> Do you think we need to modify whole nsCompressedCharMap?
> because SetChar() supports now 16 bit.

Shanjian added surrogate support to the CCmap code.

Shanjian: could you kindly explain how to use the surrogate support you added 
to the CCMap? Thanks
(Assignee)

Comment 11

17 years ago
It seems that codes around MapToCCMapExt() in windows
makes ccmap. I'll refer them.

http://lxr.mozilla.org/seamonkey/source/gfx/src/windows/nsFontMetricsWin.cpp#1479
(Assignee)

Comment 12

17 years ago
Created attachment 75913 [details] [diff] [review]
first drop
(Assignee)

Comment 15

17 years ago
I have put the first version of patch. With the patch
and with CODE2001 TTF font, I can see the surrogate
characters on Linux/Solaris Mozilla now.

What I have done in patch,

- added FCE_FLAGS_UCS4 flag for SURROGATE TrueType

- still assume FCE_FLAGS_UCS4 has the same code points
  in FCE_FLAGS_UNICODE (need to verify)

- enhanced nsFT2FontCatalog::NewFceFromFontFile() can
  handle SURROGATE for creating ccmap

  make map then make ccmap by MapToCCMapExt()
  for SURROGATE FCE_FLAGS_UCS4
  
  make map then make ccmap by MapToCCMap() for
  non-SURROGATE FCE_FLAGS_UNICODE
  
  These codes from Windows

- enhanced nsCompressedCharMap.cpp - 
  NextNonEmptyCCMapPage() supports SURROGATE area
  now can store SURROGATE area to .db file

- added nsCompressedCharMap.cpp - NewCCMapExt() makes
  a copy of mCCMap that contains SURROGATE area
  
  this is used in nsFT2FontCatalog::GetCCMap()

- enhaced nsFT2FontCatalog::NewFceFromSummary() can 
  handle SURROGATE for creating ccmap
  
  make map then make ccmap by MapToCCMapExt() for 
  SURROGATE FCE_FLAGS_UCS4
  
  make map then make ccmap by MapToCCMap() for 
  non-SURROGATE FCE_FLAGS_UNICODE
  
  now can load SURROGATE area from .db file at next
  start up
  
  These codes from Windows

- In nsRenderingContextGTK.cpp, added SURROGATE
  check into the following methods

        nsRenderingContextGTK::GetWidth()
        nsRenderingContextGTK::GetTextDimensions()
        nsRenderingContextGTK::DrawString()

- In nsFreeType.cpp, added SURROGATE check into the 
  following methods

        nsFreeTypeFont::doGetBoundingMetrics()
        nsFreeTypeFont::GetWidth()
        nsFreeTypeXImage::DrawString()

- In nsFontMetricsGTK.cpp, changed PRUnichar to PRUint32 
  for support SURROGATE

- CCMAP_HAS_CHAR() has bee changed to   
  CCMAP_HAS_CHAR_EXT() for support SURROGATE
(Assignee)

Comment 16

17 years ago
What I need to do more:

- Use FT_Get_First_Char() and FT_Get_Next_Char() if usable
  but I don't think the current version (can all code points)
  is so slow
- invalid code check, put NS_ASSERTION()
- performance improvement
- more testing
(Assignee)

Comment 17

17 years ago
Brian/Frank,

Please read the comments of
 http://bugzilla.mozilla.org/show_bug.cgi?id=127713#c15

and give your comments if I'm wrong.

Also if you have time, please look at codes in the patch.

Comment 18

17 years ago
Great work!

Your description (http://bugzilla.mozilla.org/show_bug.cgi?id=127713#c15)
and code (attachment 75913 [details] [diff] [review]) look good.

nit:
> - added FCE_FLAGS_UCS4 flag for SURROGATE TrueType

Could USC4 mean more than just surrogate?
Perhaps FCE_FLAGS_SURROGATE would be more self-explanatory?

> - added nsCompressedCharMap.cpp - NewCCMapExt() makes
>   a copy of mCCMap that contains SURROGATE area

Is this new? I thought Shanjian made surrogates work on
windows but I have not looked into the details.

It might be nice if one function could build the proper
map (surrogate/non-surrogate) depending on what was in the data.

> extern PRBool NextNonEmptyCCMapPage(PRUint16 *, PRUint32 *);

Do both these need to be PRUint32 ?

> CCMAP_HAS_CHAR_EXT

Would CCMAP_HAS_CHAR32 be more self-explanatory?

   if (charSetInfo->mCharSet) {
+    // if SURROGATE char, ignore charSetInfo->mCCMap checking
+    // because the exact ccmap is never created before loading
+    // NEED TO FIX: need better way
+    if (IS_SURROGATE(aChar) ) {
+      goto check_done;
+    }

I'm unclear on how the charset map would be non-exact.
Could you talk about this?

+      if(is_surrogate) {
+        i++;
+      }

Could you add a comment here since I suspect non-i18n
engineers will not see the (oblivious to us) connection.

+          fce->mFlags = FCE_FLAGS_ISVALID | FCE_FLAGS_UCS4;

We probably want to also set the FCE_FLAGS_UNICODE flag here
since this is used to distinguish Unicode encoded fonts from 
non-Unicode encoded fonts. For example symbol encoded fonts
such as MathML fonts would not set FCE_FLAGS_UNICODE but would
set FCE_FLAGS_SYMBOL. I'm not sure if in the future we will
have to add support for other encoding.

298 #define CCMAP_BEGIN_AT_START_OF_MAP 0xFFFF

We probably need to make this work for a 32 bit variable.
(Assignee)

Comment 19

17 years ago
>   if (charSetInfo->mCharSet) {
> +    // if SURROGATE char, ignore charSetInfo->mCCMap checking
> +    // because the exact ccmap is never created before loading
> +    // NEED TO FIX: need better way
> +    if (IS_SURROGATE(aChar) ) {
> +      goto check_done;
> +    }
> 
> I'm unclear on how the charset map would be non-exact.
> Could you talk about this?

Yes, this is one outstanding issue. We currently use
charset encoding for each font such as -vendor-family-charset
encoding-charset registry.

I understand before we load ttf font, charset map is created
by using "charset encoding-charset registry". What "charset
encoding-registry" should be used for surrogate fonts?
code2001.ttf font is handled as "jjk-code2001-iso8859-1"
and charsetmap is created as "iso8859-1". If previous
maching is done for "iso8859-1", this will just return,
before the exact ttf font is loaded.
(Assignee)

Comment 20

17 years ago
>> extern PRBool NextNonEmptyCCMapPage(PRUint16 *, PRUint32 *);
>
>Do both these need to be PRUint32 ?

I undersntand ccmap is still PRUint16* even if this contains
surrogate.

>> CCMAP_HAS_CHAR_EXT
>
>Would CCMAP_HAS_CHAR32 be more self-explanatory?

I just use _EXT macro defined in nsCompressedCharMap.h. It
is used in gfx/src/windows. Shanjian, what do you think?

>> - added nsCompressedCharMap.cpp - NewCCMapExt() makes
>>   a copy of mCCMap that contains SURROGATE area
>
>Is this new? I thought Shanjian made surrogates work on
>windows but I have not looked into the details.
>
>It might be nice if one function could build the proper
>map (surrogate/non-surrogate) depending on what was in the data.

Yes, that's true. I'll try to use the existing method.




Comment 21

17 years ago
It is great to see surrogate being added to Unix/Linux!

> I understand before we load ttf font, charset map is created
> by using "charset encoding-charset registry". What "charset
> encoding-registry" should be used for surrogate fonts?

Nice to see that you are looking so closely!

The code actually maps it to all encoding-charsets indicated by the
OS/2 table ulCodePageRange1/2:

http://lxr.mozilla.org/seamonkey/source/gfx/src/x11shared/nsFT2FontCatalog.cpp#8
40

807 nsFT2FontCatalog::doGetFontNames(...)
808 {
...
832   for (i=0; i<mFontCatalog->numFonts; i++) {
833     nsFontCatalogEntry *fce = mFontCatalog->fonts[i];
...
843       for (j=0; j<32; j++) {
844         unsigned long bit = 1 << j;
845         if (bit & fce->mCodePageRange1) {
846           charSetName = GetRange1CharSetName(bit);
...
851         }
852         if (bit & fce->mCodePageRange2) {
853           charSetName = GetRange2CharSetName(bit);

> I undersntand ccmap is still PRUint16* even if this contains
> surrogate.

Will this work when the next page is above 0xFFFF ?
(Assignee)

Comment 22

17 years ago
Thanks for the info. I understood.

The problem is that the code2001 font reports

fce->mCodePageRange1=1
fce->mCodePageRange2=0

which are the same as iso-8859-1... So, if
iso-8859-1 font is loaded before code2001,
the checking will return.
(Assignee)

Comment 23

17 years ago
Created attachment 76391 [details] [diff] [review]
2nd patch
(Assignee)

Comment 24

17 years ago
I have made changes for Brian's comments.

Also I've put First_Char() and Next_Char() methods for
performance improvement.

Comment 25

17 years ago
Nice work!

> I undersntand ccmap is still PRUint16* even if this contains
> surrogate.

Yes you are right. I see now.

+#define IS_HIGH_SURROGATE(u)  ...
+#define IS_LOW_SURROGATE(u)  ...
+#define SURROGATE_TO_UCS4(h, l)  ...
+#define IS_SURROGATE(u)      ...

nit: Should these be in a place where they could be shared with
other GUIs?

+      if(i < aLength-1 && IS_HIGH_SURROGATE(c) && 
IS_LOW_SURROGATE(aString[i+1])) {
+        // if surrogate, make UCS4 code point from high aString[i] and
+        // low surrogate aString[i+1]
+        c = SURROGATE_TO_UCS4(c, aString[i+1]);
+        is_surrogate = true;
=>       i++;

nit: Would it make sense to move the "i++" to the place where we detect the
surrogate as you do in other parts of the code? (Please forgive me, I should 
have caughtthis last time)

nsFT2FontCatalog::GetCCMap(nsFontCatalogEntry *aFce)
 {
...
+                extMap[plane_num] = new PRUint32[UCS2_MAP_LEN];
...
+        delete [] extMap[i];
...
+          extMap[plane_num] = new PRUint32[UCS2_MAP_LEN];
...
+        delete [] extMap[i];
...
+      delete [] extMap[i];
...
+          aExtMap[plane_num] = new PRUint32[UCS2_MAP_LEN];

I'd like to ask if we could we move the CCMap allocation/deallocation code 
into nsCompressedCharMap. This is not a strict request so if it is very
difficult you can skip this (sorry I missed this last time).

+    // check fast methods are defined
+    PRBool has_fast_methods = PR_FALSE;
+    if (nsFreeTypeFont::nsFT_Get_First_Char &&
+                 nsFreeTypeFont::nsFT_Get_Next_Char) {
+      has_fast_methods = PR_TRUE;
+    }

Very nice!

minor nit:
could you change this from

 FONT_SCAN_PRINTF(("\b\b\b\b\b\b\b\b\b\b\b\b%5d glyphs", num_checked));

to

 FONT_SCAN_PRINTF(("\b\b\b\b\b\b\b\b\b\b\b\b\b\b%7d glyphs", num_checked));


- still assume FCE_FLAGS_UCS4 has the same code points
  in FCE_FLAGS_UNICODE (need to verify)

Did you have a chance to verify this?

r=bstell@ix.netcom.com from here on.
(Assignee)

Comment 26

17 years ago
> +#define IS_HIGH_SURROGATE(u)  ...
> +#define IS_LOW_SURROGATE(u)  ...
> +#define SURROGATE_TO_UCS4(h, l)  ...
> +#define IS_SURROGATE(u)      ...
>
> nit: Should these be in a place where they could be shared with
> other GUIs?

Do you have any idea for the place? 

How about gfx/public/nsFont.h ?

Comment 27

17 years ago
> How about gfx/public/nsFont.h ?

Frank or Shanjian: where would you recommend?
(Assignee)

Comment 28

17 years ago
Created attachment 87487 [details] [diff] [review]
3rd patch
Attachment #75913 - Attachment is obsolete: true
Attachment #76391 - Attachment is obsolete: true
(Assignee)

Comment 29

17 years ago
Now nsUnicharUtils.h has the definitions for

> +#define IS_HIGH_SURROGATE(u)  ...
> +#define IS_LOW_SURROGATE(u)  ...
> +#define SURROGATE_TO_UCS4(h, l)  ...

I'm using the header file. It's the same with windows version.
(Assignee)

Comment 30

17 years ago
3rd patch is attached.

I have made all changes except allocation/deallocation CCMap in
nsCompressedCharMap. I'm sorry it's difficult and I want to use
the same scenario with windows.

> I'd like to ask if we could we move the CCMap allocation/deallocation code 
> into nsCompressedCharMap. This is not a strict request so if it is  very
> difficult you can skip this (sorry I missed this last time).

Brian, could you r= again please?

Comment 31

17 years ago
+// New version - support 32 bit aPageStart

Nit: Would you drop this line since this will only be "new" for a short time?

-  // walk thru the upper pointers
-  PRUint16 *upper = &aCCMap[0];
-  for (i=upper_index; i<CCMAP_NUM_UPPER_POINTERS; i++, mid_index=0) {
-    if (upper[i] == CCMAP_EMPTY_MID) {
-      continue;
+  for(l=planestart; l<=planeend; l++, *aPageStart = ...

Nit: Could you replace the comment with something appropiate?

> Now nsUnicharUtils.h has the definitions for
>
> > +#define IS_HIGH_SURROGATE(u)  ...
> > +#define IS_LOW_SURROGATE(u)  ...
> > +#define SURROGATE_TO_UCS4(h, l)  ...
>
> I'm using the header file. It's the same with windows version.

Any particular reason not to move IS_SURROGATE there as well?

+    // if SURROGATE char, ignore charSetInfo->mCCMap checking
+    // because the exact ccmap is never created before loading
+    // NEED TO FIX: need better way

Is there a bug open for this?

Is correct version of CCMAP_HAS_CHAR vs CCMAP_HAS_CHAR_EXT used in all the
appropiate places?

As a saftey check: Have you tested to see what will happen if CCMAP_HAS_CHAR is
called with a 32 bit value? Or could you add an assert to warn if this is
happening?

+    PRUint32 i, ii;
     for (i = 0; i < aLength; i++) {
-      PRUnichar c = aString[i];
+      PRUint32 c = aString[i];
+      ii = i;
+      if(i < aLength-1 && IS_HIGH_SURROGATE(c) && IS_LOW_SURROGATE(...) {
+        // if surrogate, make UCS4 code point from high aString[i] and
+        // low surrogate aString[i+1]
+        c = SURROGATE_TO_UCS4(c, aString[i+1]);
+
+        // aString[i+1] is already used as low surrogate
+        i++;

To make the logic here more obvious perhaps you could use a variable
PRUint32 extraSurrogateLength.

   Initialize extraSurrogateLength to 0.
   Change the for loop to:
       for (i = 0; i < aLength; i+=1+extraSurrogateLength).
   Set extraSurrogateLength to 0 where you initialize ii.
   Set extraSurrogateLength to 1 when you detect a surrogate.

+          else if (face->charmaps[i]->encoding_id == TT_MS_ID_UCS_4) {

Nit: Is the indentation here correct?

Nit: would you change

  if (!glyph_index)

to

  if (glyph_index == 0)

+      // alloc extMap[plane_num] if needed
+      plane_num = CCMAP_PLANE(i);
+      NS_ASSERTION(plane_num <= EXTENDED_UNICODE_PLANES,"invalid plane");
+      if (plane_num <= EXTENDED_UNICODE_PLANES) {
+        if (extMap[plane_num] == 0) {
+          extMap[plane_num] = new PRUint32[UCS2_MAP_LEN];
+        }
+        // set map
+        SET_REPRESENTABLE(extMap[plane_num], i & 0xffff);
+

+    if (fce->mFlags & FCE_FLAGS_SURROGATE) {
+      // make ext ccmap from map
+      fce->mCCMap = MapToCCMapExt(map, extMap+1, EXTENDED_UNICODE_PLANES);
+    } else if (fce->mFlags & FCE_FLAGS_UNICODE) {
+      fce->mCCMap = MapToCCMap(map);
+    }
+    // free map
+    for (i = 1; i <= EXTENDED_UNICODE_PLANES; ++i) {
+      if (extMap[i]) {
+        delete [] extMap[i];
+      }
+    }

+  // create map then create ccmap
+  // codes from Windows
+  PRUint32* extMap[EXTENDED_UNICODE_PLANES+1];
+  memset(extMap+1, 0, sizeof(PRUint32*)*EXTENDED_UNICODE_PLANES);
+  PRUint32 map[UCS2_MAP_LEN];
+  memset(map, 0, sizeof(map));
+  extMap[0] = map;

-  fce->mCCMap = ccmapObj.NewCCMap();
+  if (fce->mFlags & FCE_FLAGS_SURROGATE) {
+    // make ext ccmap from map
+    fce->mCCMap = MapToCCMapExt(map, extMap+1, EXTENDED_UNICODE_PLANES);
+  } else if (fce->mFlags & FCE_FLAGS_UNICODE) {
+    fce->mCCMap = MapToCCMap(map);
+  }
+  // free map
+  for (i = 1; i <= EXTENDED_UNICODE_PLANES; ++i) {
+    if (extMap[i]) {
+      delete [] extMap[i];
+    }
+  }

-      if (byte & (1<<j))
-        aCCMapObj->SetChar(pagechar);
+      if (byte & (1<<j)) {
+        // alloc aExtMap[plane_num] if needed
+        PRUint32 plane_num = CCMAP_PLANE(pagechar);
+        NS_ASSERTION(plane_num <= EXTENDED_UNICODE_PLANES,"invalid plane");
+        if (plane_num > EXTENDED_UNICODE_PLANES) {
+          continue;
+        }
+        if (aExtMap[plane_num] == 0) {
+          aExtMap[plane_num] = new PRUint32[UCS2_MAP_LEN];
+        }
+        // set map
+        SET_REPRESENTABLE(aExtMap[plane_num], pagechar & 0xffff);
+      }

These manipulate the ccmap and should probably not be in the font code.
Can they be moved to the ccmap code?

-      num = sscanf(value, "%lu", &uLongVal);
+      num = sscanf(value, "%lx", &uLongVal);

Nit: What is the data type here?

+static FtFuncList FtFuncsExt [] = {
+  {"FT_Get_First_Char",     (PRFuncPtr*)&nsFreeTypeFont::nsFT_Get_First_Char},
+  {"FT_Get_Next_Char",     (PRFuncPtr*)&nsFreeTypeFont::nsFT_Get_Next_Char},
+  {nsnull,                    (PRFuncPtr*)nsnull},

Nit: indentation here

+  // new functions for 2.0.9
+  for (p=FtFuncsExt; p->FuncName; p++) {
+    *p->FuncPtr = PR_FindFunctionSymbol(sSharedLib, p->FuncName);
+  }

Do we need to check for errors here?
What happens if the lib is missing these functions?

+        // break not to overwrite by UNICODE_CS if UCS_4 found

Nit: could you perhaps change this to something like

+        // UCS_4 is the most prefered cmap since it supports surrogates
+        // so stop here to avoid the possibly of getting UNICODE_CS which
+        // is the 2nd prefered choice.
(Assignee)

Comment 32

17 years ago
Created attachment 92874 [details] [diff] [review]
new patch
(Assignee)

Updated

17 years ago
Attachment #87487 - Attachment is obsolete: true
(Assignee)

Comment 33

17 years ago
Brian,

Thanks for useful comments,

>Any particular reason not to move IS_SURROGATE there as well?

No, should I need to define?

>+    // if SURROGATE char, ignore charSetInfo->mCCMap checking
>+    // because the exact ccmap is never created before loading
>+    // NEED TO FIX: need better way
>
>Is there a bug open for this?

Sure, I'll file the bug.

>Is correct version of CCMAP_HAS_CHAR vs CCMAP_HAS_CHAR_EXT used in all the
>appropiate places?

I have changed all CCMAP_HAS_CHAR to CCMAP_HAS_CHAR_EXT because
CCMAP_HAS_CHAR_EXT can be used for 16 bit. Any objection?

>These manipulate the ccmap and should probably not be in the font code.
>Can they be moved to the ccmap code?

Yes. I've made changes in nsCompressedCharMap class to handle
such extend area for surrogate. The same methods now
can be used. Only the difference is to call ccmapObj.Expand()
to enable surrogate suppport, like

   nsCompressedCharMap ccmapObj;
   if (aFce->mFlags & FCE_FLAGS_SURROGATE) {
     ccmapObj.Extend();
   }
   ccmapObj.SetChars(aFce->mCCMap);
   return ccmapObj.NewCCMap();

We can now use the same method such as SetChars(),
SetChar() and NewCCMap(). The codes can be smaller
than before patch.


>-      num = sscanf(value, "%lu", &uLongVal);
>+      num = sscanf(value, "%lx", &uLongVal);
>
>Nit: What is the data type here?

By addition of

	FCE_FLAGS_SURROGATE    0x08

Now the flag can have the value such as Flags=0000000b.
So I need to change from lu -> lx.

>+  // new functions for 2.0.9
>+  for (p=FtFuncsExt; p->FuncName; p++) {
>+    *p->FuncPtr = PR_FindFunctionSymbol(sSharedLib, p->FuncName);
>+  }
>
>Do we need to check for errors here?
>What happens if the lib is missing these functions?

I understand these are optional functions.
If the functions are not defined, FuncPrt is set to NULL.
I checked whether the functions are defined or not
like below in nsFT2FontCatalog.cpp,

+    if (nsFreeTypeFont::nsFT_Get_First_Char &&
+                 nsFreeTypeFont::nsFT_Get_Next_Char) {
+      has_fast_methods = PR_TRUE;
+    }
(Assignee)

Comment 34

17 years ago
Created attachment 93115 [details] [diff] [review]
new patch

NS_ASSERTION(aChar > 0xffff) should be NS_ASSERTION(aChar <= 0xffff)
Attachment #92874 - Attachment is obsolete: true

Comment 35

17 years ago
Comment on attachment 93115 [details] [diff] [review]
new patch

> >Is there a bug open for this?
> 
> Sure, I'll file the bug.

I could not find the bug. Could you let me know what the bug # is?


> I have changed all CCMAP_HAS_CHAR to CCMAP_HAS_CHAR_EXT because
> CCMAP_HAS_CHAR_EXT can be used for 16 bit. Any objection?

This seems fine. Nit: Should we just call it CCMAP_HAS_CHAR?

Great work!

r=bstell@ix.netcom.com
Attachment #93115 - Flags: review+

Comment 36

17 years ago
Comment on attachment 93115 [details] [diff] [review]
new patch

katakai:
Can you port the gfx/src/gtk/-changes to gfx/src/xlib/ or should I do the work
?
Attachment #93115 - Flags: needs-work+
(Assignee)

Comment 37

17 years ago
I'll work on it, let me try...
(Reporter)

Comment 38

16 years ago
katakai- let's focus and land the gtk one first. 
one by one, step by step
I really want to see the gtk part finished and land into trunk by 8/28. katakai,
can you make it ?
Blocks: 157673
Keywords: nsbeta1+

Comment 39

16 years ago
If there is deadline then having the gtk version get in under the deadline and 
not the xlib version will really penalize the xlib version.

Comment 40

16 years ago
Roland: this means you need to hustle.

Comment 41

16 years ago
Roland?

Comment 42

16 years ago
batch: adding topembed per Gecko2 document
http://rocknroll.mcom.com/users/marek/publish/Gecko/Gecko2Tasks.html
Keywords: topembed

Comment 43

16 years ago
Katakai: lets get this sr='d and checked in

Updated

16 years ago
Keywords: topembed → topembed-
Reassigning to Roland. Please set a target milestone.
Assignee: katakai → Roland.Mainz
Status: ASSIGNED → NEW
A lot of work got done in bug 182877 before I saw it, so instead of marking it a
dupe of this one, for the time being I am mutually cc-ing the respective owners
of the two bugs so you can work together and decide which one to close.

Comment 46

16 years ago
The code path here looks to me to be independent of the code path for Xft.

Thus this bug is not a dup of a Xft bug.

Thanks for the clarification, Brian. So Roland should work here on FreeType and
ignore Xft, and Jungshik should work on Xft in bug 182877 and ignore Freetype?

Comment 48

16 years ago
yes
(Reporter)

Comment 49

16 years ago
what is the progress now?
(Assignee)

Comment 50

16 years ago
gfx codes have been changed in many place while requesting sr=.
I'll restart the work for gfx part and will post new patch soon.
Assignee: Roland.Mainz → katakai
(Assignee)

Comment 51

16 years ago
Created attachment 111684 [details] [diff] [review]
updated patch for current tree
Attachment #93115 - Attachment is obsolete: true
(Assignee)

Comment 52

16 years ago
Most codes of the previous could be re-used for the current tree.

But one thing, I had to modify the codes for FT_Get_First_Char() and
FT_Get_Next_Char(). FreeType functions are now being accessed via
nsIFreeType2.idl. I added the following to idl,

+    FT_ULong getFirstChar(in FT_Face face, out FT_UInt gindex);
+    FT_ULong getNextChar(in FT_Face face, in FT_ULong charcode, out FT_UInt
gindex);
+    void supportsExtFunc(out PRBool res);

SupportsExtFunc() will check the FreeType2 library in the system
can support FT_Get_First_Char() and FT_Get_Next_Char() functions
for fast looking up of glyph index.

Brian, sorry for dup work, but could you review please?

I have a question. There are some dup files for
        nsCompressedCharMap
        nsFT2FontCatalog
        nsFreeType
undef gfx/. I modified only the files for gtk build but
should I update both?

./public/nsCompressedCharMap.h
./src/nsCompressedCharMap.cpp
./src/shared/nsCompressedCharMap.cpp
./src/shared/nsCompressedCharMap.h

./src/freetype/nsFT2FontCatalog.cpp
./src/freetype/nsFT2FontCatalog.h
./src/x11shared/nsFT2FontCatalog.cpp
./src/x11shared/nsFT2FontCatalog.h

./src/freetype/nsFreeType.cpp
./src/freetype/nsFreeType.h
./src/x11shared/nsFreeType.cpp
./src/x11shared/nsFreeType.h
Status: NEW → ASSIGNED
(Assignee)

Updated

16 years ago
Attachment #111684 - Flags: review?(bstell)

Comment 53

16 years ago
Overall this looks great. There are a few items to discuss.

1) I am little concerned about overwriting the input this way

+  // checking each plane
+  for(l=planestart; l<=planeend; l++, *aPageStart = CCMAP_BEGIN_AT_START_OF_MAP) {

2) need to check if malloc failed

+        mExtMap[plane_num] = (PRUint32*)PR_Malloc(sizeof(PRUint32)*UCS2_MAP_LEN);
+        memset(mExtMap[plane_num], 0, sizeof(PRUint32)*UCS2_MAP_LEN);

3) I'm curious if it would be easier to call "SetChars((PRUint16)base, page);"
instead of scanning the bits.

+  if(mExtended){
+    PRUint32 page = CCMAP_BEGIN_AT_START_OF_MAP;
+    while (NextNonEmptyCCMapPage(aCCMap, &page)) {
+      PRUint32 pagechar = page;
+      for (i=0; i<(CCMAP_BITS_PER_PAGE/8); i++) {
+        for (j=0; j<8; j++) {
+          if (CCMAP_HAS_CHAR_EXT(aCCMap, pagechar)) {
+            SetChar(pagechar);
+          }
+          pagechar++;
+        }
+      }
+    }

4) please convert the tabs to spaces

+  {"FT_Done_Face",            NS_FT2_OFFSET(nsFT_Done_Face),     PR_TRUE}, 
+  {"FT_Done_FreeType",        NS_FT2_OFFSET(nsFT_Done_FreeType),   PR_TRUE},


5) Nit: a slightly more positive way to say this:

+#define IS_SURROGATE(u)      (u > 0xFFFF)

would be:

+#define IS_SURROGATE(u)      (u >= 0x10000)

6) Is there a bug open for this?

+    // if SURROGATE char, ignore charSetInfo->mCCMap checking
+    // because the exact ccmap is never created before loading
+    // NEED TO FIX: need better way
+    if (IS_SURROGATE(aChar) ) {
+      goto check_done;

7) It might be a bit more clear if:

+    PRBool need_scan = PR_TRUE;

was changed to scan_done (and the logic was changed as appropiate)

8) How about using extraSurrogateLength here as in the earlier part of the patch?

+      // aString[i+1] is already used as low surrogate, skip i+1 
+      i++;

Comment 54

16 years ago
I do not know how these dups happened.

./public/nsCompressedCharMap.h
./src/nsCompressedCharMap.cpp
./src/shared/nsCompressedCharMap.cpp
./src/shared/nsCompressedCharMap.h

these should be cleaned up when bug 180668 is checked in

./src/freetype/nsFT2FontCatalog.cpp
./src/freetype/nsFT2FontCatalog.h
./src/x11shared/nsFT2FontCatalog.cpp
./src/x11shared/nsFT2FontCatalog.h

./src/freetype/nsFreeType.cpp
./src/freetype/nsFreeType.h
./src/x11shared/nsFreeType.cpp
./src/x11shared/nsFreeType.h

(Assignee)

Comment 55

16 years ago
Created attachment 112041 [details] [diff] [review]
updated patch
Attachment #111684 - Attachment is obsolete: true
(Assignee)

Comment 56

16 years ago
Thank you very much for review,

> 1) I am little concerned about overwriting the input this way
> 
> +  // checking each plane
> +  for(l=planestart; l<=planeend; l++, *aPageStart =
CCMAP_BEGIN_AT_START_OF_MAP) {

OK, I provided pagestart for aPageStart,

	PRUint32 pagestart = *aPageStart;

so that aPageStart will be changed only when return TRUE.

	*aPageStart = (((PRUint32)l)<<16)+base;
	return PR_TRUE;
 
> 2) need to check if malloc failed
> 
> +        mExtMap[plane_num] = (PRUint32*)PR_Malloc(sizeof(PRUint32)*UCS2_MAP_LEN);
> +        memset(mExtMap[plane_num], 0, sizeof(PRUint32)*UCS2_MAP_LEN);

fixed.

> 3) I'm curious if it would be easier to call "SetChars((PRUint16)base, page);"
> instead of scanning the bits.

no changes in the patch...
 
> +  if(mExtended){
> +    PRUint32 page = CCMAP_BEGIN_AT_START_OF_MAP;
> +    while (NextNonEmptyCCMapPage(aCCMap, &page)) {
> +      PRUint32 pagechar = page;
> +      for (i=0; i<(CCMAP_BITS_PER_PAGE/8); i++) {
> +        for (j=0; j<8; j++) {
> +          if (CCMAP_HAS_CHAR_EXT(aCCMap, pagechar)) {
> +            SetChar(pagechar);
> +          }
> +          pagechar++;
> +        }
> +      }
> +    }
> 
> 4) please convert the tabs to spaces
> 
> +  {"FT_Done_Face",            NS_FT2_OFFSET(nsFT_Done_Face),     PR_TRUE}, 
> +  {"FT_Done_FreeType",        NS_FT2_OFFSET(nsFT_Done_FreeType),   PR_TRUE},

fixed.
 
> 5) Nit: a slightly more positive way to say this:
> 
> +#define IS_SURROGATE(u)      (u > 0xFFFF)
> 
> would be:
> 
> +#define IS_SURROGATE(u)      (u >= 0x10000)

fixed.
 
> 6) Is there a bug open for this?
> 
> +    // if SURROGATE char, ignore charSetInfo->mCCMap checking
> +    // because the exact ccmap is never created before loading
> +    // NEED TO FIX: need better way
> +    if (IS_SURROGATE(aChar) ) {
> +      goto check_done;

filed as bug 189425.

> 7) It might be a bit more clear if:
> 
> +    PRBool need_scan = PR_TRUE;
> 
> was changed to scan_done (and the logic was changed as appropiate)

I removed "break;" and changed to the below in while() 

    while (!scan_done && i < len) {

please let me know when you find better logic and codes.

> 8) How about using extraSurrogateLength here as in the earlier part of the patch?
> 
> +      // aString[i+1] is already used as low surrogate, skip i+1 
> +      i++;

Agreed. fixed.
(Assignee)

Updated

16 years ago
Attachment #112041 - Flags: review?(bstell)

Updated

16 years ago
Attachment #112041 - Flags: review?(bstell) → review+
(Assignee)

Updated

16 years ago
Attachment #112041 - Flags: superreview?(bryner)
Attachment #112041 - Flags: superreview?(bryner) → superreview+
(Assignee)

Updated

16 years ago
Flags: blocking1.3b?
(Assignee)

Updated

16 years ago
Attachment #112041 - Flags: approval1.3b?
(Assignee)

Updated

16 years ago
Flags: blocking1.3b?

Comment 57

16 years ago
Comment on attachment 112041 [details] [diff] [review]
updated patch

Please hold this until we open for 1.4alpha development. Thanks.
Attachment #112041 - Flags: approval1.3b? → approval1.3b-
(Assignee)

Updated

16 years ago
Target Milestone: --- → mozilla1.4alpha
Comment on attachment 111684 [details] [diff] [review]
updated patch for current tree

clearing review request on obsolete patch
Attachment #111684 - Flags: review?(bstell)
(Assignee)

Comment 59

16 years ago
patch checked into 1.4a.
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED

Comment 60

16 years ago
so I hate to say it, but this bug increased footprint by about 3.4k. Is there
any optimization that can be done to make up for (or fix!) the code size
increase? It looks like a lot of functions changed type signature and grew by a
few bytes in the process..*changing that one paramter from char to PRUnichar I
think)

there isn't any glaringly obvious growth here, just lots of functions growing
from 4 to 50 bytes each, and some new functions.. 

  libgfx_gtk.so
  	Total:	      +3420 (+19088/-15668)
  	Code:	      +3292 (+15476/-12184)
  	Data:	       +128 (+3612/-3484)
  	      +3292 (+15476/-12184)	T (CODE)
  		      +3292 (+15476/-12184)	UNDEF:libgfx_gtk.so:T
  			      +2356	nsFontMetricsGTK::FindStyleSheetGenericFont(unsigned int)
  			      +2144	nsFontMetricsGTK::FindLangGroupPrefFont(nsIAtom *, unsigned int)
  			      +2072	nsFontMetricsGTK::PickASizeAndLoad(nsFontStretch *,
nsFontCharSetInfo *, unsigned int, char const *)
  			      +1052	nsFontMetricsGTK::TryNode(nsCString *, unsigned int)
  			      +1020	nsFontMetricsGTK::TryNodes(nsACString &, unsigned int)
  			       +804	nsFontMetricsGTK::TryFamily(nsCString *, unsigned int)
  			       +596	nsFontMetricsGTK::SearchNode(nsFontNode *, unsigned int)
  			       +580	nsFontMetricsGTK::FindLangGroupFont(nsIAtom *, unsigned int,
nsCString *)
  			       +464	nsFontMetricsGTK::FindAnyFont(unsigned int)
  			       +456	nsFontMetricsGTK::FindFont(unsigned int)
  			       +448	nsFontMetricsGTK::FindStyleSheetSpecificFont(unsigned int)
  			       +376	nsFT2FontCatalog::NewFceFromFontFile(FT_LibraryRec_ *, char
const *, int, int *)
  			       +300	SetFontCharsetInfo(nsFontGTK *, nsFontCharSetInfo *, unsigned int)
  			       +220	nsFontMetricsGTK::DrawString(unsigned short const *, unsigned
int, int, int, int, int const *, nsRenderingContextGTK *, nsDrawingSurfaceGTK *)
  			       +216	nsFontMetricsGTK::GetBoundingMetrics(unsigned short const *,
unsigned int, nsBoundingMetrics &, int *, nsRenderingContextGTK *)
  			       +212	nsFontGTK::SupportsChar(unsigned int)
  			       +212	nsFontMetricsGTK::GetWidth(unsigned short const *, unsigned
int, int &, int *, nsRenderingContextGTK *)
  			       +204	nsFontMetricsGTK::GetTextDimensions(unsigned short const *,
unsigned int, nsTextDimensions &, int *, nsRenderingContextGTK *)
  			       +172	nsFontMetricsGTK::FindSubstituteFont(unsigned int)
  			       +168	nsFontMetricsGTK::TryAliases(nsCString *, unsigned int)
  			       +156	nsFontMetricsGTK::ResolveForwards(unsigned short const *,
unsigned int, int (*)(nsFontSwitchGTK const *, unsigned short const *, unsigned
int, void *), void *)
  			       +156	nsFontMetricsGTK::TryLangGroup(nsIAtom *, nsCString *, unsigned
int)
  			       +144	nsFT2FontCatalog::PrintPageBits(nsNameValuePairDB *, unsigned
short *, unsigned int)
  			       +128	InitGlobals(nsIDeviceContext *)
  			       +124	nsFontMetricsGTK::FindUserDefinedFont(unsigned int)
  			       +116	nsFontMetricsGTK::LocateFont(unsigned int, int &)
  			        +96	nsFreeTypeFont::doGetBoundingMetrics(unsigned short const *,
unsigned int, int *, int *, int *, int *, int *)
  			        +92	nsFreeTypeFont::GetWidth(unsigned short const *, unsigned int)
  			        +92	nsFreeTypeXImage::DrawString(nsRenderingContextGTK *,
nsDrawingSurfaceGTK *, int, int, unsigned short const *, unsigned int)
  			        +68	nsFT2FontCatalog::NewFceFromSummary(nsNameValuePairDB *)
  			        +44	nsFreeType2::GetCCMap(nsFontCatalogEntry *)
  			        +40	nsFreeType2::GetNextChar(FT_FaceRec_ *, unsigned long, unsigned
int *, unsigned long *)
  			        +40	nsFreeType2::SupportsExtFunc(int *)
  			        +36	nsFreeType2::GetFirstChar(FT_FaceRec_ *, unsigned int *,
unsigned long *)
  			        +36	nsFreeType2::LoadSharedLib(void)
  			        +20	SetUpFontCharSetInfo(nsFontCharSetInfo *)
  			        +16	GetMapFor10646Font(XFontStruct *)
  			         -4	nsFT2FontCatalog::PrintCCMap(nsNameValuePairDB *, unsigned short *)
  			         -4	PrefEnumCallback(char const *, void *)
  			       -104	nsFontGTK::SupportsChar(unsigned short)
  			       -132	nsFontMetricsGTK::FindUserDefinedFont(unsigned short)
  			       -156	nsFontMetricsGTK::TryLangGroup(nsIAtom *, nsCString *, unsigned
short)
  			       -172	nsFontMetricsGTK::TryAliases(nsCString *, unsigned short)
  			       -172	nsFontMetricsGTK::FindSubstituteFont(unsigned short)
  			       -188	SetFontCharsetInfo(nsFontGTK *, nsFontCharSetInfo *, unsigned
short)
  			       -364	nsFontMetricsGTK::FindAnyFont(unsigned short)
  			       -460	nsFontMetricsGTK::FindStyleSheetSpecificFont(unsigned short)
  			       -472	nsFontMetricsGTK::FindFont(unsigned short)
  			       -600	nsFontMetricsGTK::SearchNode(nsFontNode *, unsigned short)
  			       -600	nsFontMetricsGTK::FindLangGroupFont(nsIAtom *, unsigned short,
nsCString *)
  			       -740	nsFontMetricsGTK::TryFamily(nsCString *, unsigned short)
  			       -908	nsFontMetricsGTK::TryNodes(nsACString &, unsigned short)
  			       -960	nsFontMetricsGTK::TryNode(nsCString *, unsigned short)
  			      -1900	nsFontMetricsGTK::FindStyleSheetGenericFont(unsigned short)
  			      -2064	nsFontMetricsGTK::PickASizeAndLoad(nsFontStretch *,
nsFontCharSetInfo *, unsigned short, char const *)
  			      -2184	nsFontMetricsGTK::FindLangGroupPrefFont(nsIAtom *, unsigned short)
  	        +96 (+112/-16)	D (DATA)
  		        +96 (+112/-16)	UNDEF:libgfx_gtk.so:D
  			        +96	nsFreeType2::FtFuncs
  			         +8	getenv_done.2846
  			         +4	already_complained.2826
  			         +4	nsFreeType2::gHasExtFunc
  			         -4	gFreeTypeFaces
  			         -4	already_complained.2823
  			         -8	getenv_done.2843
  	        +32 (+3500/-3468)	R (DATA)
  		        +32 (+3500/-3468)	UNDEF:libgfx_gtk.so:R
  			      +2520	__PRETTY_FUNCTION__.2758
  			       +128	__PRETTY_FUNCTION__.2254
  			       +128	__PRETTY_FUNCTION__.2272
  			       +128	__PRETTY_FUNCTION__.2329
  			       +128	__PRETTY_FUNCTION__.2332
  			       +116	__PRETTY_FUNCTION__.2356
  			        +96	__PRETTY_FUNCTION__.2263
  			        +96	__PRETTY_FUNCTION__.2287
  			        +96	__PRETTY_FUNCTION__.2314
  			        +32	__PRETTY_FUNCTION__.10
  			        +32	__PRETTY_FUNCTION__.2260
  			        -32	__PRETTY_FUNCTION__.2266
  			        -96	__PRETTY_FUNCTION__.2284
  			        -96	__PRETTY_FUNCTION__.2281
  			        -96	__PRETTY_FUNCTION__.2251
  			       -116	__PRETTY_FUNCTION__.2350
  			       -128	__PRETTY_FUNCTION__.2326
  			       -128	__PRETTY_FUNCTION__.2323
  			       -256	__PRETTY_FUNCTION__.2752
  			       -256	__PRETTY_FUNCTION__.2248
  			      -2264	__PRETTY_FUNCTION__.2755
(Assignee)

Comment 61

16 years ago
Hi Alec,

Yes, I needed to change PRUnichar to PRUint32 to handle surrogate
in each function. The type change of PRUnichar to PRUint32 also
has been needed in other platform e.g. gfx/src/windows when adding
surrogate support.

http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&root=/cvsroot&subdir=mozilla/gfx/src/windows&command=DIFF_FRAMESET&file=nsFontMetricsWin.h&rev2=3.41&rev1=3.40

but yes, I need to check the codes again.

Comment 62

16 years ago
On 02-25 trunk build / linux RH8.0, the GB18030 plane 2 surrogate characters are
still displayed as "?" using truetype font SURSONG.TTF.  While GB18030 extension
A characters are displayed fine.
None of the testcases linked from bug 118000 work for me in current Linux trunk,
and some hang. I have

user_pref("font.FreeType2.enable", true);
user_pref("font.directory.truetype.1", "~/fonts");

in my prefs.js, and FreeType works for me other than this.
(Assignee)

Comment 64

16 years ago
I don't have SURSONG.TTF but was using simsun18030.ttc
and code2001.ttf, it works for me on Trunk.

When you already have

  fonts/.mozilla_font_summary.ndb

file in your ttf font directory, please remove the file,
then start Mozilla again.



Thanks for the tip, Katakai. After deleting ~/fonts/.mozilla_font_summary.ndb
all the testcases now work for me.
Status: RESOLVED → VERIFIED
(Assignee)

Comment 66

16 years ago
Thank you Simon for verification.

Yuying, are you still seeing problem?
Actually I've never tried SURSONG.TTF font.
I always used simsun18030 and code2001.
I understand SURSONG.TTF is not free, right?
I'll try to get the font.
Btw, do you have any good test page for plane 2?
If you have please send to me.
If the same page can be displayed on Windows,
it seems bug for me.
Please open new bug report and assign to me.

Comment 67

16 years ago
Yes, I still have problem with display plane 2 extension B characters; I think
simsun.ttc and code2001.ttf are plane 1 not CJK extension B.

Bug 195154 filed for remaining problem.

Comment 68

16 years ago
> Btw, do you have any good test page for plane 2?

  My test pages for various UTF's at http://jshin.net/i18n/utftest
have a _few_ random plane 2 characters thrown in. I can make a
plane 2 character table, but you can do that as easily as I can....
You need to log in before you can comment on or make changes to this bug.