Closed Bug 78201 Opened 23 years ago Closed 22 years ago

[BiDi]:Arabic 2 byte fonts don't seem to render / Add support for iso8859-6.8x fonts

Categories

(Core :: Layout: Text and Fonts, defect)

Sun
Solaris
defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: prabhat.hegde, Assigned: smontagu)

References

()

Details

(Keywords: relnote)

Attachments

(11 files, 6 obsolete files)

24.83 KB, application/octet-stream
Details
144.45 KB, application/octet-stream
Details
73.18 KB, text/plain
Details
9.57 KB, application/octet-stream
Details
71.75 KB, image/jpeg
Details
71.96 KB, image/jpeg
Details
92.79 KB, image/jpeg
Details
76.59 KB, image/jpeg
Details
59.53 KB, image/gif
Details
275.94 KB, image/jpeg
Details
19.29 KB, patch
smontagu
: review+
Details | Diff | Splinter Review
On my bidi builds on both Solaris and Linux [Trunk nightly (04/29) + 
sources from erik@netscape], i am not able to view any arabic output. 
All i get is glyph for invalid char (?). The font i am using is a 2 byte 
font based on iso-8859-6. [Please email me for the font as i am not sure 
if i can attach it here]. The problem may be with 1 byte input-2 byte 
output (just guessing at this point).

Test URL's : www.ayna.com, www.al-jazirah.com
Reassign to mkaply@us.ibm.com.
Assignee: nhotta → mkaply
What is the encoding of the two bytes font?
Changing QA contact to mahar@eg.ibm.com
QA Contact: andreasb → mahar
move to Bidi Hebrew/Arabic component
probat- Pleaes attach your font again. Please specify mime type correctly while
you attach. Is it a zip file?

Is that true the font is encoded in Unicode ? What is the XLFD ?
Component: Internationalization → BiDi Hebrew & Arabic
reassign to katakai@japan.sun.com
katakai- this is a linux font issue. I think you know the font code well to make
this happen. Please work with bstell for details if you need help.
First qustion we want to know is this font a TrueType font ? a ISO-10646 font?
what is the XLFD? 
If it is a ISO-10646 font, then we will use the ISO-10646 font path in the gtk. 
Assignee: mkaply → katakai
Mass-move all BiDi Hebrew and Arabic qa to me, zach@zachlipton.com. 
Thank you Gilad for your service to this component, and best of luck to you 
in the future.

Sholom. 
QA Contact: mahar → zach
QA to mahar.
QA Contact: zach → mahar
Blocks: 115714
What is the status of this bug ? Will this be fixed for 1.0 ?
Comment on attachment 32644 [details]
The two byte arabic fonts that i tested with.

Uhm... this tar.Z misses the matching fonts.dir file... ;-(
*** Bug 136101 has been marked as a duplicate of this bug. ***
It looks as if we have to add some special voodoo to use the font encoding from
attachment 36498 [details].
hi simon,

I believe this is because the font calls itself "iso8859-6" while actually
being unicode encoded "iso10646". 

prabhat.
prabhat:
Do you know which encoding the following fonts (diestributed with Solaris) have:
1. /usr/openwin/lib/locale/ar/X11/fonts/TrueType
2. /usr/openwin/lib/locale/ar/X11/fonts/Type1
?
hi roland,

as i mentioned it is iso10646. As mentioned earlier it calls 
itself "iso8859-6". I am attaching arabic_font_info.tar.Z which
has fonts.dir and also ttmap file for your info.

prabhat
> Do you know which encoding the following fonts (diestributed with Solaris) 
> have:
> 1. /usr/openwin/lib/locale/ar/X11/fonts/TrueType
> 2. /usr/openwin/lib/locale/ar/X11/fonts/Type1

Are these fonts or are these directories?
Brian Stell wrote:
> > 1. /usr/openwin/lib/locale/ar/X11/fonts/TrueType
> > 2. /usr/openwin/lib/locale/ar/X11/fonts/Type1
>
> Are these fonts or are these directories?

This are dirs which contain fonts for the X11 system (Solaris puts the
X11-related code into /usr/openwin - but that's another story).
Solaris seperates it's fonts per locale and then per font type ("ar" == arabic
locale, "TrueType"/"Type1" are the font types).
prabhat:
I have still no luck, the '?' do not disappear...

.. and hacking the fonts.dir and renaming the fonts to
-- snip --
NASKHMT.ttf -monotype-naskh-medium-r-normal--0-0-0-0-p-0-iso10646-1
NASKHBD.ttf -monotype-naskh-bold-r-normal--0-0-0-0-p-0-iso10646-1
-- snip --
causes Xsun to ask for a matching encoding file ("Cannot find encoding file for
iso10646-1" in /var/dt/Xerrors).

Looks I have to seek little bit harder for finding a solution...
hi roland,

I don't think hacking font-encoding will help. As you mentioned, you also
need to add ttmap for the encoding, add entry in ttmaps.dir and so on.

One idea is to change the converter currently used in nsFontMetrics for this
font for Solaris only. ie iso8859-6 'SingleByteconverter' to be enhanced to
handle 8859-6 + unicode encoded presentation forms A & B.

prabhat. 
Prabhat Hegde wrote:
> I don't think hacking font-encoding will help. As you mentioned, you also
> need to add ttmap for the encoding, add entry in ttmaps.dir and so on.
>
> One idea is to change the converter currently used in nsFontMetrics for this
> font for Solaris only. ie iso8859-6 'SingleByteconverter' to be enhanced to
> handle 8859-6 + unicode encoded presentation forms A & B.

I would perfer a solution which is not Solaris-specific since many people use
SPARCs and use Xterminals or copy these fonts around...
Adding special encoder support and treating the ISO-8859-6 fonts always as
doublebyte fonts may be a solution.
Looks we simple need someone who hacks the ISO-8859-6 converter code a little
bit... :)
prabhat:
Do you know whether the PostScript Type1 fonts in
/usr/openwin/lib/locale/ar/X11/fonts/Type1/ contain the presentation forms a&b,
too ? If yes - how can I use them from the X11 API ?
> One idea is to change the converter currently used in nsFontMetrics for this
> font for Solaris only. ie iso8859-6 'SingleByteconverter' to be enhanced to
> handle 8859-6 + unicode encoded presentation forms A & B.

How would this work?

Would it be applied to all iso8859-6 fonts?

Would there be a different encoding; eg: "iso8859-6x" ?
Prabhat: please please please please do not attach file as zip or 
application/octet-stream. Attach seperate attachment for the thing you want to
attach- one for each file. please. I won't have the right tool to look at your
attachement from my Mac. 
Hmm, I could not still find out how to browse Arabic with iso-8859-6 fonts.
Have anyone succeeded?

- Sun's ar fonts - failed as this bug report
- Tried langbox arabic fonts from     
  http://www.langbox.com/AraZilla/linux/arafontfull-1.2-4.i386.rpm 

Can anyone know the location of correct iso-8869-6 fonts?

There is two way I have succeeded to display arabic characters
without iso-8859-6 fonts,

1. use FreeType as fontpath to /usr/openwin/lib/locale/ar/TrueType
   Great!!

   However, there is not priting solution for now if we use TrueType.
   We's planning to suppport Xprint for Arabic. Xserver should
   be able to load and displayt Arabic glyphs.

2. Use iso10646-1 fonts 

   I tried arabic bdf fonts from 

   http://crl.nmsu.edu/~mleisher/download.html

   It works well.
Comment on attachment 83639 [details]
not working - langbox 8859-6x fonts from http://www.langbox.com/AraZilla/linux/arafontfull-1.2-4.i386.rpm 

AFAIK we do not have any mapping for "iso-8859-6x" fonts in our fontmetrics
code
What about treating ISO-8859-6 fonts like iso10646-1 in our fontmetrics system,
e.g. use the X11 API to query which glyphs are present in the font and not try
to "guess" it based on the encoding ?
That seems to work for the Solaris 2.8 fonts from
/usr/openwin/lib/locale/ar/X11/fonts/TrueType (at least "xfd" treats them as
16bit font then...).
Does anyone know which encoding standard the fonts "SHA1____.PFA" and
"SHA2____.PFA" from /usr/openwin/lib/locale/ar/X11/fonts/Type1/ implement ?
These encodings are new to me and certainly not any of the common standard or
proprietary encodings from ISO, ASMO, Microsoft, IBM or Apple, but each one is a
folding of part of the Unicode set into 8 bits (assuming that they both start
from 0x00 at the top left).

SHA1____.PFA is like a variant of ISO-8859-6: 

0x00-0x7F - equivalent to ASCII
0x80-0xEF - equivalent to Unicode 0x0600-0x066F

SHA2____.PFA maps the presentation forms from Unicode 0xFF70-0xFEFF to 0x70-0xFF
Simon Montagu wrote:
> These encodings are new to me and certainly not any of the common standard or
> proprietary encodings from ISO, ASMO, Microsoft, IBM or Apple, but each one is 
> a folding of part of the Unicode set into 8 bits 
> (assuming that they both start from 0x00 at the top left).

Mhhh...

... what above creating a new X11 font encoding (we may call it
"sun.unicode.plane" :) which lists the unicode plane and offset being used in
the font (this would be sufficient for 8bit fonts; 16bit fonts would need
multiple entries in fonts.dir) ?
Clarification:
We are talking about two things here:
Solaris ships with two kinds of Arabic fonts:
1. TrueType fonts in /usr/openwin/lib/locale/ar/X11/fonts/TrueType/ - they are
listed as *-iso-8859-6 fonts in
/usr/openwin/lib/locale/ar/X11/fonts/TrueType/fonts.dir - but the Solaris
TrueType font engine idetifies them correctly as 16bit fonts.
2. PS Type 1 fonts in /usr/openwin/lib/locale/ar/X11/fonts/Type1/ - which seem
to represent the unicode blocks for arabic (see comment #34) squished into 8bit
fonts

For [1] I propose to check whether ISO-8859-6 fonts are 16bit X11 fonts. If they
are 16bit fonts we should assume that they have the arabic glyphs in the
expected places
For [2] I propose the idea listed in comment #35 (new X11 encoding scheme
"sun.unicode.plane")
I filed a seperate bug (bug 159430 ("[RFE] Add support for X11 fonts which
represent single unicode blocks")) for the discussion around [2] from comment
#36
BTW: I filed bug 158894 ("RFE: Document how to treat TrueType fonts as
iso10646-1 with Solaris/Xsun") to document how users can use TrueType fonts with
iso10646-1 encoding on Solaris/Xsun to view/print Arabic pages.
> 2. PS Type 1 fonts in /usr/openwin/lib/locale/ar/X11/fonts/Type1/ - which seem
> to represent the unicode blocks for arabic (see comment #34) squished into 
> 8bit fonts

> For [1] I propose to check whether ISO-8859-6 fonts are 16bit X11 fonts. 

Is it valid to use ios8859-x for a 16 bit font?

> If they are 16bit fonts we should assume that they have the arabic glyphs in 
> the expected places

Is there any standard that describes this? Doing this if no standard exists
makes me very nervous.

> For [2] I propose the idea listed in comment #35 (new X11 encoding scheme
> "sun.unicode.plane")

  0x00-0x7F - equivalent to ASCII
  0x80-0xEF - equivalent to Unicode 0x0600-0x066F

  ...Unicode 0xFF70-0xFEFF to 0x70-0xFF
  (ignore the typo)

These do not have a simple mapping which the suggestion in comment #35 assumes.


Just to be clear: both ftang and I are happy to support these fonts and if
there is no standard we can just start the encoding-registry name with 
something like "x_" to indicate this is a non standard encoding.
from email, Prabhat Hegde <prabhath@mpkmail.eng.Sun.COM> writes:

The arabic language folks at Sun finally fixed their font encoding which 
they say is Stds based (LangBox). This font is called iso8859-6.8x and is
also supported by Gnome/Pango. How do i get this font to appear in the
arabic font selection on my Solaris box?
  Edit->Preferences->Font->Languages->Arabic
Currently only iso8859-6 based font shows up in the selection box.
blizzard: here is a small task where you could get you feet wet working on
fonts.
Thanks to quick education from ftang, simon and myself found the following -

A> A UnicodeToLangBox converter already exists in ucvlatin written long-long ago
by ftang which can be re-used after synching with latest code-base.
B> This converter needs to be modified to handle Arabic Pres form B.

Finally, the converter is not as complicated as when frank wrote it since arabic
presentation forms are already generated by layout. Hence only mapping but no
shaping logic is needed. I am on it right now and should have a patch by tomorrow.


Attached patch Patch (obsolete) — Splinter Review
This is mostly Prabhat's work with some contributions from me. intl/uconv
changed under me while I was working on it, so there may be some oddities.
Attachment #101334 - Attachment is obsolete: true
Over to smontagu...
Assignee: katakai → smontagu
Comment on attachment 101339 [details] [diff] [review]
Patch with all the files in it this time

r=Roland.Mainz@informatik.med.uni-giessen.de

Patch builds&works, I can see the arabic fonts properly and print them (with
Xprint), too (assuming that the matching fonts&*.enc&*.ttmap files are
available).
Attachment #101339 - Flags: review+
I forgot one minor nit (no need to file a new patch for that):
-- snip --
-  NS_IMETHOD FillInfo(PRUint32* aInfo);
-};
-
+static PRUnichar uni2lbox [] = 
+ {
+  0xC1,   /* FE80 */
+  0xC2 , 
+  0xC2 ,
-- snip --

Can you make that array |const|, please ?
would you add a font-lang group for arabic?

 nsFontLangGroup FLG_JA =      { "ja", nsnull };
 nsFontLangGroup FLG_KO =      { "ko", nsnull };
+nsFontLangGroup FLG_AR =      { "ar", nsnull };
 nsFontLangGroup FLG_NONE =    { nsnull , nsnull };

Brian Stell wrote:
> would you add a font-lang group for arabic?

... which reminds me that we didn't add one for indic ("hi-IN") either... ;-(

Comment on attachment 101339 [details] [diff] [review]
Patch with all the files in it this time

Thanks for the r=, but the patch needs polish at the very least. Also, we have
a problem with lam-alef ligatures (as attachment 101369 [details] shows) which must be
fixed.
Attachment #101339 - Flags: review+ → needs-work+
Yup - i think it still needs some bug-fixing. Its not just combo(LAM+ALEF) case.
I am not a native user or expert so the only way i can tell is by comparing with
Windows version and also IE. I tried the following sites:

assafir.com (MAC-ARABIC) this was the best.
aljazeera.com (Windows)
bbc.co.uk -> Arabic (Unicode encoded)

However, it is worthwhile to integrate it so that arabic testers on Solaris can
start testing. I can tell that selection is badly broken.
Simon Montagu wrote:
> Thanks for the r=, but the patch needs polish at the very least. Also, we have
> a problem with lam-alef ligatures (as attachment 101369 [details] shows) which must be
> fixed.

I thought this is a problem with the Solaris fonts - or did I understand you
wrong here ?
Prabhat Hegde wrote:
> However, it is worthwhile to integrate it so that arabic testers on Solaris 
> can start testing. I can tell that selection is badly broken.

That's why I gave my r= for it.
The code works IMHO "good enougth" for trunk and we can't really kill all issues
in one step.
Without it we're completely screwed without the evil iso10646-1 fonts (there's
still the problem that the ISO8859-6.8x encoding files for Solaris+Xfree86
aren't available in the public... ;-( ).
Langbox encodings are publicly available:
http://www.langbox.com/arabic/FontSet_ISO8859-6-8X.html
Prabhat Hegde wrote:
> Langbox encodings are publicly available:
> http://www.langbox.com/arabic/FontSet_ISO8859-6-8X.html

... I was think about the encodings files (*.enc and *.ttmap) in this case.
Attached patch Patch v.2 with lam-alef handling (obsolete) — Splinter Review
I made the changes that Roland and Brian asked for, did some general clean-up,
and added handling for lam-alef.

This exposed a bug in nsRenderingContextGTK::GetTextDimensions: text with
lam-alef was laid out incorrectly and couldn't be selected properly, unless I
set MOZILLA_GFX_DISABLE_FAST_MEASURE in my environment. (Thanks are due to
Roland for help in identifying this problem).
Attachment #101339 - Attachment is obsolete: true
Attached patch Patch v.3 (obsolete) — Splinter Review
Oops! removed a printf
Attachment #101630 - Attachment is obsolete: true
Roland Mainz wrote:
> Brian Stell wrote:
> > would you add a font-lang group for arabic?
>
> ... which reminds me that we didn't add one for indic ("hi-IN") either... ;-(

Filed bug 172515 ("Some nsFontLangGroup entries missing in X11 font code") for
that issue...
The font changes look good and I am qualified to review them.
r=bstell@ix.netcom.com for the font changes.

I don't work in the converter code enough to be qualified to review those changes. 
Perhaps ftang can do that.
Adding dependicy to bug 172683 ("Problem with layout of Arabic lam-alef
ligatures and nsRenderingContext{GTK|Xlib}::GetTextDimensions()") since lam-alef
using the LangBox iso8859-6.8x is screwed-up due that bug...
Depends on: 172683
ftang, please review the converter and Mac build changes in attachment 102093 [details] [diff] [review].
Comment on attachment 102093 [details] [diff] [review]
Patch v.4: merged to tip (with fix for bug 172515) and added Mac build changes

r=ftang
only one minor issue. Please  change the return to NS_OK in the following
function:
+NS_IMETHODIMP nsUnicodeToLangBoxArabic8::GetMaxLength(
+const PRUnichar * aSrc, PRInt32 aSrcLength, 
+			    PRInt32 * aDestLength) 
+{
+  *aDestLength = 2*aSrcLength;
+  return NS_OK_UENC_EXACTLENGTH;

from NS_OK_UENC_EXACTLENGTH
then you just update the patch with a has-review with it. 
The rest of the code looks good
Attachment #102093 - Flags: review+
Attachment #102093 - Attachment is obsolete: true
Comment on attachment 102096 [details] [diff] [review]
Patch addressing ftang's review comments

r=ftang per comment 66
Attachment #102096 - Flags: review+
Comment on attachment 102096 [details] [diff] [review]
Patch addressing ftang's review comments

sr=roc+moz
Attachment #102096 - Flags: superreview+
Fix checked in.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
There is one final thing ToDo:
We need an entry in the >=1.2b release notes with the following information:
- iso8859-6.8x support added for Unix/Linux incl. Xprint
- Information where to get the iso8859-6.8x-encoded fonts from
Keywords: relnote
prabhat:
Is there a way to contribute the iso8859-6.8x *.enc dir file to Xfree86.org and
create a patch for Solaris's "ar"-locale with the fixed arabic fonts and
*.enc/*.ttmap files ?
Filed release note item under bug 174672 comment #1 ...
Hi roland, OK - i'll look at *.enc file (as you know even minor contribution
needs to go via legal). As to Solaris, i believe Ar locale owners will create a
patch.
Prabhat Hegde wrote:
> i'll look at *.enc file
> (as you know even minor contribution needs to go via legal).

;-((

> As to Solaris, i believe Ar locale owners will create a patch.

Well, AFAIK you'll have to patch the Solaris "ar"-TrueType fonts, too - per
smontagu some glyphs are missing in these fonts (however, it should be easy to
add them via "pfaedit" (more details on demand) ... :))
Summary: [BiDi]:Arabic 2 byte fonts don't seem to render → [BiDi]:Arabic 2 byte fonts don't seem to render / Add support for iso8859-6.8x fonts
Blocks: 199741
No longer blocks: 199741
Blocks: 199741
Filed http://bugs.xfree86.org//cgi-bin/bugzilla/show_bug.cgi?id=420 ("RFE: Add
encodings files for Arabic LangBox encodings iso8859-6.8, iso8859-6.8x and
iso8859-6.16") to get support for these encodings in Xfree86...
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: mahar → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: