Closed Bug 157967 Opened 22 years ago Closed 5 years ago

Make Gecko interoperate better with advanced typography systems such as ATSUI, Uniscribe, Pango & STSF

Categories

(Core :: Layout: Text and Fonts, defect, P3)

defect

Tracking

()

RESOLVED WORKSFORME
Future

People

(Reporter: sfraser_bugs, Unassigned)

References

Details

(Keywords: intl, Whiteboard: [line-breaking])

Gecko's text rendering code currently makes some very basic assumptions about
the way that text is flowed, and individual lines of text are rendered. For
example, it has some very basic linebreaking rules that seem to be hard-coded
for Western languages, and assumes that every Unicode character can be
represented by a single rendered glyph.

These assumptions make it impossible to make use of advanced typography features
available in the text rendering engines on some platforms (e.g. Mac OS X -- see
bug 105800). For instance, it's currently impossible to have text rendered with
ligatures, because this breaks selection. This is a serious issue, because
displaying text with ligatures is essential for correctly rendering text in some
language systems, like Arabic.

In addition, the way layout currently does its own linebreaking, and then calls
the gfx APIs to draw text in small chunks, makes for a very inefficient use of
the lower-level platform text drawing APIs. For example, ATSUI on Mac expects
you to use the ATSUI calls to lay out entire paragraphs at once, which can then
be drawn in one go. This allows ATSUI do do all the text layout within that
paragraph, enabling it to do language-sensitive word and linebreaking,
right-to-left layout (potentially), and even to handle selection. Right now,
when rendering with ATSUI, we're driving a paragraph-layout-API with word-sized
chunks of text, which is very inefficient.

These problems indicate that the core text layout code needs overhaul. It would
be nice to see this happen for Gecko 2.0.
How would this affect bugs like bug 56652 and bug 7969? Also, IIRC, Unicode has
a line-breaking algorithm (which I'm sure we don't follow)--how would putting
this type of layout in the hands of ATSUI in the like affect compliance, if we
did pull our line-breaking up to spec?
QA Contact: petersen → moied
Priority: -- → P4
Target Milestone: --- → Future
Is there a Gecko 2 bug this could block?
Depends on: 168884
Keywords: intl
Maybe a better summary for this bug would be, "Make Gecko interoperate better
with advanced typography systems such as ATSUI".
Would any other platforms benefit from such refinement? Do any of them have
anything like ATSUI?
Not yet, but they likely will at some point.  In any case, the code that would
need changing is XP.
Anyone at Netscape agree with my suggestion in comment 3?
greg: sounds good to me
Changing summary.
Summary: Gecko needs more advanced typography features → Make Gecko interoperate better with advanced typography systems such as Apple's ATSUI
This probably blocks bug 121540. Marking as such.

It seems to me as though bug 121540 blocks two basic things for Mac Mozilla:
good Unicode text display, and extended typography functions such as automatic
ligatures. The first is obviously more important than the latter.

Simon, is this assigned to the right person? Perhaps, in the context of
improving Unicode support on Mac, this deserves a better target than Future.
(Sure, things like shipping Chimera and transitioning Fizzilla to Mach are
higher priority, but this deserves some consideration following those two things.)
Blocks: atsui
it is not only one of the decorative problems and not only related to the Mac
world. at this stage there are lot of html sites using ligatures in arabic
script, where ligatures are indispensable to correct pronunciation. so such
sites are renderned properly only by MS Explorer. Additionally, this has also
impact on writing emails in arabic script with ligatures. If this problem is not
solved soon, lot of users using arabic based scripts would prefer to use MS
products.
.
Assignee: attinasi → font
Component: Layout → Layout: Fonts and Text
Priority: P4 → --
QA Contact: moied → ian
Target Milestone: Future → ---
Priority: -- → P3
Target Milestone: --- → Future
I agree that we need to move in the direction suggested here.

> very basic linebreaking rules that seem to be hard-coded for Western languages,

  Mozilla's line-breaking code is based on JIS X 4501 and works more or less for
Western scripts, CJ, K (I'm on purpose separating CJ and K here because they
behave differently when it comes to line breaking) and Thai. However, JIS X 4501
is not as extenstive as Unicode Line breaking algorithm(UTR #14). We have to
move on to Unicode Linebreaking. (see bug 56652 comment #18 and bug 206152)

Anyway, this bug is kinda 'meta-meta' bug in a sense.

 On Win32, we have a similar issue with Uniscribe and opentype fonts for complex
script support (rendering, text selection, and caret movement). On Linux, Pango
is similar to ATSUI. So does Sun's STSF. Currently, on Win32 and Linux, we use
our own font-specific glyph-based solution to render complex scripts (bug
176290, bug 177877, bug 203052, bug 204286, etc), but using the system APIs is
certainly desirable. 
 
As for Arabic and Hebrew, they're dealt with differently from other complex
scripts. They're handled in nsTextFrame by IBM_BIDI code that maps Unicode
Arabic strings to strings of Arabic presentation forms. (Aside from BIDI,
_modern_ Hebrew - as opposed to Biblical Hebrew- doesn't seem to require
ligatures). This is not the best possible solution, but it works (doesn't it on
Mac OS X?)
Using Opentype fonts on Win32/Unix and ??? - ooops.. the name of a very advanced
font format for MacOS X is escaping me at the moment- on Mac OS X would be better.  

SILA (http://sila.mozdev.org) uses a third advanced font format (SIL's
Graphite), but it's only for Win32 at the moment. 
Your comments on Bidi are not quite correct. On Win32 with Arabic and/or Hebrew
support enabled, we bypass most of our own routines for reordering and shaping
and hand the raw text to the Windows APIs (but not OpenType yet). I'm not sure
if we ended up doing the same thing on Mac or not.
I didn't mean to misrepersent what Mozilla does with 'BIDI scripts' on
'BIDI-enabled platform(s)'. I knew Mozilla bypasses most of its own BIDI-related
processing on platform(s) with 'native BIDI support'(Win2k/XP and Middle East
version of  Win9x/ME [1]), but my description was too coarse-grained..

[1] I remember seeing a block of code by which 'native BIDI capability' is
'detected', but can't find it at the moment. So, I'm not sure whether Mozilla
actually makes a distinction between ME version of Win9x/ME and non-ME version
of Win9x/ME. I'm even less sure of Mac OS.  
Whiteboard: [linebreak] → [line-breaking]
> the name of a very advanced font format for MacOS X is escaping me at the moment

OS X uses OpenType and the old TrueType GX advanced font formats*. ATSUI
probably abstracts both of them via its' API.

* Skia, Apple Chancery, and Hoefler Text are TTGX from the System 7+QDGX days
and still ship with OS X; Zapfino is another advanced bundled font, possibly
OpenType.
well, it's not opentype but AAT(Apple Advanced Typography). AAT uses 'mort'
truetype table and it's more advanced than opentype gsub/gpos tables in a sense
because 'mort' table is kinda 'self-contained' while gsub/gpos requires more
intervention from the upper layer. ATSUI doesn't take advantage of  opentype
GSUB/GPOS tables. 
Summary: Make Gecko interoperate better with advanced typography systems such as Apple's ATSUI → Make Gecko interoperate better with advanced typography systems such as ATSUI, Uniscribe, Pango & STSF
See also bug Bug 218887 about Uniscribe.
Blocks: 188294
Blocks: uniscribe
Blizzard is apparently working on Pango support:
http://www.0xdeadbeef.com/html/2004/11/

I wonder if he's run into any of the problems discussed here or on bug 121540.
Is this bug on the Gecko 2.0 radar?
It seems not. Pango work is being done in bug 214715. See also bug 260663. 
I'd be interested in contributing towards a bug bounty for this so bug 121540
can be fixed. Is there any point in my doing so?
Is this dependent on, duplicate of, or superseded by the Cairo initiative?
(In reply to comment #22)
> Is this dependent on, duplicate of, or superseded by the Cairo initiative?

I'm not sure what those who work on Cairo initiative have in mind. roc's blog
has some interesting information about Cairo and ATSUI and Pango. I'm not sure
what Cairo has to offer on Windows. (e.g. whether it uses/relies on Uniscribe)

Here's an excerpt from roc's blog at
http://weblogs.mozillazine.org/roc/archives/2005/05/cairo_progress.html
---------------------------
What I read about your work of porting to Cairo really makes me think that the
i18n aspect of it is a really delicate part, that you must not get wrong. The
MoFo product are used in many part of the world, in many languages, and have a
great support for that. Cairo must not mean a regression.
Make sure you synchronize with the i18n team as much as needed, it must not be
an afterthought.
-----------------
Do you about Sila ? Even if the project stopped, it included interesting change
to mozilla GFX to enable powerful mechanism to support really complex scripts :
http://sila.mozdev.org/

Posted by: jmdesp at May 10, 2005 02:34 AM

Owen Taylor is working on cairo, and he wrote Pango which is a really strong
i18n library, so i18n needs are being taken care of. My current cairo code uses
Pango so i18n is probably already at least as good as what's on the trunk.
---------------
cairo has a really nice approach to fonts. Basically, it stays out of the way.
It gives you a way to draw a string of glyphs from a platform font; you have to
specify the glyphs and their positions. (There is a "toy" API that takes a UTF8
string and tries to draw it, but we won't be using that.) So the whole
UTF8->glyphstring and glyph positioning process remains under our control; we
should probably move to Pango for this on Linux/Unix, but on Mac and Windows we
should probably use ATSUI and Uniscribe. We can port over our existing code for
use when those libraries aren't available.
> you have to specify the glyphs and their positions

This makes it sound like Cairo internally will draw the glyphs one by one. If
it's using ATSUI to do that, it's going to be very very slow.
(In reply to comment #25)
> > you have to specify the glyphs and their positions
> 
> This makes it sound like Cairo internally will draw the glyphs one by one. If
> it's using ATSUI to do that, it's going to be very very slow.

Indeed, it does. Avoiding that is the whole point of this bug. The same is more
or less true of Uniscribe and Pango. 

(In reply to comment #25)
> > you have to specify the glyphs and their positions
> 
> This makes it sound like Cairo internally will draw the glyphs one by one. If
> it's using ATSUI to do that, it's going to be very very slow.

From a quick look at the Apple docs, it looks like we should be using ATSUI to
convert text to a glyph string and a list of glyph positions (via
ATSUDirectGetLayoutDataArrayPtrFromTextLayout). Then we can pass that into
cairo. The cairo Quartz backend can call CGContextShowGlyphsWithAdvances to show
the glyphs. (Well, this assumes CGGlyph and ATSGlyph are the same thing...)
(In reply to comment #28)

> From a quick look at the Apple docs, it looks like we should be using ATSUI to
> convert text to a glyph string and a list of glyph positions (via
> ATSUDirectGetLayoutDataArrayPtrFromTextLayout).

The description of ATSUDirectGetLayoutDataArrayPtrFromTextLayout warns of some
performance implications, but my guess is that we can get arrays of glyph
positions from ATSUI and have then rendered by Core Graphics. It does seem
somewhat counter intuitive, since the fastest code path would just be to have
ATSUI render those glyph arrays itself, rather than going back through cairo.

> Then we can pass that into
> cairo. The cairo Quartz backend can call CGContextShowGlyphsWithAdvances to 
> show the glyphs.

CGContextShowGlyphsWithAdvances is only available in 10.3 or later, so using
this would change our base OS requirements.

> Well, this assumes CGGlyph and ATSGlyph are the same thing...

Both are unsigned shorts, but I can't find any docs that specify whether they
are the same.

Glyph rendering is one of the two high-cost areas of text rendering. The other
major performance area with ATSUI is text layout (i.e. glyph positioning). It
sounds like Cairo will know even less about text layout than gfx does, placing
more of the burden on Gecko, and making this bug all the more applicable. To get
good performance from ATSUI, we should be delegating more of the text layout
logic to it, and we need to be able to cache ATSUTextLayout and/or ATSUStyles.
(In reply to comment #29)
> Glyph rendering is one of the two high-cost areas of text rendering. The other
> major performance area with ATSUI is text layout (i.e. glyph positioning). It
> sounds like Cairo will know even less about text layout than gfx does, placing
> more of the burden on Gecko, and making this bug all the more applicable. To
> get good performance from ATSUI, we should be delegating more of the text
> layout logic to it, and we need to be able to cache ATSUTextLayout and/or
> ATSUStyles.

I totally agree we need to do this, but we still need to figure out the shape of
the interfaces.

See https://bugzilla.mozilla.org/show_bug.cgi?id=288439#c17

Let's go to the wiki for more discussion.
http://wiki.mozilla.org/index.php?title=Gecko2:NewTextAPI
I think you're probably right that we don't want to separate measurement from
drawing at our cross-platform-API level. Then your question can be reduced to
"how are we going to do high-performance text rendering with ATSUI and cairo on
the Mac?" and some Mac-specific hacks can be considered.
Depends on: 297074
No longer depends on: 297074
I don't follow the reasoning here above well enough to understand whether or not my bug 316249 is a manifestation of this one, or if it is soething else. Can someone please check? Thanks.
related: Bug 336959 - Line Breaking with Pango/Uniscribe
Blocks: 378271
Assignee: layout.fonts-and-text → nobody
QA Contact: ian → layout.fonts-and-text

We did integrate gecko with ATSUI, Uniscribe, and Pango, and later with Core Text and DirectWrite; nowadays we use HarfBuzz and Graphite to handle advanced text shaping. So what this bug was calling for has happened, in various pieces and at various times for different platforms/back-ends.

Closing as WFM; any current issues in this area should be filed as fresh bugs.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.