Gecko's text rendering code currently makes some very basic assumptions about the way that text is flowed, and individual lines of text are rendered. For example, it has some very basic linebreaking rules that seem to be hard-coded for Western languages, and assumes that every Unicode character can be represented by a single rendered glyph. These assumptions make it impossible to make use of advanced typography features available in the text rendering engines on some platforms (e.g. Mac OS X -- see bug 105800). For instance, it's currently impossible to have text rendered with ligatures, because this breaks selection. This is a serious issue, because displaying text with ligatures is essential for correctly rendering text in some language systems, like Arabic. In addition, the way layout currently does its own linebreaking, and then calls the gfx APIs to draw text in small chunks, makes for a very inefficient use of the lower-level platform text drawing APIs. For example, ATSUI on Mac expects you to use the ATSUI calls to lay out entire paragraphs at once, which can then be drawn in one go. This allows ATSUI do do all the text layout within that paragraph, enabling it to do language-sensitive word and linebreaking, right-to-left layout (potentially), and even to handle selection. Right now, when rendering with ATSUI, we're driving a paragraph-layout-API with word-sized chunks of text, which is very inefficient. These problems indicate that the core text layout code needs overhaul. It would be nice to see this happen for Gecko 2.0.
Is there a Gecko 2 bug this could block?
Maybe a better summary for this bug would be, "Make Gecko interoperate better with advanced typography systems such as ATSUI".
Would any other platforms benefit from such refinement? Do any of them have anything like ATSUI?
Not yet, but they likely will at some point. In any case, the code that would need changing is XP.
Anyone at Netscape agree with my suggestion in comment 3?
greg: sounds good to me
This probably blocks bug 121540. Marking as such. It seems to me as though bug 121540 blocks two basic things for Mac Mozilla: good Unicode text display, and extended typography functions such as automatic ligatures. The first is obviously more important than the latter. Simon, is this assigned to the right person? Perhaps, in the context of improving Unicode support on Mac, this deserves a better target than Future. (Sure, things like shipping Chimera and transitioning Fizzilla to Mach are higher priority, but this deserves some consideration following those two things.)
it is not only one of the decorative problems and not only related to the Mac world. at this stage there are lot of html sites using ligatures in arabic script, where ligatures are indispensable to correct pronunciation. so such sites are renderned properly only by MS Explorer. Additionally, this has also impact on writing emails in arabic script with ligatures. If this problem is not solved soon, lot of users using arabic based scripts would prefer to use MS products.
I agree that we need to move in the direction suggested here. > very basic linebreaking rules that seem to be hard-coded for Western languages, Mozilla's line-breaking code is based on JIS X 4501 and works more or less for Western scripts, CJ, K (I'm on purpose separating CJ and K here because they behave differently when it comes to line breaking) and Thai. However, JIS X 4501 is not as extenstive as Unicode Line breaking algorithm(UTR #14). We have to move on to Unicode Linebreaking. (see bug 56652 comment #18 and bug 206152) Anyway, this bug is kinda 'meta-meta' bug in a sense. On Win32, we have a similar issue with Uniscribe and opentype fonts for complex script support (rendering, text selection, and caret movement). On Linux, Pango is similar to ATSUI. So does Sun's STSF. Currently, on Win32 and Linux, we use our own font-specific glyph-based solution to render complex scripts (bug 176290, bug 177877, bug 203052, bug 204286, etc), but using the system APIs is certainly desirable. As for Arabic and Hebrew, they're dealt with differently from other complex scripts. They're handled in nsTextFrame by IBM_BIDI code that maps Unicode Arabic strings to strings of Arabic presentation forms. (Aside from BIDI, _modern_ Hebrew - as opposed to Biblical Hebrew- doesn't seem to require ligatures). This is not the best possible solution, but it works (doesn't it on Mac OS X?) Using Opentype fonts on Win32/Unix and ??? - ooops.. the name of a very advanced font format for MacOS X is escaping me at the moment- on Mac OS X would be better. SILA (http://sila.mozdev.org) uses a third advanced font format (SIL's Graphite), but it's only for Win32 at the moment.
Your comments on Bidi are not quite correct. On Win32 with Arabic and/or Hebrew support enabled, we bypass most of our own routines for reordering and shaping and hand the raw text to the Windows APIs (but not OpenType yet). I'm not sure if we ended up doing the same thing on Mac or not.
I didn't mean to misrepersent what Mozilla does with 'BIDI scripts' on 'BIDI-enabled platform(s)'. I knew Mozilla bypasses most of its own BIDI-related processing on platform(s) with 'native BIDI support'(Win2k/XP and Middle East version of Win9x/ME ), but my description was too coarse-grained..  I remember seeing a block of code by which 'native BIDI capability' is 'detected', but can't find it at the moment. So, I'm not sure whether Mozilla actually makes a distinction between ME version of Win9x/ME and non-ME version of Win9x/ME. I'm even less sure of Mac OS.
> the name of a very advanced font format for MacOS X is escaping me at the moment OS X uses OpenType and the old TrueType GX advanced font formats*. ATSUI probably abstracts both of them via its' API. * Skia, Apple Chancery, and Hoefler Text are TTGX from the System 7+QDGX days and still ship with OS X; Zapfino is another advanced bundled font, possibly OpenType.
well, it's not opentype but AAT(Apple Advanced Typography). AAT uses 'mort' truetype table and it's more advanced than opentype gsub/gpos tables in a sense because 'mort' table is kinda 'self-contained' while gsub/gpos requires more intervention from the upper layer. ATSUI doesn't take advantage of opentype GSUB/GPOS tables.
See also bug Bug 218887 about Uniscribe.
Blizzard is apparently working on Pango support: http://www.0xdeadbeef.com/html/2004/11/ I wonder if he's run into any of the problems discussed here or on bug 121540.
Is this bug on the Gecko 2.0 radar?
I'd be interested in contributing towards a bug bounty for this so bug 121540 can be fixed. Is there any point in my doing so?
Is this dependent on, duplicate of, or superseded by the Cairo initiative?
(In reply to comment #22) > Is this dependent on, duplicate of, or superseded by the Cairo initiative? I'm not sure what those who work on Cairo initiative have in mind. roc's blog has some interesting information about Cairo and ATSUI and Pango. I'm not sure what Cairo has to offer on Windows. (e.g. whether it uses/relies on Uniscribe) Here's an excerpt from roc's blog at http://weblogs.mozillazine.org/roc/archives/2005/05/cairo_progress.html --------------------------- What I read about your work of porting to Cairo really makes me think that the i18n aspect of it is a really delicate part, that you must not get wrong. The MoFo product are used in many part of the world, in many languages, and have a great support for that. Cairo must not mean a regression. Make sure you synchronize with the i18n team as much as needed, it must not be an afterthought. ----------------- Do you about Sila ? Even if the project stopped, it included interesting change to mozilla GFX to enable powerful mechanism to support really complex scripts : http://sila.mozdev.org/ Posted by: jmdesp at May 10, 2005 02:34 AM Owen Taylor is working on cairo, and he wrote Pango which is a really strong i18n library, so i18n needs are being taken care of. My current cairo code uses Pango so i18n is probably already at least as good as what's on the trunk. ---------------
cairo has a really nice approach to fonts. Basically, it stays out of the way. It gives you a way to draw a string of glyphs from a platform font; you have to specify the glyphs and their positions. (There is a "toy" API that takes a UTF8 string and tries to draw it, but we won't be using that.) So the whole UTF8->glyphstring and glyph positioning process remains under our control; we should probably move to Pango for this on Linux/Unix, but on Mac and Windows we should probably use ATSUI and Uniscribe. We can port over our existing code for use when those libraries aren't available.
> you have to specify the glyphs and their positions This makes it sound like Cairo internally will draw the glyphs one by one. If it's using ATSUI to do that, it's going to be very very slow.
Ask tor, he's doing it.
(In reply to comment #25) > > you have to specify the glyphs and their positions > > This makes it sound like Cairo internally will draw the glyphs one by one. If > it's using ATSUI to do that, it's going to be very very slow. Indeed, it does. Avoiding that is the whole point of this bug. The same is more or less true of Uniscribe and Pango.
(In reply to comment #25) > > you have to specify the glyphs and their positions > > This makes it sound like Cairo internally will draw the glyphs one by one. If > it's using ATSUI to do that, it's going to be very very slow. From a quick look at the Apple docs, it looks like we should be using ATSUI to convert text to a glyph string and a list of glyph positions (via ATSUDirectGetLayoutDataArrayPtrFromTextLayout). Then we can pass that into cairo. The cairo Quartz backend can call CGContextShowGlyphsWithAdvances to show the glyphs. (Well, this assumes CGGlyph and ATSGlyph are the same thing...)
(In reply to comment #28) > From a quick look at the Apple docs, it looks like we should be using ATSUI to > convert text to a glyph string and a list of glyph positions (via > ATSUDirectGetLayoutDataArrayPtrFromTextLayout). The description of ATSUDirectGetLayoutDataArrayPtrFromTextLayout warns of some performance implications, but my guess is that we can get arrays of glyph positions from ATSUI and have then rendered by Core Graphics. It does seem somewhat counter intuitive, since the fastest code path would just be to have ATSUI render those glyph arrays itself, rather than going back through cairo. > Then we can pass that into > cairo. The cairo Quartz backend can call CGContextShowGlyphsWithAdvances to > show the glyphs. CGContextShowGlyphsWithAdvances is only available in 10.3 or later, so using this would change our base OS requirements. > Well, this assumes CGGlyph and ATSGlyph are the same thing... Both are unsigned shorts, but I can't find any docs that specify whether they are the same. Glyph rendering is one of the two high-cost areas of text rendering. The other major performance area with ATSUI is text layout (i.e. glyph positioning). It sounds like Cairo will know even less about text layout than gfx does, placing more of the burden on Gecko, and making this bug all the more applicable. To get good performance from ATSUI, we should be delegating more of the text layout logic to it, and we need to be able to cache ATSUTextLayout and/or ATSUStyles.
(In reply to comment #29) > Glyph rendering is one of the two high-cost areas of text rendering. The other > major performance area with ATSUI is text layout (i.e. glyph positioning). It > sounds like Cairo will know even less about text layout than gfx does, placing > more of the burden on Gecko, and making this bug all the more applicable. To > get good performance from ATSUI, we should be delegating more of the text > layout logic to it, and we need to be able to cache ATSUTextLayout and/or > ATSUStyles. I totally agree we need to do this, but we still need to figure out the shape of the interfaces. See https://bugzilla.mozilla.org/show_bug.cgi?id=288439#c17 Let's go to the wiki for more discussion. http://wiki.mozilla.org/index.php?title=Gecko2:NewTextAPI I think you're probably right that we don't want to separate measurement from drawing at our cross-platform-API level. Then your question can be reduced to "how are we going to do high-performance text rendering with ATSUI and cairo on the Mac?" and some Mac-specific hacks can be considered.
I don't follow the reasoning here above well enough to understand whether or not my bug 316249 is a manifestation of this one, or if it is soething else. Can someone please check? Thanks.
related: Bug 336959 - Line Breaking with Pango/Uniscribe