Closed Bug 614476 Opened 14 years ago Closed 12 years ago

gfxPangoFonts sets up grapheme cluster boundaries twice

Categories

(Core :: Layout: Text and Fonts, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: karlt, Unassigned)

References

Details

(Whiteboard: [fixed in bug 703100])

Since bug 597147, gfxPlatform::SetupClusterBoundaries first does a rough
cluster setup, which is not yet really used.  It could be useful in the future
for:

  * font selection for the entire cluster
    (instead of for characters within the cluster)

  * choosing a point to split up long text runs

  * gfxHarfBuzzShaper (bug 569770), which would not perform the more thorough
    setup below.

Second, if Pango is used for shaping, pango_break is called.
pango_break first calls pango_default_break and then sometimes uses a
script-specific PangoEngineLang to improve the breaking of grapheme clusters.

Currently the second SetupClusterBoundaries in gfxPangoFonts.cpp does not
clear the first cluster setup, but merely removes breaks.  That's probably OK
because gfxPlatform::SetupClusterBoundaries is conservative about forming
clusters.

pango_default_break has a more complete cluster breaking algorithm than
gfxPlatform::SetupClusterBoundaries.

This is most complex for Hangul Jamo, Thai, and Lao scripts.

It also correctly handles U+FF9E HALFWIDTH KATAKANA VOICED SOUND MARK and
U+FF9F HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK, which could easily be added
to gfxPlatform::SetupClusterBoundaries in bug 614468.

Pango accidentally treats ZWNJ and ZWJ like GB_ControlCRLF which means that they
always have grapheme boundaries on both sides (and
gfxPlatform::SetupClusterBoundaries has the same bug).
http://git.gnome.org/browse/pango/tree/pango/break.c?id=1.28.3#n659

Pango has indic and arabic PangoEngineLangs.  The arabic lang engine only
alters PangoLogAttr::backspace_deletes_character, which we do not use.  The
indic lang engine does alter PangoLogAttr::is_cursor_position, which we do
use.

For some scripts, we do not need to bother with the more complete pango_break.
(This was originally pointed out by jfkthame in bug 569770 comment 32.)

However, given we are moving to gfxHarfBuzzShaper for simple scripts (and hopefully more scripts in the future), I doubt adding code to decide when to pango_break will be worthwhile as we often won't be calling pango_break anyway.
With current default preferences, the HarfBuzz shaper would not be used for
Hangul, Thai, Lao, or indic scripts, so we need only use pango_break when using
Pango for shaping.

This is helpful because setting up another PangoAnalysis for pango_break would
be inconvenient from gfxPlatform::SetupClusterBoundaries.
Bug 474068 may be one good reason to skip pango_break when unnecessary.
Blocks: 474068
Depends on: 553981
Depends on: 703100
Fixed by not calling pango_break in any situations.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in bug 703100]
You need to log in before you can comment on or make changes to this bug.