Closed Bug 1637405 Opened 5 years ago Closed 5 years ago

PUA script itemization

Categories

(Core :: Layout: Text and Fonts, defect)

77 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla78
Tracking Status
firefox78 --- fixed

People

(Reporter: corbett.dav, Assigned: jfkthame)

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:77.0) Gecko/20100101 Firefox/77.0

Steps to reproduce:

  1. Install the font JuniusX from https://github.com/psb1558/Junicode-New/blob/ff2c8b05c6b77d2c298c7d4bc64778d9179f7952/fonts/JuniusX-Regular.ttf.
  2. Go to data:text/html;charset=utf-8,<div style="font-family: JuniusX">m%EF%80%BE

Actual results:

The second character (a combining mark in the Private Use Area) is not centered above the first character (an ASCII m).

Expected results:

The PUA mark should be attached to its base using the anchor defined in the font.

The reason it doesn’t work is that the script itemizer keeps script=Unknown characters in their own separate runs; see https://github.com/harfbuzz/harfbuzz/issues/2401 for more information.

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: Untriaged → Layout: Text and Fonts
Product: Firefox → Core

Currently, gfxScriptItemizer checks for characters with Script=Inherited and Script=Common, and "adopts" them into the surrounding script run so that they are shaped together. I think we should do the same for Script=Unknown as well, which is the property value for PUA codepoints; while there's no way to know what script they are "supposed" to be, shaping them together with their context rather than as separate runs means that the font has at least a chance to apply any features it may have.

If PUA characters are used in the context of a complex script like Arabic or Indic, the shaper still won't have any basis for knowing how to apply features like 'init', 'medi', etc that depend on per-character shaping classes; but "global" features like 'liga', 'kern', mark attachments, etc., can be used by the font developer and should work as expected.

If the script itemizer interrupts runs on encountering Script=Unknown, the PUA character will
be shaped separately from its context and therefore OpenType features cannot take effect.
Allowing characters with Script=Unknown to be "adopted" into the surrounding run makes glyph
shaping possible, where a font wants to apply OpenType features such as ligation, kerning,
or diacritic positioning between a PUA codepoint and the adjacent non-PUA characters.

Assignee: nobody → jfkthame
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8e1cc786b9d3 Merge PUA characters with Script=Unknown with the surrounding script run during itemization (just like Script=Common). r=jrmuizel https://hg.mozilla.org/integration/autoland/rev/6e8e4b967372 Add reftest for shaping with a PUA-encoded diacritic. r=jrmuizel
Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla78
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: