Unify Intl APIs in gfx/thebes/gfxHarfBuzzShaper.cpp
Categories
(Core :: Internationalization, task, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox94 | --- | fixed |
People
(Reporter: gregtatum, Assigned: jfkthame)
References
Details
(Whiteboard: [i18n-unification], [i18n-unification-help-wanted] )
Attachments
(2 files)
Work: Medium
What it is: UNormalizer2 and UText
Updated•4 years ago
|
Updated•4 years ago
|
| Assignee | ||
Comment 1•4 years ago
|
||
Updated•4 years ago
|
| Assignee | ||
Comment 2•4 years ago
|
||
Depends on D126259
| Assignee | ||
Comment 3•4 years ago
|
||
gfxHarfBuzzShaper needs two low-level normalization data accessors that it currently gets from UNormalizer2:
composePair- given two Unicode characters, return their single-character composed equivalent if anydecomposeRaw- given a Unicode character, return its one- or two-character single-level decomposition if any (not recursive decomposition, like full NFD normalization would do)
Because these are used only in relation to canonical normalization, they do not have to deal with arbitrary-length decompositions. All canonical decompositions in Unicode are either singletons (like U+212A KELVIN SIGN -> U+004B LATIN CAPITAL LETTER K) or decompose to a pair of characters (which may themselves have further decompositions, like U+01D8 LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE which decomposes to <U+00FC LATIN SMALL LETTER U WITH DIAERESIS, U+0301 COMBINING ACUTE ACCENT>, where U+00FC in turn also has a pairwise decomposition). But these low-level APIs are only concerned with a single decomposition step.
Although arguably these are not really "string" operations -- they're more like queries about individual characters -- I think it probably makes most sense to include them in mozilla::intl::String alongside the higher-level normalization APIs.
The low-level APIs I'm suggesting we add here are very specific, single-purpose methods, deliberately not taking a parameter to indicate whether to use Canonical or Compatibility decompositions; this simplifies the APIs as multi-char (> 2 components) decompositions need not be handled, and avoids a test and branch at runtime for flexibility that we don't need. The use of these methods by harfbuzz is quite perf-sensitive, so I want to keep them as simple and lightweight as possible.
Once these are provided by mozilla::intl, gfxHarfBuzzShaper will no longer need either UNormalizer2 or UText.
Comment 6•4 years ago
|
||
Backed out for for causing bustages on gtest.h:1445:11. CLOSED TREE
Backout link : https://hg.mozilla.org/integration/autoland/rev/9786ceb8a9abea6fff8418614d9a03884d66ed9c
Push with failures : https://treeherder.mozilla.org/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception%2Crunnable&revision=4ab330369412d3c00793154c007d248916923f6d&selectedTaskRun=MX3Ry8QASJC99KFAoV84vA.0
Link to failure log : https://treeherder.mozilla.org/logviewer?job_id=352413403&repo=autoland&lineNumber=14235
Comment 8•4 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/83baefac5b38
https://hg.mozilla.org/mozilla-central/rev/984ec26e4217
| Assignee | ||
Updated•4 years ago
|
Description
•