Unify ClusterIterator with Grapheme segmenter
Categories
(Core :: Internationalization, task, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox98 | --- | fixed |
People
(Reporter: m_kato, Assigned: TYLin)
References
Details
Attachments
(5 files)
I guess that ClusterIterator is Grapheme break segmenter, so It may/can be unified like line/work breaker?
Assignee | ||
Comment 1•3 years ago
|
||
Sounds reasonable to me.
We also have ClusterReverseIterator()
, but ClusterReverseIterator::Next()
looks basic comparing to ClusterIterator::Next()
. I wonder if we can reimplement it using std::reverse_iterator
or ReverseIterator
on ClusterIterator
.
Comment 2•3 years ago
|
||
Agreed, ClusterIterator
is basically a grapheme break segmenter, and could be unified.
If the segmenter is bidirectional (supports Prev
as well as Next
), then it should be able to replace ClusterReverseIterator
as well. Currently the reverse iterator is less complete/correct than the forward one, as you noticed. The reverse version is only used in a couple of places and the limitations of the implementation haven't particularly mattered to us afaik, but I don't think there would be any problem with making them more symmetrical if it's convenient to share the underlying implementation.
(I don't think we can directly use std::reverse_iterator
or ReverseIterator
as things stand, because they'd require the underlying iterator to be bidirectional or at least to support a decrement operator (like a Prev
method), which ClusterIterator
doesn't currently have.)
Assignee | ||
Comment 3•3 years ago
|
||
I'm planning to take a look starting this week.
For the forward iterator, I think we can use a similar approach like bug 1722484. That is, we can use existing ClusterIterator
to implement class GraphemeClustersBreakIteratorUtf16
in Segmenter.h and rewrite the callers.
For the reverse iterator, maybe we can replace the caller with forward ones like we did for line breaker in bug 1733009 because we currently don't have the ability to iterate the grapheme clusters backwards in ICU4X anyway. If the reverse iterator is critical, then like Jonathan suggested in comment 2, we should probably keep it, and investigate sharing the underlying implementation with ClusterIterator
.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 4•3 years ago
|
||
This patch doesn't change the behavior. Just move the code around.
Assignee | ||
Comment 5•3 years ago
|
||
Include "nsLayoutUtils.h" in nsFileControlFrame to get rid of warnings in my
editor because it uses utilities such as nsLayoutUtils::AppUnitWidthOfString
.
We compile it without issues because of unified build.
Depends on D135639
Assignee | ||
Comment 6•3 years ago
|
||
Depends on D135640
Assignee | ||
Comment 7•3 years ago
|
||
Span
is more versatile than const String&
, and existing callers passing a
String into this function do not need any change.
This patch makes the next part simpler.
Depends on D135641
Assignee | ||
Comment 8•3 years ago
|
||
This is the main patch for the bug. It aims to change the grapheme cluster
break's Next()
API by implementing SegmentIteratorUtf16 interface, and adapt
the callers. It shouldn't change the behavior.
While rewriting the caller, one caveat worth mentioning is the loop termination
condition. If the old code relies on !AtEnd()
as the loop termination
condition, and it advances the iterator at the end of the loop, it meant
to skip its logic when the break position is at the end of the string. For
example, see the mozTXTToHTMLConv::NumberOfMatches
.
This patch also hooks grapheme cluster break iterator into
Segmenter::TryCreate() interface.
Existing test coverage for the file changed:
- netwerk/test/unit/test_mozTXTToHTMLConv.js
- layout/reftests/forms/input/file/dynamic-max-width.html
Depends on D135642
Comment 10•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/8031a294bc7c
https://hg.mozilla.org/mozilla-central/rev/a35cd10a12ec
https://hg.mozilla.org/mozilla-central/rev/15139f534811
https://hg.mozilla.org/mozilla-central/rev/195a3e5a485d
https://hg.mozilla.org/mozilla-central/rev/1df57205c907
Description
•