Remove WordBreaker::BreakInBetween()
Categories
(Core :: Internationalization, task)
Tracking
()
Tracking | Status | |
---|---|---|
firefox94 | --- | fixed |
People
(Reporter: TYLin, Assigned: TYLin)
References
Details
Attachments
(6 files)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
This is part of bug 1722484 comment 1 to simplify WordBreaker's API.
Assignee | ||
Comment 1•3 years ago
|
||
nextChar gets its value from DecodeChar() and PeekNextChar(), and both
functions return char32_t. It also passes as an char32_t argument to
BreakInBetween(). Therefore it really should have 32 bits to avoid
potential truncation.
Depends on D125148
Updated•3 years ago
|
Assignee | ||
Comment 2•3 years ago
|
||
nsFindContentIterator was removed in Bug 1455891.
https://hg.mozilla.org/mozilla-central/rev/2eca66a467ee
Depends on D125149
Assignee | ||
Comment 3•3 years ago
|
||
The motivation of this patch is to remove rarely used API in
WordBreaker. WordBreaker::BreakInBetween() is used only in
nsFind::BreakInBetween() in production, and it can be replaced by
Next().
If the user wants to know whether there is a word break between two
strings such as the use cases in gtest, joining the two strings and
passing the result to Next() is the preferred way.
Depends on D125150
Comment 4•3 years ago
|
||
Comment 5•3 years ago
|
||
The extra patch here adds some empty fragments; I confirmed locally that this passes with the current WordBreaker::BreakInBetween(), but fails when Part 4 above is used:
Expected equality of these values:
"This^ ^is^ ^a^ ^internationalization^ ^work^."
NS_ConvertUTF16toUTF8(result).get()
Which is: "Th^is^ ^is^ ^a^ ^int^ernationalization^ ^work^."
This shows that the modified implementation is now finding (spurious) breaks at the empty fragments.
Assignee | ||
Comment 6•3 years ago
|
||
Nice catch. Thanks for constructing a testcase that expose the problem in D125151. It also exposes an existing bug in TestFindWordBreakFromPosition
that it doesn't consider fragN
can be an empty string. As a result, it crashes with your testcase because WordBreaker::FindWord()
doesn't accept an empty string as the input.
I don't see why WordBreaker::FindWord()
cannot take an empty string because it returns a range with mBegin
and mEnd
set to aTextLen + 1
. I'll post a separate patch to address this problem by either making the tests consider empty string or improving the error story of WordBreaker::FindWord()
.
Assignee | ||
Comment 7•3 years ago
|
||
The function is added in Bug 1728708 Part 4, but it never get called, so
this patch transforms it into a test in WordBreak test suite to make it
run.
While I'm here, other individual functions are also transformed into
tests so that we can have more granular results if some of them failed.
Assignee | ||
Comment 8•3 years ago
|
||
- Rename arguments so that their names are consistent with Next().
- Make the function not assert on an empty string, i.e. aLen == 0, like
Next(). - Fix an undefined behavior when the user passes aTextLen == aOffset.
The methods used to accessaText[aOffset]
that is clearly out of range
because the string may not be null-terminated. After this patch, it
returns a sentinel WordRange when aLen == aPos. - Add document and gtest TestFindWordWithEmptyString().
Depends on D125150
Updated•3 years ago
|
Updated•3 years ago
|
Pushed by aethanyc@gmail.com: https://hg.mozilla.org/integration/autoland/rev/315c1c0f7946 Part 1 - Run TestNextWordBreakWithEmptyString as a test. r=jfkthame https://hg.mozilla.org/integration/autoland/rev/564e130b51c5 Part 2 - Change nextChar from char16_t to char32_t in nsFind::Find(). r=jfkthame https://hg.mozilla.org/integration/autoland/rev/c5607a57f3d7 Part 3 - Remove unused declaration in nsFind.h. r=jfkthame https://hg.mozilla.org/integration/autoland/rev/2e3ddecd9328 Part 4 - Clean up and fix an edge case of FindWord(). r=jfkthame https://hg.mozilla.org/integration/autoland/rev/08ee0dfa5b83 Part 5 - Remove WordBreaker::BreakInBetween(). r=jfkthame https://hg.mozilla.org/integration/autoland/rev/d74edfc1d3ed Part 6 - Add some empty fragments to the word-breaker test data. r=TYLin
Comment 10•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/315c1c0f7946
https://hg.mozilla.org/mozilla-central/rev/564e130b51c5
https://hg.mozilla.org/mozilla-central/rev/c5607a57f3d7
https://hg.mozilla.org/mozilla-central/rev/2e3ddecd9328
https://hg.mozilla.org/mozilla-central/rev/08ee0dfa5b83
https://hg.mozilla.org/mozilla-central/rev/d74edfc1d3ed
Comment 11•2 years ago
|
||
Could this bug have changed end-of-line selection behaviour in the editor which in turn causes Thunderbird Bug 1779846 where spellcheck fails on such selections?
Assignee | ||
Comment 12•2 years ago
•
|
||
Yes, I think Part 4 regresses Bug 1779846.
Description
•