Closed Bug 1728708 Opened 3 years ago Closed 3 years ago

Clean up for lwbrk WordBreaker and its gtest

Tracking

()

Status:

RESOLVED FIXED

Milestone:

94 Branch

Tracking Flags:

Tracking

Status

firefox94

---

fixed

People

(Reporter: TYLin, Assigned: TYLin)

References

Details

Attachments

(4 files)

Bug 1728708 Part 1 - Move WordBreakClass and GetClass into WordBreaker's private section. 3 years ago Ting-Yu Lin [:TYLin] (PST, UTC-8) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1728708 Part 2 - Rename WordBreaker::NextWord() to WordBreaker::Next(). 3 years ago Ting-Yu Lin [:TYLin] (PST, UTC-8) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1728708 Part 3 - Clean up the gtest for line and word breaker. 3 years ago Ting-Yu Lin [:TYLin] (PST, UTC-8) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1728708 Part 4 - Simplify WordBreaker::Next() and make it recognize the end of text a word break opportunity. 3 years ago Ting-Yu Lin [:TYLin] (PST, UTC-8) 48 bytes, text/x-phabricator-request		Details \| Review

Ting-Yu Lin [:TYLin] (PST, UTC-8)

Assignee

Description

•

3 years ago

This implement part of my proposal in bug 1722484 comment 1.

Ting-Yu Lin [:TYLin] (PST, UTC-8)

Assignee

Comment 1

•

3 years ago

Attached file Bug 1728708 Part 1 - Move WordBreakClass and GetClass into WordBreaker's private section. — Details

Ting-Yu Lin [:TYLin] (PST, UTC-8)

Assignee

Comment 2

•

3 years ago

Attached file Bug 1728708 Part 2 - Rename WordBreaker::NextWord() to WordBreaker::Next(). — Details

Depends on D124301

Ting-Yu Lin [:TYLin] (PST, UTC-8)

Assignee

Comment 3

•

3 years ago

Attached file Bug 1728708 Part 3 - Clean up the gtest for line and word breaker. — Details

Here are the changes in this patch. They shouldn't change the behavior.

Rename the gtest to TestBreak.cpp because it also contains word break tests.
Align ruler comments to the test strings.
Rename lb to wb in TestASCIIWB.
Remove unused variable j in TestPrintWordWithBreak().
Use ArrayLength instead of sizeof trick to get the array length.
#include ArrayUtils.h, and sort the #includes statements.

Depends on D124302

Ting-Yu Lin [:TYLin] (PST, UTC-8)

Assignee

Comment 4

•

3 years ago

Attached file Bug 1728708 Part 4 - Simplify WordBreaker::Next() and make it recognize the end of text a word break opportunity. — Details

A UAX29 compatible word breaker (like ICU4C) treat the end of text as a
word break opportunity (rule WB2 [1]), but currently lwbrk word breaker
doesn't.

The motivation of this patch is to make WordBreaker::Next() closer to
a UAX29 compatible one (at least for English text), and see if the
callers need to change. This should make the future integration of ICU4X
segmenter easier.

The only caller of WordBreaker::Next() is ClusterIterator's constructor.
This patch shouldn't change its behavior because we've already manually
assigned a word break point at the end of the line when aContext is
empty and aDirection is -1. This patch generalizes it to all
conditions.

Also, update TestPrintWordWithBreak() so that the result string makes
more sense.

[1] https://www.unicode.org/reports/tr29/#WB2

Depends on D124303

Pulsebot

Comment 5

•

3 years ago

Pushed by aethanyc@gmail.com: https://hg.mozilla.org/integration/autoland/rev/b1e7fd7b879a Part 1 - Move WordBreakClass and GetClass into WordBreaker's private section. r=jfkthame https://hg.mozilla.org/integration/autoland/rev/3a38d2e56bf8 Part 2 - Rename WordBreaker::NextWord() to WordBreaker::Next(). r=jfkthame https://hg.mozilla.org/integration/autoland/rev/ccfb78f756bf Part 3 - Clean up the gtest for line and word breaker. r=jfkthame https://hg.mozilla.org/integration/autoland/rev/55efff2d5628 Part 4 - Simplify WordBreaker::Next() and make it recognize the end of text a word break opportunity. r=jfkthame

Cristian Tuns

Comment 6

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/b1e7fd7b879a
https://hg.mozilla.org/mozilla-central/rev/3a38d2e56bf8
https://hg.mozilla.org/mozilla-central/rev/ccfb78f756bf
https://hg.mozilla.org/mozilla-central/rev/55efff2d5628

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

status-firefox94: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 94 Branch

Magnus Melin [:mkmelin]

Updated

•

3 years ago

Blocks: 1729682

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Clean up for lwbrk WordBreaker and its gtest

Categories

(Core :: Internationalization, task)

Tracking

()

People

(Reporter: TYLin, Assigned: TYLin)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(4 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Attachment

General

Description

File Name

Content Type