Closed Bug 1729407 Opened 3 years ago Closed 3 years ago

Cache friendly accessible text implementation

Tracking

()

Status:

RESOLVED FIXED

Milestone:

94 Branch

Tracking Flags:

Tracking

Status

firefox94

---

fixed

People

(Reporter: Jamie, Assigned: Jamie)

References

(Blocks 2 open bugs)

Details

Attachments

(6 files, 2 obsolete files)

Bug 1729407 part 1: Add new TextLeafPoint and TextLeafRange classes. 3 years ago James Teh [:Jamie] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1729407 part 2: Add TextLeafPoint support for finding line start boundaries for LocalAccessibles. 3 years ago James Teh [:Jamie] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1729407 part 3: Add TextLeafPoint support for finding word boundaries. 3 years ago James Teh [:Jamie] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1729407 part 4: Add TextLeafPoint support for finding character boundaries. 3 years ago James Teh [:Jamie] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1729407 part 5: Add TextLeafRange::Expand to move the range points to the closest boundaries of the given type. 3 years ago James Teh [:Jamie] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1729407 part 5: Use TextLeafRange for word/line starts in HyperTextAccessible::TextAtOffset if the cache is enabled. 3 years ago James Teh [:Jamie] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1729407 part 6: Fix an existing line boundary test. 3 years ago James Teh [:Jamie] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1729407: PrevWordBreakClassWalker::IsStartOfGroup: Don't advance mOffset if PrevChar fails. 3 years ago James Teh [:Jamie] 48 bytes, text/x-phabricator-request		Details \| Review

James Teh [:Jamie]

Assignee

Description

•

3 years ago

To support text in the parent process a11y cache, we need to cache any information we need from layout, but also cache as little information as possible. Our current HyperTextAccessible implementation is not really suitable:

Because of embedded objects, each ancestor/descendant HyperTextAccessible has its own "view" of a given boundary. That means we'd potentially end up querying layout multiple times for the same boundary.
For the same reason, this would probably be less memory efficient in the cache, since we're storing multiple representations of the same boundary.
We'd have to cache word offsets as well as line offsets.
Because of the extreme complexity of calculating boundaries using HyperTextAccessible (walking in and out of embedded objects to retrieve a single word or line, etc.), Mac doesn't use this. If the cache was based on HyperTextAccessible, we'd have to completely rethink the Mac implementation.
The same complexity would cause problems when we want to support UIA Automation.
Said complexity has caused a lot of obscure bugs and quirks which are extremely difficult to debug and fix.
HyperTextAccessible relies on nsIFrame::PeekOffset. This has a lot of obscure quirks which have caused us problems in the past.

In this bug, we will implement new TextLeafPoint and TextLeafRange classes for text navigation. These will eventually replace TextRange and TextPoint. Rather than using HyperTextAccessible, these will use leaf Accessibles and offsets therein. HyperTextAccessible will eventually be a thin wrapper around these new classes, translating from/to HyperTextAccessible offsets only when absolutely necessary.

We need information from layout to calculate line boundaries, so we'll provide cache friendly methods to retrieve those, using layout only where absolutely necessary. We can calculate words without information from layout, so we don't need to cache those and will instead provide unified word navigation for local and remote Accessibles. As much as possible, the code to find boundaries should be unified for local and remote.

As well as being cache friendly, my hope is that this work will resolve a lot of obscure text bugs we've collected over the years (or at least make them easier to fix).

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Depends on: 1729412

James Teh [:Jamie]

Assignee

Comment 1

•

3 years ago

Attached file Bug 1729407 part 1: Add new TextLeafPoint and TextLeafRange classes. — Details

James Teh [:Jamie]

Assignee

Comment 2

•

3 years ago

Attached file Bug 1729407 part 2: Add TextLeafPoint support for finding line start boundaries for LocalAccessibles. — Details

This adds TextLeafPoint::FindBoundary, which will be the main entry point for finding single boundaries.
FindBoundary searches individual Accessibles for a boundary, walking the a11y tree and continuing the search if a boundary isn't found.

Support for line start boundaries uses layout to determine line edges, but otherwise traverses the accessibility tree.
This avoids depending on PeekOffset, which has a lot of quirks that cause problems for a11y.
More importantly, it's possible to find line boundaries within a single LocalAccessible using the lower level FindPrev/NextLineStartSameLocalAcc methods.
This will be useful for line boundary caching, since we want to cache this info on individual Accessibles.
In contrast, PeekOffset might walk outside the current Accessible, which makes caching more difficult.

James Teh [:Jamie]

Assignee

Comment 3

•

3 years ago

Attached file Bug 1729407 part 3: Add TextLeafPoint support for finding word boundaries. — Details

This implementation uses WordBreaker and a bunch of code to emulate layout's handling of punctuation, space, words spanning across nodes, words broken due to lines, etc.

While this does mean we're effectively duplicating layout, there are a couple of advantages.
First, PeekOffset has a lot of quirks which have caused us problems in the past and are difficult to debug.
They don't matter so much for other consumers, but a11y is very sensitive to these kinds of bugs.
Having our own logic means we can make changes as needed without potentially affecting other PeekOffset consumers.
Second, while the implementation currently only works on LocalAccessibles (because we don't have text in RemoteAccessible yet), it's been designed so that it should be very easy to adapt for RemoteAccessible.
It already walks the unified Accessible tree, so it just has to be taught how to get text from RemoteAccessible once that's possible.
This means we don't have to cache word boundaries; they can be calculated on demand in the parent process.

James Teh [:Jamie]

Assignee

Comment 4

•

3 years ago

Attached file Bug 1729407 part 4: Add TextLeafPoint support for finding character boundaries. — Details

Again, this should be very easy to adapt to RemoteAccessible once we cache text there.

James Teh [:Jamie]

Assignee

Comment 5

•

3 years ago

Attached file Bug 1729407 part 5: Add TextLeafRange::Expand to move the range points to the closest boundaries of the given type. (obsolete) — Details

James Teh [:Jamie]

Assignee

Comment 6

•

3 years ago

Attached file Bug 1729407 part 5: Use TextLeafRange for word/line starts in HyperTextAccessible::TextAtOffset if the cache is enabled. — Details

We won't use HyperTextAccessible for actual caching.
However, we have a lot of existing single process mochitests for HyperTextAccessible.
Putting it behind this pref makes it easy for us to run those tests against this new text implementation.
Eventually, we want to use this implementation for both local and remote Accessibles.

James Teh [:Jamie]

Assignee

Comment 7

•

3 years ago

•

Edited

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ef743f0c4e58b3396f35fa4229ab9f0fc559acfd

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1730085

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1730086

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1730090

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1730093

Phabricator Automation

Updated

•

3 years ago

Attachment #9239809 - Attachment is obsolete: true

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Depends on: 1730809

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1730096

Phabricator Automation

Updated

•

3 years ago

Attachment #9239810 - Attachment description: Bug 1729407 part 6: Use TextLeafRange for word/line starts in HyperTextAccessible::TextAtOffset if the cache is enabled. → Bug 1729407 part 5: Use TextLeafRange for word/line starts in HyperTextAccessible::TextAtOffset if the cache is enabled.

James Teh [:Jamie]

Assignee

Comment 8

•

3 years ago

Attached file Bug 1729407 part 6: Fix an existing line boundary test. — Details

Previously, the test asserted that the line offsets for an embedded object should be an empty range.
This is actually incorrect, since there is content on the line (an empty text box and a button).
Update this with the correct result, but mark it as an expected failure.
The new implementation fixes this, so once we enable it by default, we can assert success.

James Teh [:Jamie]

Assignee

Comment 9

•

3 years ago

•

Edited

https://treeherder.mozilla.org/#/jobs?repo=try&revision=55d6fed6cbc938f5625910bef7de8a078a8dcadf

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1730862

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1730868

Pulsebot

Comment 10

•

3 years ago

Pushed by jteh@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f08b66f1bfc2 part 1: Add new TextLeafPoint and TextLeafRange classes. r=eeejay https://hg.mozilla.org/integration/autoland/rev/25c73785043e part 2: Add TextLeafPoint support for finding line start boundaries for LocalAccessibles. r=eeejay https://hg.mozilla.org/integration/autoland/rev/1bdee10f6384 part 3: Add TextLeafPoint support for finding word boundaries. r=eeejay https://hg.mozilla.org/integration/autoland/rev/58281d3cac4c part 4: Add TextLeafPoint support for finding character boundaries. r=eeejay https://hg.mozilla.org/integration/autoland/rev/2c4dc811beb9 part 5: Use TextLeafRange for word/line starts in HyperTextAccessible::TextAtOffset if the cache is enabled. r=eeejay https://hg.mozilla.org/integration/autoland/rev/9fb2520417b2 part 6: Fix an existing line boundary test. r=eeejay

Sandor Molnar[:smolnar]

Comment 11

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f08b66f1bfc2
https://hg.mozilla.org/mozilla-central/rev/25c73785043e
https://hg.mozilla.org/mozilla-central/rev/1bdee10f6384
https://hg.mozilla.org/mozilla-central/rev/58281d3cac4c
https://hg.mozilla.org/mozilla-central/rev/2c4dc811beb9
https://hg.mozilla.org/mozilla-central/rev/9fb2520417b2

Status: NEW → RESOLVED

Closed: 3 years ago

status-firefox94: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 94 Branch

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: a11y-ctw-text
No longer blocks: texta11y, a11y-ctw

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1734582

James Teh [:Jamie]

Assignee

Updated

•

3 years ago

Blocks: 1735464

James Teh [:Jamie]

Assignee

Comment 12

•

3 years ago

Attached file Bug 1729407: PrevWordBreakClassWalker::IsStartOfGroup: Don't advance mOffset if PrevChar fails. (obsolete) — Details

IsStartOfGroup wants to peek at the previous character.
It calls PrevChar, but it doesn't check for failure.
It then unconditionally increments mOffset to "undo" PrevChar.
If PrevChar failed, that means we're now 1 character after where we started.
This wasn't causing any test failures and I can't think of a test case that shows breakage here.
Nevertheless, it definitely doesn't make sense and could cause difficult-to-diagnose bugs.

Phabricator Automation

Comment 13

•

3 years ago

Comment on attachment 9245660 [details]
Bug 1729407: PrevWordBreakClassWalker::IsStartOfGroup: Don't advance mOffset if PrevChar fails.

Revision D128332 was moved to bug 1735464. Setting attachment 9245660 [details] to obsolete.

Attachment #9245660 - Attachment is obsolete: true

James Teh [:Jamie]

Assignee

Updated

•

8 months ago

Blocks: uiatext

You need to log in before you can comment on or make changes to this bug.