[CTW] HyperTextAccessibleBase::OffsetAtPoint is very slow with large text containers
Categories
(Core :: Disability Access APIs, defect)
Tracking
()
People
(Reporter: Jamie, Assigned: Jamie)
References
(Blocks 1 open bug)
Details
(Whiteboard: [ctw-m6])
Attachments
(2 files)
STR:
- Open this test case (a paragraph with 20000 characters):
data:text/html,<div id="div"><p id="p"></p><script>for (let i = 0; i < 10000; ++i) p.textContent += 'a ';</script>
- Ensure the document is focused.
- Open the Browser Console (control+shift+j).
- Paste this code:
serv = Cc["@mozilla.org/accessibilityService;1"].getService(Ci.nsIAccessibilityService); doc = serv.getAccessibleFor(document.activeElement).firstChild; p = doc.firstChild.firstChild.QueryInterface(Ci.nsIAccessibleText); p.getOffsetAtPoint(0, 0, Ci.nsIAccessibleCoordinateType.COORDTYPE_PARENT_RELATIVE);
- Expected: The response should be nearly instantaneous.
- Actual: The browser hangs for more than 5 seconds.
This happens because OffsetAtPoint indirectly calls CharBounds on every character, which calls BoundsWithOffset for each character. BoundsWithOffset walks the Accessible ancestry for every call. This gives us O(n^2).
In theory, we should be able to avoid most of these BoundsWithOffset calls because many of them will occur on the same Accessible. If a text leaf has 20000 characters, we only need to fetch that leaf's bounds once. In practice, this is a bit tricky because there might be more than one text leaf in a HyperText container.
I can think of two possible solutions.
Solution 1:
- Give TextLeafPoint::CharBounds an option to return coordinates relative to the leaf. Note that this is subtly different to COORDTYPE_PARENT_RELATIVE. Parent relative coordinates would be relative to the parent Accessible. We want them relative to the leaf itself so we don't have to do any acc bounds calculation at all.
- In OffsetAtPoint, we'd get the bounds for the TextLeafPoint's mAcc, but only if the mAcc hasn't changed since the last character. Then we'd add the self relative CharBounds.
- But we probably can't just add the CharBounds if there's a transform on mAcc. I'm not sure how to deal with that. Applying the transform in OffsetAtPoint seems awful.
- This is kinda ugly. :(
Solution 2:
- Lazily cache the calculated bounds for each Accessible in a hash map on the document. This way, BoundsWithOffset can just early return those if they're already calculated.
- Whenever a bounds cache update occurs (on any Accessible), clear the entire hash map. This is a naive caching strategy, but it should work well enough.
- This avoids OffsetAtPoint or CharBounds needing to manage any of this.
- We still have the transform problem, though.
Assignee | ||
Comment 1•2 years ago
|
||
Morgan, I'd love your thoughts on this when you get a chance.
Assignee | ||
Comment 2•2 years ago
|
||
For the second solution (caching calculated bounds), I guess we could cache the bounds up to this Accessible. BoundsWithOffset could still do the bit where it applies the offset and then the transform. Then it'd just add the cached calculated bounds after that.
But... arrrgh! Actually, none of this will work because transforms from ancestors also affect the character rects. Once we cache any calculated coordinates, we've lost the granularity needed to transform things correctly.
Assignee | ||
Comment 3•2 years ago
|
||
Can we somehow convert the requested point into a point relative to the leaf's mAcc? We'd then compare with leaf relative character rects directly. This is in contrast to the current approach where we have to get a screen rect for every character.
Assignee | ||
Comment 4•2 years ago
|
||
Note to self: profile before making assumptions about where the real problem lies.
O(n^2) is not ideal, but I don't know if we're going to be able to fix that due to transforms. However, this profile suggests that memory allocation might be the real bottleneck here. When we get the character bounds array (which we do for every single character), we have to convert it from an array of numbers to an array of rects. There are two problems with this:
- We don't specify the capacity of the array (which we can calculate), so each append causes reallocation.
- Ideally, we wouldn't build the rect array at all, as we're doing a lot of processing we don't need to there.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 5•2 years ago
|
||
When hit testing, we retrieve the bounds for every character in a text leaf in a separate call.
Because we need to convert from an array of flat ints to rects, this previously meant we built the entire array for every character.
For a large text leaf, building this rect array is relatively expensive.
We only need a single rect at a time.
Therefore, RemoteAccessibleBase now returns the rect for a single requested character instead of building an array of rects.
Assignee | ||
Comment 6•2 years ago
|
||
For the viewport cache and character rects, we have at least a reasonable estimate of the number of elements that will be required.
Pre-allocating these saves on potentially costly re-allocation as we append elements.
Comment 8•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/8b31039c5cab
https://hg.mozilla.org/mozilla-central/rev/57f7ab1f9f72
Updated•2 years ago
|
Comment 9•2 years ago
|
||
Reproduced the issue on Firefox 112.0a1 under macOS 13.3 by following the STR from Comment 0.
The issue is fixed on Firefox 112.0b9 and Firefox 113.0a1 (2023-04-02). Tests were performed on macOS 13.3, Windows 11 and Ubuntu 22.04.
Description
•