Open
Bug 1498473
Opened 7 years ago
Updated 3 years ago
ATKStringConverterHelper adds BOMs needlessly inefficiently
Categories
(Core :: Disability Access APIs, enhancement)
Core
Disability Access APIs
Tracking
()
NEW
People
(Reporter: hsivonen, Unassigned)
References
Details
https://searchfox.org/mozilla-central/source/accessible/atk/DOMtoATK.cpp
- It uses g_strdup. However, the length of the string to be duplicated is known, so g_malloc + memcpy would be more efficient.
- The loop at https://searchfox.org/mozilla-central/source/accessible/atk/DOMtoATK.cpp#36 writes to an XPCOM string that's later copied via g_strdup instead of writing directly to a g_malloc-allocated buffer (saving one copy).
- The code searches for astral characters twice, which is inefficient when there are no astral characters or only a couple emoji. We should probably assume those are the common cases and the case where the whole page is in an astral-plane script is rare.
I suggest searching for the astral characters once, storing the BOM insertion points in nsAutoTArray, using g_malloc to allocate a target of the right size and then memcpying stuff over based on the offsets in the nsAutoTArray.
Reporter | ||
Comment 1•7 years ago
|
||
AFAICT, if a substring of an accessible is requested, AdjustOffsets expands the substring by one in both directions and then after the conversion one character is dropped from the ends of the converted string. From bug 1346535, I gather this has to do with selection (set by script?) falling between a surrogate pair.
Is there a reason why the string is expanded and then contracted without looking at the characters at the start and the end of the range to see if they are unpaired surrogates?
Flags: needinfo?(samuel.thibault)
Comment 2•7 years ago
|
||
> I suggest searching for the astral characters once, storing the BOM insertion points in nsAutoTArray
Just to comment on this: I agree that this is the most common case, DOMtoATK is however only used while a screen reader is operating, according to user's text requests. These are not *so* performance critical, so I would not advise to try to optimize the implementation too much if it makes it significantly less maintainable.
> From bug 1346535, I gather this has to do with selection (set by script?) falling between a surrogate pair.
Yes, this is because the screen reader may be requesting a selection whose boundary falls within a surrogate pair.
> Is there a reason why the string is expanded and then contracted without looking at the characters at the start and the end of the range to see if they are unpaired surrogates?
I'd say: keeping the implementation less complex than it could be :)
As a first change, AdjustOffsets could probably be passed aAccessible so it can peek at the text to determine whether it has to shift the boundaries or not (provided that we are sure that the text will not change between this peek and getting an actual string), which will avoid the head/tail skips in FinishUTF16toUTF8.
Flags: needinfo?(samuel.thibault)
Updated•3 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•