Open Bug 1498473 Opened 7 years ago Updated 3 years ago

ATKStringConverterHelper adds BOMs needlessly inefficiently

Tracking

()

Status:

NEW

People

(Reporter: hsivonen, Unassigned)

References

Details

Henri Sivonen (:hsivonen)

Reporter

Description

•

7 years ago

https://searchfox.org/mozilla-central/source/accessible/atk/DOMtoATK.cpp - It uses g_strdup. However, the length of the string to be duplicated is known, so g_malloc + memcpy would be more efficient. - The loop at https://searchfox.org/mozilla-central/source/accessible/atk/DOMtoATK.cpp#36 writes to an XPCOM string that's later copied via g_strdup instead of writing directly to a g_malloc-allocated buffer (saving one copy). - The code searches for astral characters twice, which is inefficient when there are no astral characters or only a couple emoji. We should probably assume those are the common cases and the case where the whole page is in an astral-plane script is rare. I suggest searching for the astral characters once, storing the BOM insertion points in nsAutoTArray, using g_malloc to allocate a target of the right size and then memcpying stuff over based on the offsets in the nsAutoTArray.

Henri Sivonen (:hsivonen)

Reporter

Comment 1

•

7 years ago

AFAICT, if a substring of an accessible is requested, AdjustOffsets expands the substring by one in both directions and then after the conversion one character is dropped from the ends of the converted string. From bug 1346535, I gather this has to do with selection (set by script?) falling between a surrogate pair. Is there a reason why the string is expanded and then contracted without looking at the characters at the start and the end of the range to see if they are unpaired surrogates?

Flags: needinfo?(samuel.thibault)

Samuel Thibault

Comment 2

•

7 years ago

> I suggest searching for the astral characters once, storing the BOM insertion points in nsAutoTArray Just to comment on this: I agree that this is the most common case, DOMtoATK is however only used while a screen reader is operating, according to user's text requests. These are not *so* performance critical, so I would not advise to try to optimize the implementation too much if it makes it significantly less maintainable. > From bug 1346535, I gather this has to do with selection (set by script?) falling between a surrogate pair. Yes, this is because the screen reader may be requesting a selection whose boundary falls within a surrogate pair. > Is there a reason why the string is expanded and then contracted without looking at the characters at the start and the end of the range to see if they are unpaired surrogates? I'd say: keeping the implementation less complex than it could be :) As a first change, AdjustOffsets could probably be passed aAccessible so it can peek at the text to determine whether it has to shift the boundaries or not (provided that we are sure that the text will not change between this peek and getting an actual string), which will avoid the head/tail skips in FinishUTF16toUTF8.

Flags: needinfo?(samuel.thibault)

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.

Bugzilla

ATKStringConverterHelper adds BOMs needlessly inefficiently

Categories

(Core :: Disability Access APIs, enhancement)

Tracking

()

People

(Reporter: hsivonen, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated