Originally I chose to not filter out empty strings from utterances. With the thought that ATs could have some understanding about how the utterance is constructed and modify their presentation. For example on a focus event, strings in odd indexes are names or values, strings in even indexes are roles or descriptions. The problem is that it seems like TalkBack delimits each element from the utterance with spaces, so if you land on plain text, and the utterance (in Android it is AccessibilityEvent.getText()) would be ['', 'some text']. This would be sent to the TTS as ' some text'. the preceding whitespace delays the speech, so users get a laggy experience (well a more laggy experience :)
Created attachment 621234 [details] [diff] [review] Remove empty strings from utterances. This should do it. Nothing too fancy. It feels liberating not to have to return two elements each time. I might suggest later on not clumping roles and states into the previous "description" element and to have them seperate. This introduces interesting questions about the grammar boundaries, and how things should be localized.
I want to go over some high level stuff with you before review (probably Monday).