Closed
Bug 384101
Opened 17 years ago
Closed 8 years ago
text.getTextAtOffset broken for TEXT_BOUNDARY_LINE_START
Categories
(Core :: Disability Access APIs, defect)
Core
Disability Access APIs
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: wwalker, Assigned: ginnchen+exoracle)
References
(Depends on 1 open bug, Blocks 2 open bugs, )
Details
(Keywords: access, Whiteboard: [auto-closed:inactivity])
In GNOME bugzilla bug http://bugzilla.gnome.org/show_bug.cgi?id=355525, I was tracking down why a certain feature of Orca wasn't working correctly. It turns out that the URL I was using as a test case (http://bugzilla.gnome.org/attachment.cgi?id=83911) contained a document whose text included embedded object and new line characters.
To help debug this in Orca, I added the following code to examine the text of the document frame. This code merely just goes character by character through the text of the document frame, calling getTextAtOffset for each character position:
if accessible.role == rolenames.ROLE_DOCUMENT_FRAME:
for i in range(0, length):
character = self.script.getText(accessible, i, i + 1)
if character == self.script.EMBEDDED_OBJECT_CHARACTER:
character = "EMBEDDED_OBJECT_CHARACTER"
elif character == "\n":
character = "\\n"
print "%d. '%s'" % (i, character)
[string, startOffset, endOffset] = text.getTextAtOffset(
i,
atspi.Accessibility.TEXT_BOUNDARY_LINE_START)
print " line(%d, %d) = '%s'" \
% (startOffset, endOffset, string)
For each character in the text for the document frame, the output tells us what
the index of the character is, what the character itself is, and what Gecko
thinks the line is for that character, including the start and end offset for
the line. Things seem to start failing around character 19, which is the 'T' that begins the line "This sentence is bold." Instead of failing as it did, I would expect getTextAtOffset for a value of TEXT_BOUNDARY_START to return the entire line. Here's the sample output:
0. 'EMBEDDED_OBJECT_CHARACTER'
line(0, 2) = '
'
1. '\n'
line(0, 2) = '
'
2. '\n'
line(2, 3) = '
'
3. 'EMBEDDED_OBJECT_CHARACTER'
line(3, 4) = ''
4. 'EMBEDDED_OBJECT_CHARACTER'
line(5, 18) = 'Text Formats
'
5. 'T'
line(5, 18) = 'Text Formats
'
6. 'e'
line(5, 18) = 'Text Formats
'
7. 'x'
line(5, 18) = 'Text Formats
'
8. 't'
line(5, 18) = 'Text Formats
'
9. ' '
line(5, 18) = 'Text Formats
'
10. 'F'
line(5, 18) = 'Text Formats
'
11. 'o'
line(5, 18) = 'Text Formats
'
12. 'r'
line(5, 18) = 'Text Formats
'
13. 'm'
line(5, 18) = 'Text Formats
'
14. 'a'
line(5, 18) = 'Text Formats
'15. 't'
line(5, 18) = 'Text Formats
'
16. 's'
line(5, 18) = 'Text Formats
'
17. '\n'
line(5, 18) = 'Text Formats
'
18. '\n'
line(18, 19) = '
'
19. 'T'
line(18, 20) = '
T'
20. 'h'
line(18, 19) = '
'
21. 'i'
line(18, 19) = '
'
22. 's'
line(18, 19) = '
'
23. ' '
line(18, 19) = '
'
24. 's'
line(18, 19) = '
'
25. 'e'
line(18, 19) = '
'
26. 'n'
line(18, 19) = '
'
27. 't'
line(18, 19) = '
'
28. 'e'
line(18, 19) = '
'
29. 'n'
line(18, 19) = '
'
30. 'c'
line(18, 19) = '
'
31. 'e'
line(18, 19) = '
'
32. ' '
line(18, 19) = '
'
33. 'i'
line(18, 19) = '
'
34. 's'
line(18, 19) = '
'
35. ' '
line(18, 19) = '
'
36. 'b'
line(18, 19) = '
'
37. 'o'
line(18, 19) = '
'
38. 'l'
line(18, 19) = '
'
39. 'd'
line(18, 19) = '
'
40. '.'
line(18, 19) = '
'
41. 'EMBEDDED_OBJECT_CHARACTER'
line(41, 42) = ''
Reporter | ||
Updated•17 years ago
|
Assignee: nobody → aaronleventhal
Component: Disability Access → Disability Access APIs
Product: Firefox → Core
QA Contact: disability.access → accessibility-apis
Comment 1•17 years ago
|
||
An oddity can be seen using Accerciser on the test page (2nd link in opening comment). The accessible at 0 4 8 0 0 2 is a ghost accessible (not a link) with no role or text. In addition, the second and third lines of text are not shown in the accessible tree.
Comment 2•17 years ago
|
||
After examining the markup, I suspect the nasty <br> bug is to blame.
Comment 3•17 years ago
|
||
Here's a JS version of the problem with some of the unnecessary stuff stripped out:
http://www.mozilla.org/access/qa/linestest/jstest
Comment 4•17 years ago
|
||
Correctly spelled:
http://www.mozilla.org/access/qa/linetest/jstest
I'm working on this bug now, but if you want to look at the JS testcase, wait until the mozilla.org website updates (about 45 minutes).
Comment 5•17 years ago
|
||
Interesting, I didn't think newlines in the source affect anything unless it's a <pre> or styled with white-space: pre.
This has the problem:
<b><br>
</b><b>123</b>
This doesn't:
<b><br></b><b>123</b>
The top one has an extra newline.
Comment 6•17 years ago
|
||
In fact the <b> around 123 doesn't matter, so this has the problem too:
<b><br>
</b>123
That's the smallest testcase I can find which still has the problem.
Comment 7•17 years ago
|
||
The problem also exists if there is a space instead of a newline, after the <br>.
<b><br> </b>123
Comment 8•17 years ago
|
||
I think the issue is in DOMPointToOffset()
We're giving it 0 for content offset.
Broken case: it returns 0 for hyperTextOffset
Working case: it returns 1 for hyperTextOffset
Comment 9•17 years ago
|
||
In other words, we're not getting past the <br> in the hypertext.
Comment 10•17 years ago
|
||
Yikes! Gecko thinks the start of the line is before the <br>.
That causes another problem -- if you go into the middle of the line and hit Home, it jumps up a line!
Comment 11•17 years ago
|
||
Filed bug 384452 on the core Gecko problem causing this.
Depends on: 384452
Updated•17 years ago
|
Assignee: aaronleventhal → ginn.chen
Comment 12•17 years ago
|
||
If we're going to fix this in Firefox 3 we might need to find a way around bug 384452 -- e.g. look for the incorrect results an fix them in our a11y module.
Updated•17 years ago
|
Whiteboard: orca:normal
Reporter | ||
Comment 13•17 years ago
|
||
Aaron - as part of our performance improvement work for Orca, we were considering using the getTextAtOffset for TEXT_BOUNDARY_LINE_START. I think I recall you saying at the Boston 2007 summit that getTextAtOffset was now working.
Has this bug been fixed?
Comment 14•17 years ago
|
||
This is the last known case that is broken, and it's because caret navigation is also broken for this. It's not easy to fix this apparently. But, I still think it would be far better for Orca to use getTextAtOffset() and add error checking, because the code is stable now, and much faster than what you're currently doing with character bounds.
If you want you can make it an experimental pref at first, and keep your current code around.
Comment 15•17 years ago
|
||
I'm changing the whiteboard status from orca:normal to orca:urgent. I'm in the midst of converting Orca away from using extents to obtain line contents, and it would be nice to minimize (eliminate?) the need for special-casing things.
Thanks in advance!
Whiteboard: orca:normal → orca:urgent
Assignee | ||
Comment 16•17 years ago
|
||
Aaron, are you working on this bug?
Comment 17•17 years ago
|
||
Ginn, actually no. I think I told Will that I could but I will not deal with core selection internals right now.
Comment 18•17 years ago
|
||
I'm not sure if the following is another instance of the same bug or something different, so I'm starting here. :-)
If you go to: https://buy.garmin.com/shop/shop.do?cID=134 and examine any of the table cells that contain checkboxes using Accerciser, you should find the following to be true for TEXT_BOUNDARY_LINE_START:
* getTextAtOffset() works as expected given:
* an offset of 0 (link that's the image)
* an offset of 1 (link that contains a section)
* getTextAtOffset() with an offset of 2 (the checkbox) causes the
full cell contents to be returned.
Comment 19•16 years ago
|
||
Friendly ping. :-)
Updated•16 years ago
|
OS: Linux → All
Hardware: PC → All
Updated•14 years ago
|
Flags: in-testsuite?
Comment 20•8 years ago
|
||
AUTO-CLOSED. This bug untouched for over 2000 days. Please reopen if you can confirm the bug and help it progress.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
Whiteboard: orca:urgent → [auto-closed:inactivity]
You need to log in
before you can comment on or make changes to this bug.
Description
•