Closed Bug 961737 Opened 10 years ago Closed 10 years ago

Regression: atk_text_get_text() is broken w.r.t. embedded object characters

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla30

People

(Reporter: jdiggs, Assigned: surkov)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 2 obsolete files)

accessible-event listener listtext.py 10 years ago Joanmarie Diggs 1.51 KB, text/x-python		Details
part1: patch1 10 years ago alexander :surkov (:asurkov) 5.83 KB, patch		Details \| Diff \| Splinter Review
part1:patch2 10 years ago alexander :surkov (:asurkov) 3.40 KB, patch	tbsaunde : review+	Details \| Diff \| Splinter Review
part2_cleanup:patch1 10 years ago alexander :surkov (:asurkov) 15.79 KB, patch		Details \| Diff \| Splinter Review
part2_cleanup:patch2 10 years ago alexander :surkov (:asurkov) 17.86 KB, patch	tbsaunde : review+	Details \| Diff \| Splinter Review
part3 (list items):patch1 10 years ago alexander :surkov (:asurkov) 14.15 KB, patch	tbsaunde : review+	Details \| Diff \| Splinter Review

Joanmarie Diggs

Reporter

Description

•

10 years ago

Attached file accessible-event listener listtext.py — Details

Steps to reproduce:
1. Load the attached accessible-event listener in a terminal
2. (Re)load the test case (attachment 827653 [details])
3. Compare the results from Firefox 24 and nightly.

Things that implement AtkText and claim to have text (string, character count) used to report that text when using atk_get_text_at_offset(). That seems to no longer be the case when the text consists of one or more embedded object characters. Only getting a character works. See results below.

I am admittedly not a fan of digging down via the accessible text interface trying to piece together lines and implementing an Orca-controlled caret for Gecko content. ;) BUT, given that this seems necessary, whatever change is responsible for this new behavior, it breaks stuff for Orca. And even if, as part of the Great Orca Rewrite I'm doing, I solve this in Orca, that's not going to solve things for users of previous versions of Orca. So I'm going to assume this is an accidental regression, but if it is by design I still think it needs to be reversed for those "long term stable release" folks.

==================
Firefox 24 results
==================
-> [document frame | List item bug] (0, 137, 1080, 1756)
   textinfo: offset: 0 length: 1 string: [OBJ]
   lineinfo: <[OBJ]> (0, 1) (8, 145, 1064, 52)
   wordinfo: <[OBJ]> (0, 1) (8, 145, 1064, 52)
   charinfo: <[OBJ]> (0, 1) (8, 145, 1064, 52)
    -> [list | ] (8, 145, 1064, 52)
       textinfo: offset: -1 length: 2 string: [OBJ]       
       lineinfo: <[OBJ]> (0, 1) (48, 145, 1024, 26)
       wordinfo: <[OBJ]> (0, 1) (48, 145, 1024, 26)
       charinfo: <[OBJ]> (0, 1) (48, 145, 1024, 26)
        -> [list item | • Item 1] (32, 145, 1040, 26)
           textinfo: offset: -1 length: 7 string: •Item 1
           lineinfo: <•Item 1> (0, 7) (32, 145, 65, 26)
           wordinfo: <1> (6, 7) (88, 145, 9, 26)
           charinfo: <•> (0, 1) (32, 155, 8, 10)
        -> [list item | • Item 2] (32, 171, 1040, 26)
           textinfo: offset: -1 length: 7 string: •Item 2
           lineinfo: <•Item 2> (0, 7) (32, 171, 71, 26)
           wordinfo: <2> (6, 7) (93, 171, 10, 26)
           charinfo: <•> (0, 1) (32, 181, 8, 10)

===============
Nightly results
===============
-> [document frame | List item bug] (0, 110, 1080, 1783)
   textinfo: offset: 0 length: 1 string: [OBJ]
   lineinfo: <> (-1, -1) (0, 0, 0, 0)
   wordinfo: <> (0, 0) (0, 0, 0, 0)
   charinfo: <[OBJ]> (0, 1) (8, 118, 1064, 52)
    -> [list | ] (8, 118, 1064, 52)
       textinfo: offset: -1 length: 2 string: [OBJ]
       lineinfo: <> (-1, -1) (0, 0, 0, 0)
       wordinfo: <> (1, 1) (0, 0, 0, 0)
       charinfo: <[OBJ]> (0, 1) (48, 118, 1024, 26)
        -> [list item | • Item 1] (32, 118, 1040, 26)
           textinfo: offset: -1 length: 7 string: •Item 1
           lineinfo: <•Item 1> (0, 7) (32, 118, 65, 26)
           wordinfo: <> (1, 1) (0, 0, 0, 0)
           charinfo: <•> (0, 1) (32, 128, 16, 10)
        -> [list item | • Item 2] (32, 144, 1040, 26)
           textinfo: offset: -1 length: 7 string: •Item 2
           lineinfo: <•Item 2> (0, 7) (32, 144, 71, 26)
           wordinfo: <•> (0, 1) (32, 154, 16, 10)
           charinfo: <•> (0, 1) (32, 154, 16, 10)

alexander :surkov (:asurkov)

Assignee

Comment 1

•

10 years ago

I think it's reasonable if text consisting of embedded char returns the char when line is requested

alexander :surkov (:asurkov)

Assignee

Comment 2

•

10 years ago

Not sure what wanted behavior is though. Say if we have

<body>
  <p>first paragraph</p>
  <p>second paragrahp</p>
</body>

so the document text is two embed characters. Say if the first paragraph takes multiple lines then what is getTextAtOffset(0, LINE_BOUNDARY) supposed to return? Should it be (0, 1) or (0, 0)?

Flags: needinfo?(jamie)

James Teh [:Jamie]

Comment 3

•

10 years ago

(In reply to alexander :surkov from comment #2)
> so the document text is two embed characters. Say if the first paragraph
> takes multiple lines then what is getTextAtOffset(0, LINE_BOUNDARY) supposed
> to return? Should it be (0, 1) or (0, 0)?
It should be (0, 1), which indicates that the embedded object consumes at least one entire line. (0, 0) is a collapsed range suggesting that the line contains nothing, which is incorrect.

Flags: needinfo?(jamie)

alexander :surkov (:asurkov)

Assignee

Comment 4

•

10 years ago

Attached patch part1: patch1 (obsolete) — Details — Splinter Review

part1: fix a generic (non list) case

Assignee: nobody → surkov.alexander

Attachment #8363303 - Flags: review?(trev.saunders)

alexander :surkov (:asurkov)

Assignee

Comment 5

•

10 years ago

Attached patch part1:patch2 — Details — Splinter Review

Attachment #8363303 - Attachment is obsolete: true

Attachment #8363303 - Flags: review?(trev.saunders)

Attachment #8363305 - Flags: review?(trev.saunders)

Trevor Saunders (:tbsaunde)

Comment 6

•

10 years ago

Comment on attachment 8363305 [details] [diff] [review]
part1:patch2

>+    addTextOffset = static_cast<uint32_t>(
>+      aIsEndOffset && (addTextOffset > 0 || descendantAcc->IndexInParent() > 0));

imho that's a really hard way to read that.

if (aIsEndOffset)
  addOffset = addOffset > 0 || child->IndexInParent() > 0 ? 1 : 0;
else
  addOffset = 0

is a lot clearer even though its longer.  It also seems like it would be nice if addOffset could be a bool if we should add one or not, not a number to add.

Also that's going to make that loop a good bit slower,  maybe check if its the first content child? I guess that's not a great idea.  Maybe we should just finally devirtualize GetChildAt() and then use FirstChild()

>+      testTextAtOffset(["ht_5" ], BOUNDARY_LINE_START,
>+                       [ [ 0, 0, kEmbedChar, 0, 1 ] ]);

why not test BOUNDARY_LINE_END isn't that what you fix?

Attachment #8363305 - Flags: review?(trev.saunders) → review+

alexander :surkov (:asurkov)

Assignee

Comment 7

•

10 years ago

(In reply to Trevor Saunders (:tbsaunde) from comment #6)
> Comment on attachment 8363305 [details] [diff] [review]
> part1:patch2
> 
> >+    addTextOffset = static_cast<uint32_t>(
> >+      aIsEndOffset && (addTextOffset > 0 || descendantAcc->IndexInParent() > 0));
> 
> imho that's a really hard way to read that.
> 
> if (aIsEndOffset)
>   addOffset = addOffset > 0 || child->IndexInParent() > 0 ? 1 : 0;
> else
>   addOffset = 0
> 
> is a lot clearer even though its longer.

ok if you want

>  It also seems like it would be
> nice if addOffset could be a bool if we should add one or not, not a number
> to add.

addTextOffset may also be used to keep an offset in text node

> Also that's going to make that loop a good bit slower,  maybe check if its
> the first content child? I guess that's not a great idea.  Maybe we should
> just finally devirtualize GetChildAt() and then use FirstChild()

instead IndexInParent? not sure

> >+      testTextAtOffset(["ht_5" ], BOUNDARY_LINE_START,
> >+                       [ [ 0, 0, kEmbedChar, 0, 1 ] ]);
> 
> why not test BOUNDARY_LINE_END isn't that what you fix?

original test is LINE_START, iirc _END consts were finally deprecated so I don't longer worry about it

Trevor Saunders (:tbsaunde)

Comment 8

•

10 years ago

(In reply to alexander :surkov from comment #7)
> (In reply to Trevor Saunders (:tbsaunde) from comment #6)
> > Comment on attachment 8363305 [details] [diff] [review]
> > part1:patch2
> > 
> > >+    addTextOffset = static_cast<uint32_t>(
> > >+      aIsEndOffset && (addTextOffset > 0 || descendantAcc->IndexInParent() > 0));
> > 
> > imho that's a really hard way to read that.
> > 
> > if (aIsEndOffset)
> >   addOffset = addOffset > 0 || child->IndexInParent() > 0 ? 1 : 0;
> > else
> >   addOffset = 0
> > 
> > is a lot clearer even though its longer.
> 
> ok if you want

thx, my eyes bleed on the other form ;)

> >  It also seems like it would be
> > nice if addOffset could be a bool if we should add one or not, not a number
> > to add.
> 
> addTextOffset may also be used to keep an offset in text node

yeah, I guess this is text stuff its doomed to being hard to read.

> > Also that's going to make that loop a good bit slower,  maybe check if its
> > the first content child? I guess that's not a great idea.  Maybe we should
> > just finally devirtualize GetChildAt() and then use FirstChild()
> 
> instead IndexInParent? not sure

yeah, just trying to think of way to make this less slow.

> > >+      testTextAtOffset(["ht_5" ], BOUNDARY_LINE_START,
> > >+                       [ [ 0, 0, kEmbedChar, 0, 1 ] ]);
> > 
> > why not test BOUNDARY_LINE_END isn't that what you fix?
> 
> original test is LINE_START, iirc _END consts were finally deprecated so I
> don't longer worry about it

hm, and windows / jsat don't use them? I guess that's fine, just looks kind of unbalanced.

alexander :surkov (:asurkov)

Assignee

Comment 9

•

10 years ago

(In reply to Trevor Saunders (:tbsaunde) from comment #8)

> > ok if you want
> 
> thx, my eyes bleed on the other form ;)

I cannot say I disliked the previous form :)

> > >  It also seems like it would be
> > > nice if addOffset could be a bool if we should add one or not, not a number
> > > to add.
> > 
> > addTextOffset may also be used to keep an offset in text node
> 
> yeah, I guess this is text stuff its doomed to being hard to read.

heritage :)

> > instead IndexInParent? not sure
> 
> yeah, just trying to think of way to make this less slow.

yep, shouldn't be bottleneck though

> > original test is LINE_START, iirc _END consts were finally deprecated so I
> > don't longer worry about it
> 
> hm, and windows / jsat don't use them? I guess that's fine, just looks kind
> of unbalanced.

agree. windows yes, jsat is quite likely too, just patching saving some time for urgent stuff

alexander :surkov (:asurkov)

Assignee

Comment 10

•

10 years ago

Comment on attachment 8363305 [details] [diff] [review]
part1:patch2

landed http://hg.mozilla.org/integration/mozilla-inbound/rev/700b3557403d

alexander :surkov (:asurkov)

Assignee

Updated

•

10 years ago

Whiteboard: [leave open]

Carsten Book [:Tomcat]

Comment 11

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/700b3557403d

alexander :surkov (:asurkov)

Assignee

Comment 12

•

10 years ago

Attached patch part2_cleanup:patch1 (obsolete) — Details — Splinter Review

make cleaning up of DOMPointToOffset, the result code is not really equivalent to old one but mochitest suite passes. CharacterCount() as fallback value in DOMPointToOffset is primarily needed for "stranges" of PeekOffset work.

Attachment #8365545 - Flags: review?(trev.saunders)

alexander :surkov (:asurkov)

Assignee

Comment 13

•

10 years ago

Attached patch part2_cleanup:patch2 — Details — Splinter Review

Attachment #8365545 - Attachment is obsolete: true

Attachment #8365545 - Flags: review?(trev.saunders)

Attachment #8366114 - Flags: review?(trev.saunders)

alexander :surkov (:asurkov)

Assignee

Comment 14

•

10 years ago

Attached patch part3 (list items):patch1 — Details — Splinter Review

Attachment #8368950 - Flags: review?(trev.saunders)

Trevor Saunders (:tbsaunde)

Comment 15

•

10 years ago

Comment on attachment 8366114 [details] [diff] [review]
part2_cleanup:patch2

>+HyperTextAccessible::DOMPointToOffset(nsINode* aNode, int32_t aNodeOffset,
>+                                      bool aIsEndOffset) const
> {
>-  if (!aHyperTextOffset)
>-    return nullptr;
>-  *aHyperTextOffset = 0;
>-
>   if (!aNode)
>-    return nullptr;
>+    return 0;

when can it happen? can we just ban it?

A>+  // If the given DOM point cannot be mapped into offset relative this hypertext
>+  // offset then return length as fallback value.

when exactly is this needed?

Its really hard to say if this is exactly correct, but it doesn't seem particularly unreasonable.

Attachment #8366114 - Flags: review?(trev.saunders) → review+

alexander :surkov (:asurkov)

Assignee

Comment 16

•

10 years ago

(In reply to Trevor Saunders (:tbsaunde) from comment #15)
> >+HyperTextAccessible::DOMPointToOffset(nsINode* aNode, int32_t aNodeOffset,
> >+                                      bool aIsEndOffset) const
> >   if (!aNode)
> >-    return nullptr;
> >+    return 0;
> 
> when can it happen? can we just ban it?

somebody relies on it, I saw crashes

> 
> A>+  // If the given DOM point cannot be mapped into offset relative this
> hypertext
> >+  // offset then return length as fallback value.
> 
> when exactly is this needed?

when you do PeekOffset on input element then you can get outside input element, that was a case iirc

> Its really hard to say if this is exactly correct, but it doesn't seem
> particularly unreasonable.

the thesis is common for text interface stuff, we increase test coverage and one day it will guarantee the correct work

Trevor Saunders (:tbsaunde)

Comment 17

•

10 years ago

Comment on attachment 8368950 [details] [diff] [review]
part3 (list items):patch1

>+  int32_t offset = aOffset;
>+  Accessible* descendant = aDescendant;
>+  while (descendant && !descendant->IsDoc()) {

shouldn't we always find this before we get to a document? why can't we just assert that?

>+    // HTML list items needs special processing because PeekOffset doesn't work
>+    // with list bullets.

please file a bug to fix it

>+            (aAmount == eSelectBeginLine || aAmount == eSelectLine)) {
>+          if (text != this)
>+            return TransformOffset(text, 0, false);
>+
>+          // If we are in list item text then fall down into original procedure
>+          // since text may be multiline.

ending the block with a comment seems weird on top of this already being ubber ugly

>+        }
>+

extra blank line

>+      } else {
>+        // We assume that bullet is one single char.

can we assert that?

>+          // Ask a text leaf next to the bullet for an offset since list item
>+          // may be multiline.
>+          return aOffset + 1 < CharacterCount() ?

when is this not true?

>-  if (hyperTextOffset == CharacterCount() && aDirection == eDirPrevious)
>-    return 0;
>+  if (aDirection == eDirPrevious) {
>+    // If we reached the end during search, this means we didn't find the DOM point
>+    // and we're actually at the start of the paragraph

when is this ok? if there's text there shouldn't we have accessibles?


can you make sure we test list items with no bullet and test stuff with :before and :after?

Attachment #8368950 - Flags: review?(trev.saunders) → review+

alexander :surkov (:asurkov)

Assignee

Comment 18

•

10 years ago

Comment on attachment 8366114 [details] [diff] [review]
part2_cleanup:patch2

https://hg.mozilla.org/integration/mozilla-inbound/rev/a70d1fe8d510

alexander :surkov (:asurkov)

Assignee

Comment 19

•

10 years ago

(In reply to Trevor Saunders (:tbsaunde) from comment #17)

> >+  int32_t offset = aOffset;
> >+  Accessible* descendant = aDescendant;
> >+  while (descendant && !descendant->IsDoc()) {
> 
> shouldn't we always find this before we get to a document? why can't we just
> assert that?

it's sounds good

> >+    // HTML list items needs special processing because PeekOffset doesn't work
> >+    // with list bullets.
> 
> please file a bug to fix it

I'm not sure this a bug since keyboard skips list bullets from navigation

> >+            (aAmount == eSelectBeginLine || aAmount == eSelectLine)) {
> >+          if (text != this)
> >+            return TransformOffset(text, 0, false);
> >+
> >+          // If we are in list item text then fall down into original procedure
> >+          // since text may be multiline.
> 
> ending the block with a comment seems weird on top of this already being
> ubber ugly

ok, I'll remove the comment since it doesn't really seem useful


> >+      } else {
> >+        // We assume that bullet is one single char.
> 
> can we assert that?

assert is not really needed since this is obviously wrong assumption (for example in case of list-style-type:upper-roman style). I changed it to comment: "It works only when the bullet is one single char." I will file a bug.

> >+          // Ask a text leaf next to the bullet for an offset since list item
> >+          // may be multiline.
> >+          return aOffset + 1 < CharacterCount() ?
> 
> when is this not true?

no text list items, I added "(if not empty)" after "Ask a text leaf next"

> >-  if (hyperTextOffset == CharacterCount() && aDirection == eDirPrevious)
> >-    return 0;
> >+  if (aDirection == eDirPrevious) {
> >+    // If we reached the end during search, this means we didn't find the DOM point
> >+    // and we're actually at the start of the paragraph
> 
> when is this ok? if there's text there shouldn't we have accessibles?

I guess it's something related with PeekOffset work, it's full of surprises

> can you make sure we test list items with no bullet and test stuff with
> :before and :after?

:before and :after will be broken since PeekOffset presumably don't work with them. I could have a test for no bullet case but it rather belongs to list bullets bug

Ryan VanderMeulen [:RyanVM]

Comment 20

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/a70d1fe8d510

Flags: in-testsuite+

alexander :surkov (:asurkov)

Assignee

Comment 21

•

10 years ago

Comment on attachment 8368950 [details] [diff] [review]
part3 (list items):patch1

https://hg.mozilla.org/integration/mozilla-inbound/rev/826d951bd6fa

alexander :surkov (:asurkov)

Assignee

Updated

•

10 years ago

Whiteboard: [leave open]

Ryan VanderMeulen [:RyanVM]

Comment 22

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/826d951bd6fa

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla30

You need to log in before you can comment on or make changes to this bug.