Closed Bug 613959 Opened 14 years ago Closed 14 years ago

mochitest word boundaries nsIAccessibleText

Categories

(Core :: Disability Access APIs, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0b8

People

(Reporter: fherrera, Assigned: fherrera)

Details

Attachments

(1 file, 3 obsolete files)

Following:
http://www.unicode.org/reports/tr29/tr29-17.html#Word_Boundaries

we may want to test:
"one two"
"one\ntwo"
"one.two"
"one,two"
"345"
"3a A4"
"3.1416"
"4,261.01"
"カタカナ"
"Peter's car"
"N.A.T.O."
"3+4*5=23"

I'm note sure about different languages considerations like:

Break between apostrophe and vowels (French, Italian)
(In reply to comment #0)

> "3.1416"
> "4,261.01"

is it a word?

> "カタカナ"

especially I like this one :)

> "Peter's car"

good

> "N.A.T.O."

amazing, are we going to expose it as one word?

> Break between apostrophe and vowels (French, Italian)

would be nice, but up to you, I'm not a polyglot :)

it would be nice to see a sentence like "Hello. Friend, are you here?!" where we have ., and spaces, just a real case.
(In reply to comment #1)
> (In reply to comment #0)
> 
> > "3.1416"
> > "4,261.01"
> 
> is it a word?

It should be according to Unicode A#29:

Do not break within sequences, such as “3.2” or “3,456.789”.

WB11.	Numeric (MidNum | MidNumLet)	×	Numeric
WB12.	Numeric	×	(MidNum | MidNumLet) Numeric


> > "N.A.T.O."
> 
> amazing, are we going to expose it as one word?

That is let as an implementation decision:
To allow acronyms like “U.S.A.”, a tailoring may include U+002E full stop in ExtendNumLet.

so I guess it is optional.


> it would be nice to see a sentence like "Hello. Friend, are you here?!" where
> we have ., and spaces, just a real case.

Agree. What do you think it is the cleanest way of writing test?

For example something like this:

setText("div", "Peter's car");
testWordCount("div", 2, kOk);

and implementing testWordCount using get_text_at_offset internally (getting word at 0, then getting next word at endoffset,...)

and something like this for the sentence cases:
setText("div", "Hello. Friend, are you here?!");
testWordNum("div", 0, "Hello", kOk);
testWordNum("div", 1, "Friend", kOk);

using get_text_at_offset (n,WORD_END) - overlapping chars from et_text_at_offset (n,WORD_START)
(In reply to comment #2)
> (In reply to comment #1)
> > (In reply to comment #0)
> > 
> > > "3.1416"
> > > "4,261.01"
> > 
> > is it a word?
> 
> It should be according to Unicode A#29:

> > > "N.A.T.O."
> > 
> > amazing, are we going to expose it as one word?
> 
> That is let as an implementation decision:
> To allow acronyms like “U.S.A.”, a tailoring may include U+002E full stop in
> ExtendNumLet.

if we could expose them as words then it's really amazing.

> Agree. What do you think it is the cleanest way of writing test?
> 
> For example something like this:
> 
> setText("div", "Peter's car");

perhaps it should be static to keep tests separately

> testWordCount("div", 2, kOk);
> 
> and implementing testWordCount using get_text_at_offset internally (getting
> word at 0, then getting next word at endoffset,...)
> 
> and something like this for the sentence cases:
> setText("div", "Hello. Friend, are you here?!");
> testWordNum("div", 0, "Hello", kOk);
> testWordNum("div", 1, "Friend", kOk);
> 
> using get_text_at_offset (n,WORD_END) - overlapping chars from
> et_text_at_offset (n,WORD_START)

sounds great (nit: testWordAt(Index))
(In reply to comment #3)

> 
> > Agree. What do you think it is the cleanest way of writing test?
> > 
> > For example something like this:
> > 
> > setText("div", "Peter's car");
> 
> perhaps it should be static to keep tests separately

That was my initial thought, but as in the tests we are not writing the text as an expected result, just the number we'd end up with something like this:

testWordCount("div1", 2, kOk);
testWordCount("div2", 1, kOk);
testWordCount("div3", 2, kOk);
testWordCount("div4", 1, kOk);
testWordCount("div5", 3, kOk);
testWordCount("div6", 1, kOk);
testWordCount("div7", 1, kOk);
...

and some lines after:
<div id="div1">"one.two"</div>
<div id="div2">"one,two"</div>
...

and it would be harder to read/check than with the setText/textX pairs.
If you give a comment containing a test string before each test set then perhaps it'll be ok.

setText is makes tests more complicated and basically it may not work correct because layout happens async in general.
Attached patch patch1 (obsolete) — Splinter Review
Assignee: nobody → fherrera
Status: NEW → ASSIGNED
Attachment #492433 - Flags: review?(surkov.alexander)
Comment on attachment 492433 [details] [diff] [review]
patch1


>+/**
>+ * Test word count for an element

point in the end.

>+    if (offset >= length)
>+      break;
>+    wordCount++;

new line between them.

>+ * Test word at a position for an element

point in the end of sentence.

>+ *
>+ * @param aElement        [in] Element identifier

Element -> element

>+    if (offset >= length)
>+      break;
>+    wordIndex++;

new line between them please

>+      // "one two"
>+      testWordCount("div1", 2, kOk);

I would prefer to see testWordAt following each testWordCount or you could combine them like

testWords("div1", 2, kOk,
          "one", kOk,
          "two", kOk);

or 

testWords("div1", ["one", "two"]) without todo. Or
testWords("div1", { "one": kOk, "two": kTodo })
If JS allows testWords("div1", { "one", "two" }) then we could use this as form of all kOk.
Attachment #492433 - Flags: review?(surkov.alexander) → review+
Attached patch patch2 (obsolete) — Splinter Review
Updated patch
Attachment #492433 - Attachment is obsolete: true
Attachment #493068 - Flags: review?(surkov.alexander)
* @param aCount          [in] Expected word count

nit: Expected -> expected

for kOk tests like
testWordCount("div1", 2, kOk);
testWordAt("div1", 0, "one", kOk);
testWordAt("div1", 1, "two", kOk);

I would really prefer short version like testWords("div", ["one", "two"]);

what could encapsulate all these three methods. In the end we're going to get rid all of kTodo stuffs so it's nice to have easier to read tests.
Attachment #493068 - Flags: review?(surkov.alexander) → review+
Comment on attachment 493068 [details] [diff] [review]
patch2


>+  if (wordFountAtOffset >= 0) {
>+    var text = acc.getTextAtOffset(wordFountAtOffset, BOUNDARY_WORD_END,
>+                                   startOffsetObj, endOffsetObj);
>+
>+    if (endOffsetObj.value < wordFountAtOffset) {
>+      isFunc("", aText,  "wrong start and end offset for word '" + aWordIndex+ "': " +
>+             " of text '" + acc.getText(0, -1) + "'");

that should be always is_todo(), right?
(In reply to comment #10)
> Comment on attachment 493068 [details] [diff] [review]
> patch2
> 
> 
> >+  if (wordFountAtOffset >= 0) {
> >+    var text = acc.getTextAtOffset(wordFountAtOffset, BOUNDARY_WORD_END,
> >+                                   startOffsetObj, endOffsetObj);
> >+
> >+    if (endOffsetObj.value < wordFountAtOffset) {
> >+      isFunc("", aText,  "wrong start and end offset for word '" + aWordIndex+ "': " +
> >+             " of text '" + acc.getText(0, -1) + "'");
> 
> that should be always is_todo(), right?

Well, that is going to be always a fail. I'm using (aToDoFlag == kTodo) ? todo_is : is for it because it can be an unexpected fail (if the test is marked as kOk) or an expected fail for some issues that we know are failing in the code (marked as kTodo)
(In reply to comment #11)
> (In reply to comment #10)
> > Comment on attachment 493068 [details] [diff] [review] [details]
> > patch2
> > 
> > 
> > >+  if (wordFountAtOffset >= 0) {
> > >+    var text = acc.getTextAtOffset(wordFountAtOffset, BOUNDARY_WORD_END,
> > >+                                   startOffsetObj, endOffsetObj);
> > >+
> > >+    if (endOffsetObj.value < wordFountAtOffset) {
> > >+      isFunc("", aText,  "wrong start and end offset for word '" + aWordIndex+ "': " +
> > >+             " of text '" + acc.getText(0, -1) + "'");
> > 
> > that should be always is_todo(), right?
> 
> Well, that is going to be always a fail.

that I meant actually, should we do:
todo(false,  "wrong start and end offset for word '" + 
at least it should be move evident that endOffsetObj.value < wordFountAtOffset means there's something wrong
Attached patch patch3 (obsolete) — Splinter Review
Attachment #493356 - Flags: review?
Attachment #493356 - Flags: review? → review?(surkov.alexander)
Comment on attachment 493356 [details] [diff] [review]
patch3


+/**
+ * Test word count for an element.
+ *
+ * @param aElement        [in] element identifier
+ * @param aCount          [in] Expected word count
+ * @param aToDoFlag       [in] kTodo or kOk for returned text

btw, why do use so many spaces betwen param name and its description? usually
we use two spaces after longest param name.

+    wordCount++;
+    offset = endOffsetObj.value;
+  } 

extra space in the end

+  isFunc(wordCount, aCount,  "wrong words count for '" + acc.getText(0, -1) + "': " +

extra space between arguments

+    wordIndex++;
+    offset = endOffsetObj.value;
+    if (wordIndex == aWordIndex) {
+       wordFountAtOffset = startOffsetObj.value;
+       break;
+    }
+  } 

extra spance in the end of line

+    if (endOffsetObj.value < wordFountAtOffset) {
+      todo(false,  "wrong start and end offset for word '" + aWordIndex+ "': " +

extra space between args

+           " of text '" + acc.getText(0, -1) + "'");
+      return;
+    }
+    

spaces on empty line

+    text = acc.getText(wordFountAtOffset, endOffsetObj.value);
+    isFunc(text, aText,  "wrong text for word at pos '" + aWordIndex+ "': " +

extra spaces between args

+/**
+ * Test words in a element.
+ *
+ * @param aElement        [in] element identifier
+ * @param aWords          [in] array of expected words
+ * @param aToDoFlag       [in] kTodo or kOk for returned text (optional,
+ *                             defailt to kOk)

please put optional keyword into [in] like [in, optional]

 _TEST_FILES = \
 		doc.html \
 		test_doc.html \
 		test_multiline.html \
 		test_singleline.html \
 		test_whitespaces.html \
+		test_words.html \

you might want to update it to trunk if test_multiline delayed so I'm able to
check it in early.
Attachment #493356 - Flags: review?(surkov.alexander) → review+
Attachment #493068 - Attachment is obsolete: true
Attached patch patch4Splinter Review
Updated patch with comments and rebased on top of trunk
Attachment #493356 - Attachment is obsolete: true
landed on 2.0 - http://hg.mozilla.org/mozilla-central/rev/21762517cdea
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b8
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: