Closed Bug 613959 Opened 15 years ago Closed 15 years ago

mochitest word boundaries nsIAccessibleText

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla2.0b8

People

(Reporter: fherrera, Assigned: fherrera)

Details

Attachments

(1 file, 3 obsolete files)

patch1 15 years ago Fernando Herrera 6.83 KB, patch	surkov : review+	Details \| Diff \| Splinter Review
patch2 15 years ago Fernando Herrera 8.30 KB, patch	surkov : review+	Details \| Diff \| Splinter Review
patch3 15 years ago Fernando Herrera 8.29 KB, patch	surkov : review+	Details \| Diff \| Splinter Review
patch4 15 years ago Fernando Herrera 8.42 KB, patch		Details \| Diff \| Splinter Review

Fernando Herrera

Assignee

Description

•

15 years ago

Following: http://www.unicode.org/reports/tr29/tr29-17.html#Word_Boundaries we may want to test: "one two" "one\ntwo" "one.two" "one,two" "345" "3a A4" "3.1416" "4,261.01" "カタカナ" "Peter's car" "N.A.T.O." "3+4*5=23" I'm note sure about different languages considerations like: Break between apostrophe and vowels (French, Italian)

alexander :surkov (:asurkov)

Comment 1

•

15 years ago

(In reply to comment #0) > "3.1416" > "4,261.01" is it a word? > "カタカナ" especially I like this one :) > "Peter's car" good > "N.A.T.O." amazing, are we going to expose it as one word? > Break between apostrophe and vowels (French, Italian) would be nice, but up to you, I'm not a polyglot :) it would be nice to see a sentence like "Hello. Friend, are you here?!" where we have ., and spaces, just a real case.

Fernando Herrera

Assignee

Comment 2

•

15 years ago

(In reply to comment #1) > (In reply to comment #0) > > > "3.1416" > > "4,261.01" > > is it a word? It should be according to Unicode A#29: Do not break within sequences, such as “3.2” or “3,456.789”. WB11. Numeric (MidNum | MidNumLet) × Numeric WB12. Numeric × (MidNum | MidNumLet) Numeric > > "N.A.T.O." > > amazing, are we going to expose it as one word? That is let as an implementation decision: To allow acronyms like “U.S.A.”, a tailoring may include U+002E full stop in ExtendNumLet. so I guess it is optional. > it would be nice to see a sentence like "Hello. Friend, are you here?!" where > we have ., and spaces, just a real case. Agree. What do you think it is the cleanest way of writing test? For example something like this: setText("div", "Peter's car"); testWordCount("div", 2, kOk); and implementing testWordCount using get_text_at_offset internally (getting word at 0, then getting next word at endoffset,...) and something like this for the sentence cases: setText("div", "Hello. Friend, are you here?!"); testWordNum("div", 0, "Hello", kOk); testWordNum("div", 1, "Friend", kOk); using get_text_at_offset (n,WORD_END) - overlapping chars from et_text_at_offset (n,WORD_START)

alexander :surkov (:asurkov)

Comment 3

•

15 years ago

(In reply to comment #2) > (In reply to comment #1) > > (In reply to comment #0) > > > > > "3.1416" > > > "4,261.01" > > > > is it a word? > > It should be according to Unicode A#29: > > > "N.A.T.O." > > > > amazing, are we going to expose it as one word? > > That is let as an implementation decision: > To allow acronyms like “U.S.A.”, a tailoring may include U+002E full stop in > ExtendNumLet. if we could expose them as words then it's really amazing. > Agree. What do you think it is the cleanest way of writing test? > > For example something like this: > > setText("div", "Peter's car"); perhaps it should be static to keep tests separately > testWordCount("div", 2, kOk); > > and implementing testWordCount using get_text_at_offset internally (getting > word at 0, then getting next word at endoffset,...) > > and something like this for the sentence cases: > setText("div", "Hello. Friend, are you here?!"); > testWordNum("div", 0, "Hello", kOk); > testWordNum("div", 1, "Friend", kOk); > > using get_text_at_offset (n,WORD_END) - overlapping chars from > et_text_at_offset (n,WORD_START) sounds great (nit: testWordAt(Index))

Fernando Herrera

Assignee

Comment 4

•

15 years ago

(In reply to comment #3) > > > Agree. What do you think it is the cleanest way of writing test? > > > > For example something like this: > > > > setText("div", "Peter's car"); > > perhaps it should be static to keep tests separately That was my initial thought, but as in the tests we are not writing the text as an expected result, just the number we'd end up with something like this: testWordCount("div1", 2, kOk); testWordCount("div2", 1, kOk); testWordCount("div3", 2, kOk); testWordCount("div4", 1, kOk); testWordCount("div5", 3, kOk); testWordCount("div6", 1, kOk); testWordCount("div7", 1, kOk); ... and some lines after: <div id="div1">"one.two"</div> <div id="div2">"one,two"</div> ... and it would be harder to read/check than with the setText/textX pairs.

alexander :surkov (:asurkov)

Comment 5

•

15 years ago

If you give a comment containing a test string before each test set then perhaps it'll be ok. setText is makes tests more complicated and basically it may not work correct because layout happens async in general.

Fernando Herrera

Assignee

Comment 6

•

15 years ago

Attached patch patch1 (obsolete) — Details — Splinter Review

Assignee: nobody → fherrera

Status: NEW → ASSIGNED

Attachment #492433 - Flags: review?(surkov.alexander)

alexander :surkov (:asurkov)

Comment 7

•

15 years ago

Comment on attachment 492433 [details] [diff] [review] patch1 >+/** >+ * Test word count for an element point in the end. >+ if (offset >= length) >+ break; >+ wordCount++; new line between them. >+ * Test word at a position for an element point in the end of sentence. >+ * >+ * @param aElement [in] Element identifier Element -> element >+ if (offset >= length) >+ break; >+ wordIndex++; new line between them please >+ // "one two" >+ testWordCount("div1", 2, kOk); I would prefer to see testWordAt following each testWordCount or you could combine them like testWords("div1", 2, kOk, "one", kOk, "two", kOk); or testWords("div1", ["one", "two"]) without todo. Or testWords("div1", { "one": kOk, "two": kTodo }) If JS allows testWords("div1", { "one", "two" }) then we could use this as form of all kOk.

Attachment #492433 - Flags: review?(surkov.alexander) → review+

Fernando Herrera

Assignee

Comment 8

•

15 years ago

Attached patch patch2 (obsolete) — Details — Splinter Review

Updated patch

Attachment #492433 - Attachment is obsolete: true

Attachment #493068 - Flags: review?(surkov.alexander)

alexander :surkov (:asurkov)

Comment 9

•

15 years ago

* @param aCount [in] Expected word count nit: Expected -> expected for kOk tests like testWordCount("div1", 2, kOk); testWordAt("div1", 0, "one", kOk); testWordAt("div1", 1, "two", kOk); I would really prefer short version like testWords("div", ["one", "two"]); what could encapsulate all these three methods. In the end we're going to get rid all of kTodo stuffs so it's nice to have easier to read tests.

alexander :surkov (:asurkov)

Updated

•

15 years ago

Attachment #493068 - Flags: review?(surkov.alexander) → review+

alexander :surkov (:asurkov)

Comment 10

•

15 years ago

Comment on attachment 493068 [details] [diff] [review] patch2 >+ if (wordFountAtOffset >= 0) { >+ var text = acc.getTextAtOffset(wordFountAtOffset, BOUNDARY_WORD_END, >+ startOffsetObj, endOffsetObj); >+ >+ if (endOffsetObj.value < wordFountAtOffset) { >+ isFunc("", aText, "wrong start and end offset for word '" + aWordIndex+ "': " + >+ " of text '" + acc.getText(0, -1) + "'"); that should be always is_todo(), right?

Fernando Herrera

Assignee

Comment 11

•

15 years ago

(In reply to comment #10) > Comment on attachment 493068 [details] [diff] [review] > patch2 > > > >+ if (wordFountAtOffset >= 0) { > >+ var text = acc.getTextAtOffset(wordFountAtOffset, BOUNDARY_WORD_END, > >+ startOffsetObj, endOffsetObj); > >+ > >+ if (endOffsetObj.value < wordFountAtOffset) { > >+ isFunc("", aText, "wrong start and end offset for word '" + aWordIndex+ "': " + > >+ " of text '" + acc.getText(0, -1) + "'"); > > that should be always is_todo(), right? Well, that is going to be always a fail. I'm using (aToDoFlag == kTodo) ? todo_is : is for it because it can be an unexpected fail (if the test is marked as kOk) or an expected fail for some issues that we know are failing in the code (marked as kTodo)

alexander :surkov (:asurkov)

Comment 12

•

15 years ago

(In reply to comment #11) > (In reply to comment #10) > > Comment on attachment 493068 [details] [diff] [review] [details] > > patch2 > > > > > > >+ if (wordFountAtOffset >= 0) { > > >+ var text = acc.getTextAtOffset(wordFountAtOffset, BOUNDARY_WORD_END, > > >+ startOffsetObj, endOffsetObj); > > >+ > > >+ if (endOffsetObj.value < wordFountAtOffset) { > > >+ isFunc("", aText, "wrong start and end offset for word '" + aWordIndex+ "': " + > > >+ " of text '" + acc.getText(0, -1) + "'"); > > > > that should be always is_todo(), right? > > Well, that is going to be always a fail. that I meant actually, should we do: todo(false, "wrong start and end offset for word '" + at least it should be move evident that endOffsetObj.value < wordFountAtOffset means there's something wrong

Fernando Herrera

Assignee

Comment 13

•

15 years ago

Attached patch patch3 (obsolete) — Details — Splinter Review

Attachment #493356 - Flags: review?

Fernando Herrera

Assignee

Updated

•

15 years ago

Attachment #493356 - Flags: review? → review?(surkov.alexander)

alexander :surkov (:asurkov)

Comment 14

•

15 years ago

Comment on attachment 493356 [details] [diff] [review] patch3 +/** + * Test word count for an element. + * + * @param aElement [in] element identifier + * @param aCount [in] Expected word count + * @param aToDoFlag [in] kTodo or kOk for returned text btw, why do use so many spaces betwen param name and its description? usually we use two spaces after longest param name. + wordCount++; + offset = endOffsetObj.value; + } extra space in the end + isFunc(wordCount, aCount, "wrong words count for '" + acc.getText(0, -1) + "': " + extra space between arguments + wordIndex++; + offset = endOffsetObj.value; + if (wordIndex == aWordIndex) { + wordFountAtOffset = startOffsetObj.value; + break; + } + } extra spance in the end of line + if (endOffsetObj.value < wordFountAtOffset) { + todo(false, "wrong start and end offset for word '" + aWordIndex+ "': " + extra space between args + " of text '" + acc.getText(0, -1) + "'"); + return; + } + spaces on empty line + text = acc.getText(wordFountAtOffset, endOffsetObj.value); + isFunc(text, aText, "wrong text for word at pos '" + aWordIndex+ "': " + extra spaces between args +/** + * Test words in a element. + * + * @param aElement [in] element identifier + * @param aWords [in] array of expected words + * @param aToDoFlag [in] kTodo or kOk for returned text (optional, + * defailt to kOk) please put optional keyword into [in] like [in, optional] _TEST_FILES = \ doc.html \ test_doc.html \ test_multiline.html \ test_singleline.html \ test_whitespaces.html \ + test_words.html \ you might want to update it to trunk if test_multiline delayed so I'm able to check it in early.

Attachment #493356 - Flags: review?(surkov.alexander) → review+

alexander :surkov (:asurkov)

Updated

•

15 years ago

Attachment #493068 - Attachment is obsolete: true

Fernando Herrera

Assignee

Comment 15

•

15 years ago

Attached patch patch4 — Details — Splinter Review

Updated patch with comments and rebased on top of trunk

Attachment #493356 - Attachment is obsolete: true

alexander :surkov (:asurkov)

Comment 16

•

15 years ago

landed on 2.0 - http://hg.mozilla.org/mozilla-central/rev/21762517cdea

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Target Milestone: --- → mozilla2.0b8

You need to log in before you can comment on or make changes to this bug.