Incorrect results when get_text_at_offset() is used to get the last word on the line in wrapped text
Categories
(Core :: Disability Access APIs, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox67 | --- | fixed |
People
(Reporter: jdiggs, Assigned: samuel.thibault)
References
(Blocks 1 open bug)
Details
Attachments
(6 files, 10 obsolete files)
857 bytes,
text/plain
|
Details | |
283 bytes,
text/html
|
Details | |
2.30 KB,
patch
|
Details | Diff | Splinter Review | |
15.67 KB,
patch
|
samuel.thibault
:
review+
|
Details | Diff | Splinter Review |
17.40 KB,
patch
|
samuel.thibault
:
review+
|
Details | Diff | Splinter Review |
4.82 KB,
patch
|
samuel.thibault
:
review+
|
Details | Diff | Splinter Review |
Reporter | ||
Comment 1•12 years ago
|
||
Comment 2•12 years ago
|
||
Comment 3•12 years ago
|
||
Comment 4•12 years ago
|
||
Comment 5•12 years ago
|
||
Assignee | ||
Comment 6•7 years ago
|
||
I have dived into this one, here is what happens, taking the case when we have the text wrapped this way:
Having done so, use get_text_at_offset() to get the word at
the offset at the beginning of a line. Then repeat the test
and we request the word at offset of the space just after "at", with the start-of-word boundary:
In nsIFrame::PeekOffset's eSelectWord case, a first call is made to current->PeekOffsetWord (i.e. nsTextFrame::PeekOffsetWord) with the content of the first line. This terminates immediately because we are already at the end of line. nsIFrame::PeekOffset then calls it a second time with the content of the second line, after setting state.SetSawBeforeType(); to account for the line jump equivalent to a whitespace. In nsTextFrame::PeekOffsetWord, aState->mAtStart is still true, and thus it believes it should continue within word "the". This is because state.Update() was in the end never called.
In the attached patch, I make nsIFrame::PeekOffset call state.Update() along state.SetSawBeforeType() to account for the fact that it is like we had processed a whitespace, and we should thus now stop at the next start of word.
It is being tested on https://treeherder.mozilla.org/#/jobs?repo=try&revision=cafad117118bba9e432f3b221fc4490ee1cec266
Assignee | ||
Comment 7•7 years ago
|
||
So it unbreaks some cases, but breaks others, notably words which get split because they just can't fit on a single line. In that case we indeed do not want to have such a linebreak to be taken as a whitespace. But do we have a way to know, from nsIFrame::GetFrameFromDirection, to know whether the line jump is from a split word (thus not whitespace), or from an expected line break (thus whitespace)? Will we just always have a frame with IsBrFrame()?
Actually, looking more into details in the "at" word case with the requested offset on the 'a', in nsTextFrame::PeekOffsetWord it looks at 'a', then 'n', then gets out of the loop without looking at the space after "at". when requesting from offset of the space character after "at", the loop exits immediately (like mentioned in previous comment) because of the same reason, and the next call directly starts with the next line, and thus the space was missed. In the case of "from" the space character is looked at. I'd say to have proper behavior the space character should also be looked at, because the linebreak only comes from lack of space, and in the split word case we don't actually want to consider it as a whitespace. So the ClusterIterator in that case shouldn't actually trim the whitespaces when used from nsIFrame::PeekOffset through PeekOffsetWord, perhaps we can just add a parameter to nsPeekOffsetStruct for HyperTextAccessible::FindOffset to specify it?
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Comment 8•7 years ago
|
||
Jamie, do you have any opinion on my latest comment?
Comment 9•7 years ago
|
||
(In reply to Samuel Thibault from comment #7)
Actually, looking more into details in the "at" word case with the requested offset on the 'a', in nsTextFrame::PeekOffsetWord it looks at 'a', then 'n', then gets out of the loop without looking at the space after "at". when requesting from offset of the space character after "at", the loop exits immediately (like mentioned in previous comment) because of the same reason, and the next call directly starts with the next line, and thus the space was missed. In the case of "from" the space character is looked at. I'd say to have proper behavior the space character should also be looked at, because the linebreak only comes from lack of space, and in the split word case we don't actually want to consider it as a whitespace. So the ClusterIterator in that case shouldn't actually trim the whitespaces when used from nsIFrame::PeekOffset through PeekOffsetWord, perhaps we can just add a parameter to nsPeekOffsetStruct for HyperTextAccessible::FindOffset to specify it?
Daniel, how does the suggestion look ok from layout perspective? It seems it has no effect on keyboard navigation but unblocks a11y on text extraction issue.
Comment 10•7 years ago
|
||
This is the first I've seen this PeekOffset code, so I don't really know what its invariants are & can't comment on the correctness. (Looking at the hg blame surrounding this line, it looks pretty old, though it seems jfkthame has reviewed at least one patch that touched it in the past couple years (Bug 1375825) -- so I'm going to punt the needinfo to him in the hopes that he's got a bit more orientation for this code than I do.)
Assignee | ||
Comment 11•7 years ago
|
||
Mmm, Jamie, Joanmarie, in the case where a word gets split at end of line because it cannot fit in one line anyway, do we want to make the accessibility layer think that this is a word break, or not? I'd tend to say not since that's just a visual rendering, but the tests in accessible/tests/mochitest/text/test_atcaretoffset.html say yes (but we can change that).
Reporter | ||
Comment 12•7 years ago
|
||
(In reply to Samuel Thibault from comment #11)
Mmm, Jamie, Joanmarie, in the case where a word gets split at end of line because it cannot fit in one line anyway, do we want to make the accessibility layer think that this is a word break, or not? I'd tend to say not since that's just a visual rendering, but the tests in accessible/tests/mochitest/text/test_atcaretoffset.html say yes (but we can change that).
I would tend to lean toward yes BECAUSE it's a visual rendering. That said, ask your colleagues who are blind, and perhaps on the Orca list too, what they want and expect. Then, if you make Gecko do that, and Orca just trusts Gecko, everyone will be happy. :)
Assignee | ||
Comment 13•7 years ago
|
||
Here is what it looks like, https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c64b47fd864c8ee90aa614ee174642a192b6292 shows no regression, only unrelated intermitent-marked failures.
Comment 14•7 years ago
|
||
We really should have test coverage for this, so that we can verify the behavior (and detect any future regressions). Could you add appropriate tests (that fail currently, and pass with the fix) in a second patch?
Assignee | ||
Comment 15•7 years ago
|
||
Here is a test case, being tested with the fix on https://treeherder.mozilla.org/#/jobs?repo=try&revision=ffa51d121c32acf66c71221e52307359ed866054
Assignee | ||
Comment 16•7 years ago
|
||
The try is positive.
Comment 17•6 years ago
|
||
Assignee | ||
Comment 18•6 years ago
|
||
I'd be much happier if we revise this to take strongly-typed flags instead of multiple "interchangeable" bools.
Right, I was wondering about it too. I have made such a rework in the attached patch, successfully tested on https://treeherder.mozilla.org/#/jobs?repo=try&revision=0b86dad68ba187bf56fecd4ec2f1436cd5907a3f&selectedJob=229258587
Assignee | ||
Comment 19•6 years ago
|
||
AFAICT this patch will change behavior in cases that don't even involve any line-breaks. If you add a test element to test_wordboundary.html that has extra spaces that will be collapsed by normal white-space processing, e.g. as a variation on id="d6":
<div id="d6a"> hello all </div>
My patch does not actually change the behavior here. With the current firefox, a trailing space in d6 would already show up in the gettext results. The heading space is eaten both by the current firefox and with my change. And intermediate spaces are collapsed in both cases.
I am currently testing by changes with the attached test changes.
::: layout/generic/nsFrameSelection.h
- // mJumpLinesBreakWord: Whether jumping across line is to be considered as a
- // word break
- bool mJumpLinesBreakWord;
What does "be considered as a word break" mean here?
[...]
Would it make sense to give this a name more closely linked to the space-trimming behavior that it affects?bool mTrimSpaces
or something like that?
Mmm, this sounds like a remnant of my previous thinking. Like Joanmarie said, my change lets the linebreak act as a word break, it just doesn't eat spaces. I'll fix that once we have the flag rework in.
Comment 20•6 years ago
|
||
Updated•6 years ago
|
Comment 21•6 years ago
|
||
(In reply to Samuel Thibault from comment #19)
AFAICT this patch will change behavior in cases that don't even involve any line-breaks. If you add a test element to test_wordboundary.html that has extra spaces that will be collapsed by normal white-space processing, e.g. as a variation on id="d6":
<div id="d6a"> hello all </div>
My patch does not actually change the behavior here. With the current firefox, a trailing space in d6 would already show up in the gettext results. The heading space is eaten both by the current firefox and with my change. And intermediate spaces are collapsed in both cases.
Are you sure about this? I tried adding this version of test d6 with extra (collapsible) spaces. It passes as written with current code, but if I apply your patch, it starts failing, which indicates that behavior does change. I didn't look into it deeply but I think the handling of the trailing space must have changed.
Assignee | ||
Comment 22•6 years ago
|
||
Here is an updated version. The try server seems to have a hard time finding a host to run a few of the tests, but the result seems positive:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=10ac0f920f34be53f6298ecabe1c563d8082f1bc
Updated•6 years ago
|
Assignee | ||
Comment 23•6 years ago
|
||
My patch does not actually change the behavior here. With the current firefox, a trailing space in d6 would already show up in the gettext results. The heading space is eaten both by the current firefox and with my change. And intermediate spaces are collapsed in both cases.
Are you sure about this? I tried adding this version of test d6 with extra (collapsible) spaces. It passes as written with current code, but if I apply your patch, it starts failing, which indicates that behavior does change. I didn't look into it deeply but I think the handling of the trailing space must have changed.
Ah, indeed.
So the change is when the passed offset is beyond the end of the last word, i.e. after the space after the last word. My change makes getTextAtOffset return the space in that case, while it was returning the last word previously. Actually it wasn't just the last word, but " all ", which looks very odd for an end-of-word boundary. I'd say the new behavior is more coherent, actually. I have here attached the resulting testcases.
Comment 24•6 years ago
|
||
Yes, this seems reasonable to me, I just wanted to be sure we're aware of what's changing (and add test coverage so that any future changes in this area are more likely to be noticed, too).
Comment 25•6 years ago
|
||
Assignee | ||
Comment 26•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 27•6 years ago
|
||
Thanks!
Ah, I can't add the checkin-needed keyword myself, because the bug is not assigned to me.
Comment 28•6 years ago
|
||
That we can fix!
However, while we could safely check in the first patch by itself, I'd actually suggest waiting until the full set is ready (which looks like it shouldn't take much now), and landing all together. It tends to complicate things if we land patches one by one within a single bug, especially if any uplifts or backouts start to get involved.
Assignee | ||
Comment 29•6 years ago
|
||
Here is the reworked patch on top of the flags-patch
Assignee | ||
Comment 30•6 years ago
|
||
Here are the reworked testcases.
The patch is tested (without the additional testcases) on
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1950b6ca60d02a114052cb257d7ec7a46521634c&selectedJob=229876841
With the additional testcases:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3232620dfc3de9767cc9763a2d8bd21a3b69f944
Assignee | ||
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Comment 31•6 years ago
|
||
While trying to apply any of the two patches in my console to land in on the integration trees i get this error:
applying patch1
patching file layout/generic/nsTextFrame.cpp
Hunk #1 FAILED at 2960
Hunk #3 FAILED at 7950
2 out of 4 hunks FAILED -- saving rejects to file layout/generic/nsTextFrame.cpp.rej
patching file layout/generic/nsTextFrame.h
Hunk #2 FAILED at 571
1 out of 2 hunks FAILED -- saving rejects to file layout/generic/nsTextFrame.h.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working directory
errors during apply, please fix and qrefresh patch1
Can you take a look into it?
Assignee | ||
Comment 32•6 years ago
|
||
Sorry, I forgot to mention that flags-patch (patch0) needs to be applied first, then patch (patch1), then testcases (patch2)
Comment 33•6 years ago
|
||
Hi Samuel. I've tried every combination to land these patches, always get hunks failed because they are not applied in their order. Please take a look and try to fix it and then re-add chekin-needed.
Examples: https://irccloud.mozilla.com/file/uZNuG6OG/image.png
https://irccloud.mozilla.com/file/CKVeQIl0/image.png
https://irccloud.mozilla.com/file/w1a7kEUY/image.png
Also, please update the commit messages for each patch so we know which one is which.
Assignee | ||
Comment 34•6 years ago
|
||
Assignee | ||
Comment 35•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 36•6 years ago
|
||
Assignee | ||
Comment 37•6 years ago
|
||
Ok, here it is respinned:
- first flags-patch
- then patch
- then testcases
Comment 38•6 years ago
|
||
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a2d905ec0a76
nsTextFrame::GetTrimmedOffsets: Rework flag parameters r=jfkthame
https://hg.mozilla.org/integration/mozilla-inbound/rev/6c9191097079
layout: Do not trim spaces when inspected from accessibility layer r=jfkthame
https://hg.mozilla.org/integration/mozilla-inbound/rev/4dee747dc404
layout: add more tests for text inspection from accessibility layer r=jfkthame
Comment 39•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a2d905ec0a76
https://hg.mozilla.org/mozilla-central/rev/6c9191097079
https://hg.mozilla.org/mozilla-central/rev/4dee747dc404
Updated•6 years ago
|
Description
•