Closed Bug 304656 Opened 19 years ago Closed 19 years ago

Fix PeekOffset once and for all

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 300131

People

(Reporter: roc, Assigned: uriber)

References

Details

PeekOffset is a big trouble spot in our code. We've patched it a lot and there
seems to be still more to do. The API looks dirty to me*. I think now is a good
time to stop fixing little bugs and do a major redesign of PeekOffset so that it
has a well defined specification that's as simple as possible and that our I18N
posse (CCed) agrees satisfies all needs.

* e.g., too many parameters, many with overlapping or unclear meaning,
implementation details like mEatingWS exposed to users of the API.

I have taken the liberty of assigning this to Uri but I don't care who does it :-).

Here's an outline of some needed features, to start discussion:
-- define supported content units: character, (cluster?), word, line, paragraph
(page?) (see
http://developer.apple.com/documentation/Carbon/Reference/ATSUI_Reference/index.html)
-- define supported directions: visual right (down), visual left (up), logical
next, logical prev (but do we ever actually need the logical directions?)
-- define content positions
-- provide a way to limit the search to a particular scope (e.g., a particular
frame and its descendants)
-- provide a function to find the boundary of the content unit that contain a
given content position, in a given direction (or fail if the position is not
inside the unit)
-- provide a function to find the boundary of the next content unit from a given
content position, in a given direction (i.e., never returns the content unit the
given position is in)

What format should positions be? PeekOffset currently works with a mixture of
frames and content. I suggest we use just a (frame, offset) pair where for text
frames, the offset is relative to the frame's content offset, and for non-text
frames 0 means before the frame and 1 means after the frame.
Some initial comments:

Previous bugs on the same subject:
- bug 202306 (has a patch, probably bitrotten)
- bug 300131 (similar to this one, Eyal's ideas of how to do it right)
- we should also take care of bug 263305 while we're at it.

Regarding the API dirtyness, I totally agree. Specifically, I think
PeekOffsetStruct is the root of all evil. The first step IMO should be getting
rid of it, and passing exactly the required arguments (making it clear what's in
and what's out) to functions which currently receive a POS argument.

Regarding assigning to me: This is a big task, and I'm not sure I understand the
current code well enough to take it upon myself as a whole. Also, Eyal already
did some work in bug 300131, so he might want to take this. Anyway, I hope I'll
be able to help with this, especially if we somehow break this into sub-tasks.

Directions: Assuming we are removing the "find unit boundaries" to a separate
function (which will use logical notation), I think the PeekOffset (i.e. "move")
function can use visual notation exclusively. Simon and I had a little
discussion on whether "Home" and "End" (on a PC), should move the caret to the
logical or to the visual edge of a line (to me, at least, the answer is not
obvious). Assuming we're staying with "logical", this could be handled by the
proposed (logical) "find boundaries" function.

"Define content positions": Rob, can you elaborate on this a bit? I'm not sure I
understand.

Everything else sounds good to me.
*** Bug 202306 has been marked as a duplicate of this bug. ***
I should have remembered Eyal's bug. Silly me.

I'm going to be a bit rude and duplicate his bug to this one. The reason is that
I want to start with a well-defined API specification and we don't have that
yet. Also I think we should have just one or two functions rather than a large
number covering various combinations.

> Directions: Assuming we are removing the "find unit boundaries" to a separate
> function (which will use logical notation), I think the PeekOffset (i.e.
> "move") function can use visual notation exclusively.

I don't quite understand the difference between these two functions. Can you
describe the specification for the two different functions, and who would call
each one?

I can help with this work but frankly I know very little about bidi layout and I
haven't worked with this code as much as you have recently.
> I don't quite understand the difference between these two functions. Can you
> describe the specification for the two different functions, and who would call
> each one?

Err... I thought you envisioned this "boundary" function: "provide a function to
find the boundary of the content unit that contain a given content position".
Perhaps I misunderstood. The distinction (in my mind) is between something like
"go one line up" (visual) and "go (select) to the beginning of the current line"
(logical). Usually, only one of those (conceptual) functions is usefull for a
given content unit (although both might be required in order to implement
word-movement).
If this is not what you meant (as appears to be the case), I'm afraid I'll have
to ask you to explain what you did mean.
Ah, OK. That's fine then. I guess you're right that "find current unit boundary"
can use logical directions only but I haven't thought it all through. We need to
write down all the operations that we need and how they'll use the API.

To prevent confusion, let's give them names. How about
"FindContainingTextUnitEnd" and "FindNextTextUnitStart"?

One thing that's not clear yet is how this works with characters. With lines and
paragraphs, and most languages, it's always clear which line or paragraph a
caret position is in (if any). But with words in some languages and with
characters, it's not clear which unit the caret is in. In fact when the position
is between adjacent units, it's really in both units. If the direction is visual
then at most one character/word extends visually to the left of the position,
and one visually to the right. But if the direction is logical, then a caret
position could be logically at the start or end of two words (one LTR and one
RTL). Any idea how we should disambiguate these?
(In reply to comment #3)
> I should have remembered Eyal's bug. Silly me.
> 
> I'm going to be a bit rude and duplicate his bug to this one. The reason is that
> I want to start with a well-defined API specification and we don't have that
> yet.

You can do that in the bug I'd opened as well... plus there's the patch code
there which is interesting to look at. You could even call it an API spec
suggestion (or rather not the patch code itself but the function headers in the
latest partial refactoring of nsFrame.cpp I'd made).

> Also I think we should have just one or two functions rather than a large
> number covering various combinations.

No, I don't agree. The functions do quite different things, they don't just do
the same thing for different content units, and they don't just cover different
combinations of parameters. Trying to force it all into a single function would
just mean a big switch. Also, some functions require only some parameters and
one of the big problems with nsPeekOffsetStruct is the uncertainty of which
fields are in actual use, when. Finally, it seemed to me that using more
meaningful, less generic and less cryptic function names helps to understand
what exactly is being done. PeekOffset() is a bit like having a DoAnyFunction()
function which takes an "ePossibleFunction aWhichFunction" parameter and an
"struct nsLotsOfParamsForEverythingStruct" parameter.

Like I mentioned above, I had advanced little further with 'sketching' the code
refactoring before breaking off than what appears in the patch
- but now I no longer have the exact mental picture of where I was at. I'm also
caught up in a lot of issues these days so I'm not continuing this work. So
whoever wants this can take it.


> Directions: Assuming we are removing the "find unit boundaries" to a 
> separate function (which will use logical notation)

Not sure what you mean.

>, I think the PeekOffset (i.e.
> > "move") function can use visual notation exclusively.

They are exclusively visual AFAIK, and I'd mentioned that in the function names.
I don't think this bug in itself should be about adding position location
functionality or switching some operations from logical to visual or vice-versa.
(In reply to comment #5)

We currently support only visual caret movement (for characters/words). There's
bug 167288 for supporting logical movement on platforms where that's the system
behaviour (Windows, and also Linux, IIRC).

The visual/logical ambiguity is *the* problem at the base of bidirectional caret
movement. It exists (in different forms) for both the logical and visual
approaches. Note that we are internally keeping track only of the logical
position of the caret. Since one visual position can map to two different
logical positions, we are potentially in the "wrong" logical place half the time
(that's assuming we manage to do the visual movement correctly!)

I once had the radical idea that since we're doing visual caret movement we
should actually keep track of the caret's visual position, and decide on the
logical position only when we must (i.e. when the user types a letter, or hits
"delete" - but not when they're just moving around using arrow keys). I can
elaborate on that if anybody's interested (I'm already digressing).

If we were to switch to logical caret movement (at least for some platforms),
things would actually be a bit easier. You can usually know where the caret is
logically if you know how you got there (the left/right arrows traverse the
content in logical order, so as long as you're just using them you always know
where you are, logically). The only case when you have to make a decision is if
you get to an ambiguous location using a mouse-click. In that case, I suppose we
can come up with some huristic (such as preferring the paragraph direction). You
also have the "reverse" problem of where to display the caret when it is
logically in a place that can map to two visual places. This can probably be
"guessed" pretty well based on where we came from.
Okay, let's continue the discussion in bug 300131.

*** This bug has been marked as a duplicate of 300131 ***
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.