Closed
Bug 389421
Opened 18 years ago
Closed 17 years ago
"layout.word_select.stop_at_punctuation" preference no longer works
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: bzbarsky, Assigned: roc)
References
(Blocks 1 open bug)
Details
(Keywords: regression)
Attachments
(1 file, 2 obsolete files)
21.85 KB,
patch
|
smontagu
:
review+
roc
:
superreview+
|
Details | Diff | Splinter Review |
BUILD: Current Linux build
STEPS TO REPRODUCE:
1) Load http://www.mozilla.org/
2) Click after the URL in the URL bar
3) Hit the right arrow to make sure the URL is not selected
4) Hit Ctrl-backspace
EXPECTED RESULTS: Entire URL is deleted
ACTUAL RESULTS: Only the '/' at the end is deleted.
NOTE: You can also double-click the URL in the URL bar -- that should select the whole URL but only selects parts between '/' and '.'.
Basically, the behavior is as if layout.word_select.stop_at_punctuation is set to true... but it's not.
Flags: blocking1.9?
There's a front-end patch to fix the URL-bar issue a different way, for what it's worth, to special-case double-clicking in the URL bar on X11-based platforms. I remember the patch getting review+ in the past few days, but I can't find the bug right now.
If that's fixed, I suspect the pref will no longer be needed.
Comment 2•18 years ago
|
||
That's bug 190615.
Updated•18 years ago
|
Keywords: regression
Reporter | ||
Comment 3•18 years ago
|
||
> If that's fixed, I suspect the pref will no longer be needed.
I do have to say I like being able to select things like URIs and variable names in code by double-clicking, though... Oh, well.
Assignee | ||
Comment 4•18 years ago
|
||
This reworks PeekOffsetWord a bit. The basic idea is to track punctuation state across frames and take a uniform approach to checking whether we're between punctuation and non-punctuation, and if we are, deciding whether to break or not. When we're between punctuation we always obey the pref and ignore the word-breaker results.
This also fixes bug 385565 and bug 385562. It also fixes some bugs related to word-selection across text nodes; not sure if those are filed.
Kinda guessing at the reviewers here... Uri, if you want me to ask Simon instead, I'll do that. Thanks!
Assignee: nobody → roc
Status: NEW → ASSIGNED
Attachment #274122 -
Flags: superreview?(mats.palmgren)
Attachment #274122 -
Flags: review?(uriber)
Assignee | ||
Comment 5•18 years ago
|
||
Comment on attachment 274122 [details] [diff] [review]
fix
actually Simon should be able to do this
Attachment #274122 -
Flags: review?(uriber) → review?(smontagu)
Comment 6•17 years ago
|
||
Comment on attachment 274122 [details] [diff] [review]
fix
>- // sawBeforeType means "we already saw characters of the type
>+ // mSawBeforeType means "we already saw characters of the type
> // before the boundary we're looking for". Examples:
> // 1. If we're moving forward, looking for a word beginning (i.e. a boundary
> // between whitespace and non-whitespace), then eatingWS==PR_TRUE means
> // "we already saw some whitespace".
> // 2. If we're moving backward, looking for a word beginning (i.e. a boundary
> // between non-whitespace and whitespace), then eatingWS==PR_TRUE means
> // "we already saw some non-whitespace".
If you want to keep this comment (which might not be necessary, since you added your own explanation in the declaration of PeekWordState), you can use this opportunity to change "eatingWS" to "state.mSawBeforeType" (I forgot to do this last time I renamed the variable).
Assignee | ||
Comment 7•17 years ago
|
||
Sure. Does the rest of the code look alright to you?
Comment 8•17 years ago
|
||
Looks good, I have a question though.
Given layout.word_select.stop_at_punctuation=true and the text
"http://www.mozilla.org/projects/minefield/" with the caret to the right
of the 'j'.
When selecting words forward (Shift+Right) in a LTR input we stop after
the punctuation (so it's included in the selection) and when "deselecting"
words (Shift+Left) we stop before the punctuation (so the punctuation remains
selected). I assume this is an intended change (for bug 385565).
But what do we want to do in the RTL case?
With the patch, selecting forward (Shift+Right) in a RTL input we stop *before*
the punctuation (since it selects visually to the left) and for Shift+Left
we stop to the right of the punctuation (so it's not part of the selection).
This is the opposite behaviour from the LTR case.
In other words, what the patch seems to be implementing is stopping
before/after punctuation visually - should we instead implement stopping
before/after punctuation logically?
Comment 9•17 years ago
|
||
Comment on attachment 274122 [details] [diff] [review]
fix
On the Fārsī wikipedia page:
http://fa.wikipedia.org/wiki/%D8%B5%D9%81%D8%AD%D9%87%D9%94_%D8%A7%D8%B5%D9%84%DB%8C
I get a new assertion when selecting text (Ctrl+Shift+Right)
###!!! ASSERTION: invalid array index: 'i < Length()', file ../../dist/include/xpcom/nsTArray.h, line 318
Attachment #274122 -
Flags: superreview?(mats.palmgren) → superreview-
Comment 10•17 years ago
|
||
(In reply to comment #8)
> But what do we want to do in the RTL case?
> With the patch, selecting forward (Shift+Right) in a RTL input
Actually, in RTL input, Shift+Right selects *backward* (logically!)
> we stop *before*
> the punctuation (since it selects visually to the left)
If the text itself is LTR, Shift+Right stops after (i.e., to the right of) the punctuation. If the text is RTL, it still stops after (this time, to the left of) the punctuation - which is consistent with the behavior of Shift-Left in an LTR field.
> and for Shift+Left
> we stop to the right of the punctuation (so it's not part of the selection).
Only if your text is LTR, in which case, "to the right of" means "after", which is consistent with the behavior of Shift+Right in LTR context.
> This is the opposite behaviour from the LTR case.
> In other words, what the patch seems to be implementing is stopping
> before/after punctuation visually - should we instead implement stopping
> before/after punctuation logically?
>
So, no, it does behave logically in RTL as well as LTR. You just have to remember that when the control direction is RTL, "right arrow" means backward and "left arrow" means forward (regardless of the directionality of the characters themselves).
Comment 11•17 years ago
|
||
(In reply to comment #7)
> Sure. Does the rest of the code look alright to you?
>
I looked over it and didn't spot anything wrong, but that doesn't really say much, as this code causes my brain to spin (I've been too far away from it for too long). So I think Simon should review this as well.
Flags: blocking1.9? → blocking1.9+
Assignee | ||
Comment 12•17 years ago
|
||
Stop that assertion firing.
When moving backwards through text, the "before" offset could be at the end of the text frame. We actually have no break data for that offset, because we don't actually know if a break is allowed there. Make the word-break array big enough to hold a "false" value for that offset. (There's a similar situation for moving forwards: mWordBreaks[0] is always false; we're assuming that there's never a word break at an end of the frame.)
Attachment #274122 -
Attachment is obsolete: true
Attachment #275728 -
Flags: superreview?(mats.palmgren)
Attachment #275728 -
Flags: review?(smontagu)
Attachment #274122 -
Flags: review?(smontagu)
Comment 13•17 years ago
|
||
Comment on attachment 275728 [details] [diff] [review]
fix v2
Please add XXX to the new comment, and file a bug on it?
Looks good! sr=mats
Attachment #275728 -
Flags: superreview?(mats.palmgren) → superreview+
Assignee | ||
Comment 14•17 years ago
|
||
Yep sure.
Comment 15•17 years ago
|
||
I noticed that with the patch, word selection does stop (even when stop_at_punctuation=true) on some characters (such as -,_,+,=, of Unicode types Pd, Pc, Sm), on which the old implementation didn't stop.
This means that unlike in 1.8, URLs are not necessarily selectable with a double-click when stop_at_punctuation=true. I'm not sure whether this is considered a problem or not.
Assignee | ||
Comment 17•17 years ago
|
||
(In reply to comment #15)
> I noticed that with the patch, word selection does stop (even when
> stop_at_punctuation=true) on some characters (such as -,_,+,=, of Unicode types
> Pd, Pc, Sm), on which the old implementation didn't stop.
> This means that unlike in 1.8, URLs are not necessarily selectable with a
> double-click when stop_at_punctuation=true. I'm not sure whether this is
> considered a problem or not.
I'm slightly confused. You're saying that when stop_at_punctuation is true, in 1.8 we *didn't* stop at -,_,+,=, but now we do?
My reading of nsTextTransformer.cpp suggests we *were* stopping at those characters in 1.8. Even if we weren't, I think it's probably a good thing that we are stopping at those characters now.
Comment 18•17 years ago
|
||
(In reply to comment #17)
> (In reply to comment #15)
> > I noticed that with the patch, word selection does stop (even when
> > stop_at_punctuation=true) on some characters (such as -,_,+,=, of Unicode types
> > Pd, Pc, Sm), on which the old implementation didn't stop.
> > This means that unlike in 1.8, URLs are not necessarily selectable with a
> > double-click when stop_at_punctuation=true. I'm not sure whether this is
> > considered a problem or not.
>
> I'm slightly confused. You're saying that when stop_at_punctuation is true, in
> 1.8 we *didn't* stop at -,_,+,=, but now we do?
>
> My reading of nsTextTransformer.cpp suggests we *were* stopping at those
> characters in 1.8. Even if we weren't, I think it's probably a good thing that
> we are stopping at those characters now.
>
Errr... I guess s/stop_at_punctuation=true/stop_at_punctuation=false/g in comment #15.
I'm traveling now, so I can't test with the patch, but I'm pretty sure that's what I meant. Sorry.
Comment 19•17 years ago
|
||
I'm also seeing what Uri described in comment 15.
It occurs for both stop_at_punctuation=true/false, but only when
eat_space_to_next_word=false (default on Linux but not Win32).
Testcase: place caret at the left edge, (kbd) word-select to the right:
aa++bb//cc
Fx2 "stops at +"?
eat_space=0 eat_space=1
stop_at_p=0 no no
stop_at_p=1 after after
Fx2 "stops at /"?
eat_space=0 eat_space=1
stop_at_p=0 no no
stop_at_p=1 after after
Trunk with patch2 "stops at +"?
eat_space=0 eat_space=1
stop_at_p=0 before and after no
stop_at_p=1 before and after no
Trunk with patch2 "stops at /"?
eat_space=0 eat_space=1
stop_at_p=0 no no
stop_at_p=1 after after
It seems the patch makes any word breaking character behave like a space
when eat_space_to_next_word=false. Was that intended?
FWIW, I tested IE7 a bit and it seems they have different word select
behavior in the URL bar compared to a <textarea>. In the URL bar they
treat plus, dot and SPACE the same, they always stop after it, never
before it (so Firefox 2/Win32 is compatible with that, given the default
eat_space_to_next_word=true). In a <textarea> however, IE7 stops both
before and after plus and dot (but not SPACE), and they treat a sequence
of those as a single unit, eg aa++..++bb stops before
the first plus and then before the first "b".
AFAICT, IE7 treats any sequence of -_#?,;:.+=(){}[]^& as a single unit,
exception for / and \ which seems to get special treatment in <textarea>s
but not the URL bar.
IMO, the current prefs doesn't map very well to native platform behavior.
For example, the native behavior on both Linux and MacOSX when extending
the selection stops before punctuation, but when narrowing the selection
it stops after it (in the direction relative to the anchor point).
For example,
with the caret at the left edge and then typing Ctrl+Shift+Right twice,
it selects "aa" and then "aa++bb", then typing Ctrl+Shift+Left twice
selects "aa++" and then "". (same as for words separated by space)
I'm still ok with the patch, but it would be nice if we could make the
"before and after" instead be "before or after depending on direction
relative to the anchor point" to better match the platform behavior.
FWIW, Opera/Win32 seems to be compatible with the Fx2/IE7-urlbar behavior,
they appear to have the same behavior in the URL bar as in <textarea>.
Opera on Linux appears to match the Win32 version.
Safari have consistent behavior in the URL bar and <textarea>, AFAICT.
Assignee | ||
Comment 20•17 years ago
|
||
Oh crud. The reason is that those characters are deliberately excluded from nsTextFrameUtils::IsPunctuationMark because that's what we wanted from first-letter.
http://mxr.mozilla.org/seamonkey/source/layout/generic/nsTextFrameUtils.cpp#56
So should I add another CCmap for this? or should we have an API that lets me get the Unicode char type of a character? (That would cover all cases like this where performance is not an issue.) Or what? Simon, any suggestions?
Comment 21•17 years ago
|
||
(In reply to comment #20)
> should we have an API that lets me
> get the Unicode char type of a character?
http://lxr.mozilla.org/seamonkey/source/intl/unicharutil/public/nsIUGenCategory.h
Assignee | ||
Comment 22•17 years ago
|
||
That component currently isn't built and I'm reluctant to take its footprint hit just for this.
How bad would it be if I just hardcoded the ASCII punctuation characters and treated the rest as not punctuation for the purposes of this pref?
Assignee | ||
Comment 23•17 years ago
|
||
It wouldn't be so bad if we could use that component for the first-letter stuff, i.e., ripping out the ccmap that we're currently using, but we can't because it only gives us a general classification instead of the precise classification.
Assignee | ||
Comment 24•17 years ago
|
||
It looks like the Unicode paths in nsTextTransformer never honoured this pref. I'll revise the patch to hardcode ASCII punctuation knowledge and treat non-ASCII punctuation as non-punctuation.
Assignee | ||
Comment 25•17 years ago
|
||
Ok, I found that spellchecker has its own private (!) copy of the category tables, which it actually instantiates twice (!!). So I filed bug 393356 to move that back to intl/ where it belongs and where I can use that data for this bug.
This patch makes all P and S type Unicode characters be subject to the stop_at_punctuation pref. I hope this satisfies everyone. Of course the patch in bug 393356 has to land before this can land.
Attachment #275728 -
Attachment is obsolete: true
Attachment #277850 -
Flags: superreview+
Attachment #277850 -
Flags: review?(smontagu)
Attachment #275728 -
Flags: review?(smontagu)
Comment 26•17 years ago
|
||
(In reply to comment #24)
> It looks like the Unicode paths in nsTextTransformer never honoured this pref.
Yeah, that's bug 188213, which I guess could be marked FIXED now.
Comment 27•17 years ago
|
||
Comment on attachment 277850 [details] [diff] [review]
updated patch
r=me, sorry to have taken so long to get to it.
A few thoughts arising from this:
Do you want to make the same change to the only other caller of nsTextFrameUtils::IsPunctuationMark()? Then we could get rid of it, and punct_marks.ccmap too.
It would be cool if ClusterIterator() became surrogate-aware, but that's probably not 1.9 material.
Attachment #277850 -
Flags: review?(smontagu) → review+
Assignee | ||
Comment 28•17 years ago
|
||
Surrogates already cluster, so it kinda is surrogate-aware. It's not aware of non-BMP punctuation, that's all.
We have to leave IsPunctuationMark the way it is, because first-letter is very clear about the specific punctuation that "doesn't count" when looking for the end of a first-letter.
Assignee | ||
Comment 29•17 years ago
|
||
checked in.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•