389421 - "layout.word_select.stop_at_punctuation" preference no longer works

Reporter

Description

•

18 years ago

BUILD: Current Linux build STEPS TO REPRODUCE: 1) Load http://www.mozilla.org/ 2) Click after the URL in the URL bar 3) Hit the right arrow to make sure the URL is not selected 4) Hit Ctrl-backspace EXPECTED RESULTS: Entire URL is deleted ACTUAL RESULTS: Only the '/' at the end is deleted. NOTE: You can also double-click the URL in the URL bar -- that should select the whole URL but only selects parts between '/' and '.'. Basically, the behavior is as if layout.word_select.stop_at_punctuation is set to true... but it's not.

Flags: blocking1.9?

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 1

•

18 years ago

There's a front-end patch to fix the URL-bar issue a different way, for what it's worth, to special-case double-clicking in the URL bar on X11-based platforms. I remember the patch getting review+ in the past few days, but I can't find the bug right now. If that's fixed, I suspect the pref will no longer be needed.

Adam Guthrie

Comment 2

•

18 years ago

That's bug 190615.

Samuel Sidler (old account; do not CC)

Updated

•

18 years ago

Keywords: regression

Boris Zbarsky [:bzbarsky]

Reporter

Comment 3

•

18 years ago

> If that's fixed, I suspect the pref will no longer be needed. I do have to say I like being able to select things like URIs and variable names in code by double-clicking, though... Oh, well.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 4

•

18 years ago

Attached patch fix (obsolete) — Details — Splinter Review

This reworks PeekOffsetWord a bit. The basic idea is to track punctuation state across frames and take a uniform approach to checking whether we're between punctuation and non-punctuation, and if we are, deciding whether to break or not. When we're between punctuation we always obey the pref and ignore the word-breaker results. This also fixes bug 385565 and bug 385562. It also fixes some bugs related to word-selection across text nodes; not sure if those are filed. Kinda guessing at the reviewers here... Uri, if you want me to ask Simon instead, I'll do that. Thanks!

Assignee: nobody → roc

Status: NEW → ASSIGNED

Attachment #274122 - Flags: superreview?(mats.palmgren)

Attachment #274122 - Flags: review?(uriber)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 5

•

18 years ago

Comment on attachment 274122 [details] [diff] [review] fix actually Simon should be able to do this

Attachment #274122 - Flags: review?(uriber) → review?(smontagu)

Uri Bernstein

Comment 6

•

18 years ago

Comment on attachment 274122 [details] [diff] [review] fix >- // sawBeforeType means "we already saw characters of the type >+ // mSawBeforeType means "we already saw characters of the type > // before the boundary we're looking for". Examples: > // 1. If we're moving forward, looking for a word beginning (i.e. a boundary > // between whitespace and non-whitespace), then eatingWS==PR_TRUE means > // "we already saw some whitespace". > // 2. If we're moving backward, looking for a word beginning (i.e. a boundary > // between non-whitespace and whitespace), then eatingWS==PR_TRUE means > // "we already saw some non-whitespace". If you want to keep this comment (which might not be necessary, since you added your own explanation in the declaration of PeekWordState), you can use this opportunity to change "eatingWS" to "state.mSawBeforeType" (I forgot to do this last time I renamed the variable).

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 7

•

18 years ago

Sure. Does the rest of the code look alright to you?

Mats Palmgren (inactive)

Comment 8

•

18 years ago

Looks good, I have a question though. Given layout.word_select.stop_at_punctuation=true and the text "http://www.mozilla.org/projects/minefield/" with the caret to the right of the 'j'. When selecting words forward (Shift+Right) in a LTR input we stop after the punctuation (so it's included in the selection) and when "deselecting" words (Shift+Left) we stop before the punctuation (so the punctuation remains selected). I assume this is an intended change (for bug 385565). But what do we want to do in the RTL case? With the patch, selecting forward (Shift+Right) in a RTL input we stop *before* the punctuation (since it selects visually to the left) and for Shift+Left we stop to the right of the punctuation (so it's not part of the selection). This is the opposite behaviour from the LTR case. In other words, what the patch seems to be implementing is stopping before/after punctuation visually - should we instead implement stopping before/after punctuation logically?

Mats Palmgren (inactive)

Comment 9

•

18 years ago

Comment on attachment 274122 [details] [diff] [review] fix On the Fārsī wikipedia page: http://fa.wikipedia.org/wiki/%D8%B5%D9%81%D8%AD%D9%87%D9%94_%D8%A7%D8%B5%D9%84%DB%8C I get a new assertion when selecting text (Ctrl+Shift+Right) ###!!! ASSERTION: invalid array index: 'i < Length()', file ../../dist/include/xpcom/nsTArray.h, line 318

Attachment #274122 - Flags: superreview?(mats.palmgren) → superreview-

Uri Bernstein

Comment 10

•

18 years ago

(In reply to comment #8) > But what do we want to do in the RTL case? > With the patch, selecting forward (Shift+Right) in a RTL input Actually, in RTL input, Shift+Right selects *backward* (logically!) > we stop *before* > the punctuation (since it selects visually to the left) If the text itself is LTR, Shift+Right stops after (i.e., to the right of) the punctuation. If the text is RTL, it still stops after (this time, to the left of) the punctuation - which is consistent with the behavior of Shift-Left in an LTR field. > and for Shift+Left > we stop to the right of the punctuation (so it's not part of the selection). Only if your text is LTR, in which case, "to the right of" means "after", which is consistent with the behavior of Shift+Right in LTR context. > This is the opposite behaviour from the LTR case. > In other words, what the patch seems to be implementing is stopping > before/after punctuation visually - should we instead implement stopping > before/after punctuation logically? > So, no, it does behave logically in RTL as well as LTR. You just have to remember that when the control direction is RTL, "right arrow" means backward and "left arrow" means forward (regardless of the directionality of the characters themselves).

Uri Bernstein

Comment 11

•

18 years ago

(In reply to comment #7) > Sure. Does the rest of the code look alright to you? > I looked over it and didn't spot anything wrong, but that doesn't really say much, as this code causes my brain to spin (I've been too far away from it for too long). So I think Simon should review this as well.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Updated

•

18 years ago

Flags: blocking1.9? → blocking1.9+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 12

•

18 years ago

Attached patch fix v2 (obsolete) — Details — Splinter Review

Stop that assertion firing. When moving backwards through text, the "before" offset could be at the end of the text frame. We actually have no break data for that offset, because we don't actually know if a break is allowed there. Make the word-break array big enough to hold a "false" value for that offset. (There's a similar situation for moving forwards: mWordBreaks[0] is always false; we're assuming that there's never a word break at an end of the frame.)

Attachment #274122 - Attachment is obsolete: true

Attachment #275728 - Flags: superreview?(mats.palmgren)

Attachment #275728 - Flags: review?(smontagu)

Attachment #274122 - Flags: review?(smontagu)

Mats Palmgren (inactive)

Comment 13

•

18 years ago

Comment on attachment 275728 [details] [diff] [review] fix v2 Please add XXX to the new comment, and file a bug on it? Looks good! sr=mats

Attachment #275728 - Flags: superreview?(mats.palmgren) → superreview+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 14

•

18 years ago

Yep sure.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

18 years ago

Blocks: 391464

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

18 years ago

Blocks: 391584

Uri Bernstein

Comment 15

•

18 years ago

I noticed that with the patch, word selection does stop (even when stop_at_punctuation=true) on some characters (such as -,_,+,=, of Unicode types Pd, Pc, Sm), on which the old implementation didn't stop. This means that unlike in 1.8, URLs are not necessarily selectable with a double-click when stop_at_punctuation=true. I'm not sure whether this is considered a problem or not.

Uri Bernstein

Updated

•

18 years ago

Blocks: 206141

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 17

•

18 years ago

(In reply to comment #15) > I noticed that with the patch, word selection does stop (even when > stop_at_punctuation=true) on some characters (such as -,_,+,=, of Unicode types > Pd, Pc, Sm), on which the old implementation didn't stop. > This means that unlike in 1.8, URLs are not necessarily selectable with a > double-click when stop_at_punctuation=true. I'm not sure whether this is > considered a problem or not. I'm slightly confused. You're saying that when stop_at_punctuation is true, in 1.8 we *didn't* stop at -,_,+,=, but now we do? My reading of nsTextTransformer.cpp suggests we *were* stopping at those characters in 1.8. Even if we weren't, I think it's probably a good thing that we are stopping at those characters now.

Uri Bernstein

Comment 18

•

18 years ago

(In reply to comment #17) > (In reply to comment #15) > > I noticed that with the patch, word selection does stop (even when > > stop_at_punctuation=true) on some characters (such as -,_,+,=, of Unicode types > > Pd, Pc, Sm), on which the old implementation didn't stop. > > This means that unlike in 1.8, URLs are not necessarily selectable with a > > double-click when stop_at_punctuation=true. I'm not sure whether this is > > considered a problem or not. > > I'm slightly confused. You're saying that when stop_at_punctuation is true, in > 1.8 we *didn't* stop at -,_,+,=, but now we do? > > My reading of nsTextTransformer.cpp suggests we *were* stopping at those > characters in 1.8. Even if we weren't, I think it's probably a good thing that > we are stopping at those characters now. > Errr... I guess s/stop_at_punctuation=true/stop_at_punctuation=false/g in comment #15. I'm traveling now, so I can't test with the patch, but I'm pretty sure that's what I meant. Sorry.

Mats Palmgren (inactive)

Comment 19

•

18 years ago

I'm also seeing what Uri described in comment 15. It occurs for both stop_at_punctuation=true/false, but only when eat_space_to_next_word=false (default on Linux but not Win32). Testcase: place caret at the left edge, (kbd) word-select to the right: aa++bb//cc Fx2 "stops at +"? eat_space=0 eat_space=1 stop_at_p=0 no no stop_at_p=1 after after Fx2 "stops at /"? eat_space=0 eat_space=1 stop_at_p=0 no no stop_at_p=1 after after Trunk with patch2 "stops at +"? eat_space=0 eat_space=1 stop_at_p=0 before and after no stop_at_p=1 before and after no Trunk with patch2 "stops at /"? eat_space=0 eat_space=1 stop_at_p=0 no no stop_at_p=1 after after It seems the patch makes any word breaking character behave like a space when eat_space_to_next_word=false. Was that intended? FWIW, I tested IE7 a bit and it seems they have different word select behavior in the URL bar compared to a <textarea>. In the URL bar they treat plus, dot and SPACE the same, they always stop after it, never before it (so Firefox 2/Win32 is compatible with that, given the default eat_space_to_next_word=true). In a <textarea> however, IE7 stops both before and after plus and dot (but not SPACE), and they treat a sequence of those as a single unit, eg aa++..++bb stops before the first plus and then before the first "b". AFAICT, IE7 treats any sequence of -_#?,;:.+=(){}[]^& as a single unit, exception for / and \ which seems to get special treatment in <textarea>s but not the URL bar. IMO, the current prefs doesn't map very well to native platform behavior. For example, the native behavior on both Linux and MacOSX when extending the selection stops before punctuation, but when narrowing the selection it stops after it (in the direction relative to the anchor point). For example, with the caret at the left edge and then typing Ctrl+Shift+Right twice, it selects "aa" and then "aa++bb", then typing Ctrl+Shift+Left twice selects "aa++" and then "". (same as for words separated by space) I'm still ok with the patch, but it would be nice if we could make the "before and after" instead be "before or after depending on direction relative to the anchor point" to better match the platform behavior. FWIW, Opera/Win32 seems to be compatible with the Fx2/IE7-urlbar behavior, they appear to have the same behavior in the URL bar as in <textarea>. Opera on Linux appears to match the Win32 version. Safari have consistent behavior in the URL bar and <textarea>, AFAICT.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

18 years ago

Blocks: 393227

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 20

•

18 years ago

Oh crud. The reason is that those characters are deliberately excluded from nsTextFrameUtils::IsPunctuationMark because that's what we wanted from first-letter. http://mxr.mozilla.org/seamonkey/source/layout/generic/nsTextFrameUtils.cpp#56 So should I add another CCmap for this? or should we have an API that lets me get the Unicode char type of a character? (That would cover all cases like this where performance is not an issue.) Or what? Simon, any suggestions?

Simon Montagu :smontagu

Comment 21

•

18 years ago

(In reply to comment #20) > should we have an API that lets me > get the Unicode char type of a character? http://lxr.mozilla.org/seamonkey/source/intl/unicharutil/public/nsIUGenCategory.h

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 22

•

18 years ago

That component currently isn't built and I'm reluctant to take its footprint hit just for this. How bad would it be if I just hardcoded the ASCII punctuation characters and treated the rest as not punctuation for the purposes of this pref?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 23

•

18 years ago

It wouldn't be so bad if we could use that component for the first-letter stuff, i.e., ripping out the ccmap that we're currently using, but we can't because it only gives us a general classification instead of the precise classification.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 24

•

18 years ago

It looks like the Unicode paths in nsTextTransformer never honoured this pref. I'll revise the patch to hardcode ASCII punctuation knowledge and treat non-ASCII punctuation as non-punctuation.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

18 years ago

Depends on: 393356

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 25

•

18 years ago

Attached patch updated patch — Details — Splinter Review

Ok, I found that spellchecker has its own private (!) copy of the category tables, which it actually instantiates twice (!!). So I filed bug 393356 to move that back to intl/ where it belongs and where I can use that data for this bug. This patch makes all P and S type Unicode characters be subject to the stop_at_punctuation pref. I hope this satisfies everyone. Of course the patch in bug 393356 has to land before this can land.

Attachment #275728 - Attachment is obsolete: true

Attachment #277850 - Flags: superreview+

Attachment #277850 - Flags: review?(smontagu)

Attachment #275728 - Flags: review?(smontagu)

Uri Bernstein

Comment 26

•

18 years ago

(In reply to comment #24) > It looks like the Unicode paths in nsTextTransformer never honoured this pref. Yeah, that's bug 188213, which I guess could be marked FIXED now.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

18 years ago

Blocks: 188213

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

18 years ago

Blocks: 385562

Simon Montagu :smontagu

Comment 27

•

18 years ago

Comment on attachment 277850 [details] [diff] [review] updated patch r=me, sorry to have taken so long to get to it. A few thoughts arising from this: Do you want to make the same change to the only other caller of nsTextFrameUtils::IsPunctuationMark()? Then we could get rid of it, and punct_marks.ccmap too. It would be cool if ClusterIterator() became surrogate-aware, but that's probably not 1.9 material.

Attachment #277850 - Flags: review?(smontagu) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 28

•

18 years ago

Surrogates already cluster, so it kinda is surrogate-aware. It's not aware of non-BMP punctuation, that's all. We have to leave IsPunctuationMark the way it is, because first-letter is very clear about the specific punctuation that "doesn't count" when looking for the end of a first-letter.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 29

•

18 years ago

checked in.

Status: ASSIGNED → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

Jesse Ruderman

Updated

•

18 years ago

Blocks: 385565

fix 18 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 15.58 KB, patch	MatsPalmgren_bugz : superreview-	Details \| Diff \| Splinter Review
fix v2 18 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 19.31 KB, patch	MatsPalmgren_bugz : superreview+	Details \| Diff \| Splinter Review
updated patch 18 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 21.85 KB, patch	smontagu : review+ roc : superreview+	Details \| Diff \| Splinter Review