[Find] Find whole word only

RESOLVED DUPLICATE of bug 269442

Status

()

Firefox
General
--
minor
RESOLVED DUPLICATE of bug 269442
18 years ago
5 months ago

People

(Reporter: Michael Lowe, Unassigned)

Tracking

(Blocks: 1 bug, {helpwanted})

Trunk
helpwanted
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 6 obsolete attachments)

(Reporter)

Description

18 years ago
Please add "Match Whole Word Only" option to browser's "Find on this page"
dialog.

Comment 1

18 years ago
*** Bug 15055 has been marked as a duplicate of this bug. ***
(Reporter)

Updated

18 years ago
Assignee: don → sfraser

Updated

18 years ago
Status: NEW → ASSIGNED
Target Milestone: M15

Comment 2

18 years ago
I've wanted this too.

Updated

18 years ago
Component: Browser-General → XPApps

Comment 3

18 years ago
Updating QA Contact.

Updated

18 years ago
QA Contact: paulmac → sairuh

Updated

18 years ago
Summary: Find whole word only → [Find][RFE]Find whole word only
Target Milestone: M15 → M16

Updated

18 years ago
Target Milestone: M16 → M20

Comment 4

17 years ago
moving to future milestone
Assignee: sfraser → beppe
Status: ASSIGNED → NEW

Updated

17 years ago
Target Milestone: M20 → Future

Comment 5

17 years ago
moving back to previous owner
Assignee: beppe → sfraser

Comment 6

17 years ago
adding help wanted keyword
Keywords: helpwanted

Comment 7

16 years ago
BTW, this is supported by IE5, but wasn't in Netscape 4. It's helpful to have.
Keywords: mozilla1.0

Comment 8

16 years ago
if i select the "match whole word only" option
and search for a word "file",
it will not shows me "profile"

BUT .. will it shows me "files" ?

Updated

16 years ago
Blocks: 106961

Comment 9

16 years ago
*** Bug 104046 has been marked as a duplicate of this bug. ***

Comment 10

16 years ago
A find RFE for you!
Assignee: sfraser → akkana

Comment 11

16 years ago
*** Bug 118118 has been marked as a duplicate of this bug. ***
mass moving open bugs pertaining to find in page/frame to pmac@netscape.com as
qa contact.

to find all bugspam pertaining to this, set your search string to
"AppleSpongeCakeWithCaramelFrosting".
QA Contact: sairuh → pmac

Comment 13

15 years ago
*** Bug 143180 has been marked as a duplicate of this bug. ***

Comment 14

15 years ago
*** Bug 156517 has been marked as a duplicate of this bug. ***

Comment 15

15 years ago
Reply to #8:

No, searching for whole words only is like the following regular expression search:

[\n\r\t ]word[\n\r\t ]

'\n' is a linefeed, '\r' is a carriage return, '\t' is a tab sign and the last
sign is a space (not sure if '\s' for space is correct regex syntax, some regex
parsers support it, but not all; however I think a space is a space, isn't it?

So running a whole word search for "file" will only come up "file" (or "File" if
match upper/lower case isn't enabled). To make complicated searches, the search
would have to get an own syntax and that is too complicated for most users that
just want a simple search. But being able to limit searches to whole words only
can often be a big advantage and is one of the things that makes IE more
comfortable at times than Mozilla.

Updated

15 years ago
Summary: [Find][RFE]Find whole word only → [Find] Find whole word only
QA Contact: pmac → sairuh

Comment 16

15 years ago
*** Bug 192934 has been marked as a duplicate of this bug. ***
OS: Windows NT → All
Hardware: PC → All
Blocks: 195485

Comment 17

15 years ago
Moving from "help wanted 1.0", which obviously is not going to happen now. This
feature would be nice for 1.4a, I think. It'd be nice if we could match
functionality with Office, and Office (Word at least) allows you to search for
whole words only.
Severity: enhancement → minor
Keywords: mozilla1.0
Target Milestone: Future → mozilla1.4alpha

Comment 18

15 years ago
*** Bug 198922 has been marked as a duplicate of this bug. ***

Comment 19

14 years ago
I would need that functionality available in nsFind.Find() method.

Comment 20

14 years ago
Okay, who do we have to talk to to get this into nsFind.Find()? Do we have to
code it ourselves? Will it be difficulty to code for a non expert C programmer?
Maybe I will take a look, after all I'm currently spending most time on C
programming anyway (but let's face it: I'm a newbie C programmer; pretty
experienced in Java, PHP and some other languages, but only basic C skills).

Comment 21

14 years ago
The difficulty will be in the algorithm, not in knowing tricky C language
features.  The changes should all happen within one routine, after the fragment
has already been fetched from the DOM; if you grok "char*" and "char" the rest
probably won't be a problem.  A good programmer experienced with Java and PHP
should be able to pick up the C++ that's needed.  

Basically, I think it will come down to this: When looking to start a new word
match, don't start matching unless we've just started a word.  (It really should
use the nsIWordBreaker interface so it will work with non-Latin charsets, but
I'm not sure how to do that on the fragments of text that come out of the DOM. 
I suppose just flagging when the last char seen was whitespace is a reasonable
first start.)

It may be necessary to pay attention to how word-mode search interacts with
spaces in the search string (I'm not sure whether this will be a problem or not).

Alternately, another option is if we're in word mode, don't use the current code
at all, and instead, jump forward in the DOM itself using the nsIWordBreaker
interfaces, then compare (possibly using code factored out of the existing
interface).  That would work in all languages, but would be a lot harder
(especially for someone not comfortable with C++ or XPCOM calling conventions).

Comment 22

14 years ago
kovu401: Don't change target milestones on other people's bugs, unless you're
willing to take them and work on the fix yourself.
Target Milestone: mozilla1.4alpha → Future

Comment 23

14 years ago
Is anybody currently working on this bug?  After looking at some of the code, it
looks like a lot of the framework is already there.  I have added the rest of
the code necessary (thought it is not perfect I'm sure).  Should I work on
perfecting the code and create a patch, or would I be stepping on anybody's toes?

Comment 24

14 years ago
I'm not aware of anyone else working on this right now; you wouldn't be stepping
on anyone's toes (unless someone else speaks up quickly).  Go for it!  I'll be
happy to review a patch if you write one.

Comment 25

14 years ago
Created attachment 126665 [details] [diff] [review]
First proposed patch

This patch does the job for me, but I don't really know how it will work with
non-Latin charsets.  I am unfamiliar with nsIWordBreaker and how to effectively
use it, so my patch basically works by requiring the letter before the match
and the letter after the match to be NON-alphanumeric.	This way, you can do a
'whole word search' for 2 words, like "aWord bWord".  Check it out and report
back any problems or suggestions.

I'm a complete newbie to Mozilla development, so go easy on me. :)

Comment 26

14 years ago
Created attachment 126727 [details] [diff] [review]
Removes interface changes required in 1st patch

OK, so I figured out enough about XPCOM to create and use a word breaker for
flagging purposes, and I removed the additional member variables from
nsIFind.idl and nsFind.h, but I still don't use the word breaker functionality
to determine where word breaks occur.  This patch still works for english, but
correctly fixing this bug (in an international sense) would require 1 of 2
things...

1. As mentioned, write a separate find routine in nsFind.cpp that would try to
match word by word, or
2. Make an enhancement to nsIWordBreaker: add a function to determine if a
character is a "breaking" character.

Option #2 is my preferred path, but I don't know difficult it is to convince
people to change interfaces.
Attachment #126665 - Attachment is obsolete: true

Comment 27

14 years ago
Created attachment 126858 [details] [diff] [review]
Uses nsIWordBreaker (with hack for  )

This patch actually uses nsIWordBreaker::BreakInBetween to detect word breaks. 
I used a hack to convert nbsp's to ' ' before detecting breaks.  The basic idea
was to not start matching until some sort of word break occurred, and don't
finish a match unless there was a word break char after the last character.

On Google's page, it catches all the words except "Groups" (not real sure why).
 Any other testing would be appreciated.
Attachment #126727 - Attachment is obsolete: true

Comment 28

14 years ago
Great stuff.  Clean and readable, works nicely, and doesn't break any of the
find regression tests.  (Which, incidentally, live at
http://www.mozilla.org/quality/browser/front-end/testcases/xpapps-gui/find-on-page/find-in-page.html
if anyone's interested.  Eventually we should update them to add a test for
whole-word.)

Some minor comments on the code (the third patch):

#ifdef XP_MAC
#define IS_NBSP_CHAR(c) [ ... ]

I'm unhappy we have to do this, and have split off bug 211343 on the issue (it
shouldn't block this bug).

+    PRUnichar nextc = (t2b ? t2b[findex+incr] : CHAR_TO_UNICHAR(t1b[findex+incr]));
+    PRUnichar prevc = (t2b ? t2b[findex-incr] : CHAR_TO_UNICHAR(t1b[findex-incr]));

Someone (maybe me) should probably do timing tests on a big page to make sure
this doesn't noticably slow things down (since speed was a big consideration in
the find design).  I don't think it will be noticable, and this is nice and
clean and readable, so I don't want to suggest any premature optimization if it
isn't needed.

+  , mIterOffset(0)
+  , mWordBreaker(0)

Please swap these two lines.  Picky gcc gives a warning if the order of
initialization doesn't match the order of declaration.

+ PRBool wordBreakRight;

It's only "right" when we're searching forward, correct?  (Assuming L2r text.) 
It's really wordBreakNext?  And likewise for wordBreakLeft?  Or am I confused there?

+             if (IS_NBSP_CHAR(nextc))
+               nextc = CHAR_TO_UNICHAR(' ');

Does this do any work in the conversion?  If so, maybe we should keep a local
variable for a unicode space.

I wonder if it's worth checking for word break right before making the range? 
If we're in whole-word mode we fail that test, we're going to throw away the
range, so maybe we should do the test a little earlier, right after "found a
match!" and right before we make the range, so we don't have to make it then
throw it away later.  Or is there a reason we need the range first?

Comment 29

14 years ago
+#ifdef XP_MAC
+#define IS_NBSP_CHAR(c) (((unsigned char)0xca)==(c))
+#else
+#define IS_NBSP_CHAR(c) (((unsigned char)0xa0)==(c))
+#endif

This looks like it was copied from the spellcheck code, but, as far as I know,
NBSPs are not different on Mac. See, for example,
http://lxr.mozilla.org/seamonkey/source/editor/libeditor/html/nsHTMLEditRules.cpp#86

Comment 30

14 years ago
I agree with Simon; that #ifdef seems bogus for use within gecko.  I'm not 100%
sure on the usage within extensions/spellcheck but it isn't right for Find.

Comment 31

14 years ago
> +  , mIterOffset(0)
> +  , mWordBreaker(0)
>
> Please swap these two lines.  Picky gcc gives a warning if the order of
> initialization doesn't match the order of declaration.

OK - done.  mWordBreaker wasn't initialized before, and maybe it doesn't really
need to be since it is an nsCOMPtr.  I assume that nsCOMPtr's constructor sets
the pointer to null, but I figured it couldn't hurt to be more clear.

> + PRBool wordBreakRight;
> 
> It's only "right" when we're searching forward, correct?  (Assuming L2r text.) 
> It's really wordBreakNext?  And likewise for wordBreakLeft?  Or am I confused
> there?

Full ACK.  Changed in my copy.

> +             if (IS_NBSP_CHAR(nextc))
> +               nextc = CHAR_TO_UNICHAR(' ');
>
> Does this do any work in the conversion?  If so, maybe we should keep a local
> variable for a unicode space.

CHAR_TO_UNICHAR is just a type cast.  Also, I am not sure if the XP_MAC check is
necessary.  I just copied it out of the spellcheck code, but I did just notice
the #define NBSP_CHARCODE in nsFind.cpp, line 582 - it just uses char code 160
(0xA0).

This issue would not be an issue if nsSampleWordBreaker::GetClass correctly
handled the nbsp (160).  It is identified as kWbClassAlphaLetter in line 105 of
nsSampleWordBreaker.cpp in a catch-all else branch.

As for the placement of the last word break check, I see no reason that it
couldn't be moved up before the range check.  I will try changing it and
incorporate it into my next patch if it works.

Comment 32

14 years ago
Created attachment 126888 [details] [diff] [review]
Updated patch (still slightly buggy)

Incorporates changes in comment #28, but when doing whole word searches for
"mozilla" in the regression testing page produces odd results.	It catches most
of the mozillas, but not all.  Here's the weird thing: it misses different ones
each time I close and re-open Mozilla.	Feedback from others?
Attachment #126858 - Attachment is obsolete: true

Comment 33

14 years ago
>This issue would not be an issue if nsSampleWordBreaker::GetClass correctly
>handled the nbsp (160).  It is identified as kWbClassAlphaLetter in line 105 of
>nsSampleWordBreaker.cpp in a catch-all else branch.

I disagree if you are suggesting that the nbsp be treated like a space or word
breaker.  The whole point of a non-breaking space is for it to *not* break
words; it should be treated as a letter.

If I have this text:  This is&nsbp;a test
It will look like:    This is a test
But "is a" should be treated as one word.

Comment 34

14 years ago
The non-breaking part of nbsp refers to not breaking a line between words. 
http://www.unicode.org/reports/tr14/ supports this definition.  

Comment 35

14 years ago
Yeah, I don't think the user expects   to behave like a non-space letter
for finding. As far at the user is concerned, it should behave like a space.

Comment 36

14 years ago
When this moves forward, we can experiment with whether it would be a good
default for typeaheadfind/ find as you type.

In usability studies it was found that people expected find as you type it to
match only at the beginnings of words.

Comment 37

14 years ago
Created attachment 127220 [details] [diff] [review]
Fully functional (I think)

The problems with the previous patch lie in setting nextc and prevc from the
t1b and t2b arrays.  If c is at the end of the node, the nextc and/or prevc
will be outside the array boundaries.

It was easy enough to fix this for prevc: I just set prevc = c before setting c
in each iteration.  But for nextc, it is a bit more tricky.  If c is the last
character in a node, then nextc must be retrieved from the next visible node -
I used NextNode for this.

NextNode changes the member variables, so I wrote a new function that saves the
member variables, gets the next char from the next node, and then restores the
member variables.  I don't know if this the best way to implement this, so I am
open for comments.  I think the code is fully functional - I have tested it
quite a bit, but more testing is, of course, appreciated.
Attachment #126888 - Attachment is obsolete: true

Comment 38

14 years ago
Nice.  I don't have any problem with the new function -- it seems like a
reasonable approach, and makes the code easy to read.  I'm not as happy, though,
about it using the member variable mIterNode to iterate, knowing that it's going
to be restored to its previous value.  I wonder if we might be better off
rewriting NextNode to return the new node and offset, and its other customers to
do NextNode(mIterNode, &mOffset) if that's what they want; then
GetCharFromNextNode can do its work without having to save/restore mIterNode and
friends.  That would also make GetCharFromNextNode shorter and simpler.

Can the rest of GetCharFromNextNode be factored?  I guess it would take a fair
amount of rewriting of the main Find loop to group all that into one place, so
that probably isn't worth the effort or risk.  Oh, well.

Very minor question: I found myself wondering about the edge cases: what happens
if the word happens to match at the end of the document, and there's no next
node?  It seems to do the right thing in the few cases I tested (probably it's
returning additional nodes which have no text in them, so it's satisfying the
word break condition that way).

Comment 39

14 years ago
Yes, GetCharFromNextNode will return 0 if there isn't any more nodes, and 0 will
satisfy word break in all reasonable conditions.

> I wonder if we might be better off
> rewriting NextNode to return the new node and offset, and its other customers to
> do NextNode(mIterNode, &mOffset) if that's what they want; then
> GetCharFromNextNode can do its work without having to save/restore mIterNode and
> friends.  That would also make GetCharFromNextNode shorter and simpler.

I agree.  Maybe we should create a new function called AdvanceNode and use it in
all the places that NextNode is used.  Change NextNode so that it returns
iterNode and offset and then AdvanceNode would just call NextNode and set the
member variables with NextNode's returns.  This would basically turn AdvanceNode
into a wrapper that updates member variables, and "Advance" clearly indicates in
its name that it is changing the object's state.  GetCharFromNextNode would
still call NextNode, but their wouldn't be any need to save the iterator and offset.

Comment 40

14 years ago
That sounds like a good plan ... should cause minimal disruption to the rest of
the code.

Comment 41

14 years ago
Created attachment 127767 [details] [diff] [review]
Code uses AdvanceNode now.

Check my usage of nsCOMPtr's in NextNode, AdvanceNode, and GetCharFromNextNode.
 If there are no incorrect usages, then maybe this is the final patch.
Attachment #127220 - Attachment is obsolete: true

Updated

14 years ago
Attachment #127767 - Flags: review?(akkana)

Comment 42

14 years ago
Sorry I've been delaying on this.  I've looked at it, and I keep getting partway
through it and thinking that having the extra routine ends up being more
complicated than the last patch since we end up having to save and restore
mIterNode sometimes anyway, and maybe we were better off with the last one.

Re the use of nsCOMPtr: generally we try to avoid using nsCOMPtr as function
arguments, at least in editor code; it adds a level of complexity, and IIRC it
sometimes adds extra addrefs that don't need to be there.  Generally it's better
for the called method to just take a pointer (e.g. nsIContentIterator**
aIterator), and be explicit (in a comment in the .h file, preferably) about
whether or not it addrefs its argument before returning, and then the caller
uses an nsCOMPtr<nsIContentIterator> and calls it with getter_addRefs() or
getter_doesntAddRef() as appropriate.  (Usually, doing the addref is best.)

So anyway, I'm halfway through looking at the new code and modifying it to do
this, but I keep thinking we might be better off just going back to the previous
simpler patch since this doesn't really avoid the pushing/popping anyway.

Comment 43

14 years ago
Ben Howe did some really good work writing this feature, but it hasn't been
added to the tree.  The last comment from Akkana says that we should probably go
back to the previous patch (the one without AdvanceNow).

Could someone review the 7/07/03 patch so we can add this feature?  The work has
already been done, it's a shame that we haven't been able to get this in.

Comment 44

13 years ago
*** Bug 243500 has been marked as a duplicate of this bug. ***
*** Bug 252262 has been marked as a duplicate of this bug. ***

Comment 46

13 years ago
What would it take to get one of these patches reviewed/checked in?  Even
though, it's not a frequenty requested feature, I'd hate to see this go to waste.

Comment 47

13 years ago
Yes, you're right.  I'm sorry.  I was sort of waiting for a comment expressing a
preference for one patch or the other, and then it fell off my radar, and I
should have just picked one and reviewed it.  I'm going to pick the previous one
("Fully functional (I think)") unless someone thinks there's something in the
newer one which is better.

I may change function arguments that use nsCOMPtr.

Unfortunately, the CVS change from pserver to ssh has made it impossible for me
to cvs update or do a cvs diff, and bugzilla's diff doesn't show the last patch.
 I've sent in my key for cvs, and will meanwhile get started looking at the
diffs to refresh my memory.  They looked pretty good the first time, so I expect
we should be able to get them in (if I want it done some other way, then I
should dang well write that part. :-)
Status: NEW → ASSIGNED

Comment 48

13 years ago
akk: you can use cvs -d :pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot in
the interim.

echo ':pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot' > /tmp/MyRoot
find . -type d -name CVS -exec cp /tmp/MyRoot {}/Root;

you can also forcibly commit w/ cvs -d
akkzilla%shallowsky.com@cvs.mozilla.org:/cvsroot commit filelist

later if you decide you don't mind using cvs-mirror for your tree :).

Comment 49

13 years ago
Created attachment 156045 [details] [diff] [review]
Fully functional (I think) - 1 year later

I just pulled the trunk today and tried to reapply the patch.  Some interfaces
have changed over time, so I have updated to patch to reflect the changes that
happened since I wrote the original patch.

Updated

13 years ago
Attachment #127767 - Attachment is obsolete: true

Updated

13 years ago
Attachment #127767 - Flags: review?(akkzilla)

Updated

13 years ago
Attachment #156045 - Flags: review?(akkzilla)

Comment 50

13 years ago
Does your patch have any effect on bug 240932?

Comment 51

13 years ago
After doing a cursory test on the attached test case for bug 240932, this patch
does not seem to affect bug 240932.  I get the same results as described when
searching for x1, and when I search for the whole word 'x123', I find 4 matches
(all possible matches except for the grayed-out text box).

Here is something even more disconcerning, though.  Whey I type a string into
the 1 line textbox, such as 'abc', searching for the whole word 'x123' yeilds
only first result - 'x<p></p>123'.  It does not find the others.

Changing the form input at all from its default values actually seems to break
the "find whole word only" option, but this does not happen on the Regression
Testing Find In Page.  I don't know what would be causing this.  Maybe the fact
that in the test case for bug 240932, the input items are not in <form> tags
might have something to do with this.

Comment 52

13 years ago
Comment on attachment 156045 [details] [diff] [review]
Fully functional (I think) - 1 year later

r=akkana

I agree that bug 240932 has some strange behavior, and also that finding in
text controls occasionally does odd things; but I couldn't make it generate any
behavior with this patch that was worse than the behavior before the patch
(even in a text field), so I don't see any reason to block this much-delayed
patch any further.

I have some very minor changes in the version that's in my tree: changed
"varaibles" to "variables" in two places, shortened a few lines >80 columns
(including some which were there before your patch), and added the following
comment at the beginning of GetCharFromNextNode because I know I'll forget why
it needs to be there (let alone other people):
//
// Save the iteration variables, get the next character,
// then restore the variables.
// XXX It would be nice if NextNode/PrevNode could do a "peek"
//     without changing the pointers.
//

Does that comment seem to describe it well enough, or can you suggest something
better to put there?

No functional changes, so I won't bother to attach a new patch unless someone
wants it.

Simon, are you still doing sr's?  You and Kin are the only SRs familiar with
the find code, but I can pass it off to someone else for sanity/overflow
checking if you don't have time.
Attachment #156045 - Flags: superreview?(smfreeman)
Attachment #156045 - Flags: review?(akkzilla)
Attachment #156045 - Flags: review+

Comment 53

13 years ago
I have developed some knowledge of the find code (when I implemented find in
text controls) and can help with sr if sfraser or kin are unavailable.

Comment 54

13 years ago
Akkana, your comment seems to describe it well.

Comment 55

13 years ago
Comment on attachment 156045 [details] [diff] [review]
Fully functional (I think) - 1 year later

Good point, rbs!  Can you sr this patch?
Attachment #156045 - Flags: superreview?(smfreeman) → superreview?(rbs)

Comment 56

13 years ago
Comment on attachment 156045 [details] [diff] [review]
Fully functional (I think) - 1 year later

sr=rbs, with the following comments:

+    PRBool wordBreakPrev;
+    if (mWordBreaker)
+    {
+      if (prevc == NBSP_CHARCODE)
+	 prevc = CHAR_TO_UNICHAR(' ');
+      mWordBreaker->BreakInBetween(&prevc, 1, &c, 1, &wordBreakPrev);
+    }
+
+    // Compare.  Match if we're in whitespace and c is whitespace, or if
+    // the characters match and at least one of the following is true...
+    //   a) we're not matching the entire word
+    //   b) a match has already been stored
+    //   c) the previous character is a different "class" than the
+    //      current character
+    if ( (c == patc && (!mWordBreaker || matchAnchorNode || wordBreakPrev)) ||

If you init wordBreakPrev, you can simplify the above into the
following (to avoid referring to a potentially uninitialized variable
in the test):

    PRBool wordBreakPrev = PR_FALSE;
    if (mWordBreaker)
    ...
    if ( (c == patc && (matchAnchorNode || wordBreakPrev)) ||

==========================

+	 // Check for word break (if necessary)
+	 PRBool wordBreakNext;
+	 if (mWordBreaker)
	 {
[...]
+	 }
+
+	 // If a word break isn't there when it needs to be, flag and reset
search
+	 if (mWordBreaker && !wordBreakNext)
+	   startParent = nsnull;
+	 else


Similar as above. The |if (mWordBreaker && !wordBreakNext)| makes me itchy. I
don't like 
to see a potentially uninitialized variable being used as a RHS value. Init
wordBreakNext to false.

==========================
+    // Make and set a word breaker here if searching for whole word
+    nsCOMPtr<nsIWordBreaker> wb;
+    if (mEntireWord) {
+      wb = theDoc->GetWordBreaker();
+      NS_ENSURE_SUCCESS(rv, rv);
+      mFind->SetWordBreaker(wb);
+    } else {
+      mFind->SetWordBreaker(0);
+    }

What a mouthful code. And rv is unset in NS_ENSURE_SUCCESS. 
Just simplify the whole thing into this one-liner (with a
resilient code even if the WB is null): 

     // Set a word breaker if searching for whole word
     mFind->SetWordBreaker(mEntireWord ? theDoc->GetWordBreaker() : 0);

=========================
     GetCharFromNextNode()

I am not a big fan of this function, but oh well, I can live with that. 
[I would think that one could recast the find code around a struct with 
{iterNode, offset, blockparent}, but that is another story, as your experience
with AdvanceNode has shown.]

Why do you need to save mIterator in GetCharFromNextNode?

====
Care also to add a comment to the section "Here begins the find code.", saying,
e.g.,

In entire word mode, the basic idea is to not start matching until some sort of
word break occurred, and don't
finish a match unless there was a word break char after the last character.
Attachment #156045 - Flags: superreview?(rbs) → superreview+

Comment 57

13 years ago
Comment on attachment 156045 [details] [diff] [review]
Fully functional (I think) - 1 year later

Oops... I am retracting my sr.

It just occurred to me that there is a hole in your patch (and this may explain
why it had troubles with text controls).

When you save mIterNode, it is not enough to restore the old value. Why?
Because the underlying nsFindContentIterator has moved... and so its internal
state is not in sync with the old node. Hence things go out-of-sync if you
overwrite the new node with the old node.

What you should do with mIterator is to re-PositionAt() the (old) node where
you want.

BTW, for correctness / bullet proofing, you may have to add:

nsresult
nsFindContentIterator::PositionAt(nsIContent* aCurNode)
{
  nsIContent* oldNode = mOuterIterator->GetCurrentNode();
+ Reset();
  nsresult rv = mOuterIterator->PositionAt(aCurNode);

It is something that I have been thinking of adding, but I have been wandering
elsewhere and hadn't come back to that yet.
Attachment #156045 - Flags: superreview+ → superreview-

Comment 58

13 years ago
You also need to sync view-selection-source (and others such as Find-As-You-Type
that use the find service -- it is a singleton and needs special caution)
http://lxr.mozilla.org/mozilla/source/xpfe/browser/resources/content/viewPartialSource.js#245
http://lxr.mozilla.org/mozilla/source/toolkit/components/viewsource/content/viewPartialSource.js#247

i.e., set .entireWord to false and cache/restore it. Otherwise, a user may set
something on the find dialog, another code overwrite it (via the singleton
service), and next time the user looks at the dialog, they will be surprised to
see something different. Same with FAYT which needs to set .entireWord to false. 

Updated

13 years ago
Blocks: 269442
Product: Core → Mozilla Application Suite

Comment 59

12 years ago
 . . . and a full year passes in complete silence.  Not to mention we're only 15
days from the bug's six-year anniversary.  Perhaps today's bug-a-thon would be a
good day to finish this off?  I've got UI changes for the find toolbar (as
opposed to the find dialog patched above).  But I haven't written JS to make the
new checkbox function, for two reasons: 1) nsIFind doesn't have that
functionality, and 2) I don't know enough C++ to fix it myself.  But I'd really,
really be happy to have "Whole Word Only" functionality back on Firefox 1.5.

Also, I have a question - why are we modifying nsIFind to add support for word
matching when nsIFindService[1] and nsIWebBrowserFind[2] both appear to have it
already?  findUtils.js uses nsIFindService,[3] and finddialog.js uses
nsIWebBrowserFind.[4]  Do we really need three different find APIs?  Is there
some difference between them that I don't understand?  Do they operate in
different contexts/projects?  Does one perform better or something?  My poor
liddle brain can't figgur it out. Help, speling scils failing. Cortex
overheting, breach immanent. Evacuate! Evac@#%$@TR$# . . . ATH0 NO CARRIER

[1] nsIFindService:
http://www.xulplanet.com/references/xpcomref/xpcomref1.0/nsIFindService.html
(sorry, couldn't find any native mozilla.org docs for this)

[2] nsIWebBrowserFind:
http://www.mozilla.org/projects/embedding/embedapiref/embedapi17.html

[3] findUtils.js
http://lxr.mozilla.org/mozilla1.8/source/toolkit/content/findUtils.js#106

[4] finddialog.js
http://lxr.mozilla.org/mozilla1.8/source/toolkit/content/finddialog.js#48

Comment 60

12 years ago
(In reply to comment #59)

> [1] nsIFindService:

nsIFindService is an internal service that simply stores global find state
between finds (i.e. the search string and settings).

> [2] nsIWebBrowserFind:
> http://www.mozilla.org/projects/embedding/embedapiref/embedapi17.html

This is the "real" find interface that JS and embedders should be dealing with.
Under the hood, this is an nsWebBrowserFind which uses an nsFind to do the work.

> [3] findUtils.js
> http://lxr.mozilla.org/mozilla1.8/source/toolkit/content/findUtils.js

This JS manages the find dialog, passing its settings into the nsIWebBrowserFind. 

> [4] finddialog.js
> http://lxr.mozilla.org/mozilla1.8/source/toolkit/content/finddialog.

The dialog.

The "whole word" setting is exposed all the way through the APIs, but simply not
implemented at the lowest level (nsWebBrowserFind). The patch here implements
that, but failed superreview. So it needs more work.

Comment 61

12 years ago
Ah.  That makes sense - they're not duplicates, but stacked APIs that inherit
functionality from one another.

I sure wish I had the C++ skills to fix the problems with the nsIFind patch,
then.  I've really got to pick that up at some point.  Thanks for the cogent
response.
*** Bug 322897 has been marked as a duplicate of this bug. ***

Updated

11 years ago
Blocks: 354618

Comment 63

11 years ago
*cough*

Another year passes in silence broken only by the lone dupe report in August.  Happy seventh birthday, #14871.  Soon you'll be a fully grown bug, capable of smashing poorly defended Japanese cities before dying in terrible combat with a giant robot.

We came so close to fixing this back in '04.  Is Ben Howe still around?  Does he care any more?  Is the most recent patch still a workable base, or have the interfaces changed enough to require a rewrite?

Sigh.
Assignee: akkzilla → jag
Status: ASSIGNED → NEW
Priority: P3 → --
QA Contact: bugzilla
Target Milestone: Future → ---

Updated

8 years ago
Blocks: 481513

Comment 64

8 years ago
This *still* hasn't been fixed?  It's 2009 already!  10 years apparently since this bug was first introduced.  IE has had the "whole word only" function for how long now.  It really is a pain to search for a word, only to have to continously click "Find next" because it finds words that CONTAIN the real word you're looking for.
Moving to product: Firefox, for triage.
Assignee: jag → nobody
Component: UI Design → General
Product: SeaMonkey → Firefox
QA Contact: general
Duplicate of this bug: 515027

Updated

8 years ago
Duplicate of this bug: 518082

Comment 68

8 years ago
Could someone please give more info on the current status of this bug.

Updated

8 years ago
Duplicate of this bug: 528678

Comment 70

8 years ago
is this not a duplicate of Bug 269442 which was wontfix?

Comment 71

8 years ago
This is Ben here....I am still around hoping that somebody would roll this into the baseline, but I gave up long ago after many failed attempts at getting it rolled in.

Comment 72

8 years ago
This feature is rather important to implement. For example, Google searches for whole words by default. Everybody is used to it as the search results are exactly what the user wants. IE has this feature too.

The priority of this request should be changed to important. This is a long-time deficiency of Firefox which needs to be fixed. I know other people complaining about this.

Comment 73

7 years ago
Honestly, this should have been done long ago.  Firefox search bar feels like it's stuck in 1990 without this basic feature.  When I search for "IT" I don't want all the "italian", "itheration", etc words poping up before bringing me to the real word I am looking for.

Comment 74

7 years ago
Please someone mark as "blocks bug 565552" (unfortunately can't do that myself) - looks like there's hope!

Updated

7 years ago
Blocks: 565552
No longer blocks: 195485, 354618, 481513, 106961, 269442

Comment 75

7 years ago
Mike Connor was strongly opposed to this idea in bug 269442, seemingly based mostly on the number of people who want this vs the cost of the extra "clutter".

This is so backwards!...

Updated

7 years ago
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 269442

Comment 77

7 years ago
Wow.  After being subscribed to this bug for so many years it's almost sad to see it die such an unceremonious death.

Updated

7 years ago
No longer blocks: 195485

Updated

6 years ago
No longer blocks: 565552
**********************************************************************
If you still want this bug to be fixed, please vote for bug 269442:

https://bugzilla.mozilla.org/page.cgi?id=voting/user.html&bug_id=269442#vote_269442
("vote" link in the Importance field at the top of bug 269442).
**********************************************************************


Moved 12 duplicates to the currently active version of this bug (bug 269442), to get the record correct there:

Bug 15055, bug 104046, bug 118118, bug 143180, bug 156517, bug 192934, bug 198922, bug 252262, bug 322897, bug 515027, bug 518082, bug 528678.

Bug 269442 itself already had 6 duplicates (including this bug):
Big 14871, bug 243500, bug 348310, bug 351430, bug 410103, bug 505684

So the total number of dupes is now 18.
At the time of this comment (2014-05-10), this bug 14871 has 40 votes, and bug 269442 has 39 votes (intersections possible).

Maybe one day it'll dawn on them that this ridiculous lack of basic functionality in the primary UI should be fixed, 15 years after this was first filed. Sad.

Comment 79

2 years ago
I spent about half an hour trying to figure out where the 'whole word only' option was in Firefox's search. I eventually ended up here, where I was shocked to find that not only is there no such option, but this was - and still is - intentional. WTF.

I created an account just to vote for this issue. This is ridiculous beyond belief. I've switched to other browsers just to get this before, and I will again today.

I feel insulted.
You need to log in before you can comment on or make changes to this bug.