Open Bug 32641 Opened 24 years ago Updated 2 years ago

Wildcard (or regexp/regular expression) searching

Categories

(SeaMonkey :: Find In Page, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

People

(Reporter: netdragon, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: helpwanted)

When you search, it should make a list of all the matching words in a little 
window that you can individually click on. Norton Systemworks 2000 registry 
editor does this if you need to see an example.

When you click enter in the search box, it doesn't do anything. Also, I don't 
know if you have done this already, but you should create a concordance of the 
page at some time using a tree that contains a root pointing to a number of 
children, each representing a different letter, and each of those pointing to a 
different letter. Then when you add a word, you create a child for each letter 
and then add a linked list containing the words and where in the page they are. 
I did that on one of my programs, and it could create a concordance of a 20 Meg 
file in less than a second (on my computer) and find all occurances in less 
than a second. (It used much memory though). If the person doesn't have enough 
memory for a certain page, a concordance could be dropped.
Also, a concordance is great for wild card characters such as *lo* would return 
Halo, Hello, Melon, etc... Hell* would return Hello and Hell. For the first, 
all you have to do is return all children that have a l followed by an o. I 
think you should inlude wildcards in your browser search.
Rewriting bug to reflect last sentence of previous comment, which I think is 
the point boberb is trying to make :-). 

Gerv

Assignee: cbegle → matt
Component: Browser-General → Search
QA Contact: asadotzler → claudius
Summary: Find → RFE: Wildcard searching
Find on Page = XPApps. reassigning.
Assignee: matt → law
Component: Search → XPApps
QA Contact: claudius → sairuh
Oh, gee, you'll want regular expressions next!
Summary: RFE: Wildcard searching → [Find][RFE] Wildcard searching
For clarification, I meant the Search document. This is very easy to implement 
(for text files), believe me. I did it for a school project. I did it without 
using STL. Using STL would make it even simpler. :)
I do want regular expressions :-)
Setting target to M17.  This is really a thing for the "document text services" 
which is owned by somebody else but I'll take it for now.
Status: NEW → ASSIGNED
Target Milestone: --- → M17
Move to M20 target milestone.
Target Milestone: M17 → M21
nav triage team:

Not something we'll get to for beta1, marking nsbeta1-. Bill please reassign to 
the appropriate party.
Keywords: nsbeta1-
Marking nsbeta1- bugs as future to get off the radar
Target Milestone: --- → Future
Blocks: 106961
akkana, should this be yours...?
Assignee: law → akkana
Status: ASSIGNED → NEW
Keywords: helpwanted
A request for wildcard searching, as specified in the summary, would be mine
(and sounds like a fun project).  But the original request here seems to be for
a redesign of the find UI, which should belong to some front-end person, not for
adding wildcard searching in the backend.  Could the submitter (or anyone else
interested in this bug) please clarify?
mass moving open bugs pertaining to find in page/frame to pmac@netscape.com as
qa contact.

to find all bugspam pertaining to this, set your search string to
"AppleSpongeCakeWithCaramelFrosting".
QA Contact: sairuh → pmac
Akkana: This bug got a little overrun with requests :-) Please implement the
wildcard searching. I'm moving the list request to another bug. I would also
like if you could do ? in the search to mean only one character. \* and \? would
be for the * and ? characters. Please don't do regexp as that would baffle the
average user. Thanks.
New bug is bug 122061 for list of search returns.
*** Bug 37941 has been marked as a duplicate of this bug. ***
[RFE] is deprecated in favor of severity: enhancement.  They have the same meaning.
Severity: normal → enhancement
Summary: [Find][RFE] Wildcard searching → [Find] Wildcard searching
*** Bug 177034 has been marked as a duplicate of this bug. ***
Summary: [Find] Wildcard searching → Wildcard (or regexp?) searching
Depends on: 106590
*** Bug 197470 has been marked as a duplicate of this bug. ***
*** Bug 206404 has been marked as a duplicate of this bug. ***
 Akkana: see Bug #206404
I do not thing it is duplicate of this bug.
This one mostly about "regexp" and at most "list of found word".
My is about "grep"-ing (grep result contains _all string_ with the keyword).
*** Bug 244835 has been marked as a duplicate of this bug. ***
OS: other → All
Hardware: Other → All
Summary: Wildcard (or regexp?) searching → Wildcard (or regexp/regular expression) searching
Some comment to the discussion wildcard vs. regulat expression.

Most of the duplicates of this bug wanted regular expression. I am voting for
regular expression instead of for wildcards too 8-)

Look at the searching in the OpenOffice.org Writer (or even MS Word). They
supports regular expression too. The user can search page by simple way (like it
is now) and when he wants more, he can use regular expression.

I thing these too possibilities "similar search and regular expression search"
are sufficient enough and there is no need to make third way for searching -
searching with wildcards, which is - from the performance view - someting
between similar search end regexp.

So I think we should change topic to "Regular expression searching".
Regexp are evil for the average user. How about a checkbox:

[ ] Regular Expression search
User wants to do all*run to find "all ducks run", "all cats run". That as a
regexp would be all.*?run or something like that. An average Windoze user will
not know how to do that stuff. Regexp can be an option in the search box, but
shouldn't be a default.
(In reply to comment #24, #25)
>An average Windoze user will not know how to do that stuff

Yes. For the average user there is simple search, for experienced regular
expression. See OpenOffice.org, see MS Word, thay have it in the same way.

> Regexp are evil for the average user. How about a checkbox:
> [ ] Regular Expression search

Yes, I think so.
I have described it in the detail in the bug #244835
I will paste it here:

----
There would be nice to have a regular expression possibility in the searching in
the page (this can concern not only Browser, but Editor and Mail too).

The dialog "Find on the page" should have 4th checkbox "Regular expression" and
when it is checked the searching will use regular expression search.

MS Word and OpenOffice.org Writer have it so! Well, most users are using
classical search (simple text), but on the search dialog of MS Word and
OpenOffice.org Writer, there is checkbox for regular expression search, so
experienced users can use it 8-)
There would be nice to have a regular expression possibility in the searching in
the page (this can concern not only Browser, but Editor and Mail too).

The dialog "Find on the page" should have 4th checkbox "Regular expression" and
when it is checked the searching will use regular expression search.

MS Word and OpenOffice.org Writer have it so! Well, most users are using
classical search (simple text), but on the search dialog of MS Word and
OpenOffice.org Writer, there is checkbox for regular expression search, so
experienced users can use it 8-)
Agree - regexp should be an option in the dialog or in the preferences of mozilla.
And in addition to "list all found words" I whould suggest "hilight all found
words" same as "less" do.  You just see all of them on page. 

I whould propose "zoom to found" but it requires more complex programming. That
is if you press "zoom" all blocks exept of with found words are set to hidden.
The poblem is - what to call a block :) For table it whould be table row. For
lists - the LI. For text - paragraph. Etc.
(In reply to comment #27)

> The poblem is - what to call a block :) For table it whould be table row. For
> lists - the LI. For text - paragraph. Etc.

No table row, because tables are still used to page layout instead of as table
of data, so there is a problem. Inspiration is on OpenOffice see below, copy of
other part of the #244835 ;-)

/////////
There are some questions how to implement some more problematic
cases like:

a) Moz<b>illa</b> <- should this be handled as Mozilla (on word) or two words
Moz and illa.

b) How to handle ^ and $ - probaly as the begining and the end of the paragraph
(OpenOffice.org has it so) - but what exactly is a paragraph? Only the text
between <p> and </p> tags? Or everything, what have display:block (which is
probably better)? 
/////////
Re comment #28

a) Two words

b) Hmmmm... What do Openoffice and MS Word do? We'd probably want to cut it off
in each block or <br>


Implementation:
Perhaps the best way to implement simple search would be to convert it
transparently into a regexp, then we only need one search back-end. This would
also allow the regexp code to be thoroughly tested.

Simple search possibliites:

* wildcard would be replaced with regexp to match any length string
? would be replaced with regexp to match single character
quoting would be implicit... "the dog" would return only "the dog", not "dog the"
AND and OR would be allowed as modifiers, they would also be converted into the
regexp equivalents.
Blocks: 213567
Product: Core → Mozilla Application Suite
Blocks: 298127
Assignee: akkzilla → jag
Priority: P3 → --
QA Contact: pmac
Target Milestone: Future → ---
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.

If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.

Query tag for this change: mass-UNCONFIRM-20090614
Status: NEW → UNCONFIRMED
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.

If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.

Query tag for this change: mass-UNCONFIRM-20090614
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.

If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.

Query tag for this change: mass-UNCONFIRM-20090614
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.

If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.

Query tag for this change: mass-UNCONFIRM-20090614
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.

If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.

Query tag for this change: mass-UNCONFIRM-20090614
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.

If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.

Query tag for this change: mass-UNCONFIRM-20090614
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.

If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.

Query tag for this change: mass-UNCONFIRM-20090614
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee: jag-mozilla → nobody
Component: UI Design → Find In Page
QA Contact: keyboard.fayt
No longer blocks: 213567
IMO this bug was not handled properly due to very unclear initial comment. Maybe it should be closed and recreated in a new modern way.

What people need is some abilities to extend search options:

1. Use alternatives: (regex|regular)
2. Use masks: 19[7-9][0-9]
3. Check context, including nearby words: (?<!sea)monkey [a-z]{0,20} banana

All regexp specials like different whitespaces, modifiers, replacing, grouping, recursion etc are too exotic for common use. But regexps are industrial standard for advanced search and gives almost "all one can want" in the only well-maintained library. So all possible options could be implemented at once via regexp search.
It is incredible that this request was filed in 2000, 16 years ago, and yet still the only change to this feature was the recent addition of whole words only (a partial regex i.e. "\bstring\b".

There are 3rd party plugins to do this:

https://addons.mozilla.org/en-US/firefox/addon/regex-find/ (bugged?)
https://addons.mozilla.org/en-US/firefox/addon/fastest-search/

But this is basic feature and should be part of the browser. It doesn't take much RAM or space, so it's not a bloatware issue.

I suggest a two pronged solution:

1. 2 interdependent check boxes. Clicking one unclicks the other. One can unclick both which is the default.
2. First box allows for unix style globs, second box allows for regexes.

https://en.wikipedia.org/wiki/Glob_(programming)
https://en.wikipedia.org/wiki/Regular_expression

For many purposes, globs will be sufficient and they are more user friendly than regexes. For some cases, only regexes will be sufficient. The glob feature can easily be implemented through the regex implementation.

One can add little nice things like small text "(what is this?)" with links to Wikipedia (the two above links), so users unfamiliar with the systems can learn to use them easily.

I hereby set a 100 USD bounty in Bitcoin for whoever implements this feature. To collect it, simply email me when the feature has been added to beta/nightly.

I raise @emil's bounty with 100USD who implements this feature.

imho it should be implemented similar to https://addons.mozilla.org/en-US/firefox/addon/regex-find/

You need to log in before you can comment on or make changes to this bug.