37941 - [RFE] Regular Expression Searches

The JS RegExp stuff can be exposed. Some of it is. You'll need a JSContext*, which you can get from an nsIScriptContext, which you can get from a DOM global object. Where would the call(s) to compile and test a regexp come from? Here's a question: does the search have to convert the entire document into a string to find a match? Or is there an iterator that can be used character by character? /be

R.K.Aa.

Comment 10

•

23 years ago

*** Bug 118507 has been marked as a duplicate of this bug. ***

sairuh (rarely reading bugmail)

Comment 11

•

23 years ago

mass moving open bugs pertaining to find in page/frame to pmac@netscape.com as qa contact. to find all bugspam pertaining to this, set your search string to "AppleSpongeCakeWithCaramelFrosting".

QA Contact: sairuh → pmac

Akkana Peck

Comment 12

•

23 years ago

I have bug 32641, to implement simple wildcard searching. The request in that bug asks specifically that full regexps NOT be implemented. We should choose one of these two options, and resolve one of these bugs as a dup of the other. I should probably own the resulting bug, unless Simon specifically wants this one. Simon (or anyone on the cc list), what do you think? Wildcard or regexp? And please feel free to reassign this one to me, or dup it to 32641. Unfortunately, we can't just plug in a regexp library, as our searches have to be able to span multiple DOM nodes while iterating backward and forward in the dom and skipping over invisible nodes.

Boris Zbarsky [:bzbarsky]

Comment 13

•

23 years ago

I have to say that I find regexp searching infinitely more useful than wildcard searching. I also agree (unfortunately) that Brian has a point in bug 32641 -- wildcard searching may be a lot more likely to be understandable to the average "power user" who is used to dealing with ? and * in shells. I was wondering whether it would be possible to make the code flexible enough that the matching engine could be swapped out (so that someone could write a "regexp search" xpi) while keeping it fast (and whether we really care, I guess).

Boris Zbarsky [:bzbarsky]

Comment 14

•

23 years ago

I've been thinking some more.... For a typical document with text (which is what one mostly views with a web browser) the only wildcard that's really useful is "?"... "*" would have a strong tendency to match something like half the document (And it sounds like bug 32641 is asking for "*foo*" to act as the "\b\w*foo\w*\b" regexp, which is, imo, not at all intuitive even for someone used to wildcards.) Also, on Unix regular expression searches are the standard for tools that manipulate text and allow searching -- wildcards are only really used by shells.

m_mozilla

Comment 15

•

23 years ago

> Unfortunately, we can't just plug in a regexp library, > as our searches have to be able to span multiple DOM > nodes while iterating backward and forward in the > dom and skipping over invisible nodes. How does the literal text matching engine do it? This might be Too Much Bloat, but what if a search command caused a plain text version to be assembled on the fly, with an associated table of relations between positions in the plain text version and location in the document? The table would not need to be complete, only enough to be able to construct information to hilight "this much off the end of that node, these nodes, and that much off the start of the next node". The text equivilant and table would be built only when the search was initiated (and cached until some DHTML whatzit rendered the table obselete). The plain text could be searched by any regex library which could return start/end of match indicies which would then be translated into a useful result relative to the actual page. If the literal text search doesn't already do something like this, it *could* be doing something like this, and the whole thing could be very pluggable. One could have different options for search: literal text, wildcards, regular expressions, soundex, whatever. -matt

Simon Fraser [no longer active]

Assignee

Comment 16

•

23 years ago

akkana: feel free to take this. I'd also strongly recommend that you future it :)

Akkana Peck

Comment 17

•

23 years ago

I'm going to dup this to bug 32641. Those arguing for regexps rather than wildcards, discuss it there where the pro-wildcard folks are. Boris: > I was wondering whether it would be possible to make the code flexible enough > that the matching engine could be swapped out Unfortunately, not, it isn't really possible: m_mozilla: > what if a search command caused a plain text > version to be assembled on the fly That's what the previous version of find did, and that's why it was up to an order of magnitude slower on big documents. It's not reasonable on big documents. Part of the problem was that it had to be redone for every search, because we have no way of knowing whether the document changed since the last search. In various attempts at rewriting this code, I tried several different approaches involving combining text from several text nodes together and then calling the built-in searches in our string classes (which would also have allowed for calling regexp comparisons), but the result wasn't fast enough, and I never came up with a satisfactory answer to the question of "How do you determine how many nodes you have to convert to plaintext before you have enough to call the pattern search on it?" I suppose you could just keep building the string as you iterate through the document, re-doing the regexp search each time. *** This bug has been marked as a duplicate of 32641 ***

Status: ASSIGNED → RESOLVED

Closed: 23 years ago

Resolution: --- → DUPLICATE

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: Core → Mozilla Application Suite

Bugzilla

[RFE] Regular Expression Searches

Categories

(SeaMonkey :: UI Design, enhancement, P3)

Tracking

(Not tracked)

People

(Reporter: danielpeng, Assigned: sfraser_bugs)

References

Details

(Keywords: helpwanted)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17

Updated