Closed Bug 171073 Opened 17 years ago Closed 17 years ago

Need notion of grouping for search terms

Categories

(SeaMonkey :: MailNews: Message Display, defect)

x86
Windows 2000
defect
Not set

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: mscott, Assigned: mscott)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

We need an internal way to group search terms together to allow mixed AND/OR
searches. This may result in a format change for our filters (if we expose the
internal change to filters) so it may make sense to get this on the radar before
we  fork rules.dat.

I need this behavior in order to land mail views. If I have a mail view which
consists of (Sender is Seth OR Sender is David) and I type some text into the
quick search bar like: (Subject is RedSox), I really need to represent this
internally as:

( ((Sender is Seth) OR (Sender is David)) AND (Subject is RedSox) )

when I go to search. 

That allows quick search to work with mail views which consist of multiple OR'ed
search terms. 

For expediency sake, I don't care if the filter and search code has UI for
exposing this type of behavior. I just need a internal way to represent it. The
UI can come later. 

I'm open to suggestions for adding grouping terms to the search code.
In a former life for the 4.x search code, I added a boolean expression tree to
the search code which seems to still be in use inside of mozilla. However it is
built up by parsing the operators left to right. 

I'm thinking of trying to create new dummy search terms which act as grouping
operators. i.e. a search term for 'Begin grouping' and a term for 'End Group'.
The disk representation could be a ')' but we use parentheses already, we may
want a different string to represent this on disk. 

Are there better alternatives than making these group delimiters search terms? I
guess our search queries are arrays of nsISupport objects so we could create a
new interface for a begin group block / end group block and just QI for those
when we perform the search....
Here's my current idea. 

If I add a dummy search term to start and end a group, I risk breaking a lot of
UI which iterates directly over a list of search terms and expects each search
term to map into a search term widget. 

I'm thinking about adding a field to a search term which says it denotes the
start of a block, and another field saying it denotes the end of a block. 

Then as we iterate over the search terms array in the evaluation code, we can
check for the start of a block, recursively build an expression tree for the
terms inside of the block and add them to our expression tree. 

This approach seems like it has the least drastic impact to the code. 

This field would also have to be written out when we save a search term to disk
as part of a filter or a mail view query.
Attached patch initial rough cut (obsolete) — Splinter Review
Here's an initial cut that works for some basic cases (like the one I run into
with mail views combined with quick searches). It's still a bit rough around
the edges and it re-arranges some fundamental local searching code (MatchTerm)
so we'll have to be careful if a patch along this lines goes in.
Attached patch updated patchSplinter Review
Attachment #100963 - Attachment is obsolete: true
Caveats:
1) Local searches. We would have to hook it up to the imap / search code
2) You can't use it on filters since this patch does not include code for
writing out the grouping attribute of a search term to disk. I'm not sure how we
would represent that
3) There is no UI exposure for this code. I think it would be extremely hard to
write a good search UI that allows you to group terms together for complicated
searches without destroying the usability of the dialog. However, if someone
wanted to try they could build off this patch.

This code does work for:
1) My case, mail views with multiple search criteria which need to be AND'ed
against quick search criteria. 

Comment on attachment 101150 [details] [diff] [review]
updated patch

r=naving

+    if (!pResult)
+	  return NS_ERROR_NULL_POINTER;
+	if (!pResult)
+		return NS_ERROR_NULL_POINTER;
+

can change it to 

NS_ENSURE_ARG_POINTER(pResult);
Attachment #101150 - Flags: review+
looks fine, I have a few nitty nits:

could you fix the function args to use the aFoo convention?

+	static nsMsgSearchBoolExpression * AddSearchTerm (nsMsgSearchBoolExpression *
aOrigExpr, nsIMsgSearchTerm * newTerm, PRBool EvaluationValue);

+nsMsgSearchBoolExpression *
nsMsgSearchBoolExpression::AddSearchTermWithEnc(nsMsgSearchBoolExpression *
aOrigExpr, nsIMsgSearchTerm * newTerm, char * encodingStr)
+nsMsgSearchBoolExpression *
nsMsgSearchBoolExpression::AddSearchTerm(nsMsgSearchBoolExpression * aOrigExpr,
nsIMsgSearchTerm * newTerm, PRBool evalValue)
+nsMsgSearchBoolExpression *
nsMsgSearchBoolExpression::AddExpressionTree(nsMsgSearchBoolExpression *
aOrigExpr, nsMsgSearchBoolExpression * expression, PRBool boolOp)

use ? operator here:
return (newExpr) ? newExpr : aOrigExpr;
+  if (newExpr)
+    return newExpr;
+   return aOrigExpr;   // in case we failed to create a new expression, return self

AddSearchTermWithEncoding is more readable than AddSearchTermWithEnc 

Comment on attachment 101150 [details] [diff] [review]
updated patch

sr=bienvenu
Attachment #101150 - Flags: superreview+
I made David's suggestions.

For the future, if someone decides they need to write grouped terms to disk,
here's an email I sent david:

 I guess I just meant that we already use paraentheses to group individual
search terms and to use them again to also denote grouping might confuse the
search term parser.

If a typical search term looks like this:
condition="OR (from in ab,is in ab,moz-abmdbdirectory://abook.mab) OR (date,is
before,11-Sep-2002)"
then we could insert our notation just before the boolean operator for the begin
group and just after the search term data for the end group:

condition="< OR (from in ab,is in ab,moz-abmdbdirectory://abook.mab) OR (date,is
before,11-Sep-2002) >"

Then modify the parser to look for these characters in the right place and set
the boolean flags on the appropriate search term.

-Scott

David Bienvenu wrote:

>
>
> Scott MacGregor wrote:
>
>> Doesn't mean we can't. I just didn't implement it since I didn't need that
functionality and we don't have a UI that would let a user create a persistant
filter rule that would use it either.
>>
>> If you have ideas on how we should write it to disk, I would like to here
them. We already use parantheses for organizing search terms so we may not be
able to use that character. Maybe just a BG, EG notation inside the search term.
>
> That comment about parentheses confused me - I thought we did use parentheses
for grouping, so that adding more parenthesis would just do more grouping. But
yeah, we could use something like < > or { }
this has been checked in. 
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
marking verified.
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
*** Bug 59821 has been marked as a duplicate of this bug. ***
(In reply to comment #9)
> I made David's suggestions.
> 
> For the future, if someone decides they need to write grouped terms to disk,
> here's an email I sent david:
> 
>  I guess I just meant that we already use paraentheses to group individual
> search terms and to use them again to also denote grouping might confuse the
> search term parser.
> 
> If a typical search term looks like this:
> condition="OR (from in ab,is in ab,moz-abmdbdirectory://abook.mab) OR (date,is
> before,11-Sep-2002)"
> then we could insert our notation just before the boolean operator for the begin
> group and just after the search term data for the end group:
> 
> condition="< OR (from in ab,is in ab,moz-abmdbdirectory://abook.mab) OR (date,is
> before,11-Sep-2002) >"
> 
> Then modify the parser to look for these characters in the right place and set
> the boolean flags on the appropriate search term.

Would it be acceptable to completely revamp the existing msgFilterRules.dat
syntax? The current layout with the OR or AND operator between each term is
rather wasteful since the operator is always the same for existing filters.
(I.e., the UI only lets you choose All / Any once, and all the terms in the
filter get the same operator.) Instead of
  OR (foo,is,bar) OR (x,is,y) OR (a,is,b)
it should just be something like
  OR (foo,is,bar)(x,is,y)(a,is,b)

It might be a good idea to adopt something like the LDAP filter string notation:
  (|(foo,is,bar)(x,is,y)(a,is,b))

because then grouping/nesting is also explicitly provided for.

Then the example from comment #0 is
  (&(|(Sender,is,Seth)(Sender,is,David))(Subject,is,RedSox))
 
Also instead of having to explicitly define complementary operators (e.g.
contains, doesn't contain) you can just use a NOT operator:
   (!(Subject,is,RedSox))

Getting rid of the various is/is not complements would streamline the UI a
little, but it may be too drastic a change given the way things work today.
Howard, this bug is fixed; other RFEs should be reported separately. (If you 
report an RFE for a graphical interface, <http://groups-beta.google.com/
group/netscape.public.mozilla.mail-news/msg/3720f15f2d8d80a1?hl=en&fwc=1> 
may be useful.)
(In reply to comment #14)
> Howard, this bug is fixed; other RFEs should be reported separately. (If you 
> report an RFE for a graphical interface, <http://groups-beta.google.com/
> group/netscape.public.mozilla.mail-news/msg/3720f15f2d8d80a1?hl=en&fwc=1> 
> may be useful.)

OK. See bug 297582. And thanks for the pointer to that post, it looks good to me.
Blocks: 297852
Component: MailNews: Search → MailNews: Message Display
QA Contact: laurel → search
You need to log in before you can comment on or make changes to this bug.