Closed Bug 499806 Opened 15 years ago Closed 15 years ago

add simple query/enumerate w/ search term criteria filter to nsIMsgDatabase

Categories

(MailNews Core :: Database, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Bienvenu, Assigned: Bienvenu)

References

Details

(Keywords: perf)

Attachments

(1 file, 4 obsolete files)

For js code that enumerates over a msg db looking for headers with certain properties, like gloda, the simple approach of getting an enumerator and looking at each hdr in turn can generate a lot of js garbage, and produce a lot of calls across the xpcom boundary. For example, when gloda is looking for unindexed messages in a folder, it can instantiate every message header in the folder just to find a couple headers. If we can push the search down into the db code, we can reduce the amount of js garbage and the number of calls across the xpcom boundary substantially.

I've gotten into situations where, if I leave the system idle for a bit, and try to use Thunderbird, Thunderbird won't be responsive, and it turns out js is garbage collecting msghdrs like crazy.
Keywords: perf
If it's not too feature creepy, gloda could actually benefit from being able to use the predicate to get a count of the number of messages it (probably) needs to index.  Right now, the folder indexing logic does one pass to count the messages, and then a second pass to index them.  We still would need the mechanism to not lock up the UI, of course...
Status: NEW → ASSIGNED
OS: Windows Vista → All
Hardware: x86 → All
I was thinking of adding methods like this to nsIMsgDatabase:

nsISimpleEnumerator GetFilterEnumerator(in nsIArray searchTerms);
boolean NextMatchingHdrs(in nsISimpleEnumerator enumerator, in long numHdrsToLookAt, in long maxResults, in nsIMutableArray matchingHdrs, out long numMatches);

the numHdrsToLookAt arg allows you to control the granularity of the search (I think it's safe to use at least 100 for this value). Maybe we don't need maxResults...if you passed in a null pointer for matchingHdrs, that would allow you to do the counting, using the out numMatches.

I need to extend the backend a teeny bit to allow you to specify a search term that only look at msg hdr properties, similar to the way you can search on an arbitrary message header.
Sounds great.
Attached patch wip (obsolete) — Splinter Review
this is limping along - I added a unit test to exercise the basic functionality. I need to clean it up, and extend the test to make sure that the chunking works. And I've introduced some line ending issues, which I need to find and fix...
Attached patch more wip (obsolete) — Splinter Review
this hg adds the test case, fixes an issue with chunking filtering, and adds reverse filtered enumeration (untested), because someone's probably going to ask for it :-)
Attachment #385950 - Attachment is obsolete: true
Attached patch proposed fix (obsolete) — Splinter Review
I'd love to get this into b3 so that gloda could use it (or so I could tweak my local version of gloda to use it :-) )
Attachment #386130 - Attachment is obsolete: true
Attachment #386165 - Flags: superreview?(bugzilla)
Attachment #386165 - Flags: review?(bugmail)
Attachment #386165 - Flags: review?(bugmail) → review+
Comment on attachment 386165 [details] [diff] [review]
proposed fix

Rich review at:
http://reviews.visophyte.org/r/386165/

Export of review follows:

on file: mailnews/base/search/public/nsIMsgSearchTerm.idl line 57
>     attribute ACString hdrProperty;

Since you are not reusing arbitraryHeader (why aren't you? these won't persist
correctly either way...) you need to update all the locations that try and
clone search terms by copying the attributs of nsIMsgSearchTerm across.

I think these two might be it:
http://mxr.mozilla.org/comm-central/source/mailnews/base/src/searchSpec.js#196
http://mxr.mozilla.org/comm-central/source/suite/mailnews/commandglue.js#1117


on file: mailnews/base/search/public/nsIMsgSearchTerm.idl line 110
>      * @param msg - msg to match db hdr property of.

You should not be using dashes to separate the parameter name from its
description in @param tags.  You do this a number of other places as well.

The syntax can be found here:
http://www.doxygen.org/commands.html#cmdparam


on file: mailnews/base/search/public/nsIMsgSearchTerm.idl line 116
>     /**
>      * Compares value.status with the folder flags of the msg's folder.
>      * @param msg   msgHdr whose folder's flag we want to compare.
>      *
>      * @returns     true if folder's flags match value.status, false otherwise.
>      */
>     boolean matchFolderFlag(in nsIMsgDBHdr msg);

Why are you adding folderFlag?  Are you correcting a lack of full coverage of
attributes a message can possess by the search mechanism?  Because I don't
think the gloda use-case needs this.

The only cases I can dream up where this would be reasonable would be in a
complicated search query where you want to use different sub-classes based on
the type of folder a message belongs to.  This would seem appropriate for
virtual folders or some form of filter that sees all messages going through
the system and does not scope things itself.

Gloda should not need to use FolderFlags since it can specialize things based
on the folder.


on file: mailnews/base/search/public/nsMsgSearchAdapter.h line 67
>     a == nsMsgSearchAttrib::JunkPercent || a == nsMsgSearchAttrib::FolderFlag);

I think IsStringAttribute wants to check HdrProperty instead; FolderFlag isn't
a string property, HdrProperty is.


on file: mailnews/base/search/public/nsMsgSearchCore.idl line 122
>     const nsMsgSearchAttribValue FolderFlag = 50;

mention nsIMsgSearchTerm::status?


on file: mailnews/base/search/public/nsMsgSearchCore.idl line 123
>     //50 is for showing customize... in ui headers start from 52 onwards up until 99.

should be "51 is for showing customize..."


on file: mailnews/base/search/public/nsMsgSearchCore.idl line 124
>     const nsMsgSearchAttribValue OtherHeader = 51;  /* for mail and news. MUST ALWAYS BE LAST attribute since we can have an arbitrary # of these... */

Please add a comment that it is okay to change this constant since the indices
for custom headers are never persisted.  This isn't obvious otherwise.


on file: mailnews/base/search/public/nsMsgSearchCore.idl line 231
>    _a == nsMsgSearchAttrib::FolderFlag || \

IS_STRING_ATTRIBUTE probably wants HdrProperty too.


on file: mailnews/base/search/src/nsMsgSearchTerm.cpp line 817
> NS_IMETHODIMP nsMsgSearchTerm::MatchHdrProperty(nsIMsgDBHdr *aHdr, PRBool *aResult)
> {
>   NS_ENSURE_ARG_POINTER(aResult);

You don't NS_ENSURE_ARG_POINTER(aHdr) here, although you do have one in
MatchFolderFlag's case...


on file: mailnews/base/search/src/nsMsgSearchTerm.cpp line 821
>   PRBool result;
> 
>   GetMatchAllBeforeDeciding(&result);
> 
>   nsCString dbHdrValue;
>   aHdr->GetStringProperty(m_hdrProperty.get(), getter_Copies(dbHdrValue));
>   PRBool result2;
>   nsresult rv = MatchString(dbHdrValue.get(), nsnull, &result2);
>   if (result != result2) // if we found a match
>     result = result2;
>   *aResult = result;

The names result and result2 and their usage is somewhat confusing which makes
it hard to see the bug in here.  Suggest renaming "result" to
"noMatchExpected" and "result2" to "stringsMatched".  (Even better for
comprehension would be if result was inverted so it could be "matchExpected".)

Truth Table using my proposed names...
noMatchExpected | stringsMatched | correct result | patch result
1 | 1 | 0 | *1* error
1 | 0 | 1 | *0* error
0 | 1 | 1 | 1
0 | 0 | 0 | 0

The truth table suggests XOR is desired, but XOR is not what is implemented...
I suggest:
*aResult = (noMatchExpected != stringsMatched);
(and lose the conditional)

It looks like this logic was cribbed from MatchArbitraryHeader, so I think
that logic is wrong in the DoesntContain/Isnt case as well.


on file: mailnews/db/msgdb/src/nsMsgDatabase.cpp line 2768
>   /**
>    * @param     aEnumerator - most likely, a filter enumerator
>    * @param     aNumHdrsToLookAt - if non 0, the number of hdrs to advance the
>    *                               enumerator before returning.
>    * @param     aMaxResults      - if non 0, the max results to return.
>    * @param     aMatchingHdrs    - if non null, array of matching hdrs.
>    * @param     aNumMatches        if non null, the number of matching hdrs.
>    *
>    * @returns   false, if done, true if more hdrs to look at.
>    */
> NS_IMETHODIMP
> nsMsgDatabase::NextMatchingHdrs(nsISimpleEnumerator *aEnumerator,

You already documented this in the IDL, there is no need to duplicate it here,
especially without the summary that says what the method does.


on file: mailnews/db/msgdb/test/unit/test_filter_enumerator.js line 49
>   // (folderFlag & Mail && folderFlag != ImapBox) &&

Why are you including the folderFlag terms here?  While this does test them
half-way, it doesn't verify they can fail.
Comment on attachment 386165 [details] [diff] [review]
proposed fix

Can you update the patch before I look at this? The general idea seems reasonable though.
Attachment #386165 - Flags: superreview?(bugzilla)
yes, I'm working on it - I've addressed all the comments - I'm just trying to get the unit tests passing...
Attached patch fix addressing asuth's comments (obsolete) — Splinter Review
re FolderFlag, yes, this was for completeness sake. I wanted the searchTerms to be as expressive as possible since some users may not have the flexibility of differing terms based on folder type.

Re IsStringAttribute/IS_STRING_ATTRIBUTE, the '!' changes everything :-) I added a comment, because I got it wrong the first time as well, and got rid of the duplicate method.

I added a comment as to why I used a new member for the hdrProperty - basically, to avoid confusion.

The match functions didn't need to use matchAllBeforeDeciding at all, from what I could tell, so I just removed that code. matchAllBeforeDeciding is for address fields, to know if we need to compare cc, bcc headers, as well as to, for example.
Attachment #386165 - Attachment is obsolete: true
Attachment #386730 - Flags: superreview?(bugzilla)
Attachment #386730 - Flags: review+
pinging for sr...I'd like to land this relatively soon after the tree re-opens.
Attachment #386730 - Attachment is obsolete: true
Attachment #389479 - Flags: superreview?(bugzilla)
Attachment #389479 - Flags: review+
Attachment #386730 - Flags: superreview?(bugzilla)
Attachment #389479 - Flags: superreview?(bugzilla) → superreview+
Comment on attachment 389479 [details] [diff] [review]
de-bitrotted patch

>+     // see nsMsgMessageFlags.idl and nsMsgFolderFlags.idl
>+    attribute unsigned long status;

nit: incorrect indent.

>+class nsMsgFilteredDBEnumerator : public nsMsgDBEnumerator
>+{
...
>+  nsCOMPtr <nsISupportsArray>    m_searchTerms;

I was going to complain about adding another nsISupportsArray usage, but seeing as m_searchTerms isn't used, then I think removing it would be a good idea ;-)

>+function setupGlobals()
>+{
>+  loadLocalMailAccount();
>+// Create a message generator
>+  gMessageGenerator = new MessageGenerator();

nit: incorrect indent

sr=Standard8
fix checked in, with nits addressed. I'll file follow-up bugs on making gloda and desktop search integration use the filtered enumerators.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
 Bug 505307 and  Bug 505309 filed for making gloda and desktop search integration, respectively, use filtered enumerators.
Flags: in-testsuite+
Depends on: 505487
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: