Closed
Bug 550648
Opened 11 years ago
Closed 11 years ago
Gloda fulltext message search may have inefficient query structure when result set significantly exceeds LIMIT, resulting in slow faceted search response
Categories
(MailNews Core :: Database, defect)
MailNews Core
Database
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: asuth, Assigned: asuth)
References
()
Details
(Keywords: perf, Whiteboard: [gloda key][gs])
Attachments
(1 file, 4 obsolete files)
Searches that match a lot of things take a long time and seem to chew up a lot of CPU. The fundamental actions that need to take place are: 1) Get the document id of each match as well as the list of OFFSETS matched (to use as a score boost) ostensibly from the messagesText table but likely/hopefully purely from the doclists. 2) Retrieve the statically computed per-message interest factor. 3) Figure out the top LIMIT scoring messages. 4) Retrieve the message rows for those messages from the messages table. 5) Retrieve the message bodies for those messages from the messagesText table. As the query currently stands, I suspect we are incurring #5 for messages that are discarded by the LIMIT. I intentionally had us do that on the assumption that we were already hitting the messagesText row in #1 and that for the extra memory cost of holding onto the text we could leverage the locality. Time constraints precluded verification and my revised assumption is now that we aren't getting a locality boost but I'll need to check the source to be sure (unless it gets surfaced in the EXPLAIN, but I don't think it does). If #5 is wasted, we should be able to gain it back. If we actually do have locality on our side, we can use #5 to provide us with #2; right now #4 provides us with #2. As an aside: if we were willing to modify FTS3 to store static precedences in the doclists in the termlists, we could get a ridiculous performance win (at the cost of some disk bloat) and completely avoid wasted #4 and #5 because #1 would already provide #2, providing all we need for #3. The modifications would effectively create an FTS4, so I'm not sure upstream would be crazy about it, and I rather doubt we want a fork, but it is mad tantalizing in terms of the number of seeks and wasted processing we could avoid.
| Assignee | ||
Comment 1•11 years ago
|
||
I think as part of addressing this we may also want to think about changing the phrase/individual words logic as exposed by the UI. I think I originally posed our choices of query as ("foo bar") versus ("foo" AND "bar"). Well, there's also a third choice; there's a NEAR predicate. It allows us to specify a required token proximity for intersection.
I actually exposed this but I probably forgot to tell people, here's a good example set of queries. I am assuming people have the phrase "mozilla all-hands" in their repositories. Do NOT type the single quotes in your search box, type what is in them.
- '"mozilla hands"': This should return basically nothing.
- '"mozilla NEAR/2 hands"': This should return exactly what we're looking for, although the choice of NEAR is obviously specifically chosen.
- 'mozilla hands': This should take a while, return more results, and have more of the results be scatter-shot.
Note that there will be serious caching at play; you will want to close the result tab and do a search for something else really common (that does not overlap in results) to try and force a GC and harsh the caches a little bit.
Based on some brief play and a better understanding of what is going on in FTS3, I think we definitely want to think about adopting use of an iterative deepening strategy and the impact on UI. (We have some refinement stuff already and more planned, but this would increase the universe of possibilities a bit more.) I don't think we should move on that before I am able to cut any gratuitous fat off, but it needs to be considered.
Also note that our JS scoring heuristic has the offsets available to it so it could perform NEAR-style boosting of its own once we get results back, but we have not done that.
A brief straw-man implementation for n terms might be:
- Evaluate all terms in a NEAR/# configuration so they are all close together. [This primes the cache; all doclists should ideally no longer have disk seek costs after this point.]
- If we got any results whatsoever, dispatch the gloda ORM queries so we can get the data to display. Bail if we've hit our limit.
- If n > 2, create NEAR/# queries for each permutation of (n-1) terms intersected with the unpaired variables. For 'a b c' this means '(a NEAR b) AND c', '(a NEAR c) AND b', '(b NEAR c) AND a'. The idea is to try and puzzle out what groupings are actually meaningful without opening the floodgates.
- Dispatch results for gloda lookup, bail if we've hit our limit.
- Maybe relax the # we used in the previous cases, if n > 3 maybe try (n-2) tuple permutations.
- Consider automatically relaxing the NEAR requirement to just an AND requirement if the above did not do anything useful.
- Consider relaxing the OR requirement.
From a UI perspective, it might even make sense to indicate any groupings we used in their own facet. For example, if I search for 'david all hands', this might produce the following facets where parens are indicating bubbles expressing grouping, because I love me some bubbles:
- (david all hands) - query of 'david NEAR all NEAR hands'
- (david) (all hands) - query of 'david AND all NEAR hands'
- (david all) (hands) - etc
- (david hands) (all) - etc
In reality hopefully some of the weirder intersections would turn out to be proper subsets of the other ones and we could just throw them away. One nice thing about this style of presentation is that we could still be adding results (with statistics) even if we aren't updating the UI to avoid upsetting them. So the user could see what relaxing their query further would accomplish and get their results instantly if they click.
In terms of determining when to 'lock down' the UI presentation, I would initially suggest a time-based heuristic. We 'ship' with what we've got after 800ms (or if we've hit our limit already).
Comment 2•11 years ago
|
||
>- '"mozilla hands"': This should return basically nothing. Got me nothing back >- '"mozilla NEAR/2 hands"': This should return exactly what we're looking for, although the choice of NEAR is obviously specifically chosen. This only gave me one answer the bugmail from this bug. Are there anything I need to setup andrew to have Near working ? Last query gave me plenty of results.
| Assignee | ||
Comment 3•11 years ago
|
||
(In reply to comment #2) > >- '"mozilla NEAR/2 hands"': This should return exactly what we're looking for, > although the choice of NEAR is obviously specifically chosen. I got overzealous with the quoting. 'mozilla NEAR/2 hands' is what I meant. So you end up typing no quote characters whatsoever.
Comment 4•11 years ago
|
||
http://gsfn.us/t/nsxs sounds like this bug. If I'm wrong or other bugs are equally likely to be at fault or to help, please ping me. nominating for wanted v3.1. Note that the response times posted there range from 35 seconds to minutes. (although part of the initial posting was about bug 533865)
Updated•11 years ago
|
blocking-thunderbird3.1: ? → ---
Flags: wanted-thunderbird+
Updated•11 years ago
|
Summary: Gloda fulltext message search may have inefficient query structure when result set significantly exceeds LIMIT → Gloda fulltext message search may have inefficient query structure when result set significantly exceeds LIMIT, resulting in slow faceted search response
Whiteboard: [gloda key][gs] → [gloda key]
| Assignee | ||
Comment 5•11 years ago
|
||
The major explanation of the logic change is in the code. I also have a blog post with control-flow going up shortly, I'll reference when that happens. This should result in major performance improvements. The change in ranking logic going into the SQL LIMIT may result in qualitative changes in results that could require some tweaking of various constants. The logic is now generally more consistent with the ranking heuristics used by the JS code, but it's not a perfect match due to limitations of the matchinfo() data exposed to us. It is within our capabilities to also have the new ranking mechanism work in a more similar fashion to the previous OFFSETS mechanism in the name of consistency over intent.
| Assignee | ||
Comment 6•11 years ago
|
||
This adds the ability for us to get EXPLAIN output from our queries so we can debug them and find out what makes them sad and slow. Some explanation coming in the promised blog post and then I will transfer that or something better to devmo or wikimo once it lands.
Attachment #437261 -
Flags: review?(bienvenu)
| Assignee | ||
Comment 7•11 years ago
|
||
note: I haven't done anything with the NEAR stuff mentioned above. Assuming this bug gets resolved by the provided patch, I will kick that stuff out into its own enhancement bug.
Comment 8•11 years ago
|
||
Comment on attachment 437260 [details] [diff] [review] v1 rejigger the fulltext query to use matchinfo() and a ranking function and general optimization http://hg.mozilla.org/comm-central/rev/d9b3acd186ab haven't run with this yet, but a few nits: typo: + * - fulltext weighting by where the match happed works. all in one line is nicer for grep-like searches here: +NS_IMPL_ISUPPORTS1( + nsGlodaRankerFunction +, mozIStorageFunction +) maybe NS_ERROR_INVALID_ARG instead of NS_ERROR_FAILURE here and below? + if (nVal < 1) + return NS_ERROR_FAILURE; spaces around '=' (4 times) + for(PRUint32 iPhrase=0; iPhrase < nPhrase; iPhrase++){ + // in SQ + for(PRUint32 iCol=0; iCol < nCol; iCol++){ space after "for" and after ) - occurs a couple times. + for(PRUint32 iPhrase=0; iPhrase < nPhrase; iPhrase++){
Comment 9•11 years ago
|
||
Comment on attachment 437260 [details] [diff] [review] v1 rejigger the fulltext query to use matchinfo() and a ranking function and general optimization http://hg.mozilla.org/comm-central/rev/d9b3acd186ab r=me, modulo previous nits.
Attachment #437260 -
Flags: review?(bienvenu) → review+
Comment 10•11 years ago
|
||
I don't recall why I removed reference to http://gsfn.us/t/nsxs and stupid me didn't make a bug comment. reinstating
Whiteboard: [gloda key] → [gloda key][gs]
Comment 11•11 years ago
|
||
Comment on attachment 437261 [details] [diff] [review] v1 debugging support that informed the fixit patch this doesn't seem to apply, with or w/o the previous patch.
Attachment #437261 -
Flags: review?(bienvenu) → review-
| Assignee | ||
Comment 12•11 years ago
|
||
This must have been one of those cases where the 3-line context mq patch was fine but the 8-line context I provide in the patch snagged on something else in my queue, sorry about that. This is just on top of the tip.
Attachment #437261 -
Attachment is obsolete: true
Attachment #437657 -
Flags: review?(bienvenu)
| Assignee | ||
Comment 13•11 years ago
|
||
Comment on attachment 437260 [details] [diff] [review] v1 rejigger the fulltext query to use matchinfo() and a ranking function and general optimization http://hg.mozilla.org/comm-central/rev/d9b3acd186ab pushed the fix proper to comm-central: http://hg.mozilla.org/comm-central/rev/d9b3acd186ab
Attachment #437260 -
Attachment description: v1 rejigger the fulltext query to use matchinfo() and a ranking function and general optimization → v1 rejigger the fulltext query to use matchinfo() and a ranking function and general optimization http://hg.mozilla.org/comm-central/rev/d9b3acd186ab
| Assignee | ||
Comment 14•11 years ago
|
||
bustage-fix for comm-central against mozilla-central pushed: http://hg.mozilla.org/comm-central/rev/b4b2efe0a487
Comment 15•11 years ago
|
||
Comment 16•11 years ago
|
||
logHelper.js on the trunk seems pretty different from what you have in your tree/ patch queue...
Comment 17•11 years ago
|
||
Comment on attachment 437657 [details] [diff] [review] v1b debugging support that informed the fixit patch de-rotted clearing review request since it didn't apply...
Attachment #437657 -
Flags: review?(bienvenu)
| Assignee | ||
Comment 18•11 years ago
|
||
I'm a bit confused as to what's up, but this patch applies to my other comm-central tree that is free of haunting, probably just because the context is ratcheted down.
Attachment #437260 -
Attachment is obsolete: true
Attachment #437657 -
Attachment is obsolete: true
Attachment #437997 -
Attachment is obsolete: true
Attachment #438995 -
Flags: review?(bienvenu)
Updated•11 years ago
|
Attachment #438995 -
Flags: review?(bienvenu) → review+
| Assignee | ||
Comment 19•11 years ago
|
||
Comment on attachment 438995 [details] [diff] [review] v1c just provide the straight patch for explain-support rather than using qdiff http://hg.mozilla.org/comm-central/rev/b507c951ebb8 pushed the debugging support patch to comm-central: http://hg.mozilla.org/comm-central/rev/b507c951ebb8
Attachment #438995 -
Attachment description: v1c just provide the straight patch rather than using qdiff → v1c just provide the straight patch for explain-support rather than using qdiff http://hg.mozilla.org/comm-central/rev/b507c951ebb8
| Assignee | ||
Updated•11 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
| Assignee | ||
Comment 20•11 years ago
|
||
I added a gloda debugging page that calls out the new option, although the font is weird because of copy and paste: https://developer.mozilla.org/en/Thunderbird/Gloda_debugging
Comment 21•11 years ago
|
||
(In reply to comment #20) > I added a gloda debugging page that calls out the new option, although the font > is weird because of copy and paste: > https://developer.mozilla.org/en/Thunderbird/Gloda_debugging mailnews.database.global.logging.net - Should we look for a /tmp/chainsaw.ptr file that tells us a hostname and port to connect to to send XML formatted log records to. The file should be of the form "hostname:port" like "localhost:1234". is it looking for %tmp% on windows ?
| Assignee | ||
Comment 22•11 years ago
|
||
Yes, but who uses windows?
You need to log in
before you can comment on or make changes to this bug.
Description
•