Closed Bug 815630 Opened 13 years ago Closed 12 years ago

Need option to disable query expansion

Categories

(Thunderbird :: Search, defect)

15 Branch
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 752844

People

(Reporter: firstpeterfourten, Unassigned)

Details

(Whiteboard: [global search tokenizer issue?])

Thunderbird needs an option to disable query expansion. Google used to have the plus operator (put + before a word to match that word exactly) and has since changed to using quotation marks for an exact match. Example use case: I am searching for e-mail from a person with last name "Williams" and have a lot of e-mail that mentions people with first name "William" that are totally irrelevant, but get included through query expansion that I can't figure out how to disable.
Better example: I'm searching for e-mail discussing "Howe" as in the Crowdsourcing author Jeff Howe, and it's lost in the noise of many e-mails that include the word "how." Lack of something to look for the requested search term as opposed to an unrelated common word means "search is broken."
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
WBT, thanks for filing your issue. Unfortunately, your issue is hard to understand because you do not follow the prescribed format: Steps to reproduce 1 start here... 2 type this... 3 click there... Actual Result - this is what happens - and also this Expected Result - this is what should happen - plus this From experience, bugs that do not follow that pattern will not get any attention, because they are too hard or even impossible to understand to begin with because they are lacking important information and crucial detail. Same applies for this bug: We have at least 3 major different types of search, so it's vital to state which type of search you're using (probably global search) - with thousands of reports, pls don't expect everyone to guess and work out all the details themselves before they can actually understand your issue. A more detailed description often leads half the way towards better analysis and understanding of the problem, so it's in your own interest to provide such detail. E.g., for the case of global search: - only "howe" returns all instances of "how" - whereas "how@" where @ is any other vowel except e (e.g. howi, howo, etc.) will NOT return any instances of "how" Perhaps the tokenizer considers "howe" a common spelling mistake for "how" (where the w and e keys are right next to each other), so howe will not even get listed as a separate token?
Severity: major → normal
OS: Windows 7 → All
Hardware: x86_64 → All
Whiteboard: [global search tokenizer issue?]
As a side note, like reporter, I'd also like to know how I can search for an exact string match like "you put" which excludes things like "you're putting" or "you. Put" which are currently found in global search results, regardless if search term is included in quotes or not.
OK, here's the other format. Others dislike when I use that format because it's not concise enough. 1. Open Thunderbird, pre-loaded with accounts that have a lot of mail. 2. In the upper left corner, click in the (global) Search box. 3. Type a query, e. g. a single word, which is similar to but not the same as another word common in your e-mail. E. g. "Howe." 4. Search results include a large number of irrelevant results not containing the search term, but rather something else that has a similar spelling. Actual results: 5. Putting the query in quotes does not change the results. 6. Putting a plus sign in front of the word does not change the results. 7. It is not possible to find the target message/content among all the noise; the purpose of search cannot be achieved; search is broken. Expected results: 4. Search results containing the search term, and no other results, are shown. OR 5. Putting the query in quotes disables query expansion and returns only results containing the search term. The second option in "Expected Results" is preferred because there is some value in having query expansion by default.
Here's another use case - search for "Julie" in your e-mail; get a lot of messages about "July."
Another use case: Setup as in comment 4. Action: 1. Global search for "adapters" Expected results: 2. Result list clearly identifies a recent e-mail about AC adapters for a computer. Actual results: 2. Results highlighting "adaptation" [e. g. of music], [intelligent software] "adapted," "adapt" [e. g. an existing system to a new use], "adaptations" [of organisms], [how the brain] "adapts" [to injury], "adaptable" [procedure to open meetings], "adaptive" [sport], etc. 3. No indication of being able to find the desired e-mail with the queried keyword. 4. Putting the query in quotes or with a plus sign in front doesn't help. 5. Pressing "more" results in Bug 731612. 6. After much time and frustration, target e-mail is not found. SEARCH IS BROKEN.
I agree. The problem appears to be based in a fuinction called "stemming", which is useful in many cases but misses an option to override it (like + or ""). Things are even more confusing for non English languages. For example: Trying to find a message from someone named "Sachs". Thunderbird displays a lot of search results containing "Sache" (meaning "thing" in german, a word that obviously appears in a lot of messages). On the other hand if I enter "Sachen" (plural form of "Sache"; "things") only results for "Sachen" appear, although one would expect that also results for "Sache" would be displayed.
By the way this is probably a duplicate of bug 752844
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.