Closed
Bug 443944
Opened 16 years ago
Closed 15 years ago
Forum search seems to search terms as OR not AND
Categories
(support.mozilla.org :: General, defect, P1)
Tracking
(Not tracked)
RESOLVED
WONTFIX
0.9
People
(Reporter: Kensie, Unassigned)
References
Details
It seems to me when I search on the forums I get a lot of results that don't have to do with both of my search terms. Searching on "options" turns up over 452 pages of results. Searching on "flash options" turns up over 3012 pages of results.
Comment 2•16 years ago
|
||
Setting P1 on bugs that are "must haves" for 0.7. This bug should be targeted to 0.6.3 or 0.6.4, or kept in 0.7 which will be the last milestone for Q3.
Priority: -- → P1
Comment 3•16 years ago
|
||
It's neither OR nor AND. The search is a full text search, which means results are returned in order of relevance; hence you are generally receiving results where the whole phrase is contained, then results that contain less of the whole phrase. The alternative here is to switch to BOOLEAN MODE searches. This would allow only returning results where all of the words are in the returned results. However, Boolean mode does *not* allow sorting of results by relevance, so you would not receive results in such a useful order. I believe this is a worse outcome, so I think we should call this WONTFIX. a=David?
Reporter | ||
Comment 4•16 years ago
|
||
This is assuming the current results are actually more useful. What's the math that goes into relevance? There's a huge loss in usability I think in returning too many results, especially if maybe only 1 or 2 have both terms and then it defaults to either or. Would the Boolean mode sort results by date? I think this combined with the fact that the results would have all terms search for would make the results much easier for people to use. From my experience the big factors in relevance are that all terms are present and recency so I don't think this would be a huge loss in those terms.
Comment 5•16 years ago
|
||
Yeah it might be interesting to try the boolean search. The current search is really broken, with 35k+ search results on simple terms. :( How is search in the forum different from the KB? Or are we having similar problems there?
Comment 6•16 years ago
|
||
The relevance calculation is internal to MySQL but is based on "number of words in the index, the number of unique words in that row, the total number of words in both the index and the result, as well as the weight of the word". Weight is based on frequency of the word across the document collection. In addition, all fulltext searches exclude words with <=3 characters and exclude all stopwords. It's the same search in both. I'm not sure why returning a lot of results in a useful order is bad - it works for Google. Perhaps we can put the alternative up somewhere for testing so you can try it out. I'm loathe to change it in production until you're happier with the alternative, because I suspect you won't be.
Reporter | ||
Comment 7•16 years ago
|
||
(In reply to comment #6) > The relevance calculation is internal to MySQL but is based on "number of words > in the index, the number of unique words in that row, the total number of words > in both the index and the result, as well as the weight of the word". Weight > is based on frequency of the word across the document collection. is this confined to each post (is that why we get multiple results per thread?) or is it per thread? > It's the same search in both. I'm not sure why returning a lot of results in a > useful order is bad - it works for Google. Except it also fails for google when you start getting into search results from pages that cover multiple topics, which forum threads can do. The thing is I don't think the order is useful, I'll have to come back with some examples next time I try searching the forum.
Comment 8•16 years ago
|
||
Here's a specific example: Just had a user on IRC who had problems with Firefox launching slow on Vista, but only when logged in with a specific user account in Vista (logging into another account it worked fine). Searching the forums for "launch slow" returned 2600 results. Not seeing anything that sounded remotely like this user's problem in the top page of search results, I put in "launch slow vista" and got 6047 results, with the same set of posts in the top page that still didn't sound like her problem (and many of them specifically mentioned XP and didn't mention Vista).
Comment 9•16 years ago
|
||
Would it be possible to try to switch to binary search on the staging server and see if this actually produces a better result? That way we can at least rule that option out and focus solely on replacing the search engine altogether. Or, we can switch to the binary search while transitioning if that produces a better result. Dave's example seems to point out that the relevance calculation isn't very efficient.
Updated•16 years ago
|
Target Milestone: 0.7 → 0.8
Updated•16 years ago
|
Target Milestone: 0.8 → 0.9
Comment 10•15 years ago
|
||
Should we consider maybe using Google custom search?
Comment 11•15 years ago
|
||
New search should fix this.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•