Closed
Bug 616569
Opened 14 years ago
Closed 14 years ago
Evaluate different search rankers
Categories
(support.mozilla.org :: Search, defect, P1)
support.mozilla.org
Search
Tracking
(Not tracked)
VERIFIED
FIXED
2.4.2
People
(Reporter: jsocol, Assigned: jsocol)
References
Details
(Whiteboard: [qa-])
Once we've switched to using the EXTENDED2 matching mode, we have options for search ranking modes.[1] My guess is that PROXIMITY_BM25 (the default with EXTENDED2 matching) is going to be the best but it can't hurt to play with it. Sooner or later I'd really like to upgrade to Sphinx 1.10 beta (need to evaluate it for stability with IT) and then we can try SPH04[2], which is PROXIMITY_BM25 + extra weight for matches at the beginning or end of strings. That means that searching for "Cookies" would rank the article "Cookies" higher than "How to enable or disable cookies" (assuming title has enough weight to overpower anything in the content). [1] http://www.sphinxsearch.com/docs/manual-0.9.9.html#api-func-setrankingmode [2] http://www.sphinxsearch.com/docs/manual-1.10.html#api-func-setrankingmode
Assignee | ||
Comment 1•14 years ago
|
||
So I've been playing around a lot. My goal was to make "Cookies" the first result for "cookies." So far, no combination of ranker or weights is getting me there. I think those other three articles ("Deleting cookies", "Blocking cookies", and "Disabling third party cookies") really are just better matches, based on content. (I've been pushing up title weight relative to everything else, up to 50x, even going so far as ignoring everything else, and still no luck.) Anyway, my general feeling from the other results (variations on "flash" and "sync" again) is that the order of "best" ranking modes is: SPH04 (not available until we upgrade sphinx) PROXIMITY_BM25 (default + current) BM25 PROXIMITY Basically, I think we should switch to SPH04 when we can, but in the meantime, I think PROXIMITY_BM25 is the best we can do. Again, I'll implement a change making this explicit and possible to set on a per-index basis. It may make more sense to order questions, for example, with an age factor in addition to weight (something like weight*(1+exp(-age)) would work).
Assignee | ||
Comment 2•14 years ago
|
||
https://github.com/jsocol/kitsune/commit/aefc252d Nothing to verify as we didn't change the mode, but it's easier to change in the future if we want.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [qa-]
You need to log in
before you can comment on or make changes to this bug.
Description
•