Closed Bug 457952 Opened 16 years ago Closed 16 years ago

Search for "Tab Mix Plus" isn't finding it, but "Tab Mix" does

Categories

(addons.mozilla.org Graveyard :: Administration, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 447933

People

(Reporter: stephend, Unassigned)

References

()

Details

(Keywords: regression)

If you search for "Tab Mix Plus", https://addons.mozilla.org/en-US/firefox/search?q=tab+mix+plus&cat=all, you won't find it, but if you search for "Tab Mix", we do.

Additionally, a search for "IE Tab" isn't yielding any results for that add-on, either.

Both of these are in my Selenium testsuite, which I just ran today again (sorry!), so I'm not sure why or when they regressed.
Since "plus" is a stop word, this may have regressed if ft_min_word_len is now set to something bigger than 3 (or undefined, defaulting to 4). (cf. http://dev.mysql.com/doc/refman/5.0/en/fulltext-fine-tuning.html)

Over to IT: Can somebody please check what this variable is set to?

Note that if we change it, we need to restart the server, then rebuild the indexes: "FULLTEXT indexes must be rebuilt after changing this variable. Use REPAIR TABLE tbl_name QUICK." (from http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#option_mysqld_ft_min_word_len)

Thanks.
Assignee: nobody → server-ops
Component: Public Pages → Server Operations
Product: addons.mozilla.org → mozilla.org
QA Contact: web-ui → mrz
Version: unspecified → other
ft_min_word_len is 2 on the AMO master and both slaves.
Thanks, Mark. Since that didn't regress on the server side, we'll need to find out what happens to it in the code. Pulling back the bug into AMO.
Assignee: server-ops → nobody
Component: Server Operations → Administration
Product: mozilla.org → addons.mozilla.org
QA Contact: mrz → administration
Version: other → unspecified
I was suggesting we add a help page that suggests phrase searches, as a stopgap, on another bug (which I can't find now):
https://addons.mozilla.org/en-US/firefox/search?q=%22tab+mix+plus%22&cat=all
tab mix plus is the first result.

Ditto
https://addons.mozilla.org/en-US/firefox/search?q=%22ie+tab%22&cat=all
As best I can tell, this is the search query used, so I ran it on all of the databases.  They all concur:

mysql> SELECT DISTINCT a.id, a.name, MATCH(a.name, a.summary, a.description) AGAINST ('tab mix plus') AS text_score,  v.created AS created  FROM text_search_summary AS a  INNER JOIN `versions_summary` AS v ON (v.addon_id = a.id)  INNER JOIN files ON (files.platform_id IN (1, 5) AND v.version_id = files.version_id) WHERE (a.locale = 'en-US' OR a.locale = 'en-US' ) AND  MATCH(a.name, a.summary, a.description) AGAINST ('tab*,mix*,plus*' IN BOOLEAN MODE) AND a.addontype IN (1,2) AND a.status IN(4) AND a.inactive = 0 AND ( v.application_id = 1 ) ORDER BY (a.status=4) DESC, (a.name LIKE '%tab mix plus%') DESC, text_score DESC limit 5;

+------+--------------+------------------+---------------------+
| id   | name         | text_score       | created             |
+------+--------------+------------------+---------------------+
| 1122 | Tab Mix Plus | 8.05539798736572 | 2007-06-11 13:31:46 | 
| 3266 | Tab Minus    | 9.50013256072998 | 2008-06-21 23:24:31 | 
| 5756 | Taboo        | 9.28575325012207 | 2008-09-21 00:52:38 | 
| 2224 | superT       | 8.50549697875977 | 2006-07-03 11:12:40 | 
| 1950 | Close Button | 6.72902536392212 | 2008-05-23 16:42:07 | 
+------+--------------+------------------+---------------------+
5 rows in set (0.07 sec)

I removed some columns from the SELECT to make it readable.  But it seems like it's doing the Right Thing, although interestingly, the score of Tab Minus and Taboo are higher than Tab Mix Plus...
actually, the query is this:

SELECT DISTINCT a.id, a.name, a.summary, a.description, MATCH(a.name, a.summary, a.description) AGAINST ('tab mix plus') AS text_score, v.created AS created FROM text_search_summary AS a INNER JOIN `versions_summary` AS v ON (v.addon_id = a.id) WHERE (a.locale = 'en-US' OR a.locale = 'en-US' ) AND MATCH(a.name, a.summary, a.description) AGAINST ('+tab* +mix* +plus*' IN BOOLEAN MODE) AND a.addontype IN (1,2,4,3,5,6) AND a.status IN(1,2,3,4) AND a.inactive = 0 AND ((a.addontype = 4 OR v.application_id = 1 )) ORDER BY (a.status=4) DESC, (a.name LIKE '%tab mix plus%') DESC, text_score DESC
mysql> SELECT DISTINCT a.id, a.name, a.summary, a.description, MATCH(a.name, a.summary, a.description) AGAINST ('tab mix plus') AS text_score, v.created AS created FROM text_search_summary AS a INNER JOIN `versions_summary` AS v ON (v.addon_id = a.id) WHERE (a.locale = 'en-US' OR a.locale = 'en-US' ) AND MATCH(a.name, a.summary, a.description) AGAINST ('+tab* +mix* +plus*' IN BOOLEAN MODE) AND a.addontype IN (1,2,4,3,5,6) AND a.status IN(1,2,3,4) AND a.inactive = 0 AND ((a.addontype = 4 OR v.application_id = 1 )) ORDER BY (a.status=4) DESC, (a.name LIKE '%tab mix plus%') DESC, text_score DESC ;

Empty set (0.05 sec)

Same on all three databases - nothing returned.
Is it possible that if we *enforce a stop word* to be part of the results, we get nothing back? (And because of bug 444596, we force all entered words to be part of the result by adding a + sign in front of them).

That would also explain why a search for better gmail (no quotes) does not return anything: better is a stop word, and we force it to be part of the query.

Does that make sense?
Ah-hah! Better even: Straight from the MySQL docs:

"If a stopword or too-short word is specified with the truncation operator, it will not be stripped from a boolean query. For example, a search for '+word +stopword*' will likely return fewer rows than a search for '+word +stopword' because the former query remains as is and requires stopword* to be present in a document. The latter query is transformed to +word."

We force plus* to be part of the result set, that is, since plus is a stop word (and thus is not present in the index), we force the results to contain words like "plussomething", starting with plus.

Riddle solved.

Short term remedy: We strip stop words from the queries before executing them.
By the way, this realization makes this a dupe of bug 447933.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → DUPLICATE
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.