Closed Bug 438370 Opened 16 years ago Closed 16 years ago

Need to test new search query on test cluster

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cpollett, Assigned: oremj)

References

Details

Howdy,

I have a proposed patch to Bug400986 and I would like to test how it performs compared to the current search query being used in amo production. There are a couple of quirks about the new search which are described in 400098#c12. In particular, the patch creates two new tables: text_search_view and
versions_summary_view. The code for these tables can be found in remora.sql.
The script that keeps these tables up-to-date can be found in /bin/update-search-views.php it should be set up as a cron job going say every twenty minutes.

I'd like to know how my search compares with the current production search on one, two, three word queries when the databases are under a typical amo load.
For example,
firefox
firefox thunderbird
firefox thunderbird seamonkey.

I would also like to know how long the update script takes when the database is under load.

Finally, I am interested in tweaking the my.conf parameter ft_min_word_len to short word lengths than 4 (say 2 or 3).

Thanks for your help,
Chris
Blocks: 400986
What's the process for testing the search performance?  Are we throwing a query at the database or hitting certain URLs?  Which URLs?

Mark, what is your opinion on changing ft_min_word_len?
hitting urls you could try:

/en-US/firefox/search?q=firefox&cat=all
/en-US/firefox/search?q=firefox+thunderbird&cat=all
/en-US/firefox/search?q=firefox+thunderbird+seamonkey&cat=all
I'm not really against changing it, if it's needed.  Guess it depends on the size of the table we're talking about.  Thousands or millions of rows?  The former I don't care, the latter it could lead to a rather large increase in the size of the index.
let's test when q is something common like 'e' and something unusual like 'ç' also.  They are both under 4 chars also - not sure how that will affect things.
(In reply to comment #3)
> I'm not really against changing it, if it's needed.  Guess it depends on the
> size of the table we're talking about.  Thousands or millions of rows?  The
> former I don't care, the latter it could lead to a rather large increase in the
> size of the index.
> 

Both tables are just over 7000 rows after running the script this morning.  Chris - that seems small; is that correct?
that's correct. most add-ons don't seem to be translated to more than one locale.
Hey, any progress on this?  Anything we can do to help test?
I think Comment #2 is everything I needed.  I should be able to run the test tomorrow.  Did anyone do the simple perf test of looking at all the queries a search page load generates on each revision then comparing?
I did this when on a single machine running mysql 5 (new stuff faster). Need to know for mysql 4.
This is blocking 400986 which we want to commit today.  If we can help let us know.
Severity: normal → critical
I just realized that we may not have a MySQL 4 test server anymore.  Mark?
Looks like oremj is already working on this, so assigning it to him so it'll stop paging me. :)

I think that's correct though, all the staging/testing mysql servers got upgraded to mysql5.
Assignee: server-ops → oremj
Correct, the only MySQL 4 machines are production.  Why do you need one of those old and decrepit things anyway?
They need to test the query performance on MySQL 4 to see how it will hold up in production.
I don't see the point when AMO is going to be on MySQL 5 in a week or so anyway.  This is a lot of work for, IMO, no real gain.

Especially given the timeframe of this request, I can't just make a machine appear.  We have no spare machines with which to build out a new MySQL 4 cluster.

Don't think it's going to happen, not without a week or two to get a new machine or blade or shuffle something around.  (I don't think putting it on a VM is a good idea, since you can't effectively load test on a VM.  You can test in development just as well as you can test on a VM.)
I am pretty sure I tuned my stuff for mysql 4 okay. Load tests on mysql 5 would still be useful before I put this one trunk and then production.
I would be comfortable with testing on MySQL 5 and comparing it to current traffic.  Let's do that.
Let's get it up on MySQL 5 and then push it to preview to do final testing when that's completed.  Preview should allow us to test on 4.1, correct?
So are we basically testing:

SELECT DISTINCT a.id, a.name, a.summary, a.description,  MATCH(a.name, a.summary, a.description) AGAINST ('test') AS text_score,  v.created AS created  FROM text_search_summary AS a  INNER JOIN `versions_summary` AS v ON (v.addon_id = a.id)  WHERE (a.locale = 'en-US' OR a.locale = 'en-US' ) AND  MATCH(a.name, a.summary, a.description) AGAINST ('test') AND a.addontype IN (1,2,4,3,5,6) AND a.status IN(1,2,3,4) AND a.inactive = 0 AND ((a.addontype = 4 OR  v.application_id = 1 )) ORDER BY (a.status=4) DESC, text_score DESC;

vs

SELECT DISTINCT a.id, (IFNULL(`tr_name`.localized_string, `fb_name`.localized_string) LIKE '%test%') AS name_match, (IFNULL(`tr_summary`.localized_string, `fb_summary`.localized_string) LIKE '%test%') AS summary_match, (IFNULL(`tr_description`.localized_string, `fb_description`.localized_string) LIKE '%test%') AS description_match FROM addons AS a LEFT JOIN translations AS `tr_name` ON (`tr_name`.id = a.`name` AND `tr_name`.locale = 'en-US') LEFT JOIN translations AS `fb_name` ON (`fb_name`.id = a.`name` AND `fb_name`.locale = a.defaultlocale) LEFT JOIN translations AS `tr_summary` ON (`tr_summary`.id = a.`summary` AND `tr_summary`.locale = 'en-US') LEFT JOIN translations AS `fb_summary` ON (`fb_summary`.id = a.`summary` AND `fb_summary`.locale = a.defaultlocale) LEFT JOIN translations AS `tr_description` ON (`tr_description`.id = a.`description` AND `tr_description`.locale = 'en-US') LEFT JOIN translations AS `fb_description` ON (`fb_description`.id = a.`description` AND `fb_description`.locale = a.defaultlocale) WHERE (IFNULL(`tr_name`.localized_string, `fb_name`.localized_string) LIKE '%test%' OR IFNULL(`tr_summary`.localized_string, `fb_summary`.localized_string) LIKE '%test%' OR IFNULL(`tr_description`.localized_string, `fb_description`.localized_string) LIKE '%test%') AND a.addontype_id IN (1,2,4,3,5,6) AND a.status IN(1,2,3,4) AND a.inactive = 0 AND ((a.addontype_id = 4 OR ((SELECT v.created FROM versions AS v LEFT JOIN applications_versions AS av ON (av.version_id = v.id AND av.application_id = 1) WHERE v.addon_id = a.id ORDER BY v.created DESC LIMIT 1 ) IS NOT NULL))) ORDER BY (a.status=4) DESC, name_match DESC, summary_match DESC, description_match DESC;

?
Affirmative.
New search performance looks on par with old search performance.  The new query performs much better than the old.  In my tests the run time for the script under load was 3 to 5 seconds.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.