As discussed in bug 447933, we want to switch off stop words for MySQL's full text index on AMO altogether. This is a server-wide setting.
- Set the MySQL system variable "ft_stopword_file" to an empty string. That'll disable stop words (cf. http://dev.mysql.com/doc/refman/5.0/en/fulltext-fine-tuning.html)
- restart the server(s)
- execute "REPAIR TABLE text_search_summary QUICK;" on the AMO DB in order to rebuild the full text index according to the new settings.
We need to do this in a maintenance window, as the server restart will mean a short downtime. Also: after the index has been rebuilt, we need to watch the server load for a little while and make sure that nothing is adversely affected (as the unfiltered index will grow larger than before).
Mark, do you want to suggest a window for that to happen? On the webdev side, I will be involved, and probably Steven in order to make sure search behaves as expected.
If you don't mind doing it this Thursday (tomorrow), then I'm already going to be doing some work so it'd work great to put the two tasks together.
Does that work for you?
Works for me -- I can be online around that time if nobody else can.
What time? Since I am your time plus 3h, I can't do it too far in the evening, sorry (not tomorrow anyway, Tuesday would wfm). However, I mentioned everything in comment 0, Mike could pick it up too.
I'll be around; I've also done baseline perf tests using my Selenium testsuite (https://wiki.mozilla.org/QA/Tools/Selenium/AMO_Automation, the Search testcase at http://svn.mozilla.org/addons/trunk/site/app/tests/search.html), and Selenium reports that the script takes ~ 24 seconds to execute (don't know how accurate Selenium's internal timing is, but it's a good start).
Is there a better (i.e. definitely more accurate) measure we can apply to search query times?
This did not happen last night, did it?
Apologies. I didn't get the downtime window scheduled and announced so was unable to take AMO down for this.
Tuesday is the new scheduled evening - tomorrow. I've pinged mrz to let him know that this will be happening, should be able to get this announced and out this time.
Again, sorry about the dropped ball.
No problem. Let us know when you have a schedule.
This is going tonight - which db host is this? Which databases/sites will be affected?
Should be AMO-specific, I believe that is mrdb03. IT has more info on which specific servers, but it will require a restart for the AMO master, which puts a hold on:
For versioncheck, facebook and services, we should be able to effectively pause replication so that the master restart doesn't require downtime? Mark?
Actually, facebook has writes going to the master, so the only true read-only subdomains are services and versioncheck at this point.
We also need to make sure that the slaves realize their indexes are out of date also and rebuild them. Or, we first restart+execute "repair table" on the master, and later on the slaves also to be sure.
Yeah, I'll make sure all the databases get the indexes rebuilt.
After some fail on my part ('the' will never be indexed since it's in >50% of the data), this is done.
(Note: no slave01 exists right now.)
Nice work guys! :)