Work with webdev (groovecoder and/or lorchard) to set up a sphinx server for MDN staging.
We are migrating to ElasticSearch. Do you really want to use sphinx?
We had a discussion about this in #webdev, which included cshields. He had concerns over the transition to ElasticSearch, and it sounded like it was early days yet. We have code that already uses Sphinx. So, rather than jump into something unknown, I think the decision at the time was to go ahead with Sphinx for MDN for now and revisit the decision once others had spent time with ElasticSearch. If that's wrong, do we need a meeting or to CC more people on this bug?
Jason, we have a general use set of sphinx nodes in sjc1, managed by puppet. You'll need to copy an instance and configure it for kuma-stage, then let Luke/Les know what port it is running on (so they can request the appropriate config file change) The files you will need to copy and commit are in /modules/sphinx/files/searchd and you will need to make some entries in the puppet manifest for the new instance too, /modules/sphinx/manifests/generic.pp then this will end up live on pm-app-sphinx01 and pm-app-sphinx02 (and I believe there is a zeus VIP for those pm-app-sphinx.mozilla.org
I have this in work but need to know where to find the sphinx.conf file that you would like me to use? I think that we generally grab this file from git just prior to re-indexing. Also if you have any other sphinx configs (wordforms.txt etc..) I will need to know about those as well.
the sphinx.conf file is in our kuma repos at configs/sphinx/sphinx.conf. do you make a sym-link for it? I'm still modifying our sphinx implementation from what we inherited from kitsune but that is where the file will live.
I think we generally are pulling the file from git just prior to re-indexing. I will set this up the same way unless you have a strong objection.
(In reply to comment #6) > I think we generally are pulling the file from git just prior to re-indexing. > I will set this up the same way unless you have a strong objection. Sounds good. If something comes up we'll let you know. ;)
okay, I have this basically working on port 3325. I am getting the following files from github sphinx.conf, stopwords.txt, and wordforms.txt just prior to running the re-index (each time). When running the index job I get the following errors: indexing index 'questions'... ERROR: index 'questions': sql_query: Table 'kuma_stage_mozilla_org_django.questions_question' doesn't exist (DSN=mysql://kuma_sphinx:***@tm-stage01-slave01.mozilla.org:3306/kuma_stage_mozilla_org_django). indexing index 'discussion_forums'... ERROR: index 'discussion_forums': sql_query: Table 'kuma_stage_mozilla_org_django.forums_post' doesn't exist (DSN=mysql://kuma_sphinx:***@tm-stage01-slave01.mozilla.org:3306/kuma_stage_mozilla_org_django). You may note that the wiki pages one seems to be working: indexing index 'wiki_pages'... collected 3 docs, 0.0 MB collected 0 attr values If you update the sphinx.conf file then you should see changes after minutes [5,20,35,50] as the re-index cron job is scheduled at those times. If you would like me to manually run the index job for error output just let me know.
Cool. I'm modifying the sphinx.conf file now to remove the indices for the resources we removed so it should fix that when I push this to stage9 - should be today or tomorrow. I'll let you know. Thanks!
stage9 says search unavailable. https://kuma-stage.mozilla.org/en-US/search?q=calls Jason, can you confirm the sphinx settings we have? SPHINX_HOST = '127.0.0.1' SPHINX_PORT = 3381 SPHINXQL_PORT = 3382 SPHINX_INDEXER = '/usr/bin/indexer' SPHINX_SEARCHD = '/usr/bin/searchd'
Luke, I believe you need to use the load balancer VIP hostname which is 'pm-app-sphinx.mozilla.org' also this app is running on: LISTEN_PORT = 3325 MYSQL_LISTEN_PORT = 3326 Please let me know if this works for you. If not I will have to dig more into the network topology.
Jason, We don't have access to settings_local.py. Can you update the kuma settings_local.py with those correct values? SPHINX_HOST = 'pm-app-sphinx.mozilla.org' SPHINX_PORT = 3325 SPHINXQL_PORT = 3326 And verify the SPHINX_INDEXER and SPHINX_SEARCHD binary values are correct?
Luke, So a bit of a confusion on my part yesterday. The host for the sphinx configs is localhost. The host used by any remote servers attempting to query this sphinx instance is 'pm-app-sphinx.mozilla.org'. Those port numbers and the 'localhost' hostname were already set up correctly in the localsettings.py file. I made some minor changes to the way we are dealing with the sphinx.conf file but these should not affect the operation. When I run a test on the command line of the sphinx server it seems to work correctly. Additionally when I manually run the resync job it appears to work correctly. Can you please confirm for me that that application is attempting to connect to 'pm-app-sphinx.mozilla.org' on port 3325? In the mean time there is a firewall issue blocking access to the Sphinx server from the staging servers. I filed bug 691920 and this should be working as soon as that is resolved.
I assume you meant settings_local.py? So do we need localhost/127.0.0.1 or 'pm-app-sphinx.mozilla.org'? In either case, we don't have access to modify the settings_local.py file at /data/www/django/developer-stage9.mozilla.org/kuma - IT will need to make it match the right values.
Luke, Once again I am talking about something completely different :). I was referring to the configuration files on the Sphinx server. I now understand that you are talking about files in the web application itself. Now that we are on the same page (I hope) I wish to verify a few things. I am looking at the settings.py file from github and see the settings that I now think you were talking about previously. Those being: SPHINX_HOST = '127.0.0.1' SPHINX_PORT = 3381 SPHINXQL_PORT = 3382 SPHINX_INDEXER = '/usr/bin/indexer' SPHINX_SEARCHD = '/usr/bin/searchd' Now I am curious how this will work. The Sphinx server is remote from the web servers so I do not think that the indexer and search paths are relevant. At very least I can tell you that neither binary is installed on the staging servers, so if you need them this will be a larger issue. I thought you can just query the server on this port using an API or whatever. If I am incorrect please let me know and I will see what I can do. As to the host and port numbers in the settings_local.py file, I will work to get those updated (inserted?). However I would like to verify that you want them in 'developer-stage9.mozilla.org/kuma/settings_local.py' and not in 'kuma-stage.mozilla.org/kuma/settings_local.py'. I ask because I see the document root set to /data/www/django/kuma-stage.mozilla.org/kuma/webroot for 'ServerName kuma-stage.mozilla.org' on the staging web servers and so I must be missing something. (which would not be a big surprise since I have missed every step to this point). Cheers
No worries - MDN is a very messy staging situtation. kuma-stage is the stage server for our master branch with all the new wiki stuff. developer-stage9 is the stage server for our mdn branch with all the old MindTouch wiki stuff. We're moving to sphinx only for the kuma stuff, so we need to add new SPHINX_HOST, SPHINX_PORT, and SPHINXQL_PORT values into kuma-stage.mozilla.org/kuma/settings_local.py As I understand, pm-app-sphinx.mozilla.org should run a new searchd process with kuma/config/sphinx/sphinx.conf. When the new process is up and running, we will add a new reindex cron job that also uses kuma/config/sphinx/sphinx.conf. http://support.allizom.org/ already has all of this, so we should be able to copy/paste lots of their config and setup.
I updated the settings_local.py in kuma-stage.mozilla.org/kuma. I need to run the update script to push this out to the web servers. I just want to make sure that I will not break anything if I do?
push it out to the staging web servers, right? won't break anything.
I pushed the changes to the staging web servers. I also checked and there is still no connection so this still waiting on 691920. Once that is done this should (hopefully) be working.
Luke, The firewall configuration has been completed and the test URL you posted previously appears to be working. I will close this out now but please let me know if anything is not working as expected.