kuma-stage: set up a sphinx server

RESOLVED FIXED

Status

Infrastructure & Operations
WebOps: Other
RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: groovecoder, Assigned: jd)

Tracking

Details

(Reporter)

Description

6 years ago
Work with webdev (groovecoder and/or lorchard) to set up a sphinx server for MDN staging.

Comment 1

6 years ago
We are migrating to ElasticSearch.  Do you really want to use sphinx?
We had a discussion about this in #webdev, which included cshields. 

He had concerns over the transition to ElasticSearch, and it sounded like it was early days yet. We have code that already uses Sphinx. So, rather than jump into something unknown, I think the decision at the time was to go ahead with Sphinx for MDN for now and revisit the decision once others had spent time with ElasticSearch.

If that's wrong, do we need a meeting or to CC more people on this bug?
Jason,

we have a general use set of sphinx nodes in sjc1, managed by puppet.  You'll need to copy an instance and configure it for kuma-stage, then let Luke/Les know what port it is running on (so they can request the appropriate config file change)

The files you will need to copy and commit are in /modules/sphinx/files/searchd and you will need to make some entries in the puppet manifest for the new instance too, /modules/sphinx/manifests/generic.pp

then this will end up live on pm-app-sphinx01 and pm-app-sphinx02 (and I believe there is a zeus VIP for those pm-app-sphinx.mozilla.org
Assignee: server-ops → jcrowe
(Assignee)

Comment 4

6 years ago
I have this in work but need to know where to find the sphinx.conf file that you would like me to use?  I think that we generally grab this file from git just prior to re-indexing.

Also if you have any other sphinx configs (wordforms.txt etc..) I will need to know about those as well.
(Reporter)

Comment 5

6 years ago
the sphinx.conf file is in our kuma repos at configs/sphinx/sphinx.conf. do you make a sym-link for it?

I'm still modifying our sphinx implementation from what we inherited from kitsune but that is where the file will live.
(Assignee)

Comment 6

6 years ago
I think we generally are pulling the file from git just prior to re-indexing.  I will set this up the same way unless you have a strong objection.
(Reporter)

Comment 7

6 years ago
(In reply to comment #6)
> I think we generally are pulling the file from git just prior to re-indexing. 
> I will set this up the same way unless you have a strong objection.

Sounds good. If something comes up we'll let you know. ;)
(Assignee)

Comment 8

6 years ago
okay, I have this basically working on port 3325.  I am getting the following files from github sphinx.conf, stopwords.txt, and wordforms.txt just prior to running the re-index (each time).

When running the index job I get the following errors:
indexing index 'questions'...
ERROR: index 'questions': sql_query: Table 'kuma_stage_mozilla_org_django.questions_question' doesn't exist (DSN=mysql://kuma_sphinx:***@tm-stage01-slave01.mozilla.org:3306/kuma_stage_mozilla_org_django).
indexing index 'discussion_forums'...
ERROR: index 'discussion_forums': sql_query: Table 'kuma_stage_mozilla_org_django.forums_post' doesn't exist (DSN=mysql://kuma_sphinx:***@tm-stage01-slave01.mozilla.org:3306/kuma_stage_mozilla_org_django).

You may note that the wiki pages one seems to be working:
indexing index 'wiki_pages'...
collected 3 docs, 0.0 MB
collected 0 attr values

If you update the sphinx.conf file then you should see changes after minutes [5,20,35,50] as the re-index cron job is scheduled at those times.

If you would like me to manually run the index job for error output just let me know.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Reporter)

Comment 9

6 years ago
Cool. I'm modifying the sphinx.conf file now to remove the indices for the resources we removed so it should fix that when I push this to stage9 - should be today or tomorrow. I'll let you know.

Thanks!
(Reporter)

Comment 10

6 years ago
stage9 says search unavailable.

https://kuma-stage.mozilla.org/en-US/search?q=calls

Jason, can you confirm the sphinx settings we have?

SPHINX_HOST = '127.0.0.1'
SPHINX_PORT = 3381
SPHINXQL_PORT = 3382

SPHINX_INDEXER = '/usr/bin/indexer'
SPHINX_SEARCHD = '/usr/bin/searchd'
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 11

6 years ago
Luke,

I believe you need to use the load balancer VIP hostname which is 'pm-app-sphinx.mozilla.org' also this app is running on:

LISTEN_PORT       = 3325
MYSQL_LISTEN_PORT = 3326

Please let me know if this works for you.  If not I will have to dig more into the network topology.
Status: REOPENED → ASSIGNED
(Reporter)

Comment 12

6 years ago
Jason,

We don't have access to settings_local.py. Can you update the kuma settings_local.py with those correct values?

SPHINX_HOST = 'pm-app-sphinx.mozilla.org'
SPHINX_PORT = 3325
SPHINXQL_PORT = 3326

And verify the SPHINX_INDEXER and SPHINX_SEARCHD binary values are correct?
(Assignee)

Updated

6 years ago
Depends on: 691920
(Assignee)

Comment 13

6 years ago
Luke,

So a bit of a confusion on my part yesterday.  The host for the sphinx configs is localhost.  The host used by any remote servers attempting to query this sphinx instance is 'pm-app-sphinx.mozilla.org'.

Those port numbers and the 'localhost' hostname were already set up correctly in the localsettings.py file.

I made some minor changes to the way we are dealing with the sphinx.conf file but these should not affect the operation.

When I run a test on the command line of the sphinx server it seems to work correctly.  Additionally when I manually run the resync job it appears to work correctly.

Can you please confirm for me that that application is attempting to connect to 'pm-app-sphinx.mozilla.org' on port 3325?

In the mean time there is a firewall issue blocking access to the Sphinx server from the staging servers.  I filed bug 691920 and this should be working as soon as that is resolved.
(Reporter)

Comment 14

6 years ago
I assume you meant settings_local.py?

So do we need localhost/127.0.0.1 or 'pm-app-sphinx.mozilla.org'? In either case, we don't have access to modify the settings_local.py file at /data/www/django/developer-stage9.mozilla.org/kuma - IT will need to make it match the right values.
(Assignee)

Comment 15

6 years ago
Luke,

Once again I am talking about something completely different :).  I was referring to the configuration files on the Sphinx server.  I now understand that you are talking about files in the web application itself.

Now that we are on the same page (I hope) I wish to verify a few things.  I am looking at the settings.py file from github and see the settings that I now think you were talking about previously.  Those being:

SPHINX_HOST = '127.0.0.1'
SPHINX_PORT = 3381
SPHINXQL_PORT = 3382

SPHINX_INDEXER = '/usr/bin/indexer'
SPHINX_SEARCHD = '/usr/bin/searchd'

Now I am curious how this will work.  The Sphinx server is remote from the web servers so I do not think that the indexer and search paths are relevant.  At very least I can tell you that neither binary is installed on the staging servers, so if you need them this will be a larger issue.  I thought you can just query the server on this port using an API or whatever.  If I am incorrect please let me know and I will see what I can do.

As to the host and port numbers in the settings_local.py file, I will work to get those updated (inserted?).  However I would like to verify that you want them in 'developer-stage9.mozilla.org/kuma/settings_local.py' and not in 'kuma-stage.mozilla.org/kuma/settings_local.py'.  I ask because I see the document root set to /data/www/django/kuma-stage.mozilla.org/kuma/webroot for 'ServerName kuma-stage.mozilla.org' on the staging web servers and so I must be missing something. (which would not be a big surprise since I have missed every step to this point).

Cheers
(Reporter)

Comment 16

6 years ago
No worries - MDN is a very messy staging situtation. kuma-stage is the stage server for our master branch with all the new wiki stuff. developer-stage9 is the stage server for our mdn branch with all the old MindTouch wiki stuff.

We're moving to sphinx only for the kuma stuff, so we need to add new SPHINX_HOST, SPHINX_PORT, and SPHINXQL_PORT values into kuma-stage.mozilla.org/kuma/settings_local.py

As I understand, pm-app-sphinx.mozilla.org should run a new searchd process with  kuma/config/sphinx/sphinx.conf. When the new process is up and running, we will add a new reindex cron job that also uses kuma/config/sphinx/sphinx.conf.

http://support.allizom.org/ already has all of this, so we should be able to copy/paste lots of their config and setup.
(Assignee)

Comment 17

6 years ago
I updated the settings_local.py in kuma-stage.mozilla.org/kuma.

I need to run the update script to push this out to the web servers.  I just want to make sure that I will not break anything if I do?
(Reporter)

Comment 18

6 years ago
push it out to the staging web servers, right? won't break anything.
(Assignee)

Comment 19

6 years ago
I pushed the changes to the staging web servers.  I also checked and there is still no connection so this still waiting on 691920.  Once that is done this should (hopefully) be working.
(Assignee)

Comment 20

6 years ago
Luke,

The firewall configuration has been completed and the test URL you posted previously appears to be working.

I will close this out now but please let me know if anything is not working as expected.
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.