Closed Bug 1090818 Opened 11 years ago Closed 11 years ago

Update settings/local.py in mozillians-dev.allizom.org

Categories

(Infrastructure & Operations Graveyard :: WebOps: Engagement, task)

task
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: tasos, Assigned: cliang)

Details

(Whiteboard: [kanban:webops:https://kanbanize.com/ctrl_board/4/1770] )

Hi, Can you please replace in the settings/local.py the ES_HOSTS = ['<elastic_node_ip:port>'] with ES_URLS = ['http://<elastic_node_ip:port>'] Thanks!
Whiteboard: [kanban:webops:https://kanbanize.com/ctrl_board/4/1770]
Assignee: server-ops-webops → cliang
settings/local.py was updated and the changes have been deployed. I don't know if other changes need to be made: when I executed a search, it failed (and I got the jitterbeast =) ). 112,113c112,113 < # ES Bug 712860 & 879336 & 810960 < ES_HOSTS = ['elasticsearch-zlb.dev.vlan81.phx1.mozilla.com:9200'] --- > # ES Bug 712860 & 879336 & 810960 & 1090818 > ES_URLS = ['http://elasticsearch-zlb.dev.vlan81.phx1.mozilla.com:9200']
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
We are getting this [1] error in our tracebacks. Can you please check if the net flow is open from our dev deployment (mozillians-dev.allizom.org) to the elastic node that is added in ES_URLS? It seems kind of odd since this used to be working fine in the previous setup, but the connection error indicates that something is going wrong while trying to connect to the elastic server. [1] http://pastebin.mozilla.org/6987095
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I'm able to reach the ES from the generic cluster development webhead: $ curl -XGET 'http://elasticsearch-zlb.dev.vlan81.phx1.mozilla.com:9200/_cluster/health?pretty=true' { "cluster_name" : "it_es_dev", "status" : "green", "timed_out" : false, "number_of_nodes" : 3, "number_of_data_nodes" : 3, "active_primary_shards" : 180, "active_shards" : 360, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0 }
Not sure if this is related but the app isn't returning any results -- can file a separate bug if you all think it's prudent. Steps to reproduce: 0. goto https://mozillians-dev.allizom.org/search/?q=sallaberry141950&limit=&include_non_vouched=on Expected: Search returns a result set that includes https://mozillians-dev.allizom.org/u/sallaberry141950/ Actual: No results are returned
I discovered this late yesterday after we tried to re-index ES, https://rpm.newrelic.com/accounts/263620/applications/2728415/traced_errors/2457797733. Is it related? elasticsearch.exceptions:ConnectionError mozillians.users.tasks.unindex_objects rror message elasticsearch.exceptions:ConnectionError: ConnectionError(('Connection aborted.', error(111, 'Connection refused'))) caused by: ProtocolError(('Connection aborted.', error(111, 'Connection refused'))) Stack trace Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/newrelic-2.18.1.15/newrelic/hooks/application_celery.py", line 66, in wrapper return wrapped(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/newrelic-2.18.1.15/newrelic/hooks/application_celery.py", line 61, in wrapper return wrapped(*args, **kwargs) File "/data/www/mozillians-dev.allizom.org/mozillians/vendor-local/lib/python/celery/app/task/__init__.py", line 262, in __call__ return self.run(*args, **kwargs) File "/data/www/mozillians-dev.allizom.org/mozillians/mozillians/users/tasks.py", line 250, in unindex_objects mapping_type.unindex(id_, es=es, public_index=public_index) File "/data/www/mozillians-dev.allizom.org/mozillians/mozillians/users/es.py", line 116, in unindex raise e ConnectionError: ConnectionError(('Connection aborted.', error(111, 'Connection refused'))) caused by: ProtocolError(('Connection aborted.', error(111, 'Connection refused')))
Severity: normal → major
Flags: needinfo?(cliang)
The error above looks like it comes from the development generic celery server. It too, is able to reach the ES cluster if I issue the curl command listed in comment #3. I have a "dumb" question for you: Is the mozillians dev code supposed to be using pyes or pyelasticsearch? Bug 855806 shows a similar change from an ES_HOSTS to an ES_URLS setting, due to a change in elasticutils ("pyelasticsearch instead of pyes") . Looking at the vendor directory for SUMO, I see pyelasticsearch (git://github.com/rhec/pyelasticsearch.git); for mozillians-dev, I see pyes (git://github.com/aparo/pyes.git).
Flags: needinfo?(cliang)
Hi :cyliang, We are using elasticsearch-py [0] which is the replacement of pyelasticsearch after version 0.9 of elasticutils. The library is installed in the vendor-local folder. It's really strange that you are seeing pyes for mozillians-dev. The update script should have taken care of that. [0] https://github.com/mozilla/mozillians/tree/master/vendor-local/src
"It's really strange that you are seeing pyes for mozillians-dev. The update script should have taken care of that." Just to be clear: I only changed the ES_* related settings. I did not run the update script. For giggles, I added back the ES_HOSTS: ES_HOSTS = ['elasticsearch-zlb.dev.vlan81.phx1.mozilla.com:9200'] ES_URLS = ['http://elasticsearch-zlb.dev.vlan81.phx1.mozilla.com:9200'] ... and pushed that out to the development web server. The search now seems to work. Is it possible that this is a precedence issue (pyes is being found first rather than py-elasticsearch)?
Hi :cyliang, By update script I meant the chief script. Sorry for the confusion. Probably the problem was that we didn't restart the celery server after the change in the local.py. We pushed a new commit which triggered a restart to the celery server and everything seems to be working fine now. Can you please remove the ES_HOSTS to check that everything is OK? Thanks!
I've updated the settings/local.py file to remove ES_HOSTS. [1] The URL in comment #4 doesn't produce the expected output (for me) but it also doesn't produce the jitterbeast. =) [1] $ grep ES_ settings/local.py ES_URLS = ['http://elasticsearch-zlb.dev.vlan81.phx1.mozilla.com:9200'] ES_DISABLED = False ES_INDEXES = dict(default='mozillians_dev', public='mozillians_public_dev') ES_TIMEOUT = 60 ES_INDEXING_TIMEOUT = 60
Whiteboard: [kanban:webops:https://kanbanize.com/ctrl_board/4/1770] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2200] [kanban:webops:https://kanbanize.com/ctrl_board/4/1770]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2200] [kanban:webops:https://kanbanize.com/ctrl_board/4/1770] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2211] [kanban:webops:https://kanbanize.com/ctrl_board/4/1770]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2211] [kanban:webops:https://kanbanize.com/ctrl_board/4/1770] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2215] [kanban:webops:https://kanbanize.com/ctrl_board/4/1770]
Everything works as expected in dev server including the url in comment #4. Marking this as resolved. Thanks!
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2215] [kanban:webops:https://kanbanize.com/ctrl_board/4/1770] → [kanban:webops:https://kanbanize.com/ctrl_board/4/1770]
QA verified - dev looks right as rain. Test automation runs and passes
Status: RESOLVED → VERIFIED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.