Closed Bug 762167 Opened 12 years ago Closed 12 years ago

ElasticSearch Timeouts for mozillians.allizom.org

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bburton, Assigned: bburton)

References

Details

Investigating
Initial check of dev/stage ES nodes seems ok

bburton@andesite ~$ curl -v elasticsearch-dev1.webapp.phx1.mozilla.com:9200                                          7 ↵  ‹1.9.2-p290›
* About to connect() to elasticsearch-dev1.webapp.phx1.mozilla.com port 9200 (#0)
*   Trying 10.8.81.71... connected
* Connected to elasticsearch-dev1.webapp.phx1.mozilla.com (10.8.81.71) port 9200 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
> Host: elasticsearch-dev1.webapp.phx1.mozilla.com:9200
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 621
< 
{
  "ok" : true,
  "name" : "Venom",
  "version" : {
    "number" : "0.17.4",
    "date" : "2011-08-04T20:57:57",
    "snapshot_build" : false
  },
  "tagline" : "You Know, for Search",
  "cover" : "DON'T PANIC",
  "quote" : {
    "book" : "So Long And Thanks for All the Fish",
    "chapter" : "Chapter 23",
    "text1" : "Ford: \"Life,\" he said, \"is like a grapefruit.\"",
    "text2" : "Creature:\"Er, how so?\"",
    "text3" : "Ford: \"Well, it's sort of orangey-yellow and dimpled on the outside, wet and squidgy in the middle. It's got pips inside, too. Oh, and some people have half a one for breakfast.\""
  }
* Connection #0 to host elasticsearch-dev1.webapp.phx1.mozilla.com left intact
* Closing connection #0
}%                                                                                                                                     bburton@andesite ~$ curl -v elasticsearch-dev1.webapp.phx1.mozilla.com:9200                                               ‹1.9.2-p290›
* About to connect() to elasticsearch-dev1.webapp.phx1.mozilla.com port 9200 (#0)
*   Trying 10.8.81.71... connected
* Connected to elasticsearch-dev1.webapp.phx1.mozilla.com (10.8.81.71) port 9200 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
> Host: elasticsearch-dev1.webapp.phx1.mozilla.com:9200
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 862
< 
{
  "ok" : true,
  "name" : "Venom",
  "version" : {
    "number" : "0.17.4",
    "date" : "2011-08-04T20:57:57",
    "snapshot_build" : false
  },
  "tagline" : "You Know, for Search",
  "cover" : "DON'T PANIC",
  "quote" : {
    "book" : "The Restaurant at the End of the Universe",
    "chapter" : "Chapter 23",
    "text1" : "The designer of the gun had clearly not been instructed to beat about the bush. \"Make it evil,\" he'd been told. \"Make it totally clear that this gun has a right end and a wrong end. Make it totally clear to anyone standing at the wrong end that things are going badly for them. If that means sticking all sort of spikes and prongs and blackened bits all over it then so be it. This is not a gun for hanging over the fireplace or sticking in the umbrella stand, it is a gun for going out and making people miserable with.\""
  }
* Connection #0 to host elasticsearch-dev1.webapp.phx1.mozilla.com left intact
* Closing connection #0
}%
Status: NEW → ASSIGNED
Summary: ElasticSearch Timeouts for mozllians.allizom.org → ElasticSearch Timeouts for mozillians.allizom.org
I'd like to try bumping the timeout to 5s, per http://elasticutils.readthedocs.org/en/latest/installation.html#django.conf.settings.ES_TIMEOUT

Do you se any issues with that?
Additional confirmation of cluster health 

bburton@andesite ~$ curl -v "http://elasticsearch-dev2.webapp.phx1.mozilla.com:9200/_cluster/health?pretty=true"                                                                                                                                                 ‹1.9.2-p290›
* About to connect() to elasticsearch-dev2.webapp.phx1.mozilla.com port 9200 (#0)
*   Trying 10.8.81.72... connected
* Connected to elasticsearch-dev2.webapp.phx1.mozilla.com (10.8.81.72) port 9200 (#0)
> GET /_cluster/health?pretty=true HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
> Host: elasticsearch-dev2.webapp.phx1.mozilla.com:9200
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 271
< 
{
  "cluster_name" : "phxdev",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 60,
  "active_shards" : 120,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
* Connection #0 to host elasticsearch-dev2.webapp.phx1.mozilla.com left intact
* Closing connection #0
}%
Can we go ahead and bump the timeout to 2 or 3 seconds before jumping to 5? This can be done in settings/local, correct?
Yup, 2s is fine and it's an addition to local.py, I'll push out a 2s setting now
This was actually a VLAN issue, with the switch over to the new Xeon Seamicros, the IP -stage was trying to use to get to ES was wrong.

I've updated /etc/hosts and confirmed search works now.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.