824065 - (ops-es-mdn) set up ES for MDN

Sounds good to me, the SCL3 ES cluster should be ready to use. All we should need is some netflows from your nodes to the ES cluster IPs, and then probably some settings_local.py entries. The source IPs for this will be the web nodes, at least... possibly the celery node(s) at some point, but that's still not production at the moment. developer1.webapp.scl3.mozilla.com - 10.22.81.18 developer2.webapp.scl3.mozilla.com - 10.22.81.19 developer3.webapp.scl3.mozilla.com - 10.22.81.20 @phrawzty: let us know what destination IP(s) and port(s) need to be allowed. When we have that we'll open a dependent ACL bug.

Flags: needinfo?(dmaher)

Priority: P2 → P3

Whiteboard: [triaged 20130104]

Jake Maul [:jakem]

Comment 2

•

13 years ago

I missed a note from triage (and comment 0)... the info above is for prod only. Obviously we'll want to dev and stage also. I'm okay with doing a cross-DC ACL for dev and stage, if NetOps is willing to do it. I'm also okay with those living on the same ES cluster, just using different indexes. if anyone is concerned about this, please speak up. :) Those source nodes are: developer1.dev.webapp.scl3.mozilla.com - 10.22.81.16 developer1.stage.webapp.scl3.mozilla.com - 10.22.81.17

Daniel Maher [:phrawzty]

Comment 3

•

13 years ago

Hi :jakem (et al.), As a general rule, nodes in the webapp VLAN can all talk to each other without requiring additional flows to be opened - though this is true only within a given data centre. Thus, the MDN (prod) machines can already communicate with the (prod) ES cluster at SCL3 : $ for i in developer1.dev developer1.stage developer1; do ssh $i.webapp.scl3.mozilla.com 'nc -vz elasticsearch-zlb.webapp.scl3.mozilla.com 9200'; done Connection to elasticsearch-zlb.webapp.scl3.mozilla.com 9200 port [tcp/wap-wsp] succeeded! Connection to elasticsearch-zlb.webapp.scl3.mozilla.com 9200 port [tcp/wap-wsp] succeeded! Connection to elasticsearch-zlb.webapp.scl3.mozilla.com 9200 port [tcp/wap-wsp] succeeded! However, as :jakem noted, for the dev and stage machines at SCL3 to communicate with the dev & stage cluster at PHX1, a flow will need to be opened : $ for i in developer1.dev developer1.stage; ssh $i.webapp.scl3.mozilla.com 'nc -w 1 -vz elasticsearch-zlb.dev.vlan81.phx.mozilla.com 9200'; done nc: connect to elasticsearch-zlb.dev.vlan81.phx.mozilla.com port 9200 (tcp) timed out: Operation now in progress nc: connect to elasticsearch-zlb.dev.vlan81.phx.mozilla.com port 9200 (tcp) timed out: Operation now in progress In addition, it can be useful (though not always necessary) for the admin node to be able to communicate with the ES cluster(s) : $ ssh developeradm.private.scl3.mozilla.com 'nc -w 1 -vz elasticsearch-zlb.dev.vlan81.phx.mozilla.com 9200' nc: connect to elasticsearch-zlb.dev.vlan81.phx.mozilla.com port 9200 (tcp) timed out: Operation now in progress $ ssh developeradm.private.scl3.mozilla.com 'nc -w 1 -vz elasticsearch-zlb.webapp.scl3.mozilla.com 9200' nc: connect to elasticsearch-zlb.webapp.scl3.mozilla.com port 9200 (tcp) timed out: Operation now in progress In summary, these are the flows from these machines : * developer1.dev.webapp.scl3.m.c * developer1.stage.webapp.scl3.m.c * (optional) developeradm.private.scl3.m.c will need to be opened to : * elasticsearch-zlb.dev.vlan81.phx.m.c : 9200/tcp

Flags: needinfo?(dmaher)

John Karahalis [:openjck]

Updated

•

13 years ago

Alias: ops-es-mdn

Luke Crouch [:groovecoder]

Reporter

Updated

•

13 years ago

Blocks: 839214

Luke Crouch [:groovecoder]

Reporter

Comment 4

•

13 years ago

Bump. We need to set this up to unblock ES site search work.

Severity: normal → major

James Socol [:jsocol, :james]

Comment 5

•

13 years ago

(In reply to Daniel Maher [:phrawzty] (AFK through 4 March 2013) from comment #3) > In summary, these are the flows from these machines : > * developer1.dev.webapp.scl3.m.c > * developer1.stage.webapp.scl3.m.c > * (optional) developeradm.private.scl3.m.c > > will need to be opened to : > * elasticsearch-zlb.dev.vlan81.phx.m.c : 9200/tcp Is there a net-ops bug for this? It's the only IT blocker besides adding IP addresses to local configs.

Rick Bryce [:rbryce]

Comment 6

•

13 years ago

Im knocking this down to Normal so it doesn't continue to page me. I have alerted WebOps to its escalated Severity

Severity: major → normal

Brandon Burton [:solarce]

Comment 7

•

13 years ago

(In reply to James Socol [:jsocol, :james] from comment #5) > (In reply to Daniel Maher [:phrawzty] (AFK through 4 March 2013) from > comment #3) > > In summary, these are the flows from these machines : > > * developer1.dev.webapp.scl3.m.c > > * developer1.stage.webapp.scl3.m.c > > * (optional) developeradm.private.scl3.m.c > > > > will need to be opened to : > > * elasticsearch-zlb.dev.vlan81.phx.m.c : 9200/tcp > > Is there a net-ops bug for this? It's the only IT blocker besides adding IP > addresses to local configs. I'll find or file an ACL bug today and we can work together next week to roll out the config changes

Brandon Burton [:solarce]

Updated

•

13 years ago

Assignee: server-ops-webops → bburton

Priority: P3 → P1

James Socol [:jsocol, :james]

Comment 8

•

13 years ago

Wait, I just noticed this, but do we intend to open a route from scl3 to phx? Don't we have dev ES infra up in scl3?

Daniel Maher [:phrawzty]

Comment 9

•

13 years ago

(In reply to James Socol [:jsocol, :james] from comment #8) > Wait, I just noticed this, but do we intend to open a route from scl3 to > phx? Don't we have dev ES infra up in scl3? At this time the only ES Dev cluster is in PHX1. If an SCL3 cluster is required, we may be able to get one set up down the line - for now, PHX1 is the only option.

Luke Crouch [:groovecoder]

Reporter

Comment 10

•

13 years ago

We'll need this for MDN in the next week or so. Which will be available? * Open flows from SCL3 dev, stage, adm nodes to PHX ES cluster or * Set up dev ES in SCL3

Daniel Maher [:phrawzty]

Comment 11

•

13 years ago

(In reply to Luke Crouch [:groovecoder] from comment #10) > We'll need this for MDN in the next week or so. Which will be available? > > * Open flows from SCL3 dev, stage, adm nodes to PHX ES cluster Of the two options presented, opening the network flows is the only realistic one.

Luke Crouch [:groovecoder]

Reporter

Comment 12

•

13 years ago

What product/component do we use to file the netops bug? And mark it blocking this one.

Brandon Burton [:solarce]

Comment 13

•

13 years ago

I filed https://bugzilla.mozilla.org/show_bug.cgi?id=846934 to get the rest of the flows in place, it should be looked at Monday In the mean time, do you want to try and dark launch it for prod? We have flows for prod already, maybe I can push the ES config and we can try some manage.py commands?

Flags: needinfo?(lcrouch)

Luke Crouch [:groovecoder]

Reporter

Comment 14

•

13 years ago

Thanks. Need to merge some more ES code to test on stage. Will ping back here.

Flags: needinfo?(lcrouch)

Brandon Burton [:solarce]

Comment 15

•

13 years ago

(In reply to Luke Crouch [:groovecoder] from comment #14) > Thanks. Need to merge some more ES code to test on stage. Will ping back > here. Sounds good, I'm idling in #mdndev to please ping me if I can help to test anything with this or celery

Brandon Burton [:solarce]

Updated

•

13 years ago

Depends on: 848870

Luke Crouch [:groovecoder]

Reporter

Updated

•

13 years ago

Blocks: 853185

Brandon Burton [:solarce]

Updated

•

13 years ago

Assignee: bburton → server-ops-webops

Whiteboard: [triaged 20130104] → [triaged 20130104][waiting][853185]

Michael Cooper [:mythmon]

Updated

•

13 years ago

Blocks: 868506

Luke Crouch [:groovecoder]

Reporter

Comment 16

•

13 years ago

Been iterating on dev for a while and now we're ready to put stage and prod onto ES. I tried to index on stage and got: AttributeError: 'Settings' object has no attribute 'ES_INDEXING_TIMEOUT' So I think we need to copy the same ES_* values from dev's settings_local.py to stage's settings_local.py.

Daniel Maher [:phrawzty]

Comment 17

•

13 years ago

As per bug 853185c12 the ES config for stage has been set on the admin server. You may trigger a deployment via Chief at your leisure.

Luke Crouch [:groovecoder]

Reporter

Comment 18

•

13 years ago

Got this on stage when I tried to search: AttributeError: 'Settings' object has no attribute 'ES_URLS'

Daniel Maher [:phrawzty]

Comment 19

•

13 years ago

On the admin server : $ grep ^ES_ settings_local.py ES_DISABLED = False ES_INDEXES = {'default': 'main_index'} ES_INDEX_PREFIX = 'mdnstage' ES_LIVE_INDEX = True ES_INDEXING_TIMEOUT = 30 ES_URLS = ['http://elasticsearch-zlb.dev.vlan81.phx1.mozilla.com:9200'] Did you push stage before attempting your search ?

Luke Crouch [:groovecoder]

Reporter

Comment 20

•

13 years ago

Prod push, reindex, and search looks good. Thanks!

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

12 years ago

Component: Server Operations: Web Operations → WebOps: Other

Product: mozilla.org → Infrastructure & Operations

BMO Automation

Updated

•

7 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard