751171 - build distributed ES cluster

Reporter

Description

•

12 years ago

I'm not sure if ES has the concept of "local vs remote" but if it does we should make sure that it understands the difference between the 4 nodes that will be in SCL3 and the 4 nodes in PHX1.

We have the 4 nodes in SCL3 already kickstarted, elasticsearch[1-4].amopod1.addons.scl3

the 4 matching nodes in phx1 have hit a delay in racking that would put too much of a delay in this proof of concept so we are going to use some lesser-ram nodes to test with for now, and then swap those out for the prod nodes in a few weeks.  Filed 751169 to have those done.

Daniel Maher [:phrawzty]

Assignee

Updated

•

12 years ago

Severity: normal → major

Status: NEW → ASSIGNED

Daniel Maher [:phrawzty]

Assignee

Comment 1

•

12 years ago

WIP.

Daniel Maher [:phrawzty]

Assignee

Comment 2

•

12 years ago

Preliminary tests of site-awareness are positive; a test index spread across 2 sites of 4 nodes each was balanced correctly.  Further tests of shard/replica combinations are forthcoming, and following that: performance testing.

Daniel Maher [:phrawzty]

Assignee

Comment 3

•

12 years ago

The cluster is with four nodes each at SCL3 and PHX1.  ES will automatically spread the shards with respect to the two sites, and given an appropriate combination of shards and replicas, it is possible to create indices with different levels of redundancy with respect to data centre and node loss.

Daniel Maher [:phrawzty]

Assignee

Comment 4

•

12 years ago

06:22:10 <@cshields> can you sync up with jason and oremj - get the amo stage environment to use this new cluster for testing

Jason Thomas [:jason]

Comment 5

•

12 years ago

I think we need to raise the number of open file limits for the elastic search instances on these hosts:

java.io.IOException: Too many open files
	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163)
	at org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:227)
	at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:679)

[root@elasticsearch-tmp2.addons.phx1 elasticsearch]# cat /proc/8322/limits | grep file
Max file size             unlimited            unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max open files            1024                 1024                 files     
Max file locks            unlimited            unlimited            locks 

http://www.elasticsearch.org/tutorials/2011/04/06/too-many-open-files.html

Daniel Maher [:phrawzty]

Assignee

Comment 6

•

12 years ago

Indeed - I've not actually adjusted any of the system settings yet, as I was strictly interested in the mechanics of a multi-homed ES cluster last week.  This week I'm going to tune things properly, then set up a testing programme to put a few different shard / replica combinations through some automated testing.  Once that's done, the same tests will be done, but with the cluster in various states of degradation.  I'm not going to re-create the wheel, as we've already done a fair amount of testing[1], but there are some scenarios that require additional study.

Once the model multi-homed model has been validated we can extend AMO staging and see how it fares against that use case.  I'd also be highly interested in extending some nodes into the Amazon cloud, just to see if and how that element can come into play - but this latter task isn't a top priority (afaik).

[1] https://mana.mozilla.org/wiki/display/websites/Elastic+Search#ElasticSearch-Testing

Jason Thomas [:jason]

Comment 7

•

12 years ago

Okay great. I was trying to hook up AMO staging for testing earlier today however I was blocked by the following error in comment 5 when trying to create indexes for BAMO and AMO.

Daniel Maher [:phrawzty]

Assignee

Comment 8

•

12 years ago

Automated tests are currently underway.  I'm going to let them run through the night - these are just the baseline tests, so the results won't be particularly interesting.  Tomorrow I'll initiate the degradation testing, which should produce much more enlightening results.  When (if?) we're satisfied with that, we'll take and roll AMO staging out; target early next week (11 June 2012).

Daniel Maher [:phrawzty]

Assignee

Comment 9

•

12 years ago

After roughly 24 of automated testing, the averages are as follows :

default :
_all.primaries.docs.count : 2444
_all.primaries.indexing.index_time_in_millis : 8793
_all.primaries.store.size_in_bytes : 142516776
_all.total.docs.count : 25933
_all.total.indexing.index_time_in_millis : 24295
_all.total.store.size_in_bytes : 218378509

critical :
_all.primaries.docs.count : 4266
_all.primaries.indexing.index_time_in_millis : 8623
_all.primaries.store.size_in_bytes : 135713329
_all.total.docs.count : 5550
_all.total.indexing.index_time_in_millis : 80195
_all.total.store.size_in_bytes : 175593968

bigtime :
_all.primaries.docs.count : 2745
_all.primaries.indexing.index_time_in_millis : 15879
_all.primaries.store.size_in_bytes : 100333238
_all.total.docs.count : 4480
_all.total.indexing.index_time_in_millis : 148960
_all.total.store.size_in_bytes : 143752669

Daniel Maher [:phrawzty]

Assignee

Comment 10

•

12 years ago

Please ignore the statistics in comment #9; I made an error in the portion of my testing script that creates the indexes, and as a result, they were not reliably being created with the desired shard / replica combinations.  I fixed the error and re-ran the tests this week-end, and the results are actually quite interesting :

default :
_all.primaries.docs.count : 2025
_all.primaries.indexing.index_time_in_millis : 58847
_all.primaries.store.size_in_bytes : 114995479
_all.total.docs.count : 9894
_all.total.indexing.index_time_in_millis : 28532
_all.total.store.size_in_bytes : 99228557

critical :
_all.primaries.docs.count : 2631
_all.primaries.indexing.index_time_in_millis : 30927
_all.primaries.store.size_in_bytes : 141211684
_all.total.docs.count : 41183
_all.total.indexing.index_time_in_millis : 36273
_all.total.store.size_in_bytes : 757185744

bigtime :
_all.primaries.docs.count : 11716
_all.primaries.indexing.index_time_in_millis : 18094
_all.primaries.store.size_in_bytes : 75169101
_all.total.docs.count : 17566
_all.total.indexing.index_time_in_millis : 94147
_all.total.store.size_in_bytes : 155106212

Interestingly, the "critical" combination was able to index the largest number of documents during each test run.

Daniel Maher [:phrawzty]

Assignee

Comment 11

•

12 years ago

Reducing priority since we need to deal with getting new hardware up for the existing single-homed clusters.
Related :
* bug 754547
* bug 763975
* bug 771532
* bug 763923

Severity: major → minor

Status: ASSIGNED → NEW

Jeremy Orem [:oremj]

Comment 12

•

12 years ago

At this point do we know what it will take to build a distributed ES cluster? If we do, I'd like to close this bug out and open bugs for individual clusters, as needed.

Daniel Maher [:phrawzty]

Assignee

Comment 13

•

12 years ago

Yes, we know what it will take, and we can build them out as necessary.  Closing this bug.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

10 years ago

Component: Server Operations: AMO Operations → Operations: Marketplace

Product: mozilla.org → Mozilla Services

Bugzilla

Quick Search

build distributed ES cluster

Categories

(Cloud Services :: Operations: Marketplace, task)

Tracking

(Not tracked)

People

(Reporter: cshields, Assigned: dmaher)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Updated