656297 - Order boxes for Socorro Elastic search cluster

Reporter

Description

•

13 years ago

According to dre/anurag, these boxes should be basically the same as the ones in the Metrics ES prod cluster.

Minimum spec:
- 6 boxes
- Intel(R) Xeon(R) CPU L5520  @ 2.27GHz
- Memory: 24gb
- Minimum disk of 1TB each

Corey Shields [:cshields]

Assignee

Comment 1

•

13 years ago

The metrics boxes were repurposed IX nodes.  I don't see anymore IX spares in inventory.  mrz, should we order more IX nodes or ask Rich for a quote from HP?  With the rest of socorro on HP I would rather the latter.

matthew zeier [:mrz]

Comment 2

•

13 years ago

These need to go to SJC?  Getting really tight on power.

Corey Shields [:cshields]

Assignee

Comment 3

•

13 years ago

I guess I just made the assumption of phx since that is where socorro lives.  Laura?

Laura Thomson :laura

Reporter

Comment 4

•

13 years ago

Should be in PHX.

matthew zeier [:mrz]

Comment 5

•

13 years ago

Boy that makes it easier. 

Different problem - don't have rack space for this platform :)  Also, need to figure out the hardware (we're more optimized for blades).

matthew zeier [:mrz]

Comment 6

•

13 years ago

ahill - does C20 have 12u?

Assignee: server-ops → ahill

matthew zeier [:mrz]

Comment 7

•

13 years ago

Rich - what sort of SL options do I have here?  I'm limited on rackspace.

Corey Shields [:cshields]

Assignee

Comment 8

•

13 years ago

blades + storage blades are probably an option as well. Likely a more expensive option, and I don't want to eat through all of the chassis space we are just about to get - I already have a lot of requests coming up.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 9

•

13 years ago

I'm happy to explore how much storage we could squeeze into a standard blade to avoid having to go the route of storage blades

matthew zeier [:mrz]

Comment 10

•

13 years ago

Standard blade, assuming 600GB drives, is 600GB.  Likely no more than that

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 11

•

13 years ago

I believe we can work with that.  We can add more nodes in the future if space becomes an issue.

Rich Pomper

Comment 12

•

13 years ago

For the standard blade there is a newly announced 900 GB option(not shipping yet) or 600GB option. Each blade supports 2 drives.
For SL series how many nodes would you be looking for

matthew zeier [:mrz]

Comment 13

•

13 years ago

Rich - in the SL series, I'd need 6 nodes.  What options do I have there?

Assignee: ahill → mrz

Rich Pomper

Comment 14

•

13 years ago

suggest taking a look at the SL6500 (4U) and can hold 8 nodes, either SL335(AMD) (http://h18004.www1.hp.com/products/quickspecs/13951_na/13951_na.html) or SL390(Intel)(http://h18004.www1.hp.com/products/quickspecs/13713_na/13713_na.html)

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 15

•

13 years ago

I don't see anything wrong with these recommendations on my end.  I'm happy to choose the cheaper / more energy efficient CPU.
As for disk, absolute fastest is not a significant requirement here.  I think we'll get better bang out of more density.  We don't have a strong requirement for hotswap or raid and such because the cluster itself can be made redundant, so let's go with size+price/speed in the ole "choose any two" equation.

Laura Thomson :laura

Reporter

Comment 16

•

13 years ago

If we order the SL series now-ish, what would be the ETA?

Rich Pomper

Comment 17

•

13 years ago

I am working on the SL configuration and once complete I will send for review.  You can expect approximately a 2 week lead time from the time of order, depending on the final configuration

matthew zeier [:mrz]

Comment 18

•

13 years ago

Talked to dre offline, going to explore a different route first.

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → WONTFIX

Laura Thomson :laura

Reporter

Comment 19

•

13 years ago

(In reply to comment #18)
> Talked to dre offline, going to explore a different route first.

Can somebody update me on the plan please?  Cheers.

matthew zeier [:mrz]

Comment 20

•

13 years ago

dre mentioned using some of the excess hardware for in phx.  He'll be able to tell you more.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 21

•

13 years ago

we have a small number of "spare" machines in phx that were originally earmarked for this.  ideally, we can move our zookeeper cluster to other socorro hardware (5 machines with extremely light resource requirements) but even if that isnt possible, we can start with at least 3 machines.

Laura Thomson :laura

Reporter

Comment 22

•

13 years ago

We're using existing, non-ideal hardware for this, but we still need to order some as it turns out.

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Whiteboard: [2011q3]

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 23

•

13 years ago

@cshields - This is an old bug but it ties directly into the conversation we had a few days ago about needing to order suitable HP nodes for the purposes of Socorro Search.

Corey Shields [:cshields]

Assignee

Comment 24

•

13 years ago

(In reply to comment #23)
> @cshields - This is an old bug but it ties directly into the conversation we
> had a few days ago about needing to order suitable HP nodes for the purposes
> of Socorro Search.

Tell me the ideal specs..  We won't be able to do this efficiently with blades.  And anything beyond blades right now will need us to spin up a new cabinet in phx1.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 25

•

13 years ago

(In reply to comment #24)
> (In reply to comment #23)
> > @cshields - This is an old bug but it ties directly into the conversation we
> > had a few days ago about needing to order suitable HP nodes for the purposes
> > of Socorro Search.
> 
> Tell me the ideal specs..  We won't be able to do this efficiently with
> blades.  And anything beyond blades right now will need us to spin up a new
> cabinet in phx1.

Rich offered some suggestions in comment #15 that involved SL6500 with either SL335 or SL390 nodes.  It would be nice if the 900GB models are available, but if they aren't then we'll just make due with 8 of the 600GB models.

As mentioned in other comments, since the service itself is redundant, we wouldn't have to do RAID 0 which means we could theoretically double the available size to just over 1TB.

Corey Shields [:cshields]

Assignee

Comment 26

•

13 years ago

This is now being brought back to life in lieu of bug 670766.

Rich, can you quote us the SL6500 (4U) with 8 of the SL390 nodes with 900GB disks?  I'd like this full from the start, don't want to have to go through this mess again for a while.

thanks!

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 27

•

13 years ago

So I was just going over disk space requirements with Anurag, and I think that the 900GB drives might be too small even if we went with RAID0.

When Socorro is throttling processing, we generate about 300k documents per day.  This translates to about 30GB of disk space for the data plus indices.

(30GB * 90 days) / 1.8TB per node = 1.5 nodes needed to store the data.  So, we could run with a 3 node cluster at that config.

However, when we run unthrottled, we generate about 3m documents per day.  That translates to 300GB per day.

(300GB * 90 days) / 1.8TB per node = 15 nodes needed to store the data.


If we looked at the DL config then we could pack 8 or 12 TB of disk into each node pretty easily and that would bring us back down to a much more reasonable 5 or 6 nodes for the cluster.

Thoughts from anyone else?

Corey Shields [:cshields]

Assignee

Comment 28

•

13 years ago

Ok, we are bouncing between storage estimations, systems that need to be rebuilt NOW because we can't wait for an order, back to estimations, and now to huge ass clusters.  This looks to me like we don't really know what we want or need. 

We need to chat about this.  I've got 12:30 (PDT) available tomorrow.  Laura/Daniel can you both make that time?  Zimbra says you are both free then.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 29

•

13 years ago

I am happy to speak tomorrow.  I am sorry that we are bouncing around a bit, but it is mostly because I'm trying to make the most of a bad series of unfortunate events. :)

We have two sets of requirements in disk space, one for the short/medium term (i.e. throttled) and an eventual set of requirements that we need to fulfill before throttling can be turned off.  I am trying to lay everything out now because I want us to be aware if we are buying hardware that is adequate for now only to have to replace it down the line to support an unthrottled socorro.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 30

•

13 years ago

@cshields

You asked in IRC for the following stats.  Except for the horrendously underspecced disk space line, this ends up looking almost identical to the original specs described in comment #0.

=Disk space=
==Short term (i.e. with 10% processing)==
8TB total -- 30GB * 90 [days] * 2 [copies] * 1.5 [growth/overhead]
==Eventual max (i.e. with 100% processing)==
80TB total -- 300GB * 90 [days] * 2 [copies] * 1.5 [growth/overhead]

=Memory=
Given the order of magnitude difference in storage requirements, it doesn't feel right to back the desired memory into GB per TB.  What I would say instead is:
Minimum - 16GB per node
Recommended - 24GB per node

=CPU=
Minimum - 2 cores per node
Recommended - 4 cores per node

=Nodes=
If we have a six node cluster with a minimum of 14TB usable disk each, then we would have a cluster capable of servicing even the eventual max size and the cluster would remain fully functional with one node down for extended servicing.
I believe that having 12TB or even 10TB of disk per node would be fine to start with, it just means that we might need to order one more node if we hit the 1.5x growth and we wanted to do 100% processing.

Corey Shields [:cshields]

Assignee

Comment 31

•

13 years ago

Thanks, but that doesn't answer the question I had (what do we need to build out in terms of disk per core and disk per GB of RAM)..

You are speaking in terms of nodes but there are different types of servers out there.  We can have a few densely packed servers, or lots of not-so-dense servers.  The difference here is how much we need to spread the disk around based on how much processing power and how much RAM is needed.

So using your figures above you are saying 2-4 cores for every 14TB, and 16-24GB of RAM for every 14TB of disk.

If this is what needs to be spec'd we will go that route.  If we can go denser with disk-per-node without affecting performance, we will go that route.

ES scares me a lot because we are such an early adopter that we have become a featured user on their website (for a use case that we don't even have running yet[1]..  sad).  I'm not comfortable that anyone knows the best practices for building out an ES cluster here, so we may be doing this blindly.

[1] - http://www.elasticsearch.org/users/

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 32

•

13 years ago

Okay. I just had a call with the lead developer of ES and did some strategizing on ways to give you better figures.

I think we are going to have to do a new round of measurements to make them accurate though.  There are a few optimizations we need to hit that might drastically change the max disk space requirement above.  So we can assign this bug to me and I'll work with Anurag and then move it back over when we are ready to proceed.

There is definitely a restriction on using too few servers though.  IO throughput would be a significantly limiting factor if we tried to put all this data on one or two massive machines vs five or six medium machines or even ten or fifteen small machines.  That will continue to be the hardest thing for us to give proper guidance on other than to say please don't use less than four nodes.

As far as the early adopter thing, I'd soften the edge of that a little by saying that three of the four cases they describe there are not socorro.  I think they like to point at us because we are much more open about the stuff we are doing than several of those other companies.  I can ask them to move us to the bottom of the list if it makes it better. ;)

matthew zeier [:mrz]

Updated

•

13 years ago

Assignee: mrz → cshields

Corey Shields [:cshields]

Assignee

Comment 33

•

13 years ago

(In reply to comment #32)

> I think we are going to have to do a new round of measurements to make them
> accurate though.  There are a few optimizations we need to hit that might
> drastically change the max disk space requirement above.  So we can assign
> this bug to me and I'll work with Anurag and then move it back over when we
> are ready to proceed.

done.  pass it back when we have a better idea.

Assignee: cshields → deinspanjer

Corey Shields [:cshields]

Assignee

Comment 34

•

13 years ago

any word on benchmarking

Laura Thomson :laura

Reporter

Comment 35

•

13 years ago

Due to https://bugzilla.mozilla.org/show_bug.cgi?id=675712#c4 we're now blocked on getting ES into prod.

Daniel: Status?  

Corey: What is AMO using for their ES cluster?  Would that hardware be suitable for us too?

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 36

•

13 years ago

Okay, after consulting with some ES experts, we got a list of tuning/optimization strategies and some methods for testing the required amount of memory for efficient read/write characteristics.

We have done testing with a full 90 day dataset and run a large set of typical use case queries through that system while in the middle of bulk loading to simulate heavy traffic conditions.

The optimizations made a huge impact in terms of disk space requirements.  We can now store the full 90 day set of sampled (i.e. 10%) data in less than 3TB of total disk.  To be able to store the unsampled set, we will need 30TB of total disk.

Our memory usage is very nominal for the typical query case, but we know there are going to be lots of unusual ad-hoc queries after the service launches and developers and analysts figure out what they can do with it.  Hence we are sticking with the much higher estimate there.

Our figures for the production cluster are:

40TB total disk.  RAID0 for data would be acceptable because the cluster has application level redundancy.  If IT preferred to have a RAID10 for the OS partition just for ease of maintenance, that is fine, but it isn't a requirement for us.

20 cores total.  Mid-grade processor speed is fine.

8GB of RAM per core with a max of 24GB per node.

Further, we need at least 5 nodes in the cluster so that we can handle machine level failures.

Corey Shields [:cshields]

Assignee

Comment 37

•

13 years ago

let's get this sorted and ordered during all-hands week.

Keep in mind, I think we are short on cabinet space for any new gear, and this will block on getting new cabinets up (something we already need to do for AMO)

Whiteboard: [2011q3] → [allhands]

Corey Shields [:cshields]

Assignee

Comment 38

•

13 years ago

> 8GB of RAM per core with a max of 24GB per node.

Intel does not make 3-core procs, and I doubt we can get dual core anymore.  I will go ahead and ask Rich to quote me the following:

start with the DL360 G7 base model 633777-01
single 6-core cpu
36 GB RAM (18 x 2GB would probably be cheapest) giving us 6GB per core
256MB BBWC on the raid controller should come standard with this model
if we need a license for the full set of raid features, add that too
add a redundant PSU
2 small hdds (like 160gb, whatever is the smallest these days)
6 large hdds (1TB at least, understanding that this is SFF. I can't see on HPs page where my options are like I use to)

depending on the drive size, we will want to quote 8 or 10 of these.

thanks Rich!

Corey Shields [:cshields]

Assignee

Comment 39

•

13 years ago

These have been quoted..  off to purchasing..

Assignee: deinspanjer → cshields

Component: Server Operations → Server Operations: Web Operations

QA Contact: mrz → cshields

Laura Thomson :laura

Reporter

Comment 40

•

13 years ago

Any ETA on these?

matthew zeier [:mrz]

Comment 41

•

13 years ago

I got the PO yesterday and pushed it through with our vendor.  Will update with more info when I have it.

Rich Pomper

Comment 42

•

13 years ago

These configured servers have shipped.
shipped via saia.com
Pro #00511304650
Current ETA to Phoenix is 11/14

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 43

•

13 years ago

What is the status of this hardware? Were they installed yet?

Laura Thomson :laura

Reporter

Comment 44

•

13 years ago

Could we get an update here?

Laura Thomson :laura

Reporter

Comment 45

•

12 years ago

Ping?

Corey Shields [:cshields]

Assignee

Updated

•

12 years ago

Depends on: 726725

Robert Kaiser

Comment 46

•

12 years ago

CCing Chris and Sheila, who are very interested in getting Socorro ElasticSearch up to give us better insight in our crash data.

chris hofmann

Updated

•

12 years ago

Blocks: 480503

chris hofmann

Comment 47

•

12 years ago

mrz,  bug 480503 is depending on this bug.  it just entered its 3rd year of being open.  we really need to make some progress on getting hardware, getting elastic search going on crash data, and starting to use it effectively to open up analysis techniques we used to have with the old talkback system up until 2008.

area there any other dashboards, radars, or project lists that this needs to be on to get some priority?

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 682924

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 608418

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 641482

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 674956

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 423502

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 641467

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 610790

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 464775

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 427686

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 411417

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 718820

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 541224

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 580666

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 630659

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 425399

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 677434

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 470827

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 675033

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 465360

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 629552

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 678101

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 684109

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 549443

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 523777

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 688256

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 524568

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 720055

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 714984

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 439679

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 524847

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 478261

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 563451

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 558080

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 524507

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 710571

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 667028

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 668626

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 678096

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 679294

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 608259

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 641483

Laura Thomson :laura

Reporter

Updated

•

12 years ago

Blocks: 610910

Corey Shields [:cshields]

Assignee

Updated

•

12 years ago

Depends on: 732298

Corey Shields [:cshields]

Assignee

Comment 48

•

12 years ago

Not going to close this out until 732298 (monitoring) is done, but the prod ES cluster is up and ready for use:  10.8.81.220:9200

Corey Shields [:cshields]

Assignee

Comment 49

•

12 years ago

(In reply to Corey Shields [:cshields] from comment #48)
> Not going to close this out until 732298 (monitoring) is done, but the prod
> ES cluster is up and ready for use:  10.8.81.220:9200

(background, the requirements changed on us and this was not just ES needed but bagheera)

This is all done now and documented in https://mana.mozilla.org/wiki/display/websites/Socorro+Search+Service

The VIP IPs are:
    dev:  10.8.81.221:9200
    stage:  10.8.81.222:9200
    prod:  10.8.81.220:9200

And for kicks and giggles, some status output:

Corey-Shieldss-MacBook-Pro:~ cshields$ curl -XGET 'http://10.8.81.221:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "socorro-dev",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}
Corey-Shieldss-MacBook-Pro:~ cshields$ curl -XGET 'http://10.8.81.222:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "socorro-stage",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}
Corey-Shieldss-MacBook-Pro:~ cshields$ curl -XGET 'http://10.8.81.220:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "socorro",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 5,
  "number_of_data_nodes" : 5,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

Status: REOPENED → RESOLVED

Closed: 13 years ago → 12 years ago

Resolution: --- → FIXED

[DEACTIVATED] Adrian Gaudebert

Updated

•

11 years ago

No longer blocks: 678101

Nobody; OK to take it and work on it

Updated

•

11 years ago

Component: Server Operations: Web Operations → WebOps: Other

Product: mozilla.org → Infrastructure & Operations

BMO Automation

Updated

•

5 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard