Closed Bug 968318 Opened 9 years ago Closed 9 years ago

Increase memory on cluster nodes to 20g each


(Infrastructure & Operations :: Virtualization, task)

Windows 7
Not set


(Not tracked)



(Reporter: ekyle, Assigned: cknowles)





(1 file)

They're VMs, so over to the storage/virt team. I'm not sure we can swing 20g on each.

Also, increasing RAM will require rebooting. If we can take them all down at once, it'll probably be easier on elasticsearch, but will require coordination.
Assignee: server-ops-webops → server-ops-virtualization
Component: WebOps: IT-Managed Tools → Server Operations: Virtualization
Product: Infrastructure & Operations →
QA Contact: nmaul → dparsons
:ekyle, given that they're only using 2GB each right now, can you justify the request for ten times that amount of RAM?
They never should have had so little ram.  The machines they are replacing have 20g each. Even then that is not always enough.  ES is fast because it keeps indexes in memory, which is a memory hog.

We can not use these clusters for existing charts until the memory is higher.  Feel free to bounce the clusters whenever.
:ekyle, thanks for explaining. Can you please give us more information about how you came up with the 20GB per node number? It feels very arbitrary, and upon logging into these servers, we've found that most of them are barely using even 1GB RAM right now. These resources are finite and particularly right now, in the middle of our massive p2v / platform upgrade, RAM is a bit tight.
    Just to second kyle's. I use those ES instances all my queries are failing with {u'status': 503, u'error': u'ReduceSearchPhaseException[Failed to execute phase [fetch], [reduce] ]; nested: OutOfMemoryError[Java heap space]; '}.
    We used to run a ES cluster similar to kyle's and we have 24GB mem in our machines . You can look at them under elasticsearch[5,6,7,8] Even with 24gb we used to have issue of memory topping out but much less than what we are experiencing right now.

Thank you,
OK, one big issue here is :ekyle's use of the word "using" in the bug description. After all the comments here, I think using the word "allocated" instead would have saved a LOT of confusion. "using" implies that the workload on the VMs are indeed only using 2GB, so it makes no sense why you'd be ask for more in that situation.

Now that that's out of the way, I understand that you need _more_ RAM than you've got now, however the part where I need more justification is the 20GB number. You're requesting roughly 100GB of RAM in this bug and given that amount, we need more justification than simply "I've got this other cluster that doesn't work right and it's got 24GB per node". Can you provide this?
The reason for 20g is only because that's what's on the Metrics cluster that supports the existing charts.  Occasionally, I come up against OoM errors on that cluster too, so maybe 20g is still too small.  

The ram is only required when ES is being used.  The clusters have only just opened up, so usage is still low.
Also worth noting that the existing metrics cluster is slowly falling apart, and metrics has been anxious to get our Bugzilla data off of it into its own cluster for some time now.  The old cluster supports a variety of charts important to various people, smooney amongst others, which we need to get running on the new cluster.
Hi guys - 

Even I am confused at this point about which cluster is which and what is wrong with what... so lets sort this out

"Legacy Cluster" = the cluster we (BIDW - me, harsha, etc.) own and want to decomm asap
"Public Cluster" = the cluster that Kyle spun up for his work
"Private Cluster" = the cluster that Kyle spun up to handle internal analyses

The Legacy Cluster has 24GB, and would regularly fall over. Harsha, it would be good if you could provide some examplary bug numbers for the many crises we had due to OOM issues.

Each of the new clusters have only 2GB memory (which I would assert is not a feasible configuration).
Are one or *both* of them experiencing OOM issues?

Whereas the Legacy Cluster was IIRC on bare metal, the new Public and Private clusters are in VMs.

According to folks administering them, they are allocated 4GB, but sounds like they are only using 2GB of the 4GB allocated.

Harsha - Is there a whitepaper on sizing resources for ES clusters?
Possibly unrelated question:  what role is ES playing (or will play) in Bugzilla?  Is this a possible search replacement or something that will speed queries up?

The short answer:

  * The clusters have data in a format that makes historical queries easy.
  * The clusters are fast, and can reduce load on BZAPI.

The long answer:
To clarify things, the systems appear to have 4GB RAM, but Java (ElasticSearch) is configured to use only 2GB:

/usr/bin/java -Xms256m -Xmx2048m -Xss256k -D....

This explains the confusion around how much RAM there currently is, why the usage appears low, and so forth. Java will not just eat up memory like Apache will... it's more like memcache, in that you need to tell it how much it can have. :)

I have never actually looked at the legacy Metrics ES instance, and I don't know what sort of eyeballs have. One thing on my mind is "was the memory on that being used efficiently?". Specifically, when it comes to Java processes, we've often achieved major wins in performance or memory efficiency by tweaking the JVM settings to match the application and usage. Java versions and ES versions can have a significant impact, too. I don't know what, if anything, was ever tried on the legacy cluster. Consequently, it's difficult for me to say if the legacy cluster is really a good benchmark for what we need in order for these 2 new(er) clusters to work well.

My inclination is to ask Dan/Greg/Corey how much is feasible to increase in the short term. Can we go from 4GB to 6GB or 8GB per node? We can easily increase how much ES is allowed to use (controlled by /etc/sysconfig/elasticsearch... though I suspect that's not managed by puppet right now). Combined that would be a major increase, whilst hopefully not derailing Spring Cleaning and P2V any more than necessary.

Is this a reasonable starting point for everyone?
Attached image BigDesk Profile
I will need to profile the memory used when running my standard suite of charts and ETL to know the minimum memory per node that will work.  This is complicated by the fact the only three-node cluster is Metric's, and it is running an old version of ES with probably different memory behavior.  Furthermore, I am not sure I can simulate real load, so my memory estimate may be too low.

I opened a few charts, I attached the resulting profile on one of the nodes (using bigdesk).  It looks like my charts need 20G. 

I realize that less memory may still work.  If 8G per node can be done easily, I am willing to at least try it.
Increasing all nodes to 8GB each sounds like a good place to start. I have two questions for you:

(1) When is a good date / time to do this? Each VM will have to be shut down for a few minutes.

(2) Who can be responsible for adjusting the JVM settings to take advantage of this RAM increase after the reboot?
(1) The best time is ASAP.  We only have a couple of people using these cluster, including myself.  Ping me when work begins, and I will notify the others.  It is unlikely anyone will notice the downtime.

(2) The only person I know that has login privileges is :fubar.  So I hope he is the one to set the $ES_HEAP_SIZE environment variable.  (As per my reading of elasticsearch/bin/ file)
ES is a distributed pile of tentacles, so shutting down a whole cluster at once is perferable. Happy to do both 1) and 2) any time.
Passing this on to :cknowles to get it done.
Assignee: server-ops-virtualization → cknowles
Alright, working with :fubar - got the nodes down and 8G allocated to them.

My inclination is to close this out and we can revisit later - but I realize that may not be the best or desired path.  Thoughts?

Puppet updated to increase ES memory; currently running at:

/usr/bin/java -Xms256m -Xmx6826m -Xss256k

there's some fudgy math involved; puppet's setting it to 95% of avail mem (converted to int) according to facter, which is reporting 7.69G; we can bump it a tiny bit higher before we run the risk of fighting with the OS for memory.
You can close this.   I will open another bug if I see problems, but that will be a couple of days.
Alright, closing this out.  Let us know if you need further assistance.

Closed: 9 years ago
Resolution: --- → FIXED
I have reviewed my charts, and they all run fine. It looks like the new version (0.90.x) fixed the memory consumption issues seen in the old version.

I must still test the ETL jobs that run on this cluster.  Unfortunately, they do not run with the new version:  There are subtle differences in accessing non-existent properties via MVEL.  I must spend time debugging these.

Overall, I am optimistic, the ETL can be easily scaled to fit whatever memory constraints exist.  The charts were my primary concern.  

(In reply to Kendall Libby [:fubar] from comment #19)
> Puppet updated to increase ES memory; currently running at:
> /usr/bin/java -Xms256m -Xmx6826m -Xss256k
> there's some fudgy math involved; puppet's setting it to 95% of avail mem
> (converted to int) according to facter, which is reporting 7.69G; we can
> bump it a tiny bit higher before we run the risk of fighting with the OS for
> memory.

Unfortunately that's not quite true - you'll fight with the OS for memory well before 95 %.  Lucene leans heavily on the the filesystem cache for normal operations, and since that is a kernel-space activity (i.e. outside of the JVM), there needs to be sufficient RAM available to the *system* as well as to the Elasticsearch process itself.

The general rule for this is to start with bounding ES at 50 % of the total system memory, and then tuning from there; in practice, the sweet spot is generally somewhere between 50 and 70 %, and unfortunately there are no hard and fast techniques for figuring it out.  In any case, setting it to 90 % is almost certainly a mistake.
See Also: → 972236
Product: → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.