Elasticsearch Processes > 2 alerts

RESOLVED FIXED

Status

RESOLVED FIXED
a year ago
a year ago

People

(Reporter: phrozyn, Assigned: pradcliffe+bugzilla)

Tracking

Details

When I look at logs for elasticsearch I'm seeing Nagios alert for processes > 2

Maybe we can tune this to be a bit more specific per Pir

Here are the command lines for elasticsearch and for when I review logs

Elasticsearch:

elastic+ 27417 85.4 45.1 1503708988 29707588 ? Ssl  Aug10 26401:44 /bin/java -Xms31g -Xmx31g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Djna.nosys=true -Dmapper.allow_dots_in_name=true -Des.path.home=/usr/share/elasticsearch -cp /usr/share/elasticsearch/lib/elasticsearch-2.4.6.jar:/usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch start -Des.pidfile=/var/run/elasticsearch/mozdefes7.private.scl3.mozilla.com/elasticsearch.pid -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch/mozdefes7.private.scl3.mozilla.com -Des.default.path.data=/data/elasticsearch/mozdefes7.private.scl3.mozilla.com -Des.default.path.conf=/etc/elasticsearch/mozdefes7.private.scl3.mozilla.com


Viewing Elasticsearch Logs as root user:

root     34631  0.0  0.0 110248   956 pts/0    S+   15:06   0:00 less /var/log/elasticsearch/mozdefes7.private.scl3.mozilla.com/mozdef_es_prod.log

Viewing Elasticsearch logs using sudo:

root     13183  0.0  0.0 193380  2820 pts/0    S+   15:07   0:00 sudo less /var/log/elasticsearch/mozdefes6.private.scl3.mozilla.com/mozdef_es_prod.log
root     13184  0.0  0.0 110248   960 pts/0    S+   15:07   0:00 less /var/log/elasticsearch/mozdefes6.private.scl3.mozilla.com/mozdef_es_prod.log
(Assignee)

Comment 1

a year ago
Thanks for all the detail there.

Should be able to look for something like '/usr/share/elasticsearch/lib/elasticsearch-' rather than just 'elasticsearch'
Assignee: nobody → pradcliffe+bugzilla
Blocks: 1332212
(Assignee)

Comment 2

a year ago
<nagios-eis:#sysadmins> (IRC) Thu 13:15:36 UTC [5013] [opsec] 
  mozdefes7.private.scl3.mozilla.com:elasticsearch process is CRITICAL: PROCS 
  CRITICAL: 2 processes with regex args 'elasticsearch' 
  (http://m.mozilla.org/elasticsearch+process)
(Assignee)

Comment 3

a year ago
Tested on the command line:

[root@nagios-eis1.private.scl3 pradcliffe]# /usr/lib64/nagios/plugins/check_nrpe -H mozdefes7.private.scl3.mozilla.com -c check_procs_regex -a /usr/share/elasticsearch/lib/elasticsearch- 1 1
PROCS OK: 1 process with regex args '/usr/share/elasticsearch/lib/elasticsearch-' | procs=1;;1:1;0;

diff --git a/modules/nagios4/manifests/prod/eis/services.pp b/modules/nagios4/manifests/prod/eis/services.pp
index 3248cc1716..36bdab49f6 100644
--- a/modules/nagios4/manifests/prod/eis/services.pp
+++ b/modules/nagios4/manifests/prod/eis/services.pp
@@ -384,7 +384,7 @@ class nagios4::prod::eis::services {
         },
         'check_elasticsearch' => {
             service_description => 'elasticsearch process',
-            check_command => 'check_nrpe_procs_regex!elasticsearch!1!1',
+            check_command => 'check_nrpe_procs_regex!/usr/share/elasticsearch/lib/elasticsearch-!1!1',
             contact_groups => 'eisalertsonly, sysalertsonly, eisalerts',
             max_check_attempts => 3,
             hostgroups => $nagiosbot ? {

Committed in f20a4d237f2f087a1343f008a986ce0b97f7acad

16:40 <nagios-eis:#sysadmins> pir: [opsec] 
  mozdefes7.private.scl3.mozilla.com:elasticsearch process is OK - PROCS OK: 1 
  process with regex args '/usr/share/elasticsearch/lib/elasticsearch-' Last 
  Checked: 2017-08-31 15:40:13 UTC

Please reopen if you see any problems with it...
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.