Closed
Bug 747794
Opened 13 years ago
Closed 13 years ago
Need alerts for nslog log files not being updated
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mburns, Assigned: ashish)
Details
Over the weekend (starting Fri 20th @ 7:04PM PST) VAMO log data stopped being updated to nslog2. A simple 'supervisorctl ; restart nswl' jolted nswl back to life, and started filling /var/log/netscaler with the appropriate files to be processed. Prior to this restart, the /v/l/netscaler directory was empty.
We should add nagios alerts to watch this log directory (and others that relate to Metrics processing needs) for files to be updated on a regular basis (hourly rsync cron) and alert if files are stale or non-existant.
[root@nslog2.private.phx1 log]# cd /var/log/netscaler
[root@nslog2.private.phx1 netscaler]# ls -lha
total 84K
drwxr-xr-x 2 netscaler root 4.0K Apr 20 20:15 .
drwxr-xr-x. 13 root root 76K Apr 22 12:45 ..
[root@nslog2.private.phx1 netscaler]# supervisorctl
nswl RUNNING pid 12755, uptime 21 days, 23:25:14
supervisor> restart nswl
nswl: stopped
nswl: started
supervisor> exit
[root@nslog2.private.phx1 netscaler]# ls -lh
total 80M
[root@nslog2.private.phx1 netscaler]# ls -lh
total 331M
-rw-r--r-- 1 netscaler netscaler 0 Apr 22 13:20 lb1.addons.mozilla.org.access_2012-04-22-13
-rw-r--r-- 1 netscaler netscaler 4.9M Apr 22 13:20 lb1.snippets.mozilla.com.access_2012-04-22-13
-rw-r--r-- 1 netscaler netscaler 326M Apr 22 13:20 lb1.versioncheck.addons.mozilla.org.access_2012-04-22-13
-rw-r--r-- 1 netscaler netscaler 0 Apr 22 13:20 lb1.www.mozilla.org.access_2012-04-22-13
| Reporter | ||
Comment 1•13 years ago
|
||
Autocompleted the wrong component. sorry for the bugspam.
Assignee: nobody → server-ops
Component: Security Assurance → Server Operations
QA Contact: security-assurance → phong
| Assignee | ||
Comment 2•13 years ago
|
||
Are these checks the same as (from old scrollback):
12:22:50 < nagios-sjc1> im-log02:NetScaler - logs is OK: OK: access_2012-04-21-12 is OK
20:08:12 < nagios-sjc1> [51] im-log02:check_nl_logs is CRITICAL: 30 incorrect im-log03 files
| Assignee | ||
Comment 3•13 years ago
|
||
Ignore #c2. We already have a check for this:
21:47:58 < nagios-phx1> ashish: nslog2.private.phx1:NetScaler - logs is OK: OK: lb1.versioncheck.addons.mozilla.org.access_2012-04-22-21.3 is OK
However, it's b0rken :(
[root@ip-admin02 autogen]# /usr/lib64/nagios/plugins/check_nrpe -H nslog2.private.phx1 -t 300 -c check_log_file -a 150 300 /tmp 'lb1.versioncheck.*'
OK: is OK
confirmed via nagios.log:
[1335077655] PASSIVE SERVICE CHECK: nslog2.private.phx1;NetScaler - logs;0;OK: is OK
| Assignee | ||
Comment 4•13 years ago
|
||
I've added the new check to ip-admin. Should show up soon, otherwise I'll poke in the AM.
Assignee: server-ops → ashish
Status: NEW → ASSIGNED
| Assignee | ||
Comment 5•13 years ago
|
||
19:36:19 < nagios-phx1> nslog2.private.phx1:NetScaler - logs - new is OK: OK: 1/1
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•