Closed Bug 747794 Opened 13 years ago Closed 13 years ago

Need alerts for nslog log files not being updated

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mburns, Assigned: ashish)

Details

Over the weekend (starting Fri 20th @ 7:04PM PST) VAMO log data stopped being updated to nslog2. A simple 'supervisorctl ; restart nswl' jolted nswl back to life, and started filling /var/log/netscaler with the appropriate files to be processed. Prior to this restart, the /v/l/netscaler directory was empty. We should add nagios alerts to watch this log directory (and others that relate to Metrics processing needs) for files to be updated on a regular basis (hourly rsync cron) and alert if files are stale or non-existant. [root@nslog2.private.phx1 log]# cd /var/log/netscaler [root@nslog2.private.phx1 netscaler]# ls -lha total 84K drwxr-xr-x 2 netscaler root 4.0K Apr 20 20:15 . drwxr-xr-x. 13 root root 76K Apr 22 12:45 .. [root@nslog2.private.phx1 netscaler]# supervisorctl nswl RUNNING pid 12755, uptime 21 days, 23:25:14 supervisor> restart nswl nswl: stopped nswl: started supervisor> exit [root@nslog2.private.phx1 netscaler]# ls -lh total 80M [root@nslog2.private.phx1 netscaler]# ls -lh total 331M -rw-r--r-- 1 netscaler netscaler 0 Apr 22 13:20 lb1.addons.mozilla.org.access_2012-04-22-13 -rw-r--r-- 1 netscaler netscaler 4.9M Apr 22 13:20 lb1.snippets.mozilla.com.access_2012-04-22-13 -rw-r--r-- 1 netscaler netscaler 326M Apr 22 13:20 lb1.versioncheck.addons.mozilla.org.access_2012-04-22-13 -rw-r--r-- 1 netscaler netscaler 0 Apr 22 13:20 lb1.www.mozilla.org.access_2012-04-22-13
Autocompleted the wrong component. sorry for the bugspam.
Assignee: nobody → server-ops
Component: Security Assurance → Server Operations
QA Contact: security-assurance → phong
Are these checks the same as (from old scrollback): 12:22:50 < nagios-sjc1> im-log02:NetScaler - logs is OK: OK: access_2012-04-21-12 is OK 20:08:12 < nagios-sjc1> [51] im-log02:check_nl_logs is CRITICAL: 30 incorrect im-log03 files
Ignore #c2. We already have a check for this: 21:47:58 < nagios-phx1> ashish: nslog2.private.phx1:NetScaler - logs is OK: OK: lb1.versioncheck.addons.mozilla.org.access_2012-04-22-21.3 is OK However, it's b0rken :( [root@ip-admin02 autogen]# /usr/lib64/nagios/plugins/check_nrpe -H nslog2.private.phx1 -t 300 -c check_log_file -a 150 300 /tmp 'lb1.versioncheck.*' OK: is OK confirmed via nagios.log: [1335077655] PASSIVE SERVICE CHECK: nslog2.private.phx1;NetScaler - logs;0;OK: is OK
I've added the new check to ip-admin. Should show up soon, otherwise I'll poke in the AM.
Assignee: server-ops → ashish
Status: NEW → ASSIGNED
19:36:19 < nagios-phx1> nslog2.private.phx1:NetScaler - logs - new is OK: OK: 1/1
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.