Closed
Bug 1037873
Opened 10 years ago
Closed 10 years ago
nagios monitoring for sp-admin01.phx1.mozilla.com:/var/log/socorro/crontabber.log
Categories
(Infrastructure & Operations Graveyard :: WebOps: Socorro, task)
Infrastructure & Operations Graveyard
WebOps: Socorro
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rhelmer, Assigned: cliang)
References
Details
(Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/643] )
Can we get a simple monitor on sp-admin01:/var/log/socorro/crontabber.log ? If it hasn't been written to in ~3 hours we should warn, and page at 6 (we can tighten this up later as we make our jobs faster) This would catch things like bug 1037870
Reporter | ||
Updated•10 years ago
|
Assignee | ||
Updated•10 years ago
|
Assignee: server-ops-webops → cliang
Assignee | ||
Comment 1•10 years ago
|
||
I've added a log file age check that should warn if the file age is older than 3 hours and page if it is older than 6 hours. (The age is calculated in number of seconds.) This should be similar to the file age check that is run on the processros, which warn if the /var/log/socorro/socorro-processor.log hasn't been written to in 60 seconds and pages if it's been more than 5 minutes since the last write to that file. Index: modules/nagios/manifests/mozilla/services.pp =================================================================== --- modules/nagios/manifests/mozilla/services.pp (revision 92208) +++ modules/nagios/manifests/mozilla/services.pp (working copy) @@ -3642,6 +3642,18 @@ ] } }, + 'socorro-admin-crontab-log' => { + service_description => "Socorro Admin - crontab log file age", + check_command => 'check_file_age!10800!21600!/var/log/socorro/crontabber.log', + contact_groups => 'sysalerts, socorroalerts', + hostgroups => $::fqdn ? { + 'nagios1.private.phx1.mozilla.com' => [ + 'socorro-admin' + ], + default => [ + ] + } + }, 'socorro-web-http' => { service_description => 'crash-stats.m.c - http string', check_command => 'check_http_string!crash-stats.mozilla.com!/products/Firefox!\'Mozilla Crash Reports\'',
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•8 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•