Closed Bug 1055600 Opened 10 years ago Closed 8 years ago

add nagios monitoring for long running aws_* scripts on aws-manager

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: massimo, Unassigned)

References

Details

Massimo Gervasini [:massimo]

Reporter

Description

•

10 years ago

We have just noticed that aws_process_cloudtrail.py got stuck on 1st Aug (2 weeks ago), we clearly need to monitor this script.

Right now, the only script monitored on aws-manager is aws_stop_idle.log

From a quick check, the following scripts create a lockfile:

* aws_manager-aws_clean_log_dir.py.sh
* aws_manager-aws_get_cloudtrail_logs.py.sh
* aws_manager-aws_process_cloudtrail_logs.py.sh
* aws_manager-aws_publish_amis.py.sh
* aws_manager-aws_sanity_checker.py.sh
* aws_manager-aws_stop_idle_servo.sh
* aws_manager-aws_watch_pending.py.sh
* aws_manager-aws_watch_pending_servo.sh
* aws_manager-bld-linux64-ec2-golden.sh
* aws_manager-delete_old_spot_amis.py.sh
* aws_manager-spot_sanity_check.py.sh
* aws_manager-tag_spot_instances.py.sh
* aws_manager-try-linux64-ec2-golden.sh
* aws_manager-tst-emulator64-ec2-golden.sh
* aws_manager-tst-linux32-ec2-golden.sh
* aws_manager-tst-linux64-ec2-golden.sh

so we might want to monitor them too

Rail Aliiev [:rail]

Updated

•

10 years ago

Comment 1

•

8 years ago

The goldens are monitored via nagios, some of the watcher scripts now have cron jobs that will kill them off if they run too long, and some others are monitored via papertrail.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Assignee

Updated

•

7 years ago

Component: Tools → General

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

add nagios monitoring for long running aws_* scripts on aws-manager

Categories

(Release Engineering :: General, defect)

Tracking

(Not tracked)

People

(Reporter: massimo, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated