Closed Bug 413044 Opened 17 years ago Closed 16 years ago

create and deploy nagios plugin for buildbot master

Categories

(Release Engineering :: General, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 488273

People

(Reporter: joduinn, Unassigned)

References

Details

We use nagios to monitor all sorts of different parts of our Build infrastructure. It would be nice to have nagios also monitor health of a given buildbot master install. Some examples: - checking the number of pending jobs queued up in the buildmaster, and send warning notifications if the queue grows too long. - checking the buildbot master connection status to its build slaves. The latest Buildbot0.7.6 does have some new email notification functionality if a slave disconnects, but does not notify you when the slave reconnects again. However, nagios does notify you of disconnects and reconnects. It would be nice if we could handle the following: - "tell me when a slave disconnects/reconnects" - "how many times this month did we go over some threshold queue length" (do we need more slaves for timely turnarounds?) - "how much slave uptime did we have"? ... Using nagios for this means we can also use all the various nagios reporting infrastructure, rather then having to reinvent the wheel all over again.
Priority: -- → P3
Component: Release Engineering → Release Engineering: Future
QA Contact: build → release
Sounds like a job for the build database, and we have other nagios monitoring in place for buildbot masters now too.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → DUPLICATE
Moving closed Future bugs into Release Engineering in preparation for removing the Future component.
Component: Release Engineering: Future → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.