Closed Bug 629701 Opened 14 years ago Closed 14 years ago

enable buildbot_start alerts for mac/linux desktop slaves

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

(keeping this assigned to me until it's ready to run)
This will enable the alerts added to nagios in bug 629698, but kept quiet. It will only enable them for linux hosts, where runslave.py is already deployed. This requires that the reboot-idle-slaves bug be solved first, because we currently have a lot of slaves which do not restart in less than 7h.
Blocks: 627126
Can we go ahead with this? The same check as currently configured on linux-ix-slave10, but: - with the freshness check set to 7h, and set to WARNING - with all notifications disabled for all POSIX slaves. I'll stage, and then deploy, the runslave.py changes in bug 629694. Then I can look in the web interface and see what kind of idle/hung rates we're seeing. Since we don't have Idleizer implemented yet, we'll still have lots of idle/hung slaves, which is why I don't want notifications yet.
Assignee: dustin → arich
The nagios checks have been rolled out, but there's a modification that needs to be made on the slave end so that hosts talk to the nagios server in their datacenter.
Assignee: arich → dustin
Still, this bug is done. The followup work on my end is in bug 654279
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.