Occasionally we need to reboot many machines at the same time. In the past we've done this by collecting a list of machines, and then running command-line actions to reboot the list. Since slave health is already plumbed to talk to slaveapi, and many of the reboots we need to do are based on length of time since a slave last reported, we should add the ability to batch reboot slaves directly from slave health. The ability to reboot slaves marked as "broken" already exists. See "batch actions" on the slavetype page: https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slavetype.html?class=test&type=panda "broken" is a state that corresponds to "hasn't reported a result in >6 hours", but we often know *before* that 6 hour threshold that we need to reboot a given batch of slaves. We should add an ability to reboot slaves based on any time threshold. This can be a new batch action in the drop-down list.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Component: Tools → General
Product: Release Engineering → Release Engineering
You need to log in before you can comment on or make changes to this bug.