Closed Bug 660080 Opened 9 years ago Closed 6 years ago
From an earlier email thread: Our current process for rebooting slaves at the end of a build is causing multiple headaches: * masters tend to get stuck thinking the slave is still around - this is worse with slavealloc, since the slave will not necessarily re-connect to the same master. Catlee saw this today. I think it has to do with the slave powering off without terminating the buildslave process or even the TCP connection * snow and leopard slaves have been failing to reboot but killing the buildslave process lately (bug 648665) * The pidfiles left around when the buildslave process does not shut down cleanly cause problems on startup (bug 652847) * Where to get count_and_reboot.py from is problematic (bug 646580, bug 659344) * buildbot-start monitoring, and in fact the whole approach to ensuring slaves are up to date and healthy, requires frequent slave reboots (so bug 633277 will be WONTFIX'd sooner or later). I think this may also be responsible for some of our hung reconfigs, but I can't prove that. Honestly, I can't prove any of the above. The proposal is this: When buildslave-0.8.4pre-moz1 is completely deployed, it ships with the Idleizer, which means it has innate knowledge of how to reboot the machine. This could be trivially expanded so that a custom command sets a "reboot immediately after next disconnect" flag on the Idleizer. Then the DisconnectStep would be all that's required to reboot the slave. Armen, given you've filed a number of reboot-related bugs recently, do you want to work on this?
(In reply to comment #0) > Armen, given you've filed a number of reboot-related bugs recently, do you > want to work on this? No, not necessarily. Once the right version of buildbot is deployed everywhere I would be more easily persuaded.
Product: mozilla.org → Release Engineering
7 years ago
Component: Other → General Automation
QA Contact: catlee
Duping forward to bug 1028191. If we can use Idleizer for this, great! I think the behaviour we want is basically: run buildbot until either a) idle after some small amount of time (30 minutes?) b) after the current job is done I think this could be accomplished by triggering the graceful shutdown process 30 minutes after the slave starts.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1028191
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.