A high percentage of both talos-linux32-ix and talos-linux64-ix boxes are reported as broken in slave_health. This is causing an increasing jobs backlog (especially for 64 machines).
slaverebooter is apparently failing to reboot these machines: [6:03pm] coop: something is causing slaverebooter to hang before it gets to the talos-linux64-ix machines [6:03pm] coop: since those are last in its list [6:03pm] coop: i blame xp Also see: https://bugzilla.mozilla.org/show_bug.cgi?id=971861#c7, which "may" be related
:coop: now the situation seems improved: did you do anything (slaverebooter related steps, manual rebooting, ...)?
(In reply to Simone Bruno [:simone] from comment #2) > :coop: now the situation seems improved: did you do anything (slaverebooter > related steps, manual rebooting, ...)? Jordan discovered that slaveapi needed the updated passwords for cltbld and Administrator. On top of that, I manually rebooted all the talos-linux* slaves that weren't still actively taking jobs. Windows pending jobs are still high today, but I think Linux is fixed for now.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.