Closed Bug 971737 Opened 11 years ago Closed 11 years ago

slave rebooter doesn't reboot slaves when graceful shutdown fails

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

Details

Attachments

(1 file)

Which, as it turns out, always fails for slaves that are unreachable.
It seems clear that we shouldn't do a graceful shutdown if the slave is unreachable. The easiest place to put this logic would be at the start of the shutdown_buildslave action (around http://git.mozilla.org/?p=build/slaveapi.git;a=blob;f=slaveapi/actions/shutdown_buildslave.py;h=772e45d44300635e76d2b38f48d0a62c16c2c737;hb=HEAD#l24). We could just add a 'if not ping(slave): return' and that should do it... What do you think John? After all the follow-ups here my mind is a bit frazzled in terms of what the best thing to do is.
Flags: needinfo?(jhopkins)
So the graceful shutdown is just to make sure we don't interrupt a build in progress before rebooting the slave. If the machine is not pingable, then it is almost certainly not doing a build. If the ping fails because the network is down, the build will fail anyway since buildbot requires a persistent connection. Adding an 'if not ping(slave): return' like you suggested makes sense to me.
Flags: needinfo?(jhopkins)
I gave this a try locally, using talos-r3-fed-011 (a currently unreachable machine) as a test. Seems to work.
Attachment #8374953 - Flags: review?(jhopkins)
Comment on attachment 8374953 [details] [diff] [review] ping before shutting down passes visual inspection
Attachment #8374953 - Flags: review?(jhopkins) → review+
Comment on attachment 8374953 [details] [diff] [review] ping before shutting down Restarted slaveapi prod for this.
Attachment #8374953 - Flags: checked-in+
This seems to be working.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I think this might have caused bug 972867.
Component: Tools → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: