Entirely too many WinXP test slaves are in the dead part of last-job-per-slave.html

RESOLVED FIXED

Status

Infrastructure & Operations
CIDuty
P2
normal
RESOLVED FIXED
6 years ago
2 months ago

People

(Reporter: philor, Assigned: nthomas)

Tracking

Details

(Whiteboard: [buildduty][capacity], URL)

(Reporter)

Description

6 years ago
Now that the enormous backlog of try jobs has settled into the usual pattern of the oldest 150 being all WinXP, it's time for my usual trip to http://build.mozilla.org/builds/last-job-per-slave.html#test to find that there are a ton of them in the red and yellow bits, not having taken a job for two or five or twenty days.

talos-r3-xp-035
talos-r3-xp-011
talos-r3-xp-032
talos-r3-xp-022
talos-r3-xp-039
talos-r3-xp-025
talos-r3-xp-013
talos-r3-xp-065
talos-r3-xp-068
talos-r3-xp-018
talos-r3-xp-047
talos-r3-xp-007
talos-r3-xp-074
talos-r3-xp-009
talos-r3-xp-005
talos-r3-xp-055
talos-r3-xp-071
(Assignee)

Comment 1

6 years ago
I'll round these up, IIRC there's already a bug on file for the underlying issue.
Assignee: nobody → nrthomas
Priority: -- → P2
(Assignee)

Comment 2

6 years ago
I've gone through the list and rebooted them. Uniformly they had finished a job and the call to tools/buildfarm/maintenance/count_and_reboot.py had failed to reboot them. Some had
 C:\>C:\mozilla-build\python25\python c:\runslave.py
 ^CTerminate batch job (Y/N)? 
in the terminal, but not all.

Bear is using talos-r3-xp-032 to look at why briar-patch isn't saving these machines.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED

Comment 3

6 years ago
talos-r3-xp-032 was trying to reboot but had an error dialog on the screen preventing the shutdown command from finishing - this is why briarpatch seemed to be doing nothing and kept reporting the host as inactive.

killed the dialog and reboot immediately happened
Product: mozilla.org → Release Engineering

Updated

2 months ago
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.