Closed
Bug 786914
Opened 13 years ago
Closed 13 years ago
Many test slaves not taking jobs
Categories
(Release Engineering :: General, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: philor, Assigned: nthomas)
Details
(Whiteboard: [buildduty][capacity])
Currently, around 1000 of the 3000 pending jobs are 10.8 tests, going back about five hours. There are currently seven jobs running, so there must be, what, 73, 74 slaves not taking jobs? One possibility, that I don't know how to evaluate the probability of, is that those 73 think that their basedir is C:\talos-slave.
Assignee | ||
Comment 1•13 years ago
|
||
As of the 2012-08-30 02:00:04 copy of http://build.mozilla.org/builds/last-job-per-slave.html#talos
Done work in last hour:
talos-mtnlion-r5-006
talos-mtnlion-r5-014
talos-mtnlion-r5-015
talos-mtnlion-r5-017
talos-mtnlion-r5-018
talos-mtnlion-r5-053
talos-mtnlion-r5-082
Last job completed about 12.5 hours ago:
talos-mtnlion-r5-004
talos-mtnlion-r5-005
talos-mtnlion-r5-007
talos-mtnlion-r5-008
talos-mtnlion-r5-009
talos-mtnlion-r5-011
talos-mtnlion-r5-012
talos-mtnlion-r5-013
talos-mtnlion-r5-016
talos-mtnlion-r5-019
talos-mtnlion-r5-021
talos-mtnlion-r5-023
talos-mtnlion-r5-024
talos-mtnlion-r5-025
talos-mtnlion-r5-026
talos-mtnlion-r5-027
talos-mtnlion-r5-028
talos-mtnlion-r5-029
talos-mtnlion-r5-037
talos-mtnlion-r5-041
talos-mtnlion-r5-042
talos-mtnlion-r5-043
talos-mtnlion-r5-044
talos-mtnlion-r5-045
talos-mtnlion-r5-046
talos-mtnlion-r5-047
talos-mtnlion-r5-048
talos-mtnlion-r5-049
talos-mtnlion-r5-050
talos-mtnlion-r5-051
talos-mtnlion-r5-052
talos-mtnlion-r5-054
talos-mtnlion-r5-055
talos-mtnlion-r5-056
talos-mtnlion-r5-057
talos-mtnlion-r5-058
talos-mtnlion-r5-059
talos-mtnlion-r5-060
talos-mtnlion-r5-076
talos-mtnlion-r5-081
talos-mtnlion-r5-083
talos-mtnlion-r5-084
talos-mtnlion-r5-085
talos-mtnlion-r5-086
talos-mtnlion-r5-088
talos-mtnlion-r5-089
Never done a job:
talos-mtnlion-r5-020
talos-mtnlion-r5-030
talos-mtnlion-r5-031
talos-mtnlion-r5-032
talos-mtnlion-r5-033
talos-mtnlion-r5-034
talos-mtnlion-r5-035
talos-mtnlion-r5-036
talos-mtnlion-r5-038
talos-mtnlion-r5-039
talos-mtnlion-r5-040
talos-mtnlion-r5-061
talos-mtnlion-r5-062
talos-mtnlion-r5-063
talos-mtnlion-r5-064
talos-mtnlion-r5-065
talos-mtnlion-r5-066
talos-mtnlion-r5-067
talos-mtnlion-r5-068
talos-mtnlion-r5-069
talos-mtnlion-r5-070
talos-mtnlion-r5-071
talos-mtnlion-r5-072
talos-mtnlion-r5-073
talos-mtnlion-r5-074
talos-mtnlion-r5-075
talos-mtnlion-r5-077
talos-mtnlion-r5-078
talos-mtnlion-r5-079
talos-mtnlion-r5-080
talos-mtnlion-r5-087
talos-mtnlion-r5-090
Assignee | ||
Comment 2•13 years ago
|
||
(In reply to Nick Thomas [:nthomas] from comment #1)
> Last job completed about 12.5 hours ago:
There are many other slaves of other OS in the state too: talos-r4-snow, talos-r4-leopard, talos-r3-fed*. This will be fallout from bug 786807.
> Never done a job:
These seem to have uptimes of 6 days and look like they never got rebooted after 10.8 was enabled in production. Only the staging slaves and 022 are disabled in slavealloc.
Assignee | ||
Comment 3•13 years ago
|
||
talos-r4-snow-* and talos-mtnlion-r5-* (where uptime > 1 hour) rebooted
Assignee | ||
Comment 4•13 years ago
|
||
talos-r3-fed-* and talos-r3-fed64-* done.
Assignee: nobody → nthomas
Priority: -- → P1
Summary: Nearly every 10.8 slave is not taking jobs → Many test slaves not taking jobs
Assignee | ||
Comment 5•13 years ago
|
||
talos-r4-lion-* done too.
Turns out talos-mtnlion-r5-080 wasn't ready for production (bug 786993) and burned ~250 builds when hg wasn't working. talos-mtnlion-r5-087 (bug 786994) aslo had issues, there might be more.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment 6•13 years ago
|
||
I went through the list of last build per slaves for all the mtnlion machines today and opened bugs or reimaged/rebooted the problematic ones. I also updated https://wiki.mozilla.org/ReferencePlatforms/HowToSetupNewPlatform to indicate that this is something you should watch after new platform is put into production to catch any wonky slaves.
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•