Closed
Bug 1263414
Opened 9 years ago
Closed 9 years ago
No jobs starting
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: RyanVM, Assigned: nthomas)
References
Details
Looking at the trees, there are currently over 3000 pending jobs and only 77 running. Something's clearly broken here. I'm closing all trees until it's sorted out.
Guessing all of the hundreds of #buildduty alerts like
<nagios-releng> Sat 17:00:11 PDT [4477] buildbot-master85.bb.releng.scl3.mozilla.com:buildbot is CRITICAL: PROCS CRITICAL: 0 processes with command name buildbot (http://m.mozilla.org/buildbot)
have something to do with it.
| Assignee | ||
Comment 2•9 years ago
|
||
The weekly reboot script has run off the rails. It's disabled the masters in slavealloc, done a graceful stop on everything (except the schedulers which don't have a web interface), then failed to ssh in with output like:
Apr 09 15:01:13 dev-master2.bb.releng.use1.mozilla.com restart_masters.sh: 2016-04-09 15:01:13,572 - DEBUG - __main__ - Attempting to connect to buildbot-master79.bb.releng.usw2.mozilla.com as cltbld
Apr 09 15:01:14 dev-master2.bb.releng.use1.mozilla.com restart_masters.sh: 2016-04-09 15:01:14,115 - ERROR - __main__ - No authentication methods available
Apr 09 15:01:14 dev-master2.bb.releng.use1.mozilla.com restart_masters.sh: 2016-04-09 15:01:14,229 - ERROR - __main__ - Couldn't get console to buildbot-master79.bb.releng.usw2.mozilla.com
Working on restarting masters now.
Assignee: nobody → nthomas
| Assignee | ||
Comment 3•9 years ago
|
||
Masters are all back up and re-enabled in slavealloc, with the two schedulers manually given 'stop update start'. Still need to remove the reconfig.lock files.
| Assignee | ||
Comment 4•9 years ago
|
||
Lots of jobs running, no reconfig.lock to clear up because the script never had a chance to log in to the boxes.
Reopen trees at will.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 5•9 years ago
|
||
Trees reopened, thanks Nick!
Updated•9 years ago
|
blocking-b2g: --- → 2.2r?
| Reporter | ||
Updated•9 years ago
|
blocking-b2g: 2.2r? → ---
Updated•9 years ago
|
Flags: needinfo?(nthomas)
Flags: needinfo?(nthomas)
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•