Closed Bug 1263414 Opened 9 years ago Closed 9 years ago

No jobs starting

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RyanVM, Assigned: nthomas)

References

Details

Looking at the trees, there are currently over 3000 pending jobs and only 77 running. Something's clearly broken here. I'm closing all trees until it's sorted out.
Guessing all of the hundreds of #buildduty alerts like <nagios-releng> Sat 17:00:11 PDT [4477] buildbot-master85.bb.releng.scl3.mozilla.com:buildbot is CRITICAL: PROCS CRITICAL: 0 processes with command name buildbot (http://m.mozilla.org/buildbot) have something to do with it.
The weekly reboot script has run off the rails. It's disabled the masters in slavealloc, done a graceful stop on everything (except the schedulers which don't have a web interface), then failed to ssh in with output like: Apr 09 15:01:13 dev-master2.bb.releng.use1.mozilla.com restart_masters.sh: 2016-04-09 15:01:13,572 - DEBUG - __main__ - Attempting to connect to buildbot-master79.bb.releng.usw2.mozilla.com as cltbld Apr 09 15:01:14 dev-master2.bb.releng.use1.mozilla.com restart_masters.sh: 2016-04-09 15:01:14,115 - ERROR - __main__ - No authentication methods available Apr 09 15:01:14 dev-master2.bb.releng.use1.mozilla.com restart_masters.sh: 2016-04-09 15:01:14,229 - ERROR - __main__ - Couldn't get console to buildbot-master79.bb.releng.usw2.mozilla.com Working on restarting masters now.
Assignee: nobody → nthomas
Masters are all back up and re-enabled in slavealloc, with the two schedulers manually given 'stop update start'. Still need to remove the reconfig.lock files.
Lots of jobs running, no reconfig.lock to clear up because the script never had a chance to log in to the boxes. Reopen trees at will.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Trees reopened, thanks Nick!
blocking-b2g: --- → 2.2r?
blocking-b2g: 2.2r? → ---
Flags: needinfo?(nthomas)
See Also: → 1256118
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.