Closed Bug 1354786 Opened 8 years ago Closed 7 years ago

taskcluster build jobs scheduled/pending for long time/not starting (build pool not scaling up?)

Categories

(Taskcluster :: Services, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aryx, Assigned: jhford)

Details

Trees are closed for this. Taskcluster build jobs are pending but won't start (or take long, e.g. some Android build jobs are pending for 3+ hours). https://tools.taskcluster.net/aws-provisioner/ shows a small number of build jobs running and a big number pending. Jobs before 3:55am UTC push time have been completed. The one from 4:35am UTC still has pending jobs. E.g. this jobs has still pending Android and OSX jobs: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=21062cd63dd72565a4f9afc35bb4f6f78726096c&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=running&filter-resultStatus=pending&filter-resultStatus=runnable
It appears the provisioner was stuck inside an iteration loop and never completed. The loop started at this point: https://papertrailapp.com/systems/taskcluster-aws-provisioner2/events?focus=787221693587431434&selected=787221693587431434 It performed some analysis, but then did not complete nor start up any subsequent iterations. I restarted the app in heroku and I see the provisioner attempting to spawn instances now. added jhford to the bug.
Component: Queue → AWS-Provisioner
Trees reopened.
Severity: blocker → normal
Assignee: nobody → jhford
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: AWS-Provisioner → Services
You need to log in before you can comment on or make changes to this bug.