Closed Bug 1405683 Opened 8 years ago Closed 8 years ago

Deploys can timeout leaving things in an inconsistent state

Categories

(Taskcluster :: Workers, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: garndt, Assigned: wcosta)

Details

When trying to update worker types, it stopped after updating gecko-images and then timed out. This left things in an in between state of only some worker types updated and others not. Running the deploy scripts again will cause backups to be made of the mixed state. $ ./deploy/bin/update-worker-types.js Creating worker-types backup file. Updating worker types... Updating cratertest Updating dbg-macosx64 Updating cli Updating b2gtest Updating balrog Updating dbg-linux32 Updating funsize-balrog-dev Updating android-api-15 Updating deepspeech-worker Updating flame-kk Updating funsize-balrog Updating gecko-1-b-android Updating dbg-linux64 Updating gecko-2-b-macosx64 Updating gecko-misc Updating gecko-t-linux-xlarge Updating mulet-opt Updating gecko-3-b-linux Updating gaia-decision Updating gecko-1-b-linux Updating gaia-cache Updating gaia Updating gecko-2-b-android Updating github-worker Updating mulet-debug Updating hg-worker Updating gecko-2-decision Updating opt-linux32 Updating gecko-2-images Updating opt-linux64 Updating funsize-mar-generator Updating releng-svc-compute Updating b2gbuild Updating desktop-test-xlarge Updating gecko-t-linux-large Updating desktop-test-large Updating fuzzer Updating tcvcs-cache Updating spidermonkey Updating tcvcs-cache-device Updating gecko-symbol-upload Updating opt-macosx64 Updating tutorial Updating desktop-test Updating rustbuild Updating releng-svc Updating gecko-decision Updating gecko-3-images Updating gecko-3-decision Updating gecko-3-b-macosx64 Updating taskcluster-images Updating releng-task Updating taskcluster-generic Updating gecko-3-b-android Updating symbol-upload Updating gecko-t-linux-medium Updating gecko-images Error: timeout of 30000ms exceeded at Timeout.<anonymous> (/Users/garndt/work/projects/docker-worker/node_modules/superagent/lib/node/index.js:763:17) at ontimeout (timers.js:469:11) at tryOnTimeout (timers.js:304:5) at Timer.listOnTimeout (timers.js:264:5)
The immediate solution is to rollback before trying to update worker types again.
Assignee: nobody → wcosta
Status: NEW → ASSIGNED
Commits pushed to master at https://github.com/taskcluster/docker-worker https://github.com/taskcluster/docker-worker/commit/ba2fc9ef5b95f4d1a60442e89368391318238dee Bug 1405683: Add option to disable worker types backup. If, during the process of updating the worker types, an update fails, the worker types set will be left in an in-between state where some are updated and others not. If we run the script again, backups will be overwritten with this undesirable state. We add an option to update-worker-types script to not generate backup file before updating. We now also support the --test flag, which updates only ami-test and ami-test-pv worker types. These flags can also be passed to release.sh, which will forward them to update-worker-types. https://github.com/taskcluster/docker-worker/commit/0abfd3868dd0c25e0aff7ab5c692c40c52a06248 Merge pull request #324 from walac/master Bug 1405683: Add option to disable worker types backup.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Component: Docker-Worker → Workers
You need to log in before you can comment on or make changes to this bug.