Closed Bug 1154276 Opened 10 years ago Closed 10 years ago

[docker-worker] Prevent tasks from being claimed when instance is shutting down

Categories

(Taskcluster :: Workers, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: garndt, Assigned: garndt)

References

Details

Attachments

(1 file)

52 bytes, text/x-github-pull-request
jonasfj
: review+
Details | Review
There have been many occasions where docker-worker will be killed by the system because a spot node is being terminated, but then the worker will respawn right before shutdown (after syslog has been killed), claim a task, but then the whole system goes down. This causes the task to be claimed for up to 20 minutes (takenUntil) without being retried by another worker. On shutdown, perhaps there is a way to cause the worker to not respawn, or if it does start up, prevent it from claiming tasks.
Blocks: 1154248
I think the runlevel command will tell you if your system is currently being shutdown, maybe this could be used to check whether to respawn the worker?
Discussed on IRC and it might be possible to instruct the docker-worker to wait for docker and only start up when run levels are 2-5. Also, it will stop when docker stops or when run levels are not 2-5.
Assignee: nobody → garndt
Status: NEW → ASSIGNED
Attached file GH PR 80
This is the PR that we were discussing previously. Added some notes to it to explain what it's doing.
Attachment #8595997 - Flags: review?(jopsen)
Comment on attachment 8595997 [details] [review] GH PR 80 Gave a light review... Looks good to me. Didn't read the test cases, but the code changes looks good. Also didn't focus too much on the refactored parts. But I like the cleanup :)
Attachment #8595997 - Flags: review?(jopsen) → review+
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: TaskCluster → Docker-Worker
Product: Testing → Taskcluster
Component: Docker-Worker → Workers
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: