If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

docker-worker: Shutdown inside `idle` event handler

NEW
Unassigned

Status

Taskcluster
Worker
3 years ago
a year ago

People

(Reporter: jonasfj, Unassigned)

Tracking

Details

(Whiteboard: [docker-worker])

(Reporter)

Description

3 years ago
See comment here:
https://github.com/taskcluster/docker-worker/commit/1e7d9bb772a3146563323635ea056d1782799523#commitcomment-10078001

Note, this could be a non-bug, so correct me if I'm wrong.

From the code it looks like we set a shutdown time, then clear it when work arrives. This means that shutdown can occur while we're in the process of calling queue.claimTask. That would be unfortunately.

I suggest we use an interval `earliestShutdown` and `latestShutdown`.
s.t. if we get an `idle` event while the `reminder` of the billing cycle is:
  earliestShutdown <= reminder <= latestShutdown
We shutdown.

This way, we shutdown no ealier than `earliestShutdown` number of seconds
from end of billing cycle. And if we're closer to the edge of the billing
cycle than latestShutdown, we don't shutdown, because we might already have paid for the next cycle.

Decent defaults for AWS would be:
  earliestShutdown:   6 min
  latestShutdown:     2 min

This means that we must force end polling for tasks in less than ~3 min.
That seems reasonable, ideally we should also enforce that. If we have DNS
issues requests could potentially take longer and node would fail to shutdown.
Component: TaskCluster → Docker-Worker
Product: Testing → Taskcluster
Whiteboard: [docker-worker]
Component: Docker-Worker → Worker
You need to log in before you can comment on or make changes to this bug.