Closed Bug 1372906 Opened 7 years ago Closed 4 years ago

Document worker manager's expectations of AMIs

Categories

(Taskcluster :: Services, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Unassigned)

References

Details

In https://github.com/taskcluster/taskcluster-docs/pull/186 a number of comments have related to how the AWS provisioner expects the AMIs it spawns to behave:
 - start a worker on boot
 - call the provisioner secrets API
 - self-terminate
 - safety termination after 96 hours

I think this is a bit too specific for the manual, but would make a great fit in the reference section, probably as a separate document under the AWS provisioner (so a .md file in its docs/ directory).  Then it can be linked to from the manual.

Pete and John were the folks interested..
Pete, is this something you could add?
Flags: needinfo?(pmoore)
Assignee: nobody → pmoore
Flags: needinfo?(pmoore)
I note the docs could be like:
 - sequence chart with:
   1) boot
   2) read user-data
   3) Find workerGroup, workerId from ???
   4) get temp creds from provisioner (with retries)
   5) invalidate secret token by calling provisioner again
   6) fetch secrets from tc-secrets
   7) claim work from queue, following queue <-> worker interaction docs
- a json schema for what keys to expect in user-data
- a section for a few of these topics explaining things like:
    * extract provisionerId / workerType from user-data
    * extract config keys from workerType definition from user-data, I suspect it's the data key
    * how workerType definition becomes a launchSpecification

(heavily inspired by https://docs.taskcluster.net/reference/platform/taskcluster-queue/docs/worker-interaction)

I see these as important docs, written for the perspective of someone writing a worker / AMI.
These docs are important not just for making it easier for people to write workers.  They also represent a contract which helps us and others figure out what we promise and what we might change without notice.  The naming of SSH keys is an example -- we never intended that to be part of the interface, but in the absence of documentation of the interface, there was no reason for anyone to know that.

So I see this as more than a "document in some free time" bug.
Assignee: pmoore → nobody
John: should this be part of the Worker Manager (as docs or whatever)?
Flags: needinfo?(jhford)
Yes, this should be a part of the worker manager as docs since this is not something which is programmatically enforceable.
Flags: needinfo?(jhford)
Component: Documentation → Services
Summary: Document AWS provisioner's expectations of AMIs → Document worker manager's expectations of AMIs

This is largely done:

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.