Closed
Bug 1067626
Opened 10 years ago
Closed 10 years ago
docker-worker: Attachable cache on worker-host
Categories
(Taskcluster :: Workers, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jonasfj, Assigned: garndt)
References
Details
Attachments
(1 file)
To facilitate: - incremental builds - updates to git/hg clones - potential performance improvements (over AUFS) We should have an attachable cache that lives on the worker host. We've discussed various implementation options, but I think the following solution is what we end up with: Host maintains cache folders on the form: /var/cache/docker-worker/<cacheName>/<instanceNumber>/ A task specifies that it wants a cache as follows: { scopes: ['docker-worker:cache:try-obj-dir'] payload: { image: 'ubuntu:14.04', command: ['...'], cache: { '/home/worker/obj-dir': 'try-obj-dir' }, } ... } This will mount the folder "/var/cache/docker-worker/try-obj-dir/0/" under '/home/worker/obj-dir' in the worker container. If the folder "/var/cache/docker-worker/try-obj-dir/0/" is already mounted in another task, the folder "/var/cache/docker-worker/try-obj-dir/1/" will be created... The instance number is basically to allow multiple instances of a folder with the same <cacheName>. We should always prefer to use the most recent folder, if multiple are available. cache folders should be garbage collected in an LRU fashion. When the host disk capacity is small. Notes: - the scope "docker-worker:cache:<cacheName>" is required for task to use cache (prevents cache poisoning, intentional and accidental) - cache folders will exist on the worker host, this does provide a way to deliver executable code on the worker host. We should consider security implications. But for performance, stability and debuggability this is probably our best bet. - we might want to consider limits to cache size, to avoid out-of-disk issues. (we could free disk-space prior to running a tasks that specifies the size requirements for its cache) Mostly, I think we have suck up the security implications. Otherwise, we'll have to look more into data volumes and how to manage these...
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → garndt
Assignee | ||
Comment 1•10 years ago
|
||
Assignee | ||
Comment 2•10 years ago
|
||
Comment on attachment 8501064 [details] [review] GH PR# 49 Only thing missing from this PR is a more thorough integration test with a cache shared between workers. There is a lower level test that makes sure that a cache is reused once it's been released from a prior task.
Attachment #8501064 -
Flags: review?(jlal)
Assignee | ||
Comment 3•10 years ago
|
||
ami-eface1df has been built with these changes and currently configured for the CLI worker type.
Comment 4•10 years ago
|
||
Comment on attachment 8501064 [details] [review] GH PR# 49 r+ with the tests added ( I would ideally like to skim them before we land ) via https://gist.github.com/lightsofapollo/238057d82de16ef7b12f I manually tested: - scopes - parallel tests - serial tasks Looks good to me... We will likely need to observe GC and other behavior in production to see how that works out.
Attachment #8501064 -
Flags: review?(jlal) → review+
Assignee | ||
Comment 5•10 years ago
|
||
Merged https://github.com/taskcluster/docker-worker/commit/2005a30ad74cc2b9e9f7166016a5ace89a114624
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 6•10 years ago
|
||
Some additional documentation on how this works: { scopes: ['docker-worker:cache:<cache name>'] payload: { image: 'ubuntu:14.04', command: ['...'], cache: { "<cache name>": "<mount point within container>" }, } ... } The worker host will locate <cache name> and if there is no unmounted instance available, it will create a new instance to be used by the container when starting up. If there is at least one unmounted instance, it will reuse that instance for the container. If multiple are unmounted, then it will use the most recent one that was used. This cache will be mounted within the container at the point specified in the task payload. [1] These caches are stored on the worker host at the path specified by the worker configuration [2]. Default: /mnt/var/cache/docker-worker . Caches will be cleaned up during a garbage collection interval if the diskspace threshold is met. All cache instances that are not currently mounted will be deleted. This could be made smarter in the future to only remove the oldest instances of each cache until there is enough diskspace free. We'll have to monitor to see how often this scenario is encountered. [1] https://github.com/taskcluster/docker-worker/blob/2005a30ad74cc2b9e9f7166016a5ace89a114624/lib/task.js#L99 [2] https://github.com/taskcluster/docker-worker/blob/2005a30ad74cc2b9e9f7166016a5ace89a114624/config/defaults.js#L24
Updated•9 years ago
|
Component: TaskCluster → Docker-Worker
Product: Testing → Taskcluster
Updated•5 years ago
|
Component: Docker-Worker → Workers
You need to log in
before you can comment on or make changes to this bug.
Description
•