Closed Bug 1067626 Opened 10 years ago Closed 10 years ago

docker-worker: Attachable cache on worker-host

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: jonasfj, Assigned: garndt)

References

Details

Attachments

(1 file)

GH PR# 49 10 years ago Greg Arndt [:garndt] 52 bytes, text/x-github-pull-request	jlal : review+	Details \| Review

Jonas Finnemann Jensen (:jonasfj)

Reporter

Description

•

10 years ago

To facilitate:
- incremental builds
- updates to git/hg clones
- potential performance improvements (over AUFS)

We should have an attachable cache that lives on the worker host.
We've discussed various implementation options, but I think the following solution is what we end up with:

Host maintains cache folders on the form:
/var/cache/docker-worker/<cacheName>/<instanceNumber>/

A task specifies that it wants a cache as follows:
{
scopes: ['docker-worker:cache:try-obj-dir']
payload: {
image: 'ubuntu:14.04',
command: ['...'],
cache: {
'/home/worker/obj-dir': 'try-obj-dir'
},
}
...
}

This will mount the folder "/var/cache/docker-worker/try-obj-dir/0/" under '/home/worker/obj-dir' in the worker container.
If the folder "/var/cache/docker-worker/try-obj-dir/0/" is already mounted in another task, the folder "/var/cache/docker-worker/try-obj-dir/1/" will be created...

The instance number is basically to allow multiple instances of a folder with the same <cacheName>. We should always prefer to use the most recent folder, if multiple are available.

cache folders should be garbage collected in an LRU fashion. When the host disk capacity is small.

Notes:
- the scope "docker-worker:cache:<cacheName>" is required for task to use cache
(prevents cache poisoning, intentional and accidental)
- cache folders will exist on the worker host, this does provide a way to deliver executable code on the worker host. We should consider security implications. But for performance, stability and debuggability this is probably our best bet.
- we might want to consider limits to cache size, to avoid out-of-disk issues.
(we could free disk-space prior to running a tasks that specifies the size
requirements for its cache)

Mostly, I think we have suck up the security implications. Otherwise, we'll have to look more into data volumes and how to manage these...

Greg Arndt [:garndt]

Assignee

Updated

•

10 years ago

Assignee: nobody → garndt

Greg Arndt [:garndt]

Assignee

Comment 1

•

10 years ago

Attached file GH PR# 49 — Details

Greg Arndt [:garndt]

Assignee

Comment 2

•

10 years ago

Comment on attachment 8501064 [details] [review]
GH PR# 49

Only thing missing from this PR is a more thorough integration test with a cache shared between workers.  There is a lower level test that makes sure that a cache is reused once it's been released from a prior task.

Attachment #8501064 - Flags: review?(jlal)

Greg Arndt [:garndt]

Assignee

Comment 3

•

10 years ago

ami-eface1df has been built with these changes and currently configured for the CLI worker type.

James Lal [:lightsofapollo]

Comment 4

•

10 years ago

Comment on attachment 8501064 [details] [review]
GH PR# 49

r+ with the tests added ( I would ideally like to skim them before we land )

via https://gist.github.com/lightsofapollo/238057d82de16ef7b12f I manually tested:

 - scopes
 - parallel tests
 - serial tasks

Looks good to me... We will likely need to observe GC and other behavior in production to see how that works out.

Attachment #8501064 - Flags: review?(jlal) → review+

James Lal [:lightsofapollo]

Updated

•

10 years ago

Blocks: 1080265

Greg Arndt [:garndt]

Assignee

Comment 5

•

10 years ago

Merged
https://github.com/taskcluster/docker-worker/commit/2005a30ad74cc2b9e9f7166016a5ace89a114624

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → FIXED

Greg Arndt [:garndt]

Assignee

Comment 6

•

10 years ago

Some additional documentation on how this works:

{
  scopes: ['docker-worker:cache:<cache name>']
  payload: {
    image: 'ubuntu:14.04',
    command: ['...'],
    cache: {
      "<cache name>": "<mount point within container>"
    },
  }
  ...
}

The worker host will locate <cache name> and if there is no unmounted instance available, it will create a new instance to be used by the container when starting up.  

If there is at least one unmounted instance, it will reuse that instance for the container.  If multiple are unmounted, then it will use the most recent one that was used.

This cache will be mounted within the container at the point specified in the task payload. [1]

These caches are stored on the worker host at the path specified by the worker configuration [2].  Default: /mnt/var/cache/docker-worker .

Caches will be cleaned up during a garbage collection interval if the diskspace threshold is met.  All cache instances that are not currently mounted will be deleted.  This could be made smarter in the future to only remove the oldest instances of each cache until there is enough diskspace free.  We'll have to monitor to see how often this scenario is encountered.

[1] https://github.com/taskcluster/docker-worker/blob/2005a30ad74cc2b9e9f7166016a5ace89a114624/lib/task.js#L99

[2] https://github.com/taskcluster/docker-worker/blob/2005a30ad74cc2b9e9f7166016a5ace89a114624/config/defaults.js#L24

Pete Moore [:pmoore][:pete]

Updated

•

9 years ago

Component: TaskCluster → Docker-Worker

Product: Testing → Taskcluster

Nobody; OK to take it and work on it

Updated

•

5 years ago

Component: Docker-Worker → Workers

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

docker-worker: Attachable cache on worker-host

Categories

(Taskcluster :: Workers, defect)

Tracking

(Not tracked)

People

(Reporter: jonasfj, Assigned: garndt)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Comment 6

Updated

Updated

Attachment

General

Description

File Name

Content Type