Closed Bug 1290536 Opened 8 years ago Closed 4 years ago

Allow caches to be shared across tasks

Categories

(Taskcluster :: Workers, enhancement, P5)

enhancement

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: gps, Unassigned)

Details

I'm not sure if this is the correct component...

Currently, if multiple tasks run on the same Docker worker, each task gets its own cache instance. I think this makes sense as a default behavior because it is safest.

However, in some cases consumers of a cache know how to lock files and allow simultaneous access across concurrently running tasks. Mercurial store caches fall into this bucket: Mercurial obtains a lock that only allows single writers to repo stores.

When triggering multiple decision tasks on try pushes, I've seen at least 4 decision tasks running on the same worker. Each worker is doing its own `hg clone` into its own independent cache. This makes the tasks much slower because 4x the amount of I/O is being performed on the worker.

I'd love to be able to mark a cache as "shared" between tasks so we could make things like version control operations more efficient.

Since we currently don't run multiple tasks per worker very often, this is likely a low priority.
Component: Docker-Worker → Worker
Found in triage.

We might do this for read-only caches in the future.
Severity: normal → enhancement
QA Contact: pmoore
Pete, is this something generic-worker's mounts feature supports?
Flags: needinfo?(pmoore)
(In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #2)
> Pete, is this something generic-worker's mounts feature supports?

tl;dr: no

We typically run a single worker instance per machine, however we can (and do) run multiple workers on some machines (for example NSS macOS workers) in order to achieve higher throughput. In this case, the OS manages the concurrency for us by running multiple workers, rather than the worker implementing this as a feature. The advantages here are i) a much simpler worker codebase, and ii) OSes do a really good job of isolating processes and running them concurrently. 

At the moment, we have no provisions for multiple workers to share a single cache, the cache directories are simply moved into place during task initialisation by the "mounts" feature. When we have docker support in generic-worker, mounting caches will be attaching volumes rather than moving folders, at which point it could be possible to share them, if we decide it is worth the engineering effort.
Flags: needinfo?(pmoore)
I filed this bug almost exactly 2 years ago. 2 years later and it isn't critical. It is a nice-to-have. But I understand the complexity involved with implementing it. We are sacrificing some performance by not having shared caches. But I think there are bigger problems worthy of the TC team's attention.

I think this bug could linger for a bit more.
Component: Worker → Workers

This would be a good feature to add, but probably not in docker-worker, so let's sit on it for another few years :)

Priority: -- → P5
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.