Consider replacing Docker based dev environment with systemd

NEW
Unassigned

Status

3 years ago
3 years ago

People

(Reporter: gps, Unassigned)

Tracking

Details

(Reporter)

Description

3 years ago
People love to complain about the Docker dev environment in version-control-tools. The main complaint is it is slow and buggy. Both of these are valid complaints.

A lot of the badness we have in Docker stems from the following:

* Bugs in Docker and the Linux kernel, especially around stacked filesystems like aufs and overlayfs (the other Docker storage drivers are too slow)
* Too slow to create images (Dockerfiles caching invalidates too often and running Ansible all the time is too slow)
* Docker images too large and unwieldy
* Docker performance slowness when starting the 7+ containers that comprise our dev environment/test cluster
* Docker performance slowness making various HTTP requests to the Docker API to e.g. start processes in containers
* Performance slowness as a result of having to poll for events as opposed to having a proper "push-based" mechanism (e.g. when we start the containers we poll to see when a HTTP or SSH process comes online)

After using Docker for almost two years now, it is clear there are some fundamental and long-standing problems with it that make a Docker-based dev and test environment a continued PITA. I'm tentatively proposing swapping out the Docker dev environment for a systemd-based one.

systemd will give us a number of benefits over Docker:

* No Docker images or filesystem overhead or kernel bugs. We can manage files directly on the host filesystem.
* Having things directly on the filesystem means we can easily use things like modern build tools (like Bazel) to manage state of those files from outside the "container" meaning we can update things faster performing the minimal set of changes necessary.
* We can configure dependencies directly in systemd units and use systemd to start processes as quickly as possible - without having to resort to polling from "outside." We know systemd can boot a system much faster than typical init systems. This should translate to starting the dozens of processes that comprise our dev environment faster.

And systemd still provides:

* The ability to isolate processes from each other. The container core of Docker is basically clone(2) + chroot(2) on steroids. systemd can launch processes in cgroups/slices in chroots just like Docker can.
* The ability to use "containers" in production in the future.

Without Docker, we lose:

* docker-machine and the convenience of running containers on non-Linux machines (like OS X and Windows). We'll be managing our own VM, probably with Vagrant. This is mostly a solved problem.

Ansible integration is a huge open issue. We currently leverage Ansible to both provision our production servers and Docker images. Ansible does everything from install system packages and services to configuring the application layer. Having Docker running effectively an exact replica of our server environment is *really* nice. We have confidence that we'll find deployment bugs before we deploy. We've practically eliminated staging environments from our deploy process because our confidence that the local environment matches production is so high. We may lose some of this depending on how we implement the systemd approach. I think I'm fine with that. Although, we might be able to continue running Ansible in a chroot created via systemd to get the same approach.

A likely blocker to this work is transitioning our remaining infrastructure from RHEL6/CentOS6 to CentOS7. We'd like everything running CentOS7 so we have consistency and don't need to e.g. worry about running a CentOS or Ubuntu chroot just to run application X.
You need to log in before you can comment on or make changes to this bug.