Accelerating continuous integration with Docker – an intern’s story
Continuous integration (CI) has become a fundamental process in many software-based companies in recent years. Fortunately, frameworks such as Jenkins do most of the heavy lifting and make it flexible for the engineering team to examine and improve the process. With the increasing popularity of lightweight virtualization, CI has begun to embrace technologies like Docker. Here, I’d like to share my experience of accelerating continuous integration with Docker during my internship at VoltDB.
Traditionally, a single CI job occupies the entire resource on the assigned machine, which often leads to wasted resources, slowing down the whole CI system. It’s possible to execute multiple CI jobs concurrently on one machine but there can be many issues. For instance, two jobs may compete to use the same port or write to the same file at the same time. Developers can certainly adjust the code to remove possible collisions but it will keep developers from the assumption of a single machine context and involve much effort. Virtualization is the solution here. For VoltDB, which targets the Linux platform, Docker seems to be a better choice than a Virtual Machine(VM), since Docker containers support most Linux distributions with smaller footprint, faster setup and teardown, and better extensibility.
From a low-level point of view, Docker containers are just regular processes on the host machine in which isolation is provided by kernel features such as namespaces and control groups. To validate the efficiency of Docker containers, I tested with three containers in my innovation week project where three different pieces of JUnit testing work were executed concurrently on a typical proprietary machine. The result was very promising, the total time taken was very close to the longest single job execution time on the same machine. In contrast, four or five concurrent jobs resulted in noticeable overhead. Based on this observation, the remaining work was to integrate the Docker workflow into the Jenkins CI system and optimize the work division among jobs.
The first component required was a specialized Docker image for testing purposes, which temporarily supported JUnit testing only. JDK, Ant and Python were installed, followed by locale and timezone settings. Since all tests assumed a non-root user and executing as a root user in a container can cause permission issues with host volume mounting, a non-root user needed to be created for the container and granted ownership of newly-created working directories and the /tmp directory. This ‘user’ conformed to the user and group of the Jenkins agent for transparency purposes. The final statements in the Dockerfile involved changing the default user to the non-root user and the starting directory to the JUnit working directory.
The second component required was a series of shell commands that controlled the configuration and lifecycle of Docker containers. Docker containers support mounting directories on the host machine as volumes, which makes it easy to inject any version of code into the container without static bundling. Since the three containers were designed to test the same code, it was natural to make them mount and share the input directory. However, the tests wrote temporary files and results into special subdirectories so there was the risk of a write conflict unless the test was reconfigured. To avoid extra reconfigurations and write collisions, the solution was to mount the same input directory but copy it to an intra-container working directory. To collect the artifacts, the “docker copy” command was handy to copy files from the container to the host machine. Stopping and removing the containers after the tests reclaimed all the related container storage, so there was no storage pressure.
As for the lifecycle, JUnit Ant target names, container names and output directory names were first inferred. After output directories were created and containers with conflicting names were removed, “docker run” started the containers in order in the background. With volumes mounted and the launching command specified, the return value, namely the container ID, was recorded. The command “docker wait” accepted a container ID, blocked until the corresponding container stopped, and returned the exit status. The command was issued on every container to wait for all containers to complete, much like pthread.join(). Only when all containers successfully completed was the job marked a success.
The last component was integration with Jenkins and the entire proprietary CI infrastructure. We used Puppet to guarantee that Docker was installed and running on all eligible machines. Also, the Jenkins agent user was added to the docker group so the agent could issue Docker commands directly. The aforementioned container lifecycle was captured into a shell script where ‘user’ specified a range of JUnit targets. The number of targets determined the number of containers that could be started at once, making it easy to adapt to jobs of different workloads and make the most out of Docker containers. The shell script was managed by Jenkins as a freestyle build step. As a further performance optimization, a private Docker registry was set up on one internal machine to improve latency and security. Once the testing Docker image was pushed to the internal registry, all machines could pull it down quickly the first time they were assigned a Docker testing job.
The unit tests were repartitioned based on past duration data to reduce overall job latency. With the increased throughput and reduced latency, the same CI job took about half the time to complete with two-thirds of machines as before. The unit testing portion of the CI process was greatly accelerated, thus optimizing the entire development workflow.
Currently, Docker is only used on unit testing jobs. For other CI jobs, the runtime resource consumption must be profiled to take advantage of Docker containers efficiently on machines with different specifications. Docker also offers the opportunity to broaden testing coverage. A user can install CoreOS on host machines and deploy various Linux distribution Docker images on top of them to test against. Different flavors of JDK and Python, among others, can be bundled in separate images to further expand coverage.
To summarize, my experiment showcased the simplicity and power of Docker. There is still much space to improve continuous integration with Docker.
The experience with Docker complemented my regular development work at VoltDB. I learned about Docker, Jenkins, polished my shell scripting skills, and familiarized myself with many related parts of the code base.
This project started with an idea proposed during VoltDB’s quarterly innovation week, and it was added to the production workflow thanks to the support and advice of many colleagues. It felt really great to contribute work that benefitted the whole team, and it was truly a rich and unforgettable internship experience at VoltDB.
- Docker containers and the next generation of virtualization, Sandeep Khuperkar, https://opensource.com/business/15/8/interview-jerome-petazzoni-docker-linuxcon
- Understanding Volumes in Docker, Adrian Mouat, http://container-solutions.com/understanding-volumes-docker/