Containerizing server workloads is becoming increasingly popular, and it is becoming more common for web server deployments to be carried out in containers. Can the same benefits be applied to databases?
Docker can handle stateful workloads
It’s best to start with another question: C.a Do you even run a database in Docker? In general, Docker is not designed for stateful services. One of the main selling points of containers is that they can be stopped and started at will and usually connect to an authoritative data source such as a database to save their state. All data in the container is short-lived and will be destroyed when the container is deleted.
This makes stateful workloads especially difficult to run, but luckily Docker has a few tools for dealing with state: volume and bind mounts. These allow you to provide a location on the host computer in a location in the container that will store data even if the container is shut down. That way, you can keep containers running for the long term without worrying about data loss.
Volume mounts are the preferred method for most scenarios. They can be used to create a volume that Docker will manage.
docker volume create my-volume
… If you attach this volume to a destination in the container:
docker run --mount source=my-volume,target=/app
Binding mounts are simpler. They are used by volumes under the hood, but you can manually set the location on the host hard drive instead of having Docker manage it.
docker run ~/nginxlogs:/var/log/nginx
In practice, these brackets can be a little more complicated to use. Many managed Docker services, like AWS̵
Should you choose Docker for your database?
Docker is generally not suitable for the “Handling” state. Docker-based workloads typically offload this problem to databases. Given that a database is the solution to the problem, is it convenient to put your database in Docker?
For the most part, the answer is “normally not”. Docker has come a long way since its inception, and containerizing databases is no longer a terrible or “wrong” idea. It can certainly be done and has several advantages. However, for most general workloads, the benefits do not outweigh the complications.
To see why, let’s take a look at the benefits Docker offers:
- Easily Scaling: Servers can be quickly created and destroyed to meet requirements
- Simpler CI / CD tools: Automated builds are trivial
- Coding your infrastructure: All underlying libraries and setups are managed in the Docker file
Most of these do not accurately transfer to database workloads. Often these are long-term endeavors that primarily promote data integrity. In general, you don’t want to auto-scale most databases. They usually do not receive regular code updates themselves and therefore benefit less from running in containers. And if you only mount a local storage drive anyway, you can run it outside of Docker as well.
If you want to break away from the complexities of managing databases, Docker isn’t the tool for the job. It’s just an unnecessary complication for a workload that can easily run on a standard VPS. With a fully managed database as a service like AWS’s RDS, you’re probably much better off. This brings in much of the automation that Docker is good for without the headache of doing it yourself.
The main place Docker can be useful for database workloads is in development environments. Docker makes it easier to start up new databases with different configurations, which enables quick testing. In production, however, the rules are generally stricter.