Understanding usability of volumes

7/2/2019

I'm coming from the old world of monolithic applications that write their data to a database, or sometimes use some files. Usually when using files, they are stored in a sub directory of the app directory like app/data.

Now with Docker there is the concept of volumes.

I'm trying to understand:

  1. When should I use volumes?
  2. Why storing my files in a volume rather than in a sub directory of the app directory like app/data?
  3. Do I need to change the app code in order to write to the volume instead of app/data, or once the app is hosted by Docker then all the relative paths files will be saved in the mounted volume automatically?
-- Alon
docker
kubernetes
microservices

2 Answers

7/2/2019

You mount a volume into a container when you start it up. The program inside the container has no idea it's using a volume or not. Anything you read or write from the volume path automatically uses the volume.

If it's an option, you should avoid storing data in volumes at all; prefer databases if you can. If all of your data is outside the container then it's very easy to run multiple copies of a container on multiple hosts. If you're storing some data in volumes then as you scale up your application you need to worry about preserving, replicating, and moving that data.

Good uses for volumes include storing local persistent data that needs to be kept across container restarts. Host bind mounts are extremely similar (IMHO they are easier to manage, but there are permission and performance problems on some platforms) and also can fill this use case; host bind mounts are also good for injecting config files into applications and reading log files back out.

(For non-Docker reasons it can be convenient to store data outside your application's directory, e.g. /var/lib/myapp. This is less relevant in Docker where you update your code by building a new image, starting a new container, and then mounting the volume over the filesystem somewhere. It doesn't really matter if your data is under your app directory or not.)


Also you tagged this with "Kubernetes". All of the above applies here (when I say "scale up" I'm pretty specifically thinking of Kubernetes). Kubernetes persistent volumes can be slightly trickier to use than Docker named volumes; avoid hostPath type volumes (they won't be consistent across multiple nodes). You may need to use StatefulSets instead of Deployments to give each Pod its own PersistentVolume. Getting direct access to PV contents is even harder than it is in Docker. Conversely, there are other mechanisms like ConfigMaps for some tasks like injecting config files. Stay far far away from developer-oriented patterns that try to bind-mount application code into containers, it is much harder than just rebuilding your image when you need to.

-- David Maze
Source: StackOverflow

7/2/2019
  1. You need volumes for stateful applications , that is , to persist data/state.

  2. You can still use the same path of app/data , but make sure that path is mounted on a persistent volume , so that data persist when container is gone. This way you dont't have to change much.

  3. No , you don't need to change anything if you mount app/data as volume. The application don't need to know about this. Just make sure app/data is a volume.

-- Ijaz Ahmad Khan
Source: StackOverflow