I would like the containers in my pod to share a volume for temporary (cached) data. I don't mind if the data is lost when the pod terminates (in fact, I want the data deleted and space reclaimed).
The kubernetes docs make an emptyDir
sound like what I want:
An emptyDir volume is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node
.. and
By default, emptyDir volumes are stored on whatever medium is backing the node - that might be disk or SSD or network storage, depending on your environment. However, you can set the emptyDir.medium field to "Memory" to tell Kubernetes to mount a tmpfs (RAM-backed filesystem) for you instead
That sounds like the default behaviour is to store the volume on disk, unless I explicitly request in-memory.
However, if I create the following pod on my GKE cluster:
apiVersion: v1
kind: Pod
metadata:
name: alpine
spec:
containers:
- name: alpine
image: alpine:3.7
command: ["/bin/sh", "-c", "sleep 60m"]
volumeMounts:
- name: foo
mountPath: /foo
volumes:
- name: foo
emptyDir: {}
.. and then open a shell on the pod and write a 2Gb file to the volume:
kubectl exec -it alpine -- /bin/sh
$ cd foo/
$ dd if=/dev/zero of=file.txt count=2048 bs=1048576
Then I can see in the GKE web console that the RAM usage of the container has increased by 2Gb:
It looks to me like the GKE stores emptyDir
volumes in memory by default. The workload I plan to run needs plenty of memory, so I'd like the emptyDir
volume to be backed by disk - is that possible? The GKE storage docs don't have much to say on the issue.
An alternative approach might be to use a local SSD for my cached data, however if I mount them as recommended in the GKE docs they're shared by all pods running on the same node and the data isn't cleaned up on pod termination, which doesn't meet my goals of automatically managed resources.
Here's the output of df -h
inside the container:
# df -h
Filesystem Size Used Available Use% Mounted on
overlay 96.9G 26.2G 70.7G 27% /
overlay 96.9G 26.2G 70.7G 27% /
tmpfs 7.3G 0 7.3G 0% /dev
tmpfs 7.3G 0 7.3G 0% /sys/fs/cgroup
/dev/sda1 96.9G 26.2G 70.7G 27% /foo
/dev/sda1 96.9G 26.2G 70.7G 27% /dev/termination-log
/dev/sda1 96.9G 26.2G 70.7G 27% /etc/resolv.conf
/dev/sda1 96.9G 26.2G 70.7G 27% /etc/hostname
/dev/sda1 96.9G 26.2G 70.7G 27% /etc/hosts
shm 64.0M 0 64.0M 0% /dev/shm
tmpfs 7.3G 12.0K 7.3G 0% /run/secrets/kubernetes.io/serviceaccount
tmpfs 7.3G 0 7.3G 0% /proc/kcore
tmpfs 7.3G 0 7.3G 0% /proc/timer_list
tmpfs 7.3G 0 7.3G 0% /proc/sched_debug
tmpfs 7.3G 0 7.3G 0% /sys/firmware
I discovered it's possible to ssh into the node instance, and I was able to find the 2Gb file on the node filesystem:
root@gke-cluster-foo-pool-b-22bb9925-xs5p:/# find . -name file.txt
./var/lib/kubelet/pods/79ad1aa4-4441-11e8-af32-42010a980039/volumes/kubernetes.io~empty-dir/foo/file.txt
Now that I can see it is being written to the underlying filesystem, I'm wondering if maybe the RAM usage I'm seeing in the GKE web UI is the linux filesystem cache or similar, rather than the file being stored in a RAM disk?
From the mount information you've supplied, the emptyDir
volume is mounted on a drive partition, so it's working as intended, and isn't mounted in memory. It's likely that the memory usage you see is due to the filesystem buffer cache, so with sufficient memory pressure, it'd eventually get written to the disk. However, given that you have so much free memory, it's likely that the system saw no need to do so immediately.
If you have more doubts, give sync
or echo 3 > /proc/sys/vm/drop_caches
a go on the machines to flush filesystem information to disk. You should see a change in memory usage.