Docker image running out of overlay space and causing pod evictions ok GKE

3/22/2020

I have created a Docker image that runs a java command every minute (via ExecutorService) and prints it to the screen using log.

Most of the time, the output looks like the following: No requests found, sleeping for a minute

It is running on the Kubernetes Engine on Google Cloud Platform. I am running into an issue where my pod gets evicted every 5 hours with the following error:

The node was low on resource: ephemeral-storage. Container ingestion-pager-remediation was using 216Ki, which exceeds its request of 0.

What is the cause of this? At first, I thought the cause was failing to close Input/OutputStreams and HttpConnections. I went through the code and made sure to close all the connections, but the size still increases over time.

When I look at disk usage, I see that my /overlay space is increasing over time. Below, the "Used" space with only a single java command running.

/ # df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                 291.2G    172.3G    118.8G  59% /
tmpfs                    64.0M         0     64.0M   0% /dev
tmpfs                   102.3G         0    102.3G   0% /sys/fs/cgroup
/dev/sda1               291.2G    172.3G    118.8G  59% /dev/termination-log
/dev/sda1               291.2G    172.3G    118.8G  59% /mount/javakeystore
/dev/sda1               291.2G    172.3G    118.8G  59% /mount/json
/dev/sda1               291.2G    172.3G    118.8G  59% /etc/resolv.conf
/dev/sda1               291.2G    172.3G    118.8G  59% /etc/hostname
/dev/sda1               291.2G    172.3G    118.8G  59% /etc/hosts
shm                      64.0M         0     64.0M   0% /dev/shm
tmpfs                   102.3G     12.0K    102.3G   0% /run/secrets/kubernetes.io/serviceaccount
tmpfs                   102.3G         0    102.3G   0% /proc/acpi
tmpfs                    64.0M         0     64.0M   0% /proc/kcore
tmpfs                    64.0M         0     64.0M   0% /proc/keys
tmpfs                    64.0M         0     64.0M   0% /proc/timer_list
tmpfs                   102.3G         0    102.3G   0% /proc/scsi
tmpfs                   102.3G         0    102.3G   0% /sys/firmware
/ # df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                 291.2G    172.4G    118.8G  59% /
tmpfs                    64.0M         0     64.0M   0% /dev
tmpfs                   102.3G         0    102.3G   0% /sys/fs/cgroup
/dev/sda1               291.2G    172.4G    118.8G  59% /dev/termination-log
/dev/sda1               291.2G    172.4G    118.8G  59% /mount/javakeystore
/dev/sda1               291.2G    172.4G    118.8G  59% /mount/json
/dev/sda1               291.2G    172.4G    118.8G  59% /etc/resolv.conf
/dev/sda1               291.2G    172.4G    118.8G  59% /etc/hostname
/dev/sda1               291.2G    172.4G    118.8G  59% /etc/hosts
shm                      64.0M         0     64.0M   0% /dev/shm
tmpfs                   102.3G     12.0K    102.3G   0% /run/secrets/kubernetes.io/serviceaccount
tmpfs                   102.3G         0    102.3G   0% /proc/acpi
tmpfs                    64.0M         0     64.0M   0% /proc/kcore
tmpfs                    64.0M         0     64.0M   0% /proc/keys
tmpfs                    64.0M         0     64.0M   0% /proc/timer_list
tmpfs                   102.3G         0    102.3G   0% /proc/scsi
tmpfs                   102.3G         0    102.3G   0% /sys/firmware

Here is the only thing running on the pod:

/ # ps -Af
PID   USER     TIME  COMMAND
    1 root      0:27 java -jar -Dlog4j.configurationFile=/ingestion-pager-remediation/log4j.properties -Dorg.slf4j.simpleLogger.defaultLogLevel=info /ingestion-pager-remediation/ingest-pa
  222 root      0:00 sh
  250 root      0:00 ps -Af

As stated before, this is a simple java command which runs a few http connections, than sleeps.

Does anyone know why my overlay space would increase to 300GB over time?

(Edit)

I only log to stdout using this debug configuration:

log4j.rootLogger=DEBUG, STDOUT
log4j.logger.deng=INFO
log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender
log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout
log4j.appender.STDOUT.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n
org.slf4j.simpleLogger.defaultLogLevel = info
-- Joe Devilla
docker
google-kubernetes-engine
java
kubernetes

0 Answers