So far our on-prem kubernetes cluster is working fine. Lately we are seeing jobs are failing with the below error. I checked there are no space issues on Kube master as well as the worker nodes. There is plenty of space available under "/var/lib" as well as under persistent volume claims.
Version:
Client Version: v1.17.2
Server Version: v1.17.2
Host OS:
Centos 7.7
CNI:
Weave
Error:
The node was low on resource: ephemeral-storage.Container main was using 5056Ki, which exceeds its request of 0. Container wait was using 12Ki, which exceeds its request of 0.
any pointers will be helpful.
Thanks, CS
The main reason why this could be happening is that pod logs, or emptyDir
usage are filling up your ephemeral storage.
Docker takes a conservative approach to cleaning up unused objects (often referred to as “garbage collection”), such as images, containers, volumes, and networks: these objects are generally not removed unless you explicitly ask Docker to do so. This can cause Docker to use extra disk space.
You can use docker function called prune
. This will clean up the system from unused objects. If you wish to cleanup multiple objects you can use docker system prune
. Check here more about prunning.
There is also another tool called Garbage collector
. It`s docker tool that removes. unused/abandoned/orphaned blobs. Check here more about it.
In the context of the Docker registry, garbage collection is the process of removing blobs from the filesystem when they are no longer referenced by a manifest. Blobs can include both layers and manifests.
If this does`t help you can try to configure logging driver and set its limit:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3",
"labels": "production_status",
"env": "os,customer"
}
}
There is also another option if emptyDir
has been used. Using emptyDir
you allow container to write any amount of storage to it's node fs. You can request or limit settings for local ephemeral storage by setting up:
spec:
containers:
- name: test
image: test-image
resources:
requests:
ephemeral-storage: "1Gi"
limits:
ephemeral-storage: "1Gi"
- name: test
image: test-image2
resources:
requests:
ephemeral-storage: "2Gi"
limits:
ephemeral-storage: "2Gi"
You can also check the containers running using docker ps
and then inspect the container by yourself and locate the fs.
It should be found at this location:
/var/lib/docker/containers/<container-id>/<container-id>-json.log
Let me know if that helps.