AWS EKS - Container with error "2 nodes out of disc space"

11/26/2018

I deployed 6 containers and run on AWS EKS. However, after run for a period of time, the log show error with "2 nodes out of disc space". I tried to delete the container and rebuild. Some error keeps happen. Anyone have solution?

kubectl delete pod $image_name –namespace=xxx
kubectl describe pod $name --namespace=xxx
kubectl describe pod $image_name --namespace=xxX

Name:           image_name
Namespace:      xxx
Node:           <none>
Labels:         app=label
Annotations:    <none>
Status:         Pending
IP:
Controlled By:  ReplicationController/label
Containers:
  label-container:

    Image:      image_name
    Port:       8084/TCP

    Host Port:  0/TCP

    Environment:

      SPRING_PROFILES_ACTIVE:  uatsilver

    Mounts:

      /var/run/secrets/kubernetes.io/serviceaccount from default-token-kv27l (ro)

Conditions:

  Type           Status

  PodScheduled   False

Volumes:

  default-token-kv27l:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  default-token-kv27l

    Optional:    false

QoS Class:       BestEffort

Node-Selectors:  <none>

Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:
  Type     Reason            Age                From               Message

  ----     ------            ----               ----               -------
  Warning  FailedScheduling  10s (x7 over 41s)  default-scheduler  0/3 nodes are available: 1 Insufficient pods, 2 node(s) were not ready, 2 node(s) were out of disk space.
-- Mimi Law
aws-eks
disk
kubernetes

1 Answer

11/27/2018

Kubernetes fails scheduling your pods because the nodes are out of disk space. As Rafaf suggested in a comment, you should increase your nodes disk space: deleting pods and relaunching them won't fix the disk space constraint on the nodes hosting/running those pods.

If you used the standard/default CloudFormation template from the documentation to create your worker nodes, just bump-up the NodeVolumeSize parameter: by default it's 20 GiB EBS per node. You can just make it bigger as per your needs.

Also, you'd want to double check what is actually eating that much disk on the nodes! Usually, logs are well rotated and you shouldn't face situations like that if you're not writing data yourself (through your pods).

-- Clorichel
Source: StackOverflow