Minikube NodeUnderDiskPressure issue

8/11/2018

I'm constantly running into NodeUnderDiskPressure in my pods that are running in Minikube. Using minikube ssh to see df -h, I'm using 50% max on all of my mounts. In fact, one is 50% and the other 5 are <10%.

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
rootfs          7.3G  503M  6.8G   7% /
devtmpfs        7.3G     0  7.3G   0% /dev
tmpfs           7.4G     0  7.4G   0% /dev/shm
tmpfs           7.4G  9.2M  7.4G   1% /run
tmpfs           7.4G     0  7.4G   0% /sys/fs/cgroup
/dev/sda1        17G  7.5G  7.8G  50% /mnt/sda1
$ df -ih
Filesystem     Inodes IUsed IFree IUse% Mounted on
rootfs           1.9M  4.1K  1.9M    1% /
devtmpfs         1.9M   324  1.9M    1% /dev
tmpfs            1.9M     1  1.9M    1% /dev/shm
tmpfs            1.9M   657  1.9M    1% /run
tmpfs            1.9M    14  1.9M    1% /sys/fs/cgroup
/dev/sda1        9.3M  757K  8.6M    8% /mnt/sda1

The probably usually just goes away after 1-5 minutes. Strangely, restarting Minikube doesn't seem to speed up this process. I've tried removing all evicted pods but, again, disk usage doesn't actually look very high.

The docker images I'm using are just under 2GB and I'm trying to spin up just a few of them, so that should still leave me with plenty of headroom.

Here's some kubectl describe output:

$ kubectl describe po/consumer-lag-reporter-3832025036-wlfnt
Name:           consumer-lag-reporter-3832025036-wlfnt
Namespace:      default
Node:           <none>
Labels:         app=consumer-lag-reporter
                pod-template-hash=3832025036
                tier=monitor
                type=monitor
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"consumer-lag-reporter-3832025036","uid":"342b0f72-9d12-11e8-a735...
Status:         Pending
IP:
Created By:     ReplicaSet/consumer-lag-reporter-3832025036
Controlled By:  ReplicaSet/consumer-lag-reporter-3832025036
Containers:
  consumer-lag-reporter:
    Image:  avery-image:latest
    Port:   <none>
    Command:
      /bin/bash
      -c
    Args:
      newrelic-admin run-program python manage.py lag_reporter_runner --settings-module project.settings
    Environment Variables from:
      local-config  ConfigMap  Optional: false
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-sjprm (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  default-token-sjprm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-sjprm
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     <none>
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  15s (x7 over 46s)  default-scheduler  No nodes are available that match all of the following predicates:: NodeUnderDiskPressure (1).

Is this a bug? Anything else I can do to debug this?

-- s g
kubernetes
minikube

1 Answer

8/23/2018

I tried:

  1. Cleaning up evicted pods (with kubectl get pods -a)
  2. Cleaning up unused images (with minikube ssh + docker images)
  3. Cleaning up all non-running containers (with minikube ssh + docker ps -a)

The disk usage remained low as shown in my question. I simply recreated a minikube cluster and used the --disk-size flag and this solved my problem. The key thing to note is that even though df showed that I was barely using any disk, it helped to make the disk even bigger.

-- s g
Source: StackOverflow