kubelet.service: Unit entered failed state in not ready state node error from kubernetes cluster

1/3/2020

I am trying to deploy an springboot microservices in kubernetes cluster having 1 master and 2 worker node. When I am trying to get the node state using the command sudo kubectl get nodes, I am getting one of my worker node is not ready. It showing not ready in status.

When I am applying to troubleshoot the following command,

sudo journalctl -u kubelet

I am getting response like kubelet.service: Unit entered failed state and kubelet service stopped. The following is the response what I am getting when applying the command sudo journalctl -u kubelet.

-- Logs begin at Fri 2020-01-03 04:56:18 EST, end at Fri 2020-01-03 05:32:47 EST. --
Jan 03 04:56:25 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:31 MILDEVKUB050 kubelet[970]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --confi
Jan 03 04:56:31 MILDEVKUB050 kubelet[970]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --confi
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.053962     970 server.go:416] Version: v1.17.0
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.084061     970 plugins.go:100] No cloud provider specified.
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.235928     970 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.280173     970 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curre
Jan 03 04:56:38 MILDEVKUB050 kubelet[970]: I0103 04:56:38.107966     970 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
Jan 03 04:56:38 MILDEVKUB050 kubelet[970]: F0103 04:56:38.109401     970 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable swa
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.901632    1433 server.go:416] Version: v1.17.0
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.907654    1433 plugins.go:100] No cloud provider specified.
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.907806    1433 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.947107    1433 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curr
Jan 03 04:56:49 MILDEVKUB050 kubelet[1433]: I0103 04:56:49.263777    1433 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to
Jan 03 04:56:49 MILDEVKUB050 kubelet[1433]: F0103 04:56:49.264219    1433 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable sw
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.712729    1500 server.go:416] Version: v1.17.0
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.714927    1500 plugins.go:100] No cloud provider specified.
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.715248    1500 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.763508    1500 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curr
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.956706    1500 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: F0103 04:56:59.957078    1500 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable sw
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.

log file: service: Unit entered failed state

I tried by restarting the kubelet. But still there is no change in node state. Not ready state only.

Updates

When I am trying the command systemctl list-units --type=swap --state=active , then I am getting the following response,

docker@MILDEVKUB040:~$ systemctl list-units --type=swap --state=active
UNIT                                            LOAD   ACTIVE SUB    DESCRIPTION
dev-mapper-MILDEVDCR01\x2d\x2dvg\x2dswap_1.swap loaded active active /dev/mapper/MILDEVDCR01--vg-swap_1

Important

When I am getting these kind of issue with node not ready, each time I need to disable the swap and need to reload the daemon and kubelet. After that node becomes ready state. And again I need to repeat the same.

How can I find a permanent solution for this?

-- Jacob
kubelet
kubernetes

2 Answers

1/11/2020

The root cause is the swap space. To disable completely follow steps:

  • run swapoff -a: this will immediately disable swap but will activate on restart
  • remove any swap entry from /etc/fstab reboot the system.

If the swap is gone, good. If, for some reason, it is still here, you had to remove the swap partition. Repeat steps 1 and 2 and, after that, use fdisk or parted to remove the (now unused) swap partition. Use great care here: removing the wrong partition will have disastrous effects!

  • reboot

This should resolve your issue.

-- Umesh Kumhar
Source: StackOverflow

1/3/2020

failed to run Kubelet: running with swap on is not supported, please disable swap

You need to disable swap on the system for kubelet to work. You can disable swap with sudo swapoff -a

For systemd based systems, there is another way of enabling swap partitions using swap units which gets enabled whenever systemd reloads even if you have turned off swap using swapoff -a

https://www.freedesktop.org/software/systemd/man/systemd.swap.html

Check if you have any swap units using systemctl list-units --type=swap --state=active

You can permanently disable any active swap unit with systemctl mask <unit name>.

Note: Do not use systemctl disable <unit name> to disable the swap unit as swap unit will be activated again when systemd reloads. Use systemctl mask <unit name> only.

To make sure swap doesn't get re-enabled when your system reboots due to power cycle or any other reason, remove or comment out the swap entries in /etc/fstab

Summarizing:

  1. Run sudo swapoff -a

  2. Check if you have swap units with command systemctl list-units --type=swap --state=active. If there are any active swap units, mask them using systemctl mask <unit name>

  3. Remove swap entries in /etc/fstab

-- Shashank V
Source: StackOverflow