How to remove broken nodes in Kubernetes

5/9/2019

I have a kubernetes cluster with one master and two nodes. For some reason, a node became unreachable for the cluster so all pods were moved to the other node. The problem is that the broken node keep in the cluster, but i think the master should remove the node automatically and create another one.

Can anyone help me?

-- mgg
kubernetes

3 Answers

5/9/2019
  1. Cordon the node
  2. Drain the node
  3. Delete the node
  4. Reset the node ( run kubeadm reset command if it is joined using kubeadm)
  5. Join the node again as a fresh node
-- P Ekambaram
Source: StackOverflow

5/24/2019

I option:

If you work on GKE and have HA cluster, node with NotReady state shoud have been automatically deleted after couple of minutes if you have autoscaling mode on. After a while new node will be added.

II option: If you use kubeadm:

Nodes with state NotReady aren't automatically deleted if you don't have autoscaling mode on and HA cluster. Node will be continuously check and restart.

If you have Prometheus check metrics what happened on your node which has NotReady state or from unreachable node execute command:

$ sudo journalctl -u kubelet

If you want node with NotReady state to be deleted you should do it manually:

You should first drain the node and make sure that the node is empty before shutting it down.

$ kubectl drain <node name> --delete-local-data --force --ignore-daemonsets

$ kubectl delete node <node name>

Then, on the node being removed, reset all kubeadm installed state:

$ kubeadm reset

The reset process does not reset or clean up iptables rules or IPVS tables. If you wish to reset iptables, you must do so manually:

$ iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If you want to reset the IPVS tables, you must run the following command:

$ ipvsadm -C

You can also simply shutdown desire node:

$ shutdown -h

The -h means halt while now clearly means that the instruction should be carried out immediately. Different delays can be used. For example, you might use +6 instead, which will tell the computer to run the shutdown procedure in six minutes.

In this case new node will not be added automatically.

I hope this helps.

-- MaggieO
Source: StackOverflow

5/9/2019

As soon as node becomes unreachable by network, e.g. stops responding to pings - master will automatically remove such node from cluster.

You can delete node manually by:

kubectl delete node NODE_NAME
-- Vasily Angapov
Source: StackOverflow