I have a k8s cluster setup using kubespray.
Last week one of my k8s nodes have very low storage, so all the pods has been evicted, include some important pods like calico-node, kube-proxy (I thought that these pods are critical and never been evicted no matter what)
After that all the calico-node pods become not ready, when I check the log, it is said that: Warning: Readiness probe failed: calico/node is not ready: BIRD is not ready: BGP not established with 192.168.0.xxx
, where 192.168.0.xxx
is the IP of above problematic node.
My question is how can I restore that node? is it safe to just run the kubespray's cluster.yml again?
My k8s version is v1.13.3
Thanks.
When node has a disk pressure its status changes to NotReady
and a taint is added to the node: Taints: node.kubernetes.io/disk-pressure:NoSchedule
.
All pods running on this node are getting evicted, except api-server, kube-controller and kube-scheduler
- eviction manager will save those pods from getting evicted with error message: cannot evict a critical static pod [...]
Once the node is freed from disk pressure it will change its status to Ready
and previously added taint will be removed. You can check it by running kubectl describe node <node_name>
. In the conditions field you should see that DiskPressure
has changed status to False
which means that node has enough space available. Similar information can be also found in Events
field.
Normal NodeReady 1s kubelet, node1 Node node1 status is now: NodeReady
Normal NodeHasNoDiskPressure 1s (x2 over 1s) kubelet, node1 Node node1 status is now: NodeHasNoDiskPressure
After confirming that the node is ready with sufficient disk space you can restart kubelet
and run kubespray's cluster.yml- the pods will be redeployed on the node. You just have to make sure that node is ready to handle deployments.