Does kubectl drain
first make sure that pods with replicas=1
are healthy on some other node?
Assuming the pod is controlled by a deployment, and the pods can indeed be moved to other nodes. Currently as I see it only evict (delete pods) from the nodes, without scheduling them first.
In addition to Suresh Vishnoi answer:
If PodDisruptionBudget is not specified and you have a deployment with one replica, the pod will be terminated and then new pod will be scheduled on a new node.
To make sure your application will be available during node draining process you have to specify PodDisruptionBudget and create more replicas. If you have 1 pod with minAvailable: 30%
it will refuse to drain with following error:
error when evicting pod "pod01" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
Briefly that's how draining process works:
As explained in documentation kubectl drain
command "safely evicts all of your pods from a node before you perform maintenance on the node and allows the pod’s containers to gracefully terminate and will respect the PodDisruptionBudgets
you have specified”
Drain does two things:
cordons the node- it means that node is marked as unschedulable, so new pods cannot be scheduled on this node. Makes sense- if we know that node will be under maintenance there is no point to schedule a pod there and then reschedule it on another node due to maintenance. From Kubernetes perspective it adds a taint to the node: node.kubernetes.io/unschedulable:NoSchedule
evicts/ deletes the pods- after node is marked as unschedulable it tries to evict the pods that are running on the node. It uses Eviction API which takes PodDisruptionBudgets
into account (if it's not supported it will delete pods). It calls DELETE method to K8S but considers GracePeriodSeconds
so it lets a pod finish it's processes.
New Pods are scheduled, when the number of pods are not available (desired state != current state) in respective of draining or node failure.
With the PodDisruptionBudget resource you can manage the disruption during the draining of the node.
You can specify only one of maxUnavailable and minAvailable in a single PodDisruptionBudget. maxUnavailable can only be used to control the eviction of pods that have an associated controller managing them. In the examples below, “desired replicas” is the scale of the controller managing the pods being selected by the PodDisruptionBudget. https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget
Example 1: With a minAvailable of 5, evictions are allowed as long as they leave behind 5 or more healthy pods among those selected by the PodDisruptionBudget’s selector.
Example 2: With a minAvailable of 30%, evictions are allowed as long as at least 30% of the number of desired replicas are healthy.
Example 3: With a maxUnavailable of 5, evictions are allowed as long as there are at most 5 unhealthy replicas among the total number of desired replicas.
Example 4: With a maxUnavailable of 30%, evictions are allowed as long as no more than 30% of the desired replicas are unhealthy.