kubectl drain and rolling update, downtime

12/15/2019

Does kubectl drain first make sure that pods with replicas=1 are healthy on some other node?
Assuming the pod is controlled by a deployment, and the pods can indeed be moved to other nodes. Currently as I see it only evict (delete pods) from the nodes, without scheduling them first.

-- user3599803
kubectl
kubernetes
kubernetes-deployment

2 Answers

12/17/2019

In addition to Suresh Vishnoi answer:

If PodDisruptionBudget is not specified and you have a deployment with one replica, the pod will be terminated and then new pod will be scheduled on a new node.

To make sure your application will be available during node draining process you have to specify PodDisruptionBudget and create more replicas. If you have 1 pod with minAvailable: 30% it will refuse to drain with following error:

error when evicting pod "pod01" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

Briefly that's how draining process works:

As explained in documentation kubectl drain command "safely evicts all of your pods from a node before you perform maintenance on the node and allows the pod’s containers to gracefully terminate and will respect the PodDisruptionBudgets you have specified”

Drain does two things:

  1. cordons the node- it means that node is marked as unschedulable, so new pods cannot be scheduled on this node. Makes sense- if we know that node will be under maintenance there is no point to schedule a pod there and then reschedule it on another node due to maintenance. From Kubernetes perspective it adds a taint to the node: node.kubernetes.io/unschedulable:NoSchedule

  2. evicts/ deletes the pods- after node is marked as unschedulable it tries to evict the pods that are running on the node. It uses Eviction API which takes PodDisruptionBudgets into account (if it's not supported it will delete pods). It calls DELETE method to K8S but considers GracePeriodSeconds so it lets a pod finish it's processes.

-- KFC_
Source: StackOverflow

12/15/2019

New Pods are scheduled, when the number of pods are not available (desired state != current state) in respective of draining or node failure.

With the PodDisruptionBudget resource you can manage the disruption during the draining of the node.

You can specify only one of maxUnavailable and minAvailable in a single PodDisruptionBudget. maxUnavailable can only be used to control the eviction of pods that have an associated controller managing them. In the examples below, “desired replicas” is the scale of the controller managing the pods being selected by the PodDisruptionBudget. https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget

Example 1: With a minAvailable of 5, evictions are allowed as long as they leave behind 5 or more healthy pods among those selected by the PodDisruptionBudget’s selector.

Example 2: With a minAvailable of 30%, evictions are allowed as long as at least 30% of the number of desired replicas are healthy.

Example 3: With a maxUnavailable of 5, evictions are allowed as long as there are at most 5 unhealthy replicas among the total number of desired replicas.

Example 4: With a maxUnavailable of 30%, evictions are allowed as long as no more than 30% of the desired replicas are unhealthy.

-- Suresh Vishnoi
Source: StackOverflow