Can a node autoscaler automatically start an extra pod when replica count is 1 & minAvailable is also 1?

6/18/2021

our autoscaling (horizontal and vertical) works pretty fine, except the downscaling is not working somehow (yeah, we checked the usual suspects like https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#i-have-a-couple-of-nodes-with-low-utilization-but-they-are-not-scaled-down-why ).

Since we want to save resources and have pods which are not ultra-sensitive, we are setting following

Deployment

replicas: 1

PodDisruptionBudget

minAvailable: 1

HorizontalPodAutoscaler

minReplicas: 1
maxReplicas: 10

But it seems now that this is the problem that the autoscaler is not scaling down the nodes (even though the node is only used by 30% by CPU + memory and we have other nodes which have absolutely enough memory + cpu to move these pods).

Is it possible in general that the auto scaler starts an extra pod on the free node and removes the old pod from the old node?

-- BvuRVKyUVlViVIc7
autoscaling
kubernetes
rancher

1 Answer

6/18/2021

Is it possible in general that the auto scaler starts an extra pod on the free node and removes the old pod from the old node?

Yes, that should be possible in general, but in order for the cluster autoscaler to remove a node, it must be possible to move all pods running on the node somewhere else.

According to docs there are a few type of pods that are not movable:

  • Pods with restrictive PodDisruptionBudget.
  • Kube-system pods that:
    • are not run on the node by default
    • don't have a pod disruption budget set or their PDB is too restrictive >(since CA 0.6).
  • Pods that are not backed by a controller object (so not created by >deployment, replica set, job, stateful set etc).
  • Pods with local storage.
  • Pods that cannot be moved elsewhere due to various constraints (lack of >resources, non-matching node selectors or affinity, matching anti-affinity, etc)
  • Pods that have the following annotation set: cluster-autoscaler.kubernetes.io/safe-to-evict: "false

You could check the cluster autoscaler logs, they may provide a hint to why no scale in happens:

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler

Without having more information about your setup it is hard to guess what is going wrong, but unless you are using local storage, node selectors or affinity/anti-affinity rules etc Pod disruption policies is a likely candidate. Even if you are not using them explicitly they can still prevent node scale in if they there are pods in the kube-system namespace that are missing pod disruption policies (See this answer for an example of such a scenario in GKE)

-- danielorn
Source: StackOverflow